Overview

Dataset statistics

Number of variables7
Number of observations3840
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory225.1 KiB
Average record size in memory60.0 B

Variable types

Categorical5
Numeric2

Dataset

Description서울교통공사에서 운영하는 4호선의 승강장 이격거리에 대한 데이터로 철도운영기관명, 선명, 역명, 승강장이격거리에 대한 승강장번호, 차량순서, 차량출입문번호, 안전거리의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041517/fileData.do

Alerts

선명 has constant value ""Constant
역명 is highly overall correlated with 철도운영기관명High correlation
철도운영기관명 is highly overall correlated with 안전거리 and 1 other fieldsHigh correlation
안전거리 is highly overall correlated with 철도운영기관명High correlation

Reproduction

Analysis started2023-12-12 13:31:24.563996
Analysis finished2023-12-12 13:31:25.524515
Duration0.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.1 KiB
서울교통공사
2080 
코레일
1760 

Length

Max length6
Median length6
Mean length4.625
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 2080
54.2%
코레일 1760
45.8%

Length

2023-12-12T22:31:25.592964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:31:25.691591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 2080
54.2%
코레일 1760
45.8%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size30.1 KiB
4호선
3840 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4호선
2nd row4호선
3rd row4호선
4th row4호선
5th row4호선

Common Values

ValueCountFrequency (%)
4호선 3840
100.0%

Length

2023-12-12T22:31:26.156322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:31:26.262355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4호선 3840
100.0%

역명
Categorical

HIGH CORRELATION 

Distinct48
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size30.1 KiB
길음
 
80
남태령
 
80
삼각지
 
80
노원
 
80
당고개
 
80
Other values (43)
3440 

Length

Max length11
Median length10
Mean length3.8541667
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row길음
2nd row길음
3rd row길음
4th row길음
5th row길음

Common Values

ValueCountFrequency (%)
길음 80
 
2.1%
남태령 80
 
2.1%
삼각지 80
 
2.1%
노원 80
 
2.1%
당고개 80
 
2.1%
동대문 80
 
2.1%
동대문역사문화공원 80
 
2.1%
동작(현충원) 80
 
2.1%
명동 80
 
2.1%
미아 80
 
2.1%
Other values (38) 3040
79.2%

Length

2023-12-12T22:31:26.477790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
길음 80
 
2.1%
남태령 80
 
2.1%
상록수 80
 
2.1%
경마공원 80
 
2.1%
고잔 80
 
2.1%
과천 80
 
2.1%
금정 80
 
2.1%
대공원 80
 
2.1%
대야미 80
 
2.1%
반월 80
 
2.1%
Other values (38) 3040
79.2%

승강장번호
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.1 KiB
1
1920 
2
1920 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1920
50.0%
2 1920
50.0%

Length

2023-12-12T22:31:26.623692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:31:26.748792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1920
50.0%
2 1920
50.0%

차량순서
Real number (ℝ)

Distinct10
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.5
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.9 KiB
2023-12-12T22:31:26.869985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5.5
Q38
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.8726554
Coefficient of variation (CV)0.52230098
Kurtosis-1.2242739
Mean5.5
Median Absolute Deviation (MAD)2.5
Skewness0
Sum21120
Variance8.252149
MonotonicityNot monotonic
2023-12-12T22:31:27.010108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 384
10.0%
2 384
10.0%
3 384
10.0%
4 384
10.0%
5 384
10.0%
6 384
10.0%
7 384
10.0%
8 384
10.0%
9 384
10.0%
10 384
10.0%
ValueCountFrequency (%)
1 384
10.0%
2 384
10.0%
3 384
10.0%
4 384
10.0%
5 384
10.0%
6 384
10.0%
7 384
10.0%
8 384
10.0%
9 384
10.0%
10 384
10.0%
ValueCountFrequency (%)
10 384
10.0%
9 384
10.0%
8 384
10.0%
7 384
10.0%
6 384
10.0%
5 384
10.0%
4 384
10.0%
3 384
10.0%
2 384
10.0%
1 384
10.0%
Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.1 KiB
2
960 
3
960 
4
960 
1
960 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row4
4th row1
5th row2

Common Values

ValueCountFrequency (%)
2 960
25.0%
3 960
25.0%
4 960
25.0%
1 960
25.0%

Length

2023-12-12T22:31:27.155118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:31:27.290474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 960
25.0%
3 960
25.0%
4 960
25.0%
1 960
25.0%

안전거리
Real number (ℝ)

HIGH CORRELATION 

Distinct130
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.1465625
Minimum0
Maximum28
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size33.9 KiB
2023-12-12T22:31:27.423262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q16
median8
Q39.9
95-th percentile15
Maximum28
Range28
Interquartile range (IQR)3.9

Descriptive statistics

Standard deviation3.3551186
Coefficient of variation (CV)0.4118447
Kurtosis2.7316594
Mean8.1465625
Median Absolute Deviation (MAD)2
Skewness1.2218673
Sum31282.8
Variance11.256821
MonotonicityNot monotonic
2023-12-12T22:31:27.603628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.0 481
 
12.5%
5.0 382
 
9.9%
7.0 329
 
8.6%
9.0 247
 
6.4%
10.0 230
 
6.0%
8.0 211
 
5.5%
4.0 164
 
4.3%
9.5 138
 
3.6%
11.0 135
 
3.5%
12.0 112
 
2.9%
Other values (120) 1411
36.7%
ValueCountFrequency (%)
0.0 1
 
< 0.1%
0.5 1
 
< 0.1%
1.0 4
 
0.1%
1.5 2
 
0.1%
1.9 1
 
< 0.1%
2.0 26
0.7%
2.1 1
 
< 0.1%
2.3 2
 
0.1%
2.5 11
0.3%
2.6 1
 
< 0.1%
ValueCountFrequency (%)
28.0 1
 
< 0.1%
25.0 2
 
0.1%
24.0 3
 
0.1%
23.5 1
 
< 0.1%
23.0 6
0.2%
22.5 2
 
0.1%
22.0 2
 
0.1%
21.0 7
0.2%
20.0 12
0.3%
19.5 1
 
< 0.1%

Interactions

2023-12-12T22:31:25.105057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:31:24.917314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:31:25.194463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:31:25.017728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:31:27.725701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명승강장번호차량순서차량출입문번호안전거리
철도운영기관명1.0001.0000.0000.0000.0000.650
역명1.0001.0000.0000.0000.0000.837
승강장번호0.0000.0001.0000.0000.0000.232
차량순서0.0000.0000.0001.0000.0000.056
차량출입문번호0.0000.0000.0000.0001.0000.020
안전거리0.6500.8370.2320.0560.0201.000
2023-12-12T22:31:27.841183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명차량출입문번호승강장번호철도운영기관명
역명1.0000.0000.0000.994
차량출입문번호0.0001.0000.0000.000
승강장번호0.0000.0001.0000.000
철도운영기관명0.9940.0000.0001.000
2023-12-12T22:31:27.949289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
차량순서안전거리철도운영기관명역명승강장번호차량출입문번호
차량순서1.000-0.0040.0000.0000.0000.000
안전거리-0.0041.0000.5050.4640.1770.022
철도운영기관명0.0000.5051.0000.9940.0000.000
역명0.0000.4640.9941.0000.0000.000
승강장번호0.0000.1770.0000.0001.0000.000
차량출입문번호0.0000.0220.0000.0000.0001.000

Missing values

2023-12-12T22:31:25.331343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:31:25.470483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명승강장번호차량순서차량출입문번호안전거리
0서울교통공사4호선길음11211.0
1서울교통공사4호선길음11311.0
2서울교통공사4호선길음11411.0
3서울교통공사4호선길음11110.0
4서울교통공사4호선길음1225.0
5서울교통공사4호선길음1236.0
6서울교통공사4호선길음1246.0
7서울교통공사4호선길음1216.0
8서울교통공사4호선길음1336.0
9서울교통공사4호선길음1325.0
철도운영기관명선명역명승강장번호차량순서차량출입문번호안전거리
3830코레일4호선한대앞2826.0
3831코레일4호선한대앞2846.0
3832코레일4호선한대앞2916.0
3833코레일4호선한대앞2936.0
3834코레일4호선한대앞2926.0
3835코레일4호선한대앞2946.0
3836코레일4호선한대앞21046.0
3837코레일4호선한대앞21015.5
3838코레일4호선한대앞21026.0
3839코레일4호선한대앞21035.5