Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory586.1 KiB
Average record size in memory60.0 B

Variable types

Categorical4
Text1
Numeric2

Dataset

Description서울도시철도공사에서 운영하는 노선의 승강장 이격거리에 대한 데이터로 철도운영기관명,선명,역명,승강장번호,차량순서,차량출입문번호,안전거리 등이 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041524/fileData.do

Alerts

철도운영기관명 has constant value ""Constant

Reproduction

Analysis started2023-12-12 06:55:39.832835
Analysis finished2023-12-12 06:55:41.021449
Duration1.19 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.3 KiB
서울교통공사
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 10000
100.0%

Length

2023-12-12T15:55:41.100235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:55:41.197942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 10000
100.0%

선명
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.3 KiB
5호선
3424 
7호선
3392 
6호선
2368 
8호선
816 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5호선
2nd row5호선
3rd row5호선
4th row5호선
5th row5호선

Common Values

ValueCountFrequency (%)
5호선 3424
34.2%
7호선 3392
33.9%
6호선 2368
23.7%
8호선 816
 
8.2%

Length

2023-12-12T15:55:41.310842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:55:41.443858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5호선 3424
34.2%
7호선 3392
33.9%
6호선 2368
23.7%
8호선 816
 
8.2%

역명
Text

Distinct152
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size78.3 KiB
2023-12-12T15:55:41.744766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length4.4336
Min length2

Characters and Unicode

Total characters44336
Distinct characters202
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강동
2nd row강동
3rd row강동
4th row강동
5th row강동
ValueCountFrequency (%)
방화 128
 
1.3%
상일동 128
 
1.3%
온수(성공회대입구 128
 
1.3%
공덕 128
 
1.3%
태릉입구 128
 
1.3%
청담 128
 
1.3%
군자(능동 128
 
1.3%
청구 128
 
1.3%
봉화산(서울의료원 128
 
1.3%
천호(풍납토성 112
 
1.1%
Other values (142) 8736
87.4%
2023-12-12T15:55:42.324571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 2336
 
5.3%
) 2336
 
5.3%
1584
 
3.6%
1520
 
3.4%
1232
 
2.8%
1152
 
2.6%
944
 
2.1%
848
 
1.9%
816
 
1.8%
768
 
1.7%
Other values (192) 30800
69.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39488
89.1%
Open Punctuation 2336
 
5.3%
Close Punctuation 2336
 
5.3%
Decimal Number 128
 
0.3%
Other Punctuation 48
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1584
 
4.0%
1520
 
3.8%
1232
 
3.1%
1152
 
2.9%
944
 
2.4%
848
 
2.1%
816
 
2.1%
768
 
1.9%
752
 
1.9%
704
 
1.8%
Other values (187) 29168
73.9%
Decimal Number
ValueCountFrequency (%)
4 64
50.0%
3 64
50.0%
Open Punctuation
ValueCountFrequency (%)
( 2336
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2336
100.0%
Other Punctuation
ValueCountFrequency (%)
· 48
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39488
89.1%
Common 4848
 
10.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1584
 
4.0%
1520
 
3.8%
1232
 
3.1%
1152
 
2.9%
944
 
2.4%
848
 
2.1%
816
 
2.1%
768
 
1.9%
752
 
1.9%
704
 
1.8%
Other values (187) 29168
73.9%
Common
ValueCountFrequency (%)
( 2336
48.2%
) 2336
48.2%
4 64
 
1.3%
3 64
 
1.3%
· 48
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39488
89.1%
ASCII 4800
 
10.8%
None 48
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 2336
48.7%
) 2336
48.7%
4 64
 
1.3%
3 64
 
1.3%
Hangul
ValueCountFrequency (%)
1584
 
4.0%
1520
 
3.8%
1232
 
3.1%
1152
 
2.9%
944
 
2.4%
848
 
2.1%
816
 
2.1%
768
 
1.9%
752
 
1.9%
704
 
1.8%
Other values (187) 29168
73.9%
None
ValueCountFrequency (%)
· 48
100.0%

승강장번호
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.3 KiB
1
4888 
2
4664 
3
 
224
4
 
224

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 4888
48.9%
2 4664
46.6%
3 224
 
2.2%
4 224
 
2.2%

Length

2023-12-12T15:55:42.499754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:55:42.607992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 4888
48.9%
2 4664
46.6%
3 224
 
2.2%
4 224
 
2.2%

차량순서
Real number (ℝ)

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.4184
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.0 KiB
2023-12-12T15:55:42.712195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile8
Maximum8
Range7
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.266066
Coefficient of variation (CV)0.51287026
Kurtosis-1.2074931
Mean4.4184
Median Absolute Deviation (MAD)2
Skewness0.039704612
Sum44184
Variance5.1350549
MonotonicityNot monotonic
2023-12-12T15:55:42.845869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 1284
12.8%
2 1284
12.8%
3 1284
12.8%
4 1284
12.8%
5 1284
12.8%
6 1284
12.8%
7 1148
11.5%
8 1148
11.5%
ValueCountFrequency (%)
1 1284
12.8%
2 1284
12.8%
3 1284
12.8%
4 1284
12.8%
5 1284
12.8%
6 1284
12.8%
7 1148
11.5%
8 1148
11.5%
ValueCountFrequency (%)
8 1148
11.5%
7 1148
11.5%
6 1284
12.8%
5 1284
12.8%
4 1284
12.8%
3 1284
12.8%
2 1284
12.8%
1 1284
12.8%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.3 KiB
3
2500 
2
2500 
4
2500 
1
2500 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row2
3rd row4
4th row1
5th row3

Common Values

ValueCountFrequency (%)
3 2500
25.0%
2 2500
25.0%
4 2500
25.0%
1 2500
25.0%

Length

2023-12-12T15:55:43.003417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:55:43.118258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 2500
25.0%
2 2500
25.0%
4 2500
25.0%
1 2500
25.0%

안전거리
Real number (ℝ)

Distinct21
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.7403
Minimum1
Maximum21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.0 KiB
2023-12-12T15:55:43.225621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q15
median8
Q39
95-th percentile13
Maximum21
Range20
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.7595683
Coefficient of variation (CV)0.35651956
Kurtosis2.0248968
Mean7.7403
Median Absolute Deviation (MAD)2
Skewness0.97265994
Sum77403
Variance7.6152174
MonotonicityNot monotonic
2023-12-12T15:55:43.378591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
9 2639
26.4%
5 2267
22.7%
8 1396
14.0%
6 766
 
7.7%
7 662
 
6.6%
10 543
 
5.4%
4 514
 
5.1%
11 315
 
3.1%
12 224
 
2.2%
13 159
 
1.6%
Other values (11) 515
 
5.1%
ValueCountFrequency (%)
1 10
 
0.1%
2 40
 
0.4%
3 103
 
1.0%
4 514
 
5.1%
5 2267
22.7%
6 766
 
7.7%
7 662
 
6.6%
8 1396
14.0%
9 2639
26.4%
10 543
 
5.4%
ValueCountFrequency (%)
21 1
 
< 0.1%
20 26
 
0.3%
19 16
 
0.2%
18 35
 
0.4%
17 62
 
0.6%
16 46
 
0.5%
15 85
 
0.9%
14 91
0.9%
13 159
1.6%
12 224
2.2%

Interactions

2023-12-12T15:55:40.545275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:55:40.315362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:55:40.649337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:55:40.420286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:55:43.494699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명승강장번호차량순서차량출입문번호안전거리
선명1.0000.1120.1980.0000.489
승강장번호0.1121.0000.0000.0000.123
차량순서0.1980.0001.0000.0000.027
차량출입문번호0.0000.0000.0001.0000.119
안전거리0.4890.1230.0270.1191.000
2023-12-12T15:55:43.615042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
차량출입문번호선명승강장번호
차량출입문번호1.0000.0000.000
선명0.0001.0000.045
승강장번호0.0000.0451.000
2023-12-12T15:55:43.716736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
차량순서안전거리선명승강장번호차량출입문번호
차량순서1.0000.0040.0900.0000.000
안전거리0.0041.0000.3170.0750.071
선명0.0900.3171.0000.0450.000
승강장번호0.0000.0750.0451.0000.000
차량출입문번호0.0000.0710.0000.0001.000

Missing values

2023-12-12T15:55:40.801575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:55:40.954788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명승강장번호차량순서차량출입문번호안전거리
0서울교통공사5호선강동1139
1서울교통공사5호선강동1129
2서울교통공사5호선강동1149
3서울교통공사5호선강동1118
4서울교통공사5호선강동1239
5서울교통공사5호선강동1219
6서울교통공사5호선강동1249
7서울교통공사5호선강동1229
8서울교통공사5호선강동1339
9서울교통공사5호선강동1319
철도운영기관명선명역명승강장번호차량순서차량출입문번호안전거리
9990서울교통공사8호선천호(풍납토성)2416
9991서울교통공사8호선천호(풍납토성)2446
9992서울교통공사8호선천호(풍납토성)2515
9993서울교통공사8호선천호(풍납토성)2525
9994서울교통공사8호선천호(풍납토성)2535
9995서울교통공사8호선천호(풍납토성)2545
9996서울교통공사8호선천호(풍납토성)2615
9997서울교통공사8호선천호(풍납토성)2635
9998서울교통공사8호선천호(풍납토성)2626
9999서울교통공사8호선천호(풍납토성)2646