Overview

Dataset statistics

Number of variables10
Number of observations302
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory24.3 KiB
Average record size in memory82.4 B

Variable types

Categorical5
Text1
Numeric1
Boolean3

Dataset

Description서울교통공사에서 운영하는 노선(5호선~8호선)의 승강장 정보에 대한 데이터로 철도운영기관명, 선명, 역명, 승강장번호, 상하행구분, 지상구분, 역층, 승강장연결 여부, 스크린도어 유무, 안전발판 유무의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041197/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
스크린도어 유무 has constant value ""Constant
상하행 is highly overall correlated with 승강장번호High correlation
승강장번호 is highly overall correlated with 상하행High correlation
역층 is highly overall correlated with 지상구분High correlation
지상구분 is highly overall correlated with 역층High correlation
지상구분 is highly imbalanced (85.9%)Imbalance
승강장연결 여부 is highly imbalanced (51.2%)Imbalance

Reproduction

Analysis started2023-12-12 13:48:16.238992
Analysis finished2023-12-12 13:48:17.152952
Duration0.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
서울교통공사
302 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 302
100.0%

Length

2023-12-12T22:48:17.236686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:48:17.362627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 302
100.0%

선명
Categorical

Distinct4
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
5호선
106 
7호선
84 
6호선
78 
8호선
34 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5호선
2nd row5호선
3rd row5호선
4th row5호선
5th row5호선

Common Values

ValueCountFrequency (%)
5호선 106
35.1%
7호선 84
27.8%
6호선 78
25.8%
8호선 34
 
11.3%

Length

2023-12-12T22:48:17.463816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:48:17.579867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5호선 106
35.1%
7호선 84
27.8%
6호선 78
25.8%
8호선 34
 
11.3%

역명
Text

Distinct146
Distinct (%)48.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2023-12-12T22:48:17.878981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length4.3708609
Min length2

Characters and Unicode

Total characters1320
Distinct characters198
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강동
2nd row강동
3rd row개롱
4th row개롱
5th row개화산
ValueCountFrequency (%)
천호(풍납토성 4
 
1.3%
청구 4
 
1.3%
태릉입구 4
 
1.3%
공덕 4
 
1.3%
군자(능동 4
 
1.3%
논현 2
 
0.7%
대림(구로구청 2
 
0.7%
강동 2
 
0.7%
남구로 2
 
0.7%
남성 2
 
0.7%
Other values (136) 272
90.1%
2023-12-12T22:48:18.360963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 70
 
5.3%
) 70
 
5.3%
48
 
3.6%
46
 
3.5%
36
 
2.7%
30
 
2.3%
28
 
2.1%
26
 
2.0%
22
 
1.7%
22
 
1.7%
Other values (188) 922
69.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1174
88.9%
Open Punctuation 70
 
5.3%
Close Punctuation 70
 
5.3%
Decimal Number 4
 
0.3%
Other Punctuation 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
48
 
4.1%
46
 
3.9%
36
 
3.1%
30
 
2.6%
28
 
2.4%
26
 
2.2%
22
 
1.9%
22
 
1.9%
22
 
1.9%
20
 
1.7%
Other values (183) 874
74.4%
Decimal Number
ValueCountFrequency (%)
4 2
50.0%
3 2
50.0%
Open Punctuation
ValueCountFrequency (%)
( 70
100.0%
Close Punctuation
ValueCountFrequency (%)
) 70
100.0%
Other Punctuation
ValueCountFrequency (%)
· 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1174
88.9%
Common 146
 
11.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
48
 
4.1%
46
 
3.9%
36
 
3.1%
30
 
2.6%
28
 
2.4%
26
 
2.2%
22
 
1.9%
22
 
1.9%
22
 
1.9%
20
 
1.7%
Other values (183) 874
74.4%
Common
ValueCountFrequency (%)
( 70
47.9%
) 70
47.9%
· 2
 
1.4%
4 2
 
1.4%
3 2
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1174
88.9%
ASCII 144
 
10.9%
None 2
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 70
48.6%
) 70
48.6%
4 2
 
1.4%
3 2
 
1.4%
Hangul
ValueCountFrequency (%)
48
 
4.1%
46
 
3.9%
36
 
3.1%
30
 
2.6%
28
 
2.4%
26
 
2.2%
22
 
1.9%
22
 
1.9%
22
 
1.9%
20
 
1.7%
Other values (183) 874
74.4%
None
ValueCountFrequency (%)
· 2
100.0%

승강장번호
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
1
151 
2
151 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 151
50.0%
2 151
50.0%

Length

2023-12-12T22:48:18.497517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:48:18.600189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 151
50.0%
2 151
50.0%

상하행
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
상행
151 
하행
151 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상행
2nd row하행
3rd row하행
4th row상행
5th row상행

Common Values

ValueCountFrequency (%)
상행 151
50.0%
하행 151
50.0%

Length

2023-12-12T22:48:18.694620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:48:18.803123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상행 151
50.0%
하행 151
50.0%

지상구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
지하
296 
지상
 
6

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하
2nd row지하
3rd row지하
4th row지하
5th row지하

Common Values

ValueCountFrequency (%)
지하 296
98.0%
지상 6
 
2.0%

Length

2023-12-12T22:48:18.911962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:48:19.009897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하 296
98.0%
지상 6
 
2.0%

역층
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2350993
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 KiB
2023-12-12T22:48:19.101879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12.25
median3
Q34
95-th percentile5
Maximum8
Range7
Interquartile range (IQR)1.75

Descriptive statistics

Standard deviation1.0818772
Coefficient of variation (CV)0.33441854
Kurtosis1.7323186
Mean3.2350993
Median Absolute Deviation (MAD)1
Skewness0.99462238
Sum977
Variance1.1704583
MonotonicityNot monotonic
2023-12-12T22:48:19.224009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
3 128
42.4%
2 74
24.5%
4 59
19.5%
5 31
 
10.3%
6 6
 
2.0%
8 2
 
0.7%
1 2
 
0.7%
ValueCountFrequency (%)
1 2
 
0.7%
2 74
24.5%
3 128
42.4%
4 59
19.5%
5 31
 
10.3%
6 6
 
2.0%
8 2
 
0.7%
ValueCountFrequency (%)
8 2
 
0.7%
6 6
 
2.0%
5 31
 
10.3%
4 59
19.5%
3 128
42.4%
2 74
24.5%
1 2
 
0.7%

승강장연결 여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size434.0 B
True
270 
False
32 
ValueCountFrequency (%)
True 270
89.4%
False 32
 
10.6%
2023-12-12T22:48:19.357207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

스크린도어 유무
Boolean

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size434.0 B
True
302 
ValueCountFrequency (%)
True 302
100.0%
2023-12-12T22:48:19.440988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size434.0 B
False
154 
True
148 
ValueCountFrequency (%)
False 154
51.0%
True 148
49.0%
2023-12-12T22:48:19.519635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-12T22:48:16.758210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:48:19.601821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명승강장번호상하행지상구분역층승강장연결 여부안전발판 유무
선명1.0000.0000.0000.3110.3430.1210.462
승강장번호0.0001.0001.0000.0000.0000.0000.000
상하행0.0001.0001.0000.0000.0000.0000.000
지상구분0.3110.0000.0001.0000.5270.0000.000
역층0.3430.0000.0000.5271.0000.1250.319
승강장연결 여부0.1210.0000.0000.0000.1251.0000.246
안전발판 유무0.4620.0000.0000.0000.3190.2461.000
2023-12-12T22:48:19.720690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
승강장연결 여부선명상하행지상구분안전발판 유무승강장번호
승강장연결 여부1.0000.0790.0000.0000.1580.000
선명0.0791.0000.0000.2070.3110.000
상하행0.0000.0001.0000.0000.0000.993
지상구분0.0000.2070.0001.0000.0000.000
안전발판 유무0.1580.3110.0000.0001.0000.000
승강장번호0.0000.0000.9930.0000.0001.000
2023-12-12T22:48:19.830993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역층선명승강장번호상하행지상구분승강장연결 여부안전발판 유무
역층1.0000.2400.0000.0000.5620.1330.338
선명0.2401.0000.0000.0000.2070.0790.311
승강장번호0.0000.0001.0000.9930.0000.0000.000
상하행0.0000.0000.9931.0000.0000.0000.000
지상구분0.5620.2070.0000.0001.0000.0000.000
승강장연결 여부0.1330.0790.0000.0000.0001.0000.158
안전발판 유무0.3380.3110.0000.0000.0000.1581.000

Missing values

2023-12-12T22:48:16.909778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:48:17.079169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명승강장번호상하행지상구분역층승강장연결 여부스크린도어 유무안전발판 유무
0서울교통공사5호선강동1상행지하4YYN
1서울교통공사5호선강동2하행지하4YYN
2서울교통공사5호선개롱2하행지하2YYN
3서울교통공사5호선개롱1상행지하2YYN
4서울교통공사5호선개화산1상행지하4YYY
5서울교통공사5호선개화산2하행지하4YYY
6서울교통공사5호선거여1상행지하3YYY
7서울교통공사5호선거여2하행지하3YYY
8서울교통공사5호선고덕1상행지하2YYN
9서울교통공사5호선고덕2하행지하2YYN
철도운영기관명선명역명승강장번호상하행지상구분역층승강장연결 여부스크린도어 유무안전발판 유무
292서울교통공사8호선신흥1상행지하2YYN
293서울교통공사8호선신흥2하행지하2YYN
294서울교통공사8호선암사1상행지하2YYN
295서울교통공사8호선암사2하행지하2YYN
296서울교통공사8호선잠실(송파구청)2하행지하3NYY
297서울교통공사8호선잠실(송파구청)1상행지하3NYY
298서울교통공사8호선장지2하행지하2YYN
299서울교통공사8호선장지1상행지하2YYN
300서울교통공사8호선천호(풍납토성)2하행지하2YYN
301서울교통공사8호선천호(풍납토성)1상행지하2YYN