Overview

Dataset statistics

Number of variables10
Number of observations87
Missing cells73
Missing cells (%)8.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.3 KiB
Average record size in memory85.5 B

Variable types

Categorical7
Numeric2
Text1

Dataset

Description서울도시철도공사에서 관리하는 도시광역철도역들의 철도운영기관명, 선명, 역명, 휠체어리프트의 관리번호, 출입구번호, 상세위치, 길이, 폭, 시작층, 종료층의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041427/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
길이 has constant value ""Constant
출입구번호 is highly overall correlated with and 1 other fieldsHigh correlation
선명 is highly overall correlated with 역명High correlation
역명 is highly overall correlated with 선명 and 2 other fieldsHigh correlation
is highly overall correlated with 출입구번호 and 1 other fieldsHigh correlation
시작층 is highly overall correlated with 출입구번호 and 1 other fieldsHigh correlation
종료층 is highly overall correlated with 역명 and 1 other fieldsHigh correlation
is highly imbalanced (90.9%)Imbalance
출입구번호 has 73 (83.9%) missing valuesMissing

Reproduction

Analysis started2023-12-12 19:03:29.246238
Analysis finished2023-12-12 19:03:30.425095
Duration1.18 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size828.0 B
서울교통공사
87 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 87
100.0%

Length

2023-12-13T04:03:30.488960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:30.611718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 87
100.0%

선명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.6%
Missing0
Missing (%)0.0%
Memory size828.0 B
7호선
40 
6호선
26 
5호선
15 
8호선

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5호선
2nd row5호선
3rd row5호선
4th row5호선
5th row5호선

Common Values

ValueCountFrequency (%)
7호선 40
46.0%
6호선 26
29.9%
5호선 15
 
17.2%
8호선 6
 
6.9%

Length

2023-12-13T04:03:30.720591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:30.862715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
7호선 40
46.0%
6호선 26
29.9%
5호선 15
 
17.2%
8호선 6
 
6.9%

역명
Categorical

HIGH CORRELATION 

Distinct42
Distinct (%)48.3%
Missing0
Missing (%)0.0%
Memory size828.0 B
남구로
11 
신당
 
5
온수(성공회대입구)
 
5
고속터미널
 
4
가산디지털단지
 
4
Other values (37)
58 

Length

Max length14
Median length11
Mean length5.1034483
Min length2

Unique

Unique21 ?
Unique (%)24.1%

Sample

1st row강동
2nd row광나루(장신대)
3rd row까치산
4th row까치산
5th row까치산

Common Values

ValueCountFrequency (%)
남구로 11
 
12.6%
신당 5
 
5.7%
온수(성공회대입구) 5
 
5.7%
고속터미널 4
 
4.6%
가산디지털단지 4
 
4.6%
잠실(송파구청) 3
 
3.4%
까치산 3
 
3.4%
총신대입구(이수) 3
 
3.4%
건대입구 3
 
3.4%
디지털미디어시티 3
 
3.4%
Other values (32) 43
49.4%

Length

2023-12-13T04:03:31.298704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
남구로 11
 
12.6%
온수(성공회대입구 5
 
5.7%
신당 5
 
5.7%
고속터미널 4
 
4.6%
가산디지털단지 4
 
4.6%
잠실(송파구청 3
 
3.4%
까치산 3
 
3.4%
총신대입구(이수 3
 
3.4%
건대입구 3
 
3.4%
디지털미디어시티 3
 
3.4%
Other values (32) 43
49.4%
Distinct11
Distinct (%)12.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2988506
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size915.0 B
2023-12-13T04:03:31.431777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile6.7
Maximum11
Range10
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.0523098
Coefficient of variation (CV)0.89275475
Kurtosis6.2552798
Mean2.2988506
Median Absolute Deviation (MAD)1
Skewness2.4106794
Sum200
Variance4.2119754
MonotonicityNot monotonic
2023-12-13T04:03:31.565100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 42
48.3%
2 21
24.1%
3 10
 
11.5%
4 5
 
5.7%
5 3
 
3.4%
6 1
 
1.1%
8 1
 
1.1%
9 1
 
1.1%
10 1
 
1.1%
11 1
 
1.1%
ValueCountFrequency (%)
1 42
48.3%
2 21
24.1%
3 10
 
11.5%
4 5
 
5.7%
5 3
 
3.4%
6 1
 
1.1%
7 1
 
1.1%
8 1
 
1.1%
9 1
 
1.1%
10 1
 
1.1%
ValueCountFrequency (%)
11 1
 
1.1%
10 1
 
1.1%
9 1
 
1.1%
8 1
 
1.1%
7 1
 
1.1%
6 1
 
1.1%
5 3
 
3.4%
4 5
 
5.7%
3 10
11.5%
2 21
24.1%

출입구번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct8
Distinct (%)57.1%
Missing73
Missing (%)83.9%
Infinite0
Infinite (%)0.0%
Mean3.9285714
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size915.0 B
2023-12-13T04:03:31.667921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3.5
Q35.75
95-th percentile7.7
Maximum9
Range8
Interquartile range (IQR)3.75

Descriptive statistics

Standard deviation2.4007783
Coefficient of variation (CV)0.61110719
Kurtosis-0.23550638
Mean3.9285714
Median Absolute Deviation (MAD)1.5
Skewness0.68138057
Sum55
Variance5.7637363
MonotonicityNot monotonic
2023-12-13T04:03:31.777524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2 3
 
3.4%
6 2
 
2.3%
1 2
 
2.3%
3 2
 
2.3%
4 2
 
2.3%
5 1
 
1.1%
7 1
 
1.1%
9 1
 
1.1%
(Missing) 73
83.9%
ValueCountFrequency (%)
1 2
2.3%
2 3
3.4%
3 2
2.3%
4 2
2.3%
5 1
 
1.1%
6 2
2.3%
7 1
 
1.1%
9 1
 
1.1%
ValueCountFrequency (%)
9 1
 
1.1%
7 1
 
1.1%
6 2
2.3%
5 1
 
1.1%
4 2
2.3%
3 2
2.3%
2 3
3.4%
1 2
2.3%
Distinct81
Distinct (%)93.1%
Missing0
Missing (%)0.0%
Memory size828.0 B
2023-12-13T04:03:31.985547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length24
Mean length19.908046
Min length8

Characters and Unicode

Total characters1732
Distinct characters100
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78 ?
Unique (%)89.7%

Sample

1st row(B3)상승시-대합실(B4-B3)
2nd row(B1)B2대합실-B1화장실층
3rd row(F1)#2출입구(B1대-F1)
4th row(B4)신정방향 시/승-E/V(B5-B4)
5th row(B4)화곡방향 시/승-E/V(B5-B4)
ValueCountFrequency (%)
환승 6
 
4.2%
b1)대-대(b2-b1 4
 
2.8%
시/승-대(b3-b2 4
 
2.8%
b5)대림방향 3
 
2.1%
b1)b2대합실-b1화장실층 3
 
2.1%
상선 3
 
2.1%
하선 3
 
2.1%
시/승 2
 
1.4%
b1)국철환승 2
 
1.4%
승/종-대(b4-b3 2
 
1.4%
Other values (97) 111
77.6%
2023-12-13T04:03:32.453109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 212
 
12.2%
( 186
 
10.7%
) 185
 
10.7%
- 138
 
8.0%
1 129
 
7.4%
2 84
 
4.8%
73
 
4.2%
67
 
3.9%
57
 
3.3%
/ 42
 
2.4%
Other values (90) 559
32.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 543
31.4%
Decimal Number 299
17.3%
Uppercase Letter 263
15.2%
Open Punctuation 186
 
10.7%
Close Punctuation 185
 
10.7%
Dash Punctuation 138
 
8.0%
Other Punctuation 61
 
3.5%
Space Separator 57
 
3.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
73
 
13.4%
67
 
12.3%
29
 
5.3%
25
 
4.6%
20
 
3.7%
18
 
3.3%
17
 
3.1%
17
 
3.1%
17
 
3.1%
16
 
2.9%
Other values (70) 244
44.9%
Decimal Number
ValueCountFrequency (%)
1 129
43.1%
2 84
28.1%
3 31
 
10.4%
4 31
 
10.4%
5 17
 
5.7%
6 4
 
1.3%
9 2
 
0.7%
7 1
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
B 212
80.6%
F 40
 
15.2%
E 4
 
1.5%
M 3
 
1.1%
V 2
 
0.8%
S 2
 
0.8%
Other Punctuation
ValueCountFrequency (%)
/ 42
68.9%
# 19
31.1%
Open Punctuation
ValueCountFrequency (%)
( 186
100.0%
Close Punctuation
ValueCountFrequency (%)
) 185
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 138
100.0%
Space Separator
ValueCountFrequency (%)
57
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 926
53.5%
Hangul 543
31.4%
Latin 263
 
15.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
73
 
13.4%
67
 
12.3%
29
 
5.3%
25
 
4.6%
20
 
3.7%
18
 
3.3%
17
 
3.1%
17
 
3.1%
17
 
3.1%
16
 
2.9%
Other values (70) 244
44.9%
Common
ValueCountFrequency (%)
( 186
20.1%
) 185
20.0%
- 138
14.9%
1 129
13.9%
2 84
9.1%
57
 
6.2%
/ 42
 
4.5%
3 31
 
3.3%
4 31
 
3.3%
# 19
 
2.1%
Other values (4) 24
 
2.6%
Latin
ValueCountFrequency (%)
B 212
80.6%
F 40
 
15.2%
E 4
 
1.5%
M 3
 
1.1%
V 2
 
0.8%
S 2
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1189
68.6%
Hangul 543
31.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 212
17.8%
( 186
15.6%
) 185
15.6%
- 138
11.6%
1 129
10.8%
2 84
 
7.1%
57
 
4.8%
/ 42
 
3.5%
F 40
 
3.4%
3 31
 
2.6%
Other values (10) 85
7.1%
Hangul
ValueCountFrequency (%)
73
 
13.4%
67
 
12.3%
29
 
5.3%
25
 
4.6%
20
 
3.7%
18
 
3.3%
17
 
3.1%
17
 
3.1%
17
 
3.1%
16
 
2.9%
Other values (70) 244
44.9%

길이
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size828.0 B
125
87 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row125
2nd row125
3rd row125
4th row125
5th row125

Common Values

ValueCountFrequency (%)
125 87
100.0%

Length

2023-12-13T04:03:32.639721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:32.774930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
125 87
100.0%


Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size828.0 B
80
86 
90
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st row80
2nd row80
3rd row80
4th row80
5th row80

Common Values

ValueCountFrequency (%)
80 86
98.9%
90 1
 
1.1%

Length

2023-12-13T04:03:32.931565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:33.064181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
80 86
98.9%
90 1
 
1.1%

시작층
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size828.0 B
지하1층
32 
지하2층
19 
지상1층
18 
지하4층
지하3층
Other values (2)

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하3층
2nd row지하1층
3rd row지상1층
4th row지하4층
5th row지하4층

Common Values

ValueCountFrequency (%)
지하1층 32
36.8%
지하2층 19
21.8%
지상1층 18
20.7%
지하4층 6
 
6.9%
지하3층 4
 
4.6%
지상2층 4
 
4.6%
지하5층 4
 
4.6%

Length

2023-12-13T04:03:33.199516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:33.419627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하1층 32
36.8%
지하2층 19
21.8%
지상1층 18
20.7%
지하4층 6
 
6.9%
지하3층 4
 
4.6%
지상2층 4
 
4.6%
지하5층 4
 
4.6%

종료층
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size828.0 B
지하2층
27 
지하1층
26 
지하3층
20 
지하4층
지하5층

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st row지하4층
2nd row지하2층
3rd row지하1층
4th row지하5층
5th row지하5층

Common Values

ValueCountFrequency (%)
지하2층 27
31.0%
지하1층 26
29.9%
지하3층 20
23.0%
지하4층 9
 
10.3%
지하5층 4
 
4.6%
지상2층 1
 
1.1%

Length

2023-12-13T04:03:33.659091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:03:33.884688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하2층 27
31.0%
지하1층 26
29.9%
지하3층 20
23.0%
지하4층 9
 
10.3%
지하5층 4
 
4.6%
지상2층 1
 
1.1%

Interactions

2023-12-13T04:03:29.922948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:29.667238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:30.016261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:03:29.763279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:03:34.007299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명역명휠체어리프트의 관리번호출입구번호상세위치시작층종료층
선명1.0001.0000.0000.0000.9750.0000.3330.445
역명1.0001.0000.0000.9010.0001.0000.7930.932
휠체어리프트의 관리번호0.0000.0001.0000.5920.9930.0000.2180.000
출입구번호0.0000.9010.5921.0001.000NaNNaN0.000
상세위치0.9750.0000.9931.0001.0001.0001.0000.994
0.0001.0000.000NaN1.0001.0000.0000.000
시작층0.3330.7930.218NaN1.0000.0001.0000.845
종료층0.4450.9320.0000.0000.9940.0000.8451.000
2023-12-13T04:03:34.201889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명종료층선명시작층
역명1.0000.5070.7360.7280.321
종료층0.5071.0000.2970.0000.697
선명0.7360.2971.0000.0000.228
0.7280.0000.0001.0000.000
시작층0.3210.6970.2280.0001.000
2023-12-13T04:03:34.416456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
휠체어리프트의 관리번호출입구번호선명역명시작층종료층
휠체어리프트의 관리번호1.0000.1050.0000.0000.0000.0770.000
출입구번호0.1051.0000.0000.2041.0001.0000.000
선명0.0000.0001.0000.7360.0000.2280.297
역명0.0000.2040.7361.0000.7280.3210.507
0.0001.0000.0000.7281.0000.0000.000
시작층0.0771.0000.2280.3210.0001.0000.697
종료층0.0000.0000.2970.5070.0000.6971.000

Missing values

2023-12-13T04:03:30.169315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:03:30.354158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명휠체어리프트의 관리번호출입구번호상세위치길이시작층종료층
0서울교통공사5호선강동1<NA>(B3)상승시-대합실(B4-B3)12580지하3층지하4층
1서울교통공사5호선광나루(장신대)1<NA>(B1)B2대합실-B1화장실층12580지하1층지하2층
2서울교통공사5호선까치산32(F1)#2출입구(B1대-F1)12580지상1층지하1층
3서울교통공사5호선까치산2<NA>(B4)신정방향 시/승-E/V(B5-B4)12580지하4층지하5층
4서울교통공사5호선까치산1<NA>(B4)화곡방향 시/승-E/V(B5-B4)12580지하4층지하5층
5서울교통공사5호선답십리1<NA>(B1)B2대합실-B1화장실층12580지하1층지하1층
6서울교통공사5호선동대문역사문화공원1<NA>(B4)승강장 시점B5-대B412580지하4층지하5층
7서울교통공사5호선동대문역사문화공원26(F1)#7출입구(B1대-F1)12580지상1층지하1층
8서울교통공사5호선마천11(F1)#1출입구(B1대-F1)12580지상1층지하1층
9서울교통공사5호선상일동1<NA>(B1)고덕방향 시/승-대합실(B2-B1)12580지하1층지하2층
철도운영기관명선명역명휠체어리프트의 관리번호출입구번호상세위치길이시작층종료층
77서울교통공사7호선청담2<NA>(B3)강남구청방향 승/종-대(B4-B3)12580지하3층지하4층
78서울교통공사7호선총신대입구(이수)1<NA>(B2)4환승통로상선(승-승)(B4-B2)12580지하2층지하3층
79서울교통공사7호선총신대입구(이수)2<NA>(B2)4환승통로하선(승-승)(B4-B2)12580지하2층지하3층
80서울교통공사7호선총신대입구(이수)3<NA>(B1)대-대((B2-B1)12580지하1층지하2층
81서울교통공사8호선모란1<NA>(B2)승상종점-분당선환승(B3-B2)12580지하2층지하3층
82서울교통공사8호선모란2<NA>(B2)승하종점-분당선환승(B3-B2)12580지하2층지하3층
83서울교통공사8호선복정14(F1)#4출구(B1-F1)12580지상1층지하1층
84서울교통공사8호선잠실(송파구청)2<NA>(B2)하승-환승계단(B3-B2)12580지하2층지하3층
85서울교통공사8호선잠실(송파구청)3<NA>(B1)환승계단끝-2승(사당)(통로-B2)12580지하1층지하2층
86서울교통공사8호선잠실(송파구청)1<NA>(B2)상승-환승계단(B3-B2)12580지하2층지하3층