Overview

Dataset statistics

Number of variables9
Number of observations120
Missing cells233
Missing cells (%)21.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.3 KiB
Average record size in memory79.1 B

Variable types

Categorical5
Text2
Numeric2

Dataset

Description호선,역명,외부 엘리베이터(E/V),내부 엘리베이터(E/V),외부 에스컬레이터(E/S),내부 에스컬레이터(E/S),휠체어리프트(W/L),수평자동보도(M/W),비고
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-11573/S/1/datasetView.do

Alerts

휠체어리프트(W/L) is highly imbalanced (55.2%)Imbalance
수평자동보도(M/W) is highly imbalanced (93.0%)Imbalance
외부 에스컬레이터(E/S) has 52 (43.3%) missing valuesMissing
내부 에스컬레이터(E/S) has 82 (68.3%) missing valuesMissing
비고 has 99 (82.5%) missing valuesMissing
역명 has unique valuesUnique

Reproduction

Analysis started2023-12-11 05:46:54.146351
Analysis finished2023-12-11 05:46:55.557513
Duration1.41 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

호선
Categorical

Distinct4
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2호선
50 
3호선
34 
4호선
26 
1호선
10 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
2호선 50
41.7%
3호선 34
28.3%
4호선 26
21.7%
1호선 10
 
8.3%

Length

2023-12-11T14:46:55.635356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T14:46:55.750457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2호선 50
41.7%
3호선 34
28.3%
4호선 26
21.7%
1호선 10
 
8.3%

역명
Text

UNIQUE 

Distinct120
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2023-12-11T14:46:55.983145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length3
Mean length3.525
Min length2

Characters and Unicode

Total characters423
Distinct characters145
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique120 ?
Unique (%)100.0%

Sample

1st row서울①
2nd row시청①
3rd row종 각
4th row종로3가①
5th row종로5가
ValueCountFrequency (%)
6
 
3.4%
5
 
2.9%
4
 
2.3%
3
 
1.7%
3
 
1.7%
2
 
1.1%
2
 
1.1%
2
 
1.1%
2
 
1.1%
2
 
1.1%
Other values (133) 144
82.3%
2023-12-11T14:46:56.478261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
55
 
13.0%
21
 
5.0%
15
 
3.5%
15
 
3.5%
13
 
3.1%
9
 
2.1%
9
 
2.1%
7
 
1.7%
7
 
1.7%
7
 
1.7%
Other values (135) 265
62.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 341
80.6%
Space Separator 55
 
13.0%
Other Number 21
 
5.0%
Decimal Number 6
 
1.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
21
 
6.2%
15
 
4.4%
15
 
4.4%
13
 
3.8%
9
 
2.6%
9
 
2.6%
7
 
2.1%
7
 
2.1%
7
 
2.1%
6
 
1.8%
Other values (127) 232
68.0%
Other Number
ValueCountFrequency (%)
7
33.3%
5
23.8%
5
23.8%
4
19.0%
Decimal Number
ValueCountFrequency (%)
3 4
66.7%
4 1
 
16.7%
5 1
 
16.7%
Space Separator
ValueCountFrequency (%)
55
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 341
80.6%
Common 82
 
19.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
21
 
6.2%
15
 
4.4%
15
 
4.4%
13
 
3.8%
9
 
2.6%
9
 
2.6%
7
 
2.1%
7
 
2.1%
7
 
2.1%
6
 
1.8%
Other values (127) 232
68.0%
Common
ValueCountFrequency (%)
55
67.1%
7
 
8.5%
5
 
6.1%
5
 
6.1%
3 4
 
4.9%
4
 
4.9%
4 1
 
1.2%
5 1
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 341
80.6%
ASCII 61
 
14.4%
Enclosed Alphanum 21
 
5.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
55
90.2%
3 4
 
6.6%
4 1
 
1.6%
5 1
 
1.6%
Hangul
ValueCountFrequency (%)
21
 
6.2%
15
 
4.4%
15
 
4.4%
13
 
3.8%
9
 
2.6%
9
 
2.6%
7
 
2.1%
7
 
2.1%
7
 
2.1%
6
 
1.8%
Other values (127) 232
68.0%
Enclosed Alphanum
ValueCountFrequency (%)
7
33.3%
5
23.8%
5
23.8%
4
19.0%
Distinct4
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
1
75 
2
23 
<NA>
20 
3
 
2

Length

Max length4
Median length1
Mean length1.5
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row<NA>
5th row1

Common Values

ValueCountFrequency (%)
1 75
62.5%
2 23
 
19.2%
<NA> 20
 
16.7%
3 2
 
1.7%

Length

2023-12-11T14:46:56.621294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T14:46:56.773180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 75
62.5%
2 23
 
19.2%
na 20
 
16.7%
3 2
 
1.7%
Distinct4
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2
75 
1
35 
<NA>
5
 
1

Length

Max length4
Median length1
Mean length1.225
Min length1

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row1
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 75
62.5%
1 35
29.2%
<NA> 9
 
7.5%
5 1
 
0.8%

Length

2023-12-11T14:46:56.918162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T14:46:57.043367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 75
62.5%
1 35
29.2%
na 9
 
7.5%
5 1
 
0.8%

외부 에스컬레이터(E/S)
Real number (ℝ)

MISSING 

Distinct8
Distinct (%)11.8%
Missing52
Missing (%)43.3%
Infinite0
Infinite (%)0.0%
Mean3.2058824
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-11T14:46:57.169378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12
median2
Q34
95-th percentile7.3
Maximum11
Range10
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.0190054
Coefficient of variation (CV)0.6297815
Kurtosis4.1069923
Mean3.2058824
Median Absolute Deviation (MAD)0
Skewness1.9744045
Sum218
Variance4.0763828
MonotonicityNot monotonic
2023-12-11T14:46:57.290141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2 38
31.7%
4 13
 
10.8%
6 6
 
5.0%
3 5
 
4.2%
1 2
 
1.7%
8 2
 
1.7%
11 1
 
0.8%
10 1
 
0.8%
(Missing) 52
43.3%
ValueCountFrequency (%)
1 2
 
1.7%
2 38
31.7%
3 5
 
4.2%
4 13
 
10.8%
6 6
 
5.0%
8 2
 
1.7%
10 1
 
0.8%
11 1
 
0.8%
ValueCountFrequency (%)
11 1
 
0.8%
10 1
 
0.8%
8 2
 
1.7%
6 6
 
5.0%
4 13
 
10.8%
3 5
 
4.2%
2 38
31.7%
1 2
 
1.7%

내부 에스컬레이터(E/S)
Real number (ℝ)

MISSING 

Distinct11
Distinct (%)28.9%
Missing82
Missing (%)68.3%
Infinite0
Infinite (%)0.0%
Mean5.9736842
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-11T14:46:57.415195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.85
Q12
median4
Q38
95-th percentile18.3
Maximum24
Range23
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.3599458
Coefficient of variation (CV)0.89725964
Kurtosis3.4873982
Mean5.9736842
Median Absolute Deviation (MAD)2
Skewness1.8728497
Sum227
Variance28.729018
MonotonicityNot monotonic
2023-12-11T14:46:57.537688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2 12
 
10.0%
4 7
 
5.8%
6 6
 
5.0%
8 5
 
4.2%
1 2
 
1.7%
9 1
 
0.8%
24 1
 
0.8%
18 1
 
0.8%
14 1
 
0.8%
12 1
 
0.8%
(Missing) 82
68.3%
ValueCountFrequency (%)
1 2
 
1.7%
2 12
10.0%
4 7
5.8%
6 6
5.0%
8 5
4.2%
9 1
 
0.8%
12 1
 
0.8%
14 1
 
0.8%
18 1
 
0.8%
20 1
 
0.8%
ValueCountFrequency (%)
24 1
 
0.8%
20 1
 
0.8%
18 1
 
0.8%
14 1
 
0.8%
12 1
 
0.8%
9 1
 
0.8%
8 5
4.2%
6 6
5.0%
4 7
5.8%
2 12
10.0%

휠체어리프트(W/L)
Categorical

IMBALANCE 

Distinct6
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
92 
1
17 
2
 
6
6
 
3
5
 
1

Length

Max length4
Median length4
Mean length3.3
Min length1

Unique

Unique2 ?
Unique (%)1.7%

Sample

1st row1
2nd row<NA>
3rd row<NA>
4th row1
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 92
76.7%
1 17
 
14.2%
2 6
 
5.0%
6 3
 
2.5%
5 1
 
0.8%
3 1
 
0.8%

Length

2023-12-11T14:46:57.662632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T14:46:57.784683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 92
76.7%
1 17
 
14.2%
2 6
 
5.0%
6 3
 
2.5%
5 1
 
0.8%
3 1
 
0.8%

수평자동보도(M/W)
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
119 
2
 
1

Length

Max length4
Median length4
Mean length3.975
Min length1

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 119
99.2%
2 1
 
0.8%

Length

2023-12-11T14:46:57.913763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T14:46:58.304745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 119
99.2%
2 1
 
0.8%

비고
Text

MISSING 

Distinct15
Distinct (%)71.4%
Missing99
Missing (%)82.5%
Memory size1.1 KiB
2023-12-11T14:46:58.497499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length11
Mean length10.47619
Min length7

Characters and Unicode

Total characters220
Distinct characters45
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)47.6%

Sample

1st row피카디리 연결통로이용
2nd row내부(E/V)에 Lift 1대 포함
3rd row3호선이용가능
4th row외부5호선이용
5th row내부(E/V)에 Lift 2대 포함
ValueCountFrequency (%)
내부(e/v)에 5
12.5%
포함 5
12.5%
lift 5
12.5%
1대 3
 
7.5%
이용 3
 
7.5%
2대 2
 
5.0%
외부경사로 2
 
5.0%
외부연결통로이용 2
 
5.0%
내부경사형ev 2
 
5.0%
내부철도공사이용 1
 
2.5%
Other values (10) 10
25.0%
2023-12-11T14:46:58.859347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19
 
8.6%
19
 
8.6%
14
 
6.4%
14
 
6.4%
10
 
4.5%
10
 
4.5%
V 7
 
3.2%
E 7
 
3.2%
6
 
2.7%
6
 
2.7%
Other values (35) 108
49.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 142
64.5%
Space Separator 19
 
8.6%
Uppercase Letter 19
 
8.6%
Lowercase Letter 15
 
6.8%
Decimal Number 10
 
4.5%
Close Punctuation 5
 
2.3%
Other Punctuation 5
 
2.3%
Open Punctuation 5
 
2.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
 
13.4%
14
 
9.9%
14
 
9.9%
10
 
7.0%
10
 
7.0%
6
 
4.2%
6
 
4.2%
5
 
3.5%
5
 
3.5%
5
 
3.5%
Other values (19) 48
33.8%
Decimal Number
ValueCountFrequency (%)
2 3
30.0%
1 3
30.0%
4 1
 
10.0%
6 1
 
10.0%
5 1
 
10.0%
3 1
 
10.0%
Uppercase Letter
ValueCountFrequency (%)
V 7
36.8%
E 7
36.8%
L 5
26.3%
Lowercase Letter
ValueCountFrequency (%)
t 5
33.3%
f 5
33.3%
i 5
33.3%
Space Separator
ValueCountFrequency (%)
19
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 142
64.5%
Common 44
 
20.0%
Latin 34
 
15.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
 
13.4%
14
 
9.9%
14
 
9.9%
10
 
7.0%
10
 
7.0%
6
 
4.2%
6
 
4.2%
5
 
3.5%
5
 
3.5%
5
 
3.5%
Other values (19) 48
33.8%
Common
ValueCountFrequency (%)
19
43.2%
) 5
 
11.4%
/ 5
 
11.4%
( 5
 
11.4%
2 3
 
6.8%
1 3
 
6.8%
4 1
 
2.3%
6 1
 
2.3%
5 1
 
2.3%
3 1
 
2.3%
Latin
ValueCountFrequency (%)
V 7
20.6%
E 7
20.6%
t 5
14.7%
f 5
14.7%
i 5
14.7%
L 5
14.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 142
64.5%
ASCII 78
35.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
19
24.4%
V 7
 
9.0%
E 7
 
9.0%
) 5
 
6.4%
/ 5
 
6.4%
t 5
 
6.4%
f 5
 
6.4%
i 5
 
6.4%
L 5
 
6.4%
( 5
 
6.4%
Other values (6) 10
12.8%
Hangul
ValueCountFrequency (%)
19
 
13.4%
14
 
9.9%
14
 
9.9%
10
 
7.0%
10
 
7.0%
6
 
4.2%
6
 
4.2%
5
 
3.5%
5
 
3.5%
5
 
3.5%
Other values (19) 48
33.8%

Interactions

2023-12-11T14:46:54.759604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T14:46:54.581086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T14:46:54.902059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T14:46:54.657032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T14:46:58.964727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선외부 엘리베이터(E/V)내부 엘리베이터(E/V)외부 에스컬레이터(E/S)내부 에스컬레이터(E/S)휠체어리프트(W/L)비고
호선1.0000.1360.2090.0000.0000.0000.225
외부 엘리베이터(E/V)0.1361.0000.2400.2370.2800.0001.000
내부 엘리베이터(E/V)0.2090.2401.0000.0000.0000.0000.000
외부 에스컬레이터(E/S)0.0000.2370.0001.0000.2640.0000.000
내부 에스컬레이터(E/S)0.0000.2800.0000.2641.0000.5650.000
휠체어리프트(W/L)0.0000.0000.0000.0000.5651.0001.000
비고0.2251.0000.0000.0000.0001.0001.000
2023-12-11T14:46:59.104896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수평자동보도(M/W)외부 엘리베이터(E/V)호선휠체어리프트(W/L)내부 엘리베이터(E/V)
수평자동보도(M/W)1.000NaNNaNNaNNaN
외부 엘리베이터(E/V)NaN1.0000.1270.0000.075
호선NaN0.1271.0000.0000.197
휠체어리프트(W/L)NaN0.0000.0001.0000.000
내부 엘리베이터(E/V)NaN0.0750.1970.0001.000
2023-12-11T14:46:59.239421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
외부 에스컬레이터(E/S)내부 에스컬레이터(E/S)호선외부 엘리베이터(E/V)내부 엘리베이터(E/V)휠체어리프트(W/L)수평자동보도(M/W)
외부 에스컬레이터(E/S)1.000-0.0650.0000.1820.0730.0000.000
내부 에스컬레이터(E/S)-0.0651.0000.0000.1610.0000.244NaN
호선0.0000.0001.0000.1270.1970.000NaN
외부 엘리베이터(E/V)0.1820.1610.1271.0000.0750.0000.000
내부 엘리베이터(E/V)0.0730.0000.1970.0751.0000.0000.000
휠체어리프트(W/L)0.0000.2440.0000.0000.0001.000NaN
수평자동보도(M/W)0.000NaNNaN0.0000.000NaN1.000

Missing values

2023-12-11T14:46:55.075119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T14:46:55.242121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T14:46:55.378005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

호선역명외부 엘리베이터(E/V)내부 엘리베이터(E/V)외부 에스컬레이터(E/S)내부 에스컬레이터(E/S)휠체어리프트(W/L)수평자동보도(M/W)비고
01호선서울①21<NA>11<NA><NA>
11호선시청①122<NA><NA><NA><NA>
21호선종 각12<NA><NA><NA><NA><NA>
31호선종로3가①<NA>22<NA>1<NA>피카디리 연결통로이용
41호선종로5가12<NA><NA><NA><NA><NA>
51호선동대문①121<NA><NA><NA><NA>
61호선동묘앞2566<NA><NA><NA>
71호선신설동①32<NA><NA>6<NA><NA>
81호선제기동122<NA><NA><NA><NA>
91호선청량리112<NA>1<NA>내부(E/V)에 Lift 1대 포함
호선역명외부 엘리베이터(E/V)내부 엘리베이터(E/V)외부 에스컬레이터(E/S)내부 에스컬레이터(E/S)휠체어리프트(W/L)수평자동보도(M/W)비고
1104호선회 현<NA>1<NA>61<NA><NA>
1114호선서울④11222<NA><NA>
1124호선숙대입구12<NA><NA><NA><NA><NA>
1134호선삼각지224<NA><NA><NA><NA>
1144호선신용산<NA>22<NA>1<NA><NA>
1154호선이 촌224<NA>1<NA><NA>
1164호선동 작<NA>2<NA><NA><NA><NA><NA>
1174호선총신대입구22<NA><NA><NA><NA><NA>
1184호선사당④11611<NA><NA>
1194호선남태령12<NA>2<NA><NA>내부경사형EV