Overview

Dataset statistics

Number of variables5
Number of observations5468
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory229.7 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text2

Dataset

Description서울교통공사 1-8호선 275개역 5436개(하남선 5개역, 201개 포함) 지하철 시각장애인 음성유도기 설치 위치 정보입니다. 해당 데이터는 연번,호선,역번호,역명,음성유도기 설치 위치로 구성되어 있습니다.
URLhttps://www.data.go.kr/data/15100171/fileData.do

Alerts

연번 is highly overall correlated with 호선 and 1 other fieldsHigh correlation
호선 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
외부역번호 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-13 00:55:21.892116
Analysis finished2023-12-13 00:55:23.142290
Duration1.25 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct5468
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2734.5
Minimum1
Maximum5468
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size48.2 KiB
2023-12-13T09:55:23.205985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile274.35
Q11367.75
median2734.5
Q34101.25
95-th percentile5194.65
Maximum5468
Range5467
Interquartile range (IQR)2733.5

Descriptive statistics

Standard deviation1578.62
Coefficient of variation (CV)0.57729748
Kurtosis-1.2
Mean2734.5
Median Absolute Deviation (MAD)1367
Skewness0
Sum14952246
Variance2492041
MonotonicityStrictly increasing
2023-12-13T09:55:23.309500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
3645 1
 
< 0.1%
3653 1
 
< 0.1%
3652 1
 
< 0.1%
3651 1
 
< 0.1%
3650 1
 
< 0.1%
3649 1
 
< 0.1%
3648 1
 
< 0.1%
3647 1
 
< 0.1%
3646 1
 
< 0.1%
Other values (5458) 5458
99.8%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
5468 1
< 0.1%
5467 1
< 0.1%
5466 1
< 0.1%
5465 1
< 0.1%
5464 1
< 0.1%
5463 1
< 0.1%
5462 1
< 0.1%
5461 1
< 0.1%
5460 1
< 0.1%
5459 1
< 0.1%

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.6305779
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size48.2 KiB
2023-12-13T09:55:23.620250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q36
95-th percentile8
Maximum8
Range7
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.0375911
Coefficient of variation (CV)0.44002955
Kurtosis-1.1115275
Mean4.6305779
Median Absolute Deviation (MAD)2
Skewness-0.12059121
Sum25320
Variance4.1517775
MonotonicityIncreasing
2023-12-13T09:55:23.702260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
5 1145
20.9%
2 963
17.6%
7 885
16.2%
6 718
13.1%
4 602
11.0%
3 496
9.1%
8 373
 
6.8%
1 286
 
5.2%
ValueCountFrequency (%)
1 286
 
5.2%
2 963
17.6%
3 496
9.1%
4 602
11.0%
5 1145
20.9%
6 718
13.1%
7 885
16.2%
8 373
 
6.8%
ValueCountFrequency (%)
8 373
 
6.8%
7 885
16.2%
6 718
13.1%
5 1145
20.9%
4 602
11.0%
3 496
9.1%
2 963
17.6%
1 286
 
5.2%

외부역번호
Real number (ℝ)

HIGH CORRELATION 

Distinct238
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean502.76372
Minimum126
Maximum2114
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size48.2 KiB
2023-12-13T09:55:23.799059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126
5-th percentile133
Q1327
median530
Q3644
95-th percentile815
Maximum2114
Range1988
Interquartile range (IQR)317

Descriptive statistics

Standard deviation241.93721
Coefficient of variation (CV)0.48121455
Kurtosis10.912883
Mean502.76372
Median Absolute Deviation (MAD)185.5
Skewness1.7407462
Sum2749112
Variance58533.616
MonotonicityNot monotonic
2023-12-13T09:55:23.905609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
423 52
 
1.0%
130 48
 
0.9%
208 44
 
0.8%
128 44
 
0.8%
562 44
 
0.8%
564 43
 
0.8%
425 41
 
0.7%
329 41
 
0.7%
561 41
 
0.7%
421 40
 
0.7%
Other values (228) 5030
92.0%
ValueCountFrequency (%)
126 22
0.4%
127 37
0.7%
128 44
0.8%
129 32
0.6%
130 48
0.9%
131 38
0.7%
132 32
0.6%
133 33
0.6%
202 38
0.7%
203 32
0.6%
ValueCountFrequency (%)
2114 17
0.3%
2113 19
0.3%
827 19
0.3%
826 13
0.2%
825 14
0.3%
824 25
0.5%
823 18
0.3%
822 13
0.2%
821 31
0.6%
820 14
0.3%

역명
Text

Distinct220
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size42.8 KiB
2023-12-13T09:55:24.170323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length3.4575713
Min length2

Characters and Unicode

Total characters18906
Distinct characters203
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row서울역 (1)
2nd row서울역 (1)
3rd row서울역 (1)
4th row서울역 (1)
5th row서울역 (1)
ValueCountFrequency (%)
1 113
 
1.9%
종로3가 111
 
1.9%
3 109
 
1.9%
을지로3가 71
 
1.2%
2 67
 
1.2%
왕십리 65
 
1.1%
잠실 62
 
1.1%
동묘앞 59
 
1.0%
불광 59
 
1.0%
고속터미널 58
 
1.0%
Other values (210) 5045
86.7%
2023-12-13T09:55:24.524659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
740
 
3.9%
) 567
 
3.0%
( 567
 
3.0%
529
 
2.8%
486
 
2.6%
432
 
2.3%
392
 
2.1%
384
 
2.0%
380
 
2.0%
375
 
2.0%
Other values (193) 14054
74.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16554
87.6%
Decimal Number 834
 
4.4%
Close Punctuation 567
 
3.0%
Open Punctuation 567
 
3.0%
Space Separator 384
 
2.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
740
 
4.5%
529
 
3.2%
486
 
2.9%
432
 
2.6%
392
 
2.4%
380
 
2.3%
375
 
2.3%
363
 
2.2%
349
 
2.1%
328
 
2.0%
Other values (185) 12180
73.6%
Decimal Number
ValueCountFrequency (%)
3 291
34.9%
4 231
27.7%
1 157
18.8%
2 113
 
13.5%
5 42
 
5.0%
Close Punctuation
ValueCountFrequency (%)
) 567
100.0%
Open Punctuation
ValueCountFrequency (%)
( 567
100.0%
Space Separator
ValueCountFrequency (%)
384
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16554
87.6%
Common 2352
 
12.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
740
 
4.5%
529
 
3.2%
486
 
2.9%
432
 
2.6%
392
 
2.4%
380
 
2.3%
375
 
2.3%
363
 
2.2%
349
 
2.1%
328
 
2.0%
Other values (185) 12180
73.6%
Common
ValueCountFrequency (%)
) 567
24.1%
( 567
24.1%
384
16.3%
3 291
12.4%
4 231
9.8%
1 157
 
6.7%
2 113
 
4.8%
5 42
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16554
87.6%
ASCII 2352
 
12.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
740
 
4.5%
529
 
3.2%
486
 
2.9%
432
 
2.6%
392
 
2.4%
380
 
2.3%
375
 
2.3%
363
 
2.2%
349
 
2.1%
328
 
2.0%
Other values (185) 12180
73.6%
ASCII
ValueCountFrequency (%)
) 567
24.1%
( 567
24.1%
384
16.3%
3 291
12.4%
4 231
9.8%
1 157
 
6.7%
2 113
 
4.8%
5 42
 
1.8%
Distinct3818
Distinct (%)69.8%
Missing1
Missing (%)< 0.1%
Memory size42.8 KiB
2023-12-13T09:55:24.837521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length32
Mean length12.67752
Min length3

Characters and Unicode

Total characters69308
Distinct characters334
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3409 ?
Unique (%)62.4%

Sample

1st row4호선 환승통로입구
2nd row서울역쪽 개표소
3rd row서울역쪽 발매기 앞
4th row역무실 앞 발매기 앞기둥
5th row역무실 앞 중간개표소(내부)
ValueCountFrequency (%)
1459
 
7.3%
출구 969
 
4.9%
계단 931
 
4.7%
승강장 824
 
4.1%
상선 722
 
3.6%
하선 683
 
3.4%
e/v 674
 
3.4%
내부 571
 
2.9%
468
 
2.3%
e/s 403
 
2.0%
Other values (1422) 12217
61.3%
2023-12-13T09:55:25.288538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14527
 
21.0%
2197
 
3.2%
2119
 
3.1%
2084
 
3.0%
1 2000
 
2.9%
1865
 
2.7%
E 1592
 
2.3%
- 1509
 
2.2%
1488
 
2.1%
/ 1451
 
2.1%
Other values (324) 38476
55.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 36574
52.8%
Space Separator 14527
 
21.0%
Decimal Number 8050
 
11.6%
Uppercase Letter 3981
 
5.7%
Other Punctuation 2365
 
3.4%
Dash Punctuation 1509
 
2.2%
Close Punctuation 1116
 
1.6%
Open Punctuation 1114
 
1.6%
Math Symbol 49
 
0.1%
Lowercase Letter 16
 
< 0.1%
Other values (2) 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2197
 
6.0%
2119
 
5.8%
2084
 
5.7%
1865
 
5.1%
1488
 
4.1%
1356
 
3.7%
1343
 
3.7%
1342
 
3.7%
1304
 
3.6%
1141
 
3.1%
Other values (267) 20335
55.6%
Uppercase Letter
ValueCountFrequency (%)
E 1592
40.0%
V 847
21.3%
S 554
 
13.9%
B 254
 
6.4%
I 237
 
6.0%
L 123
 
3.1%
C 98
 
2.5%
A 54
 
1.4%
F 45
 
1.1%
G 38
 
1.0%
Other values (10) 139
 
3.5%
Decimal Number
ValueCountFrequency (%)
1 2000
24.8%
2 1408
17.5%
4 1294
16.1%
3 1188
14.8%
5 617
 
7.7%
6 443
 
5.5%
7 401
 
5.0%
8 329
 
4.1%
9 187
 
2.3%
0 183
 
2.3%
Lowercase Letter
ValueCountFrequency (%)
o 2
12.5%
x 2
12.5%
e 2
12.5%
v 2
12.5%
t 2
12.5%
a 2
12.5%
b 1
6.2%
s 1
6.2%
i 1
6.2%
j 1
6.2%
Other Punctuation
ValueCountFrequency (%)
/ 1451
61.4%
, 887
37.5%
# 21
 
0.9%
. 5
 
0.2%
& 1
 
< 0.1%
Letter Number
ValueCountFrequency (%)
2
33.3%
2
33.3%
2
33.3%
Close Punctuation
ValueCountFrequency (%)
) 1109
99.4%
] 7
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 1107
99.4%
[ 7
 
0.6%
Math Symbol
ValueCountFrequency (%)
~ 46
93.9%
> 3
 
6.1%
Space Separator
ValueCountFrequency (%)
14527
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1509
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 36534
52.7%
Common 28731
41.5%
Latin 4003
 
5.8%
Han 40
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2197
 
6.0%
2119
 
5.8%
2084
 
5.7%
1865
 
5.1%
1488
 
4.1%
1356
 
3.7%
1343
 
3.7%
1342
 
3.7%
1304
 
3.6%
1141
 
3.1%
Other values (265) 20295
55.6%
Latin
ValueCountFrequency (%)
E 1592
39.8%
V 847
21.2%
S 554
 
13.8%
B 254
 
6.3%
I 237
 
5.9%
L 123
 
3.1%
C 98
 
2.4%
A 54
 
1.3%
F 45
 
1.1%
G 38
 
0.9%
Other values (23) 161
 
4.0%
Common
ValueCountFrequency (%)
14527
50.6%
1 2000
 
7.0%
- 1509
 
5.3%
/ 1451
 
5.1%
2 1408
 
4.9%
4 1294
 
4.5%
3 1188
 
4.1%
) 1109
 
3.9%
( 1107
 
3.9%
, 887
 
3.1%
Other values (14) 2251
 
7.8%
Han
ValueCountFrequency (%)
23
57.5%
17
42.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 36534
52.7%
ASCII 32727
47.2%
CJK 40
 
0.1%
Number Forms 6
 
< 0.1%
Geometric Shapes 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
14527
44.4%
1 2000
 
6.1%
E 1592
 
4.9%
- 1509
 
4.6%
/ 1451
 
4.4%
2 1408
 
4.3%
4 1294
 
4.0%
3 1188
 
3.6%
) 1109
 
3.4%
( 1107
 
3.4%
Other values (43) 5542
 
16.9%
Hangul
ValueCountFrequency (%)
2197
 
6.0%
2119
 
5.8%
2084
 
5.7%
1865
 
5.1%
1488
 
4.1%
1356
 
3.7%
1343
 
3.7%
1342
 
3.7%
1304
 
3.6%
1141
 
3.1%
Other values (265) 20295
55.6%
CJK
ValueCountFrequency (%)
23
57.5%
17
42.5%
Number Forms
ValueCountFrequency (%)
2
33.3%
2
33.3%
2
33.3%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%

Interactions

2023-12-13T09:55:22.741709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:55:22.303772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:55:22.511115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:55:22.810331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:55:22.367222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:55:22.584378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:55:22.886611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:55:22.438296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:55:22.660686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:55:25.366546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선외부역번호
연번1.0000.9290.981
호선0.9291.0000.860
외부역번호0.9810.8601.000
2023-12-13T09:55:25.432794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선외부역번호
연번1.0000.9880.974
호선0.9881.0000.960
외부역번호0.9740.9601.000

Missing values

2023-12-13T09:55:23.011990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:55:23.105121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번호선외부역번호역명설치위치
011133서울역 (1)4호선 환승통로입구
121133서울역 (1)서울역쪽 개표소
231133서울역 (1)서울역쪽 발매기 앞
341133서울역 (1)역무실 앞 발매기 앞기둥
451133서울역 (1)역무실 앞 중간개표소(내부)
561133서울역 (1)2번 출구 (내부)
671133서울역 (1)2번 입구 (외부)
781133서울역 (1)E/V 앞 2번 출구 하단
891133서울역 (1)E/V 앞 2번 출구 상단
9101133서울역 (1)대합실에서 승강장가는 E/V 앞기둥
연번호선외부역번호역명설치위치
545854598827모란분당선 환승통로
545954608827모란상선 승강장 가는 계단
546054618827모란상선 2-4
546154628827모란하선 5-1
546254638827모란상선 4-3
546354648827모란상선 5-3 (E/V앞)
546454658827모란하선 3-2
546554668827모란하선 2-3 (E/V앞)
546654678827모란상선 6-4
546754688827모란하선 1-1