Overview

Dataset statistics

Number of variables8
Number of observations332
Missing cells232
Missing cells (%)8.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.2 KiB
Average record size in memory65.4 B

Variable types

Categorical4
Text4

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-2732/F/1/datasetView.do

Alerts

시 설 명 (역사명) has 232 (69.9%) missing valuesMissing

Reproduction

Analysis started2024-04-29 22:00:01.813393
Analysis finished2024-04-29 22:00:02.283742
Duration0.47 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

호선
Categorical

Distinct5
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
2
120 
3
108 
4
69 
1
33 
<NA>
 
2

Length

Max length4
Median length1
Mean length1.0180723
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
2 120
36.1%
3 108
32.5%
4 69
20.8%
1 33
 
9.9%
<NA> 2
 
0.6%

Length

2024-04-30T07:00:02.341044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T07:00:02.445313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 120
36.1%
3 108
32.5%
4 69
20.8%
1 33
 
9.9%
na 2
 
0.6%
Distinct91
Distinct (%)91.0%
Missing232
Missing (%)69.9%
Memory size2.7 KiB
2024-04-30T07:00:02.701674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length5
Mean length4.41
Min length3

Characters and Unicode

Total characters441
Distinct characters125
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)82.0%

Sample

1st row서울역
2nd row시 청
3rd row종 각
4th row종로3가
5th row종로5가
ValueCountFrequency (%)
5
 
3.3%
5
 
3.3%
3
 
2.0%
3
 
2.0%
3
 
2.0%
3
 
2.0%
2
 
1.3%
2
 
1.3%
2
 
1.3%
2
 
1.3%
Other values (103) 121
80.1%
2024-04-30T07:00:03.118924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
137
31.1%
17
 
3.9%
13
 
2.9%
11
 
2.5%
11
 
2.5%
10
 
2.3%
9
 
2.0%
7
 
1.6%
7
 
1.6%
6
 
1.4%
Other values (115) 213
48.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 295
66.9%
Space Separator 137
31.1%
Decimal Number 6
 
1.4%
Control 3
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
17
 
5.8%
13
 
4.4%
11
 
3.7%
11
 
3.7%
10
 
3.4%
9
 
3.1%
7
 
2.4%
7
 
2.4%
6
 
2.0%
5
 
1.7%
Other values (110) 199
67.5%
Decimal Number
ValueCountFrequency (%)
3 4
66.7%
5 1
 
16.7%
4 1
 
16.7%
Space Separator
ValueCountFrequency (%)
137
100.0%
Control
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 295
66.9%
Common 146
33.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
17
 
5.8%
13
 
4.4%
11
 
3.7%
11
 
3.7%
10
 
3.4%
9
 
3.1%
7
 
2.4%
7
 
2.4%
6
 
2.0%
5
 
1.7%
Other values (110) 199
67.5%
Common
ValueCountFrequency (%)
137
93.8%
3 4
 
2.7%
3
 
2.1%
5 1
 
0.7%
4 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 295
66.9%
ASCII 146
33.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
137
93.8%
3 4
 
2.7%
3
 
2.1%
5 1
 
0.7%
4 1
 
0.7%
Hangul
ValueCountFrequency (%)
17
 
5.8%
13
 
4.4%
11
 
3.7%
11
 
3.7%
10
 
3.4%
9
 
3.1%
7
 
2.4%
7
 
2.4%
6
 
2.0%
5
 
1.7%
Other values (110) 199
67.5%

측정 지점
Categorical

Distinct8
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
평 균
100 
승강장
100 
대합실
82 
대합실-1
18 
대합실-2
18 
Other values (3)
14 

Length

Max length6
Median length3
Mean length3.5662651
Min length3

Unique

Unique2 ?
Unique (%)0.6%

Sample

1st row<NA>
2nd row공기질 기준
3rd row평 균
4th row승강장
5th row대합실-1

Common Values

ValueCountFrequency (%)
평 균 100
30.1%
승강장 100
30.1%
대합실 82
24.7%
대합실-1 18
 
5.4%
대합실-2 18
 
5.4%
환승통로 12
 
3.6%
<NA> 1
 
0.3%
공기질 기준 1
 
0.3%

Length

2024-04-30T07:00:03.272362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T07:00:03.394045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
100
23.1%
100
23.1%
승강장 100
23.1%
대합실 82
18.9%
대합실-1 18
 
4.2%
대합실-2 18
 
4.2%
환승통로 12
 
2.8%
na 1
 
0.2%
공기질 1
 
0.2%
기준 1
 
0.2%
Distinct195
Distinct (%)58.7%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
2024-04-30T07:00:03.731931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.9246988
Min length2

Characters and Unicode

Total characters1303
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique108 ?
Unique (%)32.5%

Sample

1st rowPM10
2nd row140㎍/㎥
3rd row97
4th row121.7
5th row81.4
ValueCountFrequency (%)
97.5 6
 
1.8%
98.4 5
 
1.5%
90.9 5
 
1.5%
93.6 5
 
1.5%
92.8 4
 
1.2%
87.2 4
 
1.2%
95.8 4
 
1.2%
97.4 4
 
1.2%
95.7 4
 
1.2%
95.1 4
 
1.2%
Other values (185) 287
86.4%
2024-04-30T07:00:04.217417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 297
22.8%
9 235
18.0%
8 176
13.5%
1 109
 
8.4%
7 86
 
6.6%
5 75
 
5.8%
0 72
 
5.5%
4 70
 
5.4%
3 60
 
4.6%
2 60
 
4.6%
Other values (6) 63
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1001
76.8%
Other Punctuation 298
 
22.9%
Other Symbol 2
 
0.2%
Uppercase Letter 2
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 235
23.5%
8 176
17.6%
1 109
10.9%
7 86
 
8.6%
5 75
 
7.5%
0 72
 
7.2%
4 70
 
7.0%
3 60
 
6.0%
2 60
 
6.0%
6 58
 
5.8%
Other Punctuation
ValueCountFrequency (%)
. 297
99.7%
/ 1
 
0.3%
Other Symbol
ValueCountFrequency (%)
1
50.0%
1
50.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
50.0%
M 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1301
99.8%
Latin 2
 
0.2%

Most frequent character per script

Common
ValueCountFrequency (%)
. 297
22.8%
9 235
18.1%
8 176
13.5%
1 109
 
8.4%
7 86
 
6.6%
5 75
 
5.8%
0 72
 
5.5%
4 70
 
5.4%
3 60
 
4.6%
2 60
 
4.6%
Other values (4) 61
 
4.7%
Latin
ValueCountFrequency (%)
P 1
50.0%
M 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1301
99.8%
CJK Compat 2
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 297
22.8%
9 235
18.1%
8 176
13.5%
1 109
 
8.4%
7 86
 
6.6%
5 75
 
5.8%
0 72
 
5.5%
4 70
 
5.4%
3 60
 
4.6%
2 60
 
4.6%
Other values (4) 61
 
4.7%
CJK Compat
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct177
Distinct (%)53.3%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
2024-04-30T07:00:04.548455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.0150602
Min length3

Characters and Unicode

Total characters1001
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)26.2%

Sample

1st rowCO2
2nd row1,000ppm
3rd row638
4th row656
5th row616
ValueCountFrequency (%)
482 6
 
1.8%
466 4
 
1.2%
528 4
 
1.2%
415 4
 
1.2%
503 4
 
1.2%
502 4
 
1.2%
508 4
 
1.2%
530 4
 
1.2%
454 4
 
1.2%
520 4
 
1.2%
Other values (167) 290
87.3%
2024-04-30T07:00:05.022666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 229
22.9%
5 198
19.8%
6 101
10.1%
0 79
 
7.9%
7 76
 
7.6%
8 70
 
7.0%
3 68
 
6.8%
2 66
 
6.6%
1 60
 
6.0%
9 48
 
4.8%
Other values (5) 6
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 995
99.4%
Lowercase Letter 3
 
0.3%
Uppercase Letter 2
 
0.2%
Other Punctuation 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 229
23.0%
5 198
19.9%
6 101
10.2%
0 79
 
7.9%
7 76
 
7.6%
8 70
 
7.0%
3 68
 
6.8%
2 66
 
6.6%
1 60
 
6.0%
9 48
 
4.8%
Lowercase Letter
ValueCountFrequency (%)
p 2
66.7%
m 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 1
50.0%
O 1
50.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 996
99.5%
Latin 5
 
0.5%

Most frequent character per script

Common
ValueCountFrequency (%)
4 229
23.0%
5 198
19.9%
6 101
10.1%
0 79
 
7.9%
7 76
 
7.6%
8 70
 
7.0%
3 68
 
6.8%
2 66
 
6.6%
1 60
 
6.0%
9 48
 
4.8%
Latin
ValueCountFrequency (%)
p 2
40.0%
m 1
20.0%
C 1
20.0%
O 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1001
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 229
22.9%
5 198
19.8%
6 101
10.1%
0 79
 
7.9%
7 76
 
7.6%
8 70
 
7.0%
3 68
 
6.8%
2 66
 
6.6%
1 60
 
6.0%
9 48
 
4.8%
Other values (5) 6
 
0.6%
Distinct182
Distinct (%)54.8%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
2024-04-30T07:00:05.453021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.6506024
Min length1

Characters and Unicode

Total characters1212
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique101 ?
Unique (%)30.4%

Sample

1st rowHCHO
2nd row100㎍/㎥
3rd row13.7
4th row13.4
5th row14
ValueCountFrequency (%)
12.8 6
 
1.8%
16.8 6
 
1.8%
16.3 6
 
1.8%
13.4 5
 
1.5%
16 5
 
1.5%
7.9 5
 
1.5%
17 5
 
1.5%
13.9 5
 
1.5%
14.7 5
 
1.5%
15.9 4
 
1.2%
Other values (172) 280
84.3%
2024-04-30T07:00:05.916860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 293
24.2%
1 282
23.3%
2 108
 
8.9%
3 82
 
6.8%
7 77
 
6.4%
6 72
 
5.9%
4 69
 
5.7%
5 68
 
5.6%
9 62
 
5.1%
8 60
 
5.0%
Other values (7) 39
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 912
75.2%
Other Punctuation 294
 
24.3%
Uppercase Letter 4
 
0.3%
Other Symbol 2
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 282
30.9%
2 108
 
11.8%
3 82
 
9.0%
7 77
 
8.4%
6 72
 
7.9%
4 69
 
7.6%
5 68
 
7.5%
9 62
 
6.8%
8 60
 
6.6%
0 32
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
H 2
50.0%
C 1
25.0%
O 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 293
99.7%
/ 1
 
0.3%
Other Symbol
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1208
99.7%
Latin 4
 
0.3%

Most frequent character per script

Common
ValueCountFrequency (%)
. 293
24.3%
1 282
23.3%
2 108
 
8.9%
3 82
 
6.8%
7 77
 
6.4%
6 72
 
6.0%
4 69
 
5.7%
5 68
 
5.6%
9 62
 
5.1%
8 60
 
5.0%
Other values (4) 35
 
2.9%
Latin
ValueCountFrequency (%)
H 2
50.0%
C 1
25.0%
O 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1210
99.8%
CJK Compat 2
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 293
24.2%
1 282
23.3%
2 108
 
8.9%
3 82
 
6.8%
7 77
 
6.4%
6 72
 
6.0%
4 69
 
5.7%
5 68
 
5.6%
9 62
 
5.1%
8 60
 
5.0%
Other values (5) 37
 
3.1%
CJK Compat
ValueCountFrequency (%)
1
50.0%
1
50.0%

유지기준.3
Categorical

Distinct19
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
1
50 
0.5
47 
0.7
40 
0.8
37 
0.6
37 
Other values (14)
121 

Length

Max length4
Median length3
Mean length2.686747
Min length1

Unique

Unique5 ?
Unique (%)1.5%

Sample

1st rowCO
2nd row9ppm
3rd row0.6
4th row0.6
5th row0.6

Common Values

ValueCountFrequency (%)
1 50
15.1%
0.5 47
14.2%
0.7 40
12.0%
0.8 37
11.1%
0.6 37
11.1%
0.4 32
9.6%
0.9 32
9.6%
0.3 17
 
5.1%
1.1 15
 
4.5%
0.2 6
 
1.8%
Other values (9) 19
 
5.7%

Length

2024-04-30T07:00:06.055127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 50
15.1%
0.5 47
14.2%
0.7 40
12.0%
0.8 37
11.1%
0.6 37
11.1%
0.4 32
9.6%
0.9 32
9.6%
0.3 17
 
5.1%
1.1 15
 
4.5%
0.2 6
 
1.8%
Other values (9) 19
 
5.7%

유지기준.4
Categorical

Distinct16
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
0.0004
141 
0.0008
96 
0.0006
17 
0.0012
 
14
0
 
13
Other values (11)
51 

Length

Max length8
Median length6
Mean length5.7771084
Min length1

Unique

Unique3 ?
Unique (%)0.9%

Sample

1st row석면
2nd row0.01개/cc
3rd row0.0005
4th row0.0008
5th row0.0004

Common Values

ValueCountFrequency (%)
0.0004 141
42.5%
0.0008 96
28.9%
0.0006 17
 
5.1%
0.0012 14
 
4.2%
0 13
 
3.9%
0.0005 12
 
3.6%
0.0009 11
 
3.3%
0.0013 8
 
2.4%
0.001 7
 
2.1%
0.0011 4
 
1.2%
Other values (6) 9
 
2.7%

Length

2024-04-30T07:00:06.183486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0.0004 141
42.5%
0.0008 96
28.9%
0.0006 17
 
5.1%
0.0012 14
 
4.2%
0 13
 
3.9%
0.0005 12
 
3.6%
0.0009 11
 
3.3%
0.0013 8
 
2.4%
0.001 7
 
2.1%
0.0011 4
 
1.2%
Other values (6) 9
 
2.7%

Correlations

2024-04-30T07:00:06.256737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선시 설 명 (역사명)측정 지점유지기준.3유지기준.4
호선1.0000.0000.0000.3520.039
시 설 명\n(역사명)0.0001.000NaN0.9110.695
측정\n지점0.000NaN1.0000.7020.760
유지기준.30.3520.9110.7021.0000.748
유지기준.40.0390.6950.7600.7481.000
2024-04-30T07:00:06.348718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유지기준.4측정 지점유지기준.3호선
유지기준.41.0000.4680.3410.019
측정\n지점0.4681.0000.4000.000
유지기준.30.3410.4001.0000.198
호선0.0190.0000.1981.000
2024-04-30T07:00:06.442018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선측정 지점유지기준.3유지기준.4
호선1.0000.0000.1980.019
측정\n지점0.0001.0000.4000.468
유지기준.30.1980.4001.0000.341
유지기준.40.0190.4680.3411.000

Missing values

2024-04-30T07:00:02.137038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T07:00:02.243635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

호선시 설 명 (역사명)측정 지점유지기준유지기준.1유지기준.2유지기준.3유지기준.4
0<NA><NA><NA>PM10CO2HCHOCO석면
1<NA><NA>공기질 기준140㎍/㎥1,000ppm100㎍/㎥9ppm0.01개/cc
21서울역평 균9763813.70.60.0005
31<NA>승강장121.765613.40.60.0008
41<NA>대합실-181.4616140.60.0004
51<NA>대합실-287.964313.70.60.0004
61시 청평 균98.558714.70.60.0004
71<NA>승강장101600130.70
81<NA>대합실-195.561316.90.60.0008
91<NA>대합실-299.154914.30.60.0004
호선시 설 명 (역사명)측정 지점유지기준유지기준.1유지기준.2유지기준.3유지기준.4
3224총신대 입구평 균96.847518.30.60.0004
3234<NA>승강장101.348514.70.70.0008
3244<NA>대합실92.246421.80.50.0001
3254사 당평 균95.654320.90.60.0004
3264<NA>승강장93.459722.30.80.0004
3274<NA>대합실97.547821.20.40
3284<NA>환승통로9655319.30.50.0008
3294남태령평 균92.74797.90.60.0008
3304<NA>승강장91.94856.60.70.0008
3314<NA>대합실93.54729.10.40.0008