Overview

Dataset statistics

Number of variables7
Number of observations298
Missing cells10
Missing cells (%)0.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.7 KiB
Average record size in memory57.4 B

Variable types

Numeric1
Categorical1
Text5

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-2751/F/1/datasetView.do

Alerts

연번 is highly overall correlated with 호선High correlation
호선 is highly overall correlated with 연번High correlation
한자 has 10 (3.4%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2024-04-29 22:02:00.568814
Analysis finished2024-04-29 22:02:02.454678
Duration1.89 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct298
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean149.5
Minimum1
Maximum298
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2024-04-30T07:02:02.526401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile15.85
Q175.25
median149.5
Q3223.75
95-th percentile283.15
Maximum298
Range297
Interquartile range (IQR)148.5

Descriptive statistics

Standard deviation86.169407
Coefficient of variation (CV)0.57638399
Kurtosis-1.2
Mean149.5
Median Absolute Deviation (MAD)74.5
Skewness0
Sum44551
Variance7425.1667
MonotonicityStrictly increasing
2024-04-30T07:02:02.658052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.3%
206 1
 
0.3%
204 1
 
0.3%
203 1
 
0.3%
202 1
 
0.3%
201 1
 
0.3%
200 1
 
0.3%
199 1
 
0.3%
198 1
 
0.3%
197 1
 
0.3%
Other values (288) 288
96.6%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
298 1
0.3%
297 1
0.3%
296 1
0.3%
295 1
0.3%
294 1
0.3%
293 1
0.3%
292 1
0.3%
291 1
0.3%
290 1
0.3%
289 1
0.3%

호선
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
5호선
56 
2호선
51 
7호선
51 
6호선
39 
3호선
34 
Other values (4)
67 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
5호선 56
18.8%
2호선 51
17.1%
7호선 51
17.1%
6호선 39
13.1%
3호선 34
11.4%
4호선 26
8.7%
8호선 18
 
6.0%
9호선 13
 
4.4%
1호선 10
 
3.4%

Length

2024-04-30T07:02:02.777591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T07:02:02.884076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5호선 56
18.8%
2호선 51
17.1%
7호선 51
17.1%
6호선 39
13.1%
3호선 34
11.4%
4호선 26
8.7%
8호선 18
 
6.0%
9호선 13
 
4.4%
1호선 10
 
3.4%

역명
Text

Distinct260
Distinct (%)87.2%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2024-04-30T07:02:03.174836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length4.4228188
Min length2

Characters and Unicode

Total characters1318
Distinct characters251
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique223 ?
Unique (%)74.8%

Sample

1st row서울역
2nd row시청
3rd row종각
4th row종로3가
5th row종로5가
ValueCountFrequency (%)
동대문역사문화공원(ddp 3
 
1.0%
대림(구로구청 2
 
0.7%
영등포구청 2
 
0.7%
충정로(경기대입구 2
 
0.7%
충무로 2
 
0.7%
시청 2
 
0.7%
공덕 2
 
0.7%
사당 2
 
0.7%
석촌 2
 
0.7%
교대(법원·검찰청 2
 
0.7%
Other values (251) 278
93.0%
2024-04-30T07:02:03.559696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 67
 
5.1%
) 67
 
5.1%
50
 
3.8%
50
 
3.8%
36
 
2.7%
32
 
2.4%
28
 
2.1%
26
 
2.0%
23
 
1.7%
20
 
1.5%
Other values (241) 919
69.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1161
88.1%
Open Punctuation 67
 
5.1%
Close Punctuation 67
 
5.1%
Uppercase Letter 9
 
0.7%
Decimal Number 8
 
0.6%
Other Punctuation 4
 
0.3%
Space Separator 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
50
 
4.3%
50
 
4.3%
36
 
3.1%
32
 
2.8%
28
 
2.4%
26
 
2.2%
23
 
2.0%
20
 
1.7%
19
 
1.6%
18
 
1.6%
Other values (231) 859
74.0%
Decimal Number
ValueCountFrequency (%)
3 5
62.5%
4 2
 
25.0%
5 1
 
12.5%
Uppercase Letter
ValueCountFrequency (%)
D 6
66.7%
P 3
33.3%
Other Punctuation
ValueCountFrequency (%)
· 3
75.0%
1
 
25.0%
Open Punctuation
ValueCountFrequency (%)
( 67
100.0%
Close Punctuation
ValueCountFrequency (%)
) 67
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1161
88.1%
Common 148
 
11.2%
Latin 9
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
50
 
4.3%
50
 
4.3%
36
 
3.1%
32
 
2.8%
28
 
2.4%
26
 
2.2%
23
 
2.0%
20
 
1.7%
19
 
1.6%
18
 
1.6%
Other values (231) 859
74.0%
Common
ValueCountFrequency (%)
( 67
45.3%
) 67
45.3%
3 5
 
3.4%
· 3
 
2.0%
4 2
 
1.4%
2
 
1.4%
1
 
0.7%
5 1
 
0.7%
Latin
ValueCountFrequency (%)
D 6
66.7%
P 3
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1161
88.1%
ASCII 153
 
11.6%
None 3
 
0.2%
Punctuation 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 67
43.8%
) 67
43.8%
D 6
 
3.9%
3 5
 
3.3%
P 3
 
2.0%
4 2
 
1.3%
2
 
1.3%
5 1
 
0.7%
Hangul
ValueCountFrequency (%)
50
 
4.3%
50
 
4.3%
36
 
3.1%
32
 
2.8%
28
 
2.4%
26
 
2.2%
23
 
2.0%
20
 
1.7%
19
 
1.6%
18
 
1.6%
Other values (231) 859
74.0%
None
ValueCountFrequency (%)
· 3
100.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

한자
Text

MISSING 

Distinct254
Distinct (%)88.2%
Missing10
Missing (%)3.4%
Memory size2.5 KiB
2024-04-30T07:02:03.795535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length4.4548611
Min length2

Characters and Unicode

Total characters1283
Distinct characters410
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique221 ?
Unique (%)76.7%

Sample

1st row市廳
2nd row鐘閣
3rd row鍾路3街
4th row鍾路5街
5th row東大門
ValueCountFrequency (%)
東大門歷史文化公園(ddp 3
 
1.0%
市廳 2
 
0.7%
石村 2
 
0.7%
忠正路(京畿大入口 2
 
0.7%
綜合運動場 2
 
0.7%
孔德 2
 
0.7%
舍堂 2
 
0.7%
藥水 2
 
0.7%
泰陵入口 2
 
0.7%
까치山 2
 
0.7%
Other values (247) 270
92.8%
2024-04-30T07:02:04.256088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 66
 
5.1%
) 66
 
5.1%
46
 
3.6%
27
 
2.1%
23
 
1.8%
22
 
1.7%
20
 
1.6%
20
 
1.6%
20
 
1.6%
15
 
1.2%
Other values (400) 958
74.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1127
87.8%
Open Punctuation 66
 
5.1%
Close Punctuation 66
 
5.1%
Uppercase Letter 9
 
0.7%
Decimal Number 8
 
0.6%
Space Separator 4
 
0.3%
Other Punctuation 2
 
0.2%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
46
 
4.1%
27
 
2.4%
23
 
2.0%
22
 
2.0%
20
 
1.8%
20
 
1.8%
20
 
1.8%
15
 
1.3%
15
 
1.3%
14
 
1.2%
Other values (389) 905
80.3%
Decimal Number
ValueCountFrequency (%)
3 5
62.5%
4 2
 
25.0%
5 1
 
12.5%
Uppercase Letter
ValueCountFrequency (%)
D 6
66.7%
P 3
33.3%
Space Separator
ValueCountFrequency (%)
2
50.0%
  2
50.0%
Open Punctuation
ValueCountFrequency (%)
( 66
100.0%
Close Punctuation
ValueCountFrequency (%)
) 66
100.0%
Other Punctuation
ValueCountFrequency (%)
· 2
100.0%
Math Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Han 985
76.8%
Common 147
 
11.5%
Hangul 142
 
11.1%
Latin 9
 
0.7%

Most frequent character per script

Han
ValueCountFrequency (%)
46
 
4.7%
27
 
2.7%
23
 
2.3%
22
 
2.2%
20
 
2.0%
20
 
2.0%
20
 
2.0%
15
 
1.5%
15
 
1.5%
14
 
1.4%
Other values (319) 763
77.5%
Hangul
ValueCountFrequency (%)
8
 
5.6%
8
 
5.6%
7
 
4.9%
6
 
4.2%
6
 
4.2%
5
 
3.5%
5
 
3.5%
5
 
3.5%
4
 
2.8%
4
 
2.8%
Other values (60) 84
59.2%
Common
ValueCountFrequency (%)
( 66
44.9%
) 66
44.9%
3 5
 
3.4%
2
 
1.4%
  2
 
1.4%
· 2
 
1.4%
4 2
 
1.4%
5 1
 
0.7%
1
 
0.7%
Latin
ValueCountFrequency (%)
D 6
66.7%
P 3
33.3%

Most occurring blocks

ValueCountFrequency (%)
CJK 951
74.1%
ASCII 151
 
11.8%
Hangul 142
 
11.1%
CJK Compat Ideographs 34
 
2.7%
None 4
 
0.3%
Math Operators 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 66
43.7%
) 66
43.7%
D 6
 
4.0%
3 5
 
3.3%
P 3
 
2.0%
2
 
1.3%
4 2
 
1.3%
5 1
 
0.7%
CJK
ValueCountFrequency (%)
46
 
4.8%
27
 
2.8%
23
 
2.4%
22
 
2.3%
20
 
2.1%
20
 
2.1%
20
 
2.1%
15
 
1.6%
15
 
1.6%
14
 
1.5%
Other values (301) 729
76.7%
Hangul
ValueCountFrequency (%)
8
 
5.6%
8
 
5.6%
7
 
4.9%
6
 
4.2%
6
 
4.2%
5
 
3.5%
5
 
3.5%
5
 
3.5%
4
 
2.8%
4
 
2.8%
Other values (60) 84
59.2%
CJK Compat Ideographs
ValueCountFrequency (%)
6
17.6%
4
11.8%
3
 
8.8%
3
 
8.8%
2
 
5.9%
2
 
5.9%
2
 
5.9%
2
 
5.9%
1
 
2.9%
1
 
2.9%
Other values (8) 8
23.5%
None
ValueCountFrequency (%)
  2
50.0%
· 2
50.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

영문
Text

Distinct263
Distinct (%)88.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2024-04-30T07:02:04.463699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length59
Median length50
Mean length14.275168
Min length3

Characters and Unicode

Total characters4254
Distinct characters60
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique229 ?
Unique (%)76.8%

Sample

1st rowSeoul Station
2nd rowCity Hall
3rd rowJonggak
4th rowJongno 3(sam)ga
5th rowJongno 5(o)ga
ValueCountFrequency (%)
univ 24
 
4.7%
office 22
 
4.3%
seoul 11
 
2.2%
9
 
1.8%
national 6
 
1.2%
center 6
 
1.2%
dongdaemun 5
 
1.0%
city 5
 
1.0%
euljiro 5
 
1.0%
terminal 5
 
1.0%
Other values (325) 413
80.8%
2024-04-30T07:02:04.794239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 454
 
10.7%
o 355
 
8.3%
a 348
 
8.2%
g 302
 
7.1%
e 298
 
7.0%
218
 
5.1%
i 203
 
4.8%
u 187
 
4.4%
s 121
 
2.8%
r 120
 
2.8%
Other values (50) 1648
38.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3256
76.5%
Uppercase Letter 543
 
12.8%
Space Separator 219
 
5.1%
Open Punctuation 75
 
1.8%
Close Punctuation 75
 
1.8%
Other Punctuation 45
 
1.1%
Dash Punctuation 29
 
0.7%
Decimal Number 9
 
0.2%
Final Punctuation 2
 
< 0.1%
Initial Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 454
13.9%
o 355
10.9%
a 348
10.7%
g 302
 
9.3%
e 298
 
9.2%
i 203
 
6.2%
u 187
 
5.7%
s 121
 
3.7%
r 120
 
3.7%
m 111
 
3.4%
Other values (14) 757
23.2%
Uppercase Letter
ValueCountFrequency (%)
S 103
19.0%
G 48
 
8.8%
C 44
 
8.1%
D 40
 
7.4%
M 31
 
5.7%
U 31
 
5.7%
O 30
 
5.5%
H 30
 
5.5%
N 25
 
4.6%
P 24
 
4.4%
Other values (12) 137
25.2%
Decimal Number
ValueCountFrequency (%)
3 5
55.6%
4 2
 
22.2%
5 1
 
11.1%
1 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
. 26
57.8%
' 10
 
22.2%
& 9
 
20.0%
Space Separator
ValueCountFrequency (%)
218
99.5%
  1
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 75
100.0%
Close Punctuation
ValueCountFrequency (%)
) 75
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 29
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3799
89.3%
Common 455
 
10.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 454
 
12.0%
o 355
 
9.3%
a 348
 
9.2%
g 302
 
7.9%
e 298
 
7.8%
i 203
 
5.3%
u 187
 
4.9%
s 121
 
3.2%
r 120
 
3.2%
m 111
 
2.9%
Other values (36) 1300
34.2%
Common
ValueCountFrequency (%)
218
47.9%
( 75
 
16.5%
) 75
 
16.5%
- 29
 
6.4%
. 26
 
5.7%
' 10
 
2.2%
& 9
 
2.0%
3 5
 
1.1%
2
 
0.4%
4 2
 
0.4%
Other values (4) 4
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4250
99.9%
Punctuation 3
 
0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 454
 
10.7%
o 355
 
8.4%
a 348
 
8.2%
g 302
 
7.1%
e 298
 
7.0%
218
 
5.1%
i 203
 
4.8%
u 187
 
4.4%
s 121
 
2.8%
r 120
 
2.8%
Other values (47) 1644
38.7%
Punctuation
ValueCountFrequency (%)
2
66.7%
1
33.3%
None
ValueCountFrequency (%)
  1
100.0%
Distinct261
Distinct (%)87.6%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2024-04-30T07:02:05.027554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length4.4060403
Min length2

Characters and Unicode

Total characters1313
Distinct characters374
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique225 ?
Unique (%)75.5%

Sample

1st row首尔站
2nd row市厅
3rd row钟阁
4th row钟路三街
5th row钟路五街
ValueCountFrequency (%)
东大门历史文化公园(ddp 3
 
1.0%
药水 2
 
0.7%
忠正路(京畿大学 2
 
0.7%
忠武路 2
 
0.7%
市厅 2
 
0.7%
孔德 2
 
0.7%
舍堂 2
 
0.7%
石村 2
 
0.7%
大林(九老区厅 2
 
0.7%
首尔教育大学(法院·检察厅 2
 
0.7%
Other values (251) 277
93.0%
2024-04-30T07:02:05.388953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 66
 
5.0%
) 66
 
5.0%
47
 
3.6%
32
 
2.4%
28
 
2.1%
25
 
1.9%
23
 
1.8%
22
 
1.7%
18
 
1.4%
15
 
1.1%
Other values (364) 971
74.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1168
89.0%
Open Punctuation 66
 
5.0%
Close Punctuation 66
 
5.0%
Uppercase Letter 9
 
0.7%
Other Punctuation 4
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
47
 
4.0%
32
 
2.7%
28
 
2.4%
25
 
2.1%
23
 
2.0%
22
 
1.9%
18
 
1.5%
15
 
1.3%
15
 
1.3%
14
 
1.2%
Other values (359) 929
79.5%
Uppercase Letter
ValueCountFrequency (%)
D 6
66.7%
P 3
33.3%
Open Punctuation
ValueCountFrequency (%)
( 66
100.0%
Close Punctuation
ValueCountFrequency (%)
) 66
100.0%
Other Punctuation
ValueCountFrequency (%)
· 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Han 1168
89.0%
Common 136
 
10.4%
Latin 9
 
0.7%

Most frequent character per script

Han
ValueCountFrequency (%)
47
 
4.0%
32
 
2.7%
28
 
2.4%
25
 
2.1%
23
 
2.0%
22
 
1.9%
18
 
1.5%
15
 
1.3%
15
 
1.3%
14
 
1.2%
Other values (359) 929
79.5%
Common
ValueCountFrequency (%)
( 66
48.5%
) 66
48.5%
· 4
 
2.9%
Latin
ValueCountFrequency (%)
D 6
66.7%
P 3
33.3%

Most occurring blocks

ValueCountFrequency (%)
CJK 1155
88.0%
ASCII 141
 
10.7%
CJK Compat Ideographs 13
 
1.0%
None 4
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 66
46.8%
) 66
46.8%
D 6
 
4.3%
P 3
 
2.1%
CJK
ValueCountFrequency (%)
47
 
4.1%
32
 
2.8%
28
 
2.4%
25
 
2.2%
23
 
2.0%
22
 
1.9%
18
 
1.6%
15
 
1.3%
15
 
1.3%
14
 
1.2%
Other values (351) 916
79.3%
None
ValueCountFrequency (%)
· 4
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
4
30.8%
3
23.1%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
Distinct261
Distinct (%)87.6%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2024-04-30T07:02:05.602748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length16
Mean length5.6845638
Min length2

Characters and Unicode

Total characters1694
Distinct characters88
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique226 ?
Unique (%)75.8%

Sample

1st rowソウルヨク
2nd rowシチョン
3rd rowチョンガク
4th rowチョンノサムガ
5th rowチョンノオガ
ValueCountFrequency (%)
チョンノサムガ 3
 
1.0%
トンデムンヨクサムンファゴンウォン(ddp 3
 
1.0%
チョング 2
 
0.7%
ソウルヨク 2
 
0.7%
テルンイック 2
 
0.7%
ハプチョン 2
 
0.7%
チャムシル 2
 
0.7%
サムガクチ 2
 
0.7%
チョンハブンドンジャン 2
 
0.7%
コンドク 2
 
0.7%
Other values (251) 277
92.6%
2024-04-30T07:02:06.160725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
367
21.7%
93
 
5.5%
87
 
5.1%
72
 
4.3%
68
 
4.0%
58
 
3.4%
51
 
3.0%
43
 
2.5%
38
 
2.2%
34
 
2.0%
Other values (78) 783
46.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1642
96.9%
Other Punctuation 12
 
0.7%
Open Punctuation 10
 
0.6%
Close Punctuation 10
 
0.6%
Uppercase Letter 9
 
0.5%
Space Separator 6
 
0.4%
Modifier Letter 4
 
0.2%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
367
22.4%
93
 
5.7%
87
 
5.3%
72
 
4.4%
68
 
4.1%
58
 
3.5%
51
 
3.1%
43
 
2.6%
38
 
2.3%
34
 
2.1%
Other values (69) 731
44.5%
Uppercase Letter
ValueCountFrequency (%)
D 6
66.7%
P 3
33.3%
Space Separator
ValueCountFrequency (%)
  3
50.0%
3
50.0%
Other Punctuation
ValueCountFrequency (%)
12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Modifier Letter
ValueCountFrequency (%)
4
100.0%
Math Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Katakana 1632
96.3%
Common 43
 
2.5%
Han 10
 
0.6%
Latin 9
 
0.5%

Most frequent character per script

Katakana
ValueCountFrequency (%)
367
22.5%
93
 
5.7%
87
 
5.3%
72
 
4.4%
68
 
4.2%
58
 
3.6%
51
 
3.1%
43
 
2.6%
38
 
2.3%
34
 
2.1%
Other values (61) 721
44.2%
Han
ValueCountFrequency (%)
2
20.0%
2
20.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Common
ValueCountFrequency (%)
12
27.9%
( 10
23.3%
) 10
23.3%
4
 
9.3%
  3
 
7.0%
3
 
7.0%
1
 
2.3%
Latin
ValueCountFrequency (%)
D 6
66.7%
P 3
33.3%

Most occurring blocks

ValueCountFrequency (%)
Katakana 1648
97.3%
ASCII 32
 
1.9%
CJK 10
 
0.6%
None 3
 
0.2%
Math Operators 1
 
0.1%

Most frequent character per block

Katakana
ValueCountFrequency (%)
367
22.3%
93
 
5.6%
87
 
5.3%
72
 
4.4%
68
 
4.1%
58
 
3.5%
51
 
3.1%
43
 
2.6%
38
 
2.3%
34
 
2.1%
Other values (63) 737
44.7%
ASCII
ValueCountFrequency (%)
( 10
31.2%
) 10
31.2%
D 6
18.8%
3
 
9.4%
P 3
 
9.4%
None
ValueCountFrequency (%)
  3
100.0%
CJK
ValueCountFrequency (%)
2
20.0%
2
20.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

Interactions

2024-04-30T07:02:02.151636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T07:02:06.247517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선
연번1.0000.946
호선0.9461.000
2024-04-30T07:02:06.322102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선
연번1.0000.812
호선0.8121.000

Missing values

2024-04-30T07:02:02.303004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T07:02:02.412625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번호선역명한자영문중국어일본어
011호선서울역<NA>Seoul Station首尔站ソウルヨク
121호선시청市廳City Hall市厅シチョン
231호선종각鐘閣Jonggak钟阁チョンガク
341호선종로3가鍾路3街Jongno 3(sam)ga钟路三街チョンノサムガ
451호선종로5가鍾路5街Jongno 5(o)ga钟路五街チョンノオガ
561호선동대문東大門Dongdaemun东大门トンデムン
671호선동묘앞東廟앞Dongmyo东庙トンミョアプ
781호선신설동新設洞Sinseoldong新设洞シンソルトン
891호선제기동祭基洞Jegidong祭基洞チェギドン
9101호선청량리(서울시립대입구)淸凉里(서울市立大入口)Cheongnyangni(University of Seoul)清凉里(首尔市立大学)チョンニャンニ
연번호선역명한자영문중국어일본어
2882899호선봉은사奉恩寺Bongeunsa奉恩寺ポンウンサ
2892909호선종합운동장綜合運動場Sports Complex综合运动场チョンハブンドンジャン
2902919호선삼전三田Samjeon三田サムジョン
2912929호선석촌고분石村古墳Seokchon Gobun石村古坟ソクチョンコブン
2922939호선석촌石村Seokchon石村ソクチョン
2932949호선송파나루松坡나루Songpanaru松坡渡口ソンパナル
2942959호선한성백제漢城百済Hanseong Baekje汉城百济ハンソンベクチェ
2952969호선올림픽공원(한국체대)올림픽公園(韓國體大)Olympic Park(Korea National Sport University)奥林匹克公园(韩国体育大学)オリンピックゴンウォン
2962979호선둔촌오륜遁村五輪Dunchon Oryun遁村五轮トゥンチョノリュン
2972989호선중앙보훈병원中央報勲病院VHS Medical Center中央报勋医院チュンアンボフンビョンウォン