Overview

Dataset statistics

Number of variables12
Number of observations377
Missing cells372
Missing cells (%)8.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory35.8 KiB
Average record size in memory97.4 B

Variable types

Text10
Categorical1
Numeric1

Alerts

317106 is highly overall correlated with "경원선High correlation
"경원선 is highly overall correlated with 317106High correlation
"401901" has 372 (98.7%) missing valuesMissing
"D00057" has unique valuesUnique

Reproduction

Analysis started2023-12-10 06:30:11.044322
Analysis finished2023-12-10 06:30:12.796658
Duration1.75 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

"D00057"
Text

UNIQUE 

Distinct377
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
2023-12-10T15:30:13.684775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters3016
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique377 ?
Unique (%)100.0%

Sample

1st row"D00058"
2nd row"D00060"
3rd row"D00064"
4th row"D00074"
5th row"D00076"
ValueCountFrequency (%)
d00058 1
 
0.3%
d00278 1
 
0.3%
d00223 1
 
0.3%
d00222 1
 
0.3%
d00220 1
 
0.3%
d00218 1
 
0.3%
d00294 1
 
0.3%
d00292 1
 
0.3%
d00290 1
 
0.3%
d00289 1
 
0.3%
Other values (367) 367
97.3%
2023-12-10T15:30:15.065891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 926
30.7%
" 754
25.0%
D 377
12.5%
2 174
 
5.8%
1 173
 
5.7%
3 160
 
5.3%
8 78
 
2.6%
7 77
 
2.6%
6 77
 
2.6%
5 75
 
2.5%
Other values (2) 145
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1885
62.5%
Other Punctuation 754
 
25.0%
Uppercase Letter 377
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 926
49.1%
2 174
 
9.2%
1 173
 
9.2%
3 160
 
8.5%
8 78
 
4.1%
7 77
 
4.1%
6 77
 
4.1%
5 75
 
4.0%
4 75
 
4.0%
9 70
 
3.7%
Other Punctuation
ValueCountFrequency (%)
" 754
100.0%
Uppercase Letter
ValueCountFrequency (%)
D 377
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2639
87.5%
Latin 377
 
12.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 926
35.1%
" 754
28.6%
2 174
 
6.6%
1 173
 
6.6%
3 160
 
6.1%
8 78
 
3.0%
7 77
 
2.9%
6 77
 
2.9%
5 75
 
2.8%
4 75
 
2.8%
Latin
ValueCountFrequency (%)
D 377
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3016
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 926
30.7%
" 754
25.0%
D 377
12.5%
2 174
 
5.8%
1 173
 
5.7%
3 160
 
5.3%
8 78
 
2.6%
7 77
 
2.6%
6 77
 
2.6%
5 75
 
2.5%
Other values (2) 145
 
4.8%
Distinct376
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
2023-12-10T15:30:15.760726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length4
Mean length4.2679045
Min length4

Characters and Unicode

Total characters1609
Distinct characters211
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique375 ?
Unique (%)99.5%

Sample

1st row"임진강"
2nd row"문산"
3rd row"의정부"
4th row"퇴계원"
5th row"곡산"
ValueCountFrequency (%)
서울 2
 
0.5%
진해 1
 
0.3%
점촌 1
 
0.3%
이하 1
 
0.3%
마사 1
 
0.3%
예천 1
 
0.3%
영주 1
 
0.3%
북천 1
 
0.3%
부산진 1
 
0.3%
범일 1
 
0.3%
Other values (366) 366
97.1%
2023-12-10T15:30:16.669240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
" 754
46.9%
34
 
2.1%
29
 
1.8%
21
 
1.3%
20
 
1.2%
18
 
1.1%
18
 
1.1%
17
 
1.1%
16
 
1.0%
16
 
1.0%
Other values (201) 666
41.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 848
52.7%
Other Punctuation 754
46.9%
Uppercase Letter 3
 
0.2%
Open Punctuation 2
 
0.1%
Close Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
34
 
4.0%
29
 
3.4%
21
 
2.5%
20
 
2.4%
18
 
2.1%
18
 
2.1%
17
 
2.0%
16
 
1.9%
16
 
1.9%
16
 
1.9%
Other values (195) 643
75.8%
Uppercase Letter
ValueCountFrequency (%)
T 1
33.3%
X 1
33.3%
K 1
33.3%
Other Punctuation
ValueCountFrequency (%)
" 754
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 848
52.7%
Common 758
47.1%
Latin 3
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
34
 
4.0%
29
 
3.4%
21
 
2.5%
20
 
2.4%
18
 
2.1%
18
 
2.1%
17
 
2.0%
16
 
1.9%
16
 
1.9%
16
 
1.9%
Other values (195) 643
75.8%
Common
ValueCountFrequency (%)
" 754
99.5%
( 2
 
0.3%
) 2
 
0.3%
Latin
ValueCountFrequency (%)
T 1
33.3%
X 1
33.3%
K 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 848
52.7%
ASCII 761
47.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
" 754
99.1%
( 2
 
0.3%
) 2
 
0.3%
T 1
 
0.1%
X 1
 
0.1%
K 1
 
0.1%
Hangul
ValueCountFrequency (%)
34
 
4.0%
29
 
3.4%
21
 
2.5%
20
 
2.4%
18
 
2.1%
18
 
2.1%
17
 
2.0%
16
 
1.9%
16
 
1.9%
16
 
1.9%
Other values (195) 643
75.8%

592104
Text

Distinct372
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
2023-12-10T15:30:17.349662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.7214854
Min length6

Characters and Unicode

Total characters2911
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique367 ?
Unique (%)97.3%

Sample

1st row"174010"
2nd row"402201"
3rd row"283840"
4th row"145941"
5th row"289341"
ValueCountFrequency (%)
509434 2
 
0.5%
445738 2
 
0.5%
323236 2
 
0.5%
509132 2
 
0.5%
232860 2
 
0.5%
514953 1
 
0.3%
443369 1
 
0.3%
331452 1
 
0.3%
168504 1
 
0.3%
81664 1
 
0.3%
Other values (362) 362
96.0%
2023-12-10T15:30:18.270477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
" 744
25.6%
3 305
10.5%
1 281
 
9.7%
4 254
 
8.7%
2 238
 
8.2%
0 199
 
6.8%
6 195
 
6.7%
5 185
 
6.4%
8 174
 
6.0%
7 169
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2167
74.4%
Other Punctuation 744
 
25.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 305
14.1%
1 281
13.0%
4 254
11.7%
2 238
11.0%
0 199
9.2%
6 195
9.0%
5 185
8.5%
8 174
8.0%
7 169
7.8%
9 167
7.7%
Other Punctuation
ValueCountFrequency (%)
" 744
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2911
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
" 744
25.6%
3 305
10.5%
1 281
 
9.7%
4 254
 
8.7%
2 238
 
8.2%
0 199
 
6.8%
6 195
 
6.7%
5 185
 
6.4%
8 174
 
6.0%
7 169
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2911
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
" 744
25.6%
3 305
10.5%
1 281
 
9.7%
4 254
 
8.7%
2 238
 
8.2%
0 199
 
6.8%
6 195
 
6.7%
5 185
 
6.4%
8 174
 
6.0%
7 169
 
5.8%

"401901"
Text

MISSING 

Distinct5
Distinct (%)100.0%
Missing372
Missing (%)98.7%
Memory size3.1 KiB
2023-12-10T15:30:18.680116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.8
Min length7

Characters and Unicode

Total characters39
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)100.0%

Sample

1st row"28425"
2nd row"302468"
3rd row"302596"
4th row"515654"
5th row"400827"
ValueCountFrequency (%)
28425 1
20.0%
302468 1
20.0%
302596 1
20.0%
515654 1
20.0%
400827 1
20.0%
2023-12-10T15:30:19.171055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
" 10
25.6%
2 5
12.8%
5 5
12.8%
4 4
 
10.3%
0 4
 
10.3%
8 3
 
7.7%
6 3
 
7.7%
3 2
 
5.1%
9 1
 
2.6%
1 1
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 29
74.4%
Other Punctuation 10
 
25.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 5
17.2%
5 5
17.2%
4 4
13.8%
0 4
13.8%
8 3
10.3%
6 3
10.3%
3 2
 
6.9%
9 1
 
3.4%
1 1
 
3.4%
7 1
 
3.4%
Other Punctuation
ValueCountFrequency (%)
" 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 39
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
" 10
25.6%
2 5
12.8%
5 5
12.8%
4 4
 
10.3%
0 4
 
10.3%
8 3
 
7.7%
6 3
 
7.7%
3 2
 
5.1%
9 1
 
2.6%
1 1
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
" 10
25.6%
2 5
12.8%
5 5
12.8%
4 4
 
10.3%
0 4
 
10.3%
8 3
 
7.7%
6 3
 
7.7%
3 2
 
5.1%
9 1
 
2.6%
1 1
 
2.6%

"경원선
Categorical

HIGH CORRELATION 

Distinct28
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
"경부선
55 
"중앙선
50 
"호남선
35 
"경전선
33 
"동해남부선
23 
Other values (23)
181 

Length

Max length8
Median length4
Mean length4.1671088
Min length4

Unique

Unique5 ?
Unique (%)1.3%

Sample

1st row"경의선
2nd row"경의선
3rd row"경원선
4th row"경춘선
5th row"경의선

Common Values

ValueCountFrequency (%)
"경부선 55
14.6%
"중앙선 50
13.3%
"호남선 35
 
9.3%
"경전선 33
 
8.8%
"동해남부선 23
 
6.1%
"장항선 22
 
5.8%
"전라선 22
 
5.8%
"영동선 20
 
5.3%
"경춘선 15
 
4.0%
"충북선 14
 
3.7%
Other values (18) 88
23.3%

Length

2023-12-10T15:30:19.424296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경부선 55
14.6%
중앙선 50
13.2%
호남선 35
 
9.3%
경전선 33
 
8.7%
동해남부선 23
 
6.1%
장항선 22
 
5.8%
전라선 22
 
5.8%
영동선 20
 
5.3%
경춘선 15
 
4.0%
충북선 14
 
3.7%
Other values (18) 89
23.5%
Distinct376
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
2023-12-10T15:30:20.032023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length48
Median length42
Mean length24.787798
Min length4

Characters and Unicode

Total characters9345
Distinct characters280
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique375 ?
Unique (%)99.5%

Sample

1st row"경기 파주시 문산읍 임진각로 115"
2nd row"경기도 파주시 문산읍 문산역로 94"
3rd row"경기도 의정부시 평화로 525번지"
4th row"경기도 남양주시 퇴계원면 경춘북로 545 (퇴계원리 218-8)"
5th row"경기도 고양시 일산동구 경의로 160 (백석동)"
ValueCountFrequency (%)
경북 49
 
2.4%
경기도 37
 
1.8%
강원도 36
 
1.8%
경상북도 26
 
1.3%
전북 26
 
1.3%
충북 23
 
1.1%
전남 23
 
1.1%
충남 22
 
1.1%
경기 18
 
0.9%
봉화군 17
 
0.8%
Other values (1192) 1724
86.2%
2023-12-10T15:30:20.850286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1633
 
17.5%
" 748
 
8.0%
1 323
 
3.5%
268
 
2.9%
244
 
2.6%
2 200
 
2.1%
200
 
2.1%
3 169
 
1.8%
165
 
1.8%
162
 
1.7%
Other values (270) 5233
56.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5027
53.8%
Space Separator 1633
 
17.5%
Decimal Number 1543
 
16.5%
Other Punctuation 748
 
8.0%
Dash Punctuation 151
 
1.6%
Open Punctuation 122
 
1.3%
Close Punctuation 121
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
268
 
5.3%
244
 
4.9%
200
 
4.0%
165
 
3.3%
162
 
3.2%
152
 
3.0%
144
 
2.9%
130
 
2.6%
129
 
2.6%
121
 
2.4%
Other values (255) 3312
65.9%
Decimal Number
ValueCountFrequency (%)
1 323
20.9%
2 200
13.0%
3 169
11.0%
5 141
9.1%
4 135
8.7%
6 130
8.4%
8 118
 
7.6%
0 116
 
7.5%
7 109
 
7.1%
9 102
 
6.6%
Space Separator
ValueCountFrequency (%)
1633
100.0%
Other Punctuation
ValueCountFrequency (%)
" 748
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 151
100.0%
Open Punctuation
ValueCountFrequency (%)
( 122
100.0%
Close Punctuation
ValueCountFrequency (%)
) 121
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5027
53.8%
Common 4318
46.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
268
 
5.3%
244
 
4.9%
200
 
4.0%
165
 
3.3%
162
 
3.2%
152
 
3.0%
144
 
2.9%
130
 
2.6%
129
 
2.6%
121
 
2.4%
Other values (255) 3312
65.9%
Common
ValueCountFrequency (%)
1633
37.8%
" 748
17.3%
1 323
 
7.5%
2 200
 
4.6%
3 169
 
3.9%
- 151
 
3.5%
5 141
 
3.3%
4 135
 
3.1%
6 130
 
3.0%
( 122
 
2.8%
Other values (5) 566
 
13.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5027
53.8%
ASCII 4318
46.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1633
37.8%
" 748
17.3%
1 323
 
7.5%
2 200
 
4.6%
3 169
 
3.9%
- 151
 
3.5%
5 141
 
3.3%
4 135
 
3.1%
6 130
 
3.0%
( 122
 
2.8%
Other values (5) 566
 
13.1%
Hangul
ValueCountFrequency (%)
268
 
5.3%
244
 
4.9%
200
 
4.0%
165
 
3.3%
162
 
3.2%
152
 
3.0%
144
 
2.9%
130
 
2.6%
129
 
2.6%
121
 
2.4%
Other values (255) 3312
65.9%
Distinct375
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
2023-12-10T15:30:21.268096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length21
Mean length21.015915
Min length8

Characters and Unicode

Total characters7923
Distinct characters43
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique373 ?
Unique (%)98.9%

Sample

1st row"4148025027012530003"
2nd row"4148025021000170014"
3rd row"4115010100001680054"
4th row"4136037021002180142"
5th row"4128510600011850001"
ValueCountFrequency (%)
3
 
0.8%
경기도 3
 
0.8%
1117010700000430205 2
 
0.5%
전곡리 2
 
0.5%
연천군 2
 
0.5%
전곡읍 2
 
0.5%
4711325049001370001 2
 
0.5%
2920011900002120002 1
 
0.3%
2671025330004820000 1
 
0.3%
4721010400002570000 1
 
0.3%
Other values (377) 377
95.2%
2023-12-10T15:30:21.924678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2917
36.8%
1 886
 
11.2%
2 756
 
9.5%
" 750
 
9.5%
4 594
 
7.5%
3 587
 
7.4%
5 349
 
4.4%
7 316
 
4.0%
8 249
 
3.1%
6 248
 
3.1%
Other values (33) 271
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7084
89.4%
Other Punctuation 750
 
9.5%
Other Letter 57
 
0.7%
Space Separator 22
 
0.3%
Close Punctuation 4
 
0.1%
Open Punctuation 3
 
< 0.1%
Dash Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
8.8%
5
 
8.8%
5
 
8.8%
4
 
7.0%
3
 
5.3%
3
 
5.3%
3
 
5.3%
3
 
5.3%
3
 
5.3%
2
 
3.5%
Other values (18) 21
36.8%
Decimal Number
ValueCountFrequency (%)
0 2917
41.2%
1 886
 
12.5%
2 756
 
10.7%
4 594
 
8.4%
3 587
 
8.3%
5 349
 
4.9%
7 316
 
4.5%
8 249
 
3.5%
6 248
 
3.5%
9 182
 
2.6%
Other Punctuation
ValueCountFrequency (%)
" 750
100.0%
Space Separator
ValueCountFrequency (%)
22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7866
99.3%
Hangul 57
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
8.8%
5
 
8.8%
5
 
8.8%
4
 
7.0%
3
 
5.3%
3
 
5.3%
3
 
5.3%
3
 
5.3%
3
 
5.3%
2
 
3.5%
Other values (18) 21
36.8%
Common
ValueCountFrequency (%)
0 2917
37.1%
1 886
 
11.3%
2 756
 
9.6%
" 750
 
9.5%
4 594
 
7.6%
3 587
 
7.5%
5 349
 
4.4%
7 316
 
4.0%
8 249
 
3.2%
6 248
 
3.2%
Other values (5) 214
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7866
99.3%
Hangul 57
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2917
37.1%
1 886
 
11.3%
2 756
 
9.6%
" 750
 
9.5%
4 594
 
7.6%
3 587
 
7.5%
5 349
 
4.4%
7 316
 
4.0%
8 249
 
3.2%
6 248
 
3.2%
Other values (5) 214
 
2.7%
Hangul
ValueCountFrequency (%)
5
 
8.8%
5
 
8.8%
5
 
8.8%
4
 
7.0%
3
 
5.3%
3
 
5.3%
3
 
5.3%
3
 
5.3%
3
 
5.3%
2
 
3.5%
Other values (18) 21
36.8%
Distinct365
Distinct (%)96.8%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
2023-12-10T15:30:22.279436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length27
Mean length26.190981
Min length2

Characters and Unicode

Total characters9874
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique362 ?
Unique (%)96.0%

Sample

1st row"4148025027112510002016856"
2nd row"4148025021100170550011681"
3rd row"4115010100101680054021634"
4th row"4136037021102180142006174"
5th row"4128510600111850001007542"
ValueCountFrequency (%)
11
 
2.9%
4711325049101370001000003 2
 
0.5%
1117010700100430205024398 2
 
0.5%
4519035025100810001016537 1
 
0.3%
4148025027112510002016856 1
 
0.3%
4717031029103800013009717 1
 
0.3%
4717032025100100004030424 1
 
0.3%
4790025026100790003040896 1
 
0.3%
4721010400103490001012992 1
 
0.3%
4885039023105220005000001 1
 
0.3%
Other values (355) 355
94.2%
2023-12-10T15:30:22.803302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3388
34.3%
1 1459
14.8%
2 910
 
9.2%
" 754
 
7.6%
3 722
 
7.3%
4 714
 
7.2%
5 472
 
4.8%
7 435
 
4.4%
6 358
 
3.6%
8 354
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9120
92.4%
Other Punctuation 754
 
7.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3388
37.1%
1 1459
16.0%
2 910
 
10.0%
3 722
 
7.9%
4 714
 
7.8%
5 472
 
5.2%
7 435
 
4.8%
6 358
 
3.9%
8 354
 
3.9%
9 308
 
3.4%
Other Punctuation
ValueCountFrequency (%)
" 754
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9874
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3388
34.3%
1 1459
14.8%
2 910
 
9.2%
" 754
 
7.6%
3 722
 
7.3%
4 714
 
7.2%
5 472
 
4.8%
7 435
 
4.4%
6 358
 
3.6%
8 354
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9874
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3388
34.3%
1 1459
14.8%
2 910
 
9.2%
" 754
 
7.6%
3 722
 
7.3%
4 714
 
7.2%
5 472
 
4.8%
7 435
 
4.4%
6 358
 
3.6%
8 354
 
3.6%
Distinct375
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
2023-12-10T15:30:23.316139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length30
Mean length24.687003
Min length18

Characters and Unicode

Total characters9307
Distinct characters252
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique373 ?
Unique (%)98.9%

Sample

1st row"경기도 파주시 문산읍 마정리 1253-3번지"
2nd row"경기도 파주시 문산읍 문산리 17-14번지"
3rd row"경기도 의정부시 의정부동 168-54번지"
4th row"경기도 남양주시 퇴계원면 퇴계원리 218-142번지"
5th row"경기도 고양시 일산동구 백석동 1185-1번지"
ValueCountFrequency (%)
경상북도 76
 
4.3%
경기도 51
 
2.9%
강원도 42
 
2.4%
전라남도 33
 
1.9%
전라북도 28
 
1.6%
충청남도 28
 
1.6%
경상남도 27
 
1.5%
충청북도 27
 
1.5%
부산광역시 16
 
0.9%
서울특별시 13
 
0.7%
Other values (1037) 1425
80.7%
2023-12-10T15:30:24.053460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1389
 
14.9%
" 754
 
8.1%
390
 
4.2%
372
 
4.0%
1 354
 
3.8%
326
 
3.5%
- 286
 
3.1%
263
 
2.8%
244
 
2.6%
2 216
 
2.3%
Other values (242) 4713
50.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5274
56.7%
Decimal Number 1604
 
17.2%
Space Separator 1389
 
14.9%
Other Punctuation 754
 
8.1%
Dash Punctuation 286
 
3.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
390
 
7.4%
372
 
7.1%
326
 
6.2%
263
 
5.0%
244
 
4.6%
192
 
3.6%
174
 
3.3%
155
 
2.9%
152
 
2.9%
129
 
2.4%
Other values (229) 2877
54.6%
Decimal Number
ValueCountFrequency (%)
1 354
22.1%
2 216
13.5%
0 169
10.5%
3 161
10.0%
4 152
9.5%
5 132
 
8.2%
6 120
 
7.5%
9 104
 
6.5%
8 102
 
6.4%
7 94
 
5.9%
Space Separator
ValueCountFrequency (%)
1389
100.0%
Other Punctuation
ValueCountFrequency (%)
" 754
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 286
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5274
56.7%
Common 4033
43.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
390
 
7.4%
372
 
7.1%
326
 
6.2%
263
 
5.0%
244
 
4.6%
192
 
3.6%
174
 
3.3%
155
 
2.9%
152
 
2.9%
129
 
2.4%
Other values (229) 2877
54.6%
Common
ValueCountFrequency (%)
1389
34.4%
" 754
18.7%
1 354
 
8.8%
- 286
 
7.1%
2 216
 
5.4%
0 169
 
4.2%
3 161
 
4.0%
4 152
 
3.8%
5 132
 
3.3%
6 120
 
3.0%
Other values (3) 300
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5274
56.7%
ASCII 4033
43.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1389
34.4%
" 754
18.7%
1 354
 
8.8%
- 286
 
7.1%
2 216
 
5.4%
0 169
 
4.2%
3 161
 
4.0%
4 152
 
3.8%
5 132
 
3.3%
6 120
 
3.0%
Other values (3) 300
 
7.4%
Hangul
ValueCountFrequency (%)
390
 
7.4%
372
 
7.1%
326
 
6.2%
263
 
5.0%
244
 
4.6%
192
 
3.6%
174
 
3.3%
155
 
2.9%
152
 
2.9%
129
 
2.4%
Other values (229) 2877
54.6%
Distinct363
Distinct (%)96.3%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
2023-12-10T15:30:24.686643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length29
Mean length21.079576
Min length2

Characters and Unicode

Total characters7947
Distinct characters267
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique360 ?
Unique (%)95.5%

Sample

1st row"경기도 파주시 문산읍 임진각로 115"
2nd row"경기도 파주시 문산읍 문산역로 94"
3rd row"경기도 의정부시 평화로 525"
4th row"경기도 남양주시 퇴계원면 경춘북로 545"
5th row"경기도 고양시 일산동구 경의로 160"
ValueCountFrequency (%)
경상북도 74
 
4.3%
경기도 54
 
3.1%
강원도 40
 
2.3%
전라남도 31
 
1.8%
충청남도 27
 
1.6%
충청북도 27
 
1.6%
전라북도 26
 
1.5%
경상남도 25
 
1.4%
부산광역시 16
 
0.9%
13
 
0.7%
Other values (930) 1406
80.9%
2023-12-10T15:30:25.575454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1362
 
17.1%
" 754
 
9.5%
318
 
4.0%
261
 
3.3%
254
 
3.2%
1 236
 
3.0%
184
 
2.3%
151
 
1.9%
146
 
1.8%
2 142
 
1.8%
Other values (257) 4139
52.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4625
58.2%
Space Separator 1362
 
17.1%
Decimal Number 1127
 
14.2%
Other Punctuation 754
 
9.5%
Dash Punctuation 79
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
318
 
6.9%
261
 
5.6%
254
 
5.5%
184
 
4.0%
151
 
3.3%
146
 
3.2%
128
 
2.8%
124
 
2.7%
115
 
2.5%
113
 
2.4%
Other values (244) 2831
61.2%
Decimal Number
ValueCountFrequency (%)
1 236
20.9%
2 142
12.6%
3 134
11.9%
5 104
9.2%
6 94
 
8.3%
4 87
 
7.7%
7 85
 
7.5%
0 85
 
7.5%
8 82
 
7.3%
9 78
 
6.9%
Space Separator
ValueCountFrequency (%)
1362
100.0%
Other Punctuation
ValueCountFrequency (%)
" 754
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 79
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4625
58.2%
Common 3322
41.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
318
 
6.9%
261
 
5.6%
254
 
5.5%
184
 
4.0%
151
 
3.3%
146
 
3.2%
128
 
2.8%
124
 
2.7%
115
 
2.5%
113
 
2.4%
Other values (244) 2831
61.2%
Common
ValueCountFrequency (%)
1362
41.0%
" 754
22.7%
1 236
 
7.1%
2 142
 
4.3%
3 134
 
4.0%
5 104
 
3.1%
6 94
 
2.8%
4 87
 
2.6%
7 85
 
2.6%
0 85
 
2.6%
Other values (3) 239
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4625
58.2%
ASCII 3322
41.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1362
41.0%
" 754
22.7%
1 236
 
7.1%
2 142
 
4.3%
3 134
 
4.0%
5 104
 
3.1%
6 94
 
2.8%
4 87
 
2.6%
7 85
 
2.6%
0 85
 
2.6%
Other values (3) 239
 
7.2%
Hangul
ValueCountFrequency (%)
318
 
6.9%
261
 
5.6%
254
 
5.5%
184
 
4.0%
151
 
3.3%
146
 
3.2%
128
 
2.8%
124
 
2.7%
115
 
2.5%
113
 
2.4%
Other values (244) 2831
61.2%
Distinct375
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
2023-12-10T15:30:26.092301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length6
Mean length6.2015915
Min length6

Characters and Unicode

Total characters2338
Distinct characters40
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique373 ?
Unique (%)98.9%

Sample

1st row289980
2nd row293497
3rd row316100
4th row324693
5th row294424
ValueCountFrequency (%)
경기도 3
 
0.8%
연천군 3
 
0.8%
309342 2
 
0.5%
전곡읍 2
 
0.5%
521072 2
 
0.5%
338591 1
 
0.3%
460385 1
 
0.3%
418465 1
 
0.3%
466387 1
 
0.3%
465684 1
 
0.3%
Other values (379) 379
95.7%
2023-12-10T15:30:26.869650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 330
14.1%
4 303
13.0%
2 259
11.1%
9 214
9.2%
5 212
9.1%
6 196
8.4%
1 191
8.2%
8 187
8.0%
0 179
7.7%
7 175
7.5%
Other values (30) 92
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2246
96.1%
Other Letter 62
 
2.7%
Space Separator 19
 
0.8%
Other Punctuation 10
 
0.4%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6
 
9.7%
5
 
8.1%
5
 
8.1%
4
 
6.5%
4
 
6.5%
3
 
4.8%
3
 
4.8%
3
 
4.8%
3
 
4.8%
3
 
4.8%
Other values (17) 23
37.1%
Decimal Number
ValueCountFrequency (%)
3 330
14.7%
4 303
13.5%
2 259
11.5%
9 214
9.5%
5 212
9.4%
6 196
8.7%
1 191
8.5%
8 187
8.3%
0 179
8.0%
7 175
7.8%
Space Separator
ValueCountFrequency (%)
19
100.0%
Other Punctuation
ValueCountFrequency (%)
" 10
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2276
97.3%
Hangul 62
 
2.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6
 
9.7%
5
 
8.1%
5
 
8.1%
4
 
6.5%
4
 
6.5%
3
 
4.8%
3
 
4.8%
3
 
4.8%
3
 
4.8%
3
 
4.8%
Other values (17) 23
37.1%
Common
ValueCountFrequency (%)
3 330
14.5%
4 303
13.3%
2 259
11.4%
9 214
9.4%
5 212
9.3%
6 196
8.6%
1 191
8.4%
8 187
8.2%
0 179
7.9%
7 175
7.7%
Other values (3) 30
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2276
97.3%
Hangul 62
 
2.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 330
14.5%
4 303
13.3%
2 259
11.4%
9 214
9.4%
5 212
9.3%
6 196
8.6%
1 191
8.4%
8 187
8.2%
0 179
7.9%
7 175
7.7%
Other values (3) 30
 
1.3%
Hangul
ValueCountFrequency (%)
6
 
9.7%
5
 
8.1%
5
 
8.1%
4
 
6.5%
4
 
6.5%
3
 
4.8%
3
 
4.8%
3
 
4.8%
3
 
4.8%
3
 
4.8%
Other values (17) 23
37.1%

317106
Real number (ℝ)

HIGH CORRELATION 

Distinct375
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean419105.7
Minimum239462
Maximum623663
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.4 KiB
2023-12-10T15:30:27.135967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum239462
5-th percentile268358.2
Q1321683
median413703
Q3514133
95-th percentile567420.8
Maximum623663
Range384201
Interquartile range (IQR)192450

Descriptive statistics

Standard deviation104183.82
Coefficient of variation (CV)0.24858604
Kurtosis-1.3034323
Mean419105.7
Median Absolute Deviation (MAD)97124
Skewness0.008328072
Sum1.5800285 × 108
Variance1.0854269 × 1010
MonotonicityNot monotonic
2023-12-10T15:30:27.391054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
550785 2
 
0.5%
386539 2
 
0.5%
588065 1
 
0.3%
440956 1
 
0.3%
443989 1
 
0.3%
445968 1
 
0.3%
450138 1
 
0.3%
449663 1
 
0.3%
467952 1
 
0.3%
279180 1
 
0.3%
Other values (365) 365
96.8%
ValueCountFrequency (%)
239462 1
0.3%
240674 1
0.3%
241360 1
0.3%
242385 1
0.3%
242720 1
0.3%
244637 1
0.3%
245340 1
0.3%
246025 1
0.3%
248021 1
0.3%
251059 1
0.3%
ValueCountFrequency (%)
623663 1
0.3%
620578 1
0.3%
598023 1
0.3%
594333 1
0.3%
588065 1
0.3%
586932 1
0.3%
584641 1
0.3%
584245 1
0.3%
582172 1
0.3%
580999 1
0.3%

Interactions

2023-12-10T15:30:12.163295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:30:27.567142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
"401901""경원선317106
"401901"1.0001.0001.000
"경원선1.0001.0000.857
3171061.0000.8571.000
2023-12-10T15:30:27.707358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
317106"경원선
3171061.0000.509
"경원선0.5091.000

Missing values

2023-12-10T15:30:12.406961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:30:12.683435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

"D00057""동두천"592104"401901""경원선"경기도 동두천시 평화로 2687(동두천동)( 경기도 동두천동 245-210 동두천역)""4125010700002450210""4125010700102450210002100""경기도 동두천시 동두천동 245-210번지""경기도 동두천시 평화로 2687"317106
0"D00058""임진강""174010"<NA>"경의선"경기 파주시 문산읍 임진각로 115""4148025027012530003""4148025027112510002016856""경기도 파주시 문산읍 마정리 1253-3번지""경기도 파주시 문산읍 임진각로 115"289980588065
1"D00060""문산""402201"<NA>"경의선"경기도 파주시 문산읍 문산역로 94""4148025021000170014""4148025021100170550011681""경기도 파주시 문산읍 문산리 17-14번지""경기도 파주시 문산읍 문산역로 94"293497584245
2"D00064""의정부""283840"<NA>"경원선"경기도 의정부시 평화로 525번지""4115010100001680054""4115010100101680054021634""경기도 의정부시 의정부동 168-54번지""경기도 의정부시 평화로 525"316100571043
3"D00074""퇴계원""145941"<NA>"경춘선"경기도 남양주시 퇴계원면 경춘북로 545 (퇴계원리 218-8)""4136037021002180142""4136037021102180142006174""경기도 남양주시 퇴계원면 퇴계원리 218-142번지""경기도 남양주시 퇴계원면 경춘북로 545"324693560998
4"D00076""곡산""289341"<NA>"경의선"경기도 고양시 일산동구 경의로 160 (백석동)""4128510600011850001""4128510600111850001007542""경기도 고양시 일산동구 백석동 1185-1번지""경기도 고양시 일산동구 경의로 160"294424561029
5"D00077""능곡""291347"<NA>"경의선"경기도 고양시 덕양구 토당로 35(토당동 454-3)""4128112000004540003""4128112000104540003016817""경기도 고양시 덕양구 토당동 454-3번지""경기도 고양시 덕양구 토당로 35"296077558015
6"D00078""금곡""146491"<NA>"경춘선"경기도 남양주시 금곡로 19번길 47 (금곡동404-276) ""4136010300004040276""4136010300104040276000001""경기도 남양주시 금곡동 404-276번지""경기도 남양주시 금곡로19번길 47"330289559749
7"D00082""도농""146978"<NA>"중앙선"경기 남양주시 경춘로 433(도농동 56-1) 도농역사""4136011200040560001""4136011000100560001015990""경기도 남양주시 다산동 4056-1번지""경기도 남양주시 경춘로 433"326132556611
8"D00085""가좌""215765"<NA>"경의선"서울특별시 서대문구 수색로 27(남가좌동)""1141012000002930064""1141012000102930064008452""서울특별시 서대문구 남가좌동 293-64번지""서울특별시 서대문구 수색로 27"304361552400
9"D00088""아신""303791"<NA>"중앙선"경기 양평군 옥천면 아신리 359-3""4183034024003590003""4183034024201120002013880""경기도 양평군 옥천면 아신리 359-3번지""경기도 양평군 옥천면 아신역1길 23"350963545868
"D00057""동두천"592104"401901""경원선"경기도 동두천시 평화로 2687(동두천동)( 경기도 동두천동 245-210 동두천역)""4125010700002450210""4125010700102450210002100""경기도 동두천시 동두천동 245-210번지""경기도 동두천시 평화로 2687"317106
367"D00219""문수""330703"<NA>"중앙선"경상북도 영주시 문수면 문수로 1360""4721033023007310010""4721033023107170004035467""경상북도 영주시 문수면 적동리 731-10번지""경상북도 영주시 문수면 문수로 1360"456452463152
368"D00389""공항화물청사""509434"<NA>"공항철도"인천광역시 중구 공항동로 135번길 86 공항화물청사역""2811014700021640000""2811014700128440002223116""인천광역시 중구 운서동 2164번지""인천광역시 중구 공항동로135번길 86"265495540744
369"D00336""관촌""51759"<NA>"전라선"전북 임실군 관촌면 춘향로 3326-1""4575038024003530015""4575038024103530015118490""전라북도 임실군 관촌면 병암리 353-15번지""전라북도 임실군 관촌면 춘향로 3326-1"334703340495
370"D00337""임실""54337"<NA>"전라선"전북 임실군 임실읍 춘향로 2932""4575025024000130035""4575025024100130000122124""전라북도 임실군 임실읍 두곡리 13-35번지""전라북도 임실군 임실읍 춘향로 2932"335829337289
371"D00342""산성""50450"<NA>"전라선"전북 남원시 내척동 294-11""4519011400002940011""4519011400102940011300612""전라북도 남원시 내척동 294-11번지""전라북도 남원시 내척길 179"342364315844
372"D00343""주생""50749"<NA>"전라선"전북 남원시 주생면 요천로 779-9""4519034026001210002""4519034026101190003019032""전라북도 남원시 주생면 제천리 121-2번지""전라북도 남원시 주생면 요천로 779-9"338172308805
373"D00348""극락강""481642"<NA>"광주선"광주광역시 광산구 목련로310-23(신가동 212-2)""2920011900002120002""2920011900102120002004091""광주광역시 광산구 신가동 212-2번지""광주광역시 광산구 목련로 310-23"293702287021
374"D00371""덕양""113731"<NA>"전라선"전라남도 여수시 소라면 여순로 56""4613031021014890006""4613031021114890006000001""전라남도 여수시 소라면 덕양리 1489-6번지""전라남도 여수시 소라면 여순로 56"366111246025
375"D00379""오수""3262"<NA>"전라선"전북 임실군 오수면 충효로 1967-19""4575035528005650001""4575035528105650001127040""전라북도 임실군 오수면 대명리 565-1번지""전라북도 임실군 오수면 충효로 1967-19"338591327261
376"D00383""검안""105833"<NA>"공항철도"인천광역시 서구 검바위로 26 검암역""2826010300004140204""2826010300104140204026696""인천광역시 서구 검암동 414-204번지""인천광역시 서구 검바위로 26"283027552703