Overview

Dataset statistics

Number of variables13
Number of observations400
Missing cells4
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory41.9 KiB
Average record size in memory107.3 B

Variable types

Text8
Numeric3
Categorical2

Alerts

"동해선" is highly overall correlated with 502202 and 2 other fieldsHigh correlation
"KORAIL" is highly overall correlated with 502202 and 2 other fieldsHigh correlation
502202 is highly overall correlated with 288265 and 2 other fieldsHigh correlation
288265 is highly overall correlated with 502202 and 2 other fieldsHigh correlation
"947" has unique valuesUnique

Reproduction

Analysis started2023-12-10 06:33:39.791343
Analysis finished2023-12-10 06:33:43.503510
Duration3.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

"947"
Text

UNIQUE 

Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:33:44.057207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.915
Min length3

Characters and Unicode

Total characters1966
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique400 ?
Unique (%)100.0%

Sample

1st row"645"
2nd row"533"
3rd row"525"
4th row"572"
5th row"640"
ValueCountFrequency (%)
645 1
 
0.2%
925 1
 
0.2%
206 1
 
0.2%
898 1
 
0.2%
22 1
 
0.2%
983 1
 
0.2%
755 1
 
0.2%
834 1
 
0.2%
715 1
 
0.2%
698 1
 
0.2%
Other values (390) 390
97.5%
2023-12-10T15:33:44.970770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
" 800
40.7%
5 136
 
6.9%
3 136
 
6.9%
1 135
 
6.9%
2 131
 
6.7%
6 126
 
6.4%
8 117
 
6.0%
4 105
 
5.3%
9 101
 
5.1%
0 92
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1166
59.3%
Other Punctuation 800
40.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 136
11.7%
3 136
11.7%
1 135
11.6%
2 131
11.2%
6 126
10.8%
8 117
10.0%
4 105
9.0%
9 101
8.7%
0 92
7.9%
7 87
7.5%
Other Punctuation
ValueCountFrequency (%)
" 800
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1966
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
" 800
40.7%
5 136
 
6.9%
3 136
 
6.9%
1 135
 
6.9%
2 131
 
6.7%
6 126
 
6.4%
8 117
 
6.0%
4 105
 
5.3%
9 101
 
5.1%
0 92
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1966
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
" 800
40.7%
5 136
 
6.9%
3 136
 
6.9%
1 135
 
6.9%
2 131
 
6.7%
6 126
 
6.4%
8 117
 
6.0%
4 105
 
5.3%
9 101
 
5.1%
0 92
 
4.7%
Distinct380
Distinct (%)95.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:33:45.444596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length4
Mean length5
Min length4

Characters and Unicode

Total characters2000
Distinct characters264
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique361 ?
Unique (%)90.2%

Sample

1st row"평촌"
2nd row"명학"
3rd row"구로"
4th row"제물포"
5th row"경마공원"
ValueCountFrequency (%)
청량리 3
 
0.8%
금곡 2
 
0.5%
제기동 2
 
0.5%
신도림 2
 
0.5%
옥수 2
 
0.5%
판교 2
 
0.5%
회기 2
 
0.5%
중앙로 2
 
0.5%
천안 2
 
0.5%
김포공항 2
 
0.5%
Other values (370) 379
94.8%
2023-12-10T15:33:46.150253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
" 800
40.0%
41
 
2.1%
28
 
1.4%
) 26
 
1.3%
( 26
 
1.3%
25
 
1.2%
24
 
1.2%
24
 
1.2%
23
 
1.1%
23
 
1.1%
Other values (254) 960
48.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1120
56.0%
Other Punctuation 801
40.1%
Decimal Number 27
 
1.4%
Close Punctuation 26
 
1.3%
Open Punctuation 26
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
41
 
3.7%
28
 
2.5%
25
 
2.2%
24
 
2.1%
24
 
2.1%
23
 
2.1%
23
 
2.1%
21
 
1.9%
19
 
1.7%
18
 
1.6%
Other values (242) 874
78.0%
Decimal Number
ValueCountFrequency (%)
5 7
25.9%
7 5
18.5%
6 4
14.8%
3 4
14.8%
1 3
11.1%
4 2
 
7.4%
8 1
 
3.7%
2 1
 
3.7%
Other Punctuation
ValueCountFrequency (%)
" 800
99.9%
. 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 26
100.0%
Open Punctuation
ValueCountFrequency (%)
( 26
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1120
56.0%
Common 880
44.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
41
 
3.7%
28
 
2.5%
25
 
2.2%
24
 
2.1%
24
 
2.1%
23
 
2.1%
23
 
2.1%
21
 
1.9%
19
 
1.7%
18
 
1.6%
Other values (242) 874
78.0%
Common
ValueCountFrequency (%)
" 800
90.9%
) 26
 
3.0%
( 26
 
3.0%
5 7
 
0.8%
7 5
 
0.6%
6 4
 
0.5%
3 4
 
0.5%
1 3
 
0.3%
4 2
 
0.2%
. 1
 
0.1%
Other values (2) 2
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1120
56.0%
ASCII 880
44.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
" 800
90.9%
) 26
 
3.0%
( 26
 
3.0%
5 7
 
0.8%
7 5
 
0.6%
6 4
 
0.5%
3 4
 
0.5%
1 3
 
0.3%
4 2
 
0.2%
. 1
 
0.1%
Other values (2) 2
 
0.2%
Hangul
ValueCountFrequency (%)
41
 
3.7%
28
 
2.5%
25
 
2.2%
24
 
2.1%
24
 
2.1%
23
 
2.1%
23
 
2.1%
21
 
1.9%
19
 
1.7%
18
 
1.6%
Other values (242) 874
78.0%

502202
Real number (ℝ)

HIGH CORRELATION 

Distinct393
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean357166.37
Minimum265501
Maximum504776
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:33:46.446951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum265501
5-th percentile285100.7
Q1305054.25
median316581.5
Q3449675
95-th percentile499092.4
Maximum504776
Range239275
Interquartile range (IQR)144620.75

Descriptive statistics

Standard deviation77775.416
Coefficient of variation (CV)0.21775683
Kurtosis-0.85180409
Mean357166.37
Median Absolute Deviation (MAD)16476
Skewness0.9629055
Sum1.4286655 × 108
Variance6.0490154 × 109
MonotonicityNot monotonic
2023-12-10T15:33:47.251796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
316057 3
 
0.8%
324033 2
 
0.5%
499510 2
 
0.5%
320054 2
 
0.5%
314954 2
 
0.5%
317014 2
 
0.5%
308446 1
 
0.2%
303017 1
 
0.2%
487691 1
 
0.2%
502525 1
 
0.2%
Other values (383) 383
95.8%
ValueCountFrequency (%)
265501 1
0.2%
278914 1
0.2%
279251 1
0.2%
279738 1
0.2%
280124 1
0.2%
280728 1
0.2%
281190 1
0.2%
281412 1
0.2%
281703 1
0.2%
283032 1
0.2%
ValueCountFrequency (%)
504776 1
0.2%
503939 1
0.2%
503726 1
0.2%
503698 1
0.2%
503288 1
0.2%
502607 1
0.2%
502525 1
0.2%
501421 1
0.2%
501153 1
0.2%
501144 1
0.2%

288265
Real number (ℝ)

HIGH CORRELATION 

Distinct390
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean475329.36
Minimum274486
Maximum594434
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:33:47.864513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum274486
5-th percentile283963.25
Q1362358
median542631
Q3550984
95-th percentile564821.25
Maximum594434
Range319948
Interquartile range (IQR)188626

Descriptive statistics

Standard deviation109156.34
Coefficient of variation (CV)0.2296436
Kurtosis-0.98065352
Mean475329.36
Median Absolute Deviation (MAD)12609
Skewness-0.90400898
Sum1.9013174 × 108
Variance1.1915107 × 1010
MonotonicityNot monotonic
2023-12-10T15:33:48.175292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
553467 3
 
0.8%
551941 2
 
0.5%
467926 2
 
0.5%
555608 2
 
0.5%
545067 2
 
0.5%
553316 2
 
0.5%
285371 2
 
0.5%
554588 2
 
0.5%
543668 2
 
0.5%
279191 1
 
0.2%
Other values (380) 380
95.0%
ValueCountFrequency (%)
274486 1
0.2%
275468 1
0.2%
276295 1
0.2%
277216 1
0.2%
278110 1
0.2%
278339 1
0.2%
278486 1
0.2%
278740 1
0.2%
278828 1
0.2%
279035 1
0.2%
ValueCountFrequency (%)
594434 1
0.2%
589217 1
0.2%
588167 1
0.2%
586980 1
0.2%
584254 1
0.2%
582716 1
0.2%
579995 1
0.2%
579584 1
0.2%
578195 1
0.2%
575058 1
0.2%
Distinct376
Distinct (%)94.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:33:48.698458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.71
Min length2

Characters and Unicode

Total characters3084
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique360 ?
Unique (%)90.0%

Sample

1st row"135191"
2nd row"129835"
3rd row"418762"
4th row"95041"
5th row"62433"
ValueCountFrequency (%)
9
 
2.2%
362986 3
 
0.8%
226528 2
 
0.5%
500346 2
 
0.5%
414095 2
 
0.5%
418558 2
 
0.5%
193938 2
 
0.5%
213546 2
 
0.5%
362028 2
 
0.5%
96292 2
 
0.5%
Other values (366) 372
93.0%
2023-12-10T15:33:49.435237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
" 800
25.9%
2 358
11.6%
4 274
 
8.9%
1 273
 
8.9%
3 229
 
7.4%
0 221
 
7.2%
5 217
 
7.0%
8 195
 
6.3%
9 192
 
6.2%
6 168
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2284
74.1%
Other Punctuation 800
 
25.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 358
15.7%
4 274
12.0%
1 273
12.0%
3 229
10.0%
0 221
9.7%
5 217
9.5%
8 195
8.5%
9 192
8.4%
6 168
7.4%
7 157
6.9%
Other Punctuation
ValueCountFrequency (%)
" 800
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3084
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
" 800
25.9%
2 358
11.6%
4 274
 
8.9%
1 273
 
8.9%
3 229
 
7.4%
0 221
 
7.2%
5 217
 
7.0%
8 195
 
6.3%
9 192
 
6.2%
6 168
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3084
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
" 800
25.9%
2 358
11.6%
4 274
 
8.9%
1 273
 
8.9%
3 229
 
7.4%
0 221
 
7.2%
5 217
 
7.0%
8 195
 
6.3%
9 192
 
6.2%
6 168
 
5.4%

"KORAIL"
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
"KORAIL"
122 
"부산교통공사"
68 
"서울메트로"
66 
"서울특별시도시철도공사"
58 
"대구광역시도시철도공사"
39 
Other values (6)
47 

Length

Max length13
Median length9
Mean length9.1725
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row"KORAIL"
2nd row"KORAIL"
3rd row"KORAIL"
4th row"KORAIL"
5th row"KORAIL"

Common Values

ValueCountFrequency (%)
"KORAIL" 122
30.5%
"부산교통공사" 68
17.0%
"서울메트로" 66
16.5%
"서울특별시도시철도공사" 58
14.5%
"대구광역시도시철도공사" 39
 
9.8%
"인천메트로" 15
 
3.8%
"네오트랜스" 9
 
2.2%
"대전광역시도시철도공사" 7
 
1.8%
"광주광역시도시철도공사" 7
 
1.8%
"의정부경전철" 5
 
1.2%

Length

2023-12-10T15:33:49.728335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
korail 122
30.5%
부산교통공사 68
17.0%
서울메트로 66
16.5%
서울특별시도시철도공사 58
14.5%
대구광역시도시철도공사 39
 
9.8%
인천메트로 15
 
3.8%
네오트랜스 9
 
2.2%
대전광역시도시철도공사 7
 
1.8%
광주광역시도시철도공사 7
 
1.8%
의정부경전철 5
 
1.2%

"동해선"
Categorical

HIGH CORRELATION 

Distinct38
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
"서울2호선"
 
27
"서울5호선"
 
23
"서울7호선"
 
21
"부산2호선"
 
21
"부산1호선"
 
21
Other values (33)
287 

Length

Max length13
Median length7
Mean length6.525
Min length5

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row"과천선"
2nd row"경부선"
3rd row"경부선"
4th row"경인선"
5th row"과천선"

Common Values

ValueCountFrequency (%)
"서울2호선" 27
 
6.8%
"서울5호선" 23
 
5.8%
"서울7호선" 21
 
5.2%
"부산2호선" 21
 
5.2%
"부산1호선" 21
 
5.2%
"대구1호선" 16
 
4.0%
"서울3호선" 16
 
4.0%
"분당선" 15
 
3.8%
"경원선" 14
 
3.5%
"경부선" 14
 
3.5%
Other values (28) 212
53.0%

Length

2023-12-10T15:33:50.072371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울2호선 27
 
6.8%
서울5호선 23
 
5.8%
서울7호선 21
 
5.2%
부산1호선 21
 
5.2%
부산2호선 21
 
5.2%
대구1호선 16
 
4.0%
서울3호선 16
 
4.0%
분당선 15
 
3.8%
경부선 14
 
3.5%
경인선 14
 
3.5%
Other values (28) 212
53.0%

2810
Real number (ℝ)

Distinct393
Distinct (%)99.2%
Missing4
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean23736.758
Minimum99
Maximum243307
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:33:50.429169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum99
5-th percentile2149.5
Q16290.75
median14048.5
Q328710.5
95-th percentile85870.75
Maximum243307
Range243208
Interquartile range (IQR)22419.75

Descriptive statistics

Standard deviation30323.412
Coefficient of variation (CV)1.2774875
Kurtosis13.906692
Mean23736.758
Median Absolute Deviation (MAD)9028.5
Skewness3.249425
Sum9399756
Variance9.1950929 × 108
MonotonicityNot monotonic
2023-12-10T15:33:50.798114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16097 2
 
0.5%
1745 2
 
0.5%
5174 2
 
0.5%
8355 1
 
0.2%
1369 1
 
0.2%
8404 1
 
0.2%
262 1
 
0.2%
12550 1
 
0.2%
4561 1
 
0.2%
29505 1
 
0.2%
Other values (383) 383
95.8%
(Missing) 4
 
1.0%
ValueCountFrequency (%)
99 1
0.2%
262 1
0.2%
282 1
0.2%
433 1
0.2%
662 1
0.2%
839 1
0.2%
962 1
0.2%
978 1
0.2%
1090 1
0.2%
1332 1
0.2%
ValueCountFrequency (%)
243307 1
0.2%
199699 1
0.2%
181692 1
0.2%
162002 1
0.2%
143470 1
0.2%
136578 1
0.2%
135912 1
0.2%
122815 1
0.2%
121702 1
0.2%
118867 1
0.2%
Distinct393
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:33:51.374690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length38
Median length27
Mean length20.7
Min length14

Characters and Unicode

Total characters8280
Distinct characters221
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique387 ?
Unique (%)96.8%

Sample

1st row"경기도 안양시 동안구 부림동 1608"
2nd row"경기도 안양시 만안구 안양8동 383-1"
3rd row"서울 특별시 구로구 구로5동585-5"
4th row"인천광역시 남구 도화동 450-39"
5th row"경기도 과천시 과천동 646"
ValueCountFrequency (%)
서울특별시 145
 
8.9%
경기도 67
 
4.1%
부산광역시 45
 
2.8%
대구광역시 37
 
2.3%
인천광역시 25
 
1.5%
중구 18
 
1.1%
17
 
1.0%
강서구 16
 
1.0%
북구 15
 
0.9%
부산 15
 
0.9%
Other values (807) 1225
75.4%
2023-12-10T15:33:52.272457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1226
 
14.8%
" 800
 
9.7%
420
 
5.1%
403
 
4.9%
372
 
4.5%
1 346
 
4.2%
2 226
 
2.7%
214
 
2.6%
- 194
 
2.3%
154
 
1.9%
Other values (211) 3925
47.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4490
54.2%
Decimal Number 1534
 
18.5%
Space Separator 1226
 
14.8%
Other Punctuation 800
 
9.7%
Dash Punctuation 194
 
2.3%
Open Punctuation 18
 
0.2%
Close Punctuation 18
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
420
 
9.4%
403
 
9.0%
372
 
8.3%
214
 
4.8%
154
 
3.4%
153
 
3.4%
151
 
3.4%
145
 
3.2%
130
 
2.9%
118
 
2.6%
Other values (196) 2230
49.7%
Decimal Number
ValueCountFrequency (%)
1 346
22.6%
2 226
14.7%
3 153
10.0%
5 139
9.1%
4 135
 
8.8%
7 124
 
8.1%
6 116
 
7.6%
0 109
 
7.1%
8 97
 
6.3%
9 89
 
5.8%
Space Separator
ValueCountFrequency (%)
1226
100.0%
Other Punctuation
ValueCountFrequency (%)
" 800
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 194
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4490
54.2%
Common 3790
45.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
420
 
9.4%
403
 
9.0%
372
 
8.3%
214
 
4.8%
154
 
3.4%
153
 
3.4%
151
 
3.4%
145
 
3.2%
130
 
2.9%
118
 
2.6%
Other values (196) 2230
49.7%
Common
ValueCountFrequency (%)
1226
32.3%
" 800
21.1%
1 346
 
9.1%
2 226
 
6.0%
- 194
 
5.1%
3 153
 
4.0%
5 139
 
3.7%
4 135
 
3.6%
7 124
 
3.3%
6 116
 
3.1%
Other values (5) 331
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4490
54.2%
ASCII 3790
45.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1226
32.3%
" 800
21.1%
1 346
 
9.1%
2 226
 
6.0%
- 194
 
5.1%
3 153
 
4.0%
5 139
 
3.7%
4 135
 
3.6%
7 124
 
3.3%
6 116
 
3.1%
Other values (5) 331
 
8.7%
Hangul
ValueCountFrequency (%)
420
 
9.4%
403
 
9.0%
372
 
8.3%
214
 
4.8%
154
 
3.4%
153
 
3.4%
151
 
3.4%
145
 
3.2%
130
 
2.9%
118
 
2.6%
Other values (196) 2230
49.7%
Distinct387
Distinct (%)96.8%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:33:52.747729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length21
Mean length21
Min length21

Characters and Unicode

Total characters8400
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique375 ?
Unique (%)93.8%

Sample

1st row"4117310200016080000"
2nd row"4117110100003830001"
3rd row"1153010200005850005"
4th row"2817710400004500014"
5th row"4129010500006460000"
ValueCountFrequency (%)
1123010400005880001 3
 
0.8%
1123010300000520001 2
 
0.5%
1126010200001720003 2
 
0.5%
1123010900003170101 2
 
0.5%
4413110100000570001 2
 
0.5%
2647010200015150001 2
 
0.5%
1159010700011120000 2
 
0.5%
1144010400000250013 2
 
0.5%
2817710500001250000 2
 
0.5%
1153010100004600026 2
 
0.5%
Other values (377) 379
94.8%
2023-12-10T15:33:53.372835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3642
43.4%
1 1430
 
17.0%
" 800
 
9.5%
2 607
 
7.2%
4 368
 
4.4%
3 332
 
4.0%
5 312
 
3.7%
6 289
 
3.4%
7 270
 
3.2%
8 205
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7600
90.5%
Other Punctuation 800
 
9.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3642
47.9%
1 1430
 
18.8%
2 607
 
8.0%
4 368
 
4.8%
3 332
 
4.4%
5 312
 
4.1%
6 289
 
3.8%
7 270
 
3.6%
8 205
 
2.7%
9 145
 
1.9%
Other Punctuation
ValueCountFrequency (%)
" 800
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8400
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3642
43.4%
1 1430
 
17.0%
" 800
 
9.5%
2 607
 
7.2%
4 368
 
4.4%
3 332
 
4.0%
5 312
 
3.7%
6 289
 
3.4%
7 270
 
3.2%
8 205
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8400
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3642
43.4%
1 1430
 
17.0%
" 800
 
9.5%
2 607
 
7.2%
4 368
 
4.4%
3 332
 
4.0%
5 312
 
3.7%
6 289
 
3.4%
7 270
 
3.2%
8 205
 
2.4%
Distinct247
Distinct (%)61.8%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:33:53.761239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length27
Mean length17.875
Min length2

Characters and Unicode

Total characters7150
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique239 ?
Unique (%)59.8%

Sample

1st row"4117310200116080000004377"
2nd row""
3rd row"1153010200105890014000039"
4th row""
5th row""
ValueCountFrequency (%)
146
36.5%
1123010400105880001027819 3
 
0.8%
1159010700105880044000001 2
 
0.5%
1123010900103170101008795 2
 
0.5%
1126010200100530003016550 2
 
0.5%
4413110100100570001010739 2
 
0.5%
1153010100104600026000002 2
 
0.5%
2817010600101250000166865 2
 
0.5%
4115010200103260028000001 1
 
0.2%
4420041025103460007000002 1
 
0.2%
Other values (237) 237
59.2%
2023-12-10T15:33:54.406100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2950
41.3%
1 1371
19.2%
" 800
 
11.2%
2 452
 
6.3%
4 287
 
4.0%
3 260
 
3.6%
5 250
 
3.5%
7 229
 
3.2%
6 228
 
3.2%
8 184
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6350
88.8%
Other Punctuation 800
 
11.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2950
46.5%
1 1371
21.6%
2 452
 
7.1%
4 287
 
4.5%
3 260
 
4.1%
5 250
 
3.9%
7 229
 
3.6%
6 228
 
3.6%
8 184
 
2.9%
9 139
 
2.2%
Other Punctuation
ValueCountFrequency (%)
" 800
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7150
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2950
41.3%
1 1371
19.2%
" 800
 
11.2%
2 452
 
6.3%
4 287
 
4.0%
3 260
 
3.6%
5 250
 
3.5%
7 229
 
3.2%
6 228
 
3.2%
8 184
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7150
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2950
41.3%
1 1371
19.2%
" 800
 
11.2%
2 452
 
6.3%
4 287
 
4.0%
3 260
 
3.6%
5 250
 
3.5%
7 229
 
3.2%
6 228
 
3.2%
8 184
 
2.6%
Distinct385
Distinct (%)96.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:33:54.958208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length27
Mean length22.525
Min length2

Characters and Unicode

Total characters9010
Distinct characters218
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique373 ?
Unique (%)93.2%

Sample

1st row"경기도 안양시 동안구 관양동 1608번지"
2nd row"경기도 안양시 만안구 안양동 383-1번지"
3rd row"서울특별시 구로구 구로동 585-5번지"
4th row"인천광역시 미추홀구 도화동 450-14번지"
5th row"경기도 과천시 과천동 646번지"
ValueCountFrequency (%)
서울특별시 154
 
9.3%
경기도 81
 
4.9%
부산광역시 63
 
3.8%
대구광역시 38
 
2.3%
인천광역시 29
 
1.8%
중구 18
 
1.1%
강서구 16
 
1.0%
북구 16
 
1.0%
동구 13
 
0.8%
고양시 12
 
0.7%
Other values (776) 1209
73.3%
2023-12-10T15:33:55.702018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1249
 
13.9%
" 800
 
8.9%
431
 
4.8%
400
 
4.4%
392
 
4.4%
391
 
4.3%
391
 
4.3%
1 367
 
4.1%
- 270
 
3.0%
215
 
2.4%
Other values (208) 4104
45.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5163
57.3%
Decimal Number 1528
 
17.0%
Space Separator 1249
 
13.9%
Other Punctuation 800
 
8.9%
Dash Punctuation 270
 
3.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
431
 
8.3%
400
 
7.7%
392
 
7.6%
391
 
7.6%
391
 
7.6%
215
 
4.2%
165
 
3.2%
156
 
3.0%
154
 
3.0%
154
 
3.0%
Other values (195) 2314
44.8%
Decimal Number
ValueCountFrequency (%)
1 367
24.0%
2 205
13.4%
3 143
 
9.4%
4 141
 
9.2%
5 140
 
9.2%
7 127
 
8.3%
6 116
 
7.6%
0 107
 
7.0%
8 98
 
6.4%
9 84
 
5.5%
Space Separator
ValueCountFrequency (%)
1249
100.0%
Other Punctuation
ValueCountFrequency (%)
" 800
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 270
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5163
57.3%
Common 3847
42.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
431
 
8.3%
400
 
7.7%
392
 
7.6%
391
 
7.6%
391
 
7.6%
215
 
4.2%
165
 
3.2%
156
 
3.0%
154
 
3.0%
154
 
3.0%
Other values (195) 2314
44.8%
Common
ValueCountFrequency (%)
1249
32.5%
" 800
20.8%
1 367
 
9.5%
- 270
 
7.0%
2 205
 
5.3%
3 143
 
3.7%
4 141
 
3.7%
5 140
 
3.6%
7 127
 
3.3%
6 116
 
3.0%
Other values (3) 289
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5163
57.3%
ASCII 3847
42.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1249
32.5%
" 800
20.8%
1 367
 
9.5%
- 270
 
7.0%
2 205
 
5.3%
3 143
 
3.7%
4 141
 
3.7%
5 140
 
3.6%
7 127
 
3.3%
6 116
 
3.0%
Other values (3) 289
 
7.5%
Hangul
ValueCountFrequency (%)
431
 
8.3%
400
 
7.7%
392
 
7.6%
391
 
7.6%
391
 
7.6%
215
 
4.2%
165
 
3.2%
156
 
3.0%
154
 
3.0%
154
 
3.0%
Other values (195) 2314
44.8%
Distinct245
Distinct (%)61.3%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:33:56.238155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length26
Mean length14.0175
Min length2

Characters and Unicode

Total characters5607
Distinct characters205
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique238 ?
Unique (%)59.5%

Sample

1st row"경기도 안양시 동안구 시민대로 300"
2nd row""
3rd row"서울특별시 구로구 구로중앙로 174"
4th row""
5th row""
ValueCountFrequency (%)
150
 
11.7%
서울특별시 95
 
7.4%
지하 93
 
7.2%
경기도 55
 
4.3%
대구광역시 28
 
2.2%
부산광역시 27
 
2.1%
인천광역시 23
 
1.8%
강남구 11
 
0.9%
동구 8
 
0.6%
중구 8
 
0.6%
Other values (482) 788
61.3%
2023-12-10T15:33:57.097219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
886
 
15.8%
" 800
 
14.3%
252
 
4.5%
248
 
4.4%
247
 
4.4%
1 173
 
3.1%
2 127
 
2.3%
124
 
2.2%
112
 
2.0%
100
 
1.8%
Other values (195) 2538
45.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3116
55.6%
Space Separator 886
 
15.8%
Other Punctuation 800
 
14.3%
Decimal Number 787
 
14.0%
Dash Punctuation 18
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
252
 
8.1%
248
 
8.0%
247
 
7.9%
124
 
4.0%
112
 
3.6%
100
 
3.2%
99
 
3.2%
96
 
3.1%
96
 
3.1%
95
 
3.0%
Other values (182) 1647
52.9%
Decimal Number
ValueCountFrequency (%)
1 173
22.0%
2 127
16.1%
0 84
10.7%
3 73
9.3%
4 60
 
7.6%
5 59
 
7.5%
7 58
 
7.4%
6 57
 
7.2%
9 56
 
7.1%
8 40
 
5.1%
Space Separator
ValueCountFrequency (%)
886
100.0%
Other Punctuation
ValueCountFrequency (%)
" 800
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3116
55.6%
Common 2491
44.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
252
 
8.1%
248
 
8.0%
247
 
7.9%
124
 
4.0%
112
 
3.6%
100
 
3.2%
99
 
3.2%
96
 
3.1%
96
 
3.1%
95
 
3.0%
Other values (182) 1647
52.9%
Common
ValueCountFrequency (%)
886
35.6%
" 800
32.1%
1 173
 
6.9%
2 127
 
5.1%
0 84
 
3.4%
3 73
 
2.9%
4 60
 
2.4%
5 59
 
2.4%
7 58
 
2.3%
6 57
 
2.3%
Other values (3) 114
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3116
55.6%
ASCII 2491
44.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
886
35.6%
" 800
32.1%
1 173
 
6.9%
2 127
 
5.1%
0 84
 
3.4%
3 73
 
2.9%
4 60
 
2.4%
5 59
 
2.4%
7 58
 
2.3%
6 57
 
2.3%
Other values (3) 114
 
4.6%
Hangul
ValueCountFrequency (%)
252
 
8.1%
248
 
8.0%
247
 
7.9%
124
 
4.0%
112
 
3.6%
100
 
3.2%
99
 
3.2%
96
 
3.1%
96
 
3.1%
95
 
3.0%
Other values (182) 1647
52.9%

Interactions

2023-12-10T15:33:42.016205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:33:40.930272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:33:41.535091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:33:42.160372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:33:41.049020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:33:41.654154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:33:42.324999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:33:41.293285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:33:41.823797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:33:57.279028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
502202288265"KORAIL""동해선"2810
5022021.0000.8080.8560.9460.110
2882650.8081.0000.8730.9560.000
"KORAIL"0.8560.8731.0001.0000.233
"동해선"0.9460.9561.0001.0000.325
28100.1100.0000.2330.3251.000
2023-12-10T15:33:57.447450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
"동해선""KORAIL"
"동해선"1.0000.965
"KORAIL"0.9651.000
2023-12-10T15:33:57.593971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
5022022882652810"KORAIL""동해선"
5022021.000-0.568-0.2900.6330.716
288265-0.5681.0000.2720.6490.738
2810-0.2900.2721.0000.1010.114
"KORAIL"0.6330.6490.1011.0000.965
"동해선"0.7160.7380.1140.9651.000

Missing values

2023-12-10T15:33:42.629325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:33:43.377919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

"947""재송"502202288265"483141""KORAIL""동해선"2810"부산광역시해운대구재송동909-2""2635010400009090002""2635010400109090002000002""부산광역시 해운대구 재송동 909-2번지""부산광역시 해운대구 해운대로 100"
0"645""평촌"308446532968"135191""KORAIL""과천선"33160"경기도 안양시 동안구 부림동 1608""4117310200016080000""4117310200116080000004377""경기도 안양시 동안구 관양동 1608번지""경기도 안양시 동안구 시민대로 300"
1"533""명학"305933531923"129835""KORAIL""경부선"19661"경기도 안양시 만안구 안양8동 383-1""4117110100003830001""""경기도 안양시 만안구 안양동 383-1번지"""
2"525""구로"301294545065"418762""KORAIL""경부선"28957"서울 특별시 구로구 구로5동585-5""1153010200005850005""1153010200105890014000039""서울특별시 구로구 구로동 585-5번지""서울특별시 구로구 구로중앙로 174"
3"572""제물포"281412541362"95041""KORAIL""경인선"22038"인천광역시 남구 도화동 450-39""2817710400004500014""""인천광역시 미추홀구 도화동 450-14번지"""
4"640""경마공원"312408538428"62433""KORAIL""과천선"13605"경기도 과천시 과천동 646""4129010500006460000""""경기도 과천시 과천동 646번지"""
5"559""온수"296206544014"19181""KORAIL""경인선"64204"서울특별시 구로구 온수동 51-7""1153011000000510007""1153011000100510007001491""서울특별시 구로구 온수동 51-7번지""서울특별시 구로구 부일로 872"
6"531""관악"303573535850"133563""KORAIL""경부선"16329"경기도 안양시 만안구 석수1동 110-21""4117110200001100021""4117110200101100021013357""경기도 안양시 만안구 석수동 110-21번지""경기도 안양시 만안구 경수대로1273번길 46"
7"830""수원"311577518762"128117""KORAIL""분당선"60012"경기도 수원시 팔달구 매산로1가 18""4111513400000180000""4111513400100180000005527""경기도 수원시 팔달구 매산로1가 18번지""경기도 수원시 팔달구 덕영대로 924"
8"614""덕소"330325554140"413156""KORAIL""중앙선"12571"경기도 남양주시 와부읍 덕소리590-17""4136025021005900017""4136025021105900017018093""경기도 남양주시 와부읍 덕소리 590-17번지""경기도 남양주시 와부읍 덕소로 56"
9"574""동인천"279251542355"20658""KORAIL""경인선"34088"인천광역시 중구 인현동1-618""2811013600000010618""2811013600100010001134129""인천광역시 중구 인현동 1-618번지""인천광역시 중구 참외전로 125"
"947""재송"502202288265"483141""KORAIL""동해선"2810"부산광역시해운대구재송동909-2""2635010400009090002""2635010400109090002000002""부산광역시 해운대구 재송동 909-2번지""부산광역시 해운대구 해운대로 100"
390"286""금호"313233550004"197264""서울메트로""서울3호선"16562"서울특별시 성동구 금호동4가 1470""1120011200014700000""1120011200114700000000001""서울특별시 성동구 금호동4가 1470번지""서울특별시 성동구 동호로 지하 104"
391"314""성신여대입구"313335554940"218152""서울메트로""서울4호선"47551"서울특별시 성북구 동소문동5가 65-1""1129010800000650001""""서울특별시 성북구 동소문동5가 65-1번지"""
392"308""창동"316204561641"363925""서울메트로""서울4호선"59015"서울특별시 도봉구 창5동135""1132010700001350000""""서울특별시 도봉구 창동 135번지"""
393"319""충무로"311309551471"206785""서울메트로""서울4호선"63219"서울특별시 중구 충무로4가125""1114013200001250000""""서울특별시 중구 충무로4가 125번지"""
394"256""문래"302492546770"230288""서울메트로""서울2호선"38263"서울특별시 영등포구 문래동 3가 54""1156012100000540000""1156011900100030000023823""서울특별시 영등포구 문래동3가 54번지""서울특별시 영등포구 경인로94길 9-7"
395"235""강변"320195548476"211864""서울메트로""서울2호선"91144"서울특별시 광진구 구의3동 546-6""1121510300005460006""1121510300105460006000001""서울특별시 광진구 구의동 546-6번지""서울특별시 광진구 강변역로 53"
396"348""여의나루"305864547739"226342""서울특별시도시철도공사""서울5호선"20648"서울특별시 영등포구 여의도동 85-1""1156011000000850001""1156011000100860001000003""서울특별시 영등포구 여의도동 85-1번지""서울특별시 영등포구 여의동로 280"
397"404""버티고개"312451549992"207332""서울특별시도시철도공사""서울6호선"4288"서울특별시 중구 신당2동 432""1114016200004320001""""서울특별시 중구 신당동 432-1번지"""
398"109""당리"488975278740"246013""부산교통공사""부산1호선"13958"부산광역시 사하구 당리동 323""2638010200003230000""2638010200103230000000001""부산광역시 사하구 당리동 323번지""부산광역시 사하구 낙동대로 지하 405"
399"236""잠실나루"320979546868"152187""서울메트로""서울2호선"32657"서울특별시 송파구 신천동 1번지""1171010200000010000""1171010200100010000000001""서울특별시 송파구 신천동 1번지""서울특별시 송파구 송파대로 624"