Overview

Dataset statistics

Number of variables11
Number of observations10000
Missing cells7390
Missing cells (%)6.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory966.8 KiB
Average record size in memory99.0 B

Variable types

Categorical2
Text6
Numeric3

Alerts

소재지우편번호 is highly overall correlated with WGS84위도 and 1 other fieldsHigh correlation
WGS84위도 is highly overall correlated with 소재지우편번호 and 1 other fieldsHigh correlation
WGS84경도 is highly overall correlated with 시군명High correlation
시군명 is highly overall correlated with 소재지우편번호 and 2 other fieldsHigh correlation
교습과정명 is highly imbalanced (61.6%)Imbalance
구/읍면동명 has 368 (3.7%) missing valuesMissing
전화번호 has 6436 (64.4%) missing valuesMissing
소재지도로명주소 has 368 (3.7%) missing valuesMissing

Reproduction

Analysis started2023-12-10 22:23:01.596361
Analysis finished2023-12-10 22:23:05.171230
Duration3.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct34
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
수원시
1240 
고양시
1068 
성남시
866 
용인시
825 
화성시
668 
Other values (29)
5333 

Length

Max length14
Median length3
Mean length3.0899
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row화성시
2nd row의왕시
3rd row화성시
4th row화성시
5th row부천시

Common Values

ValueCountFrequency (%)
수원시 1240
12.4%
고양시 1068
 
10.7%
성남시 866
 
8.7%
용인시 825
 
8.2%
화성시 668
 
6.7%
안양시 630
 
6.3%
남양주시 561
 
5.6%
부천시 479
 
4.8%
의정부시 299
 
3.0%
시흥시 288
 
2.9%
Other values (24) 3076
30.8%

Length

2023-12-11T07:23:05.232393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
수원시 1240
12.4%
고양시 1068
 
10.7%
성남시 866
 
8.7%
용인시 825
 
8.2%
화성시 668
 
6.7%
안양시 630
 
6.3%
남양주시 561
 
5.6%
부천시 479
 
4.8%
의정부시 299
 
3.0%
시흥시 288
 
2.9%
Other values (24) 3077
30.8%

구/읍면동명
Text

MISSING 

Distinct314
Distinct (%)3.3%
Missing368
Missing (%)3.7%
Memory size156.2 KiB
2023-12-11T07:23:05.532299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.0316653
Min length2

Characters and Unicode

Total characters29201
Distinct characters191
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)0.4%

Sample

1st row오산동
2nd row내손동
3rd row장지동
4th row청계동
5th row옥길동
ValueCountFrequency (%)
분당구 656
 
6.8%
동안구 445
 
4.6%
일산서구 397
 
4.1%
덕양구 373
 
3.9%
기흥구 343
 
3.6%
수지구 342
 
3.6%
권선구 321
 
3.3%
영통구 317
 
3.3%
일산동구 290
 
3.0%
장안구 278
 
2.9%
Other values (304) 5870
60.9%
2023-12-11T07:23:05.963352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5094
 
17.4%
4674
 
16.0%
1264
 
4.3%
1022
 
3.5%
755
 
2.6%
745
 
2.6%
656
 
2.2%
595
 
2.0%
592
 
2.0%
526
 
1.8%
Other values (181) 13278
45.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29201
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5094
 
17.4%
4674
 
16.0%
1264
 
4.3%
1022
 
3.5%
755
 
2.6%
745
 
2.6%
656
 
2.2%
595
 
2.0%
592
 
2.0%
526
 
1.8%
Other values (181) 13278
45.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29201
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5094
 
17.4%
4674
 
16.0%
1264
 
4.3%
1022
 
3.5%
755
 
2.6%
745
 
2.6%
656
 
2.2%
595
 
2.0%
592
 
2.0%
526
 
1.8%
Other values (181) 13278
45.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29201
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5094
 
17.4%
4674
 
16.0%
1264
 
4.3%
1022
 
3.5%
755
 
2.6%
745
 
2.6%
656
 
2.2%
595
 
2.0%
592
 
2.0%
526
 
1.8%
Other values (181) 13278
45.5%
Distinct8426
Distinct (%)84.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T07:23:06.197923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length36
Mean length9.6337
Min length3

Characters and Unicode

Total characters96337
Distinct characters898
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7409 ?
Unique (%)74.1%

Sample

1st row뉴늘봄피아노음악교습소
2nd row스마트해법영어벨포레점영어교습소
3rd row늘봄피아노음악교습소
4th row라움음악교습소
5th row솔루니옥길논술교습소
ValueCountFrequency (%)
교습소 55
 
0.5%
영어 14
 
0.1%
영어교습소 10
 
0.1%
스마트해법수학교습소 10
 
0.1%
리드인독서논술교습소 9
 
0.1%
erc영어독서클럽영어교습소 8
 
0.1%
유니팝미술교습소 8
 
0.1%
채움수학교습소 8
 
0.1%
탑수학교습소 8
 
0.1%
수학교습소 8
 
0.1%
Other values (8512) 10112
98.7%
2023-12-11T07:23:06.556997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9924
 
10.3%
9867
 
10.2%
9675
 
10.0%
2967
 
3.1%
2718
 
2.8%
2667
 
2.8%
2549
 
2.6%
2451
 
2.5%
2400
 
2.5%
2139
 
2.2%
Other values (888) 48980
50.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 92934
96.5%
Uppercase Letter 1282
 
1.3%
Lowercase Letter 1026
 
1.1%
Space Separator 260
 
0.3%
Close Punctuation 249
 
0.3%
Open Punctuation 249
 
0.3%
Decimal Number 218
 
0.2%
Other Punctuation 105
 
0.1%
Dash Punctuation 7
 
< 0.1%
Math Symbol 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9924
 
10.7%
9867
 
10.6%
9675
 
10.4%
2967
 
3.2%
2718
 
2.9%
2667
 
2.9%
2549
 
2.7%
2451
 
2.6%
2400
 
2.6%
2139
 
2.3%
Other values (812) 45577
49.0%
Uppercase Letter
ValueCountFrequency (%)
E 152
 
11.9%
S 123
 
9.6%
M 96
 
7.5%
T 81
 
6.3%
A 69
 
5.4%
L 69
 
5.4%
J 69
 
5.4%
C 61
 
4.8%
Y 59
 
4.6%
G 54
 
4.2%
Other values (16) 449
35.0%
Lowercase Letter
ValueCountFrequency (%)
e 140
13.6%
n 110
10.7%
i 109
10.6%
s 104
10.1%
o 81
7.9%
a 81
7.9%
l 64
 
6.2%
h 62
 
6.0%
t 42
 
4.1%
g 40
 
3.9%
Other values (13) 193
18.8%
Other Punctuation
ValueCountFrequency (%)
& 48
45.7%
. 17
 
16.2%
; 16
 
15.2%
, 12
 
11.4%
% 2
 
1.9%
' 2
 
1.9%
" 2
 
1.9%
· 2
 
1.9%
: 2
 
1.9%
? 1
 
1.0%
Decimal Number
ValueCountFrequency (%)
1 62
28.4%
3 46
21.1%
2 42
19.3%
0 36
16.5%
5 10
 
4.6%
7 8
 
3.7%
4 7
 
3.2%
8 4
 
1.8%
9 2
 
0.9%
6 1
 
0.5%
Space Separator
ValueCountFrequency (%)
260
100.0%
Close Punctuation
ValueCountFrequency (%)
) 249
100.0%
Open Punctuation
ValueCountFrequency (%)
( 249
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%
Math Symbol
ValueCountFrequency (%)
+ 6
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 92929
96.5%
Latin 2308
 
2.4%
Common 1095
 
1.1%
Han 5
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9924
 
10.7%
9867
 
10.6%
9675
 
10.4%
2967
 
3.2%
2718
 
2.9%
2667
 
2.9%
2549
 
2.7%
2451
 
2.6%
2400
 
2.6%
2139
 
2.3%
Other values (808) 45572
49.0%
Latin
ValueCountFrequency (%)
E 152
 
6.6%
e 140
 
6.1%
S 123
 
5.3%
n 110
 
4.8%
i 109
 
4.7%
s 104
 
4.5%
M 96
 
4.2%
o 81
 
3.5%
T 81
 
3.5%
a 81
 
3.5%
Other values (39) 1231
53.3%
Common
ValueCountFrequency (%)
260
23.7%
) 249
22.7%
( 249
22.7%
1 62
 
5.7%
& 48
 
4.4%
3 46
 
4.2%
2 42
 
3.8%
0 36
 
3.3%
. 17
 
1.6%
; 16
 
1.5%
Other values (17) 70
 
6.4%
Han
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 92929
96.5%
ASCII 3400
 
3.5%
CJK 5
 
< 0.1%
None 2
 
< 0.1%
Letterlike Symbols 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9924
 
10.7%
9867
 
10.6%
9675
 
10.4%
2967
 
3.2%
2718
 
2.9%
2667
 
2.9%
2549
 
2.7%
2451
 
2.6%
2400
 
2.6%
2139
 
2.3%
Other values (808) 45572
49.0%
ASCII
ValueCountFrequency (%)
260
 
7.6%
) 249
 
7.3%
( 249
 
7.3%
E 152
 
4.5%
e 140
 
4.1%
S 123
 
3.6%
n 110
 
3.2%
i 109
 
3.2%
s 104
 
3.1%
M 96
 
2.8%
Other values (64) 1808
53.2%
None
ValueCountFrequency (%)
· 2
100.0%
CJK
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%
Distinct6985
Distinct (%)69.9%
Missing4
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-11T07:23:06.823945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length3
Mean length3.0635254
Min length2

Characters and Unicode

Total characters30623
Distinct characters370
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5552 ?
Unique (%)55.5%

Sample

1st row이명희
2nd row이예슬
3rd row강정님
4th row최한나
5th row감요셉
ValueCountFrequency (%)
김민정 26
 
0.3%
김지영 23
 
0.2%
김은경 20
 
0.2%
김지현 17
 
0.2%
김현정 16
 
0.2%
이현정 16
 
0.2%
김은정 15
 
0.1%
김미영 15
 
0.1%
김혜진 14
 
0.1%
이수진 14
 
0.1%
Other values (7034) 9898
98.3%
2023-12-11T07:23:07.194610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2044
 
6.7%
1640
 
5.4%
1467
 
4.8%
1171
 
3.8%
953
 
3.1%
908
 
3.0%
906
 
3.0%
854
 
2.8%
832
 
2.7%
713
 
2.3%
Other values (360) 19135
62.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29858
97.5%
Uppercase Letter 580
 
1.9%
Space Separator 84
 
0.3%
Lowercase Letter 42
 
0.1%
Open Punctuation 29
 
0.1%
Close Punctuation 29
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2044
 
6.8%
1640
 
5.5%
1467
 
4.9%
1171
 
3.9%
953
 
3.2%
908
 
3.0%
906
 
3.0%
854
 
2.9%
832
 
2.8%
713
 
2.4%
Other values (314) 18370
61.5%
Uppercase Letter
ValueCountFrequency (%)
N 75
12.9%
A 65
11.2%
I 58
 
10.0%
E 43
 
7.4%
O 41
 
7.1%
G 38
 
6.6%
H 37
 
6.4%
U 27
 
4.7%
Y 25
 
4.3%
L 24
 
4.1%
Other values (15) 147
25.3%
Lowercase Letter
ValueCountFrequency (%)
e 8
19.0%
o 5
11.9%
n 5
11.9%
a 3
 
7.1%
t 3
 
7.1%
s 3
 
7.1%
w 2
 
4.8%
i 2
 
4.8%
r 2
 
4.8%
h 2
 
4.8%
Other values (7) 7
16.7%
Space Separator
ValueCountFrequency (%)
84
100.0%
Open Punctuation
ValueCountFrequency (%)
( 29
100.0%
Close Punctuation
ValueCountFrequency (%)
) 29
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29858
97.5%
Latin 622
 
2.0%
Common 143
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2044
 
6.8%
1640
 
5.5%
1467
 
4.9%
1171
 
3.9%
953
 
3.2%
908
 
3.0%
906
 
3.0%
854
 
2.9%
832
 
2.8%
713
 
2.4%
Other values (314) 18370
61.5%
Latin
ValueCountFrequency (%)
N 75
 
12.1%
A 65
 
10.5%
I 58
 
9.3%
E 43
 
6.9%
O 41
 
6.6%
G 38
 
6.1%
H 37
 
5.9%
U 27
 
4.3%
Y 25
 
4.0%
L 24
 
3.9%
Other values (32) 189
30.4%
Common
ValueCountFrequency (%)
84
58.7%
( 29
 
20.3%
) 29
 
20.3%
- 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29858
97.5%
ASCII 765
 
2.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2044
 
6.8%
1640
 
5.5%
1467
 
4.9%
1171
 
3.9%
953
 
3.2%
908
 
3.0%
906
 
3.0%
854
 
2.9%
832
 
2.8%
713
 
2.4%
Other values (314) 18370
61.5%
ASCII
ValueCountFrequency (%)
84
 
11.0%
N 75
 
9.8%
A 65
 
8.5%
I 58
 
7.6%
E 43
 
5.6%
O 41
 
5.4%
G 38
 
5.0%
H 37
 
4.8%
( 29
 
3.8%
) 29
 
3.8%
Other values (36) 266
34.8%

교습과정명
Categorical

IMBALANCE 

Distinct39
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
보습
4990 
음악
2486 
미술
1640 
보습·논술
 
268
실용외국어(유아/초·중·고)
 
198
Other values (34)
 
418

Length

Max length16
Median length2
Mean length2.3871
Min length2

Unique

Unique11 ?
Unique (%)0.1%

Sample

1st row음악
2nd row보습
3rd row음악
4th row음악
5th row보습·논술

Common Values

ValueCountFrequency (%)
보습 4990
49.9%
음악 2486
24.9%
미술 1640
 
16.4%
보습·논술 268
 
2.7%
실용외국어(유아/초·중·고) 198
 
2.0%
서예 84
 
0.8%
기타(소) 81
 
0.8%
주산 46
 
0.5%
바둑 42
 
0.4%
컴퓨터(소) 25
 
0.2%
Other values (29) 140
 
1.4%

Length

2023-12-11T07:23:07.319999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
보습 4990
49.9%
음악 2486
24.9%
미술 1640
 
16.4%
보습·논술 268
 
2.7%
실용외국어(유아/초·중·고 198
 
2.0%
서예 84
 
0.8%
기타(소 81
 
0.8%
주산 46
 
0.5%
바둑 42
 
0.4%
컴퓨터(소 25
 
0.2%
Other values (29) 140
 
1.4%

전화번호
Text

MISSING 

Distinct3430
Distinct (%)96.2%
Missing6436
Missing (%)64.4%
Memory size156.2 KiB
2023-12-11T07:23:07.527024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.02862
Min length11

Characters and Unicode

Total characters42870
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3299 ?
Unique (%)92.6%

Sample

1st row031-922-0594
2nd row031-503-6066
3rd row070-4406-1184
4th row031-388-3278
5th row031-915-5395
ValueCountFrequency (%)
031-562-5925 4
 
0.1%
031-377-5056 3
 
0.1%
031-8015-0504 2
 
0.1%
031-444-1318 2
 
0.1%
031-222-0307 2
 
0.1%
031-375-8007 2
 
0.1%
031-554-0844 2
 
0.1%
031-473-3869 2
 
0.1%
031-235-4140 2
 
0.1%
031-246-2432 2
 
0.1%
Other values (3421) 3543
99.4%
2023-12-11T07:23:07.846232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 7128
16.6%
0 6168
14.4%
3 5957
13.9%
1 5485
12.8%
2 3263
7.6%
7 2986
7.0%
5 2651
 
6.2%
9 2540
 
5.9%
8 2320
 
5.4%
6 2248
 
5.2%
Other values (2) 2124
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 35740
83.4%
Dash Punctuation 7128
 
16.6%
Space Separator 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6168
17.3%
3 5957
16.7%
1 5485
15.3%
2 3263
9.1%
7 2986
8.4%
5 2651
7.4%
9 2540
7.1%
8 2320
 
6.5%
6 2248
 
6.3%
4 2122
 
5.9%
Dash Punctuation
ValueCountFrequency (%)
- 7128
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 42870
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 7128
16.6%
0 6168
14.4%
3 5957
13.9%
1 5485
12.8%
2 3263
7.6%
7 2986
7.0%
5 2651
 
6.2%
9 2540
 
5.9%
8 2320
 
5.4%
6 2248
 
5.2%
Other values (2) 2124
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42870
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 7128
16.6%
0 6168
14.4%
3 5957
13.9%
1 5485
12.8%
2 3263
7.6%
7 2986
7.0%
5 2651
 
6.2%
9 2540
 
5.9%
8 2320
 
5.4%
6 2248
 
5.2%
Other values (2) 2124
 
5.0%

소재지우편번호
Real number (ℝ)

HIGH CORRELATION 

Distinct2350
Distinct (%)23.6%
Missing58
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean14334.775
Minimum5544
Maximum18633
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T07:23:07.971245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5544
5-th percentile10348
Q112151.25
median14240
Q316572
95-th percentile18414
Maximum18633
Range13089
Interquartile range (IQR)4420.75

Descriptive statistics

Standard deviation2555.0147
Coefficient of variation (CV)0.17823891
Kurtosis-1.1433875
Mean14334.775
Median Absolute Deviation (MAD)2256
Skewness-0.08375281
Sum1.4251634 × 108
Variance6528100.3
MonotonicityNot monotonic
2023-12-11T07:23:08.092165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14103 111
 
1.1%
11921 76
 
0.8%
14099 62
 
0.6%
10375 59
 
0.6%
13600 55
 
0.5%
13599 54
 
0.5%
13837 49
 
0.5%
10416 49
 
0.5%
12248 38
 
0.4%
10374 36
 
0.4%
Other values (2340) 9353
93.5%
(Missing) 58
 
0.6%
ValueCountFrequency (%)
5544 2
< 0.1%
10011 1
 
< 0.1%
10019 3
< 0.1%
10031 2
< 0.1%
10036 1
 
< 0.1%
10037 1
 
< 0.1%
10040 1
 
< 0.1%
10047 1
 
< 0.1%
10059 3
< 0.1%
10060 1
 
< 0.1%
ValueCountFrequency (%)
18633 1
 
< 0.1%
18613 2
 
< 0.1%
18611 3
< 0.1%
18610 1
 
< 0.1%
18603 2
 
< 0.1%
18602 6
0.1%
18601 4
< 0.1%
18600 5
0.1%
18599 2
 
< 0.1%
18596 4
< 0.1%
Distinct9382
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T07:23:08.381421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length74
Median length62
Mean length34.7424
Min length16

Characters and Unicode

Total characters347424
Distinct characters615
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8783 ?
Unique (%)87.8%

Sample

1st row경기도 화성시 오산동 972번지 동탄역반도유보라아이비파크6.0상가A동 A-102,A-103호
2nd row경기도 의왕시 내손동 416-3번지 512호
3rd row경기도 화성시 장지동 923번지 금호어울림레이크2차 상가동 103호
4th row경기도 화성시 청계동 520번지 동탄역 시범한화 꿈에그린 프레스티지 근린생활시설3 123호
5th row경기도 부천시 옥길동 744-3번지 O 612호
ValueCountFrequency (%)
경기도 9998
 
14.0%
일부 1649
 
2.3%
상가동 1502
 
2.1%
수원시 1241
 
1.7%
고양시 1068
 
1.5%
1층 933
 
1.3%
성남시 866
 
1.2%
용인시 825
 
1.2%
상가 825
 
1.2%
2층 740
 
1.0%
Other values (10223) 51753
72.5%
2023-12-11T07:23:08.806051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
61458
 
17.7%
1 13843
 
4.0%
13754
 
4.0%
11899
 
3.4%
2 11331
 
3.3%
10841
 
3.1%
0 10627
 
3.1%
10494
 
3.0%
10463
 
3.0%
10126
 
2.9%
Other values (605) 182588
52.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 208251
59.9%
Decimal Number 68780
 
19.8%
Space Separator 61458
 
17.7%
Dash Punctuation 5870
 
1.7%
Uppercase Letter 1119
 
0.3%
Other Punctuation 694
 
0.2%
Open Punctuation 492
 
0.1%
Close Punctuation 448
 
0.1%
Lowercase Letter 270
 
0.1%
Math Symbol 21
 
< 0.1%
Other values (2) 21
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13754
 
6.6%
11899
 
5.7%
10841
 
5.2%
10494
 
5.0%
10463
 
5.0%
10126
 
4.9%
9985
 
4.8%
8923
 
4.3%
5402
 
2.6%
3993
 
1.9%
Other values (532) 112371
54.0%
Uppercase Letter
ValueCountFrequency (%)
B 212
18.9%
A 174
15.5%
I 89
 
8.0%
S 80
 
7.1%
E 71
 
6.3%
C 56
 
5.0%
K 50
 
4.5%
L 49
 
4.4%
T 40
 
3.6%
W 38
 
3.4%
Other values (14) 260
23.2%
Lowercase Letter
ValueCountFrequency (%)
e 110
40.7%
m 34
 
12.6%
o 33
 
12.2%
a 18
 
6.7%
l 16
 
5.9%
h 13
 
4.8%
t 9
 
3.3%
r 7
 
2.6%
n 6
 
2.2%
p 6
 
2.2%
Other values (8) 18
 
6.7%
Decimal Number
ValueCountFrequency (%)
1 13843
20.1%
2 11331
16.5%
0 10627
15.5%
3 6745
9.8%
4 5248
 
7.6%
5 5122
 
7.4%
6 4558
 
6.6%
7 4119
 
6.0%
8 3809
 
5.5%
9 3378
 
4.9%
Other Punctuation
ValueCountFrequency (%)
, 504
72.6%
. 111
 
16.0%
& 48
 
6.9%
· 8
 
1.2%
? 7
 
1.0%
@ 7
 
1.0%
/ 7
 
1.0%
: 1
 
0.1%
* 1
 
0.1%
Letter Number
ValueCountFrequency (%)
9
45.0%
8
40.0%
3
 
15.0%
Open Punctuation
ValueCountFrequency (%)
( 491
99.8%
[ 1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 447
99.8%
] 1
 
0.2%
Math Symbol
ValueCountFrequency (%)
~ 20
95.2%
1
 
4.8%
Space Separator
ValueCountFrequency (%)
61458
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5870
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 208251
59.9%
Common 137764
39.7%
Latin 1409
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13754
 
6.6%
11899
 
5.7%
10841
 
5.2%
10494
 
5.0%
10463
 
5.0%
10126
 
4.9%
9985
 
4.8%
8923
 
4.3%
5402
 
2.6%
3993
 
1.9%
Other values (532) 112371
54.0%
Latin
ValueCountFrequency (%)
B 212
15.0%
A 174
 
12.3%
e 110
 
7.8%
I 89
 
6.3%
S 80
 
5.7%
E 71
 
5.0%
C 56
 
4.0%
K 50
 
3.5%
L 49
 
3.5%
T 40
 
2.8%
Other values (35) 478
33.9%
Common
ValueCountFrequency (%)
61458
44.6%
1 13843
 
10.0%
2 11331
 
8.2%
0 10627
 
7.7%
3 6745
 
4.9%
- 5870
 
4.3%
4 5248
 
3.8%
5 5122
 
3.7%
6 4558
 
3.3%
7 4119
 
3.0%
Other values (18) 8843
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 208251
59.9%
ASCII 139144
40.1%
Number Forms 20
 
< 0.1%
None 8
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
61458
44.2%
1 13843
 
9.9%
2 11331
 
8.1%
0 10627
 
7.6%
3 6745
 
4.8%
- 5870
 
4.2%
4 5248
 
3.8%
5 5122
 
3.7%
6 4558
 
3.3%
7 4119
 
3.0%
Other values (58) 10223
 
7.3%
Hangul
ValueCountFrequency (%)
13754
 
6.6%
11899
 
5.7%
10841
 
5.2%
10494
 
5.0%
10463
 
5.0%
10126
 
4.9%
9985
 
4.8%
8923
 
4.3%
5402
 
2.6%
3993
 
1.9%
Other values (532) 112371
54.0%
Number Forms
ValueCountFrequency (%)
9
45.0%
8
40.0%
3
 
15.0%
None
ValueCountFrequency (%)
· 8
100.0%
Math Operators
ValueCountFrequency (%)
1
100.0%
Distinct5218
Distinct (%)54.2%
Missing368
Missing (%)3.7%
Memory size156.2 KiB
2023-12-11T07:23:09.078568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length28
Mean length19.195494
Min length13

Characters and Unicode

Total characters184891
Distinct characters357
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3271 ?
Unique (%)34.0%

Sample

1st row경기도 화성시 동탄기흥로 479-12
2nd row경기도 의왕시 내손중앙로 4
3rd row경기도 화성시 동탄대로1길 32
4th row경기도 화성시 동탄대로시범길 20
5th row경기도 부천시 옥길로 110-11
ValueCountFrequency (%)
경기도 9630
 
21.9%
수원시 1117
 
2.5%
고양시 1060
 
2.4%
성남시 856
 
1.9%
용인시 786
 
1.8%
분당구 656
 
1.5%
화성시 655
 
1.5%
안양시 586
 
1.3%
남양주시 548
 
1.2%
부천시 478
 
1.1%
Other values (3899) 27575
62.7%
2023-12-11T07:23:09.498072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34315
18.6%
10085
 
5.5%
9913
 
5.4%
9895
 
5.4%
9865
 
5.3%
9273
 
5.0%
1 7091
 
3.8%
5027
 
2.7%
2 4784
 
2.6%
3 3740
 
2.0%
Other values (347) 80903
43.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 116042
62.8%
Space Separator 34315
 
18.6%
Decimal Number 33131
 
17.9%
Dash Punctuation 1403
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10085
 
8.7%
9913
 
8.5%
9895
 
8.5%
9865
 
8.5%
9273
 
8.0%
5027
 
4.3%
3669
 
3.2%
3148
 
2.7%
3028
 
2.6%
2182
 
1.9%
Other values (335) 49957
43.1%
Decimal Number
ValueCountFrequency (%)
1 7091
21.4%
2 4784
14.4%
3 3740
11.3%
4 3006
9.1%
5 2799
 
8.4%
7 2651
 
8.0%
6 2494
 
7.5%
0 2378
 
7.2%
8 2240
 
6.8%
9 1948
 
5.9%
Space Separator
ValueCountFrequency (%)
34315
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1403
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 116042
62.8%
Common 68849
37.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10085
 
8.7%
9913
 
8.5%
9895
 
8.5%
9865
 
8.5%
9273
 
8.0%
5027
 
4.3%
3669
 
3.2%
3148
 
2.7%
3028
 
2.6%
2182
 
1.9%
Other values (335) 49957
43.1%
Common
ValueCountFrequency (%)
34315
49.8%
1 7091
 
10.3%
2 4784
 
6.9%
3 3740
 
5.4%
4 3006
 
4.4%
5 2799
 
4.1%
7 2651
 
3.9%
6 2494
 
3.6%
0 2378
 
3.5%
8 2240
 
3.3%
Other values (2) 3351
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 116042
62.8%
ASCII 68849
37.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
34315
49.8%
1 7091
 
10.3%
2 4784
 
6.9%
3 3740
 
5.4%
4 3006
 
4.4%
5 2799
 
4.1%
7 2651
 
3.9%
6 2494
 
3.6%
0 2378
 
3.5%
8 2240
 
3.3%
Other values (2) 3351
 
4.9%
Hangul
ValueCountFrequency (%)
10085
 
8.7%
9913
 
8.5%
9895
 
8.5%
9865
 
8.5%
9273
 
8.0%
5027
 
4.3%
3669
 
3.2%
3148
 
2.7%
3028
 
2.6%
2182
 
1.9%
Other values (335) 49957
43.1%

WGS84위도
Real number (ℝ)

HIGH CORRELATION 

Distinct5528
Distinct (%)55.7%
Missing78
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean37.425769
Minimum36.960541
Maximum38.090853
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T07:23:09.632150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.960541
5-th percentile37.163836
Q137.288885
median37.387315
Q337.604822
95-th percentile37.728621
Maximum38.090853
Range1.1303121
Interquartile range (IQR)0.31593683

Descriptive statistics

Standard deviation0.18807846
Coefficient of variation (CV)0.0050253733
Kurtosis-0.57785799
Mean37.425769
Median Absolute Deviation (MAD)0.11961573
Skewness0.1584533
Sum371338.48
Variance0.035373508
MonotonicityNot monotonic
2023-12-11T07:23:09.759182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.6048216394 46
 
0.5%
37.3832475488 32
 
0.3%
37.3699876639 29
 
0.3%
37.3822361787 25
 
0.2%
37.383931025 22
 
0.2%
37.2070424917 22
 
0.2%
37.421221953 19
 
0.2%
37.0146734408 17
 
0.2%
37.3706164267 17
 
0.2%
37.3849920846 15
 
0.1%
Other values (5518) 9678
96.8%
(Missing) 78
 
0.8%
ValueCountFrequency (%)
36.9605410794 1
 
< 0.1%
36.9636098749 1
 
< 0.1%
36.9643269427 2
 
< 0.1%
36.9773330934 1
 
< 0.1%
36.977548528 3
< 0.1%
36.9775627222 1
 
< 0.1%
36.9780218483 1
 
< 0.1%
36.9787197587 5
0.1%
36.9789601868 1
 
< 0.1%
36.9799377935 1
 
< 0.1%
ValueCountFrequency (%)
38.0908531891 1
< 0.1%
38.0286990957 1
< 0.1%
38.0286715484 1
< 0.1%
38.0283340084 1
< 0.1%
38.0281669258 1
< 0.1%
37.9590722157 1
< 0.1%
37.9566836141 1
< 0.1%
37.9561834287 1
< 0.1%
37.9546376274 1
< 0.1%
37.9479236529 1
< 0.1%

WGS84경도
Real number (ℝ)

HIGH CORRELATION 

Distinct5528
Distinct (%)55.7%
Missing78
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean127.0058
Minimum126.58259
Maximum127.75262
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T07:23:10.131190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.58259
5-th percentile126.74889
Q1126.86463
median127.03364
Q3127.11955
95-th percentile127.25035
Maximum127.75262
Range1.1700322
Interquartile range (IQR)0.25492475

Descriptive statistics

Standard deviation0.17414916
Coefficient of variation (CV)0.0013711906
Kurtosis0.48323897
Mean127.0058
Median Absolute Deviation (MAD)0.1045376
Skewness0.31794159
Sum1260151.6
Variance0.030327931
MonotonicityNot monotonic
2023-12-11T07:23:10.272619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.1415855185 46
 
0.5%
126.9608025327 32
 
0.3%
127.1236529514 29
 
0.3%
127.2285064261 25
 
0.2%
126.9604587618 22
 
0.2%
127.0730745759 22
 
0.2%
126.9912252368 19
 
0.2%
127.2647231995 17
 
0.2%
127.1229608786 17
 
0.2%
126.9598976523 15
 
0.1%
Other values (5518) 9678
96.8%
(Missing) 78
 
0.8%
ValueCountFrequency (%)
126.5825880896 1
< 0.1%
126.5826055651 1
< 0.1%
126.5927922822 1
< 0.1%
126.5937185429 1
< 0.1%
126.5940538809 1
< 0.1%
126.596309692 1
< 0.1%
126.5971793331 1
< 0.1%
126.6190656834 1
< 0.1%
126.621915191 1
< 0.1%
126.6228753128 1
< 0.1%
ValueCountFrequency (%)
127.7526202722 1
< 0.1%
127.7141703278 1
< 0.1%
127.7080232141 1
< 0.1%
127.6800067605 2
< 0.1%
127.6739566483 1
< 0.1%
127.6602708537 1
< 0.1%
127.65104844 1
< 0.1%
127.6450606674 1
< 0.1%
127.6432635465 1
< 0.1%
127.6430759773 1
< 0.1%

Interactions

2023-12-11T07:23:04.448988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:23:03.946819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:23:04.202230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:23:04.529306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:23:04.033964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:23:04.285484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:23:04.616810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:23:04.121428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:23:04.362668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:23:10.373275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명교습과정명소재지우편번호WGS84위도WGS84경도
시군명1.0000.2150.9900.9640.941
교습과정명0.2151.0000.1770.1600.133
소재지우편번호0.9900.1771.0000.8070.695
WGS84위도0.9640.1600.8071.0000.642
WGS84경도0.9410.1330.6950.6421.000
2023-12-11T07:23:10.467875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명교습과정명
시군명1.0000.046
교습과정명0.0461.000
2023-12-11T07:23:10.554521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소재지우편번호WGS84위도WGS84경도시군명교습과정명
소재지우편번호1.000-0.9250.2260.8890.063
WGS84위도-0.9251.000-0.2560.7910.056
WGS84경도0.226-0.2561.0000.7070.046
시군명0.8890.7910.7071.0000.046
교습과정명0.0630.0560.0460.0461.000

Missing values

2023-12-11T07:23:04.751939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:23:04.952803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T07:23:05.086025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군명구/읍면동명시설명대표자명교습과정명전화번호소재지우편번호소재지지번주소소재지도로명주소WGS84위도WGS84경도
10325화성시오산동뉴늘봄피아노음악교습소이명희음악<NA>18479경기도 화성시 오산동 972번지 동탄역반도유보라아이비파크6.0상가A동 A-102,A-103호경기도 화성시 동탄기흥로 479-1237.203624127.090335
8567의왕시내손동스마트해법영어벨포레점영어교습소이예슬보습<NA>16024경기도 의왕시 내손동 416-3번지 512호경기도 의왕시 내손중앙로 437.389079126.976859
10406화성시장지동늘봄피아노음악교습소강정님음악<NA>18499경기도 화성시 장지동 923번지 금호어울림레이크2차 상가동 103호경기도 화성시 동탄대로1길 3237.161407127.102355
10485화성시청계동라움음악교습소최한나음악<NA>18477경기도 화성시 청계동 520번지 동탄역 시범한화 꿈에그린 프레스티지 근린생활시설3 123호경기도 화성시 동탄대로시범길 2037.198312127.105328
3519부천시옥길동솔루니옥길논술교습소감요셉보습·논술<NA>14786경기도 부천시 옥길동 744-3번지 O 612호경기도 부천시 옥길로 110-1137.466475126.823183
1132고양시일산서구탄현주니어랩영어교습소양섭규보습031-922-059410249경기도 고양시 일산서구 탄현동 1473번지 탄현마을 204호경기도 고양시 일산서구 탄현로 8037.698807126.769441
5642수원시팔달구마리아피아노김영중음악<NA>16263경기도 수원시 팔달구 중동 14번지경기도 수원시 팔달구 향교로 15437.274094127.01587
10076화성시반월동제이엘영어교습소정혜경보습<NA>18383경기도 화성시 반월동 960번지 e편한세상신동탄 201동 212호경기도 화성시 효행로 1337-2337.226269127.060702
6506안산시상록구송호수학교습소신승엽보습<NA>15487경기도 안산시 상록구 이동 651-3번지 102호경기도 안산시 상록구 송호2길 5-137.312563126.84656
8182용인시수지구용인대일수학교습소최미헌보습<NA>16873경기도 용인시 수지구 죽전동 1258-1번지 1층 일부경기도 용인시 수지구 푸른솔로 9537.326847127.11801
시군명구/읍면동명시설명대표자명교습과정명전화번호소재지우편번호소재지지번주소소재지도로명주소WGS84위도WGS84경도
4272성남시분당구그린아이미술교습소김지영음악<NA>13544경기도 성남시 분당구 대장동 595번지 힐스테이트 판교 엘포레 A6BL 209호경기도 성남시 분당구 판교대장로7길 1637.372601127.068004
9965화성시목동올리브논술교습소정다은보습<NA>18484경기도 화성시 목동 우성스타파크아파트 604호경기도 화성시 동탄순환대로20길 12437.182663127.124697
4408성남시분당구카덴차바이올린(cadenza violin)교습소현유라음악<NA>13602경기도 성남시 분당구 정자동 193번지 정든마을근린상가동 208호 일부경기도 성남시 분당구 불정로 11937.359523127.118274
8381용인시수지구뮤엠신리영어교습소김수경보습<NA>16816경기도 용인시 수지구 신봉동 50-2번지 뉴골드프라자1 101호경기도 용인시 수지구 신봉1로48번길 1137.323104127.079879
7203안양시별양동서밋영어교습소황금란보습<NA>13837경기도 과천시 별양동 1-13번지 제일쇼핑 3층 317호경기도 과천시 별양상가2로 1437.427332126.993004
715고양시일산동구제니의리드앤톡영어교습소신동희보습<NA>10394경기도 고양시 일산동구 장항동 1762번지 킨텍스원시티 3블럭 334호경기도 고양시 일산동구 월드고양로 2137.662206126.749517
94고양시덕양구에이스픽영어교습소서희원보습031-979-179010501경기도 고양시 덕양구 화정동 946번지 문정교육센타 5층 일부경기도 고양시 덕양구 화신로 31937.632655126.826291
7920용인시기흥구삼성영어동백어정영어교습소전승희보습<NA>16993경기도 용인시 기흥구 동백동 683번지 동원마을동원로얄듀크 상가동 202호 일부경기도 용인시 기흥구 동백죽전대로 455-1737.279559127.149755
5140수원시영통구다움수학교습소김다움보습<NA>16509경기도 수원시 영통구 이의동 1348번지 광교중앙역 SK VIEW B동 712호경기도 수원시 영통구 에듀타운로 10237.289791127.046422
10099화성시병점동나는과학교습소임미정보습<NA>18414경기도 화성시 병점동 292번지 태안병점 브이타운 403-1호경기도 화성시 병점중앙로 8737.206286127.041896