Overview

Dataset statistics

Number of variables12
Number of observations583
Missing cells1432
Missing cells (%)20.5%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory57.1 KiB
Average record size in memory100.2 B

Variable types

Categorical2
Text4
Boolean1
DateTime1
Numeric4

Alerts

Dataset has 1 (0.2%) duplicate rowsDuplicates
정제우편번호 is highly overall correlated with 정제WGS84위도 and 3 other fieldsHigh correlation
정제WGS84위도 is highly overall correlated with 정제우편번호 and 2 other fieldsHigh correlation
정제WGS84경도 is highly overall correlated with 시군명 and 2 other fieldsHigh correlation
시군명 is highly overall correlated with 정제우편번호 and 4 other fieldsHigh correlation
업종명 is highly overall correlated with 정제우편번호 and 4 other fieldsHigh correlation
폐업여부 is highly overall correlated with 정제우편번호 and 3 other fieldsHigh correlation
폐업여부 is highly imbalanced (84.5%)Imbalance
점포명 has 104 (17.8%) missing valuesMissing
허가번호 has 186 (31.9%) missing valuesMissing
폐업여부 has 183 (31.4%) missing valuesMissing
인허가일자 has 215 (36.9%) missing valuesMissing
점용면적 has 171 (29.3%) missing valuesMissing
정제도로명주소 has 138 (23.7%) missing valuesMissing
정제지번주소 has 98 (16.8%) missing valuesMissing
정제우편번호 has 127 (21.8%) missing valuesMissing
정제WGS84위도 has 105 (18.0%) missing valuesMissing
정제WGS84경도 has 105 (18.0%) missing valuesMissing

Reproduction

Analysis started2024-05-10 20:24:36.787273
Analysis finished2024-05-10 20:24:43.727011
Duration6.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size4.7 KiB
가평군
171 
고양시
140 
수원시
93 
부천시
73 
성남시
69 
Other values (7)
37 

Length

Max length4
Median length3
Mean length3.0034305
Min length3

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row고양시
2nd row고양시
3rd row고양시
4th row고양시
5th row고양시

Common Values

ValueCountFrequency (%)
가평군 171
29.3%
고양시 140
24.0%
수원시 93
16.0%
부천시 73
12.5%
성남시 69
11.8%
구리시 10
 
1.7%
광명시 8
 
1.4%
여주시 8
 
1.4%
의왕시 4
 
0.7%
하남시 4
 
0.7%
Other values (2) 3
 
0.5%

Length

2024-05-10T20:24:43.884099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
가평군 171
29.3%
고양시 140
24.0%
수원시 93
16.0%
부천시 73
12.5%
성남시 69
11.8%
구리시 10
 
1.7%
광명시 8
 
1.4%
여주시 8
 
1.4%
의왕시 4
 
0.7%
하남시 4
 
0.7%
Other values (2) 3
 
0.5%

점포명
Text

MISSING 

Distinct302
Distinct (%)63.0%
Missing104
Missing (%)17.8%
Memory size4.7 KiB
2024-05-10T20:24:44.370951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length16
Mean length8.0939457
Min length3

Characters and Unicode

Total characters3877
Distinct characters167
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique295 ?
Unique (%)61.6%

Sample

1st row교통카드판매소
2nd row교통카드판매소
3rd row교통카드판매소
4th row교통카드판매소
5th row교통카드판매소
ValueCountFrequency (%)
햇살가게 73
 
14.7%
구두수선점 64
 
12.9%
구두수선소 15
 
3.0%
교통카드판매소 13
 
2.6%
가로판매점 12
 
2.4%
구두수선대 6
 
1.2%
토큰박스 3
 
0.6%
길벗가게a-38호 1
 
0.2%
가평5일시장(뻥튀기③ 1
 
0.2%
소품 1
 
0.2%
Other values (307) 307
61.9%
2024-05-10T20:24:45.405301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
291
 
7.5%
( 219
 
5.6%
) 217
 
5.6%
5 195
 
5.0%
180
 
4.6%
179
 
4.6%
173
 
4.5%
172
 
4.4%
144
 
3.7%
106
 
2.7%
Other values (157) 2001
51.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2686
69.3%
Decimal Number 403
 
10.4%
Open Punctuation 219
 
5.6%
Close Punctuation 217
 
5.6%
Other Number 114
 
2.9%
Dash Punctuation 84
 
2.2%
Uppercase Letter 84
 
2.2%
Lowercase Letter 30
 
0.8%
Other Punctuation 22
 
0.6%
Space Separator 17
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
291
 
10.8%
180
 
6.7%
179
 
6.7%
173
 
6.4%
172
 
6.4%
144
 
5.4%
106
 
3.9%
106
 
3.9%
95
 
3.5%
94
 
3.5%
Other values (128) 1146
42.7%
Decimal Number
ValueCountFrequency (%)
5 195
48.4%
0 32
 
7.9%
1 29
 
7.2%
3 28
 
6.9%
4 26
 
6.5%
2 25
 
6.2%
7 20
 
5.0%
6 17
 
4.2%
8 16
 
4.0%
9 15
 
3.7%
Other Number
ValueCountFrequency (%)
35
30.7%
34
29.8%
18
15.8%
9
 
7.9%
8
 
7.0%
5
 
4.4%
2
 
1.8%
2
 
1.8%
1
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
B 40
47.6%
C 36
42.9%
A 8
 
9.5%
Open Punctuation
ValueCountFrequency (%)
( 219
100.0%
Close Punctuation
ValueCountFrequency (%)
) 217
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 84
100.0%
Lowercase Letter
ValueCountFrequency (%)
a 30
100.0%
Other Punctuation
ValueCountFrequency (%)
, 22
100.0%
Space Separator
ValueCountFrequency (%)
17
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2687
69.3%
Common 1076
27.8%
Latin 114
 
2.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
291
 
10.8%
180
 
6.7%
179
 
6.7%
173
 
6.4%
172
 
6.4%
144
 
5.4%
106
 
3.9%
106
 
3.9%
95
 
3.5%
94
 
3.5%
Other values (129) 1147
42.7%
Common
ValueCountFrequency (%)
( 219
20.4%
) 217
20.2%
5 195
18.1%
- 84
 
7.8%
35
 
3.3%
34
 
3.2%
0 32
 
3.0%
1 29
 
2.7%
3 28
 
2.6%
4 26
 
2.4%
Other values (14) 177
16.4%
Latin
ValueCountFrequency (%)
B 40
35.1%
C 36
31.6%
a 30
26.3%
A 8
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2686
69.3%
ASCII 1076
27.8%
Enclosed Alphanum 114
 
2.9%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
291
 
10.8%
180
 
6.7%
179
 
6.7%
173
 
6.4%
172
 
6.4%
144
 
5.4%
106
 
3.9%
106
 
3.9%
95
 
3.5%
94
 
3.5%
Other values (128) 1146
42.7%
ASCII
ValueCountFrequency (%)
( 219
20.4%
) 217
20.2%
5 195
18.1%
- 84
 
7.8%
B 40
 
3.7%
C 36
 
3.3%
0 32
 
3.0%
a 30
 
2.8%
1 29
 
2.7%
3 28
 
2.6%
Other values (9) 166
15.4%
Enclosed Alphanum
ValueCountFrequency (%)
35
30.7%
34
29.8%
18
15.8%
9
 
7.9%
8
 
7.0%
5
 
4.4%
2
 
1.8%
2
 
1.8%
1
 
0.9%
None
ValueCountFrequency (%)
1
100.0%

허가번호
Text

MISSING 

Distinct307
Distinct (%)77.3%
Missing186
Missing (%)31.9%
Memory size4.7 KiB
2024-05-10T20:24:46.009574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length7.604534
Min length3

Characters and Unicode

Total characters3019
Distinct characters39
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique260 ?
Unique (%)65.5%

Sample

1st row제2023-31호
2nd row제2023-32호
3rd row제2023-33호
4th row제2023-34호
5th row제2023-35호
ValueCountFrequency (%)
18
 
4.3%
제2023-40호 3
 
0.7%
제2023-20호 3
 
0.7%
제2023-38호 3
 
0.7%
제2023-04호 3
 
0.7%
제2023-09호 3
 
0.7%
제2023-11호 3
 
0.7%
제2023-14호 3
 
0.7%
제2023-16호 3
 
0.7%
제2023-27호 3
 
0.7%
Other values (298) 370
89.2%
2024-05-10T20:24:46.990885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 647
21.4%
0 442
14.6%
- 406
13.4%
3 257
 
8.5%
203
 
6.7%
201
 
6.7%
1 182
 
6.0%
4 83
 
2.7%
5 47
 
1.6%
6 46
 
1.5%
Other values (29) 505
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1829
60.6%
Other Letter 682
 
22.6%
Dash Punctuation 406
 
13.4%
Uppercase Letter 73
 
2.4%
Space Separator 18
 
0.6%
Open Punctuation 4
 
0.1%
Close Punctuation 4
 
0.1%
Other Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
203
29.8%
201
29.5%
45
 
6.6%
45
 
6.6%
39
 
5.7%
33
 
4.8%
18
 
2.6%
18
 
2.6%
17
 
2.5%
17
 
2.5%
Other values (9) 46
 
6.7%
Decimal Number
ValueCountFrequency (%)
2 647
35.4%
0 442
24.2%
3 257
 
14.1%
1 182
 
10.0%
4 83
 
4.5%
5 47
 
2.6%
6 46
 
2.5%
7 45
 
2.5%
8 44
 
2.4%
9 36
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
H 29
39.7%
A 21
28.8%
C 12
16.4%
B 8
 
11.0%
D 3
 
4.1%
Dash Punctuation
ValueCountFrequency (%)
- 406
100.0%
Space Separator
ValueCountFrequency (%)
18
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2264
75.0%
Hangul 682
 
22.6%
Latin 73
 
2.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
203
29.8%
201
29.5%
45
 
6.6%
45
 
6.6%
39
 
5.7%
33
 
4.8%
18
 
2.6%
18
 
2.6%
17
 
2.5%
17
 
2.5%
Other values (9) 46
 
6.7%
Common
ValueCountFrequency (%)
2 647
28.6%
0 442
19.5%
- 406
17.9%
3 257
 
11.4%
1 182
 
8.0%
4 83
 
3.7%
5 47
 
2.1%
6 46
 
2.0%
7 45
 
2.0%
8 44
 
1.9%
Other values (5) 65
 
2.9%
Latin
ValueCountFrequency (%)
H 29
39.7%
A 21
28.8%
C 12
16.4%
B 8
 
11.0%
D 3
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2337
77.4%
Hangul 682
 
22.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 647
27.7%
0 442
18.9%
- 406
17.4%
3 257
 
11.0%
1 182
 
7.8%
4 83
 
3.6%
5 47
 
2.0%
6 46
 
2.0%
7 45
 
1.9%
8 44
 
1.9%
Other values (10) 138
 
5.9%
Hangul
ValueCountFrequency (%)
203
29.8%
201
29.5%
45
 
6.6%
45
 
6.6%
39
 
5.7%
33
 
4.8%
18
 
2.6%
18
 
2.6%
17
 
2.5%
17
 
2.5%
Other values (9) 46
 
6.7%

업종명
Categorical

HIGH CORRELATION 

Distinct28
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size4.7 KiB
<NA>
171 
길벗가게
106 
스낵
51 
구두수선대
50 
가두판매대
45 
Other values (23)
160 

Length

Max length9
Median length8
Mean length4.1955403
Min length2

Unique

Unique11 ?
Unique (%)1.9%

Sample

1st row교통카드판매소
2nd row교통카드판매소
3rd row교통카드판매소
4th row교통카드판매소
5th row교통카드판매소

Common Values

ValueCountFrequency (%)
<NA> 171
29.3%
길벗가게 106
18.2%
스낵 51
 
8.7%
구두수선대 50
 
8.6%
가두판매대 45
 
7.7%
구두수선소 42
 
7.2%
가로판매대 25
 
4.3%
교통카드판매소 17
 
2.9%
구두수선점 13
 
2.2%
가로판매점 12
 
2.1%
Other values (18) 51
 
8.7%

Length

2024-05-10T20:24:47.444554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 171
29.2%
길벗가게 106
18.1%
스낵 51
 
8.7%
구두수선대 50
 
8.5%
가두판매대 45
 
7.7%
구두수선소 42
 
7.2%
가로판매대 25
 
4.3%
교통카드판매소 17
 
2.9%
구두수선점 13
 
2.2%
가로판매점 12
 
2.1%
Other values (20) 53
 
9.1%

폐업여부
Boolean

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)0.5%
Missing183
Missing (%)31.4%
Memory size1.3 KiB
False
391 
True
 
9
(Missing)
183 
ValueCountFrequency (%)
False 391
67.1%
True 9
 
1.5%
(Missing) 183
31.4%
2024-05-10T20:24:48.057149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

인허가일자
Date

MISSING 

Distinct37
Distinct (%)10.1%
Missing215
Missing (%)36.9%
Memory size4.7 KiB
Minimum1998-01-01 00:00:00
Maximum2023-01-01 00:00:00
2024-05-10T20:24:48.608635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:49.327780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)

점용면적
Real number (ℝ)

MISSING 

Distinct59
Distinct (%)14.3%
Missing171
Missing (%)29.3%
Infinite0
Infinite (%)0.0%
Mean4.1935437
Minimum1.8
Maximum7.1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.3 KiB
2024-05-10T20:24:49.880592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.8
5-th percentile3
Q13.8
median4.14
Q34.5
95-th percentile5.4
Maximum7.1
Range5.3
Interquartile range (IQR)0.7

Descriptive statistics

Standard deviation0.75780196
Coefficient of variation (CV)0.18070682
Kurtosis1.892716
Mean4.1935437
Median Absolute Deviation (MAD)0.36
Skewness0.14327625
Sum1727.74
Variance0.57426381
MonotonicityNot monotonic
2024-05-10T20:24:50.484883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.5 98
16.8%
4.08 54
 
9.3%
4.14 28
 
4.8%
4.84 26
 
4.5%
4.0 24
 
4.1%
3.61 20
 
3.4%
3.0 18
 
3.1%
3.6 12
 
2.1%
3.52 11
 
1.9%
4.8 10
 
1.7%
Other values (49) 111
19.0%
(Missing) 171
29.3%
ValueCountFrequency (%)
1.8 3
 
0.5%
2.25 5
 
0.9%
2.38 1
 
0.2%
2.4 1
 
0.2%
2.66 3
 
0.5%
2.8 1
 
0.2%
2.88 1
 
0.2%
3.0 18
3.1%
3.08 3
 
0.5%
3.09 1
 
0.2%
ValueCountFrequency (%)
7.1 1
 
0.2%
7.0 1
 
0.2%
6.7 1
 
0.2%
6.3 1
 
0.2%
6.11 9
1.5%
6.0 3
 
0.5%
5.7 1
 
0.2%
5.4 6
1.0%
5.32 1
 
0.2%
5.3 1
 
0.2%

정제도로명주소
Text

MISSING 

Distinct242
Distinct (%)54.4%
Missing138
Missing (%)23.7%
Memory size4.7 KiB
2024-05-10T20:24:51.062129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length31
Mean length21.54382
Min length13

Characters and Unicode

Total characters9587
Distinct characters155
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique222 ?
Unique (%)49.9%

Sample

1st row경기도 고양시 덕양구 화정로 지하 60
2nd row경기도 구리시 건원대로 55
3rd row경기도 수원시 장안구 송원로 65
4th row경기도 수원시 장안구 연무로42번길 12
5th row경기도 수원시 장안구 장안로115번길 38
ValueCountFrequency (%)
경기도 445
19.8%
가평군 171
 
7.6%
가평읍 96
 
4.3%
34번길 96
 
4.3%
일원 96
 
4.3%
보납로 96
 
4.3%
수원시 82
 
3.6%
부천시 73
 
3.2%
성남시 69
 
3.1%
시장중앙로 48
 
2.1%
Other values (382) 981
43.5%
2024-05-10T20:24:52.264661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1808
18.9%
457
 
4.8%
446
 
4.7%
445
 
4.6%
445
 
4.6%
1 357
 
3.7%
326
 
3.4%
323
 
3.4%
268
 
2.8%
3 205
 
2.1%
Other values (145) 4507
47.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6062
63.2%
Space Separator 1808
 
18.9%
Decimal Number 1384
 
14.4%
Dash Punctuation 200
 
2.1%
Close Punctuation 63
 
0.7%
Open Punctuation 63
 
0.7%
Connector Punctuation 7
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
457
 
7.5%
446
 
7.4%
445
 
7.3%
445
 
7.3%
326
 
5.4%
323
 
5.3%
268
 
4.4%
198
 
3.3%
194
 
3.2%
171
 
2.8%
Other values (130) 2789
46.0%
Decimal Number
ValueCountFrequency (%)
1 357
25.8%
3 205
14.8%
2 194
14.0%
4 167
12.1%
9 102
 
7.4%
0 86
 
6.2%
5 84
 
6.1%
6 68
 
4.9%
7 64
 
4.6%
8 57
 
4.1%
Space Separator
ValueCountFrequency (%)
1808
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 200
100.0%
Close Punctuation
ValueCountFrequency (%)
) 63
100.0%
Open Punctuation
ValueCountFrequency (%)
( 63
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6062
63.2%
Common 3525
36.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
457
 
7.5%
446
 
7.4%
445
 
7.3%
445
 
7.3%
326
 
5.4%
323
 
5.3%
268
 
4.4%
198
 
3.3%
194
 
3.2%
171
 
2.8%
Other values (130) 2789
46.0%
Common
ValueCountFrequency (%)
1808
51.3%
1 357
 
10.1%
3 205
 
5.8%
- 200
 
5.7%
2 194
 
5.5%
4 167
 
4.7%
9 102
 
2.9%
0 86
 
2.4%
5 84
 
2.4%
6 68
 
1.9%
Other values (5) 254
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6062
63.2%
ASCII 3525
36.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1808
51.3%
1 357
 
10.1%
3 205
 
5.8%
- 200
 
5.7%
2 194
 
5.5%
4 167
 
4.7%
9 102
 
2.9%
0 86
 
2.4%
5 84
 
2.4%
6 68
 
1.9%
Other values (5) 254
 
7.2%
Hangul
ValueCountFrequency (%)
457
 
7.5%
446
 
7.4%
445
 
7.3%
445
 
7.3%
326
 
5.4%
323
 
5.3%
268
 
4.4%
198
 
3.3%
194
 
3.2%
171
 
2.8%
Other values (130) 2789
46.0%

정제지번주소
Text

MISSING 

Distinct316
Distinct (%)65.2%
Missing98
Missing (%)16.8%
Memory size4.7 KiB
2024-05-10T20:24:52.753877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length49
Mean length25.34433
Min length15

Characters and Unicode

Total characters12292
Distinct characters194
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique264 ?
Unique (%)54.4%

Sample

1st row경기도 고양시 덕양구 화정동 1098번지 (970번지선)
2nd row경기도 고양시 덕양구 행신동 858번지 (783번지선)
3rd row경기도 고양시 덕양구 성사동 386-5번지 (726번지선)
4th row경기도 고양시 덕양구 화정동 971-3번지 (974번지선)
5th row경기도 고양시 덕양구 행신동 1019번지 (993번지선)
ValueCountFrequency (%)
경기도 486
 
18.1%
고양시 141
 
5.2%
수원시 94
 
3.5%
가평군 75
 
2.8%
부천시 71
 
2.6%
성남시 69
 
2.6%
원미구 59
 
2.2%
일산동구 52
 
1.9%
청평면 48
 
1.8%
청평리 48
 
1.8%
Other values (532) 1546
57.5%
2024-05-10T20:24:53.813293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2206
 
17.9%
1 537
 
4.4%
500
 
4.1%
490
 
4.0%
486
 
4.0%
486
 
4.0%
421
 
3.4%
414
 
3.4%
2 335
 
2.7%
- 306
 
2.5%
Other values (184) 6111
49.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7136
58.1%
Decimal Number 2315
 
18.8%
Space Separator 2206
 
17.9%
Dash Punctuation 306
 
2.5%
Close Punctuation 157
 
1.3%
Open Punctuation 157
 
1.3%
Other Punctuation 12
 
0.1%
Lowercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
500
 
7.0%
490
 
6.9%
486
 
6.8%
486
 
6.8%
421
 
5.9%
414
 
5.8%
278
 
3.9%
261
 
3.7%
195
 
2.7%
185
 
2.6%
Other values (167) 3420
47.9%
Decimal Number
ValueCountFrequency (%)
1 537
23.2%
2 335
14.5%
8 217
9.4%
4 203
 
8.8%
3 199
 
8.6%
6 177
 
7.6%
0 172
 
7.4%
7 163
 
7.0%
5 160
 
6.9%
9 152
 
6.6%
Lowercase Letter
ValueCountFrequency (%)
s 2
66.7%
b 1
33.3%
Space Separator
ValueCountFrequency (%)
2206
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 306
100.0%
Close Punctuation
ValueCountFrequency (%)
) 157
100.0%
Open Punctuation
ValueCountFrequency (%)
( 157
100.0%
Other Punctuation
ValueCountFrequency (%)
, 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7136
58.1%
Common 5153
41.9%
Latin 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
500
 
7.0%
490
 
6.9%
486
 
6.8%
486
 
6.8%
421
 
5.9%
414
 
5.8%
278
 
3.9%
261
 
3.7%
195
 
2.7%
185
 
2.6%
Other values (167) 3420
47.9%
Common
ValueCountFrequency (%)
2206
42.8%
1 537
 
10.4%
2 335
 
6.5%
- 306
 
5.9%
8 217
 
4.2%
4 203
 
3.9%
3 199
 
3.9%
6 177
 
3.4%
0 172
 
3.3%
7 163
 
3.2%
Other values (5) 638
 
12.4%
Latin
ValueCountFrequency (%)
s 2
66.7%
b 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7136
58.1%
ASCII 5156
41.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2206
42.8%
1 537
 
10.4%
2 335
 
6.5%
- 306
 
5.9%
8 217
 
4.2%
4 203
 
3.9%
3 199
 
3.9%
6 177
 
3.4%
0 172
 
3.3%
7 163
 
3.2%
Other values (7) 641
 
12.4%
Hangul
ValueCountFrequency (%)
500
 
7.0%
490
 
6.9%
486
 
6.8%
486
 
6.8%
421
 
5.9%
414
 
5.8%
278
 
3.9%
261
 
3.7%
195
 
2.7%
185
 
2.6%
Other values (167) 3420
47.9%

정제우편번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct195
Distinct (%)42.8%
Missing127
Missing (%)21.8%
Infinite0
Infinite (%)0.0%
Mean13288.064
Minimum10237
Maximum16713
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.3 KiB
2024-05-10T20:24:54.214733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10237
5-th percentile10364
Q110586.75
median13314.5
Q314644
95-th percentile16565.25
Maximum16713
Range6476
Interquartile range (IQR)4057.25

Descriptive statistics

Standard deviation2157.4064
Coefficient of variation (CV)0.16235672
Kurtosis-1.1825515
Mean13288.064
Median Absolute Deviation (MAD)1362
Skewness0.085299818
Sum6059357
Variance4654402.6
MonotonicityNot monotonic
2024-05-10T20:24:54.625939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12453 48
 
8.2%
12465 27
 
4.6%
10386 23
 
3.9%
14637 13
 
2.2%
10500 13
 
2.2%
10364 12
 
2.1%
14644 10
 
1.7%
10414 10
 
1.7%
14742 8
 
1.4%
12621 8
 
1.4%
Other values (185) 284
48.7%
(Missing) 127
21.8%
ValueCountFrequency (%)
10237 1
 
0.2%
10241 1
 
0.2%
10242 1
 
0.2%
10286 1
 
0.2%
10293 3
0.5%
10306 1
 
0.2%
10338 2
0.3%
10346 1
 
0.2%
10357 1
 
0.2%
10358 1
 
0.2%
ValueCountFrequency (%)
16713 2
0.3%
16709 1
0.2%
16708 1
0.2%
16707 1
0.2%
16704 1
0.2%
16703 1
0.2%
16700 1
0.2%
16699 2
0.3%
16698 2
0.3%
16692 2
0.3%

정제WGS84위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct298
Distinct (%)62.3%
Missing105
Missing (%)18.0%
Infinite0
Infinite (%)0.0%
Mean37.517153
Minimum37.237227
Maximum37.831801
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.3 KiB
2024-05-10T20:24:55.082075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.237227
5-th percentile37.261396
Q137.378203
median37.504463
Q337.669325
95-th percentile37.734606
Maximum37.831801
Range0.59457359
Interquartile range (IQR)0.29112227

Descriptive statistics

Standard deviation0.1639575
Coefficient of variation (CV)0.0043702009
Kurtosis-1.35122
Mean37.517153
Median Absolute Deviation (MAD)0.15832218
Skewness-0.27301937
Sum17933.199
Variance0.026882061
MonotonicityNot monotonic
2024-05-10T20:24:55.525560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.7346061435 48
 
8.2%
37.6775904442 27
 
4.6%
37.6697335206 15
 
2.6%
37.4869364026 8
 
1.4%
37.2956458819 8
 
1.4%
37.6520635602 7
 
1.2%
37.6567111412 6
 
1.0%
37.6645130115 6
 
1.0%
37.6627851331 6
 
1.0%
37.6709272392 5
 
0.9%
Other values (288) 342
58.7%
(Missing) 105
 
18.0%
ValueCountFrequency (%)
37.23722736 1
0.2%
37.2392058629 1
0.2%
37.2451780596 1
0.2%
37.2460950051 1
0.2%
37.2464754073 1
0.2%
37.2490869204 1
0.2%
37.2496770879 1
0.2%
37.2505474909 1
0.2%
37.2509511287 1
0.2%
37.2514263539 1
0.2%
ValueCountFrequency (%)
37.8318009486 1
 
0.2%
37.7346061435 48
8.2%
37.6969627091 1
 
0.2%
37.6946392537 1
 
0.2%
37.6933893917 1
 
0.2%
37.6919995515 1
 
0.2%
37.68872941 1
 
0.2%
37.6878832356 1
 
0.2%
37.6873787383 1
 
0.2%
37.6780504044 1
 
0.2%

정제WGS84경도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct298
Distinct (%)62.3%
Missing105
Missing (%)18.0%
Infinite0
Infinite (%)0.0%
Mean127.01344
Minimum126.74777
Maximum127.63755
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.3 KiB
2024-05-10T20:24:56.014020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.74777
5-th percentile126.75303
Q1126.77817
median127.00786
Q3127.13316
95-th percentile127.4929
Maximum127.63755
Range0.8897837
Interquartile range (IQR)0.35499092

Descriptive statistics

Standard deviation0.25038193
Coefficient of variation (CV)0.0019713026
Kurtosis-0.54641902
Mean127.01344
Median Absolute Deviation (MAD)0.219917
Skewness0.74208925
Sum60712.423
Variance0.062691108
MonotonicityNot monotonic
2024-05-10T20:24:56.442325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.415880581 48
 
8.2%
127.4929009409 27
 
4.6%
126.7626394714 15
 
2.6%
126.7530269102 8
 
1.4%
127.6375487482 8
 
1.4%
126.7780161833 7
 
1.2%
126.7716224656 6
 
1.0%
126.7652308692 6
 
1.0%
126.7671073188 6
 
1.0%
126.7618339197 5
 
0.9%
Other values (288) 342
58.7%
(Missing) 105
 
18.0%
ValueCountFrequency (%)
126.7477650455 2
0.3%
126.7478408715 2
0.3%
126.7480186671 1
0.2%
126.7510489352 1
0.2%
126.7523469055 1
0.2%
126.7523682616 2
0.3%
126.7523870918 2
0.3%
126.7524263231 1
0.2%
126.7527945904 1
0.2%
126.7528074666 1
0.2%
ValueCountFrequency (%)
127.6375487482 8
 
1.4%
127.4929009409 27
4.6%
127.415880581 48
8.2%
127.2362512352 1
 
0.2%
127.2355632511 1
 
0.2%
127.2130625029 1
 
0.2%
127.2084336222 1
 
0.2%
127.2082411875 1
 
0.2%
127.2019461542 1
 
0.2%
127.1669905006 1
 
0.2%

Interactions

2024-05-10T20:24:41.298411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:38.169010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:39.267156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:40.278941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:41.549664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:38.411688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:39.534192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:40.539051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:41.821301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:38.764898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:39.772137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:40.812514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:42.097991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:39.021685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:40.016624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T20:24:41.070217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-10T20:24:56.719050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명업종명폐업여부인허가일자점용면적정제우편번호정제WGS84위도정제WGS84경도
시군명1.0000.9710.5840.9980.7230.9750.9390.956
업종명0.9711.0000.6890.9580.8050.9690.9410.928
폐업여부0.5840.6891.0000.8260.1950.5330.1770.547
인허가일자0.9980.9580.8261.0000.9030.9960.9810.964
점용면적0.7230.8050.1950.9031.0000.6310.5520.592
정제우편번호0.9750.9690.5330.9960.6311.0000.9160.862
정제WGS84위도0.9390.9410.1770.9810.5520.9161.0000.877
정제WGS84경도0.9560.9280.5470.9640.5920.8620.8771.000
2024-05-10T20:24:57.033670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종명시군명폐업여부
업종명1.0000.8070.590
시군명0.8071.0000.582
폐업여부0.5900.5821.000
2024-05-10T20:24:57.315423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
점용면적정제우편번호정제WGS84위도정제WGS84경도시군명업종명폐업여부
점용면적1.000-0.0430.043-0.1200.4120.4390.148
정제우편번호-0.0431.000-0.8110.0840.9280.8280.568
정제WGS84위도0.043-0.8111.000-0.0130.7740.7270.175
정제WGS84경도-0.1200.084-0.0131.0000.8180.7170.584
시군명0.4120.9280.7740.8181.0000.8070.582
업종명0.4390.8280.7270.7170.8071.0000.590
폐업여부0.1480.5680.1750.5840.5820.5901.000

Missing values

2024-05-10T20:24:42.464576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-10T20:24:43.029147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-10T20:24:43.404397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군명점포명허가번호업종명폐업여부인허가일자점용면적정제도로명주소정제지번주소정제우편번호정제WGS84위도정제WGS84경도
0고양시교통카드판매소제2023-31호교통카드판매소N2022-12-305.04경기도 고양시 덕양구 화정로 지하 60경기도 고양시 덕양구 화정동 1098번지 (970번지선)1050037.634503126.832622
1고양시교통카드판매소제2023-32호교통카드판매소N2022-12-303.6<NA>경기도 고양시 덕양구 행신동 858번지 (783번지선)1052737.615419126.838571
2고양시교통카드판매소제2023-33호교통카드판매소N2022-12-303.08<NA>경기도 고양시 덕양구 성사동 386-5번지 (726번지선)<NA><NA><NA>
3고양시교통카드판매소제2023-34호교통카드판매소N2022-12-302.8<NA>경기도 고양시 덕양구 화정동 971-3번지 (974번지선)1050037.634418126.832924
4고양시교통카드판매소제2023-35호교통카드판매소N2022-12-303.6<NA>경기도 고양시 덕양구 행신동 1019번지 (993번지선)<NA>37.623161126.835982
5고양시교통카드판매소제2023-36호교통카드판매소N2022-12-302.4<NA>경기도 고양시 덕양구 관산동 227-9번지 (227-11번지선)1028637.687883126.864817
6고양시교통카드판매소제2023-37호교통카드판매소N2022-12-303.6<NA>경기도 고양시 덕양구 행신동 187-2번지 (971번지선)1049237.618522126.844415
7고양시교통카드판매소제2023-39호교통카드판매소N2022-12-303.08<NA>경기도 고양시 덕양구 행신동 881번지 (796번지선)<NA>37.618118126.844875
8구리시구두수선점2019-158구두수선점N2020-01-014.86경기도 구리시 건원대로 55경기도 구리시 인창동 679-71191837.605968127.139303
9고양시구두수선소제2023-41호구두수선소N2022-12-304.48<NA>경기도 고양시 덕양구 삼송동 26-11번지 (26-8번지선)1059037.653369126.894245
시군명점포명허가번호업종명폐업여부인허가일자점용면적정제도로명주소정제지번주소정제우편번호정제WGS84위도정제WGS84경도
573가평군청평5일시장(양말②,속옷②)<NA><NA><NA><NA><NA>경기도 가평군 청평면 시장중앙로 19경기도 가평군 청평면 청평리 81-2번지1245337.734606127.415881
574가평군청평5일시장(고추)<NA><NA><NA><NA><NA>경기도 가평군 청평면 시장중앙로 19경기도 가평군 청평면 청평리 81-2번지1245337.734606127.415881
575가평군청평5일시장(의류③)<NA><NA><NA><NA><NA>경기도 가평군 청평면 시장중앙로 19경기도 가평군 청평면 청평리 81-2번지1245337.734606127.415881
576가평군청평5일시장(과일②)<NA><NA><NA><NA><NA>경기도 가평군 청평면 시장중앙로 19경기도 가평군 청평면 청평리 81-2번지1245337.734606127.415881
577가평군청평5일시장(신발)<NA><NA><NA><NA><NA>경기도 가평군 청평면 시장중앙로 19경기도 가평군 청평면 청평리 81-2번지1245337.734606127.415881
578가평군청평5일시장(약재,건강식품)<NA><NA><NA><NA><NA>경기도 가평군 청평면 시장중앙로 19경기도 가평군 청평면 청평리 81-2번지1245337.734606127.415881
579가평군청평5일시장(먹거리)<NA><NA><NA><NA><NA>경기도 가평군 청평면 시장중앙로 19경기도 가평군 청평면 청평리 81-2번지1245337.734606127.415881
580가평군청평5일시장(낚시)<NA><NA><NA><NA><NA>경기도 가평군 청평면 시장중앙로 19경기도 가평군 청평면 청평리 81-2번지1245337.734606127.415881
581가평군청평5일시장(왕애치킨)<NA><NA><NA><NA><NA>경기도 가평군 청평면 시장중앙로 19경기도 가평군 청평면 청평리 81-2번지1245337.734606127.415881
582가평군청평5일시장(각종약초)<NA><NA><NA><NA><NA>경기도 가평군 청평면 시장중앙로 19경기도 가평군 청평면 청평리 81-2번지1245337.734606127.415881

Duplicate rows

Most frequently occurring

시군명점포명허가번호업종명폐업여부인허가일자점용면적정제도로명주소정제지번주소정제우편번호정제WGS84위도정제WGS84경도# duplicates
0부천시햇살가게B-2,3스낵N2012-10-153.33경기도 부천시 송내대로 37-2경기도 부천시 소사구 송내동 280-11번지1474237.486936126.7530273