Overview

Dataset statistics

Number of variables14
Number of observations10000
Missing cells35159
Missing cells (%)25.1%
Duplicate rows3
Duplicate rows (%)< 0.1%
Total size in memory1.2 MiB
Average record size in memory126.0 B

Variable types

Text5
Categorical2
DateTime1
Unsupported3
Numeric3

Dataset

Description일반음식점(일식) 현황
Author행정안전부
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=BHUQYP2K2WG5MUEQAQFB13646445&infSeq=1

Alerts

위생업태명 has constant value ""Constant
Dataset has 3 (< 0.1%) duplicate rowsDuplicates
소재지우편번호 is highly overall correlated with WGS84위도High correlation
WGS84위도 is highly overall correlated with 소재지우편번호High correlation
폐업일자 has 3986 (39.9%) missing valuesMissing
다중이용업소여부 has 10000 (100.0%) missing valuesMissing
총시설규모(㎡) has 10000 (100.0%) missing valuesMissing
위생업종명 has 10000 (100.0%) missing valuesMissing
소재지도로명주소 has 524 (5.2%) missing valuesMissing
소재지우편번호 has 235 (2.4%) missing valuesMissing
WGS84위도 has 206 (2.1%) missing valuesMissing
WGS84경도 has 206 (2.1%) missing valuesMissing
다중이용업소여부 is an unsupported type, check if it needs cleaning or further analysisUnsupported
총시설규모(㎡) is an unsupported type, check if it needs cleaning or further analysisUnsupported
위생업종명 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-10 21:16:50.623374
Analysis finished2023-12-10 21:16:55.138737
Duration4.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct58
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T06:16:55.296462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.8804
Min length3

Characters and Unicode

Total characters38804
Distinct characters39
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row화성시
2nd row수원시
3rd row고양시
4th row화성시
5th row수원시
ValueCountFrequency (%)
부천시 1014
 
10.1%
고양시 1001
 
10.0%
수원시 924
 
9.2%
성남시 898
 
9.0%
안양시 600
 
6.0%
평택시 586
 
5.9%
용인시 545
 
5.5%
안산시 494
 
4.9%
화성시 472
 
4.7%
시흥시 442
 
4.4%
Other values (21) 3024
30.2%
2023-12-11T06:16:55.705753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10371
26.7%
8021
20.7%
2161
 
5.6%
1511
 
3.9%
1455
 
3.7%
1333
 
3.4%
1322
 
3.4%
1179
 
3.0%
1046
 
2.7%
1001
 
2.6%
Other values (29) 9404
24.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30783
79.3%
Space Separator 8021
 
20.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10371
33.7%
2161
 
7.0%
1511
 
4.9%
1455
 
4.7%
1333
 
4.3%
1322
 
4.3%
1179
 
3.8%
1046
 
3.4%
1001
 
3.3%
924
 
3.0%
Other values (28) 8480
27.5%
Space Separator
ValueCountFrequency (%)
8021
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30783
79.3%
Common 8021
 
20.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10371
33.7%
2161
 
7.0%
1511
 
4.9%
1455
 
4.7%
1333
 
4.3%
1322
 
4.3%
1179
 
3.8%
1046
 
3.4%
1001
 
3.3%
924
 
3.0%
Other values (28) 8480
27.5%
Common
ValueCountFrequency (%)
8021
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30783
79.3%
ASCII 8021
 
20.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10371
33.7%
2161
 
7.0%
1511
 
4.9%
1455
 
4.7%
1333
 
4.3%
1322
 
4.3%
1179
 
3.8%
1046
 
3.4%
1001
 
3.3%
924
 
3.0%
Other values (28) 8480
27.5%
ASCII
ValueCountFrequency (%)
8021
100.0%
Distinct8324
Distinct (%)83.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T06:16:56.111171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length31
Mean length5.6081
Min length1

Characters and Unicode

Total characters56081
Distinct characters987
Distinct categories12 ?
Distinct scripts6 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7448 ?
Unique (%)74.5%

Sample

1st row나루참치
2nd row스시나미
3rd row동우회 화정점
4th row스시규
5th row음식문화축제 일식협의회
ValueCountFrequency (%)
스시 47
 
0.4%
미소야 37
 
0.3%
참치 31
 
0.3%
초밥 25
 
0.2%
판교점 20
 
0.2%
19
 
0.2%
동탄점 19
 
0.2%
일산점 19
 
0.2%
바다횟집 17
 
0.1%
청해수산 17
 
0.1%
Other values (8653) 11633
97.9%
2023-12-11T06:16:56.674988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1886
 
3.4%
1626
 
2.9%
1608
 
2.9%
1578
 
2.8%
1285
 
2.3%
1108
 
2.0%
1097
 
2.0%
1031
 
1.8%
1010
 
1.8%
938
 
1.7%
Other values (977) 42914
76.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 51362
91.6%
Space Separator 1886
 
3.4%
Decimal Number 642
 
1.1%
Close Punctuation 519
 
0.9%
Open Punctuation 514
 
0.9%
Uppercase Letter 499
 
0.9%
Lowercase Letter 444
 
0.8%
Other Punctuation 197
 
0.4%
Dash Punctuation 13
 
< 0.1%
Letter Number 2
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1626
 
3.2%
1608
 
3.1%
1578
 
3.1%
1285
 
2.5%
1108
 
2.2%
1097
 
2.1%
1031
 
2.0%
1010
 
2.0%
938
 
1.8%
900
 
1.8%
Other values (899) 39181
76.3%
Uppercase Letter
ValueCountFrequency (%)
O 53
 
10.6%
A 42
 
8.4%
S 37
 
7.4%
T 35
 
7.0%
K 33
 
6.6%
I 31
 
6.2%
N 30
 
6.0%
U 27
 
5.4%
R 25
 
5.0%
E 20
 
4.0%
Other values (15) 166
33.3%
Lowercase Letter
ValueCountFrequency (%)
e 59
13.3%
i 48
10.8%
o 39
 
8.8%
n 34
 
7.7%
a 33
 
7.4%
s 33
 
7.4%
h 30
 
6.8%
r 23
 
5.2%
u 18
 
4.1%
t 18
 
4.1%
Other values (14) 109
24.5%
Other Punctuation
ValueCountFrequency (%)
& 96
48.7%
, 41
20.8%
. 29
 
14.7%
' 9
 
4.6%
· 8
 
4.1%
! 4
 
2.0%
? 3
 
1.5%
/ 3
 
1.5%
# 2
 
1.0%
: 1
 
0.5%
Decimal Number
ValueCountFrequency (%)
0 136
21.2%
9 120
18.7%
1 110
17.1%
2 83
12.9%
3 66
10.3%
5 37
 
5.8%
8 30
 
4.7%
4 27
 
4.2%
7 19
 
3.0%
6 14
 
2.2%
Math Symbol
ValueCountFrequency (%)
+ 1
50.0%
~ 1
50.0%
Space Separator
ValueCountFrequency (%)
1886
100.0%
Close Punctuation
ValueCountFrequency (%)
) 519
100.0%
Open Punctuation
ValueCountFrequency (%)
( 514
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 51275
91.4%
Common 3774
 
6.7%
Latin 945
 
1.7%
Han 73
 
0.1%
Hiragana 11
 
< 0.1%
Katakana 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1626
 
3.2%
1608
 
3.1%
1578
 
3.1%
1285
 
2.5%
1108
 
2.2%
1097
 
2.1%
1031
 
2.0%
1010
 
2.0%
938
 
1.8%
900
 
1.8%
Other values (837) 39094
76.2%
Latin
ValueCountFrequency (%)
e 59
 
6.2%
O 53
 
5.6%
i 48
 
5.1%
A 42
 
4.4%
o 39
 
4.1%
S 37
 
3.9%
T 35
 
3.7%
n 34
 
3.6%
a 33
 
3.5%
s 33
 
3.5%
Other values (40) 532
56.3%
Han
ValueCountFrequency (%)
5
 
6.8%
4
 
5.5%
4
 
5.5%
3
 
4.1%
3
 
4.1%
2
 
2.7%
2
 
2.7%
2
 
2.7%
2
 
2.7%
2
 
2.7%
Other values (39) 44
60.3%
Common
ValueCountFrequency (%)
1886
50.0%
) 519
 
13.8%
( 514
 
13.6%
0 136
 
3.6%
9 120
 
3.2%
1 110
 
2.9%
& 96
 
2.5%
2 83
 
2.2%
3 66
 
1.7%
, 41
 
1.1%
Other values (18) 203
 
5.4%
Hiragana
ValueCountFrequency (%)
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
Katakana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 51271
91.4%
ASCII 4709
 
8.4%
CJK 69
 
0.1%
Hiragana 11
 
< 0.1%
None 8
 
< 0.1%
CJK Compat Ideographs 4
 
< 0.1%
Compat Jamo 4
 
< 0.1%
Katakana 3
 
< 0.1%
Number Forms 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1886
40.1%
) 519
 
11.0%
( 514
 
10.9%
0 136
 
2.9%
9 120
 
2.5%
1 110
 
2.3%
& 96
 
2.0%
2 83
 
1.8%
3 66
 
1.4%
e 59
 
1.3%
Other values (66) 1120
23.8%
Hangul
ValueCountFrequency (%)
1626
 
3.2%
1608
 
3.1%
1578
 
3.1%
1285
 
2.5%
1108
 
2.2%
1097
 
2.1%
1031
 
2.0%
1010
 
2.0%
938
 
1.8%
900
 
1.8%
Other values (833) 39090
76.2%
None
ValueCountFrequency (%)
· 8
100.0%
CJK
ValueCountFrequency (%)
5
 
7.2%
4
 
5.8%
4
 
5.8%
3
 
4.3%
3
 
4.3%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
Other values (37) 40
58.0%
Number Forms
ValueCountFrequency (%)
2
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
2
50.0%
2
50.0%
Hiragana
ValueCountFrequency (%)
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
Katakana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Compat Jamo
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Distinct5893
Distinct (%)58.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T06:16:57.043976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length8.2514
Min length6

Characters and Unicode

Total characters82514
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3425 ?
Unique (%)34.2%

Sample

1st row20171220
2nd row20141010
3rd row2017-11-23
4th row2023-08-22
5th row20130924
ValueCountFrequency (%)
19981117 14
 
0.1%
20021227 11
 
0.1%
19990914 10
 
0.1%
19981202 9
 
0.1%
20210517 8
 
0.1%
20210520 8
 
0.1%
20210303 8
 
0.1%
19971103 8
 
0.1%
2023-04-07 8
 
0.1%
20060918 7
 
0.1%
Other values (5883) 9909
99.1%
2023-12-11T06:16:57.476115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 24575
29.8%
2 17624
21.4%
1 14639
17.7%
9 6060
 
7.3%
3 3355
 
4.1%
8 2890
 
3.5%
7 2821
 
3.4%
6 2751
 
3.3%
4 2724
 
3.3%
5 2559
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 79998
97.0%
Dash Punctuation 2516
 
3.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 24575
30.7%
2 17624
22.0%
1 14639
18.3%
9 6060
 
7.6%
3 3355
 
4.2%
8 2890
 
3.6%
7 2821
 
3.5%
6 2751
 
3.4%
4 2724
 
3.4%
5 2559
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 2516
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 82514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 24575
29.8%
2 17624
21.4%
1 14639
17.7%
9 6060
 
7.3%
3 3355
 
4.1%
8 2890
 
3.5%
7 2821
 
3.4%
6 2751
 
3.3%
4 2724
 
3.3%
5 2559
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 82514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 24575
29.8%
2 17624
21.4%
1 14639
17.7%
9 6060
 
7.3%
3 3355
 
4.1%
8 2890
 
3.5%
7 2821
 
3.4%
6 2751
 
3.3%
4 2724
 
3.3%
5 2559
 
3.1%

영업상태명
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
폐업
6014 
영업
3986 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영업
2nd row폐업
3rd row영업
4th row영업
5th row폐업

Common Values

ValueCountFrequency (%)
폐업 6014
60.1%
영업 3986
39.9%

Length

2023-12-11T06:16:57.605386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:16:57.688814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
폐업 6014
60.1%
영업 3986
39.9%

폐업일자
Date

MISSING 

Distinct3723
Distinct (%)61.9%
Missing3986
Missing (%)39.9%
Memory size156.2 KiB
Minimum1991-05-07 00:00:00
Maximum2023-11-28 00:00:00
2023-12-11T06:16:57.785935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:16:57.909896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

다중이용업소여부
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB

총시설규모(㎡)
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB

위생업종명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB

위생업태명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일식
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일식
2nd row일식
3rd row일식
4th row일식
5th row일식

Common Values

ValueCountFrequency (%)
일식 10000
100.0%

Length

2023-12-11T06:16:58.016046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:16:58.088532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일식 10000
100.0%
Distinct8977
Distinct (%)94.7%
Missing524
Missing (%)5.2%
Memory size156.2 KiB
2023-12-11T06:16:58.335492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length85
Median length62
Mean length31.468974
Min length13

Characters and Unicode

Total characters298200
Distinct characters643
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8620 ?
Unique (%)91.0%

Sample

1st row경기도 화성시 동탄문화센터로 65, 에이스타운 1층 132호 (반송동)
2nd row경기도 수원시 영통구 월드컵로97번길 53-2 (원천동)
3rd row경기도 고양시 덕양구 화신로260번길 31, 2(일부)층 (화정동, 백양빌딩)
4th row경기도 화성시 남양읍 시청로45번길 3-9, 1층 101호
5th row경기도 남양주시 별내5로5번길 4-22, 1층 (102)호 (별내동)
ValueCountFrequency (%)
경기도 9476
 
15.2%
1층 2575
 
4.1%
고양시 974
 
1.6%
부천시 945
 
1.5%
성남시 887
 
1.4%
수원시 872
 
1.4%
분당구 609
 
1.0%
안양시 552
 
0.9%
평택시 541
 
0.9%
용인시 526
 
0.8%
Other values (9610) 44484
71.2%
2023-12-11T06:16:58.763410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
53070
 
17.8%
1 16151
 
5.4%
10326
 
3.5%
9870
 
3.3%
9832
 
3.3%
9776
 
3.3%
9040
 
3.0%
8683
 
2.9%
2 7950
 
2.7%
, 7719
 
2.6%
Other values (633) 155783
52.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 167842
56.3%
Space Separator 53070
 
17.8%
Decimal Number 53033
 
17.8%
Other Punctuation 7769
 
2.6%
Close Punctuation 6565
 
2.2%
Open Punctuation 6563
 
2.2%
Dash Punctuation 2113
 
0.7%
Uppercase Letter 1018
 
0.3%
Lowercase Letter 143
 
< 0.1%
Math Symbol 64
 
< 0.1%
Other values (2) 20
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10326
 
6.2%
9870
 
5.9%
9832
 
5.9%
9776
 
5.8%
9040
 
5.4%
8683
 
5.2%
4769
 
2.8%
4761
 
2.8%
4427
 
2.6%
4318
 
2.6%
Other values (562) 92040
54.8%
Uppercase Letter
ValueCountFrequency (%)
B 244
24.0%
A 146
14.3%
C 92
 
9.0%
E 50
 
4.9%
I 49
 
4.8%
K 46
 
4.5%
S 45
 
4.4%
R 36
 
3.5%
D 36
 
3.5%
L 33
 
3.2%
Other values (16) 241
23.7%
Lowercase Letter
ValueCountFrequency (%)
e 30
21.0%
c 15
10.5%
a 15
10.5%
t 13
9.1%
i 10
 
7.0%
m 9
 
6.3%
b 8
 
5.6%
l 7
 
4.9%
n 7
 
4.9%
r 6
 
4.2%
Other values (8) 23
16.1%
Decimal Number
ValueCountFrequency (%)
1 16151
30.5%
2 7950
15.0%
0 5741
 
10.8%
3 5047
 
9.5%
4 3666
 
6.9%
5 3579
 
6.7%
6 2981
 
5.6%
7 2831
 
5.3%
8 2583
 
4.9%
9 2504
 
4.7%
Other Punctuation
ValueCountFrequency (%)
, 7719
99.4%
. 44
 
0.6%
· 2
 
< 0.1%
& 2
 
< 0.1%
/ 1
 
< 0.1%
@ 1
 
< 0.1%
Letter Number
ValueCountFrequency (%)
10
52.6%
6
31.6%
2
 
10.5%
1
 
5.3%
Math Symbol
ValueCountFrequency (%)
~ 62
96.9%
+ 2
 
3.1%
Space Separator
ValueCountFrequency (%)
53070
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6565
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6563
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2113
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 167841
56.3%
Common 129178
43.3%
Latin 1180
 
0.4%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10326
 
6.2%
9870
 
5.9%
9832
 
5.9%
9776
 
5.8%
9040
 
5.4%
8683
 
5.2%
4769
 
2.8%
4761
 
2.8%
4427
 
2.6%
4318
 
2.6%
Other values (561) 92039
54.8%
Latin
ValueCountFrequency (%)
B 244
20.7%
A 146
 
12.4%
C 92
 
7.8%
E 50
 
4.2%
I 49
 
4.2%
K 46
 
3.9%
S 45
 
3.8%
R 36
 
3.1%
D 36
 
3.1%
L 33
 
2.8%
Other values (38) 403
34.2%
Common
ValueCountFrequency (%)
53070
41.1%
1 16151
 
12.5%
2 7950
 
6.2%
, 7719
 
6.0%
) 6565
 
5.1%
( 6563
 
5.1%
0 5741
 
4.4%
3 5047
 
3.9%
4 3666
 
2.8%
5 3579
 
2.8%
Other values (13) 13127
 
10.2%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 167841
56.3%
ASCII 130337
43.7%
Number Forms 19
 
< 0.1%
None 2
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
53070
40.7%
1 16151
 
12.4%
2 7950
 
6.1%
, 7719
 
5.9%
) 6565
 
5.0%
( 6563
 
5.0%
0 5741
 
4.4%
3 5047
 
3.9%
4 3666
 
2.8%
5 3579
 
2.7%
Other values (56) 14286
 
11.0%
Hangul
ValueCountFrequency (%)
10326
 
6.2%
9870
 
5.9%
9832
 
5.9%
9776
 
5.8%
9040
 
5.4%
8683
 
5.2%
4769
 
2.8%
4761
 
2.8%
4427
 
2.6%
4318
 
2.6%
Other values (561) 92039
54.8%
Number Forms
ValueCountFrequency (%)
10
52.6%
6
31.6%
2
 
10.5%
1
 
5.3%
None
ValueCountFrequency (%)
· 2
100.0%
CJK
ValueCountFrequency (%)
1
100.0%
Distinct9662
Distinct (%)96.6%
Missing2
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-11T06:16:59.078735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length82
Median length60
Mean length27.461192
Min length13

Characters and Unicode

Total characters274557
Distinct characters624
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9379 ?
Unique (%)93.8%

Sample

1st row경기도 화성시 반송동 107-3 에이스타운 132호
2nd row경기도 수원시 영통구 원천동 619-2
3rd row경기도 고양시 덕양구 화정동 979-2 백양빌딩 2층 일부
4th row경기도 화성시 남양읍 남양리 2257-3
5th row경기도 수원시 팔달구 남창동 38-6 화성행궁주차장내
ValueCountFrequency (%)
경기도 9998
 
17.1%
1층 1643
 
2.8%
부천시 1014
 
1.7%
고양시 1001
 
1.7%
수원시 924
 
1.6%
성남시 898
 
1.5%
분당구 611
 
1.0%
안양시 601
 
1.0%
평택시 586
 
1.0%
용인시 545
 
0.9%
Other values (12567) 40501
69.4%
2023-12-11T06:16:59.513863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
49408
 
18.0%
1 15993
 
5.8%
10968
 
4.0%
10698
 
3.9%
10328
 
3.8%
10268
 
3.7%
10084
 
3.7%
- 7850
 
2.9%
2 7420
 
2.7%
6750
 
2.5%
Other values (614) 134790
49.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 154253
56.2%
Decimal Number 58944
 
21.5%
Space Separator 49408
 
18.0%
Dash Punctuation 7850
 
2.9%
Other Punctuation 1418
 
0.5%
Uppercase Letter 986
 
0.4%
Close Punctuation 736
 
0.3%
Open Punctuation 735
 
0.3%
Lowercase Letter 136
 
< 0.1%
Math Symbol 73
 
< 0.1%
Other values (2) 18
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10968
 
7.1%
10698
 
6.9%
10328
 
6.7%
10268
 
6.7%
10084
 
6.5%
6750
 
4.4%
5168
 
3.4%
4871
 
3.2%
4082
 
2.6%
3271
 
2.1%
Other values (539) 77765
50.4%
Uppercase Letter
ValueCountFrequency (%)
B 213
21.6%
A 152
15.4%
C 75
 
7.6%
I 53
 
5.4%
E 53
 
5.4%
S 48
 
4.9%
K 41
 
4.2%
D 35
 
3.5%
T 32
 
3.2%
L 32
 
3.2%
Other values (16) 252
25.6%
Lowercase Letter
ValueCountFrequency (%)
e 28
20.6%
a 14
10.3%
t 14
10.3%
c 13
9.6%
m 12
8.8%
i 9
 
6.6%
l 7
 
5.1%
n 7
 
5.1%
r 6
 
4.4%
k 6
 
4.4%
Other values (8) 20
14.7%
Decimal Number
ValueCountFrequency (%)
1 15993
27.1%
2 7420
12.6%
0 6416
10.9%
3 5398
 
9.2%
4 4596
 
7.8%
5 4405
 
7.5%
6 4180
 
7.1%
7 3936
 
6.7%
8 3345
 
5.7%
9 3255
 
5.5%
Other Punctuation
ValueCountFrequency (%)
, 1285
90.6%
. 109
 
7.7%
/ 12
 
0.8%
@ 7
 
0.5%
& 3
 
0.2%
* 1
 
0.1%
· 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 66
90.4%
< 2
 
2.7%
> 2
 
2.7%
+ 2
 
2.7%
1
 
1.4%
Letter Number
ValueCountFrequency (%)
10
58.8%
4
 
23.5%
2
 
11.8%
1
 
5.9%
Space Separator
ValueCountFrequency (%)
49408
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7850
100.0%
Close Punctuation
ValueCountFrequency (%)
) 736
100.0%
Open Punctuation
ValueCountFrequency (%)
( 735
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 154253
56.2%
Common 119165
43.4%
Latin 1139
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10968
 
7.1%
10698
 
6.9%
10328
 
6.7%
10268
 
6.7%
10084
 
6.5%
6750
 
4.4%
5168
 
3.4%
4871
 
3.2%
4082
 
2.6%
3271
 
2.1%
Other values (539) 77765
50.4%
Latin
ValueCountFrequency (%)
B 213
18.7%
A 152
 
13.3%
C 75
 
6.6%
I 53
 
4.7%
E 53
 
4.7%
S 48
 
4.2%
K 41
 
3.6%
D 35
 
3.1%
T 32
 
2.8%
L 32
 
2.8%
Other values (38) 405
35.6%
Common
ValueCountFrequency (%)
49408
41.5%
1 15993
 
13.4%
- 7850
 
6.6%
2 7420
 
6.2%
0 6416
 
5.4%
3 5398
 
4.5%
4 4596
 
3.9%
5 4405
 
3.7%
6 4180
 
3.5%
7 3936
 
3.3%
Other values (17) 9563
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 154253
56.2%
ASCII 120285
43.8%
Number Forms 17
 
< 0.1%
Math Operators 1
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
49408
41.1%
1 15993
 
13.3%
- 7850
 
6.5%
2 7420
 
6.2%
0 6416
 
5.3%
3 5398
 
4.5%
4 4596
 
3.8%
5 4405
 
3.7%
6 4180
 
3.5%
7 3936
 
3.3%
Other values (59) 10683
 
8.9%
Hangul
ValueCountFrequency (%)
10968
 
7.1%
10698
 
6.9%
10328
 
6.7%
10268
 
6.7%
10084
 
6.5%
6750
 
4.4%
5168
 
3.4%
4871
 
3.2%
4082
 
2.6%
3271
 
2.1%
Other values (539) 77765
50.4%
Number Forms
ValueCountFrequency (%)
10
58.8%
4
 
23.5%
2
 
11.8%
1
 
5.9%
Math Operators
ValueCountFrequency (%)
1
100.0%
None
ValueCountFrequency (%)
· 1
100.0%

소재지우편번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct2535
Distinct (%)26.0%
Missing235
Missing (%)2.4%
Infinite0
Infinite (%)0.0%
Mean14248.04
Minimum10003
Maximum18611
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:16:59.677890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10003
5-th percentile10358
Q112105
median14427
Q316481
95-th percentile18141
Maximum18611
Range8608
Interquartile range (IQR)4376

Descriptive statistics

Standard deviation2495.9651
Coefficient of variation (CV)0.17517954
Kurtosis-1.0786642
Mean14248.04
Median Absolute Deviation (MAD)2134
Skewness-0.049730924
Sum1.3913212 × 108
Variance6229841.8
MonotonicityNot monotonic
2023-12-11T06:16:59.797428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14548 91
 
0.9%
14963 70
 
0.7%
15865 57
 
0.6%
14072 53
 
0.5%
13595 53
 
0.5%
10401 48
 
0.5%
17948 47
 
0.5%
15011 45
 
0.4%
13591 44
 
0.4%
11813 43
 
0.4%
Other values (2525) 9214
92.1%
(Missing) 235
 
2.4%
ValueCountFrequency (%)
10003 1
 
< 0.1%
10011 1
 
< 0.1%
10016 2
 
< 0.1%
10017 1
 
< 0.1%
10018 2
 
< 0.1%
10019 5
0.1%
10020 4
< 0.1%
10024 1
 
< 0.1%
10031 2
 
< 0.1%
10036 1
 
< 0.1%
ValueCountFrequency (%)
18611 5
 
0.1%
18606 14
0.1%
18602 4
 
< 0.1%
18600 4
 
< 0.1%
18598 6
0.1%
18597 1
 
< 0.1%
18596 1
 
< 0.1%
18594 1
 
< 0.1%
18593 2
 
< 0.1%
18592 1
 
< 0.1%

WGS84위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct7693
Distinct (%)78.5%
Missing206
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean37.430875
Minimum36.913414
Maximum38.157701
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:16:59.916970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.913414
5-th percentile37.016752
Q137.293052
median37.3995
Q337.613724
95-th percentile37.747598
Maximum38.157701
Range1.2442866
Interquartile range (IQR)0.32067115

Descriptive statistics

Standard deviation0.20803514
Coefficient of variation (CV)0.0055578487
Kurtosis-0.23499369
Mean37.430875
Median Absolute Deviation (MAD)0.12773941
Skewness-0.024459718
Sum366597.99
Variance0.043278619
MonotonicityNot monotonic
2023-12-11T06:17:00.047419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.3831019563 21
 
0.2%
37.3927822598 16
 
0.2%
37.2549888048 15
 
0.1%
37.6679786706 14
 
0.1%
37.2915640479 13
 
0.1%
37.6470962619 13
 
0.1%
37.4888795916 12
 
0.1%
37.5043171668 12
 
0.1%
37.397320957 12
 
0.1%
37.3868933577 12
 
0.1%
Other values (7683) 9654
96.5%
(Missing) 206
 
2.1%
ValueCountFrequency (%)
36.9134140728 1
 
< 0.1%
36.9144185431 1
 
< 0.1%
36.9145005642 2
 
< 0.1%
36.9146408009 1
 
< 0.1%
36.9149387574 1
 
< 0.1%
36.9149627585 3
 
< 0.1%
36.9149925174 1
 
< 0.1%
36.9159574251 11
0.1%
36.9165803501 1
 
< 0.1%
36.9171154729 4
 
< 0.1%
ValueCountFrequency (%)
38.157700698 1
< 0.1%
38.0996829364 1
< 0.1%
38.0995779441 1
< 0.1%
38.0980563987 1
< 0.1%
38.0912656647 1
< 0.1%
38.0907760417 1
< 0.1%
38.0905998143 1
< 0.1%
38.0903659058 1
< 0.1%
38.0571933121 1
< 0.1%
38.037337717 1
< 0.1%

WGS84경도
Real number (ℝ)

MISSING 

Distinct7693
Distinct (%)78.5%
Missing206
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean126.97699
Minimum126.52981
Maximum127.70062
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:17:00.408440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.52981
5-th percentile126.73536
Q1126.80504
median126.99146
Q3127.1106
95-th percentile127.24454
Maximum127.70062
Range1.1708081
Interquartile range (IQR)0.305562

Descriptive statistics

Standard deviation0.1815793
Coefficient of variation (CV)0.0014300173
Kurtosis0.047367192
Mean126.97699
Median Absolute Deviation (MAD)0.14210647
Skewness0.3540729
Sum1243612.7
Variance0.032971042
MonotonicityNot monotonic
2023-12-11T06:17:00.542053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.9704727553 21
 
0.2%
127.1120948679 16
 
0.2%
127.0304265019 15
 
0.1%
126.7516242854 14
 
0.1%
127.0504732206 13
 
0.1%
126.8948092677 13
 
0.1%
126.7552924952 12
 
0.1%
126.7620745903 12
 
0.1%
127.1135515109 12
 
0.1%
126.7414592187 12
 
0.1%
Other values (7683) 9654
96.5%
(Missing) 206
 
2.1%
ValueCountFrequency (%)
126.52980712 1
 
< 0.1%
126.5300155839 1
 
< 0.1%
126.5346414169 1
 
< 0.1%
126.5376365181 1
 
< 0.1%
126.5418098658 2
 
< 0.1%
126.5434441568 1
 
< 0.1%
126.5441024 1
 
< 0.1%
126.5459013281 4
< 0.1%
126.5497387546 9
0.1%
126.5510468081 1
 
< 0.1%
ValueCountFrequency (%)
127.7006152686 1
< 0.1%
127.6630841716 1
< 0.1%
127.6566740849 1
< 0.1%
127.6512743251 1
< 0.1%
127.6483259144 1
< 0.1%
127.6475935954 1
< 0.1%
127.6460438809 1
< 0.1%
127.6418414917 1
< 0.1%
127.6406486558 1
< 0.1%
127.6403452029 1
< 0.1%

Interactions

2023-12-11T06:16:54.200786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:16:53.537209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:16:53.865155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:16:54.300596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:16:53.620081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:16:53.987328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:16:54.436373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:16:53.733889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:16:54.098869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T06:17:00.620594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명영업상태명소재지우편번호WGS84위도WGS84경도
시군명1.0000.4080.9940.9610.951
영업상태명0.4081.0000.1590.1180.146
소재지우편번호0.9940.1591.0000.9220.853
WGS84위도0.9610.1180.9221.0000.673
WGS84경도0.9510.1460.8530.6731.000
2023-12-11T06:17:00.718990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소재지우편번호WGS84위도WGS84경도영업상태명
소재지우편번호1.000-0.9120.1830.122
WGS84위도-0.9121.000-0.2520.090
WGS84경도0.183-0.2521.0000.112
영업상태명0.1220.0900.1121.000

Missing values

2023-12-11T06:16:54.595242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:16:54.852487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T06:16:55.038152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군명사업장명인허가일자영업상태명폐업일자다중이용업소여부총시설규모(㎡)위생업종명위생업태명소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도
11214화성시나루참치20171220영업<NA><NA><NA><NA>일식경기도 화성시 동탄문화센터로 65, 에이스타운 1층 132호 (반송동)경기도 화성시 반송동 107-3 에이스타운 132호1845537.199983127.072596
6066수원시스시나미20141010폐업20170316<NA><NA><NA>일식경기도 수원시 영통구 월드컵로97번길 53-2 (원천동)경기도 수원시 영통구 원천동 619-21651837.274301127.050113
100고양시동우회 화정점2017-11-23영업<NA><NA><NA><NA>일식경기도 고양시 덕양구 화신로260번길 31, 2(일부)층 (화정동, 백양빌딩)경기도 고양시 덕양구 화정동 979-2 백양빌딩 2층 일부1050037.63214126.832461
11121화성시스시규2023-08-22영업<NA><NA><NA><NA>일식경기도 화성시 남양읍 시청로45번길 3-9, 1층 101호경기도 화성시 남양읍 남양리 2257-31826237.20626126.820567
5981수원시음식문화축제 일식협의회20130924폐업20131111<NA><NA><NA>일식<NA>경기도 수원시 팔달구 남창동 38-6 화성행궁주차장내<NA><NA><NA>
2604남양주시미중연어20171102영업<NA><NA><NA><NA>일식경기도 남양주시 별내5로5번길 4-22, 1층 (102)호 (별내동)경기도 남양주시 별내동 1096-71210237.650498127.114669
7553안양시스시스토리20150423영업<NA><NA><NA><NA>일식경기도 안양시 만안구 장내로 151, 1층 (안양동)경기도 안양시 만안구 안양동 674-165번지1399237.398551126.924178
335고양시오사카부루스20210720영업<NA><NA><NA><NA>일식경기도 고양시 덕양구 혜음로 41, 2층 (고양동)경기도 고양시 덕양구 고양동 123-11027437.704186126.901878
2991부천시호시타코야끼2023-09-27영업<NA><NA><NA><NA>일식경기도 부천시 길주로 137, 그린힐 113호 (상동)경기도 부천시 상동 538-4 그린힐 113호1454237.50611126.75769
569고양시다이버하우스20150403폐업20170504<NA><NA><NA>일식경기도 고양시 덕양구 고양대로 1998, 1층 (동산동)경기도 고양시 덕양구 동산동 47-1번지 1층1059637.646766126.900774
시군명사업장명인허가일자영업상태명폐업일자다중이용업소여부총시설규모(㎡)위생업종명위생업태명소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도
9876파주시우마이20180907폐업20200214<NA><NA><NA>일식경기도 파주시 와석순환로192번길 27, 1층 101호 (동패동)경기도 파주시 동패동 1821-1번지 1층 101호1090137.715661126.740707
2512남양주시미담 마라탕 별내점2021-08-09영업<NA><NA><NA><NA>일식경기도 남양주시 별내3로 320, 센트럴프라자 1층 104호 (별내동)경기도 남양주시 별내동 826-9 센트럴프라자 1층 104호1209737.666011127.117821
9236의정부시어참치20190429영업<NA><NA><NA><NA>일식경기도 의정부시 청사로48번길 19, 세하메디칼스타 1층 112호 (금오동)경기도 의정부시 금오동 474-3번지 세하메디칼스타1175737.753081127.070933
7689안양시호계수산20000417폐업20021016<NA><NA><NA>일식<NA>경기도 안양시 동안구 호계동 952-40번지 (지상1층)1411637.372275126.955929
9749파주시총각네 축산20220214영업<NA><NA><NA><NA>일식경기도 파주시 교하로 442-11, 나동 전체 (동패동)경기도 파주시 동패동 1343-4 나동 전체1090437.706034126.729483
6119수원시미즈하루20020401폐업20110530<NA><NA><NA>일식<NA>경기도 수원시 팔달구 인계동 1039-7 ,81648937.266098127.031542
9419의정부시유조스시20091030폐업20221228<NA><NA><NA>일식경기도 의정부시 용현로159번길 18 (민락동, 지상1층, 107호)경기도 의정부시 민락동 692-31180537.741311127.094234
4713성남시스시히로바20120424폐업20121228<NA><NA><NA>일식경기도 성남시 분당구 느티로 16 (정자동, 젤존타워 2층 207호,208호)경기도 성남시 분당구 정자동 17-1번지 젤존타워 2층 207호,208호1355837.368165127.106315
8304여주시반달카레20150527폐업20221230<NA><NA><NA>일식경기도 여주시 세종로 356-4 (점봉동)경기도 여주시 점봉동 429-371265337.266647127.637925
3980부천시눈썹스시엔롤20051103폐업20070730<NA><NA><NA>일식경기도 부천시 소향로 17경기도 부천시 상동 546-2번지 두성프라자 105호1454437.504095126.749688

Duplicate rows

Most frequently occurring

시군명사업장명인허가일자영업상태명폐업일자위생업태명소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도# duplicates
0고양시본키친 일산점2023-01-26폐업2023-06-15일식경기도 고양시 일산동구 백석로 177, 백송마을7단지 주상가동 101호 (백석동)경기도 고양시 일산동구 백석동 1136 백송마을7단지1041837.651921126.7961782
1성남시본가스시2023-01-26폐업2023-04-14일식경기도 성남시 분당구 판교역로146번길 20, 현대백화점 판교점 지하1층 일부호 (백현동)경기도 성남시 분당구 백현동 541 현대백화점 판교점 지하1층 일부호1352937.392782127.1120952
2의정부시스시하루2023-01-26영업<NA>일식경기도 의정부시 비우로108번길 8-2, 지상1층 (녹양동)경기도 의정부시 녹양동 330-5 1층1161337.757971127.0388792