Overview

Dataset statistics

Number of variables12
Number of observations4784
Missing cells3668
Missing cells (%)6.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory458.0 KiB
Average record size in memory98.0 B

Variable types

Numeric2
Categorical5
Text4
Boolean1

Alerts

주소1 has constant value ""Constant
선정여부 is highly overall correlated with 전화번호 and 2 other fieldsHigh correlation
시도명 is highly overall correlated with 선정여부 and 1 other fieldsHigh correlation
업종 is highly overall correlated with 전화번호High correlation
전화번호 is highly overall correlated with 업종 and 1 other fieldsHigh correlation
업종상세 is highly overall correlated with 시도명 and 1 other fieldsHigh correlation
시도명 is highly imbalanced (79.9%)Imbalance
선정여부 is highly imbalanced (98.7%)Imbalance
안심식당SEQ is highly imbalanced (50.8%)Imbalance
업종상세 is highly imbalanced (60.9%)Imbalance
주소2 has 3667 (76.7%) missing valuesMissing
업종 has unique valuesUnique
전화번호 has unique valuesUnique

Reproduction

Analysis started2024-01-09 19:53:33.436339
Analysis finished2024-01-09 19:53:34.977886
Duration1.54 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct4784
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2392.5
Minimum1
Maximum4784
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size42.2 KiB
2024-01-10T04:53:35.033885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile240.15
Q11196.75
median2392.5
Q33588.25
95-th percentile4544.85
Maximum4784
Range4783
Interquartile range (IQR)2391.5

Descriptive statistics

Standard deviation1381.1662
Coefficient of variation (CV)0.57728994
Kurtosis-1.2
Mean2392.5
Median Absolute Deviation (MAD)1196
Skewness0
Sum11445720
Variance1907620
MonotonicityStrictly increasing
2024-01-10T04:53:35.136678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
3197 1
 
< 0.1%
3195 1
 
< 0.1%
3194 1
 
< 0.1%
3193 1
 
< 0.1%
3192 1
 
< 0.1%
3191 1
 
< 0.1%
3190 1
 
< 0.1%
3189 1
 
< 0.1%
3188 1
 
< 0.1%
Other values (4774) 4774
99.8%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
4784 1
< 0.1%
4783 1
< 0.1%
4782 1
< 0.1%
4781 1
< 0.1%
4780 1
< 0.1%
4779 1
< 0.1%
4778 1
< 0.1%
4777 1
< 0.1%
4776 1
< 0.1%
4775 1
< 0.1%

시도명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size37.5 KiB
일반음식점
4634 
일반음식점_외
 
150

Length

Max length7
Median length5
Mean length5.062709
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반음식점
2nd row일반음식점
3rd row일반음식점
4th row일반음식점
5th row일반음식점

Common Values

ValueCountFrequency (%)
일반음식점 4634
96.9%
일반음식점_외 150
 
3.1%

Length

2024-01-10T04:53:35.235620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T04:53:35.309604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반음식점 4634
96.9%
일반음식점_외 150
 
3.1%

주소1
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size37.5 KiB
충청남도
4784 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충청남도
2nd row충청남도
3rd row충청남도
4th row충청남도
5th row충청남도

Common Values

ValueCountFrequency (%)
충청남도 4784
100.0%

Length

2024-01-10T04:53:35.380836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T04:53:35.445915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
충청남도 4784
100.0%

주소2
Text

MISSING 

Distinct121
Distinct (%)10.8%
Missing3667
Missing (%)76.7%
Memory size37.5 KiB
2024-01-10T04:53:35.663472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length2
Mean length3.0581916
Min length1

Characters and Unicode

Total characters3416
Distinct characters157
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique104 ?
Unique (%)9.3%

Sample

1st row2층
2nd row2층
3rd row102호
4th row3층
5th row1층
ValueCountFrequency (%)
1층 907
72.7%
2층 93
 
7.5%
1층&#44 15
 
1.2%
3층 10
 
0.8%
101호 9
 
0.7%
지하1층 9
 
0.7%
102호 8
 
0.6%
103호 6
 
0.5%
상가동 6
 
0.5%
4층 5
 
0.4%
Other values (134) 179
 
14.4%
2024-01-10T04:53:36.038993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1059
31.0%
1059
31.0%
2 158
 
4.6%
130
 
3.8%
4 128
 
3.7%
93
 
2.7%
0 90
 
2.6%
72
 
2.1%
# 52
 
1.5%
; 52
 
1.5%
Other values (147) 523
15.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1527
44.7%
Decimal Number 1505
44.1%
Other Punctuation 156
 
4.6%
Space Separator 130
 
3.8%
Close Punctuation 28
 
0.8%
Open Punctuation 28
 
0.8%
Uppercase Letter 16
 
0.5%
Lowercase Letter 12
 
0.4%
Math Symbol 10
 
0.3%
Dash Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1059
69.4%
93
 
6.1%
72
 
4.7%
20
 
1.3%
14
 
0.9%
13
 
0.9%
12
 
0.8%
11
 
0.7%
8
 
0.5%
7
 
0.5%
Other values (112) 218
 
14.3%
Decimal Number
ValueCountFrequency (%)
1 1059
70.4%
2 158
 
10.5%
4 128
 
8.5%
0 90
 
6.0%
3 27
 
1.8%
5 15
 
1.0%
6 9
 
0.6%
9 8
 
0.5%
7 6
 
0.4%
8 5
 
0.3%
Lowercase Letter
ValueCountFrequency (%)
e 3
25.0%
n 2
16.7%
u 1
 
8.3%
v 1
 
8.3%
h 1
 
8.3%
c 1
 
8.3%
r 1
 
8.3%
a 1
 
8.3%
o 1
 
8.3%
Uppercase Letter
ValueCountFrequency (%)
A 7
43.8%
B 3
18.8%
C 2
 
12.5%
I 1
 
6.2%
M 1
 
6.2%
D 1
 
6.2%
W 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
# 52
33.3%
; 52
33.3%
& 52
33.3%
Space Separator
ValueCountFrequency (%)
130
100.0%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%
Math Symbol
ValueCountFrequency (%)
~ 10
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1861
54.5%
Hangul 1527
44.7%
Latin 28
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1059
69.4%
93
 
6.1%
72
 
4.7%
20
 
1.3%
14
 
0.9%
13
 
0.9%
12
 
0.8%
11
 
0.7%
8
 
0.5%
7
 
0.5%
Other values (112) 218
 
14.3%
Common
ValueCountFrequency (%)
1 1059
56.9%
2 158
 
8.5%
130
 
7.0%
4 128
 
6.9%
0 90
 
4.8%
# 52
 
2.8%
; 52
 
2.8%
& 52
 
2.8%
) 28
 
1.5%
( 28
 
1.5%
Other values (9) 84
 
4.5%
Latin
ValueCountFrequency (%)
A 7
25.0%
B 3
10.7%
e 3
10.7%
n 2
 
7.1%
C 2
 
7.1%
I 1
 
3.6%
u 1
 
3.6%
v 1
 
3.6%
h 1
 
3.6%
c 1
 
3.6%
Other values (6) 6
21.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1889
55.3%
Hangul 1527
44.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1059
56.1%
2 158
 
8.4%
130
 
6.9%
4 128
 
6.8%
0 90
 
4.8%
# 52
 
2.8%
; 52
 
2.8%
& 52
 
2.8%
) 28
 
1.5%
( 28
 
1.5%
Other values (25) 112
 
5.9%
Hangul
ValueCountFrequency (%)
1059
69.4%
93
 
6.1%
72
 
4.7%
20
 
1.3%
14
 
0.9%
13
 
0.9%
12
 
0.8%
11
 
0.7%
8
 
0.5%
7
 
0.5%
Other values (112) 218
 
14.3%
Distinct4526
Distinct (%)94.6%
Missing0
Missing (%)0.0%
Memory size37.5 KiB
2024-01-10T04:53:36.245670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length25
Mean length5.9260033
Min length1

Characters and Unicode

Total characters28350
Distinct characters804
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4320 ?
Unique (%)90.3%

Sample

1st row청와삼대
2nd row토마토아저씨서산점
3rd row하루엔소쿠(서산점)
4th row거성정육식당
5th row김영희강남동태찜예산점
ValueCountFrequency (%)
서산점 17
 
0.3%
부여점 9
 
0.2%
등촌샤브칼국수 7
 
0.1%
양평해장국 7
 
0.1%
서산호수공원점 7
 
0.1%
이삭토스트 6
 
0.1%
논산점 6
 
0.1%
예천점 6
 
0.1%
아산점 6
 
0.1%
함지박 5
 
0.1%
Other values (4677) 5054
98.5%
2024-01-10T04:53:36.558074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
685
 
2.4%
539
 
1.9%
524
 
1.8%
499
 
1.8%
482
 
1.7%
481
 
1.7%
479
 
1.7%
430
 
1.5%
405
 
1.4%
353
 
1.2%
Other values (794) 23473
82.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26984
95.2%
Space Separator 347
 
1.2%
Decimal Number 235
 
0.8%
Lowercase Letter 192
 
0.7%
Open Punctuation 180
 
0.6%
Close Punctuation 179
 
0.6%
Other Punctuation 120
 
0.4%
Uppercase Letter 96
 
0.3%
Other Symbol 16
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
685
 
2.5%
539
 
2.0%
524
 
1.9%
499
 
1.8%
482
 
1.8%
481
 
1.8%
479
 
1.8%
430
 
1.6%
405
 
1.5%
353
 
1.3%
Other values (729) 22107
81.9%
Lowercase Letter
ValueCountFrequency (%)
a 45
23.4%
m 33
17.2%
p 33
17.2%
e 12
 
6.2%
n 10
 
5.2%
o 9
 
4.7%
t 7
 
3.6%
r 6
 
3.1%
y 5
 
2.6%
l 5
 
2.6%
Other values (11) 27
14.1%
Uppercase Letter
ValueCountFrequency (%)
R 14
14.6%
C 13
13.5%
K 10
10.4%
F 8
 
8.3%
E 6
 
6.2%
M 6
 
6.2%
A 5
 
5.2%
B 5
 
5.2%
G 4
 
4.2%
D 4
 
4.2%
Other values (10) 21
21.9%
Decimal Number
ValueCountFrequency (%)
4 44
18.7%
1 43
18.3%
2 39
16.6%
0 21
8.9%
9 20
8.5%
3 17
 
7.2%
6 15
 
6.4%
8 15
 
6.4%
7 11
 
4.7%
5 10
 
4.3%
Other Punctuation
ValueCountFrequency (%)
; 46
38.3%
& 38
31.7%
# 15
 
12.5%
. 8
 
6.7%
· 5
 
4.2%
3
 
2.5%
/ 3
 
2.5%
* 1
 
0.8%
! 1
 
0.8%
Space Separator
ValueCountFrequency (%)
347
100.0%
Open Punctuation
ValueCountFrequency (%)
( 180
100.0%
Close Punctuation
ValueCountFrequency (%)
) 179
100.0%
Other Symbol
ValueCountFrequency (%)
16
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26996
95.2%
Common 1062
 
3.7%
Latin 288
 
1.0%
Han 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
685
 
2.5%
539
 
2.0%
524
 
1.9%
499
 
1.8%
482
 
1.8%
481
 
1.8%
479
 
1.8%
430
 
1.6%
405
 
1.5%
353
 
1.3%
Other values (726) 22119
81.9%
Latin
ValueCountFrequency (%)
a 45
15.6%
m 33
 
11.5%
p 33
 
11.5%
R 14
 
4.9%
C 13
 
4.5%
e 12
 
4.2%
n 10
 
3.5%
K 10
 
3.5%
o 9
 
3.1%
F 8
 
2.8%
Other values (31) 101
35.1%
Common
ValueCountFrequency (%)
347
32.7%
( 180
16.9%
) 179
16.9%
; 46
 
4.3%
4 44
 
4.1%
1 43
 
4.0%
2 39
 
3.7%
& 38
 
3.6%
0 21
 
2.0%
9 20
 
1.9%
Other values (13) 105
 
9.9%
Han
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26980
95.2%
ASCII 1342
 
4.7%
None 24
 
0.1%
CJK 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
685
 
2.5%
539
 
2.0%
524
 
1.9%
499
 
1.8%
482
 
1.8%
481
 
1.8%
479
 
1.8%
430
 
1.6%
405
 
1.5%
353
 
1.3%
Other values (725) 22103
81.9%
ASCII
ValueCountFrequency (%)
347
25.9%
( 180
13.4%
) 179
13.3%
; 46
 
3.4%
a 45
 
3.4%
4 44
 
3.3%
1 43
 
3.2%
2 39
 
2.9%
& 38
 
2.8%
m 33
 
2.5%
Other values (52) 348
25.9%
None
ValueCountFrequency (%)
16
66.7%
· 5
 
20.8%
3
 
12.5%
CJK
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

선정여부
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size37.5 KiB
<NA>
4770 
프랜차이즈협회우선추천매장
 
10
http://www.인삼장어백탄구이.kr/
 
1
http://www.강변가든.com/
 
1
http://www.gardenofsky.com/dining/
 
1

Length

Max length34
Median length4
Mean length4.0319816
Min length2

Unique

Unique4 ?
Unique (%)0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 4770
99.7%
프랜차이즈협회우선추천매장 10
 
0.2%
http://www.인삼장어백탄구이.kr/ 1
 
< 0.1%
http://www.강변가든.com/ 1
 
< 0.1%
http://www.gardenofsky.com/dining/ 1
 
< 0.1%
횟집 1
 
< 0.1%

Length

2024-01-10T04:53:36.667827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T04:53:36.754383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 4770
99.7%
프랜차이즈협회우선추천매장 10
 
0.2%
http://www.인삼장어백탄구이.kr 1
 
< 0.1%
http://www.강변가든.com 1
 
< 0.1%
http://www.gardenofsky.com/dining 1
 
< 0.1%
횟집 1
 
< 0.1%

안심식당SEQ
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
True
4270 
False
514 
ValueCountFrequency (%)
True 4270
89.3%
False 514
 
10.7%
2024-01-10T04:53:36.828470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

전화번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct4784
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31649.528
Minimum4290
Maximum70509
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size42.2 KiB
2024-01-10T04:53:36.906798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4290
5-th percentile6292.15
Q110399.75
median31867.5
Q349670.25
95-th percentile64195.85
Maximum70509
Range66219
Interquartile range (IQR)39270.5

Descriptive statistics

Standard deviation20917.715
Coefficient of variation (CV)0.66091711
Kurtosis-1.5056701
Mean31649.528
Median Absolute Deviation (MAD)21241
Skewness0.15095074
Sum1.5141134 × 108
Variance4.3755078 × 108
MonotonicityNot monotonic
2024-01-10T04:53:37.006051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11515 1
 
< 0.1%
48880 1
 
< 0.1%
48878 1
 
< 0.1%
48877 1
 
< 0.1%
48876 1
 
< 0.1%
48875 1
 
< 0.1%
48874 1
 
< 0.1%
48873 1
 
< 0.1%
48872 1
 
< 0.1%
48871 1
 
< 0.1%
Other values (4774) 4774
99.8%
ValueCountFrequency (%)
4290 1
< 0.1%
4291 1
< 0.1%
4292 1
< 0.1%
4293 1
< 0.1%
4294 1
< 0.1%
4295 1
< 0.1%
4296 1
< 0.1%
4297 1
< 0.1%
4298 1
< 0.1%
4299 1
< 0.1%
ValueCountFrequency (%)
70509 1
< 0.1%
70408 1
< 0.1%
70407 1
< 0.1%
70406 1
< 0.1%
70405 1
< 0.1%
70404 1
< 0.1%
70403 1
< 0.1%
70402 1
< 0.1%
70401 1
< 0.1%
70400 1
< 0.1%
Distinct4329
Distinct (%)90.5%
Missing0
Missing (%)0.0%
Memory size37.5 KiB
2024-01-10T04:53:37.270864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length26
Mean length20.234323
Min length14

Characters and Unicode

Total characters96801
Distinct characters339
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3981 ?
Unique (%)83.2%

Sample

1st row충청남도 서산시 석림6로 17
2nd row충청남도 서산시 호수공원14로 36
3rd row충청남도 서산시 안견로 242
4th row충청남도 예산군 삽교읍 수암산로 272-23
5th row충청남도 예산군 예산읍 산성공원1길 20
ValueCountFrequency (%)
충청남도 4784
 
21.2%
천안시 830
 
3.7%
아산시 616
 
2.7%
당진시 595
 
2.6%
서산시 497
 
2.2%
서북구 418
 
1.8%
동남구 412
 
1.8%
논산시 345
 
1.5%
보령시 310
 
1.4%
공주시 293
 
1.3%
Other values (3391) 13507
59.7%
2024-01-10T04:53:37.641413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17823
18.4%
5369
 
5.5%
5116
 
5.3%
4908
 
5.1%
4874
 
5.0%
3774
 
3.9%
1 3644
 
3.8%
3287
 
3.4%
2591
 
2.7%
2 2276
 
2.4%
Other values (329) 43139
44.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 61840
63.9%
Space Separator 17823
 
18.4%
Decimal Number 15815
 
16.3%
Dash Punctuation 1322
 
1.4%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5369
 
8.7%
5116
 
8.3%
4908
 
7.9%
4874
 
7.9%
3774
 
6.1%
3287
 
5.3%
2591
 
4.2%
2172
 
3.5%
1425
 
2.3%
1368
 
2.2%
Other values (316) 26956
43.6%
Decimal Number
ValueCountFrequency (%)
1 3644
23.0%
2 2276
14.4%
3 1857
11.7%
4 1490
9.4%
5 1299
 
8.2%
6 1157
 
7.3%
7 1155
 
7.3%
8 1064
 
6.7%
0 946
 
6.0%
9 927
 
5.9%
Space Separator
ValueCountFrequency (%)
17823
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1322
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 61840
63.9%
Common 34961
36.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5369
 
8.7%
5116
 
8.3%
4908
 
7.9%
4874
 
7.9%
3774
 
6.1%
3287
 
5.3%
2591
 
4.2%
2172
 
3.5%
1425
 
2.3%
1368
 
2.2%
Other values (316) 26956
43.6%
Common
ValueCountFrequency (%)
17823
51.0%
1 3644
 
10.4%
2 2276
 
6.5%
3 1857
 
5.3%
4 1490
 
4.3%
- 1322
 
3.8%
5 1299
 
3.7%
6 1157
 
3.3%
7 1155
 
3.3%
8 1064
 
3.0%
Other values (3) 1874
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 61840
63.9%
ASCII 34961
36.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17823
51.0%
1 3644
 
10.4%
2 2276
 
6.5%
3 1857
 
5.3%
4 1490
 
4.3%
- 1322
 
3.8%
5 1299
 
3.7%
6 1157
 
3.3%
7 1155
 
3.3%
8 1064
 
3.0%
Other values (3) 1874
 
5.4%
Hangul
ValueCountFrequency (%)
5369
 
8.7%
5116
 
8.3%
4908
 
7.9%
4874
 
7.9%
3774
 
6.1%
3287
 
5.3%
2591
 
4.2%
2172
 
3.5%
1425
 
2.3%
1368
 
2.2%
Other values (316) 26956
43.6%
Distinct3064
Distinct (%)64.1%
Missing1
Missing (%)< 0.1%
Memory size37.5 KiB
2024-01-10T04:53:37.836791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length11.868702
Min length7

Characters and Unicode

Total characters56768
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2987 ?
Unique (%)62.5%

Sample

1st row041-669-1008
2nd row050-5006-2544
3rd row041-664-3637
4th row041-338-5485
5th row041-335-9992
ValueCountFrequency (%)
010 1320
27.5%
1234567 147
 
3.1%
12345678 94
 
2.0%
000-000-0000 27
 
0.6%
000-0000-0000 20
 
0.4%
0413397483 19
 
0.4%
041 18
 
0.4%
041-332-8852 8
 
0.2%
041-858-0561 8
 
0.2%
041-856-8351 5
 
0.1%
Other values (3059) 3139
65.3%
2024-01-10T04:53:38.136884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 10560
18.6%
- 8761
15.4%
0 7896
13.9%
1 6086
10.7%
4 5032
8.9%
3 3573
 
6.3%
5 3424
 
6.0%
6 2589
 
4.6%
2 2361
 
4.2%
7 2357
 
4.2%
Other values (3) 4129
 
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 37425
65.9%
Other Punctuation 10560
 
18.6%
Dash Punctuation 8761
 
15.4%
Space Separator 22
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7896
21.1%
1 6086
16.3%
4 5032
13.4%
3 3573
9.5%
5 3424
9.1%
6 2589
 
6.9%
2 2361
 
6.3%
7 2357
 
6.3%
8 2163
 
5.8%
9 1944
 
5.2%
Other Punctuation
ValueCountFrequency (%)
* 10560
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8761
100.0%
Space Separator
ValueCountFrequency (%)
22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 56768
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 10560
18.6%
- 8761
15.4%
0 7896
13.9%
1 6086
10.7%
4 5032
8.9%
3 3573
 
6.3%
5 3424
 
6.0%
6 2589
 
4.6%
2 2361
 
4.2%
7 2357
 
4.2%
Other values (3) 4129
 
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 10560
18.6%
- 8761
15.4%
0 7896
13.9%
1 6086
10.7%
4 5032
8.9%
3 3573
 
6.3%
5 3424
 
6.0%
6 2589
 
4.6%
2 2361
 
4.2%
7 2357
 
4.2%
Other values (3) 4129
 
7.3%

대표자명
Categorical

Distinct15
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size37.5 KiB
천안시
830 
아산시
616 
당진시
595 
서산시
497 
논산시
345 
Other values (10)
1901 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서산시
2nd row서산시
3rd row서산시
4th row예산군
5th row예산군

Common Values

ValueCountFrequency (%)
천안시 830
17.3%
아산시 616
12.9%
당진시 595
12.4%
서산시 497
10.4%
논산시 345
7.2%
보령시 310
 
6.5%
공주시 293
 
6.1%
홍성군 270
 
5.6%
예산군 209
 
4.4%
태안군 190
 
4.0%
Other values (5) 629
13.1%

Length

2024-01-10T04:53:38.239838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
천안시 830
17.3%
아산시 616
12.9%
당진시 595
12.4%
서산시 497
10.4%
논산시 345
7.2%
보령시 310
 
6.5%
공주시 293
 
6.1%
홍성군 270
 
5.6%
예산군 209
 
4.4%
태안군 190
 
4.0%
Other values (5) 629
13.1%

업종상세
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size37.5 KiB
한식
3963 
기타 음식점업
 
304
중식
 
221
일식
 
164
서양식
 
107

Length

Max length7
Median length2
Mean length2.3557692
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row한식
2nd row서양식
3rd row일식
4th row한식
5th row한식

Common Values

ValueCountFrequency (%)
한식 3963
82.8%
기타 음식점업 304
 
6.4%
중식 221
 
4.6%
일식 164
 
3.4%
서양식 107
 
2.2%
기타외국식 25
 
0.5%

Length

2024-01-10T04:53:38.324464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T04:53:38.412526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한식 3963
77.9%
기타 304
 
6.0%
음식점업 304
 
6.0%
중식 221
 
4.3%
일식 164
 
3.2%
서양식 107
 
2.1%
기타외국식 25
 
0.5%

Interactions

2024-01-10T04:53:34.542997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T04:53:34.406652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T04:53:34.612465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T04:53:34.472609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T04:53:38.700040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종시도명선정여부안심식당SEQ전화번호대표자명업종상세
업종1.0000.1310.6320.2230.8920.5980.116
시도명0.1311.000NaN0.0140.1180.1550.827
선정여부0.632NaN1.0000.0000.7840.0001.000
안심식당SEQ0.2230.0140.0001.0000.1800.2310.000
전화번호0.8920.1180.7840.1801.0000.6000.145
대표자명0.5980.1550.0000.2310.6001.0000.198
업종상세0.1160.8271.0000.0000.1450.1981.000
2024-01-10T04:53:38.786309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선정여부대표자명시도명업종상세안심식당SEQ
선정여부1.0000.0001.0000.8660.000
대표자명0.0001.0000.1410.0930.210
시도명1.0000.1411.0000.6310.009
업종상세0.8660.0930.6311.0000.000
안심식당SEQ0.0000.2100.0090.0001.000
2024-01-10T04:53:38.879108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종전화번호시도명선정여부안심식당SEQ대표자명업종상세
업종1.0000.8920.1010.4300.1710.2690.061
전화번호0.8921.0000.1180.7150.1790.2940.072
시도명0.1010.1181.0001.0000.0090.1410.631
선정여부0.4300.7151.0001.0000.0000.0000.866
안심식당SEQ0.1710.1790.0090.0001.0000.2100.000
대표자명0.2690.2940.1410.0000.2101.0000.093
업종상세0.0610.0720.6310.8660.0000.0931.000

Missing values

2024-01-10T04:53:34.720402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T04:53:34.844243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-10T04:53:34.934664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업종시도명주소1주소2사업자명선정여부안심식당SEQ전화번호시군구명시도코드대표자명업종상세
01일반음식점충청남도<NA>청와삼대<NA>Y11515충청남도 서산시 석림6로 17041-669-1008서산시한식
12일반음식점충청남도<NA>토마토아저씨서산점<NA>Y11516충청남도 서산시 호수공원14로 36050-5006-2544서산시서양식
23일반음식점충청남도2층하루엔소쿠(서산점)<NA>Y11517충청남도 서산시 안견로 242041-664-3637서산시일식
34일반음식점충청남도<NA>거성정육식당<NA>Y11518충청남도 예산군 삽교읍 수암산로 272-23041-338-5485예산군한식
45일반음식점충청남도<NA>김영희강남동태찜예산점<NA>Y11519충청남도 예산군 예산읍 산성공원1길 20041-335-9992예산군한식
56일반음식점충청남도<NA>늘봄우와돈<NA>Y11520충청남도 예산군 예산읍 역전로160번길 30041-335-3309예산군한식
67일반음식점충청남도<NA>대성식당<NA>Y11521충청남도 예산군 신양면 대덕로 6010-****-****예산군한식
78일반음식점충청남도<NA>대술촌돼지찌개<NA>Y11522충청남도 예산군 대술면 대술로 119010-****-****예산군한식
89일반음식점충청남도<NA>대술칼국수<NA>Y11523충청남도 예산군 대술면 대술로 120-1010-****-****예산군한식
910일반음식점충청남도<NA>막창도둑<NA>Y11524충청남도 예산군 예산읍 벚꽃로155번길 5041-331-5161예산군한식
업종시도명주소1주소2사업자명선정여부안심식당SEQ전화번호시군구명시도코드대표자명업종상세
47744775일반음식점충청남도<NA>월명산장<NA>Y70509충청남도 서천군 비인면 충서로740번길 2-1010-****-****서천군한식
47754776일반음식점충청남도1층온정곰탕<NA>Y70399충청남도 아산시 충무로20번길 10041-532-6094아산시한식
47764777일반음식점충청남도<NA>장미치킨호프<NA>Y70400충청남도 아산시 신창면 서부북로 933-30041-546-3392아산시한식
47774778일반음식점충청남도<NA>다도횟집<NA>Y70401충청남도 아산시 시민로457번길 35041-545-9555아산시한식
47784779일반음식점충청남도<NA>서경한식뷔페<NA>Y70402충청남도 아산시 영인면 영인로 136041-543-5657아산시한식
47794780일반음식점충청남도<NA>그램그램 (아산음봉포스코점)<NA>Y70403충청남도 아산시 음봉면 음봉로 574041-546-9002아산시한식
47804781일반음식점충청남도<NA>짱이야<NA>Y70404충청남도 아산시 온화로65번길 250507-1482-0230아산시한식
47814782일반음식점충청남도<NA>이춘봉인생치킨 아산1호점<NA>Y70405충청남도 아산시 음봉면 음봉로 672041-531-3139아산시한식
47824783일반음식점충청남도<NA>마미쿡<NA>Y70406충청남도 아산시 배방읍 용연로 20-20507-1404-1183아산시한식
47834784일반음식점충청남도<NA>새로운가주식회사 족발야시장<NA>Y70407충청남도 아산시 배방읍 용연로 84-7041-532-4969아산시한식