Overview

Dataset statistics

Number of variables4
Number of observations3923
Missing cells2059
Missing cells (%)13.1%
Duplicate rows3
Duplicate rows (%)0.1%
Total size in memory122.7 KiB
Average record size in memory32.0 B

Variable types

Categorical1
Text3

Dataset

Description경기도 하남시 식품접객업 중 일반음식점과 휴게음식업에 대한 현황(업종명,업소명,소재지주소,소재지전화)으로 자료를 제공합니다.
Author경기도 하남시
URLhttps://www.data.go.kr/data/15016348/fileData.do

Alerts

Dataset has 3 (0.1%) duplicate rowsDuplicates
소재지전화 has 2052 (52.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 05:16:51.618890
Analysis finished2023-12-12 05:16:52.765071
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.8 KiB
일반음식점
2898 
휴게음식점
1025 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반음식점
2nd row휴게음식점
3rd row휴게음식점
4th row휴게음식점
5th row휴게음식점

Common Values

ValueCountFrequency (%)
일반음식점 2898
73.9%
휴게음식점 1025
 
26.1%

Length

2023-12-12T14:16:52.859877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:16:52.977443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반음식점 2898
73.9%
휴게음식점 1025
 
26.1%
Distinct3857
Distinct (%)98.3%
Missing0
Missing (%)0.0%
Memory size30.8 KiB
2023-12-12T14:16:53.298178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length28
Mean length8.1547285
Min length1

Characters and Unicode

Total characters31991
Distinct characters923
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3795 ?
Unique (%)96.7%

Sample

1st row미사반점
2nd row르브레드랩 하남스타필드점(신세계)
3rd row(주)지에스리테일 GS수퍼 하남미사점
4th row폴 바셋 스타필드하남 1호점
5th row(주)플레이타임그룹 SS스타필드 하남점 맘스카페 플레이타임
ValueCountFrequency (%)
하남미사점 165
 
2.6%
하남점 112
 
1.7%
미사점 99
 
1.5%
미사역점 53
 
0.8%
카페 45
 
0.7%
씨유 44
 
0.7%
gs25 43
 
0.7%
하남 41
 
0.6%
미사강변점 39
 
0.6%
세븐일레븐 38
 
0.6%
Other values (4464) 5763
89.5%
2023-12-12T14:16:53.978589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2522
 
7.9%
1316
 
4.1%
906
 
2.8%
872
 
2.7%
843
 
2.6%
821
 
2.6%
636
 
2.0%
600
 
1.9%
( 450
 
1.4%
) 450
 
1.4%
Other values (913) 22575
70.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 25614
80.1%
Space Separator 2522
 
7.9%
Uppercase Letter 1156
 
3.6%
Lowercase Letter 1017
 
3.2%
Decimal Number 632
 
2.0%
Open Punctuation 451
 
1.4%
Close Punctuation 451
 
1.4%
Other Punctuation 133
 
0.4%
Dash Punctuation 9
 
< 0.1%
Letter Number 4
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1316
 
5.1%
906
 
3.5%
872
 
3.4%
843
 
3.3%
821
 
3.2%
636
 
2.5%
600
 
2.3%
399
 
1.6%
258
 
1.0%
246
 
1.0%
Other values (832) 18717
73.1%
Uppercase Letter
ValueCountFrequency (%)
C 120
 
10.4%
E 112
 
9.7%
S 107
 
9.3%
G 104
 
9.0%
O 73
 
6.3%
U 60
 
5.2%
F 57
 
4.9%
A 55
 
4.8%
B 54
 
4.7%
R 53
 
4.6%
Other values (16) 361
31.2%
Lowercase Letter
ValueCountFrequency (%)
e 160
15.7%
a 106
 
10.4%
o 99
 
9.7%
i 62
 
6.1%
f 58
 
5.7%
n 58
 
5.7%
c 55
 
5.4%
t 52
 
5.1%
s 47
 
4.6%
r 45
 
4.4%
Other values (15) 275
27.0%
Decimal Number
ValueCountFrequency (%)
2 172
27.2%
5 105
16.6%
1 95
15.0%
4 60
 
9.5%
0 52
 
8.2%
9 44
 
7.0%
3 40
 
6.3%
7 25
 
4.0%
6 21
 
3.3%
8 18
 
2.8%
Other Punctuation
ValueCountFrequency (%)
& 62
46.6%
. 25
18.8%
, 19
 
14.3%
' 13
 
9.8%
/ 5
 
3.8%
! 4
 
3.0%
· 2
 
1.5%
; 1
 
0.8%
? 1
 
0.8%
: 1
 
0.8%
Open Punctuation
ValueCountFrequency (%)
( 450
99.8%
1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 450
99.8%
1
 
0.2%
Letter Number
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Space Separator
ValueCountFrequency (%)
2522
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 25606
80.0%
Common 4200
 
13.1%
Latin 2177
 
6.8%
Han 8
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1316
 
5.1%
906
 
3.5%
872
 
3.4%
843
 
3.3%
821
 
3.2%
636
 
2.5%
600
 
2.3%
399
 
1.6%
258
 
1.0%
246
 
1.0%
Other values (826) 18709
73.1%
Latin
ValueCountFrequency (%)
e 160
 
7.3%
C 120
 
5.5%
E 112
 
5.1%
S 107
 
4.9%
a 106
 
4.9%
G 104
 
4.8%
o 99
 
4.5%
O 73
 
3.4%
i 62
 
2.8%
U 60
 
2.8%
Other values (43) 1174
53.9%
Common
ValueCountFrequency (%)
2522
60.0%
( 450
 
10.7%
) 450
 
10.7%
2 172
 
4.1%
5 105
 
2.5%
1 95
 
2.3%
& 62
 
1.5%
4 60
 
1.4%
0 52
 
1.2%
9 44
 
1.0%
Other values (18) 188
 
4.5%
Han
ValueCountFrequency (%)
3
37.5%
1
 
12.5%
1
 
12.5%
1
 
12.5%
1
 
12.5%
1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 25606
80.0%
ASCII 6369
 
19.9%
CJK 8
 
< 0.1%
Number Forms 4
 
< 0.1%
None 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2522
39.6%
( 450
 
7.1%
) 450
 
7.1%
2 172
 
2.7%
e 160
 
2.5%
C 120
 
1.9%
E 112
 
1.8%
S 107
 
1.7%
a 106
 
1.7%
5 105
 
1.6%
Other values (66) 2065
32.4%
Hangul
ValueCountFrequency (%)
1316
 
5.1%
906
 
3.5%
872
 
3.4%
843
 
3.3%
821
 
3.2%
636
 
2.5%
600
 
2.3%
399
 
1.6%
258
 
1.0%
246
 
1.0%
Other values (826) 18709
73.1%
Number Forms
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
CJK
ValueCountFrequency (%)
3
37.5%
1
 
12.5%
1
 
12.5%
1
 
12.5%
1
 
12.5%
1
 
12.5%
None
ValueCountFrequency (%)
· 2
50.0%
1
25.0%
1
25.0%
Distinct3673
Distinct (%)93.8%
Missing7
Missing (%)0.2%
Memory size30.8 KiB
2023-12-12T14:16:54.219268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length72
Median length60
Mean length36.871297
Min length19

Characters and Unicode

Total characters144388
Distinct characters386
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3527 ?
Unique (%)90.1%

Sample

1st row경기도 하남시 미사강변중앙로204번길 45, 성산타워플러스 1층 107,108호 (망월동)
2nd row경기도 하남시 미사대로 750, 스타필드 하남(신세계백화점) 지하2층 (신장동)
3rd row경기도 하남시 미사강변북로 65-1, 미사강변 더샵 리버포레 지하1층 (선동)
4th row경기도 하남시 미사대로 750, 스타필드 하남 지층 B112호 (신장동)
5th row경기도 하남시 미사대로 750, 2층 (신장동, 스타필드하남 쇼핑센터)
ValueCountFrequency (%)
경기도 3916
 
13.7%
하남시 3916
 
13.7%
1층 2233
 
7.8%
망월동 1318
 
4.6%
신장동 595
 
2.1%
덕풍동 549
 
1.9%
2층 427
 
1.5%
풍산동 387
 
1.3%
미사강변중앙로 341
 
1.2%
미사대로 287
 
1.0%
Other values (2424) 14705
51.3%
2023-12-12T14:16:54.617891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
24795
 
17.2%
1 9042
 
6.3%
4967
 
3.4%
4803
 
3.3%
, 4677
 
3.2%
4633
 
3.2%
( 4360
 
3.0%
) 4357
 
3.0%
4118
 
2.9%
3981
 
2.8%
Other values (376) 74655
51.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 78116
54.1%
Decimal Number 25963
 
18.0%
Space Separator 24795
 
17.2%
Other Punctuation 4685
 
3.2%
Open Punctuation 4360
 
3.0%
Close Punctuation 4357
 
3.0%
Uppercase Letter 1028
 
0.7%
Dash Punctuation 763
 
0.5%
Lowercase Letter 174
 
0.1%
Math Symbol 105
 
0.1%
Other values (2) 42
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4967
 
6.4%
4803
 
6.1%
4633
 
5.9%
4118
 
5.3%
3981
 
5.1%
3936
 
5.0%
3928
 
5.0%
3921
 
5.0%
3496
 
4.5%
2699
 
3.5%
Other values (312) 37634
48.2%
Uppercase Letter
ValueCountFrequency (%)
B 207
20.1%
C 118
11.5%
R 117
11.4%
A 113
11.0%
L 86
8.4%
E 80
 
7.8%
T 40
 
3.9%
N 37
 
3.6%
U 35
 
3.4%
K 33
 
3.2%
Other values (12) 162
15.8%
Lowercase Letter
ValueCountFrequency (%)
e 55
31.6%
c 19
 
10.9%
t 19
 
10.9%
r 18
 
10.3%
n 18
 
10.3%
l 11
 
6.3%
a 9
 
5.2%
p 8
 
4.6%
o 3
 
1.7%
b 3
 
1.7%
Other values (7) 11
 
6.3%
Decimal Number
ValueCountFrequency (%)
1 9042
34.8%
2 3574
 
13.8%
0 3484
 
13.4%
3 1906
 
7.3%
5 1848
 
7.1%
4 1525
 
5.9%
7 1418
 
5.5%
6 1097
 
4.2%
9 1036
 
4.0%
8 1033
 
4.0%
Other Punctuation
ValueCountFrequency (%)
, 4677
99.8%
& 3
 
0.1%
. 3
 
0.1%
· 1
 
< 0.1%
@ 1
 
< 0.1%
Letter Number
ValueCountFrequency (%)
25
61.0%
10
 
24.4%
6
 
14.6%
Math Symbol
ValueCountFrequency (%)
~ 96
91.4%
+ 9
 
8.6%
Space Separator
ValueCountFrequency (%)
24795
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4360
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4357
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 763
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 78116
54.1%
Common 65029
45.0%
Latin 1243
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4967
 
6.4%
4803
 
6.1%
4633
 
5.9%
4118
 
5.3%
3981
 
5.1%
3936
 
5.0%
3928
 
5.0%
3921
 
5.0%
3496
 
4.5%
2699
 
3.5%
Other values (312) 37634
48.2%
Latin
ValueCountFrequency (%)
B 207
16.7%
C 118
 
9.5%
R 117
 
9.4%
A 113
 
9.1%
L 86
 
6.9%
E 80
 
6.4%
e 55
 
4.4%
T 40
 
3.2%
N 37
 
3.0%
U 35
 
2.8%
Other values (32) 355
28.6%
Common
ValueCountFrequency (%)
24795
38.1%
1 9042
 
13.9%
, 4677
 
7.2%
( 4360
 
6.7%
) 4357
 
6.7%
2 3574
 
5.5%
0 3484
 
5.4%
3 1906
 
2.9%
5 1848
 
2.8%
4 1525
 
2.3%
Other values (12) 5461
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 78116
54.1%
ASCII 66230
45.9%
Number Forms 41
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
24795
37.4%
1 9042
 
13.7%
, 4677
 
7.1%
( 4360
 
6.6%
) 4357
 
6.6%
2 3574
 
5.4%
0 3484
 
5.3%
3 1906
 
2.9%
5 1848
 
2.8%
4 1525
 
2.3%
Other values (50) 6662
 
10.1%
Hangul
ValueCountFrequency (%)
4967
 
6.4%
4803
 
6.1%
4633
 
5.9%
4118
 
5.3%
3981
 
5.1%
3936
 
5.0%
3928
 
5.0%
3921
 
5.0%
3496
 
4.5%
2699
 
3.5%
Other values (312) 37634
48.2%
Number Forms
ValueCountFrequency (%)
25
61.0%
10
 
24.4%
6
 
14.6%
None
ValueCountFrequency (%)
· 1
100.0%

소재지전화
Text

MISSING 

Distinct1815
Distinct (%)97.0%
Missing2052
Missing (%)52.3%
Memory size30.8 KiB
2023-12-12T14:16:54.885919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.021913
Min length11

Characters and Unicode

Total characters22493
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1774 ?
Unique (%)94.8%

Sample

1st row02-0323-6933
2nd row02-1522-1261
3rd row02-2006-2353
4th row02-2056-3035
5th row02-2226-0970
ValueCountFrequency (%)
031-791-4200 8
 
0.4%
031-524-1053 6
 
0.3%
031-796-6709 4
 
0.2%
031-795-7953 3
 
0.2%
031-793-0172 3
 
0.2%
031-795-0592 3
 
0.2%
031-794-9592 2
 
0.1%
031-796-7892 2
 
0.1%
031-791-3808 2
 
0.1%
031-791-1320 2
 
0.1%
Other values (1805) 1836
98.1%
2023-12-12T14:16:55.379236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 3742
16.6%
0 3025
13.4%
1 2654
11.8%
3 2649
11.8%
7 2403
10.7%
9 2345
10.4%
2 1528
6.8%
5 1154
 
5.1%
4 1057
 
4.7%
8 1034
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18751
83.4%
Dash Punctuation 3742
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3025
16.1%
1 2654
14.2%
3 2649
14.1%
7 2403
12.8%
9 2345
12.5%
2 1528
8.1%
5 1154
 
6.2%
4 1057
 
5.6%
8 1034
 
5.5%
6 902
 
4.8%
Dash Punctuation
ValueCountFrequency (%)
- 3742
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22493
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 3742
16.6%
0 3025
13.4%
1 2654
11.8%
3 2649
11.8%
7 2403
10.7%
9 2345
10.4%
2 1528
6.8%
5 1154
 
5.1%
4 1057
 
4.7%
8 1034
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22493
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 3742
16.6%
0 3025
13.4%
1 2654
11.8%
3 2649
11.8%
7 2403
10.7%
9 2345
10.4%
2 1528
6.8%
5 1154
 
5.1%
4 1057
 
4.7%
8 1034
 
4.6%

Missing values

2023-12-12T14:16:52.491567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:16:52.578386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T14:16:52.676157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업종명업소명소재지(도로명)소재지전화
0일반음식점미사반점경기도 하남시 미사강변중앙로204번길 45, 성산타워플러스 1층 107,108호 (망월동)02-0323-6933
1휴게음식점르브레드랩 하남스타필드점(신세계)경기도 하남시 미사대로 750, 스타필드 하남(신세계백화점) 지하2층 (신장동)02-1522-1261
2휴게음식점(주)지에스리테일 GS수퍼 하남미사점경기도 하남시 미사강변북로 65-1, 미사강변 더샵 리버포레 지하1층 (선동)02-2006-2353
3휴게음식점폴 바셋 스타필드하남 1호점경기도 하남시 미사대로 750, 스타필드 하남 지층 B112호 (신장동)02-2056-3035
4휴게음식점(주)플레이타임그룹 SS스타필드 하남점 맘스카페 플레이타임경기도 하남시 미사대로 750, 2층 (신장동, 스타필드하남 쇼핑센터)02-2226-0970
5휴게음식점배스킨라빈스 위례트레이더스점경기도 하남시 위례대로 200, 스타필드시티 위례점 지하3층 (학암동)02-2276-4639
6일반음식점부어치킨 하남신장점경기도 하남시 신장1로21번길 46-1, 1층 (신장동)02-2662-1660
7일반음식점버무리떡볶이 하남미사강변점경기도 하남시 미사강변한강로 270-1, 미사강변 호반 써밋 1층 1-192호 (망월동)02-2694-7602
8휴게음식점GS25 하남광암경기도 하남시 초광로 106, 1층 (광암동)02-3013-0229
9일반음식점초광한식부페경기도 하남시 초광산단서로16번길 5 (초이동)02-3013-7333
업종명업소명소재지(도로명)소재지전화
3913휴게음식점친정 커피경기도 하남시 감일로 9, 지1층 B106호 (감일동, 소슬빌)<NA>
3914휴게음식점오에이디(OAD coffee Lab)경기도 하남시 미사대로 540, A동 1층 AC01-011호 (덕풍동)<NA>
3915휴게음식점컴포즈커피 하남감일백제로점경기도 하남시 감일백제로 105, 신성메디타워 1층 103호 (감이동)<NA>
3916휴게음식점오유오(ouo)경기도 하남시 미사강변한강로334번길 8, 1층 일부호 (망월동)<NA>
3917휴게음식점올슨(olsson)경기도 하남시 신장로 107-1, 1동 1층 일부호 (신장동)<NA>
3918휴게음식점설빙 경기하남감일점경기도 하남시 감일백제로 155, 2층 206, 207, 208호 (감이동)<NA>
3919휴게음식점버드리&커피쉬경기도 하남시 세미로 33, 도윤빌딩2 3층 일부호 (풍산동)<NA>
3920휴게음식점배스킨라빈스 북위례힐스테이트점경기도 하남시 위례학암로14번길 18, 1층 107, 108호 (학암동)<NA>
3921휴게음식점와로샐러드 하남시청역점경기도 하남시 하남대로 815, 102호 (신장동, 명지캐럿108)<NA>
3922휴게음식점세븐일레븐 미사리버에비뉴점경기도 하남시 미사강변중앙로 173, 미사 리버에비뉴 1층 112호 (망월동)<NA>

Duplicate rows

Most frequently occurring

업종명업소명소재지(도로명)소재지전화# duplicates
2휴게음식점(주)와이오케이푸드경기도 하남시 덕풍서로 70 (덕풍동,하남풍산 E-MART 내(2층))031-524-10533
0일반음식점큰고기수산경기도 하남시 신장1로3번길 3, 1층 (신장동)<NA>2
1일반음식점하남정덮밥경기도 하남시 신평로 74 (신장동,(1층))031-795-79532