Overview

Dataset statistics

Number of variables5
Number of observations8196
Missing cells4602
Missing cells (%)11.2%
Duplicate rows334
Duplicate rows (%)4.1%
Total size in memory320.3 KiB
Average record size in memory40.0 B

Variable types

Categorical1
Text3
DateTime1

Dataset

Description인천광역시 서구 식품위생업소 현황 에 관한 데이터입니다. 연번, 업종명, 업소명, 소재지 (도로명) 등의 항목을 제공하고 있습니다.
Author인천광역시 서구
URLhttps://www.data.go.kr/data/15039517/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 334 (4.1%) duplicate rowsDuplicates
업종명 is highly imbalanced (55.8%)Imbalance
전화번호 has 4602 (56.1%) missing valuesMissing

Reproduction

Analysis started2023-12-12 20:48:38.094471
Analysis finished2023-12-12 20:48:39.387641
Duration1.29 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size64.2 KiB
일반음식점
6045 
휴게음식점
1875 
제과점영업
 
143
유흥주점영업
 
68
단란주점
 
65

Length

Max length6
Median length5
Mean length5.000366
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반음식점
2nd row일반음식점
3rd row일반음식점
4th row일반음식점
5th row일반음식점

Common Values

ValueCountFrequency (%)
일반음식점 6045
73.8%
휴게음식점 1875
 
22.9%
제과점영업 143
 
1.7%
유흥주점영업 68
 
0.8%
단란주점 65
 
0.8%

Length

2023-12-13T05:48:39.481115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:48:39.620096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반음식점 6045
73.8%
휴게음식점 1875
 
22.9%
제과점영업 143
 
1.7%
유흥주점영업 68
 
0.8%
단란주점 65
 
0.8%
Distinct7514
Distinct (%)91.7%
Missing0
Missing (%)0.0%
Memory size64.2 KiB
2023-12-13T05:48:40.189068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length29
Mean length7.4082479
Min length1

Characters and Unicode

Total characters60718
Distinct characters1066
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6964 ?
Unique (%)85.0%

Sample

1st row인천국제컨트리클럽 대식당
2nd row비알케이(BRK)
3rd row예전
4th row일미정
5th row드럼통숯불구이
ValueCountFrequency (%)
청라점 160
 
1.5%
검단신도시점 103
 
1.0%
검단점 76
 
0.7%
세븐일레븐 64
 
0.6%
씨유 59
 
0.6%
루원시티점 53
 
0.5%
인천청라점 48
 
0.4%
메가엠지씨커피 32
 
0.3%
지에스25 28
 
0.3%
가좌점 28
 
0.3%
Other values (7717) 10059
93.9%
2023-12-13T05:48:40.742703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2632
 
4.3%
2516
 
4.1%
1191
 
2.0%
1174
 
1.9%
935
 
1.5%
851
 
1.4%
792
 
1.3%
780
 
1.3%
748
 
1.2%
711
 
1.2%
Other values (1056) 48388
79.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 52498
86.5%
Space Separator 2516
 
4.1%
Uppercase Letter 1913
 
3.2%
Decimal Number 1172
 
1.9%
Lowercase Letter 1136
 
1.9%
Close Punctuation 614
 
1.0%
Open Punctuation 614
 
1.0%
Other Punctuation 238
 
0.4%
Dash Punctuation 11
 
< 0.1%
Math Symbol 4
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2632
 
5.0%
1191
 
2.3%
1174
 
2.2%
935
 
1.8%
851
 
1.6%
792
 
1.5%
780
 
1.5%
748
 
1.4%
711
 
1.4%
655
 
1.2%
Other values (972) 42029
80.1%
Uppercase Letter
ValueCountFrequency (%)
S 185
 
9.7%
C 185
 
9.7%
E 162
 
8.5%
O 143
 
7.5%
G 133
 
7.0%
A 114
 
6.0%
B 106
 
5.5%
F 87
 
4.5%
P 84
 
4.4%
T 77
 
4.0%
Other values (16) 637
33.3%
Lowercase Letter
ValueCountFrequency (%)
e 179
15.8%
o 105
 
9.2%
a 100
 
8.8%
c 73
 
6.4%
s 70
 
6.2%
f 67
 
5.9%
n 63
 
5.5%
i 61
 
5.4%
t 57
 
5.0%
r 50
 
4.4%
Other values (16) 311
27.4%
Other Punctuation
ValueCountFrequency (%)
& 125
52.5%
. 36
 
15.1%
, 35
 
14.7%
' 17
 
7.1%
· 8
 
3.4%
# 6
 
2.5%
: 6
 
2.5%
! 1
 
0.4%
% 1
 
0.4%
? 1
 
0.4%
Other values (2) 2
 
0.8%
Decimal Number
ValueCountFrequency (%)
2 336
28.7%
5 178
15.2%
1 134
 
11.4%
0 126
 
10.8%
4 99
 
8.4%
9 82
 
7.0%
8 68
 
5.8%
3 67
 
5.7%
7 58
 
4.9%
6 24
 
2.0%
Math Symbol
ValueCountFrequency (%)
~ 1
25.0%
< 1
25.0%
> 1
25.0%
+ 1
25.0%
Space Separator
ValueCountFrequency (%)
2516
100.0%
Close Punctuation
ValueCountFrequency (%)
) 614
100.0%
Open Punctuation
ValueCountFrequency (%)
( 614
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 52460
86.4%
Common 5170
 
8.5%
Latin 3050
 
5.0%
Han 38
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2632
 
5.0%
1191
 
2.3%
1174
 
2.2%
935
 
1.8%
851
 
1.6%
792
 
1.5%
780
 
1.5%
748
 
1.4%
711
 
1.4%
655
 
1.2%
Other values (940) 41991
80.0%
Latin
ValueCountFrequency (%)
S 185
 
6.1%
C 185
 
6.1%
e 179
 
5.9%
E 162
 
5.3%
O 143
 
4.7%
G 133
 
4.4%
A 114
 
3.7%
B 106
 
3.5%
o 105
 
3.4%
a 100
 
3.3%
Other values (43) 1638
53.7%
Han
ValueCountFrequency (%)
3
 
7.9%
3
 
7.9%
2
 
5.3%
2
 
5.3%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (22) 22
57.9%
Common
ValueCountFrequency (%)
2516
48.7%
) 614
 
11.9%
( 614
 
11.9%
2 336
 
6.5%
5 178
 
3.4%
1 134
 
2.6%
0 126
 
2.4%
& 125
 
2.4%
4 99
 
1.9%
9 82
 
1.6%
Other values (21) 346
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 52460
86.4%
ASCII 8211
 
13.5%
CJK 37
 
0.1%
None 8
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2632
 
5.0%
1191
 
2.3%
1174
 
2.2%
935
 
1.8%
851
 
1.6%
792
 
1.5%
780
 
1.5%
748
 
1.4%
711
 
1.4%
655
 
1.2%
Other values (940) 41991
80.0%
ASCII
ValueCountFrequency (%)
2516
30.6%
) 614
 
7.5%
( 614
 
7.5%
2 336
 
4.1%
S 185
 
2.3%
C 185
 
2.3%
e 179
 
2.2%
5 178
 
2.2%
E 162
 
2.0%
O 143
 
1.7%
Other values (72) 3099
37.7%
None
ValueCountFrequency (%)
· 8
100.0%
CJK
ValueCountFrequency (%)
3
 
8.1%
3
 
8.1%
2
 
5.4%
2
 
5.4%
1
 
2.7%
1
 
2.7%
1
 
2.7%
1
 
2.7%
1
 
2.7%
1
 
2.7%
Other values (21) 21
56.8%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct7579
Distinct (%)92.5%
Missing0
Missing (%)0.0%
Memory size64.2 KiB
2023-12-13T05:48:41.048026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length96
Median length69
Mean length35.478038
Min length9

Characters and Unicode

Total characters290778
Distinct characters503
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7111 ?
Unique (%)86.8%

Sample

1st row인천광역시 서구 도요지로 37 (경서동, 외120필지 국제골프장내)
2nd row인천광역시 서구 방축로 339, 2층 일부호 (가좌동)
3rd row인천광역시 서구 가정로98번길 9-4 (가좌동)
4th row인천광역시 서구 방축로 335 (가좌동)
5th row인천광역시 서구 길주로75번길 1 (석남동)
ValueCountFrequency (%)
서구 8171
 
14.4%
인천광역시 8171
 
14.4%
1층 2270
 
4.0%
청라동 1358
 
2.4%
석남동 998
 
1.8%
일부호 848
 
1.5%
가좌동 796
 
1.4%
가정동 737
 
1.3%
마전동 543
 
1.0%
원당동 509
 
0.9%
Other values (4751) 32147
56.8%
2023-12-13T05:48:41.493477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
48381
 
16.6%
1 15735
 
5.4%
, 9818
 
3.4%
9296
 
3.2%
8978
 
3.1%
8631
 
3.0%
8500
 
2.9%
( 8468
 
2.9%
) 8467
 
2.9%
8343
 
2.9%
Other values (493) 156161
53.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 157843
54.3%
Decimal Number 53425
 
18.4%
Space Separator 48381
 
16.6%
Other Punctuation 9914
 
3.4%
Open Punctuation 8470
 
2.9%
Close Punctuation 8469
 
2.9%
Dash Punctuation 1870
 
0.6%
Uppercase Letter 1808
 
0.6%
Lowercase Letter 476
 
0.2%
Math Symbol 103
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9296
 
5.9%
8978
 
5.7%
8631
 
5.5%
8500
 
5.4%
8343
 
5.3%
8321
 
5.3%
8220
 
5.2%
8211
 
5.2%
8200
 
5.2%
5552
 
3.5%
Other values (430) 75591
47.9%
Uppercase Letter
ValueCountFrequency (%)
B 545
30.1%
A 216
 
11.9%
E 147
 
8.1%
S 133
 
7.4%
K 114
 
6.3%
L 103
 
5.7%
I 87
 
4.8%
W 86
 
4.8%
V 83
 
4.6%
C 57
 
3.2%
Other values (12) 237
13.1%
Lowercase Letter
ValueCountFrequency (%)
e 155
32.6%
s 85
17.9%
a 75
15.8%
r 74
15.5%
d 74
15.5%
c 5
 
1.1%
b 3
 
0.6%
p 1
 
0.2%
k 1
 
0.2%
n 1
 
0.2%
Other values (2) 2
 
0.4%
Decimal Number
ValueCountFrequency (%)
1 15735
29.5%
2 7321
13.7%
0 6805
12.7%
3 4560
 
8.5%
4 3863
 
7.2%
5 3500
 
6.6%
6 3079
 
5.8%
7 3010
 
5.6%
8 2879
 
5.4%
9 2673
 
5.0%
Other Punctuation
ValueCountFrequency (%)
, 9818
99.0%
' 75
 
0.8%
. 12
 
0.1%
& 5
 
0.1%
* 2
 
< 0.1%
/ 1
 
< 0.1%
@ 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
~ 53
51.5%
< 25
24.3%
> 25
24.3%
Letter Number
ValueCountFrequency (%)
17
89.5%
1
 
5.3%
1
 
5.3%
Open Punctuation
ValueCountFrequency (%)
( 8468
> 99.9%
[ 2
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 8467
> 99.9%
] 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
48381
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1870
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 157842
54.3%
Common 130632
44.9%
Latin 2303
 
0.8%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9296
 
5.9%
8978
 
5.7%
8631
 
5.5%
8500
 
5.4%
8343
 
5.3%
8321
 
5.3%
8220
 
5.2%
8211
 
5.2%
8200
 
5.2%
5552
 
3.5%
Other values (429) 75590
47.9%
Latin
ValueCountFrequency (%)
B 545
23.7%
A 216
 
9.4%
e 155
 
6.7%
E 147
 
6.4%
S 133
 
5.8%
K 114
 
5.0%
L 103
 
4.5%
I 87
 
3.8%
W 86
 
3.7%
s 85
 
3.7%
Other values (27) 632
27.4%
Common
ValueCountFrequency (%)
48381
37.0%
1 15735
 
12.0%
, 9818
 
7.5%
( 8468
 
6.5%
) 8467
 
6.5%
2 7321
 
5.6%
0 6805
 
5.2%
3 4560
 
3.5%
4 3863
 
3.0%
5 3500
 
2.7%
Other values (16) 13714
 
10.5%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 157757
54.3%
ASCII 132916
45.7%
Compat Jamo 85
 
< 0.1%
Number Forms 19
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
48381
36.4%
1 15735
 
11.8%
, 9818
 
7.4%
( 8468
 
6.4%
) 8467
 
6.4%
2 7321
 
5.5%
0 6805
 
5.1%
3 4560
 
3.4%
4 3863
 
2.9%
5 3500
 
2.6%
Other values (50) 15998
 
12.0%
Hangul
ValueCountFrequency (%)
9296
 
5.9%
8978
 
5.7%
8631
 
5.5%
8500
 
5.4%
8343
 
5.3%
8321
 
5.3%
8220
 
5.2%
8211
 
5.2%
8200
 
5.2%
5552
 
3.5%
Other values (428) 75505
47.9%
Compat Jamo
ValueCountFrequency (%)
85
100.0%
Number Forms
ValueCountFrequency (%)
17
89.5%
1
 
5.3%
1
 
5.3%
CJK
ValueCountFrequency (%)
1
100.0%

전화번호
Text

MISSING 

Distinct3199
Distinct (%)89.0%
Missing4602
Missing (%)56.1%
Memory size64.2 KiB
2023-12-13T05:48:41.733564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.036728
Min length3

Characters and Unicode

Total characters43260
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2838 ?
Unique (%)79.0%

Sample

1st row032-562-6666
2nd row032-578-0062
3rd row032-584-4661
4th row032-577-7932
5th row032-576-0521
ValueCountFrequency (%)
032-562-6666 15
 
0.4%
02-3284-8112 6
 
0.2%
032-560-1717 5
 
0.1%
032-582-5959 4
 
0.1%
02-3284-8116 4
 
0.1%
032-563-3838 3
 
0.1%
032-575-9285 3
 
0.1%
032-573-5770 3
 
0.1%
032-581-6422 3
 
0.1%
032-566-3355 3
 
0.1%
Other values (3189) 3545
98.6%
2023-12-13T05:48:42.061131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 7187
16.6%
2 5833
13.5%
0 5525
12.8%
3 5238
12.1%
5 5193
12.0%
6 3473
8.0%
7 3148
7.3%
8 2335
 
5.4%
1 1941
 
4.5%
9 1864
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 36073
83.4%
Dash Punctuation 7187
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 5833
16.2%
0 5525
15.3%
3 5238
14.5%
5 5193
14.4%
6 3473
9.6%
7 3148
8.7%
8 2335
6.5%
1 1941
 
5.4%
9 1864
 
5.2%
4 1523
 
4.2%
Dash Punctuation
ValueCountFrequency (%)
- 7187
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 43260
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 7187
16.6%
2 5833
13.5%
0 5525
12.8%
3 5238
12.1%
5 5193
12.0%
6 3473
8.0%
7 3148
7.3%
8 2335
 
5.4%
1 1941
 
4.5%
9 1864
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43260
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 7187
16.6%
2 5833
13.5%
0 5525
12.8%
3 5238
12.1%
5 5193
12.0%
6 3473
8.0%
7 3148
7.3%
8 2335
 
5.4%
1 1941
 
4.5%
9 1864
 
4.3%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size64.2 KiB
Minimum2023-10-26 00:00:00
Maximum2023-10-26 00:00:00
2023-12-13T05:48:42.163967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:48:42.251090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Missing values

2023-12-13T05:48:39.214735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:48:39.332155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명소재지 (도로명)전화번호데이터기준일자
0일반음식점인천국제컨트리클럽 대식당인천광역시 서구 도요지로 37 (경서동, 외120필지 국제골프장내)032-562-66662023-10-26
1일반음식점비알케이(BRK)인천광역시 서구 방축로 339, 2층 일부호 (가좌동)032-578-00622023-10-26
2일반음식점예전인천광역시 서구 가정로98번길 9-4 (가좌동)032-584-46612023-10-26
3일반음식점일미정인천광역시 서구 방축로 335 (가좌동)032-577-79322023-10-26
4일반음식점드럼통숯불구이인천광역시 서구 길주로75번길 1 (석남동)<NA>2023-10-26
5일반음식점효정통닭인천광역시 서구 가정로126번길 16 (가좌동)032-576-05212023-10-26
6일반음식점엄스키친인천광역시 서구 검단로487번길 4 (마전동)032-562-59272023-10-26
7일반음식점성원닭갈비식당인천광역시 서구 가정로98번길 6 (가좌동)032-572-93312023-10-26
8일반음식점본죽인천광역시 서구 가정로 187 (석남동)032-582-40692023-10-26
9일반음식점예당추어탕인천광역시 서구 원적로100번길 6 (가좌동)032-0572-88722023-10-26
업종명업소명소재지 (도로명)전화번호데이터기준일자
8186휴게음식점플러스82 청라점인천광역시 서구 크리스탈로102번길 22, 경연타워 106호 (청라동)<NA>2023-10-26
8187휴게음식점온안인천광역시 서구 원당대로 705, 1층 (마전동)032-562-08882023-10-26
8188휴게음식점이디야 심곡동점인천광역시 서구 심곡로 44, 1층 전체호 (심곡동)<NA>2023-10-26
8189휴게음식점지에스25 루원베르힐인천광역시 서구 서곶로 50, 판매1동 A103,104호 (가정동, 루원시티 대성베르힐 더 센트로)<NA>2023-10-26
8190휴게음식점팔천순대인천광역시 서구 봉오재3로 66, 1동 104호 (가정동, 루원시티프라디움)<NA>2023-10-26
8191휴게음식점스마일탕후루인천광역시 서구 청라라임로 85, A동 107호 (청라동, 청라린스트라우스)<NA>2023-10-26
8192휴게음식점달달탕후루인천광역시 서구 청마로34번길 6, 513동 102호 (당하동, 인천 검단 힐스테이트 5차)<NA>2023-10-26
8193휴게음식점아몽즈커피인천검단아라점인천광역시 서구 바리미로 21, 103호 (원당동)032-569-25882023-10-26
8194제과점영업밀:릇인천광역시 서구 청라커낼로 280, 청라골든프라자 103호 (청라동)<NA>2023-10-26
8195제과점영업플래비츠인천광역시 서구 염곡로498번안길 5, 1층 (가정동)<NA>2023-10-26

Duplicate rows

Most frequently occurring

업종명업소명소재지 (도로명)전화번호데이터기준일자# duplicates
0일반음식점(주)촌장인천광역시 서구 서곶로 296 (심곡동)032-562-43432023-10-262
1일반음식점(주)해피락플러스인천광역시 서구 한서로 58-2 (백석동)032-571-52872023-10-262
2일반음식점3학년8반인천광역시 서구 원적로 33 (가좌동)032-583-94622023-10-262
3일반음식점OB캠프인천광역시 서구 길주로 96 (석남동)032-572-79052023-10-262
4일반음식점OK치킨(오케이치킨)인천광역시 서구 새오개로35번길 2 (석남동)032-582-28692023-10-262
5일반음식점가산팔복초밥 서구청역점인천광역시 서구 탁옥로86번길 2 (심곡동)032-568-97972023-10-262
6일반음식점가시골염소마을인천광역시 서구 심곡로55번길 6, 1층 (심곡동)032-562-26672023-10-262
7일반음식점가야촌 오리마을인천광역시 서구 심곡로49번길 7, 녹성아파트 103호 (심곡동)032-567-74462023-10-262
8일반음식점가정식백반칼국수인천광역시 서구 염곡로 343-16 (신현동,(1층))032-581-25402023-10-262
9일반음식점가정옥인천광역시 서구 가정로352번길 3 (가정동)032-572-49362023-10-262