Overview

Dataset statistics

Number of variables4
Number of observations6453
Missing cells3741
Missing cells (%)14.5%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory201.8 KiB
Average record size in memory32.0 B

Variable types

Categorical1
Text3

Dataset

Description인천광역시 미추홀구의 식품접객업소 현황에 관한 정보로 식품접객업소의 업종명, 업소명, 소재지, 전화번호의 데이터를 제공합니다.
Author인천광역시 미추홀구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15099964&srcSe=7661IVAWM27C61E190

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
업종명 is highly imbalanced (50.9%)Imbalance
전화번호 has 3741 (58.0%) missing valuesMissing

Reproduction

Analysis started2024-04-06 09:43:05.929662
Analysis finished2024-04-06 09:43:07.010789
Duration1.08 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size50.5 KiB
일반음식점
4549 
휴게음식점
1384 
유흥주점영업
 
245
단란주점
 
115
제과점영업
 
104

Length

Max length6
Median length5
Mean length5.0288238
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반음식점
2nd row일반음식점
3rd row일반음식점
4th row일반음식점
5th row일반음식점

Common Values

ValueCountFrequency (%)
일반음식점 4549
70.5%
휴게음식점 1384
 
21.4%
유흥주점영업 245
 
3.8%
단란주점 115
 
1.8%
제과점영업 104
 
1.6%
위탁급식영업 56
 
0.9%

Length

2024-04-06T18:43:07.085835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T18:43:07.214612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반음식점 4549
70.5%
휴게음식점 1384
 
21.4%
유흥주점영업 245
 
3.8%
단란주점 115
 
1.8%
제과점영업 104
 
1.6%
위탁급식영업 56
 
0.9%
Distinct6094
Distinct (%)94.4%
Missing0
Missing (%)0.0%
Memory size50.5 KiB
2024-04-06T18:43:07.531179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length25
Mean length6.827987
Min length1

Characters and Unicode

Total characters44061
Distinct characters991
Distinct categories12 ?
Distinct scripts5 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5850 ?
Unique (%)90.7%

Sample

1st row서문삼계탕
2nd row황금오리
3rd row금호횟집
4th row세컨드 찬스(second chance)
5th row화룽
ValueCountFrequency (%)
주안점 116
 
1.3%
씨유 79
 
0.9%
인하대점 78
 
0.9%
카페 69
 
0.8%
용현점 66
 
0.7%
세븐일레븐 66
 
0.7%
도화점 46
 
0.5%
지에스25 41
 
0.4%
인천도화점 34
 
0.4%
인하대역점 31
 
0.3%
Other values (6549) 8525
93.2%
2024-04-06T18:43:08.084023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2707
 
6.1%
1581
 
3.6%
934
 
2.1%
709
 
1.6%
686
 
1.6%
679
 
1.5%
584
 
1.3%
498
 
1.1%
471
 
1.1%
461
 
1.0%
Other values (981) 34751
78.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 37789
85.8%
Space Separator 2707
 
6.1%
Uppercase Letter 930
 
2.1%
Decimal Number 885
 
2.0%
Lowercase Letter 691
 
1.6%
Close Punctuation 454
 
1.0%
Open Punctuation 454
 
1.0%
Other Punctuation 73
 
0.2%
Connector Punctuation 72
 
0.2%
Dash Punctuation 3
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1581
 
4.2%
934
 
2.5%
709
 
1.9%
686
 
1.8%
679
 
1.8%
584
 
1.5%
498
 
1.3%
471
 
1.2%
461
 
1.2%
433
 
1.1%
Other values (901) 30753
81.4%
Lowercase Letter
ValueCountFrequency (%)
e 104
15.1%
a 79
11.4%
o 59
 
8.5%
c 48
 
6.9%
i 41
 
5.9%
n 40
 
5.8%
f 37
 
5.4%
l 36
 
5.2%
r 34
 
4.9%
s 30
 
4.3%
Other values (17) 183
26.5%
Uppercase Letter
ValueCountFrequency (%)
C 124
13.3%
S 88
 
9.5%
G 65
 
7.0%
B 61
 
6.6%
P 61
 
6.6%
A 56
 
6.0%
O 54
 
5.8%
E 50
 
5.4%
F 46
 
4.9%
T 34
 
3.7%
Other values (16) 291
31.3%
Decimal Number
ValueCountFrequency (%)
2 199
22.5%
0 129
14.6%
1 120
13.6%
5 119
13.4%
8 74
 
8.4%
9 72
 
8.1%
7 57
 
6.4%
4 48
 
5.4%
3 44
 
5.0%
6 23
 
2.6%
Other Punctuation
ValueCountFrequency (%)
. 26
35.6%
, 22
30.1%
' 13
17.8%
! 4
 
5.5%
: 3
 
4.1%
· 3
 
4.1%
# 2
 
2.7%
Close Punctuation
ValueCountFrequency (%)
) 453
99.8%
] 1
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 453
99.8%
[ 1
 
0.2%
Math Symbol
ValueCountFrequency (%)
< 1
50.0%
> 1
50.0%
Space Separator
ValueCountFrequency (%)
2707
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 72
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 37771
85.7%
Common 4650
 
10.6%
Latin 1621
 
3.7%
Han 18
 
< 0.1%
Greek 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1581
 
4.2%
934
 
2.5%
709
 
1.9%
686
 
1.8%
679
 
1.8%
584
 
1.5%
498
 
1.3%
471
 
1.2%
461
 
1.2%
433
 
1.1%
Other values (889) 30735
81.4%
Latin
ValueCountFrequency (%)
C 124
 
7.6%
e 104
 
6.4%
S 88
 
5.4%
a 79
 
4.9%
G 65
 
4.0%
B 61
 
3.8%
P 61
 
3.8%
o 59
 
3.6%
A 56
 
3.5%
O 54
 
3.3%
Other values (43) 870
53.7%
Common
ValueCountFrequency (%)
2707
58.2%
) 453
 
9.7%
( 453
 
9.7%
2 199
 
4.3%
0 129
 
2.8%
1 120
 
2.6%
5 119
 
2.6%
8 74
 
1.6%
9 72
 
1.5%
_ 72
 
1.5%
Other values (16) 252
 
5.4%
Han
ValueCountFrequency (%)
5
27.8%
2
 
11.1%
2
 
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (2) 2
 
11.1%
Greek
ValueCountFrequency (%)
α 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 37766
85.7%
ASCII 6267
 
14.2%
CJK 18
 
< 0.1%
Compat Jamo 5
 
< 0.1%
None 4
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2707
43.2%
) 453
 
7.2%
( 453
 
7.2%
2 199
 
3.2%
0 129
 
2.1%
C 124
 
2.0%
1 120
 
1.9%
5 119
 
1.9%
e 104
 
1.7%
S 88
 
1.4%
Other values (67) 1771
28.3%
Hangul
ValueCountFrequency (%)
1581
 
4.2%
934
 
2.5%
709
 
1.9%
686
 
1.8%
679
 
1.8%
584
 
1.5%
498
 
1.3%
471
 
1.2%
461
 
1.2%
433
 
1.1%
Other values (888) 30730
81.4%
CJK
ValueCountFrequency (%)
5
27.8%
2
 
11.1%
2
 
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (2) 2
 
11.1%
Compat Jamo
ValueCountFrequency (%)
5
100.0%
None
ValueCountFrequency (%)
· 3
75.0%
α 1
 
25.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct5839
Distinct (%)90.5%
Missing0
Missing (%)0.0%
Memory size50.5 KiB
2024-04-06T18:43:08.404445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length75
Median length68
Mean length31.618317
Min length22

Characters and Unicode

Total characters204033
Distinct characters427
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5376 ?
Unique (%)83.3%

Sample

1st row인천광역시 미추홀구 미추홀대로 690 (주안동)
2nd row인천광역시 미추홀구 석정로 428 (주안동)
3rd row인천광역시 미추홀구 용삼길 75 (용현동)
4th row인천광역시 미추홀구 인하로 73 (용현동)
5th row인천광역시 미추홀구 주안로104번길 57 (주안동)
ValueCountFrequency (%)
인천광역시 6453
 
16.6%
미추홀구 6453
 
16.6%
주안동 2377
 
6.1%
용현동 1633
 
4.2%
1층 1490
 
3.8%
도화동 711
 
1.8%
숭의동 543
 
1.4%
학익동 494
 
1.3%
2층 415
 
1.1%
경인로 327
 
0.8%
Other values (3142) 17888
46.1%
2024-04-06T18:43:08.973923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
32343
 
15.9%
8395
 
4.1%
1 8260
 
4.0%
7142
 
3.5%
6988
 
3.4%
6966
 
3.4%
6903
 
3.4%
6670
 
3.3%
( 6578
 
3.2%
) 6576
 
3.2%
Other values (417) 107212
52.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 120840
59.2%
Space Separator 32343
 
15.9%
Decimal Number 31300
 
15.3%
Open Punctuation 6579
 
3.2%
Close Punctuation 6577
 
3.2%
Other Punctuation 4747
 
2.3%
Dash Punctuation 1077
 
0.5%
Uppercase Letter 406
 
0.2%
Lowercase Letter 111
 
0.1%
Math Symbol 50
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8395
 
6.9%
7142
 
5.9%
6988
 
5.8%
6966
 
5.8%
6903
 
5.7%
6670
 
5.5%
6548
 
5.4%
6538
 
5.4%
6538
 
5.4%
6474
 
5.4%
Other values (358) 51678
42.8%
Uppercase Letter
ValueCountFrequency (%)
B 81
20.0%
A 55
13.5%
S 53
13.1%
I 34
8.4%
E 30
 
7.4%
K 27
 
6.7%
W 23
 
5.7%
V 23
 
5.7%
C 16
 
3.9%
N 11
 
2.7%
Other values (11) 53
13.1%
Lowercase Letter
ValueCountFrequency (%)
e 31
27.9%
y 23
20.7%
k 22
19.8%
o 6
 
5.4%
w 5
 
4.5%
r 5
 
4.5%
b 3
 
2.7%
v 3
 
2.7%
g 3
 
2.7%
c 3
 
2.7%
Other values (3) 7
 
6.3%
Decimal Number
ValueCountFrequency (%)
1 8260
26.4%
2 4309
13.8%
3 3578
11.4%
0 2982
 
9.5%
4 2760
 
8.8%
5 2242
 
7.2%
6 2038
 
6.5%
8 1861
 
5.9%
7 1839
 
5.9%
9 1431
 
4.6%
Other Punctuation
ValueCountFrequency (%)
, 4715
99.3%
. 28
 
0.6%
@ 2
 
< 0.1%
/ 1
 
< 0.1%
* 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 6578
> 99.9%
[ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 6576
> 99.9%
] 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
32343
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1077
100.0%
Math Symbol
ValueCountFrequency (%)
~ 50
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 120839
59.2%
Common 82675
40.5%
Latin 517
 
0.3%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8395
 
6.9%
7142
 
5.9%
6988
 
5.8%
6966
 
5.8%
6903
 
5.7%
6670
 
5.5%
6548
 
5.4%
6538
 
5.4%
6538
 
5.4%
6474
 
5.4%
Other values (357) 51677
42.8%
Latin
ValueCountFrequency (%)
B 81
15.7%
A 55
10.6%
S 53
 
10.3%
I 34
 
6.6%
e 31
 
6.0%
E 30
 
5.8%
K 27
 
5.2%
W 23
 
4.4%
y 23
 
4.4%
V 23
 
4.4%
Other values (24) 137
26.5%
Common
ValueCountFrequency (%)
32343
39.1%
1 8260
 
10.0%
( 6578
 
8.0%
) 6576
 
8.0%
, 4715
 
5.7%
2 4309
 
5.2%
3 3578
 
4.3%
0 2982
 
3.6%
4 2760
 
3.3%
5 2242
 
2.7%
Other values (14) 8332
 
10.1%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 120838
59.2%
ASCII 83192
40.8%
CJK 2
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
32343
38.9%
1 8260
 
9.9%
( 6578
 
7.9%
) 6576
 
7.9%
, 4715
 
5.7%
2 4309
 
5.2%
3 3578
 
4.3%
0 2982
 
3.6%
4 2760
 
3.3%
5 2242
 
2.7%
Other values (48) 8849
 
10.6%
Hangul
ValueCountFrequency (%)
8395
 
6.9%
7142
 
5.9%
6988
 
5.8%
6966
 
5.8%
6903
 
5.7%
6670
 
5.5%
6548
 
5.4%
6538
 
5.4%
6538
 
5.4%
6474
 
5.4%
Other values (356) 51676
42.8%
None
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%

전화번호
Text

MISSING 

Distinct2662
Distinct (%)98.2%
Missing3741
Missing (%)58.0%
Memory size50.5 KiB
2024-04-06T18:43:09.234941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.012906
Min length9

Characters and Unicode

Total characters32579
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2617 ?
Unique (%)96.5%

Sample

1st row032-425-3300
2nd row032-861-2020
3rd row032-882-3786
4th row032-868-4227
5th row032-435-8192
ValueCountFrequency (%)
032-437-9999 4
 
0.1%
032-423-2343 3
 
0.1%
070-4024-3993 3
 
0.1%
032-227-5000 3
 
0.1%
032-456-3043 2
 
0.1%
032-874-5222 2
 
0.1%
032-434-3081 2
 
0.1%
032-422-1221 2
 
0.1%
032-883-0330 2
 
0.1%
032-867-8050 2
 
0.1%
Other values (2652) 2687
99.1%
2024-04-06T18:43:09.902520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 5408
16.6%
2 4872
15.0%
3 4329
13.3%
0 4186
12.8%
8 3739
11.5%
4 2109
 
6.5%
7 1931
 
5.9%
6 1770
 
5.4%
5 1557
 
4.8%
1 1401
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 27171
83.4%
Dash Punctuation 5408
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 4872
17.9%
3 4329
15.9%
0 4186
15.4%
8 3739
13.8%
4 2109
7.8%
7 1931
 
7.1%
6 1770
 
6.5%
5 1557
 
5.7%
1 1401
 
5.2%
9 1277
 
4.7%
Dash Punctuation
ValueCountFrequency (%)
- 5408
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 32579
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 5408
16.6%
2 4872
15.0%
3 4329
13.3%
0 4186
12.8%
8 3739
11.5%
4 2109
 
6.5%
7 1931
 
5.9%
6 1770
 
5.4%
5 1557
 
4.8%
1 1401
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32579
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 5408
16.6%
2 4872
15.0%
3 4329
13.3%
0 4186
12.8%
8 3739
11.5%
4 2109
 
6.5%
7 1931
 
5.9%
6 1770
 
5.4%
5 1557
 
4.8%
1 1401
 
4.3%

Missing values

2024-04-06T18:43:06.880396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T18:43:06.967874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명소재지전화번호
0일반음식점서문삼계탕인천광역시 미추홀구 미추홀대로 690 (주안동)032-425-3300
1일반음식점황금오리인천광역시 미추홀구 석정로 428 (주안동)032-861-2020
2일반음식점금호횟집인천광역시 미추홀구 용삼길 75 (용현동)032-882-3786
3일반음식점세컨드 찬스(second chance)인천광역시 미추홀구 인하로 73 (용현동)032-868-4227
4일반음식점화룽인천광역시 미추홀구 주안로104번길 57 (주안동)032-435-8192
5일반음식점이나술인천광역시 미추홀구 경인남길30번길 38 (용현동)032-873-8505
6일반음식점아름장인천광역시 미추홀구 경인로418번길 60 (주안동)032-422-2907
7일반음식점맵당인천광역시 미추홀구 경인로 392 (주안동)032-425-6222
8일반음식점조선화로집 주안점인천광역시 미추홀구 미추홀대로734번길 6 (주안동)032-423-4455
9일반음식점펀비어킹인천광역시 미추홀구 길파로 1 (주안동)<NA>
업종명업소명소재지전화번호
6443단란주점풍금7080인천광역시 미추홀구 경인로 437, 2층 (주안동)<NA>
6444단란주점아바인천광역시 미추홀구 토금중로 8, 3층 (용현동)<NA>
6445단란주점오니인천광역시 미추홀구 주안로 116, 주안리가스퀘어 B106호 (주안동)<NA>
6446단란주점골드노래짱인천광역시 미추홀구 독배로 430, 지하1층 (용현동)<NA>
6447단란주점황금성단란주점인천광역시 미추홀구 주안중로 19, 창보빌딩 지하1층 (주안동)032-426-7777
6448단란주점여기요인천광역시 미추홀구 미추홀대로 722, 지하1층 (주안동)<NA>
6449단란주점SONG송라이브인천광역시 미추홀구 미추홀대로 665, 2층 (주안동)<NA>
6450단란주점티파니노래짱인천광역시 미추홀구 독배로 420, 지하1층 (용현동)<NA>
6451단란주점홀리데이인천광역시 미추홀구 독배로 426, 중앙프라자 쇼핑센타 지하1층 (용현동)<NA>
6452단란주점만남라이브 단란인천광역시 미추홀구 제일로 41, 2층 일부호 (도화동)<NA>

Duplicate rows

Most frequently occurring

업종명업소명소재지전화번호# duplicates
0휴게음식점커피메이커인천광역시 미추홀구 미추홀대로 726 (주안동)<NA>2