Overview

Dataset statistics

Number of variables4
Number of observations220
Missing cells112
Missing cells (%)12.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.0 KiB
Average record size in memory32.6 B

Variable types

Categorical1
Text3

Dataset

Description부산광역시_동구_건강기능식품판매업현황_20210117
Author부산광역시 동구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15028638

Alerts

업종명 is highly imbalanced (92.5%)Imbalance
소재지전화 has 112 (50.9%) missing valuesMissing

Reproduction

Analysis started2023-12-10 16:35:22.395907
Analysis finished2023-12-10 16:35:23.032886
Duration0.64 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
건강기능식품일반판매업
218 
건강기능식품유통전문판매업
 
2

Length

Max length13
Median length11
Mean length11.018182
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row건강기능식품일반판매업
2nd row건강기능식품일반판매업
3rd row건강기능식품일반판매업
4th row건강기능식품일반판매업
5th row건강기능식품일반판매업

Common Values

ValueCountFrequency (%)
건강기능식품일반판매업 218
99.1%
건강기능식품유통전문판매업 2
 
0.9%

Length

2023-12-11T01:35:23.145624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:35:23.322197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
건강기능식품일반판매업 218
99.1%
건강기능식품유통전문판매업 2
 
0.9%
Distinct218
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-11T01:35:23.651331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length15
Mean length7.2136364
Min length2

Characters and Unicode

Total characters1587
Distinct characters333
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique216 ?
Unique (%)98.2%

Sample

1st row집합몰
2nd row(주)굿헬스코리아
3rd row유한회사 스노우에이치
4th row명품프라자
5th row참선진녹즙
ValueCountFrequency (%)
주식회사 8
 
2.9%
세븐일레븐 6
 
2.2%
유니베라 2
 
0.7%
찜질 2
 
0.7%
초량지점 2
 
0.7%
수정점 2
 
0.7%
애터미 2
 
0.7%
주)엑셀 2
 
0.7%
문문칩 1
 
0.4%
연구소 1
 
0.4%
Other values (251) 251
90.0%
2023-12-11T01:35:24.194592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
59
 
3.7%
49
 
3.1%
44
 
2.8%
42
 
2.6%
( 40
 
2.5%
) 40
 
2.5%
39
 
2.5%
35
 
2.2%
34
 
2.1%
29
 
1.8%
Other values (323) 1176
74.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1398
88.1%
Space Separator 59
 
3.7%
Open Punctuation 40
 
2.5%
Close Punctuation 40
 
2.5%
Uppercase Letter 22
 
1.4%
Lowercase Letter 16
 
1.0%
Decimal Number 9
 
0.6%
Other Punctuation 2
 
0.1%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
49
 
3.5%
44
 
3.1%
42
 
3.0%
39
 
2.8%
35
 
2.5%
34
 
2.4%
29
 
2.1%
27
 
1.9%
23
 
1.6%
20
 
1.4%
Other values (290) 1056
75.5%
Lowercase Letter
ValueCountFrequency (%)
o 2
12.5%
n 2
12.5%
c 2
12.5%
u 2
12.5%
p 1
6.2%
i 1
6.2%
d 1
6.2%
s 1
6.2%
j 1
6.2%
y 1
6.2%
Other values (2) 2
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 4
18.2%
G 4
18.2%
I 3
13.6%
B 3
13.6%
O 3
13.6%
D 1
 
4.5%
N 1
 
4.5%
L 1
 
4.5%
C 1
 
4.5%
V 1
 
4.5%
Decimal Number
ValueCountFrequency (%)
5 3
33.3%
2 3
33.3%
1 1
 
11.1%
6 1
 
11.1%
3 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
, 1
50.0%
& 1
50.0%
Space Separator
ValueCountFrequency (%)
59
100.0%
Open Punctuation
ValueCountFrequency (%)
( 40
100.0%
Close Punctuation
ValueCountFrequency (%)
) 40
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1398
88.1%
Common 151
 
9.5%
Latin 38
 
2.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
49
 
3.5%
44
 
3.1%
42
 
3.0%
39
 
2.8%
35
 
2.5%
34
 
2.4%
29
 
2.1%
27
 
1.9%
23
 
1.6%
20
 
1.4%
Other values (290) 1056
75.5%
Latin
ValueCountFrequency (%)
S 4
 
10.5%
G 4
 
10.5%
I 3
 
7.9%
B 3
 
7.9%
O 3
 
7.9%
o 2
 
5.3%
n 2
 
5.3%
c 2
 
5.3%
u 2
 
5.3%
p 1
 
2.6%
Other values (12) 12
31.6%
Common
ValueCountFrequency (%)
59
39.1%
( 40
26.5%
) 40
26.5%
5 3
 
2.0%
2 3
 
2.0%
1 1
 
0.7%
, 1
 
0.7%
& 1
 
0.7%
6 1
 
0.7%
3 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1398
88.1%
ASCII 189
 
11.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
59
31.2%
( 40
21.2%
) 40
21.2%
S 4
 
2.1%
G 4
 
2.1%
I 3
 
1.6%
B 3
 
1.6%
O 3
 
1.6%
5 3
 
1.6%
2 3
 
1.6%
Other values (23) 27
14.3%
Hangul
ValueCountFrequency (%)
49
 
3.5%
44
 
3.1%
42
 
3.0%
39
 
2.8%
35
 
2.5%
34
 
2.4%
29
 
2.1%
27
 
1.9%
23
 
1.6%
20
 
1.4%
Other values (290) 1056
75.5%
Distinct211
Distinct (%)95.9%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-11T01:35:24.529442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length51
Median length44
Mean length32.1
Min length21

Characters and Unicode

Total characters7062
Distinct characters175
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique205 ?
Unique (%)93.2%

Sample

1st row부산광역시 동구 고관로 164, 205호 (좌천동)
2nd row부산광역시 동구 범일로 125, 현대백화점 부산점 지하2층 (범일동)
3rd row부산광역시 동구 조방로 32, 1-2층 (범일동)
4th row부산광역시 동구 중앙대로196번길 12-8, 중앙빌딩 2층 203호 (초량동)
5th row부산광역시 동구 수정중로20번길 31 (수정동)
ValueCountFrequency (%)
부산광역시 220
 
15.7%
동구 220
 
15.7%
초량동 87
 
6.2%
범일동 66
 
4.7%
중앙대로 50
 
3.6%
범일로 31
 
2.2%
1층 23
 
1.6%
수정동 23
 
1.6%
2층 20
 
1.4%
125 14
 
1.0%
Other values (380) 650
46.3%
2023-12-11T01:35:25.083738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1184
 
16.8%
466
 
6.6%
1 261
 
3.7%
238
 
3.4%
237
 
3.4%
( 234
 
3.3%
) 234
 
3.3%
227
 
3.2%
226
 
3.2%
223
 
3.2%
Other values (165) 3532
50.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3944
55.8%
Decimal Number 1196
 
16.9%
Space Separator 1184
 
16.8%
Open Punctuation 234
 
3.3%
Close Punctuation 234
 
3.3%
Other Punctuation 214
 
3.0%
Dash Punctuation 44
 
0.6%
Uppercase Letter 11
 
0.2%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
466
 
11.8%
238
 
6.0%
237
 
6.0%
227
 
5.8%
226
 
5.7%
223
 
5.7%
220
 
5.6%
216
 
5.5%
132
 
3.3%
128
 
3.2%
Other values (141) 1631
41.4%
Decimal Number
ValueCountFrequency (%)
1 261
21.8%
2 193
16.1%
0 154
12.9%
3 130
10.9%
4 94
 
7.9%
5 84
 
7.0%
6 81
 
6.8%
7 70
 
5.9%
9 67
 
5.6%
8 62
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
A 3
27.3%
Y 2
18.2%
M 2
18.2%
C 2
18.2%
G 1
 
9.1%
T 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
, 212
99.1%
. 1
 
0.5%
/ 1
 
0.5%
Space Separator
ValueCountFrequency (%)
1184
100.0%
Open Punctuation
ValueCountFrequency (%)
( 234
100.0%
Close Punctuation
ValueCountFrequency (%)
) 234
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 44
100.0%
Lowercase Letter
ValueCountFrequency (%)
b 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3944
55.8%
Common 3106
44.0%
Latin 12
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
466
 
11.8%
238
 
6.0%
237
 
6.0%
227
 
5.8%
226
 
5.7%
223
 
5.7%
220
 
5.6%
216
 
5.5%
132
 
3.3%
128
 
3.2%
Other values (141) 1631
41.4%
Common
ValueCountFrequency (%)
1184
38.1%
1 261
 
8.4%
( 234
 
7.5%
) 234
 
7.5%
, 212
 
6.8%
2 193
 
6.2%
0 154
 
5.0%
3 130
 
4.2%
4 94
 
3.0%
5 84
 
2.7%
Other values (7) 326
 
10.5%
Latin
ValueCountFrequency (%)
A 3
25.0%
Y 2
16.7%
M 2
16.7%
C 2
16.7%
G 1
 
8.3%
T 1
 
8.3%
b 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3944
55.8%
ASCII 3118
44.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1184
38.0%
1 261
 
8.4%
( 234
 
7.5%
) 234
 
7.5%
, 212
 
6.8%
2 193
 
6.2%
0 154
 
4.9%
3 130
 
4.2%
4 94
 
3.0%
5 84
 
2.7%
Other values (14) 338
 
10.8%
Hangul
ValueCountFrequency (%)
466
 
11.8%
238
 
6.0%
237
 
6.0%
227
 
5.8%
226
 
5.7%
223
 
5.7%
220
 
5.6%
216
 
5.5%
132
 
3.3%
128
 
3.2%
Other values (141) 1631
41.4%

소재지전화
Text

MISSING 

Distinct103
Distinct (%)95.4%
Missing112
Missing (%)50.9%
Memory size1.8 KiB
2023-12-11T01:35:25.413144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length11.972222
Min length9

Characters and Unicode

Total characters1293
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98 ?
Unique (%)90.7%

Sample

1st row1661-0956
2nd row051-946-8008
3rd row051-912-9880
4th row051-897-9673
5th row051-863-7038
ValueCountFrequency (%)
051-513-2560 2
 
1.9%
051-714-0671 2
 
1.9%
051-667-1257 2
 
1.9%
051-469-6366 2
 
1.9%
051-441-1122 2
 
1.9%
051-466-9450 1
 
0.9%
051-466-3919 1
 
0.9%
051-464-1293 1
 
0.9%
051-464-2505 1
 
0.9%
051-464-3455 1
 
0.9%
Other values (93) 93
86.1%
2023-12-11T01:35:25.881990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 215
16.6%
1 186
14.4%
0 175
13.5%
5 173
13.4%
6 146
11.3%
4 128
9.9%
3 70
 
5.4%
2 70
 
5.4%
7 56
 
4.3%
8 44
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1078
83.4%
Dash Punctuation 215
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 186
17.3%
0 175
16.2%
5 173
16.0%
6 146
13.5%
4 128
11.9%
3 70
 
6.5%
2 70
 
6.5%
7 56
 
5.2%
8 44
 
4.1%
9 30
 
2.8%
Dash Punctuation
ValueCountFrequency (%)
- 215
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1293
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 215
16.6%
1 186
14.4%
0 175
13.5%
5 173
13.4%
6 146
11.3%
4 128
9.9%
3 70
 
5.4%
2 70
 
5.4%
7 56
 
4.3%
8 44
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1293
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 215
16.6%
1 186
14.4%
0 175
13.5%
5 173
13.4%
6 146
11.3%
4 128
9.9%
3 70
 
5.4%
2 70
 
5.4%
7 56
 
4.3%
8 44
 
3.4%

Missing values

2023-12-11T01:35:22.829427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:35:22.986811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명소재지(도로명)소재지전화
0건강기능식품일반판매업집합몰부산광역시 동구 고관로 164, 205호 (좌천동)1661-0956
1건강기능식품일반판매업(주)굿헬스코리아부산광역시 동구 범일로 125, 현대백화점 부산점 지하2층 (범일동)<NA>
2건강기능식품일반판매업유한회사 스노우에이치부산광역시 동구 조방로 32, 1-2층 (범일동)051-946-8008
3건강기능식품일반판매업명품프라자부산광역시 동구 중앙대로196번길 12-8, 중앙빌딩 2층 203호 (초량동)051-912-9880
4건강기능식품일반판매업참선진녹즙부산광역시 동구 수정중로20번길 31 (수정동)051-897-9673
5건강기능식품일반판매업솔고헬스케어부산광역시 동구 망양로 897 (범일동)051-863-7038
6건강기능식품일반판매업네드베드부산광역시 동구 자성로141번길 11, 삼환오피스텔 1101호 일부 (범일동)051-804-6555
7건강기능식품일반판매업락재팬부산광역시 동구 중앙대로 263, 국제오피스텔 901호 (초량동)051-761-1220
8건강기능식품일반판매업나비엘 범일갤러리부산광역시 동구 범일로 101-3, 2층 (범일동)051-758-0646
9건강기능식품일반판매업(주)엑셀부산광역시 동구 조방로 39, 썬오피스텔 8층 803호 (범일동)051-714-0671
업종명업소명소재지(도로명)소재지전화
210건강기능식품일반판매업퓨처앤나우부산광역시 동구 중앙대로 263, 국제오피스텔 1606호 (초량동)<NA>
211건강기능식품일반판매업한국암웨이IBO이정연부산광역시 동구 중앙대로248번길 7 (초량동,502호)<NA>
212건강기능식품일반판매업한국테크부산광역시 동구 망양로821번길 12-3 (수정동)<NA>
213건강기능식품일반판매업한희프로덕션부산광역시 동구 망양로668번길 28, 3동 353호 (수정동, 덕림아파트)<NA>
214건강기능식품일반판매업해피랑힐링센터부산광역시 동구 범일로 120, 지하 1층 (범일동)<NA>
215건강기능식품일반판매업헬로그리니부산광역시 동구 망양로610번길 61, 201호 (초량동, 양지빌라)<NA>
216건강기능식품일반판매업헬스플래너부산광역시 동구 중앙대로 487, 202호 (범일동)<NA>
217건강기능식품일반판매업힐링 다이어트 찜질 카페부산광역시 동구 범일로 78, 2층 (범일동)<NA>
218건강기능식품유통전문판매업(주)엑셀부산광역시 동구 조방로 39, 썬오피스텔 8층 803호 (범일동)051-714-0671
219건강기능식품유통전문판매업이디지씨글로벌주식회사부산광역시 동구 중앙대로214번길 7-8, 304호 (초량동)<NA>