Overview

Dataset statistics

Number of variables4
Number of observations487
Missing cells251
Missing cells (%)12.9%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory15.3 KiB
Average record size in memory32.3 B

Variable types

Categorical1
Text3

Dataset

Description부산광역시연제구건강기능식품판매업현황(20190613)
Author부산광역시 연제구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3082197

Alerts

Dataset has 1 (0.2%) duplicate rowsDuplicates
업종명 is highly imbalanced (84.4%)Imbalance
소재지전화 has 251 (51.5%) missing valuesMissing

Reproduction

Analysis started2023-12-10 16:01:28.318398
Analysis finished2023-12-10 16:01:29.077635
Duration0.76 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
건강기능식품일반판매업
476 
건강기능식품유통전문판매업
 
11

Length

Max length13
Median length11
Mean length11.045175
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row건강기능식품일반판매업
2nd row건강기능식품일반판매업
3rd row건강기능식품일반판매업
4th row건강기능식품일반판매업
5th row건강기능식품일반판매업

Common Values

ValueCountFrequency (%)
건강기능식품일반판매업 476
97.7%
건강기능식품유통전문판매업 11
 
2.3%

Length

2023-12-11T01:01:29.195571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:01:29.360136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
건강기능식품일반판매업 476
97.7%
건강기능식품유통전문판매업 11
 
2.3%
Distinct477
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
2023-12-11T01:01:29.848255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length7.6570842
Min length2

Characters and Unicode

Total characters3729
Distinct characters416
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique467 ?
Unique (%)95.9%

Sample

1st row(주)이마트연제점
2nd row아모레시청특약점
3rd row미니스톱법조타운점
4th row유니시티코리아(유)부산지점
5th row미니스톱(연산점)
ValueCountFrequency (%)
연제점 6
 
1.0%
주식회사 6
 
1.0%
한국암웨이 6
 
1.0%
연산점 5
 
0.8%
gs25 5
 
0.8%
허브다이어트 5
 
0.8%
라파플러스 4
 
0.7%
세븐일레븐 3
 
0.5%
시너지 3
 
0.5%
ibo 2
 
0.3%
Other values (536) 555
92.5%
2023-12-11T01:01:30.594299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
128
 
3.4%
) 120
 
3.2%
( 119
 
3.2%
114
 
3.1%
113
 
3.0%
96
 
2.6%
95
 
2.5%
94
 
2.5%
74
 
2.0%
70
 
1.9%
Other values (406) 2706
72.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3238
86.8%
Close Punctuation 120
 
3.2%
Open Punctuation 119
 
3.2%
Space Separator 113
 
3.0%
Uppercase Letter 83
 
2.2%
Decimal Number 35
 
0.9%
Lowercase Letter 12
 
0.3%
Other Punctuation 8
 
0.2%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
128
 
4.0%
114
 
3.5%
96
 
3.0%
95
 
2.9%
94
 
2.9%
74
 
2.3%
70
 
2.2%
67
 
2.1%
57
 
1.8%
52
 
1.6%
Other values (366) 2391
73.8%
Uppercase Letter
ValueCountFrequency (%)
G 17
20.5%
S 15
18.1%
B 7
8.4%
I 7
8.4%
O 7
8.4%
C 4
 
4.8%
T 4
 
4.8%
P 3
 
3.6%
H 3
 
3.6%
K 3
 
3.6%
Other values (9) 13
15.7%
Lowercase Letter
ValueCountFrequency (%)
a 4
33.3%
k 2
16.7%
c 1
 
8.3%
o 1
 
8.3%
m 1
 
8.3%
b 1
 
8.3%
y 1
 
8.3%
z 1
 
8.3%
Decimal Number
ValueCountFrequency (%)
5 13
37.1%
2 12
34.3%
4 5
 
14.3%
3 3
 
8.6%
6 1
 
2.9%
9 1
 
2.9%
Other Punctuation
ValueCountFrequency (%)
. 6
75.0%
& 1
 
12.5%
, 1
 
12.5%
Close Punctuation
ValueCountFrequency (%)
) 120
100.0%
Open Punctuation
ValueCountFrequency (%)
( 119
100.0%
Space Separator
ValueCountFrequency (%)
113
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3238
86.8%
Common 396
 
10.6%
Latin 95
 
2.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
128
 
4.0%
114
 
3.5%
96
 
3.0%
95
 
2.9%
94
 
2.9%
74
 
2.3%
70
 
2.2%
67
 
2.1%
57
 
1.8%
52
 
1.6%
Other values (366) 2391
73.8%
Latin
ValueCountFrequency (%)
G 17
17.9%
S 15
15.8%
B 7
 
7.4%
I 7
 
7.4%
O 7
 
7.4%
a 4
 
4.2%
C 4
 
4.2%
T 4
 
4.2%
P 3
 
3.2%
H 3
 
3.2%
Other values (17) 24
25.3%
Common
ValueCountFrequency (%)
) 120
30.3%
( 119
30.1%
113
28.5%
5 13
 
3.3%
2 12
 
3.0%
. 6
 
1.5%
4 5
 
1.3%
3 3
 
0.8%
6 1
 
0.3%
& 1
 
0.3%
Other values (3) 3
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3238
86.8%
ASCII 491
 
13.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
128
 
4.0%
114
 
3.5%
96
 
3.0%
95
 
2.9%
94
 
2.9%
74
 
2.3%
70
 
2.2%
67
 
2.1%
57
 
1.8%
52
 
1.6%
Other values (366) 2391
73.8%
ASCII
ValueCountFrequency (%)
) 120
24.4%
( 119
24.2%
113
23.0%
G 17
 
3.5%
S 15
 
3.1%
5 13
 
2.6%
2 12
 
2.4%
B 7
 
1.4%
I 7
 
1.4%
O 7
 
1.4%
Other values (30) 61
12.4%
Distinct461
Distinct (%)94.7%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
2023-12-11T01:01:31.169345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length57
Median length47
Mean length31.186858
Min length20

Characters and Unicode

Total characters15188
Distinct characters236
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique440 ?
Unique (%)90.3%

Sample

1st row부산광역시 연제구 연수로 89 (연산동)
2nd row부산광역시 연제구 신촌로 3 (연산동)
3rd row부산광역시 연제구 거제동 1474번지 1호
4th row부산광역시 연제구 중앙대로 1091, 13층 1301,1302호 (연산동)
5th row부산광역시 연제구 반송로 13-10 (연산동)
ValueCountFrequency (%)
부산광역시 487
 
16.7%
연제구 485
 
16.6%
연산동 308
 
10.6%
거제동 93
 
3.2%
1층 73
 
2.5%
중앙대로 64
 
2.2%
과정로 33
 
1.1%
2층 31
 
1.1%
월드컵대로 30
 
1.0%
반송로 26
 
0.9%
Other values (653) 1285
44.1%
2023-12-11T01:01:31.966659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2450
 
16.1%
906
 
6.0%
872
 
5.7%
1 672
 
4.4%
662
 
4.4%
560
 
3.7%
532
 
3.5%
492
 
3.2%
) 489
 
3.2%
( 489
 
3.2%
Other values (226) 7064
46.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8816
58.0%
Decimal Number 2464
 
16.2%
Space Separator 2450
 
16.1%
Close Punctuation 489
 
3.2%
Open Punctuation 489
 
3.2%
Other Punctuation 399
 
2.6%
Dash Punctuation 41
 
0.3%
Uppercase Letter 31
 
0.2%
Lowercase Letter 8
 
0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
906
 
10.3%
872
 
9.9%
662
 
7.5%
560
 
6.4%
532
 
6.0%
492
 
5.6%
489
 
5.5%
487
 
5.5%
487
 
5.5%
474
 
5.4%
Other values (187) 2855
32.4%
Uppercase Letter
ValueCountFrequency (%)
S 5
16.1%
B 5
16.1%
K 4
12.9%
E 3
9.7%
A 2
 
6.5%
J 2
 
6.5%
H 2
 
6.5%
T 1
 
3.2%
V 1
 
3.2%
I 1
 
3.2%
Other values (5) 5
16.1%
Decimal Number
ValueCountFrequency (%)
1 672
27.3%
2 323
13.1%
0 315
12.8%
3 281
11.4%
4 188
 
7.6%
5 176
 
7.1%
6 133
 
5.4%
8 131
 
5.3%
7 126
 
5.1%
9 119
 
4.8%
Lowercase Letter
ValueCountFrequency (%)
k 2
25.0%
s 2
25.0%
w 1
12.5%
e 1
12.5%
i 1
12.5%
v 1
12.5%
Other Punctuation
ValueCountFrequency (%)
, 393
98.5%
@ 5
 
1.3%
/ 1
 
0.3%
Space Separator
ValueCountFrequency (%)
2450
100.0%
Close Punctuation
ValueCountFrequency (%)
) 489
100.0%
Open Punctuation
ValueCountFrequency (%)
( 489
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 41
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8816
58.0%
Common 6333
41.7%
Latin 39
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
906
 
10.3%
872
 
9.9%
662
 
7.5%
560
 
6.4%
532
 
6.0%
492
 
5.6%
489
 
5.5%
487
 
5.5%
487
 
5.5%
474
 
5.4%
Other values (187) 2855
32.4%
Latin
ValueCountFrequency (%)
S 5
12.8%
B 5
12.8%
K 4
 
10.3%
E 3
 
7.7%
k 2
 
5.1%
A 2
 
5.1%
s 2
 
5.1%
J 2
 
5.1%
H 2
 
5.1%
T 1
 
2.6%
Other values (11) 11
28.2%
Common
ValueCountFrequency (%)
2450
38.7%
1 672
 
10.6%
) 489
 
7.7%
( 489
 
7.7%
, 393
 
6.2%
2 323
 
5.1%
0 315
 
5.0%
3 281
 
4.4%
4 188
 
3.0%
5 176
 
2.8%
Other values (8) 557
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8816
58.0%
ASCII 6372
42.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2450
38.4%
1 672
 
10.5%
) 489
 
7.7%
( 489
 
7.7%
, 393
 
6.2%
2 323
 
5.1%
0 315
 
4.9%
3 281
 
4.4%
4 188
 
3.0%
5 176
 
2.8%
Other values (29) 596
 
9.4%
Hangul
ValueCountFrequency (%)
906
 
10.3%
872
 
9.9%
662
 
7.5%
560
 
6.4%
532
 
6.0%
492
 
5.6%
489
 
5.5%
487
 
5.5%
487
 
5.5%
474
 
5.4%
Other values (187) 2855
32.4%

소재지전화
Text

MISSING 

Distinct229
Distinct (%)97.0%
Missing251
Missing (%)51.5%
Memory size3.9 KiB
2023-12-11T01:01:32.392634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters2832
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique223 ?
Unique (%)94.5%

Sample

1st row051-860-1052
2nd row051-752-8392
3rd row051-502-6942
4th row051-865-6669
5th row051-862-8800
ValueCountFrequency (%)
051-758-5090 3
 
1.3%
051-865-6669 2
 
0.8%
051-863-3071 2
 
0.8%
051-853-3664 2
 
0.8%
051-505-0395 2
 
0.8%
051-819-4949 2
 
0.8%
051-751-0456 1
 
0.4%
051-862-4517 1
 
0.4%
051-864-0160 1
 
0.4%
051-866-9024 1
 
0.4%
Other values (219) 219
92.8%
2023-12-11T01:01:32.971902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 511
18.0%
- 472
16.7%
0 414
14.6%
1 362
12.8%
8 250
8.8%
6 190
 
6.7%
7 157
 
5.5%
3 138
 
4.9%
2 135
 
4.8%
9 104
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2360
83.3%
Dash Punctuation 472
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 511
21.7%
0 414
17.5%
1 362
15.3%
8 250
10.6%
6 190
 
8.1%
7 157
 
6.7%
3 138
 
5.8%
2 135
 
5.7%
9 104
 
4.4%
4 99
 
4.2%
Dash Punctuation
ValueCountFrequency (%)
- 472
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2832
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 511
18.0%
- 472
16.7%
0 414
14.6%
1 362
12.8%
8 250
8.8%
6 190
 
6.7%
7 157
 
5.5%
3 138
 
4.9%
2 135
 
4.8%
9 104
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2832
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 511
18.0%
- 472
16.7%
0 414
14.6%
1 362
12.8%
8 250
8.8%
6 190
 
6.7%
7 157
 
5.5%
3 138
 
4.9%
2 135
 
4.8%
9 104
 
3.7%

Missing values

2023-12-11T01:01:28.893296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:01:29.021204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명소재지(도로명)소재지전화
0건강기능식품일반판매업(주)이마트연제점부산광역시 연제구 연수로 89 (연산동)051-860-1052
1건강기능식품일반판매업아모레시청특약점부산광역시 연제구 신촌로 3 (연산동)051-752-8392
2건강기능식품일반판매업미니스톱법조타운점부산광역시 연제구 거제동 1474번지 1호051-502-6942
3건강기능식품일반판매업유니시티코리아(유)부산지점부산광역시 연제구 중앙대로 1091, 13층 1301,1302호 (연산동)051-865-6669
4건강기능식품일반판매업미니스톱(연산점)부산광역시 연제구 반송로 13-10 (연산동)051-862-8800
5건강기능식품일반판매업알로에마임토곡지사부산광역시 연제구 과정로 221 (연산동)<NA>
6건강기능식품일반판매업홈플러스(주)아시아드점부산광역시 연제구 종합운동장로 7 (거제동)051-500-8000
7건강기능식품일반판매업(주)유니베라 연제영업국부산광역시 연제구 반송로 33 (연산동,샤르망라이프 503호)<NA>
8건강기능식품일반판매업건강생활 교대역 지점부산광역시 연제구 중앙대로 1197, 4층 403호 (거제동, 중보빙딩)051-751-6244
9건강기능식품일반판매업(주)서원유통 탑마트 연제점부산광역시 연제구 중앙대로 1174, 탑마트 연제점 1층 (거제동)051-504-9551
업종명업소명소재지(도로명)소재지전화
477건강기능식품유통전문판매업이공유통부산광역시 연제구 월드컵대로10번길 75, 1~2층 (연산동)<NA>
478건강기능식품유통전문판매업관회 양봉원부산광역시 연제구 거제대로252번길 7, 1층 (거제동)<NA>
479건강기능식품유통전문판매업우정약품(주)부산광역시 연제구 세병로 39 (연산동)051-863-2222
480건강기능식품유통전문판매업허브플렛폼부산광역시 연제구 마곡천로30번길 18, 1층 (연산동)<NA>
481건강기능식품유통전문판매업에이치케이바이오텍부산광역시 연제구 반송로 13-7, 4층 (연산동, 석촌빌딩)051-647-8170
482건강기능식품유통전문판매업주식회사 제이에이치에프엔비부산광역시 연제구 중앙대로 1076, 5층 (연산동)<NA>
483건강기능식품유통전문판매업웰빙플러스부산광역시 연제구 과정로 324, 4층 (연산동)<NA>
484건강기능식품유통전문판매업(주)성원제이에스부산광역시 연제구 반송로 33, 6층 604호 (연산동)<NA>
485건강기능식품유통전문판매업선진약품부산광역시 연제구 세병로 39, 4층 (연산동)<NA>
486건강기능식품유통전문판매업삼에스개발(주)부산광역시 연제구 월드컵대로 83, 케이티엔지 1층 109호 (연산동)051-851-8701

Duplicate rows

Most frequently occurring

업종명업소명소재지(도로명)소재지전화# duplicates
0건강기능식품일반판매업경희메디팜부산광역시 연제구 반송로 33, 604호 (연산동)<NA>2