Overview

Dataset statistics

Number of variables4
Number of observations558
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.1 KiB
Average record size in memory33.2 B

Variable types

Numeric1
Categorical1
Text2

Dataset

Description부산광역시연제구_건강기능식품판매업현황_20230920
Author부산광역시 연제구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3082197

Alerts

연번 is highly overall correlated with 업종명High correlation
업종명 is highly overall correlated with 연번High correlation
업종명 is highly imbalanced (76.0%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:01:18.201818
Analysis finished2023-12-10 16:01:18.974272
Duration0.77 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct558
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean279.5
Minimum1
Maximum558
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.0 KiB
2023-12-11T01:01:19.069974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile28.85
Q1140.25
median279.5
Q3418.75
95-th percentile530.15
Maximum558
Range557
Interquartile range (IQR)278.5

Descriptive statistics

Standard deviation161.225
Coefficient of variation (CV)0.57683362
Kurtosis-1.2
Mean279.5
Median Absolute Deviation (MAD)139.5
Skewness0
Sum155961
Variance25993.5
MonotonicityStrictly increasing
2023-12-11T01:01:19.269093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.2%
376 1
 
0.2%
370 1
 
0.2%
371 1
 
0.2%
372 1
 
0.2%
373 1
 
0.2%
374 1
 
0.2%
375 1
 
0.2%
377 1
 
0.2%
385 1
 
0.2%
Other values (548) 548
98.2%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
558 1
0.2%
557 1
0.2%
556 1
0.2%
555 1
0.2%
554 1
0.2%
553 1
0.2%
552 1
0.2%
551 1
0.2%
550 1
0.2%
549 1
0.2%

업종명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
건강기능식품일반판매업
536 
건강기능식품유통전문판매업
 
22

Length

Max length13
Median length11
Mean length11.078853
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row건강기능식품일반판매업
2nd row건강기능식품일반판매업
3rd row건강기능식품일반판매업
4th row건강기능식품일반판매업
5th row건강기능식품일반판매업

Common Values

ValueCountFrequency (%)
건강기능식품일반판매업 536
96.1%
건강기능식품유통전문판매업 22
 
3.9%

Length

2023-12-11T01:01:19.827444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:01:19.986194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
건강기능식품일반판매업 536
96.1%
건강기능식품유통전문판매업 22
 
3.9%
Distinct545
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
2023-12-11T01:01:20.243290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length18
Mean length6.844086
Min length2

Characters and Unicode

Total characters3819
Distinct characters464
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique533 ?
Unique (%)95.5%

Sample

1st row(주)이마트연제점
2nd row유니시티코리아(유)부산지점
3rd row알로에마임토곡지사
4th row홈플러스(주)아시아드점
5th row마임부산연제남부지사
ValueCountFrequency (%)
주식회사 18
 
2.5%
연제점 6
 
0.8%
연산점 6
 
0.8%
인셀덤 4
 
0.6%
씨제이올리브영(주 4
 
0.6%
스튜디오 3
 
0.4%
세븐일레븐 3
 
0.4%
대리점 3
 
0.4%
라라샵 3
 
0.4%
연산토곡점 3
 
0.4%
Other values (635) 662
92.6%
2023-12-11T01:01:20.741124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
159
 
4.2%
157
 
4.1%
124
 
3.2%
84
 
2.2%
( 83
 
2.2%
) 83
 
2.2%
76
 
2.0%
73
 
1.9%
67
 
1.8%
65
 
1.7%
Other values (454) 2848
74.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3276
85.8%
Space Separator 157
 
4.1%
Lowercase Letter 120
 
3.1%
Open Punctuation 83
 
2.2%
Close Punctuation 83
 
2.2%
Uppercase Letter 73
 
1.9%
Decimal Number 25
 
0.7%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
159
 
4.9%
124
 
3.8%
84
 
2.6%
76
 
2.3%
73
 
2.2%
67
 
2.0%
65
 
2.0%
61
 
1.9%
56
 
1.7%
48
 
1.5%
Other values (399) 2463
75.2%
Uppercase Letter
ValueCountFrequency (%)
J 7
 
9.6%
M 7
 
9.6%
T 6
 
8.2%
S 6
 
8.2%
H 6
 
8.2%
F 4
 
5.5%
G 4
 
5.5%
N 4
 
5.5%
P 4
 
5.5%
I 4
 
5.5%
Other values (12) 21
28.8%
Lowercase Letter
ValueCountFrequency (%)
e 16
13.3%
l 13
10.8%
a 10
 
8.3%
i 10
 
8.3%
s 8
 
6.7%
n 7
 
5.8%
o 7
 
5.8%
g 6
 
5.0%
p 6
 
5.0%
y 5
 
4.2%
Other values (10) 32
26.7%
Decimal Number
ValueCountFrequency (%)
2 8
32.0%
3 5
20.0%
9 4
16.0%
4 3
 
12.0%
5 2
 
8.0%
6 1
 
4.0%
7 1
 
4.0%
1 1
 
4.0%
Other Punctuation
ValueCountFrequency (%)
& 1
50.0%
# 1
50.0%
Space Separator
ValueCountFrequency (%)
157
100.0%
Open Punctuation
ValueCountFrequency (%)
( 83
100.0%
Close Punctuation
ValueCountFrequency (%)
) 83
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3276
85.8%
Common 350
 
9.2%
Latin 193
 
5.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
159
 
4.9%
124
 
3.8%
84
 
2.6%
76
 
2.3%
73
 
2.2%
67
 
2.0%
65
 
2.0%
61
 
1.9%
56
 
1.7%
48
 
1.5%
Other values (399) 2463
75.2%
Latin
ValueCountFrequency (%)
e 16
 
8.3%
l 13
 
6.7%
a 10
 
5.2%
i 10
 
5.2%
s 8
 
4.1%
n 7
 
3.6%
J 7
 
3.6%
o 7
 
3.6%
M 7
 
3.6%
T 6
 
3.1%
Other values (32) 102
52.8%
Common
ValueCountFrequency (%)
157
44.9%
( 83
23.7%
) 83
23.7%
2 8
 
2.3%
3 5
 
1.4%
9 4
 
1.1%
4 3
 
0.9%
5 2
 
0.6%
6 1
 
0.3%
7 1
 
0.3%
Other values (3) 3
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3276
85.8%
ASCII 543
 
14.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
159
 
4.9%
124
 
3.8%
84
 
2.6%
76
 
2.3%
73
 
2.2%
67
 
2.0%
65
 
2.0%
61
 
1.9%
56
 
1.7%
48
 
1.5%
Other values (399) 2463
75.2%
ASCII
ValueCountFrequency (%)
157
28.9%
( 83
15.3%
) 83
15.3%
e 16
 
2.9%
l 13
 
2.4%
a 10
 
1.8%
i 10
 
1.8%
s 8
 
1.5%
2 8
 
1.5%
n 7
 
1.3%
Other values (45) 148
27.3%
Distinct529
Distinct (%)94.8%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
2023-12-11T01:01:21.143494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length70
Median length51
Mean length36.603943
Min length21

Characters and Unicode

Total characters20425
Distinct characters280
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique505 ?
Unique (%)90.5%

Sample

1st row부산광역시 연제구 연수로 89 (연산동)
2nd row부산광역시 연제구 중앙대로 1125, C T 타워 A동 201,202,203호 (연산동)
3rd row부산광역시 연제구 과정로 221 (연산동)
4th row부산광역시 연제구 종합운동장로 7 (거제동)
5th row부산광역시 연제구 반송로 33 (연산동,샤르망라이프 503호)
ValueCountFrequency (%)
부산광역시 558
 
14.2%
연제구 558
 
14.2%
연산동 418
 
10.6%
거제동 141
 
3.6%
1층 86
 
2.2%
월드컵대로 47
 
1.2%
중앙대로 47
 
1.2%
2층 46
 
1.2%
과정로 27
 
0.7%
101동 27
 
0.7%
Other values (792) 1980
50.3%
2023-12-11T01:01:21.713881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3377
 
16.5%
1110
 
5.4%
1048
 
5.1%
1 1007
 
4.9%
808
 
4.0%
785
 
3.8%
, 728
 
3.6%
623
 
3.1%
0 595
 
2.9%
593
 
2.9%
Other values (270) 9751
47.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11449
56.1%
Decimal Number 3573
 
17.5%
Space Separator 3377
 
16.5%
Other Punctuation 733
 
3.6%
Open Punctuation 570
 
2.8%
Close Punctuation 570
 
2.8%
Uppercase Letter 99
 
0.5%
Dash Punctuation 50
 
0.2%
Lowercase Letter 3
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1110
 
9.7%
1048
 
9.2%
808
 
7.1%
785
 
6.9%
623
 
5.4%
593
 
5.2%
569
 
5.0%
558
 
4.9%
558
 
4.9%
558
 
4.9%
Other values (234) 4239
37.0%
Uppercase Letter
ValueCountFrequency (%)
K 13
13.1%
S 13
13.1%
I 12
12.1%
E 10
10.1%
V 9
9.1%
W 9
9.1%
B 6
 
6.1%
A 5
 
5.1%
C 4
 
4.0%
T 3
 
3.0%
Other values (7) 15
15.2%
Decimal Number
ValueCountFrequency (%)
1 1007
28.2%
0 595
16.7%
2 522
14.6%
3 361
 
10.1%
5 240
 
6.7%
4 236
 
6.6%
6 198
 
5.5%
7 144
 
4.0%
8 140
 
3.9%
9 130
 
3.6%
Other Punctuation
ValueCountFrequency (%)
, 728
99.3%
& 4
 
0.5%
/ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
3377
100.0%
Open Punctuation
ValueCountFrequency (%)
( 570
100.0%
Close Punctuation
ValueCountFrequency (%)
) 570
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 50
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 3
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11449
56.1%
Common 8874
43.4%
Latin 102
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1110
 
9.7%
1048
 
9.2%
808
 
7.1%
785
 
6.9%
623
 
5.4%
593
 
5.2%
569
 
5.0%
558
 
4.9%
558
 
4.9%
558
 
4.9%
Other values (234) 4239
37.0%
Common
ValueCountFrequency (%)
3377
38.1%
1 1007
 
11.3%
, 728
 
8.2%
0 595
 
6.7%
( 570
 
6.4%
) 570
 
6.4%
2 522
 
5.9%
3 361
 
4.1%
5 240
 
2.7%
4 236
 
2.7%
Other values (8) 668
 
7.5%
Latin
ValueCountFrequency (%)
K 13
12.7%
S 13
12.7%
I 12
11.8%
E 10
9.8%
V 9
8.8%
W 9
8.8%
B 6
 
5.9%
A 5
 
4.9%
C 4
 
3.9%
T 3
 
2.9%
Other values (8) 18
17.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11449
56.1%
ASCII 8976
43.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3377
37.6%
1 1007
 
11.2%
, 728
 
8.1%
0 595
 
6.6%
( 570
 
6.4%
) 570
 
6.4%
2 522
 
5.8%
3 361
 
4.0%
5 240
 
2.7%
4 236
 
2.6%
Other values (26) 770
 
8.6%
Hangul
ValueCountFrequency (%)
1110
 
9.7%
1048
 
9.2%
808
 
7.1%
785
 
6.9%
623
 
5.4%
593
 
5.2%
569
 
5.0%
558
 
4.9%
558
 
4.9%
558
 
4.9%
Other values (234) 4239
37.0%

Interactions

2023-12-11T01:01:18.651305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:01:21.863903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종명
연번1.0000.760
업종명0.7601.000
2023-12-11T01:01:21.997445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종명
연번1.0000.594
업종명0.5941.000

Missing values

2023-12-11T01:01:18.806826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:01:18.925718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번업종명업소명소재지(도로명)
01건강기능식품일반판매업(주)이마트연제점부산광역시 연제구 연수로 89 (연산동)
12건강기능식품일반판매업유니시티코리아(유)부산지점부산광역시 연제구 중앙대로 1125, C T 타워 A동 201,202,203호 (연산동)
23건강기능식품일반판매업알로에마임토곡지사부산광역시 연제구 과정로 221 (연산동)
34건강기능식품일반판매업홈플러스(주)아시아드점부산광역시 연제구 종합운동장로 7 (거제동)
45건강기능식품일반판매업마임부산연제남부지사부산광역시 연제구 반송로 33 (연산동,샤르망라이프 503호)
56건강기능식품일반판매업건강생활 교대역 지점부산광역시 연제구 중앙대로 1197, 4층 403호 (거제동, 중보빙딩)
67건강기능식품일반판매업(주)서원유통 탑마트 연제점부산광역시 연제구 중앙대로 1174, 탑마트 연제점 1층 (거제동)
78건강기능식품일반판매업유니베라 사직대리점부산광역시 연제구 아시아드대로 99 (거제동)
89건강기능식품일반판매업유경희소아과부산광역시 연제구 고분로 25-1 (연산동)
910건강기능식품일반판매업이신영산부인과부산광역시 연제구 반송로 5 (연산동)
연번업종명업소명소재지(도로명)
548549건강기능식품유통전문판매업주식회사 유바생명과학부산광역시 연제구 월드컵대로 197, 8층 (거제동)
549550건강기능식품유통전문판매업(주)민즈부산광역시 연제구 미남로 1, 하나빌딩 501호 (거제동)
550551건강기능식품유통전문판매업광림사부산광역시 연제구 월드컵대로187번길 61 (거제동)
551552건강기능식품유통전문판매업디오아이그룹부산광역시 연제구 월드컵대로 282, 705호 (거제동, 청목파르테논)
552553건강기능식품유통전문판매업하나파트너스부산광역시 연제구 법원남로15번길 22, 파라존빌딩 7층 (거제동)
553554건강기능식품유통전문판매업퍼스트인글로벌 주식회사부산광역시 연제구 중앙대로 1091, 제세빌딩 3층 (연산동)
554555건강기능식품유통전문판매업나해솔하우스부산광역시 연제구 월드컵대로 160, 4층 432호 (연산동)
555556건강기능식품유통전문판매업주식회사 더블엔부산광역시 연제구 안연로 33, 106동 602호 (연산동, 더샵 파크시티)
556557건강기능식품유통전문판매업이공유통부산광역시 연제구 월드컵대로10번길 75 (연산동)
557558건강기능식품유통전문판매업제이엠건강부산광역시 연제구 신금로 25, 노블레스스퀘어 211호 (연산동)