Overview

Dataset statistics

Number of variables5
Number of observations3510
Missing cells1150
Missing cells (%)6.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory140.7 KiB
Average record size in memory41.0 B

Variable types

Numeric1
Categorical1
Text3

Dataset

Description부산광역시연제구_식품접객업소현황_20230517
Author부산광역시 연제구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3043636

Alerts

업종명 is highly imbalanced (74.5%)Imbalance
소재지전화 has 1150 (32.8%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:39:55.080168
Analysis finished2023-12-10 16:39:56.838727
Duration1.76 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct3510
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1755.5
Minimum1
Maximum3510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.0 KiB
2023-12-11T01:39:56.999326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile176.45
Q1878.25
median1755.5
Q32632.75
95-th percentile3334.55
Maximum3510
Range3509
Interquartile range (IQR)1754.5

Descriptive statistics

Standard deviation1013.394
Coefficient of variation (CV)0.57726804
Kurtosis-1.2
Mean1755.5
Median Absolute Deviation (MAD)877.5
Skewness0
Sum6161805
Variance1026967.5
MonotonicityStrictly increasing
2023-12-11T01:39:57.203534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
2346 1
 
< 0.1%
2335 1
 
< 0.1%
2336 1
 
< 0.1%
2337 1
 
< 0.1%
2338 1
 
< 0.1%
2339 1
 
< 0.1%
2340 1
 
< 0.1%
2341 1
 
< 0.1%
2342 1
 
< 0.1%
Other values (3500) 3500
99.7%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
3510 1
< 0.1%
3509 1
< 0.1%
3508 1
< 0.1%
3507 1
< 0.1%
3506 1
< 0.1%
3505 1
< 0.1%
3504 1
< 0.1%
3503 1
< 0.1%
3502 1
< 0.1%
3501 1
< 0.1%

업종명
Categorical

IMBALANCE 

Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size27.6 KiB
일반음식점
3088 
휴게음식점
339 
유흥주점영업
 
35
제과점영업
 
19
단란주점
 
18

Length

Max length6
Median length5
Mean length5.0079772
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반음식점
2nd row일반음식점
3rd row일반음식점
4th row일반음식점
5th row일반음식점

Common Values

ValueCountFrequency (%)
일반음식점 3088
88.0%
휴게음식점 339
 
9.7%
유흥주점영업 35
 
1.0%
제과점영업 19
 
0.5%
단란주점 18
 
0.5%
위탁급식영업 11
 
0.3%

Length

2023-12-11T01:39:57.433356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:39:57.610629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반음식점 3088
88.0%
휴게음식점 339
 
9.7%
유흥주점영업 35
 
1.0%
제과점영업 19
 
0.5%
단란주점 18
 
0.5%
위탁급식영업 11
 
0.3%
Distinct2803
Distinct (%)79.9%
Missing0
Missing (%)0.0%
Memory size27.6 KiB
2023-12-11T01:39:58.072862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length24
Mean length6.6202279
Min length1

Characters and Unicode

Total characters23237
Distinct characters846
Distinct categories11 ?
Distinct scripts5 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2124 ?
Unique (%)60.5%

Sample

1st row컴포즈교대점
2nd row언양진미식당
3rd row용문각(2호점)
4th row치킨업(UP)
5th row밀양돼지국밥
ValueCountFrequency (%)
연산점 140
 
2.7%
시청점 47
 
0.9%
부산시청점 42
 
0.8%
부산연산점 31
 
0.6%
연산토곡점 21
 
0.4%
연제점 17
 
0.3%
세븐일레븐 16
 
0.3%
칼국수 16
 
0.3%
씨유 12
 
0.2%
카페 11
 
0.2%
Other values (3287) 4772
93.1%
2023-12-11T01:39:58.674248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1617
 
7.0%
779
 
3.4%
713
 
3.1%
555
 
2.4%
425
 
1.8%
319
 
1.4%
294
 
1.3%
270
 
1.2%
267
 
1.1%
) 265
 
1.1%
Other values (836) 17733
76.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 19683
84.7%
Space Separator 1617
 
7.0%
Uppercase Letter 499
 
2.1%
Lowercase Letter 448
 
1.9%
Decimal Number 361
 
1.6%
Close Punctuation 265
 
1.1%
Open Punctuation 265
 
1.1%
Other Punctuation 92
 
0.4%
Dash Punctuation 3
 
< 0.1%
Letter Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
779
 
4.0%
713
 
3.6%
555
 
2.8%
425
 
2.2%
319
 
1.6%
294
 
1.5%
270
 
1.4%
267
 
1.4%
241
 
1.2%
230
 
1.2%
Other values (764) 15590
79.2%
Uppercase Letter
ValueCountFrequency (%)
C 49
 
9.8%
B 46
 
9.2%
S 34
 
6.8%
O 33
 
6.6%
A 33
 
6.6%
E 31
 
6.2%
G 31
 
6.2%
T 24
 
4.8%
K 19
 
3.8%
D 19
 
3.8%
Other values (16) 180
36.1%
Lowercase Letter
ValueCountFrequency (%)
e 88
19.6%
n 38
 
8.5%
a 37
 
8.3%
r 33
 
7.4%
o 31
 
6.9%
i 30
 
6.7%
c 22
 
4.9%
f 20
 
4.5%
s 19
 
4.2%
t 16
 
3.6%
Other values (14) 114
25.4%
Decimal Number
ValueCountFrequency (%)
2 80
22.2%
1 68
18.8%
0 51
14.1%
5 42
11.6%
3 29
 
8.0%
8 26
 
7.2%
9 24
 
6.6%
4 16
 
4.4%
6 15
 
4.2%
7 10
 
2.8%
Other Punctuation
ValueCountFrequency (%)
& 47
51.1%
, 19
20.7%
. 19
20.7%
' 3
 
3.3%
! 2
 
2.2%
; 2
 
2.2%
Space Separator
ValueCountFrequency (%)
1617
100.0%
Close Punctuation
ValueCountFrequency (%)
) 265
100.0%
Open Punctuation
ValueCountFrequency (%)
( 265
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Letter Number
ValueCountFrequency (%)
3
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 19658
84.6%
Common 2604
 
11.2%
Latin 950
 
4.1%
Han 23
 
0.1%
Hiragana 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
779
 
4.0%
713
 
3.6%
555
 
2.8%
425
 
2.2%
319
 
1.6%
294
 
1.5%
270
 
1.4%
267
 
1.4%
241
 
1.2%
230
 
1.2%
Other values (745) 15565
79.2%
Latin
ValueCountFrequency (%)
e 88
 
9.3%
C 49
 
5.2%
B 46
 
4.8%
n 38
 
4.0%
a 37
 
3.9%
S 34
 
3.6%
r 33
 
3.5%
O 33
 
3.5%
A 33
 
3.5%
E 31
 
3.3%
Other values (41) 528
55.6%
Common
ValueCountFrequency (%)
1617
62.1%
) 265
 
10.2%
( 265
 
10.2%
2 80
 
3.1%
1 68
 
2.6%
0 51
 
2.0%
& 47
 
1.8%
5 42
 
1.6%
3 29
 
1.1%
8 26
 
1.0%
Other values (11) 114
 
4.4%
Han
ValueCountFrequency (%)
3
13.0%
2
 
8.7%
2
 
8.7%
2
 
8.7%
2
 
8.7%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
Other values (7) 7
30.4%
Hiragana
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 19658
84.6%
ASCII 3550
 
15.3%
CJK 23
 
0.1%
Number Forms 3
 
< 0.1%
Hiragana 2
 
< 0.1%
Letterlike Symbols 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1617
45.5%
) 265
 
7.5%
( 265
 
7.5%
e 88
 
2.5%
2 80
 
2.3%
1 68
 
1.9%
0 51
 
1.4%
C 49
 
1.4%
& 47
 
1.3%
B 46
 
1.3%
Other values (60) 974
27.4%
Hangul
ValueCountFrequency (%)
779
 
4.0%
713
 
3.6%
555
 
2.8%
425
 
2.2%
319
 
1.6%
294
 
1.5%
270
 
1.4%
267
 
1.4%
241
 
1.2%
230
 
1.2%
Other values (745) 15565
79.2%
Number Forms
ValueCountFrequency (%)
3
100.0%
CJK
ValueCountFrequency (%)
3
13.0%
2
 
8.7%
2
 
8.7%
2
 
8.7%
2
 
8.7%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
Other values (7) 7
30.4%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%
Hiragana
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct2595
Distinct (%)73.9%
Missing0
Missing (%)0.0%
Memory size27.6 KiB
2023-12-11T01:39:59.050797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length72
Median length59
Mean length30.17037
Min length9

Characters and Unicode

Total characters105898
Distinct characters295
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1866 ?
Unique (%)53.2%

Sample

1st row부산광역시 연제구 교대로 18 (거제동)
2nd row부산광역시 연제구 교대로 11 (거제동)
3rd row부산광역시 연제구 중앙대로1175번길 33-1 (거제동)
4th row부산광역시 연제구 거제천로 183 (거제동)
5th row부산광역시 연제구 중앙대로1120번길 13 (연산동)
ValueCountFrequency (%)
부산광역시 3506
16.9%
연제구 3506
16.9%
연산동 2671
 
12.9%
1층 1352
 
6.5%
거제동 667
 
3.2%
일부호 212
 
1.0%
2층 172
 
0.8%
과정로 168
 
0.8%
거제천로 160
 
0.8%
월드컵대로 140
 
0.7%
Other values (1338) 8216
39.6%
2023-12-11T01:39:59.679627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17264
 
16.3%
6758
 
6.4%
6460
 
6.1%
1 5589
 
5.3%
4740
 
4.5%
3881
 
3.7%
3863
 
3.6%
3792
 
3.6%
( 3630
 
3.4%
) 3630
 
3.4%
Other values (285) 46291
43.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 61725
58.3%
Space Separator 17264
 
16.3%
Decimal Number 16277
 
15.4%
Open Punctuation 3630
 
3.4%
Close Punctuation 3630
 
3.4%
Other Punctuation 2486
 
2.3%
Dash Punctuation 521
 
0.5%
Uppercase Letter 335
 
0.3%
Math Symbol 24
 
< 0.1%
Lowercase Letter 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6758
 
10.9%
6460
 
10.5%
4740
 
7.7%
3881
 
6.3%
3863
 
6.3%
3792
 
6.1%
3549
 
5.7%
3514
 
5.7%
3507
 
5.7%
3506
 
5.7%
Other values (245) 18155
29.4%
Uppercase Letter
ValueCountFrequency (%)
E 52
15.5%
B 39
11.6%
A 37
11.0%
S 35
10.4%
W 34
10.1%
I 30
9.0%
K 30
9.0%
V 29
8.7%
C 20
 
6.0%
D 10
 
3.0%
Other values (5) 19
 
5.7%
Decimal Number
ValueCountFrequency (%)
1 5589
34.3%
2 2360
14.5%
3 1756
 
10.8%
0 1283
 
7.9%
4 1194
 
7.3%
5 1148
 
7.1%
6 829
 
5.1%
8 792
 
4.9%
7 726
 
4.5%
9 600
 
3.7%
Other Punctuation
ValueCountFrequency (%)
, 2459
98.9%
. 14
 
0.6%
· 8
 
0.3%
& 5
 
0.2%
Lowercase Letter
ValueCountFrequency (%)
e 2
33.3%
b 2
33.3%
s 1
16.7%
a 1
16.7%
Math Symbol
ValueCountFrequency (%)
~ 16
66.7%
< 4
 
16.7%
> 4
 
16.7%
Space Separator
ValueCountFrequency (%)
17264
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3630
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3630
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 521
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 61725
58.3%
Common 43832
41.4%
Latin 341
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6758
 
10.9%
6460
 
10.5%
4740
 
7.7%
3881
 
6.3%
3863
 
6.3%
3792
 
6.1%
3549
 
5.7%
3514
 
5.7%
3507
 
5.7%
3506
 
5.7%
Other values (245) 18155
29.4%
Common
ValueCountFrequency (%)
17264
39.4%
1 5589
 
12.8%
( 3630
 
8.3%
) 3630
 
8.3%
, 2459
 
5.6%
2 2360
 
5.4%
3 1756
 
4.0%
0 1283
 
2.9%
4 1194
 
2.7%
5 1148
 
2.6%
Other values (11) 3519
 
8.0%
Latin
ValueCountFrequency (%)
E 52
15.2%
B 39
11.4%
A 37
10.9%
S 35
10.3%
W 34
10.0%
I 30
8.8%
K 30
8.8%
V 29
8.5%
C 20
 
5.9%
D 10
 
2.9%
Other values (9) 25
7.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 61724
58.3%
ASCII 44165
41.7%
None 8
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17264
39.1%
1 5589
 
12.7%
( 3630
 
8.2%
) 3630
 
8.2%
, 2459
 
5.6%
2 2360
 
5.3%
3 1756
 
4.0%
0 1283
 
2.9%
4 1194
 
2.7%
5 1148
 
2.6%
Other values (29) 3852
 
8.7%
Hangul
ValueCountFrequency (%)
6758
 
10.9%
6460
 
10.5%
4740
 
7.7%
3881
 
6.3%
3863
 
6.3%
3792
 
6.1%
3549
 
5.7%
3514
 
5.7%
3507
 
5.7%
3506
 
5.7%
Other values (244) 18154
29.4%
None
ValueCountFrequency (%)
· 8
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

소재지전화
Text

MISSING 

Distinct1878
Distinct (%)79.6%
Missing1150
Missing (%)32.8%
Memory size27.6 KiB
2023-12-11T01:40:00.064785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length13.949576
Min length9

Characters and Unicode

Total characters32921
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1408 ?
Unique (%)59.7%

Sample

1st row051-506-0355
2nd row 051-504-7226
3rd row 051-864-6005
4th row 051-866-9050
5th row051-996-7666
ValueCountFrequency (%)
051 2185
35.6%
852 171
 
2.8%
851 137
 
2.2%
868 136
 
2.2%
867 132
 
2.1%
853 125
 
2.0%
866 66
 
1.1%
861 61
 
1.0%
864 43
 
0.7%
070 42
 
0.7%
Other values (1867) 3047
49.6%
2023-12-11T01:40:00.585329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 4712
14.3%
5 4613
14.0%
4513
13.7%
0 3880
11.8%
1 3637
11.0%
8 2892
8.8%
6 1944
5.9%
7 1708
 
5.2%
2 1481
 
4.5%
3 1327
 
4.0%
Other values (10) 2214
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23678
71.9%
Dash Punctuation 4712
 
14.3%
Space Separator 4513
 
13.7%
Uppercase Letter 14
 
< 0.1%
Math Symbol 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 4613
19.5%
0 3880
16.4%
1 3637
15.4%
8 2892
12.2%
6 1944
8.2%
7 1708
 
7.2%
2 1481
 
6.3%
3 1327
 
5.6%
9 1250
 
5.3%
4 946
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
E 4
28.6%
R 2
14.3%
P 2
14.3%
L 2
14.3%
A 2
14.3%
C 2
14.3%
Math Symbol
ValueCountFrequency (%)
< 2
50.0%
> 2
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 4712
100.0%
Space Separator
ValueCountFrequency (%)
4513
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 32907
> 99.9%
Latin 14
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 4712
14.3%
5 4613
14.0%
4513
13.7%
0 3880
11.8%
1 3637
11.1%
8 2892
8.8%
6 1944
5.9%
7 1708
 
5.2%
2 1481
 
4.5%
3 1327
 
4.0%
Other values (4) 2200
6.7%
Latin
ValueCountFrequency (%)
E 4
28.6%
R 2
14.3%
P 2
14.3%
L 2
14.3%
A 2
14.3%
C 2
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32921
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 4712
14.3%
5 4613
14.0%
4513
13.7%
0 3880
11.8%
1 3637
11.0%
8 2892
8.8%
6 1944
5.9%
7 1708
 
5.2%
2 1481
 
4.5%
3 1327
 
4.0%
Other values (10) 2214
6.7%

Interactions

2023-12-11T01:39:55.933020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:40:00.683449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종명
연번1.0000.407
업종명0.4071.000
2023-12-11T01:40:00.766826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종명
연번1.0000.228
업종명0.2281.000

Missing values

2023-12-11T01:39:56.602343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:39:56.777726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번업종명업소명소재지(도로명)소재지전화
01일반음식점컴포즈교대점부산광역시 연제구 교대로 18 (거제동)051-506-0355
12일반음식점언양진미식당부산광역시 연제구 교대로 11 (거제동)051-504-7226
23일반음식점용문각(2호점)부산광역시 연제구 중앙대로1175번길 33-1 (거제동)051-864-6005
34일반음식점치킨업(UP)부산광역시 연제구 거제천로 183 (거제동)051-866-9050
45일반음식점밀양돼지국밥부산광역시 연제구 중앙대로1120번길 13 (연산동)051-996-7666
56일반음식점연산식당부산광역시 연제구 과정로265번가길 5 (연산동)051-866-5258
67일반음식점고성횟집부산광역시 연제구 중앙대로 1116-9 (연산동)051-861-7666
78일반음식점먹거리부산광역시 연제구 중앙대로 1116-11 (연산동)051-867-4789
89일반음식점24시콩나물해장국부산광역시 연제구 중앙대로 1116-11 (연산동)051-852-5453
910일반음식점참나무숯불갈비부산광역시 연제구 중앙대로1133번길 13 (연산동)051-852-2014
연번업종명업소명소재지(도로명)소재지전화
35003501일반음식점모미루 돈부산광역시 연제구 중앙대로1235번길 20 (거제동)051- 506-4710
35013502일반음식점삼환축산 수영강변점부산광역시 연제구 고분로 251 (연산동,(1층))051 -722-2333
35023503일반음식점갑이네부산광역시 연제구 연동로 9-1, 1층 (연산동)051 -900-7555
35033504일반음식점인생닭강정 부산시청점부산광역시 연제구 연제로42번길 35, 1,2층 (연산동)051-338-2222
35043505일반음식점키리츠부산광역시 연제구 반송로 62-9, 1층 일부호 (연산동)<NA>
35053506일반음식점카츠우노부산광역시 연제구 중앙천로19번길 23, 1층 (연산동)<NA>
35063507일반음식점온도카페부산광역시 연제구 중앙대로 1125, C T 타워 비동 103,104호 (연산동)<NA>
35073508휴게음식점포근당부산광역시 연제구 중앙천로4번길 5, 1층 (연산동)<NA>
35083509휴게음식점세븐일레븐 부산물만골역점부산광역시 연제구 월드컵대로58번길 7, 1층 일부호 (연산동)<NA>
35093510제과점영업다온브레드부산광역시 연제구 연수로 102, 1층 일부호 (연산동)<NA>