Overview

Dataset statistics

Number of variables5
Number of observations102
Missing cells3
Missing cells (%)0.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.2 KiB
Average record size in memory42.3 B

Variable types

Numeric1
Categorical2
Text2

Dataset

Description부산광역시남구위생관리등급현황(2019년)
Author부산광역시 남구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15047966

Alerts

등급 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
업종 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 업종 and 1 other fieldsHigh correlation
업종 is highly imbalanced (92.1%)Imbalance

Reproduction

Analysis started2023-12-10 16:26:42.641298
Analysis finished2023-12-10 16:26:44.220401
Duration1.58 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION 

Distinct101
Distinct (%)100.0%
Missing1
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean51
Minimum1
Maximum101
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-11T01:26:44.343565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q126
median51
Q376
95-th percentile96
Maximum101
Range100
Interquartile range (IQR)50

Descriptive statistics

Standard deviation29.300171
Coefficient of variation (CV)0.57451315
Kurtosis-1.2
Mean51
Median Absolute Deviation (MAD)25
Skewness0
Sum5151
Variance858.5
MonotonicityStrictly increasing
2023-12-11T01:26:44.573883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (91) 91
89.2%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
101 1
1.0%
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%

업종
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size948.0 B
이용업
101 
<NA>
 
1

Length

Max length4
Median length3
Mean length3.0098039
Min length3

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row이용업
2nd row이용업
3rd row이용업
4th row이용업
5th row이용업

Common Values

ValueCountFrequency (%)
이용업 101
99.0%
<NA> 1
 
1.0%

Length

2023-12-11T01:26:44.780623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:26:44.924100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
이용업 101
99.0%
na 1
 
1.0%
Distinct100
Distinct (%)99.0%
Missing1
Missing (%)1.0%
Memory size948.0 B
2023-12-11T01:26:45.206455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length6
Mean length6.5940594
Min length3

Characters and Unicode

Total characters666
Distinct characters146
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)98.0%

Sample

1st row남도탕구내이용원
2nd row용호헬스사우나이용원
3rd row중앙해수탕구내이용원
4th row가나안 남성컷트
5th row경덕 이용원
ValueCountFrequency (%)
이용원 54
31.2%
구내이용 4
 
2.3%
구내이용원 4
 
2.3%
구내 2
 
1.2%
진주 2
 
1.2%
효원 1
 
0.6%
신신 1
 
0.6%
1
 
0.6%
유정 1
 
0.6%
부경해수탕구내이용원 1
 
0.6%
Other values (102) 102
59.0%
2023-12-11T01:26:45.696633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
91
 
13.7%
90
 
13.5%
86
 
12.9%
72
 
10.8%
22
 
3.3%
20
 
3.0%
17
 
2.6%
12
 
1.8%
10
 
1.5%
10
 
1.5%
Other values (136) 236
35.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 592
88.9%
Space Separator 72
 
10.8%
Open Punctuation 1
 
0.2%
Close Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
91
15.4%
90
 
15.2%
86
 
14.5%
22
 
3.7%
20
 
3.4%
17
 
2.9%
12
 
2.0%
10
 
1.7%
10
 
1.7%
9
 
1.5%
Other values (133) 225
38.0%
Space Separator
ValueCountFrequency (%)
72
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 592
88.9%
Common 74
 
11.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
91
15.4%
90
 
15.2%
86
 
14.5%
22
 
3.7%
20
 
3.4%
17
 
2.9%
12
 
2.0%
10
 
1.7%
10
 
1.7%
9
 
1.5%
Other values (133) 225
38.0%
Common
ValueCountFrequency (%)
72
97.3%
( 1
 
1.4%
) 1
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 592
88.9%
ASCII 74
 
11.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
91
15.4%
90
 
15.2%
86
 
14.5%
22
 
3.7%
20
 
3.4%
17
 
2.9%
12
 
2.0%
10
 
1.7%
10
 
1.7%
9
 
1.5%
Other values (133) 225
38.0%
ASCII
ValueCountFrequency (%)
72
97.3%
( 1
 
1.4%
) 1
 
1.4%
Distinct100
Distinct (%)99.0%
Missing1
Missing (%)1.0%
Memory size948.0 B
2023-12-11T01:26:46.147882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length26
Mean length18.29703
Min length12

Characters and Unicode

Total characters1848
Distinct characters103
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)98.0%

Sample

1st row 용호로76번길 83 (용호동)
2nd row 용호로159번길 108 (용호동)
3rd row 용호로 64, 5층 (용호동)
4th row 용호로 98-24, 101호 (용호동, 성지빌라)
5th row 동명로146번길 48, 1층 (용호동)
ValueCountFrequency (%)
대연동 30
 
9.1%
용호동 25
 
7.6%
감만동 11
 
3.3%
문현동 11
 
3.3%
우암동 8
 
2.4%
t/b 8
 
2.4%
용당동 4
 
1.2%
39 4
 
1.2%
유엔로 4
 
1.2%
지게골로 4
 
1.2%
Other values (181) 222
67.1%
2023-12-11T01:26:46.838011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
364
19.7%
114
 
6.2%
) 104
 
5.6%
( 104
 
5.6%
90
 
4.9%
1 88
 
4.8%
2 55
 
3.0%
55
 
3.0%
53
 
2.9%
50
 
2.7%
Other values (93) 771
41.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 810
43.8%
Decimal Number 399
21.6%
Space Separator 364
19.7%
Close Punctuation 104
 
5.6%
Open Punctuation 104
 
5.6%
Other Punctuation 39
 
2.1%
Uppercase Letter 16
 
0.9%
Dash Punctuation 12
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
114
 
14.1%
90
 
11.1%
55
 
6.8%
53
 
6.5%
50
 
6.2%
43
 
5.3%
40
 
4.9%
36
 
4.4%
21
 
2.6%
16
 
2.0%
Other values (75) 292
36.0%
Decimal Number
ValueCountFrequency (%)
1 88
22.1%
2 55
13.8%
3 47
11.8%
6 41
10.3%
9 36
9.0%
4 31
 
7.8%
0 29
 
7.3%
5 25
 
6.3%
8 25
 
6.3%
7 22
 
5.5%
Other Punctuation
ValueCountFrequency (%)
, 31
79.5%
/ 8
 
20.5%
Uppercase Letter
ValueCountFrequency (%)
B 8
50.0%
T 8
50.0%
Space Separator
ValueCountFrequency (%)
364
100.0%
Close Punctuation
ValueCountFrequency (%)
) 104
100.0%
Open Punctuation
ValueCountFrequency (%)
( 104
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1022
55.3%
Hangul 810
43.8%
Latin 16
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
114
 
14.1%
90
 
11.1%
55
 
6.8%
53
 
6.5%
50
 
6.2%
43
 
5.3%
40
 
4.9%
36
 
4.4%
21
 
2.6%
16
 
2.0%
Other values (75) 292
36.0%
Common
ValueCountFrequency (%)
364
35.6%
) 104
 
10.2%
( 104
 
10.2%
1 88
 
8.6%
2 55
 
5.4%
3 47
 
4.6%
6 41
 
4.0%
9 36
 
3.5%
4 31
 
3.0%
, 31
 
3.0%
Other values (6) 121
 
11.8%
Latin
ValueCountFrequency (%)
B 8
50.0%
T 8
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1038
56.2%
Hangul 810
43.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
364
35.1%
) 104
 
10.0%
( 104
 
10.0%
1 88
 
8.5%
2 55
 
5.3%
3 47
 
4.5%
6 41
 
3.9%
9 36
 
3.5%
4 31
 
3.0%
, 31
 
3.0%
Other values (8) 137
 
13.2%
Hangul
ValueCountFrequency (%)
114
 
14.1%
90
 
11.1%
55
 
6.8%
53
 
6.5%
50
 
6.2%
43
 
5.3%
40
 
4.9%
36
 
4.4%
21
 
2.6%
16
 
2.0%
Other values (75) 292
36.0%

등급
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size948.0 B
백색
53 
황색
27 
녹색
21 
<NA>
 
1

Length

Max length4
Median length2
Mean length2.0196078
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row녹색
2nd row녹색
3rd row녹색
4th row녹색
5th row녹색

Common Values

ValueCountFrequency (%)
백색 53
52.0%
황색 27
26.5%
녹색 21
 
20.6%
<NA> 1
 
1.0%

Length

2023-12-11T01:26:47.060745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:26:47.222892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
백색 53
52.0%
황색 27
26.5%
녹색 21
 
20.6%
na 1
 
1.0%

Interactions

2023-12-11T01:26:43.662237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:26:47.325986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업소명소재지등급
연번1.0000.9450.9450.952
업소명0.9451.0000.9991.000
소재지0.9450.9991.0001.000
등급0.9521.0001.0001.000
2023-12-11T01:26:47.442348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등급업종
등급1.0001.000
업종1.0001.000
2023-12-11T01:26:47.541119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종등급
연번1.0001.0000.932
업종1.0001.0001.000
등급0.9321.0001.000

Missing values

2023-12-11T01:26:43.811860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:26:43.948289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T01:26:44.112972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번업종업소명소재지등급
01이용업남도탕구내이용원용호로76번길 83 (용호동)녹색
12이용업용호헬스사우나이용원용호로159번길 108 (용호동)녹색
23이용업중앙해수탕구내이용원용호로 64, 5층 (용호동)녹색
34이용업가나안 남성컷트용호로 98-24, 101호 (용호동, 성지빌라)녹색
45이용업경덕 이용원동명로146번길 48, 1층 (용호동)녹색
56이용업대영온천황령대로492번길 10 (대연동)녹색
67이용업동선이용원용호로42번길 127 (용호동)녹색
78이용업동호 이용원용주로 11 (용호동)녹색
89이용업메트로랜드이용원분포로 66-14 (용호동)녹색
910이용업블루클럽분포로 113, 1005동 127-1호 (용호동)녹색
연번업종업소명소재지등급
9293이용업진경이용원수영로 195-2 (대연동,지하1층)백색
9394이용업궁전 이용원유엔평화로 133 (용당동)백색
9495이용업신선탕구내이용원유엔로 38 (우암동)백색
9596이용업신신 이용원지게골로 76 (문현동)백색
9697이용업효원 이용원전포대로 94 (문현동)백색
9798이용업동명대학교 구내이용원신선로 428 (용당동)백색
9899이용업문현탕컷트실진남로 188 (문현동)백색
99100이용업백천 이용원우암동 189번지 (T/B)백색
100101이용업삼천리 이용원문현동 781번지 10호 (T/B)백색
101<NA><NA><NA><NA><NA>