Overview

Dataset statistics

Number of variables4
Number of observations173
Missing cells24
Missing cells (%)3.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.5 KiB
Average record size in memory32.8 B

Variable types

Categorical1
Text3

Dataset

Description부산광역시_동구_숙박업현황_20220113
Author부산광역시 동구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15028641

Alerts

업종명 is highly imbalanced (63.6%)Imbalance
소재지전화 has 24 (13.9%) missing valuesMissing
업소명 has unique valuesUnique
영업소 주소(도로명) has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:48:14.725780
Analysis finished2023-12-10 16:48:15.049407
Duration0.32 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
숙박업(일반)
161 
숙박업(생활)
 
12

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
숙박업(일반) 161
93.1%
숙박업(생활) 12
 
6.9%

Length

2023-12-11T01:48:15.109926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:48:15.201453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
숙박업(일반 161
93.1%
숙박업(생활 12
 
6.9%

업소명
Text

UNIQUE 

Distinct173
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-11T01:48:15.469804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length19
Mean length5.6589595
Min length2

Characters and Unicode

Total characters979
Distinct characters225
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique173 ?
Unique (%)100.0%

Sample

1st row삼오여관
2nd row동명여인숙
3rd row대성그린빌
4th row첵앤아웃게스트하우스
5th row호텔26(HOTEL26)
ValueCountFrequency (%)
모텔 4
 
1.9%
게스트하우스 4
 
1.9%
부산역 4
 
1.9%
호텔 3
 
1.4%
부산역점 2
 
1.0%
하운드호텔 2
 
1.0%
탑모텔 2
 
1.0%
2
 
1.0%
프린스 2
 
1.0%
레지던스 2
 
1.0%
Other values (180) 180
87.0%
2023-12-11T01:48:16.348721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
88
 
9.0%
50
 
5.1%
46
 
4.7%
38
 
3.9%
38
 
3.9%
34
 
3.5%
32
 
3.3%
27
 
2.8%
19
 
1.9%
18
 
1.8%
Other values (215) 589
60.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 862
88.0%
Space Separator 34
 
3.5%
Uppercase Letter 34
 
3.5%
Lowercase Letter 14
 
1.4%
Decimal Number 12
 
1.2%
Close Punctuation 10
 
1.0%
Open Punctuation 10
 
1.0%
Other Punctuation 2
 
0.2%
Letter Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
88
 
10.2%
50
 
5.8%
46
 
5.3%
38
 
4.4%
38
 
4.4%
32
 
3.7%
27
 
3.1%
19
 
2.2%
18
 
2.1%
15
 
1.7%
Other values (180) 491
57.0%
Uppercase Letter
ValueCountFrequency (%)
E 5
14.7%
T 4
11.8%
C 4
11.8%
O 4
11.8%
H 3
8.8%
K 2
 
5.9%
L 2
 
5.9%
M 2
 
5.9%
J 1
 
2.9%
W 1
 
2.9%
Other values (6) 6
17.6%
Lowercase Letter
ValueCountFrequency (%)
o 4
28.6%
n 2
14.3%
z 2
14.3%
i 1
 
7.1%
h 1
 
7.1%
s 1
 
7.1%
t 1
 
7.1%
e 1
 
7.1%
l 1
 
7.1%
Decimal Number
ValueCountFrequency (%)
6 4
33.3%
9 2
16.7%
3 2
16.7%
2 2
16.7%
7 2
16.7%
Space Separator
ValueCountFrequency (%)
34
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 862
88.0%
Common 68
 
6.9%
Latin 49
 
5.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
88
 
10.2%
50
 
5.8%
46
 
5.3%
38
 
4.4%
38
 
4.4%
32
 
3.7%
27
 
3.1%
19
 
2.2%
18
 
2.1%
15
 
1.7%
Other values (180) 491
57.0%
Latin
ValueCountFrequency (%)
E 5
 
10.2%
o 4
 
8.2%
T 4
 
8.2%
C 4
 
8.2%
O 4
 
8.2%
H 3
 
6.1%
K 2
 
4.1%
n 2
 
4.1%
L 2
 
4.1%
z 2
 
4.1%
Other values (16) 17
34.7%
Common
ValueCountFrequency (%)
34
50.0%
) 10
 
14.7%
( 10
 
14.7%
6 4
 
5.9%
9 2
 
2.9%
. 2
 
2.9%
3 2
 
2.9%
2 2
 
2.9%
7 2
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 862
88.0%
ASCII 116
 
11.8%
Number Forms 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
88
 
10.2%
50
 
5.8%
46
 
5.3%
38
 
4.4%
38
 
4.4%
32
 
3.7%
27
 
3.1%
19
 
2.2%
18
 
2.1%
15
 
1.7%
Other values (180) 491
57.0%
ASCII
ValueCountFrequency (%)
34
29.3%
) 10
 
8.6%
( 10
 
8.6%
E 5
 
4.3%
o 4
 
3.4%
T 4
 
3.4%
C 4
 
3.4%
6 4
 
3.4%
O 4
 
3.4%
H 3
 
2.6%
Other values (24) 34
29.3%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct173
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-11T01:48:16.647025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length36
Mean length27.098266
Min length20

Characters and Unicode

Total characters4688
Distinct characters72
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique173 ?
Unique (%)100.0%

Sample

1st row부산광역시 동구 범일일길 13-1 (범일동)
2nd row부산광역시 동구 중앙대로389번길 8 (수정동)
3rd row부산광역시 동구 자성로103번길 18 (범일동)
4th row부산광역시 동구 중앙대로226번길 3-7 (초량동)
5th row부산광역시 동구 중앙대로209번길 10-15 (초량동)
ValueCountFrequency (%)
부산광역시 173
19.5%
동구 173
19.5%
초량동 108
 
12.2%
범일동 48
 
5.4%
중앙대로196번길 12
 
1.4%
수정동 12
 
1.4%
대영로243번길 11
 
1.2%
초량로13번길 10
 
1.1%
7 10
 
1.1%
중앙대로 10
 
1.1%
Other values (201) 318
35.9%
2023-12-11T01:48:17.105181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
712
 
15.2%
348
 
7.4%
1 190
 
4.1%
179
 
3.8%
174
 
3.7%
174
 
3.7%
) 173
 
3.7%
173
 
3.7%
173
 
3.7%
173
 
3.7%
Other values (62) 2219
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2678
57.1%
Decimal Number 821
 
17.5%
Space Separator 712
 
15.2%
Close Punctuation 173
 
3.7%
Open Punctuation 173
 
3.7%
Dash Punctuation 88
 
1.9%
Other Punctuation 34
 
0.7%
Math Symbol 7
 
0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
348
13.0%
179
 
6.7%
174
 
6.5%
174
 
6.5%
173
 
6.5%
173
 
6.5%
173
 
6.5%
166
 
6.2%
140
 
5.2%
138
 
5.2%
Other values (44) 840
31.4%
Decimal Number
ValueCountFrequency (%)
1 190
23.1%
2 140
17.1%
3 102
12.4%
9 70
 
8.5%
4 66
 
8.0%
6 59
 
7.2%
7 58
 
7.1%
0 54
 
6.6%
5 42
 
5.1%
8 40
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
B 1
50.0%
A 1
50.0%
Space Separator
ValueCountFrequency (%)
712
100.0%
Close Punctuation
ValueCountFrequency (%)
) 173
100.0%
Open Punctuation
ValueCountFrequency (%)
( 173
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 88
100.0%
Other Punctuation
ValueCountFrequency (%)
, 34
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2678
57.1%
Common 2008
42.8%
Latin 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
348
13.0%
179
 
6.7%
174
 
6.5%
174
 
6.5%
173
 
6.5%
173
 
6.5%
173
 
6.5%
166
 
6.2%
140
 
5.2%
138
 
5.2%
Other values (44) 840
31.4%
Common
ValueCountFrequency (%)
712
35.5%
1 190
 
9.5%
) 173
 
8.6%
( 173
 
8.6%
2 140
 
7.0%
3 102
 
5.1%
- 88
 
4.4%
9 70
 
3.5%
4 66
 
3.3%
6 59
 
2.9%
Other values (6) 235
 
11.7%
Latin
ValueCountFrequency (%)
B 1
50.0%
A 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2678
57.1%
ASCII 2010
42.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
712
35.4%
1 190
 
9.5%
) 173
 
8.6%
( 173
 
8.6%
2 140
 
7.0%
3 102
 
5.1%
- 88
 
4.4%
9 70
 
3.5%
4 66
 
3.3%
6 59
 
2.9%
Other values (8) 237
 
11.8%
Hangul
ValueCountFrequency (%)
348
13.0%
179
 
6.7%
174
 
6.5%
174
 
6.5%
173
 
6.5%
173
 
6.5%
173
 
6.5%
166
 
6.2%
140
 
5.2%
138
 
5.2%
Other values (44) 840
31.4%

소재지전화
Text

MISSING 

Distinct149
Distinct (%)100.0%
Missing24
Missing (%)13.9%
Memory size1.5 KiB
2023-12-11T01:48:17.422213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1788
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique149 ?
Unique (%)100.0%

Sample

1st row051-321-1220
2nd row051-409-8888
3rd row051-441-0010
4th row051-441-0708
5th row051-441-5171
ValueCountFrequency (%)
051-462-0089 1
 
0.7%
051-643-2710 1
 
0.7%
051-468-9434 1
 
0.7%
051-469-0688 1
 
0.7%
051-469-1918 1
 
0.7%
051-469-4274 1
 
0.7%
051-469-4747 1
 
0.7%
051-631-1504 1
 
0.7%
051-631-6866 1
 
0.7%
051-631-7780 1
 
0.7%
Other values (139) 139
93.3%
2023-12-11T01:48:17.872379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 298
16.7%
0 234
13.1%
1 234
13.1%
5 231
12.9%
4 198
11.1%
6 192
10.7%
3 100
 
5.6%
7 93
 
5.2%
8 84
 
4.7%
2 80
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1490
83.3%
Dash Punctuation 298
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 234
15.7%
1 234
15.7%
5 231
15.5%
4 198
13.3%
6 192
12.9%
3 100
6.7%
7 93
 
6.2%
8 84
 
5.6%
2 80
 
5.4%
9 44
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 298
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1788
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 298
16.7%
0 234
13.1%
1 234
13.1%
5 231
12.9%
4 198
11.1%
6 192
10.7%
3 100
 
5.6%
7 93
 
5.2%
8 84
 
4.7%
2 80
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1788
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 298
16.7%
0 234
13.1%
1 234
13.1%
5 231
12.9%
4 198
11.1%
6 192
10.7%
3 100
 
5.6%
7 93
 
5.2%
8 84
 
4.7%
2 80
 
4.5%

Missing values

2023-12-11T01:48:14.939684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:48:15.018938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명영업소 주소(도로명)소재지전화
0숙박업(일반)삼오여관부산광역시 동구 범일일길 13-1 (범일동)<NA>
1숙박업(일반)동명여인숙부산광역시 동구 중앙대로389번길 8 (수정동)<NA>
2숙박업(일반)대성그린빌부산광역시 동구 자성로103번길 18 (범일동)<NA>
3숙박업(일반)첵앤아웃게스트하우스부산광역시 동구 중앙대로226번길 3-7 (초량동)<NA>
4숙박업(일반)호텔26(HOTEL26)부산광역시 동구 중앙대로209번길 10-15 (초량동)<NA>
5숙박업(생활)민트 파라다이스부산광역시 동구 대영로239번길 20 (초량동)<NA>
6숙박업(생활)마리나레지던스호텔부산광역시 동구 대영로243번길 73-5, 조이팰리스 2~7층 (초량동)<NA>
7숙박업(일반)원웨이 게스트하우스부산광역시 동구 중앙대로196번길 6-3, 1~5층 (초량동)<NA>
8숙박업(생활)부산역 오름 레지던스부산광역시 동구 중앙대로180번길 16-8, 지하1층일부,지상1층일부,2~20층 (초량동)<NA>
9숙박업(일반)스마일 모텔부산광역시 동구 초량로13번길 54 (초량동)<NA>
업종명업소명영업소 주소(도로명)소재지전화
163숙박업(일반)삼성장여관부산광역시 동구 중앙대로 477-3 (범일동)051-646-8424
164숙박업(일반)허브모텔부산광역시 동구 범일로89번길 26 (범일동)051-647-0026
165숙박업(일반)라온부산광역시 동구 조방로38번길 8 (범일동)051-647-1776
166숙박업(일반)하운드호텔 범일부산광역시 동구 조방로34번길 5 (범일동)051-647-6829
167숙박업(일반)브라운도트호텔 범일점부산광역시 동구 중앙대로 528 (범일동)051-791-0770
168숙박업(일반)더웨이호텔부산광역시 동구 중앙대로209번길 12 (초량동)051-852-3600
169숙박업(일반)정원장부산광역시 동구 범일일길 12-2 (범일동)051-868-1130
170숙박업(일반)라마다앙코르부산역호텔부산광역시 동구 중앙대로196번길 10, 부산역라마다앙코르호텔 (초량동)051-922-0000
171숙박업(일반)부산숙박닷컴 게스트하우스부산광역시 동구 초량중로 60 (초량동)<NA>
172숙박업(일반)모찌호스텔(Mozzi hostel)부산광역시 동구 중앙대로196번길 16-12, 5층 (초량동)<NA>