Overview

Dataset statistics

Number of variables4
Number of observations177
Missing cells14
Missing cells (%)2.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.7 KiB
Average record size in memory32.7 B

Variable types

Categorical1
Text3

Dataset

Description부산광역시_동구_숙박업현황_20200121
Author부산광역시 동구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15028641

Alerts

업종명 is highly imbalanced (66.4%)Imbalance
소재지전화 has 14 (7.9%) missing valuesMissing
업소명 has unique valuesUnique
업소소재지(도로명) has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:48:24.127194
Analysis finished2023-12-10 16:48:24.553518
Duration0.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
숙박업(일반)
166 
숙박업(생활)
 
11

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
숙박업(일반) 166
93.8%
숙박업(생활) 11
 
6.2%

Length

2023-12-11T01:48:24.623380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:48:24.739285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
숙박업(일반 166
93.8%
숙박업(생활 11
 
6.2%

업소명
Text

UNIQUE 

Distinct177
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-11T01:48:24.990649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length21
Mean length5.7062147
Min length2

Characters and Unicode

Total characters1010
Distinct characters226
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique177 ?
Unique (%)100.0%

Sample

1st row한길여인숙
2nd row진주여인숙
3rd row경남여인숙
4th row본역여인숙
5th row낙원여인숙
ValueCountFrequency (%)
부산역 5
 
2.4%
게스트하우스 4
 
1.9%
레지던스 2
 
1.0%
탑모텔 2
 
1.0%
2
 
1.0%
모텔 2
 
1.0%
호텔 2
 
1.0%
브라운 1
 
0.5%
삼일모텔 1
 
0.5%
미라벨모텔 1
 
0.5%
Other values (184) 184
89.3%
2023-12-11T01:48:25.369615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
86
 
8.5%
52
 
5.1%
48
 
4.8%
43
 
4.3%
38
 
3.8%
34
 
3.4%
29
 
2.9%
29
 
2.9%
19
 
1.9%
19
 
1.9%
Other values (216) 613
60.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 876
86.7%
Uppercase Letter 34
 
3.4%
Lowercase Letter 32
 
3.2%
Space Separator 29
 
2.9%
Open Punctuation 12
 
1.2%
Close Punctuation 12
 
1.2%
Decimal Number 11
 
1.1%
Other Punctuation 2
 
0.2%
Letter Number 1
 
0.1%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
86
 
9.8%
52
 
5.9%
48
 
5.5%
43
 
4.9%
38
 
4.3%
34
 
3.9%
29
 
3.3%
19
 
2.2%
19
 
2.2%
18
 
2.1%
Other values (179) 490
55.9%
Uppercase Letter
ValueCountFrequency (%)
O 4
11.8%
C 4
11.8%
H 4
11.8%
T 3
8.8%
B 3
8.8%
E 3
8.8%
L 2
 
5.9%
S 2
 
5.9%
G 2
 
5.9%
K 2
 
5.9%
Other values (5) 5
14.7%
Lowercase Letter
ValueCountFrequency (%)
o 6
18.8%
s 4
12.5%
t 4
12.5%
n 4
12.5%
e 3
9.4%
u 3
9.4%
i 2
 
6.2%
z 2
 
6.2%
a 2
 
6.2%
l 1
 
3.1%
Decimal Number
ValueCountFrequency (%)
6 4
36.4%
3 2
18.2%
2 2
18.2%
9 2
18.2%
7 1
 
9.1%
Space Separator
ValueCountFrequency (%)
29
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 876
86.7%
Common 67
 
6.6%
Latin 67
 
6.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
86
 
9.8%
52
 
5.9%
48
 
5.5%
43
 
4.9%
38
 
4.3%
34
 
3.9%
29
 
3.3%
19
 
2.2%
19
 
2.2%
18
 
2.1%
Other values (179) 490
55.9%
Latin
ValueCountFrequency (%)
o 6
 
9.0%
O 4
 
6.0%
s 4
 
6.0%
C 4
 
6.0%
t 4
 
6.0%
n 4
 
6.0%
H 4
 
6.0%
e 3
 
4.5%
u 3
 
4.5%
T 3
 
4.5%
Other values (17) 28
41.8%
Common
ValueCountFrequency (%)
29
43.3%
( 12
17.9%
) 12
17.9%
6 4
 
6.0%
3 2
 
3.0%
2 2
 
3.0%
9 2
 
3.0%
. 2
 
3.0%
- 1
 
1.5%
7 1
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 876
86.7%
ASCII 133
 
13.2%
Number Forms 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
86
 
9.8%
52
 
5.9%
48
 
5.5%
43
 
4.9%
38
 
4.3%
34
 
3.9%
29
 
3.3%
19
 
2.2%
19
 
2.2%
18
 
2.1%
Other values (179) 490
55.9%
ASCII
ValueCountFrequency (%)
29
21.8%
( 12
 
9.0%
) 12
 
9.0%
o 6
 
4.5%
6 4
 
3.0%
O 4
 
3.0%
s 4
 
3.0%
C 4
 
3.0%
t 4
 
3.0%
n 4
 
3.0%
Other values (26) 50
37.6%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct177
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-11T01:48:25.599724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length36
Mean length26.943503
Min length20

Characters and Unicode

Total characters4769
Distinct characters66
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique177 ?
Unique (%)100.0%

Sample

1st row부산광역시 동구 초량중로80번길 3 (초량동)
2nd row부산광역시 동구 중앙대로229번길 10-2 (초량동)
3rd row부산광역시 동구 수정로 10 (수정동)
4th row부산광역시 동구 대영로243번길 52 (초량동)
5th row부산광역시 동구 자성공원로 1-3 (범일동)
ValueCountFrequency (%)
부산광역시 177
19.6%
동구 177
19.6%
초량동 108
 
12.0%
범일동 50
 
5.5%
수정동 14
 
1.6%
중앙대로196번길 12
 
1.3%
7 10
 
1.1%
대영로243번길 10
 
1.1%
중앙대로 10
 
1.1%
초량로13번길 10
 
1.1%
Other values (202) 324
35.9%
2023-12-11T01:48:25.998217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
725
 
15.2%
355
 
7.4%
1 193
 
4.0%
182
 
3.8%
178
 
3.7%
178
 
3.7%
177
 
3.7%
177
 
3.7%
177
 
3.7%
) 177
 
3.7%
Other values (56) 2250
47.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2728
57.2%
Decimal Number 834
 
17.5%
Space Separator 725
 
15.2%
Close Punctuation 177
 
3.7%
Open Punctuation 177
 
3.7%
Dash Punctuation 89
 
1.9%
Other Punctuation 33
 
0.7%
Math Symbol 6
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
355
13.0%
182
 
6.7%
178
 
6.5%
178
 
6.5%
177
 
6.5%
177
 
6.5%
177
 
6.5%
170
 
6.2%
143
 
5.2%
139
 
5.1%
Other values (40) 852
31.2%
Decimal Number
ValueCountFrequency (%)
1 193
23.1%
2 140
16.8%
3 103
12.4%
9 74
 
8.9%
4 66
 
7.9%
6 60
 
7.2%
7 58
 
7.0%
0 54
 
6.5%
5 45
 
5.4%
8 41
 
4.9%
Space Separator
ValueCountFrequency (%)
725
100.0%
Close Punctuation
ValueCountFrequency (%)
) 177
100.0%
Open Punctuation
ValueCountFrequency (%)
( 177
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 89
100.0%
Other Punctuation
ValueCountFrequency (%)
, 33
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2728
57.2%
Common 2041
42.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
355
13.0%
182
 
6.7%
178
 
6.5%
178
 
6.5%
177
 
6.5%
177
 
6.5%
177
 
6.5%
170
 
6.2%
143
 
5.2%
139
 
5.1%
Other values (40) 852
31.2%
Common
ValueCountFrequency (%)
725
35.5%
1 193
 
9.5%
) 177
 
8.7%
( 177
 
8.7%
2 140
 
6.9%
3 103
 
5.0%
- 89
 
4.4%
9 74
 
3.6%
4 66
 
3.2%
6 60
 
2.9%
Other values (6) 237
 
11.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2728
57.2%
ASCII 2041
42.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
725
35.5%
1 193
 
9.5%
) 177
 
8.7%
( 177
 
8.7%
2 140
 
6.9%
3 103
 
5.0%
- 89
 
4.4%
9 74
 
3.6%
4 66
 
3.2%
6 60
 
2.9%
Other values (6) 237
 
11.6%
Hangul
ValueCountFrequency (%)
355
13.0%
182
 
6.7%
178
 
6.5%
178
 
6.5%
177
 
6.5%
177
 
6.5%
177
 
6.5%
170
 
6.2%
143
 
5.2%
139
 
5.1%
Other values (40) 852
31.2%

소재지전화
Text

MISSING 

Distinct163
Distinct (%)100.0%
Missing14
Missing (%)7.9%
Memory size1.5 KiB
2023-12-11T01:48:26.292757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1956
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique163 ?
Unique (%)100.0%

Sample

1st row051-464-2372
2nd row051-467-0977
3rd row051-468-7329
4th row051-467-4277
5th row051-646-0929
ValueCountFrequency (%)
051-467-2313 1
 
0.6%
051-645-6717 1
 
0.6%
051-466-5595 1
 
0.6%
051-441-8409 1
 
0.6%
051-466-9597 1
 
0.6%
051-463-7365 1
 
0.6%
051-464-6121 1
 
0.6%
051-442-5504 1
 
0.6%
051-467-1344 1
 
0.6%
051-468-5178 1
 
0.6%
Other values (153) 153
93.9%
2023-12-11T01:48:26.706629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 326
16.7%
0 256
13.1%
5 255
13.0%
1 254
13.0%
4 217
11.1%
6 216
11.0%
7 104
 
5.3%
3 104
 
5.3%
8 93
 
4.8%
2 86
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1630
83.3%
Dash Punctuation 326
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 256
15.7%
5 255
15.6%
1 254
15.6%
4 217
13.3%
6 216
13.3%
7 104
6.4%
3 104
6.4%
8 93
 
5.7%
2 86
 
5.3%
9 45
 
2.8%
Dash Punctuation
ValueCountFrequency (%)
- 326
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1956
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 326
16.7%
0 256
13.1%
5 255
13.0%
1 254
13.0%
4 217
11.1%
6 216
11.0%
7 104
 
5.3%
3 104
 
5.3%
8 93
 
4.8%
2 86
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1956
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 326
16.7%
0 256
13.1%
5 255
13.0%
1 254
13.0%
4 217
11.1%
6 216
11.0%
7 104
 
5.3%
3 104
 
5.3%
8 93
 
4.8%
2 86
 
4.4%

Missing values

2023-12-11T01:48:24.367998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:48:24.503515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명업소소재지(도로명)소재지전화
0숙박업(일반)한길여인숙부산광역시 동구 초량중로80번길 3 (초량동)051-464-2372
1숙박업(일반)진주여인숙부산광역시 동구 중앙대로229번길 10-2 (초량동)051-467-0977
2숙박업(일반)경남여인숙부산광역시 동구 수정로 10 (수정동)051-468-7329
3숙박업(일반)본역여인숙부산광역시 동구 대영로243번길 52 (초량동)051-467-4277
4숙박업(일반)낙원여인숙부산광역시 동구 자성공원로 1-3 (범일동)051-646-0929
5숙박업(일반)삼오여관부산광역시 동구 범일일길 13-1 (범일동)051-646-5626
6숙박업(일반)초량솔로몬 리빙텔부산광역시 동구 고관로33번길 9 (수정동)051-467-7660
7숙박업(일반)Inn부산부산광역시 동구 대영로239번길 45 (초량동)051-467-4875
8숙박업(일반)황금여인숙부산광역시 동구 초량로13번길 60 (초량동)051-467-8751
9숙박업(일반)물레부산광역시 동구 대영로243번길 10 (초량동)051-468-5994
업종명업소명업소소재지(도로명)소재지전화
167숙박업(생활)리젠시빌부산광역시 동구 조방로49번길 23-11 (범일동)051-635-1818
168숙박업(생활)영광숙박부산광역시 동구 범곡로28번길 9 (범일동)051-644-3255
169숙박업(생활)민트 파라다이스부산광역시 동구 대영로239번길 20 (초량동)<NA>
170숙박업(생활)파밀리에게스트하우스부산광역시 동구 중앙대로214번길 3-4, 4,5층 (초량동)051-461-0080
171숙박업(생활)케이게스트하우스부산역(K-GuestHouse Busan Station)부산광역시 동구 중앙대로236번길 7-5, 2,3층 (초량동)<NA>
172숙박업(생활)소호스텔부산광역시 동구 중앙대로226번길 3-2 (초량동, 4-6층, 7층 일부)051-465-9030
173숙박업(생활)싱글싱글부산광역시 동구 초량중로 18-1, 1-4층 (초량동)051-468-5978
174숙박업(생활)온팍스 레지던스부산광역시 동구 중앙대로196번길 16-12, 3~4층 (초량동)051-468-1537
175숙박업(생활)워라밸 게스트하우스부산광역시 동구 초량중로 11, 2층 (초량동)051-463-1555
176숙박업(생활)부산역 오름 레지던스부산광역시 동구 중앙대로180번길 16-8, 지하1층일부,지상1층일부,2~20층 (초량동)<NA>