Overview

Dataset statistics

Number of variables4
Number of observations167
Missing cells24
Missing cells (%)3.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.3 KiB
Average record size in memory32.8 B

Variable types

Categorical1
Text3

Dataset

Description부산광역시_동구_숙박업현황_20230117
Author부산광역시 동구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15028641

Alerts

업종명 is highly imbalanced (60.5%)Imbalance
소재지전화 has 24 (14.4%) missing valuesMissing
업소명 has unique valuesUnique
영업소 주소(도로명) has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:48:09.545398
Analysis finished2023-12-10 16:48:10.027865
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
숙박업(일반)
154 
숙박업(생활)
 
13

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
숙박업(일반) 154
92.2%
숙박업(생활) 13
 
7.8%

Length

2023-12-11T01:48:10.123279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:48:10.289935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
숙박업(일반 154
92.2%
숙박업(생활 13
 
7.8%

업소명
Text

UNIQUE 

Distinct167
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2023-12-11T01:48:10.649277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length19
Mean length5.6646707
Min length2

Characters and Unicode

Total characters946
Distinct characters224
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique167 ?
Unique (%)100.0%

Sample

1st row한길여인숙
2nd row진주여인숙
3rd row경남여인숙
4th row본역여인숙
5th row낙원여인숙
ValueCountFrequency (%)
부산역 5
 
2.5%
게스트하우스 4
 
2.0%
모텔 4
 
2.0%
호텔 3
 
1.5%
탑모텔 2
 
1.0%
레지던스 2
 
1.0%
프린스 2
 
1.0%
장춘모텔 1
 
0.5%
힐모텔 1
 
0.5%
유호텔 1
 
0.5%
Other values (172) 172
87.3%
2023-12-11T01:48:11.302633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
87
 
9.2%
47
 
5.0%
47
 
5.0%
39
 
4.1%
37
 
3.9%
31
 
3.3%
30
 
3.2%
25
 
2.6%
19
 
2.0%
18
 
1.9%
Other values (214) 566
59.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 836
88.4%
Uppercase Letter 33
 
3.5%
Space Separator 30
 
3.2%
Lowercase Letter 14
 
1.5%
Open Punctuation 10
 
1.1%
Close Punctuation 10
 
1.1%
Decimal Number 10
 
1.1%
Other Punctuation 2
 
0.2%
Letter Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
87
 
10.4%
47
 
5.6%
47
 
5.6%
39
 
4.7%
37
 
4.4%
31
 
3.7%
25
 
3.0%
19
 
2.3%
18
 
2.2%
15
 
1.8%
Other values (180) 471
56.3%
Uppercase Letter
ValueCountFrequency (%)
E 5
15.2%
C 4
12.1%
O 4
12.1%
T 4
12.1%
H 3
9.1%
M 2
 
6.1%
L 2
 
6.1%
K 2
 
6.1%
I 1
 
3.0%
R 1
 
3.0%
Other values (5) 5
15.2%
Lowercase Letter
ValueCountFrequency (%)
o 4
28.6%
n 2
14.3%
z 2
14.3%
l 1
 
7.1%
e 1
 
7.1%
t 1
 
7.1%
s 1
 
7.1%
h 1
 
7.1%
i 1
 
7.1%
Decimal Number
ValueCountFrequency (%)
2 3
30.0%
6 3
30.0%
7 2
20.0%
9 1
 
10.0%
3 1
 
10.0%
Space Separator
ValueCountFrequency (%)
30
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 836
88.4%
Common 62
 
6.6%
Latin 48
 
5.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
87
 
10.4%
47
 
5.6%
47
 
5.6%
39
 
4.7%
37
 
4.4%
31
 
3.7%
25
 
3.0%
19
 
2.3%
18
 
2.2%
15
 
1.8%
Other values (180) 471
56.3%
Latin
ValueCountFrequency (%)
E 5
 
10.4%
o 4
 
8.3%
C 4
 
8.3%
O 4
 
8.3%
T 4
 
8.3%
H 3
 
6.2%
M 2
 
4.2%
L 2
 
4.2%
n 2
 
4.2%
z 2
 
4.2%
Other values (15) 16
33.3%
Common
ValueCountFrequency (%)
30
48.4%
( 10
 
16.1%
) 10
 
16.1%
2 3
 
4.8%
6 3
 
4.8%
. 2
 
3.2%
7 2
 
3.2%
9 1
 
1.6%
3 1
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 836
88.4%
ASCII 109
 
11.5%
Number Forms 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
87
 
10.4%
47
 
5.6%
47
 
5.6%
39
 
4.7%
37
 
4.4%
31
 
3.7%
25
 
3.0%
19
 
2.3%
18
 
2.2%
15
 
1.8%
Other values (180) 471
56.3%
ASCII
ValueCountFrequency (%)
30
27.5%
( 10
 
9.2%
) 10
 
9.2%
E 5
 
4.6%
o 4
 
3.7%
C 4
 
3.7%
O 4
 
3.7%
T 4
 
3.7%
2 3
 
2.8%
H 3
 
2.8%
Other values (23) 32
29.4%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct167
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2023-12-11T01:48:11.729979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length36
Mean length27.401198
Min length20

Characters and Unicode

Total characters4576
Distinct characters76
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique167 ?
Unique (%)100.0%

Sample

1st row부산광역시 동구 초량중로80번길 3 (초량동)
2nd row부산광역시 동구 중앙대로229번길 10-2 (초량동)
3rd row부산광역시 동구 수정로 10 (수정동)
4th row부산광역시 동구 대영로243번길 52 (초량동)
5th row부산광역시 동구 자성공원로 1-3 (범일동)
ValueCountFrequency (%)
부산광역시 167
19.4%
동구 167
19.4%
초량동 109
 
12.7%
범일동 41
 
4.8%
중앙대로196번길 12
 
1.4%
수정동 12
 
1.4%
중앙대로 11
 
1.3%
대영로243번길 11
 
1.3%
초량로13번길 10
 
1.2%
7 9
 
1.0%
Other values (200) 311
36.2%
2023-12-11T01:48:12.361439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
693
 
15.1%
337
 
7.4%
1 189
 
4.1%
174
 
3.8%
168
 
3.7%
168
 
3.7%
168
 
3.7%
167
 
3.6%
167
 
3.6%
) 167
 
3.6%
Other values (66) 2178
47.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2608
57.0%
Decimal Number 806
 
17.6%
Space Separator 693
 
15.1%
Close Punctuation 167
 
3.6%
Open Punctuation 167
 
3.6%
Dash Punctuation 86
 
1.9%
Other Punctuation 37
 
0.8%
Math Symbol 7
 
0.2%
Uppercase Letter 5
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
337
12.9%
174
 
6.7%
168
 
6.4%
168
 
6.4%
168
 
6.4%
167
 
6.4%
167
 
6.4%
161
 
6.2%
139
 
5.3%
139
 
5.3%
Other values (47) 820
31.4%
Decimal Number
ValueCountFrequency (%)
1 189
23.4%
2 137
17.0%
3 97
12.0%
9 71
 
8.8%
4 62
 
7.7%
6 59
 
7.3%
7 58
 
7.2%
0 54
 
6.7%
5 42
 
5.2%
8 37
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
A 2
40.0%
B 2
40.0%
G 1
20.0%
Space Separator
ValueCountFrequency (%)
693
100.0%
Close Punctuation
ValueCountFrequency (%)
) 167
100.0%
Open Punctuation
ValueCountFrequency (%)
( 167
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 86
100.0%
Other Punctuation
ValueCountFrequency (%)
, 37
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2608
57.0%
Common 1963
42.9%
Latin 5
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
337
12.9%
174
 
6.7%
168
 
6.4%
168
 
6.4%
168
 
6.4%
167
 
6.4%
167
 
6.4%
161
 
6.2%
139
 
5.3%
139
 
5.3%
Other values (47) 820
31.4%
Common
ValueCountFrequency (%)
693
35.3%
1 189
 
9.6%
) 167
 
8.5%
( 167
 
8.5%
2 137
 
7.0%
3 97
 
4.9%
- 86
 
4.4%
9 71
 
3.6%
4 62
 
3.2%
6 59
 
3.0%
Other values (6) 235
 
12.0%
Latin
ValueCountFrequency (%)
A 2
40.0%
B 2
40.0%
G 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2608
57.0%
ASCII 1968
43.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
693
35.2%
1 189
 
9.6%
) 167
 
8.5%
( 167
 
8.5%
2 137
 
7.0%
3 97
 
4.9%
- 86
 
4.4%
9 71
 
3.6%
4 62
 
3.2%
6 59
 
3.0%
Other values (9) 240
 
12.2%
Hangul
ValueCountFrequency (%)
337
12.9%
174
 
6.7%
168
 
6.4%
168
 
6.4%
168
 
6.4%
167
 
6.4%
167
 
6.4%
161
 
6.2%
139
 
5.3%
139
 
5.3%
Other values (47) 820
31.4%

소재지전화
Text

MISSING 

Distinct143
Distinct (%)100.0%
Missing24
Missing (%)14.4%
Memory size1.4 KiB
2023-12-11T01:48:12.705082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.006993
Min length12

Characters and Unicode

Total characters1717
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique143 ?
Unique (%)100.0%

Sample

1st row051-464-2372
2nd row051-467-0977
3rd row051-468-7329
4th row051-467-4277
5th row051-646-0929
ValueCountFrequency (%)
051-464-2372 1
 
0.7%
051-467-2338 1
 
0.7%
051-645-2115 1
 
0.7%
051-442-5504 1
 
0.7%
051-467-1344 1
 
0.7%
051-468-5178 1
 
0.7%
051-466-5595 1
 
0.7%
051-442-2868 1
 
0.7%
051-462-1638 1
 
0.7%
051-632-6005 1
 
0.7%
Other values (133) 133
93.0%
2023-12-11T01:48:13.200740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 286
16.7%
0 227
13.2%
5 224
13.0%
1 223
13.0%
4 193
11.2%
6 182
10.6%
3 94
 
5.5%
7 90
 
5.2%
2 79
 
4.6%
8 79
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1431
83.3%
Dash Punctuation 286
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 227
15.9%
5 224
15.7%
1 223
15.6%
4 193
13.5%
6 182
12.7%
3 94
6.6%
7 90
 
6.3%
2 79
 
5.5%
8 79
 
5.5%
9 40
 
2.8%
Dash Punctuation
ValueCountFrequency (%)
- 286
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1717
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 286
16.7%
0 227
13.2%
5 224
13.0%
1 223
13.0%
4 193
11.2%
6 182
10.6%
3 94
 
5.5%
7 90
 
5.2%
2 79
 
4.6%
8 79
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1717
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 286
16.7%
0 227
13.2%
5 224
13.0%
1 223
13.0%
4 193
11.2%
6 182
10.6%
3 94
 
5.5%
7 90
 
5.2%
2 79
 
4.6%
8 79
 
4.6%

Missing values

2023-12-11T01:48:09.868549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:48:09.975174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명영업소 주소(도로명)소재지전화
0숙박업(일반)한길여인숙부산광역시 동구 초량중로80번길 3 (초량동)051-464-2372
1숙박업(일반)진주여인숙부산광역시 동구 중앙대로229번길 10-2 (초량동)051-467-0977
2숙박업(일반)경남여인숙부산광역시 동구 수정로 10 (수정동)051-468-7329
3숙박업(일반)본역여인숙부산광역시 동구 대영로243번길 52 (초량동)051-467-4277
4숙박업(일반)낙원여인숙부산광역시 동구 자성공원로 1-3 (범일동)051-646-0929
5숙박업(일반)삼오여관부산광역시 동구 범일일길 13-1 (범일동)<NA>
6숙박업(일반)초량솔로몬 리빙텔부산광역시 동구 고관로33번길 9 (수정동)<NA>
7숙박업(일반)Inn부산부산광역시 동구 대영로239번길 45 (초량동)051-467-4875
8숙박업(일반)황금여인숙부산광역시 동구 초량로13번길 60 (초량동)051-467-8751
9숙박업(일반)물레부산광역시 동구 대영로243번길 10 (초량동)051-468-5994
업종명업소명영업소 주소(도로명)소재지전화
157숙박업(생활)민트 파라다이스부산광역시 동구 대영로239번길 20 (초량동)<NA>
158숙박업(생활)파밀리에게스트하우스부산광역시 동구 중앙대로214번길 3-4, 4,5층 (초량동)051-461-0080
159숙박업(생활)소호스텔부산광역시 동구 중앙대로226번길 3-2 (초량동, 4-6층, 7층 일부)051-465-9030
160숙박업(생활)싱글싱글부산광역시 동구 초량중로 18-1, 1-4층 (초량동)051-468-5978
161숙박업(생활)온팍스 레지던스부산광역시 동구 중앙대로196번길 16-12, 3층 (초량동)051-468-1537
162숙박업(생활)워라밸 게스트하우스부산광역시 동구 초량중로 11, 2층 (초량동)051-463-1555
163숙박업(생활)부산역 오름 레지던스부산광역시 동구 중앙대로180번길 16-8, 지하1층일부,지상1층일부,2~20층 (초량동)051-231-7772
164숙박업(생활)마리나레지던스호텔부산광역시 동구 대영로243번길 73-5, 조이팰리스 2~7층 (초량동)<NA>
165숙박업(생활)르컬렉티브 부산역부산광역시 동구 충장대로 160, A,B동 일부호 (초량동)<NA>
166숙박업(생활)스테이G7부산광역시 동구 충장대로 160, 협성마리나G7 A,B동 일부호 (초량동)<NA>