Overview

Dataset statistics

Number of variables5
Number of observations171
Missing cells6
Missing cells (%)0.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.8 KiB
Average record size in memory40.8 B

Variable types

Categorical1
Text4

Dataset

Description대구광역시_동구_공중위생정보_20230406
Author대구광역시 동구
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=3055360&dataSetDetailId=30553601bd64ce4b8127&provdMethod=FILE

Alerts

업종명 is highly imbalanced (80.9%)Imbalance
소재지전화 has 6 (3.5%) missing valuesMissing

Reproduction

Analysis started2024-04-19 05:36:45.426522
Analysis finished2024-04-19 05:36:45.780000
Duration0.35 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
숙박업(일반)
166 
숙박업(생활)
 
5

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
숙박업(일반) 166
97.1%
숙박업(생활) 5
 
2.9%

Length

2024-04-19T14:36:45.837887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-19T14:36:45.929384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
숙박업(일반 166
97.1%
숙박업(생활 5
 
2.9%
Distinct168
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-04-19T14:36:46.136109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length25
Mean length5.7953216
Min length1

Characters and Unicode

Total characters991
Distinct characters237
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique165 ?
Unique (%)96.5%

Sample

1st row장원장여관
2nd row신화여관
3rd row영빈모텔
4th row구일여인숙
5th row동덕여인숙
ValueCountFrequency (%)
호텔 5
 
2.5%
hotel 4
 
2.0%
엠(m)모텔 2
 
1.0%
동대구역점 2
 
1.0%
하운드 2
 
1.0%
22 2
 
1.0%
대구 2
 
1.0%
앤모텔 2
 
1.0%
황금모텔 2
 
1.0%
체리쉬호텔 1
 
0.5%
Other values (175) 175
87.9%
2024-04-19T14:36:46.501009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
108
 
10.9%
64
 
6.5%
47
 
4.7%
33
 
3.3%
28
 
2.8%
26
 
2.6%
25
 
2.5%
25
 
2.5%
22
 
2.2%
( 20
 
2.0%
Other values (227) 593
59.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 791
79.8%
Uppercase Letter 78
 
7.9%
Lowercase Letter 37
 
3.7%
Space Separator 28
 
2.8%
Open Punctuation 20
 
2.0%
Close Punctuation 20
 
2.0%
Decimal Number 16
 
1.6%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
108
 
13.7%
64
 
8.1%
47
 
5.9%
33
 
4.2%
26
 
3.3%
25
 
3.2%
25
 
3.2%
22
 
2.8%
13
 
1.6%
10
 
1.3%
Other values (184) 418
52.8%
Uppercase Letter
ValueCountFrequency (%)
T 9
11.5%
O 8
10.3%
S 7
9.0%
A 7
9.0%
H 6
 
7.7%
L 6
 
7.7%
M 6
 
7.7%
Y 5
 
6.4%
E 5
 
6.4%
K 4
 
5.1%
Other values (10) 15
19.2%
Lowercase Letter
ValueCountFrequency (%)
e 5
13.5%
o 5
13.5%
t 4
10.8%
l 4
10.8%
n 3
8.1%
i 3
8.1%
g 2
 
5.4%
m 2
 
5.4%
d 2
 
5.4%
u 2
 
5.4%
Other values (5) 5
13.5%
Decimal Number
ValueCountFrequency (%)
2 11
68.8%
0 2
 
12.5%
1 2
 
12.5%
7 1
 
6.2%
Space Separator
ValueCountFrequency (%)
28
100.0%
Open Punctuation
ValueCountFrequency (%)
( 20
100.0%
Close Punctuation
ValueCountFrequency (%)
) 20
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 791
79.8%
Latin 115
 
11.6%
Common 85
 
8.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
108
 
13.7%
64
 
8.1%
47
 
5.9%
33
 
4.2%
26
 
3.3%
25
 
3.2%
25
 
3.2%
22
 
2.8%
13
 
1.6%
10
 
1.3%
Other values (184) 418
52.8%
Latin
ValueCountFrequency (%)
T 9
 
7.8%
O 8
 
7.0%
S 7
 
6.1%
A 7
 
6.1%
H 6
 
5.2%
L 6
 
5.2%
M 6
 
5.2%
e 5
 
4.3%
Y 5
 
4.3%
o 5
 
4.3%
Other values (25) 51
44.3%
Common
ValueCountFrequency (%)
28
32.9%
( 20
23.5%
) 20
23.5%
2 11
 
12.9%
0 2
 
2.4%
1 2
 
2.4%
7 1
 
1.2%
. 1
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 791
79.8%
ASCII 200
 
20.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
108
 
13.7%
64
 
8.1%
47
 
5.9%
33
 
4.2%
26
 
3.3%
25
 
3.2%
25
 
3.2%
22
 
2.8%
13
 
1.6%
10
 
1.3%
Other values (184) 418
52.8%
ASCII
ValueCountFrequency (%)
28
 
14.0%
( 20
 
10.0%
) 20
 
10.0%
2 11
 
5.5%
T 9
 
4.5%
O 8
 
4.0%
S 7
 
3.5%
A 7
 
3.5%
H 6
 
3.0%
L 6
 
3.0%
Other values (33) 78
39.0%
Distinct170
Distinct (%)99.4%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-04-19T14:36:46.841458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length34
Mean length24.836257
Min length20

Characters and Unicode

Total characters4247
Distinct characters105
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique169 ?
Unique (%)98.8%

Sample

1st row대구광역시 동구 아양로 294 (입석동)
2nd row대구광역시 동구 아양로38길 5-2 (효목동)
3rd row대구광역시 동구 동부로32길 28 (신천동)
4th row대구광역시 동구 송라로32길 15-1 (신암동)
5th row대구광역시 동구 아양로 75-2 (신암동)
ValueCountFrequency (%)
대구광역시 171
19.4%
동구 171
19.4%
신천동 51
 
5.8%
신암동 27
 
3.1%
효목동 23
 
2.6%
동부로26길 16
 
1.8%
용수동 13
 
1.5%
팔공산로185길 12
 
1.4%
동부로30길 10
 
1.1%
신암남로 10
 
1.1%
Other values (212) 379
42.9%
2024-04-19T14:36:47.348247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
713
16.8%
427
 
10.1%
350
 
8.2%
185
 
4.4%
( 171
 
4.0%
171
 
4.0%
171
 
4.0%
171
 
4.0%
) 171
 
4.0%
171
 
4.0%
Other values (95) 1546
36.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2447
57.6%
Space Separator 713
 
16.8%
Decimal Number 670
 
15.8%
Open Punctuation 171
 
4.0%
Close Punctuation 171
 
4.0%
Dash Punctuation 52
 
1.2%
Other Punctuation 19
 
0.4%
Math Symbol 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
427
17.4%
350
14.3%
185
 
7.6%
171
 
7.0%
171
 
7.0%
171
 
7.0%
171
 
7.0%
104
 
4.3%
99
 
4.0%
54
 
2.2%
Other values (79) 544
22.2%
Decimal Number
ValueCountFrequency (%)
1 137
20.4%
2 123
18.4%
3 84
12.5%
6 67
10.0%
5 61
9.1%
8 56
8.4%
4 46
 
6.9%
0 43
 
6.4%
7 36
 
5.4%
9 17
 
2.5%
Space Separator
ValueCountFrequency (%)
713
100.0%
Open Punctuation
ValueCountFrequency (%)
( 171
100.0%
Close Punctuation
ValueCountFrequency (%)
) 171
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 52
100.0%
Other Punctuation
ValueCountFrequency (%)
, 19
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2447
57.6%
Common 1800
42.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
427
17.4%
350
14.3%
185
 
7.6%
171
 
7.0%
171
 
7.0%
171
 
7.0%
171
 
7.0%
104
 
4.3%
99
 
4.0%
54
 
2.2%
Other values (79) 544
22.2%
Common
ValueCountFrequency (%)
713
39.6%
( 171
 
9.5%
) 171
 
9.5%
1 137
 
7.6%
2 123
 
6.8%
3 84
 
4.7%
6 67
 
3.7%
5 61
 
3.4%
8 56
 
3.1%
- 52
 
2.9%
Other values (6) 165
 
9.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2447
57.6%
ASCII 1800
42.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
713
39.6%
( 171
 
9.5%
) 171
 
9.5%
1 137
 
7.6%
2 123
 
6.8%
3 84
 
4.7%
6 67
 
3.7%
5 61
 
3.4%
8 56
 
3.1%
- 52
 
2.9%
Other values (6) 165
 
9.2%
Hangul
ValueCountFrequency (%)
427
17.4%
350
14.3%
185
 
7.6%
171
 
7.0%
171
 
7.0%
171
 
7.0%
171
 
7.0%
104
 
4.3%
99
 
4.0%
54
 
2.2%
Other values (79) 544
22.2%
Distinct170
Distinct (%)99.4%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-04-19T14:36:47.727550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length27
Mean length19.491228
Min length17

Characters and Unicode

Total characters3333
Distinct characters65
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique169 ?
Unique (%)98.8%

Sample

1st row대구광역시 동구 입석동 999-1
2nd row대구광역시 동구 효목동 960-22
3rd row대구광역시 동구 신천동 382-4
4th row대구광역시 동구 신암동 165-7
5th row대구광역시 동구 신암동 603-165
ValueCountFrequency (%)
대구광역시 171
24.3%
동구 171
24.3%
신천동 53
 
7.5%
신암동 29
 
4.1%
효목동 24
 
3.4%
용수동 13
 
1.8%
상매동 9
 
1.3%
입석동 7
 
1.0%
중대동 6
 
0.9%
지묘동 5
 
0.7%
Other values (193) 215
30.6%
2024-04-19T14:36:48.197478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
702
21.1%
343
 
10.3%
343
 
10.3%
178
 
5.3%
171
 
5.1%
171
 
5.1%
171
 
5.1%
- 157
 
4.7%
3 117
 
3.5%
1 110
 
3.3%
Other values (55) 870
26.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1747
52.4%
Decimal Number 725
21.8%
Space Separator 702
21.1%
Dash Punctuation 157
 
4.7%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
343
19.6%
343
19.6%
178
10.2%
171
9.8%
171
9.8%
171
9.8%
86
 
4.9%
53
 
3.0%
29
 
1.7%
24
 
1.4%
Other values (42) 178
10.2%
Decimal Number
ValueCountFrequency (%)
3 117
16.1%
1 110
15.2%
2 100
13.8%
6 70
9.7%
0 67
9.2%
5 65
9.0%
9 57
7.9%
4 54
7.4%
7 52
7.2%
8 33
 
4.6%
Space Separator
ValueCountFrequency (%)
702
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 157
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1747
52.4%
Common 1586
47.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
343
19.6%
343
19.6%
178
10.2%
171
9.8%
171
9.8%
171
9.8%
86
 
4.9%
53
 
3.0%
29
 
1.7%
24
 
1.4%
Other values (42) 178
10.2%
Common
ValueCountFrequency (%)
702
44.3%
- 157
 
9.9%
3 117
 
7.4%
1 110
 
6.9%
2 100
 
6.3%
6 70
 
4.4%
0 67
 
4.2%
5 65
 
4.1%
9 57
 
3.6%
4 54
 
3.4%
Other values (3) 87
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1747
52.4%
ASCII 1586
47.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
702
44.3%
- 157
 
9.9%
3 117
 
7.4%
1 110
 
6.9%
2 100
 
6.3%
6 70
 
4.4%
0 67
 
4.2%
5 65
 
4.1%
9 57
 
3.6%
4 54
 
3.4%
Other values (3) 87
 
5.5%
Hangul
ValueCountFrequency (%)
343
19.6%
343
19.6%
178
10.2%
171
9.8%
171
9.8%
171
9.8%
86
 
4.9%
53
 
3.0%
29
 
1.7%
24
 
1.4%
Other values (42) 178
10.2%

소재지전화
Text

MISSING 

Distinct164
Distinct (%)99.4%
Missing6
Missing (%)3.5%
Memory size1.5 KiB
2024-04-19T14:36:48.404185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1980
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique163 ?
Unique (%)98.8%

Sample

1st row053-984-7667
2nd row053-742-1730
3rd row053-755-9244
4th row053-955-8906
5th row053-941-9379
ValueCountFrequency (%)
053-986-8201 2
 
1.2%
053-746-5401 1
 
0.6%
053-942-9475 1
 
0.6%
053-984-7667 1
 
0.6%
053-958-6373 1
 
0.6%
053-953-4531 1
 
0.6%
053-752-3358 1
 
0.6%
053-952-7454 1
 
0.6%
053-951-2364 1
 
0.6%
053-742-3337 1
 
0.6%
Other values (154) 154
93.3%
2024-04-19T14:36:48.717965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 330
16.7%
5 320
16.2%
3 286
14.4%
0 268
13.5%
9 156
7.9%
4 120
 
6.1%
7 115
 
5.8%
8 113
 
5.7%
2 107
 
5.4%
6 85
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1650
83.3%
Dash Punctuation 330
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 320
19.4%
3 286
17.3%
0 268
16.2%
9 156
9.5%
4 120
 
7.3%
7 115
 
7.0%
8 113
 
6.8%
2 107
 
6.5%
6 85
 
5.2%
1 80
 
4.8%
Dash Punctuation
ValueCountFrequency (%)
- 330
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1980
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 330
16.7%
5 320
16.2%
3 286
14.4%
0 268
13.5%
9 156
7.9%
4 120
 
6.1%
7 115
 
5.8%
8 113
 
5.7%
2 107
 
5.4%
6 85
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1980
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 330
16.7%
5 320
16.2%
3 286
14.4%
0 268
13.5%
9 156
7.9%
4 120
 
6.1%
7 115
 
5.8%
8 113
 
5.7%
2 107
 
5.4%
6 85
 
4.3%

Missing values

2024-04-19T14:36:45.652497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-19T14:36:45.738039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명영업소 주소(도로명)영업소 주소(지번)소재지전화
0숙박업(일반)장원장여관대구광역시 동구 아양로 294 (입석동)대구광역시 동구 입석동 999-1053-984-7667
1숙박업(일반)신화여관대구광역시 동구 아양로38길 5-2 (효목동)대구광역시 동구 효목동 960-22053-742-1730
2숙박업(일반)영빈모텔대구광역시 동구 동부로32길 28 (신천동)대구광역시 동구 신천동 382-4053-755-9244
3숙박업(일반)구일여인숙대구광역시 동구 송라로32길 15-1 (신암동)대구광역시 동구 신암동 165-7053-955-8906
4숙박업(일반)동덕여인숙대구광역시 동구 아양로 75-2 (신암동)대구광역시 동구 신암동 603-165053-941-9379
5숙박업(일반)서울모텔대구광역시 동구 동부로28길 15 (신천동)대구광역시 동구 신천동 334-16053-743-7613
6숙박업(일반)청수여관대구광역시 동구 반야월로 174 (신기동)대구광역시 동구 신기동 15-2053-962-5859
7숙박업(일반)신모텔대구광역시 동구 신암남로 109 (신암동)대구광역시 동구 신암동 259-42053-955-5277
8숙박업(일반)현대장여관대구광역시 동구 동부로 65 (신천동)대구광역시 동구 신천동 23-1053-752-3200
9숙박업(일반)동양장여관대구광역시 동구 동부로30길 6 (신천동)대구광역시 동구 신천동 330-7053-755-2429
업종명업소명영업소 주소(도로명)영업소 주소(지번)소재지전화
161숙박업(일반)아마레호텔대구광역시 동구 율암로 156-15, 아마레 호텔 (상매동)대구광역시 동구 상매동 506-2 아마레 호텔053-953-2580
162숙박업(일반)대구 메리어트 호텔대구광역시 동구 동부로26길 6, 대구 메리어트 호텔 및 서비스드 레지던스 3층일부, 6-11층 (신천동)대구광역시 동구 신천동 326-1<NA>
163숙박업(일반)제이비관광호텔(JB TOURIST HOTEL)대구광역시 동구 율암로 162, 1~7층층 (상매동)대구광역시 동구 상매동 506-6053-964-2000
164숙박업(일반)제이비한옥호텔(JBHANOKHOTEL)대구광역시 동구 율암로 156-13, 1~3층 (상매동)대구광역시 동구 상매동 506-3<NA>
165숙박업(일반)호텔골든캐프대구광역시 동구 율암로 156-28, 호텔골든캐프 (상매동)대구광역시 동구 상매동 505<NA>
166숙박업(생활)애플호텔펜션대구광역시 동구 팔공산로185길 33-6 (용수동)대구광역시 동구 용수동 67-25053-981-8009
167숙박업(생활)대구광역시 동구 팔공로 525 (지묘동)대구광역시 동구 지묘동 85-1053-984-0033
168숙박업(생활)팔공펜션대구광역시 동구 팔공산로185길 35 (용수동)대구광역시 동구 용수동 67-28053-981-6688
169숙박업(생활)스파펜션링스대구광역시 동구 팔공산로185길 39 (용수동)대구광역시 동구 용수동 59-18053-981-3321
170숙박업(생활)와이컬렉션 by UH FLAT 대구대구광역시 동구 동부로26길 6, 대구 메리어트 호텔 및 서비스드 레지던스 12~23층 41개호 (신천동)대구광역시 동구 신천동 326-1 대구 메리어트 호텔 및 서비스드 레지던스053-746-2288