Overview

Dataset statistics

Number of variables4
Number of observations292
Missing cells31
Missing cells (%)2.7%
Duplicate rows1
Duplicate rows (%)0.3%
Total size in memory9.3 KiB
Average record size in memory32.5 B

Variable types

Text3
Categorical1

Dataset

Description대구광역시 북구_노래연습장업_20230904
Author대구광역시 북구
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=15006329&dataSetDetailId=150063291ad7355bff5ac_201909241126&provdMethod=FILE

Alerts

Dataset has 1 (0.3%) duplicate rowsDuplicates
데이터기준일자 is highly imbalanced (78.5%)Imbalance
상호명 has 10 (3.4%) missing valuesMissing
영업소도로명소재지 has 11 (3.8%) missing valuesMissing
영업소지번소재지 has 10 (3.4%) missing valuesMissing

Reproduction

Analysis started2024-04-22 00:15:45.943606
Analysis finished2024-04-22 00:15:46.455127
Duration0.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상호명
Text

MISSING 

Distinct242
Distinct (%)85.8%
Missing10
Missing (%)3.4%
Memory size2.4 KiB
2024-04-22T09:15:46.624586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length15
Mean length7.5886525
Min length2

Characters and Unicode

Total characters2140
Distinct characters276
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique212 ?
Unique (%)75.2%

Sample

1st row한곡땡겨노래연습장
2nd row청춘노래연습장
3rd row은하노래연습장
4th row한마당노래연습장
5th row산림노래연습장
ValueCountFrequency (%)
노래연습장 9
 
3.0%
축제노래연습장 4
 
1.3%
제우스노래연습장 4
 
1.3%
에쿠스노래연습장 3
 
1.0%
쇼노래연습장 3
 
1.0%
스타노래연습장 3
 
1.0%
에이스노래연습장 3
 
1.0%
파도노래연습장 3
 
1.0%
청솔노래연습장 3
 
1.0%
토마토노래연습장 2
 
0.7%
Other values (239) 262
87.6%
2024-04-22T09:15:47.001773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
274
12.8%
274
12.8%
265
 
12.4%
264
 
12.3%
263
 
12.3%
33
 
1.5%
32
 
1.5%
28
 
1.3%
21
 
1.0%
17
 
0.8%
Other values (266) 669
31.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2078
97.1%
Uppercase Letter 18
 
0.8%
Space Separator 17
 
0.8%
Close Punctuation 9
 
0.4%
Open Punctuation 9
 
0.4%
Decimal Number 7
 
0.3%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
274
13.2%
274
13.2%
265
12.8%
264
12.7%
263
12.7%
33
 
1.6%
32
 
1.5%
28
 
1.3%
21
 
1.0%
13
 
0.6%
Other values (251) 611
29.4%
Uppercase Letter
ValueCountFrequency (%)
M 5
27.8%
S 3
16.7%
I 2
 
11.1%
V 2
 
11.1%
O 2
 
11.1%
P 2
 
11.1%
B 1
 
5.6%
K 1
 
5.6%
Decimal Number
ValueCountFrequency (%)
2 5
71.4%
1 1
 
14.3%
3 1
 
14.3%
Space Separator
ValueCountFrequency (%)
17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2078
97.1%
Common 44
 
2.1%
Latin 18
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
274
13.2%
274
13.2%
265
12.8%
264
12.7%
263
12.7%
33
 
1.6%
32
 
1.5%
28
 
1.3%
21
 
1.0%
13
 
0.6%
Other values (251) 611
29.4%
Latin
ValueCountFrequency (%)
M 5
27.8%
S 3
16.7%
I 2
 
11.1%
V 2
 
11.1%
O 2
 
11.1%
P 2
 
11.1%
B 1
 
5.6%
K 1
 
5.6%
Common
ValueCountFrequency (%)
17
38.6%
) 9
20.5%
( 9
20.5%
2 5
 
11.4%
. 2
 
4.5%
1 1
 
2.3%
3 1
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2078
97.1%
ASCII 62
 
2.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
274
13.2%
274
13.2%
265
12.8%
264
12.7%
263
12.7%
33
 
1.6%
32
 
1.5%
28
 
1.3%
21
 
1.0%
13
 
0.6%
Other values (251) 611
29.4%
ASCII
ValueCountFrequency (%)
17
27.4%
) 9
14.5%
( 9
14.5%
M 5
 
8.1%
2 5
 
8.1%
S 3
 
4.8%
I 2
 
3.2%
V 2
 
3.2%
O 2
 
3.2%
P 2
 
3.2%
Other values (5) 6
 
9.7%
Distinct278
Distinct (%)98.9%
Missing11
Missing (%)3.8%
Memory size2.4 KiB
2024-04-22T09:15:47.400252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length48
Median length37
Mean length25.227758
Min length20

Characters and Unicode

Total characters7089
Distinct characters107
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique275 ?
Unique (%)97.9%

Sample

1st row대구광역시 북구 태전로 53 (태전동)
2nd row대구광역시 북구 검단동로4길 6 (검단동)
3rd row대구광역시 북구 동북로 297 (복현동)
4th row대구광역시 북구 동북로 252, 2층 (복현동)
5th row대구광역시 북구 동북로 143 (산격동)
ValueCountFrequency (%)
대구광역시 281
 
18.7%
북구 281
 
18.7%
태전동 57
 
3.8%
산격동 41
 
2.7%
2층 29
 
1.9%
구암동 28
 
1.9%
지하1층 28
 
1.9%
복현동 18
 
1.2%
칠곡중앙대로 16
 
1.1%
동천동 16
 
1.1%
Other values (327) 707
47.1%
2024-04-22T09:15:47.847308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1221
17.2%
628
 
8.9%
363
 
5.1%
355
 
5.0%
299
 
4.2%
281
 
4.0%
281
 
4.0%
281
 
4.0%
281
 
4.0%
) 281
 
4.0%
Other values (97) 2818
39.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4160
58.7%
Space Separator 1221
 
17.2%
Decimal Number 1000
 
14.1%
Close Punctuation 281
 
4.0%
Open Punctuation 281
 
4.0%
Other Punctuation 81
 
1.1%
Dash Punctuation 64
 
0.9%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
628
15.1%
363
 
8.7%
355
 
8.5%
299
 
7.2%
281
 
6.8%
281
 
6.8%
281
 
6.8%
281
 
6.8%
120
 
2.9%
88
 
2.1%
Other values (81) 1183
28.4%
Decimal Number
ValueCountFrequency (%)
1 240
24.0%
2 153
15.3%
3 127
12.7%
6 89
 
8.9%
4 77
 
7.7%
5 71
 
7.1%
8 69
 
6.9%
0 63
 
6.3%
7 61
 
6.1%
9 50
 
5.0%
Space Separator
ValueCountFrequency (%)
1221
100.0%
Close Punctuation
ValueCountFrequency (%)
) 281
100.0%
Open Punctuation
ValueCountFrequency (%)
( 281
100.0%
Other Punctuation
ValueCountFrequency (%)
, 81
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 64
100.0%
Uppercase Letter
ValueCountFrequency (%)
B 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4160
58.7%
Common 2928
41.3%
Latin 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
628
15.1%
363
 
8.7%
355
 
8.5%
299
 
7.2%
281
 
6.8%
281
 
6.8%
281
 
6.8%
281
 
6.8%
120
 
2.9%
88
 
2.1%
Other values (81) 1183
28.4%
Common
ValueCountFrequency (%)
1221
41.7%
) 281
 
9.6%
( 281
 
9.6%
1 240
 
8.2%
2 153
 
5.2%
3 127
 
4.3%
6 89
 
3.0%
, 81
 
2.8%
4 77
 
2.6%
5 71
 
2.4%
Other values (5) 307
 
10.5%
Latin
ValueCountFrequency (%)
B 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4160
58.7%
ASCII 2929
41.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1221
41.7%
) 281
 
9.6%
( 281
 
9.6%
1 240
 
8.2%
2 153
 
5.2%
3 127
 
4.3%
6 89
 
3.0%
, 81
 
2.8%
4 77
 
2.6%
5 71
 
2.4%
Other values (6) 308
 
10.5%
Hangul
ValueCountFrequency (%)
628
15.1%
363
 
8.7%
355
 
8.5%
299
 
7.2%
281
 
6.8%
281
 
6.8%
281
 
6.8%
281
 
6.8%
120
 
2.9%
88
 
2.1%
Other values (81) 1183
28.4%
Distinct279
Distinct (%)98.9%
Missing10
Missing (%)3.4%
Memory size2.4 KiB
2024-04-22T09:15:48.142439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length29
Mean length21.780142
Min length17

Characters and Unicode

Total characters6142
Distinct characters80
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique276 ?
Unique (%)97.9%

Sample

1st row대구광역시 북구 태전동 254-30
2nd row대구광역시 북구 검단동 1266-25
3rd row대구광역시 북구 복현동 364-1
4th row대구광역시 북구 복현동 427-4
5th row대구광역시 북구 산격동 1269-20 (지상2층)
ValueCountFrequency (%)
대구광역시 282
22.9%
북구 282
22.9%
태전동 57
 
4.6%
지하1층 46
 
3.7%
산격동 41
 
3.3%
지상2층 29
 
2.4%
구암동 28
 
2.3%
복현동 18
 
1.5%
동천동 16
 
1.3%
읍내동 14
 
1.1%
Other values (314) 421
34.1%
2024-04-22T09:15:48.592483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1232
20.1%
592
 
9.6%
1 331
 
5.4%
314
 
5.1%
294
 
4.8%
282
 
4.6%
282
 
4.6%
282
 
4.6%
282
 
4.6%
- 245
 
4.0%
Other values (70) 2006
32.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3146
51.2%
Decimal Number 1391
22.6%
Space Separator 1232
 
20.1%
Dash Punctuation 245
 
4.0%
Open Punctuation 62
 
1.0%
Close Punctuation 62
 
1.0%
Uppercase Letter 3
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
592
18.8%
314
10.0%
294
9.3%
282
9.0%
282
9.0%
282
9.0%
282
9.0%
94
 
3.0%
84
 
2.7%
57
 
1.8%
Other values (54) 583
18.5%
Decimal Number
ValueCountFrequency (%)
1 331
23.8%
2 189
13.6%
3 146
10.5%
4 123
 
8.8%
9 116
 
8.3%
7 102
 
7.3%
8 100
 
7.2%
6 97
 
7.0%
5 95
 
6.8%
0 92
 
6.6%
Space Separator
ValueCountFrequency (%)
1232
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 245
100.0%
Open Punctuation
ValueCountFrequency (%)
( 62
100.0%
Close Punctuation
ValueCountFrequency (%)
) 62
100.0%
Uppercase Letter
ValueCountFrequency (%)
B 3
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3146
51.2%
Common 2993
48.7%
Latin 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
592
18.8%
314
10.0%
294
9.3%
282
9.0%
282
9.0%
282
9.0%
282
9.0%
94
 
3.0%
84
 
2.7%
57
 
1.8%
Other values (54) 583
18.5%
Common
ValueCountFrequency (%)
1232
41.2%
1 331
 
11.1%
- 245
 
8.2%
2 189
 
6.3%
3 146
 
4.9%
4 123
 
4.1%
9 116
 
3.9%
7 102
 
3.4%
8 100
 
3.3%
6 97
 
3.2%
Other values (5) 312
 
10.4%
Latin
ValueCountFrequency (%)
B 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3146
51.2%
ASCII 2996
48.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1232
41.1%
1 331
 
11.0%
- 245
 
8.2%
2 189
 
6.3%
3 146
 
4.9%
4 123
 
4.1%
9 116
 
3.9%
7 102
 
3.4%
8 100
 
3.3%
6 97
 
3.2%
Other values (6) 315
 
10.5%
Hangul
ValueCountFrequency (%)
592
18.8%
314
10.0%
294
9.3%
282
9.0%
282
9.0%
282
9.0%
282
9.0%
94
 
3.0%
84
 
2.7%
57
 
1.8%
Other values (54) 583
18.5%

데이터기준일자
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-09-04
282 
<NA>
 
10

Length

Max length10
Median length10
Mean length9.7945205
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-09-04
2nd row2023-09-04
3rd row2023-09-04
4th row2023-09-04
5th row2023-09-04

Common Values

ValueCountFrequency (%)
2023-09-04 282
96.6%
<NA> 10
 
3.4%

Length

2024-04-22T09:15:48.731198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T09:15:48.837113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-09-04 282
96.6%
na 10
 
3.4%

Missing values

2024-04-22T09:15:46.199153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-22T09:15:46.283052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-22T09:15:46.383387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

상호명영업소도로명소재지영업소지번소재지데이터기준일자
0한곡땡겨노래연습장대구광역시 북구 태전로 53 (태전동)대구광역시 북구 태전동 254-302023-09-04
1청춘노래연습장대구광역시 북구 검단동로4길 6 (검단동)대구광역시 북구 검단동 1266-252023-09-04
2은하노래연습장대구광역시 북구 동북로 297 (복현동)대구광역시 북구 복현동 364-12023-09-04
3한마당노래연습장대구광역시 북구 동북로 252, 2층 (복현동)대구광역시 북구 복현동 427-42023-09-04
4산림노래연습장대구광역시 북구 동북로 143 (산격동)대구광역시 북구 산격동 1269-20 (지상2층)2023-09-04
5만남노래연습장대구광역시 북구 동북로 149-2, 3층 (산격동)대구광역시 북구 산격동 12702023-09-04
6풍년노래연습장대구광역시 북구 오봉로1길 51 (노원동2가)대구광역시 북구 노원동2가 289-12023-09-04
7딩동댕노래연습장대구광역시 북구 연암로 177 (산격동)대구광역시 북구 산격동 750-32023-09-04
8궁전노래연습장대구광역시 북구 구암로 103 (읍내동)대구광역시 북구 읍내동 1381-142023-09-04
9테크닉노래방대구광역시 북구 노원로 90, 1층 (노원동3가)대구광역시 북구 노원동3가 1038-32023-09-04
상호명영업소도로명소재지영업소지번소재지데이터기준일자
282<NA><NA><NA><NA>
283<NA><NA><NA><NA>
284<NA><NA><NA><NA>
285<NA><NA><NA><NA>
286<NA><NA><NA><NA>
287<NA><NA><NA><NA>
288<NA><NA><NA><NA>
289<NA><NA><NA><NA>
290<NA><NA><NA><NA>
291<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

상호명영업소도로명소재지영업소지번소재지데이터기준일자# duplicates
0<NA><NA><NA><NA>10