Overview

Dataset statistics

Number of variables6
Number of observations909
Missing cells38
Missing cells (%)0.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory42.7 KiB
Average record size in memory48.1 B

Variable types

Text3
DateTime1
Categorical2

Dataset

Description대구광역시 동구 담배소매업 지정업소 현황 데이터 입니다. 업소명, 주소, 지정일자, 전화번호, 법인구분, 영업현황 등의 항목을 포함하고 있습니다.
Author대구광역시 동구
URLhttps://www.data.go.kr/data/15035585/fileData.do

Alerts

영업구분 has constant value ""Constant
법인구분 is highly imbalanced (60.1%)Imbalance
업소명 has 37 (4.1%) missing valuesMissing

Reproduction

Analysis started2024-03-16 04:15:56.700171
Analysis finished2024-03-16 04:15:57.801212
Duration1.1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업소명
Text

MISSING 

Distinct836
Distinct (%)95.9%
Missing37
Missing (%)4.1%
Memory size7.2 KiB
2024-03-16T13:15:58.049290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length17
Mean length8.6192661
Min length2

Characters and Unicode

Total characters7516
Distinct characters478
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique817 ?
Unique (%)93.7%

Sample

1st row(주)진성기산
2nd row주식회사 그린힐
3rd row오 굿모닝 부동산컨설팅
4th row지에스(GS)25 용계푸르지오점
5th row지에스(GS)25 이시아더샵접
ValueCountFrequency (%)
지에스(gs)25 64
 
5.1%
씨유 60
 
4.8%
세븐일레븐 52
 
4.2%
이마트24 28
 
2.2%
주식회사 15
 
1.2%
주)코리아세븐 12
 
1.0%
나이스마트 10
 
0.8%
대백마트 9
 
0.7%
홈마트 8
 
0.6%
gs25 7
 
0.6%
Other values (894) 983
78.8%
2024-03-16T13:15:58.762077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
395
 
5.3%
377
 
5.0%
253
 
3.4%
231
 
3.1%
217
 
2.9%
214
 
2.8%
157
 
2.1%
156
 
2.1%
) 155
 
2.1%
( 152
 
2.0%
Other values (468) 5209
69.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6263
83.3%
Space Separator 377
 
5.0%
Decimal Number 296
 
3.9%
Uppercase Letter 249
 
3.3%
Close Punctuation 155
 
2.1%
Open Punctuation 152
 
2.0%
Lowercase Letter 10
 
0.1%
Other Punctuation 9
 
0.1%
Dash Punctuation 4
 
0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
395
 
6.3%
253
 
4.0%
231
 
3.7%
217
 
3.5%
214
 
3.4%
157
 
2.5%
156
 
2.5%
136
 
2.2%
128
 
2.0%
128
 
2.0%
Other values (423) 4248
67.8%
Uppercase Letter
ValueCountFrequency (%)
S 95
38.2%
G 90
36.1%
C 13
 
5.2%
U 7
 
2.8%
O 7
 
2.8%
K 6
 
2.4%
E 5
 
2.0%
R 4
 
1.6%
L 3
 
1.2%
D 3
 
1.2%
Other values (10) 16
 
6.4%
Decimal Number
ValueCountFrequency (%)
2 135
45.6%
5 96
32.4%
4 36
 
12.2%
1 12
 
4.1%
3 5
 
1.7%
6 5
 
1.7%
7 3
 
1.0%
0 3
 
1.0%
9 1
 
0.3%
Lowercase Letter
ValueCountFrequency (%)
e 2
20.0%
r 2
20.0%
y 1
10.0%
i 1
10.0%
n 1
10.0%
k 1
10.0%
t 1
10.0%
h 1
10.0%
Other Punctuation
ValueCountFrequency (%)
. 6
66.7%
& 2
 
22.2%
? 1
 
11.1%
Space Separator
ValueCountFrequency (%)
377
100.0%
Close Punctuation
ValueCountFrequency (%)
) 155
100.0%
Open Punctuation
ValueCountFrequency (%)
( 152
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Math Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6263
83.3%
Common 994
 
13.2%
Latin 259
 
3.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
395
 
6.3%
253
 
4.0%
231
 
3.7%
217
 
3.5%
214
 
3.4%
157
 
2.5%
156
 
2.5%
136
 
2.2%
128
 
2.0%
128
 
2.0%
Other values (423) 4248
67.8%
Latin
ValueCountFrequency (%)
S 95
36.7%
G 90
34.7%
C 13
 
5.0%
U 7
 
2.7%
O 7
 
2.7%
K 6
 
2.3%
E 5
 
1.9%
R 4
 
1.5%
L 3
 
1.2%
D 3
 
1.2%
Other values (18) 26
 
10.0%
Common
ValueCountFrequency (%)
377
37.9%
) 155
15.6%
( 152
15.3%
2 135
 
13.6%
5 96
 
9.7%
4 36
 
3.6%
1 12
 
1.2%
. 6
 
0.6%
3 5
 
0.5%
6 5
 
0.5%
Other values (7) 15
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6260
83.3%
ASCII 1252
 
16.7%
Compat Jamo 3
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
395
 
6.3%
253
 
4.0%
231
 
3.7%
217
 
3.5%
214
 
3.4%
157
 
2.5%
156
 
2.5%
136
 
2.2%
128
 
2.0%
128
 
2.0%
Other values (421) 4245
67.8%
ASCII
ValueCountFrequency (%)
377
30.1%
) 155
12.4%
( 152
12.1%
2 135
 
10.8%
5 96
 
7.7%
S 95
 
7.6%
G 90
 
7.2%
4 36
 
2.9%
C 13
 
1.0%
1 12
 
1.0%
Other values (34) 91
 
7.3%
Compat Jamo
ValueCountFrequency (%)
2
66.7%
1
33.3%
Math Operators
ValueCountFrequency (%)
1
100.0%
Distinct904
Distinct (%)99.4%
Missing0
Missing (%)0.0%
Memory size7.2 KiB
2024-03-16T13:15:59.219774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length53
Mean length28.9956
Min length20

Characters and Unicode

Total characters26357
Distinct characters314
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique899 ?
Unique (%)98.9%

Sample

1st row대구광역시 동구 효동로 79. 2층 (효목동)
2nd row대구광역시 동구 과학로4길 3 (각산동)
3rd row대구광역시 동구 화랑로9길 63-1. 1층 (신천동)
4th row대구광역시 동구 화랑로108길 42. 상가101호 (용계동. 용계역푸르지오아츠베르1단지)
5th row대구광역시 동구 팔공로51길 10. 상가3동 101호.102호 (봉무동. 이시아폴리스 더샵 3차)
ValueCountFrequency (%)
대구광역시 909
 
16.8%
동구 909
 
16.8%
1층 222
 
4.1%
신암동 147
 
2.7%
신천동 126
 
2.3%
효목동 75
 
1.4%
방촌동 59
 
1.1%
율하동 53
 
1.0%
신서동 50
 
0.9%
동촌로 39
 
0.7%
Other values (1085) 2826
52.2%
2024-03-16T13:15:59.949513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4571
17.3%
2334
 
8.9%
1891
 
7.2%
1 1254
 
4.8%
1009
 
3.8%
946
 
3.6%
939
 
3.6%
934
 
3.5%
( 912
 
3.5%
) 912
 
3.5%
Other values (304) 10655
40.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14889
56.5%
Space Separator 4571
 
17.3%
Decimal Number 4255
 
16.1%
Open Punctuation 912
 
3.5%
Close Punctuation 912
 
3.5%
Other Punctuation 656
 
2.5%
Dash Punctuation 111
 
0.4%
Uppercase Letter 35
 
0.1%
Lowercase Letter 9
 
< 0.1%
Math Symbol 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2334
15.7%
1891
12.7%
1009
 
6.8%
946
 
6.4%
939
 
6.3%
934
 
6.3%
909
 
6.1%
484
 
3.3%
428
 
2.9%
297
 
2.0%
Other values (269) 4718
31.7%
Uppercase Letter
ValueCountFrequency (%)
B 8
22.9%
H 6
17.1%
A 5
14.3%
D 4
11.4%
L 4
11.4%
C 3
 
8.6%
K 1
 
2.9%
T 1
 
2.9%
F 1
 
2.9%
G 1
 
2.9%
Decimal Number
ValueCountFrequency (%)
1 1254
29.5%
2 566
13.3%
0 494
 
11.6%
3 393
 
9.2%
5 354
 
8.3%
4 331
 
7.8%
6 270
 
6.3%
7 216
 
5.1%
9 195
 
4.6%
8 182
 
4.3%
Lowercase Letter
ValueCountFrequency (%)
e 3
33.3%
a 2
22.2%
n 1
 
11.1%
s 1
 
11.1%
d 1
 
11.1%
i 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
. 654
99.7%
/ 2
 
0.3%
Space Separator
ValueCountFrequency (%)
4571
100.0%
Open Punctuation
ValueCountFrequency (%)
( 912
100.0%
Close Punctuation
ValueCountFrequency (%)
) 912
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 111
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14889
56.5%
Common 11424
43.3%
Latin 44
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2334
15.7%
1891
12.7%
1009
 
6.8%
946
 
6.4%
939
 
6.3%
934
 
6.3%
909
 
6.1%
484
 
3.3%
428
 
2.9%
297
 
2.0%
Other values (269) 4718
31.7%
Common
ValueCountFrequency (%)
4571
40.0%
1 1254
 
11.0%
( 912
 
8.0%
) 912
 
8.0%
. 654
 
5.7%
2 566
 
5.0%
0 494
 
4.3%
3 393
 
3.4%
5 354
 
3.1%
4 331
 
2.9%
Other values (8) 983
 
8.6%
Latin
ValueCountFrequency (%)
B 8
18.2%
H 6
13.6%
A 5
11.4%
D 4
9.1%
L 4
9.1%
e 3
 
6.8%
C 3
 
6.8%
a 2
 
4.5%
K 1
 
2.3%
T 1
 
2.3%
Other values (7) 7
15.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14889
56.5%
ASCII 11467
43.5%
CJK Compat 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4571
39.9%
1 1254
 
10.9%
( 912
 
8.0%
) 912
 
8.0%
. 654
 
5.7%
2 566
 
4.9%
0 494
 
4.3%
3 393
 
3.4%
5 354
 
3.1%
4 331
 
2.9%
Other values (24) 1026
 
8.9%
Hangul
ValueCountFrequency (%)
2334
15.7%
1891
12.7%
1009
 
6.8%
946
 
6.4%
939
 
6.3%
934
 
6.3%
909
 
6.1%
484
 
3.3%
428
 
2.9%
297
 
2.0%
Other values (269) 4718
31.7%
CJK Compat
ValueCountFrequency (%)
1
100.0%
Distinct354
Distinct (%)39.0%
Missing1
Missing (%)0.1%
Memory size7.2 KiB
2024-03-16T13:16:00.473079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters10896
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique344 ?
Unique (%)37.9%

Sample

1st row053-000-0000
2nd row053-000-0000
3rd row053-000-0000
4th row053-000-0000
5th row053-000-0000
ValueCountFrequency (%)
053-000-0000 541
59.6%
053-742-2631 5
 
0.6%
053-750-4482 3
 
0.3%
053-961-0448 3
 
0.3%
053-985-1495 2
 
0.2%
053-961-2776 2
 
0.2%
053-941-0019 2
 
0.2%
053-665-1052 2
 
0.2%
053-751-4511 2
 
0.2%
053-985-0111 2
 
0.2%
Other values (344) 344
37.9%
2024-03-16T13:16:01.278115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4912
45.1%
- 1816
 
16.7%
5 1183
 
10.9%
3 1109
 
10.2%
9 406
 
3.7%
8 279
 
2.6%
4 276
 
2.5%
1 266
 
2.4%
2 255
 
2.3%
6 203
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9080
83.3%
Dash Punctuation 1816
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4912
54.1%
5 1183
 
13.0%
3 1109
 
12.2%
9 406
 
4.5%
8 279
 
3.1%
4 276
 
3.0%
1 266
 
2.9%
2 255
 
2.8%
6 203
 
2.2%
7 191
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 1816
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10896
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4912
45.1%
- 1816
 
16.7%
5 1183
 
10.9%
3 1109
 
10.2%
9 406
 
3.7%
8 279
 
2.6%
4 276
 
2.5%
1 266
 
2.4%
2 255
 
2.3%
6 203
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10896
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4912
45.1%
- 1816
 
16.7%
5 1183
 
10.9%
3 1109
 
10.2%
9 406
 
3.7%
8 279
 
2.6%
4 276
 
2.5%
1 266
 
2.4%
2 255
 
2.3%
6 203
 
1.9%
Distinct774
Distinct (%)85.1%
Missing0
Missing (%)0.0%
Memory size7.2 KiB
Minimum1962-01-01 00:00:00
Maximum2024-02-29 00:00:00
2024-03-16T13:16:01.593249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:16:01.888069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

법인구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.2 KiB
개인
837 
법인
 
72

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row법인
2nd row법인
3rd row개인
4th row개인
5th row개인

Common Values

ValueCountFrequency (%)
개인 837
92.1%
법인 72
 
7.9%

Length

2024-03-16T13:16:02.527311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-16T13:16:02.667367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
개인 837
92.1%
법인 72
 
7.9%

영업구분
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.2 KiB
정상영업
909 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정상영업
2nd row정상영업
3rd row정상영업
4th row정상영업
5th row정상영업

Common Values

ValueCountFrequency (%)
정상영업 909
100.0%

Length

2024-03-16T13:16:02.841238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-16T13:16:03.041185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정상영업 909
100.0%

Missing values

2024-03-16T13:15:57.465569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-16T13:15:57.626297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-16T13:15:57.753724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업소명업소도로명주소업소전화번호지정일자법인구분영업구분
0(주)진성기산대구광역시 동구 효동로 79. 2층 (효목동)053-000-00002024-02-29법인정상영업
1주식회사 그린힐대구광역시 동구 과학로4길 3 (각산동)053-000-00002024-02-29법인정상영업
2오 굿모닝 부동산컨설팅대구광역시 동구 화랑로9길 63-1. 1층 (신천동)053-000-00002024-02-27개인정상영업
3지에스(GS)25 용계푸르지오점대구광역시 동구 화랑로108길 42. 상가101호 (용계동. 용계역푸르지오아츠베르1단지)053-000-00002024-02-26개인정상영업
4지에스(GS)25 이시아더샵접대구광역시 동구 팔공로51길 10. 상가3동 101호.102호 (봉무동. 이시아폴리스 더샵 3차)053-000-00002024-02-22개인정상영업
5씨유 방촌점대구광역시 동구 동촌로 267. 1층 (방촌동)053-000-00002024-02-22개인정상영업
6지에스(GS)25 대구동촌점대구광역시 동구 동촌로 175 (방촌동)053-000-00002024-02-21개인정상영업
7씨유 동구 방천로점(대구동구지역자활센터)대구광역시 동구 방천로 34. 1층 (불로동)053-986-08262024-02-21법인정상영업
8롯데쇼핑 (주) 롯데마트 대구율하점대구광역시 동구 안심로 80. 지하 1층 마트 (율하동)053-607-25042024-02-20법인정상영업
9더(the)짬뽕47대구광역시 동구 장등로 47. 1층 (신천동)053-000-00002024-02-14개인정상영업
업소명업소도로명주소업소전화번호지정일자법인구분영업구분
899언덕슈퍼대구광역시 동구 아양로34길 26 (신암동)053-942-61111987-04-30개인정상영업
900명진마트 명진정보대구광역시 동구 동대구로 596 (신암동)053-941-12911981-11-18개인정상영업
901동대구윤업사대구광역시 동구 아양로 164-1 (신암동)053-941-35591964-01-01개인정상영업
902<NA>대구광역시 동구 동대구로99길 18 (신암동)053-958-15011984-12-27개인정상영업
903<NA>대구광역시 동구 동북로 417-1 (신암동)053-942-41881980-12-22개인정상영업
904<NA>대구광역시 동구 아양로49길 77. 113동 1호 (신암동.보성2차상가)053-957-94241988-12-26개인정상영업
905한빛의료기대구광역시 동구 화랑로 105 (효목동)053-742-29222000-06-09개인정상영업
906대구축산업협동조합대구광역시 동구 동북로 296 (신암동)053-950-12722000-03-30법인정상영업
907<NA>대구광역시 동구 반야월북로11길 42 (각산동)053-962-14842000-03-17개인정상영업
908영남슈퍼대구광역시 동구 아양로15길 90-48 (신암동)053-942-52822000-02-10개인정상영업