Overview

Dataset statistics

Number of variables3
Number of observations560
Missing cells0
Missing cells (%)0.0%
Duplicate rows4
Duplicate rows (%)0.7%
Total size in memory13.3 KiB
Average record size in memory24.2 B

Variable types

Text2
DateTime1

Dataset

Description대구광역시 중구 담배소매업소 현황입니다. (업소명, 업소도로명주소에 대한 정보를 제공하고 있습니다.)
Author대구광역시 중구
URLhttps://www.data.go.kr/data/15015694/fileData.do

Alerts

데이터 기준일자 has constant value ""Constant
Dataset has 4 (0.7%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-11 23:16:51.551092
Analysis finished2023-12-11 23:16:51.956595
Duration0.41 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct525
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
2023-12-12T08:16:52.140720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length22
Mean length6.8589286
Min length1

Characters and Unicode

Total characters3841
Distinct characters384
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique507 ?
Unique (%)90.5%

Sample

1st row대왕종합상사
2nd row뽕산베이프
3rd row씨유 대신태왕점
4th row경북상회
5th row씨유 남산e편한점
ValueCountFrequency (%)
세븐일레븐 27
 
3.7%
씨유 22
 
3.0%
이마트24 17
 
2.3%
15
 
2.1%
주)코리아세븐 12
 
1.7%
지에스(gs)25 6
 
0.8%
위드미 6
 
0.8%
홈마트 4
 
0.6%
gs25 4
 
0.6%
대구 4
 
0.6%
Other values (556) 609
83.9%
2023-12-12T08:16:52.621307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
182
 
4.7%
168
 
4.4%
167
 
4.3%
118
 
3.1%
105
 
2.7%
99
 
2.6%
80
 
2.1%
73
 
1.9%
70
 
1.8%
62
 
1.6%
Other values (374) 2717
70.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3297
85.8%
Space Separator 168
 
4.4%
Uppercase Letter 123
 
3.2%
Decimal Number 118
 
3.1%
Close Punctuation 58
 
1.5%
Open Punctuation 58
 
1.5%
Other Punctuation 9
 
0.2%
Lowercase Letter 9
 
0.2%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
182
 
5.5%
167
 
5.1%
118
 
3.6%
105
 
3.2%
99
 
3.0%
80
 
2.4%
73
 
2.2%
70
 
2.1%
62
 
1.9%
61
 
1.9%
Other values (334) 2280
69.2%
Uppercase Letter
ValueCountFrequency (%)
S 28
22.8%
G 24
19.5%
C 10
 
8.1%
A 7
 
5.7%
O 7
 
5.7%
U 6
 
4.9%
T 6
 
4.9%
M 5
 
4.1%
Y 4
 
3.3%
K 4
 
3.3%
Other values (12) 22
17.9%
Decimal Number
ValueCountFrequency (%)
2 59
50.0%
4 28
23.7%
5 27
22.9%
3 2
 
1.7%
8 1
 
0.8%
1 1
 
0.8%
Lowercase Letter
ValueCountFrequency (%)
e 2
22.2%
o 2
22.2%
g 2
22.2%
y 1
11.1%
m 1
11.1%
a 1
11.1%
Other Punctuation
ValueCountFrequency (%)
& 5
55.6%
. 4
44.4%
Space Separator
ValueCountFrequency (%)
168
100.0%
Close Punctuation
ValueCountFrequency (%)
) 58
100.0%
Open Punctuation
ValueCountFrequency (%)
( 58
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3297
85.8%
Common 412
 
10.7%
Latin 132
 
3.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
182
 
5.5%
167
 
5.1%
118
 
3.6%
105
 
3.2%
99
 
3.0%
80
 
2.4%
73
 
2.2%
70
 
2.1%
62
 
1.9%
61
 
1.9%
Other values (334) 2280
69.2%
Latin
ValueCountFrequency (%)
S 28
21.2%
G 24
18.2%
C 10
 
7.6%
A 7
 
5.3%
O 7
 
5.3%
U 6
 
4.5%
T 6
 
4.5%
M 5
 
3.8%
Y 4
 
3.0%
K 4
 
3.0%
Other values (18) 31
23.5%
Common
ValueCountFrequency (%)
168
40.8%
2 59
 
14.3%
) 58
 
14.1%
( 58
 
14.1%
4 28
 
6.8%
5 27
 
6.6%
& 5
 
1.2%
. 4
 
1.0%
3 2
 
0.5%
- 1
 
0.2%
Other values (2) 2
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3297
85.8%
ASCII 544
 
14.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
182
 
5.5%
167
 
5.1%
118
 
3.6%
105
 
3.2%
99
 
3.0%
80
 
2.4%
73
 
2.2%
70
 
2.1%
62
 
1.9%
61
 
1.9%
Other values (334) 2280
69.2%
ASCII
ValueCountFrequency (%)
168
30.9%
2 59
 
10.8%
) 58
 
10.7%
( 58
 
10.7%
S 28
 
5.1%
4 28
 
5.1%
5 27
 
5.0%
G 24
 
4.4%
C 10
 
1.8%
A 7
 
1.3%
Other values (30) 77
14.2%
Distinct530
Distinct (%)94.6%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
2023-12-12T08:16:52.964520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length65
Median length51
Mean length27.860714
Min length1

Characters and Unicode

Total characters15602
Distinct characters232
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique525 ?
Unique (%)93.8%

Sample

1st row대구광역시 중구 북성로 66 (북성로2가)
2nd row대구광역시 중구 달구벌대로 2161. 1층 (삼덕동2가)
3rd row대구광역시 중구 달구벌대로393길 62. 서문목욕탕 1층 상가동 1-1호 (대신동)
4th row대구광역시 중구 큰장로26길 24. 1층 102호 (대신동)
5th row대구광역시 중구 달구벌대로 2020. 단지내상가 401동 106호 (남산동. 남산e편한세상)
ValueCountFrequency (%)
대구광역시 532
 
17.6%
중구 531
 
17.5%
남산동 90
 
3.0%
1층 89
 
2.9%
달구벌대로 59
 
1.9%
대신동 52
 
1.7%
국채보상로 42
 
1.4%
대봉동 34
 
1.1%
중앙대로 33
 
1.1%
태평로 32
 
1.1%
Other values (720) 1532
50.6%
2023-12-12T08:16:53.646793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2738
17.5%
1168
 
7.5%
794
 
5.1%
685
 
4.4%
1 674
 
4.3%
592
 
3.8%
585
 
3.7%
554
 
3.6%
543
 
3.5%
535
 
3.4%
Other values (222) 6734
43.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8858
56.8%
Space Separator 2738
 
17.5%
Decimal Number 2536
 
16.3%
Close Punctuation 535
 
3.4%
Open Punctuation 535
 
3.4%
Other Punctuation 274
 
1.8%
Dash Punctuation 102
 
0.7%
Uppercase Letter 21
 
0.1%
Lowercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1168
 
13.2%
794
 
9.0%
685
 
7.7%
592
 
6.7%
585
 
6.6%
554
 
6.3%
543
 
6.1%
535
 
6.0%
247
 
2.8%
207
 
2.3%
Other values (194) 2948
33.3%
Decimal Number
ValueCountFrequency (%)
1 674
26.6%
2 433
17.1%
3 241
 
9.5%
4 219
 
8.6%
0 219
 
8.6%
5 187
 
7.4%
6 176
 
6.9%
7 146
 
5.8%
8 136
 
5.4%
9 105
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
E 6
28.6%
C 6
28.6%
B 2
 
9.5%
F 1
 
4.8%
H 1
 
4.8%
W 1
 
4.8%
S 1
 
4.8%
T 1
 
4.8%
K 1
 
4.8%
D 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
. 272
99.3%
· 2
 
0.7%
Lowercase Letter
ValueCountFrequency (%)
e 2
66.7%
c 1
33.3%
Space Separator
ValueCountFrequency (%)
2738
100.0%
Close Punctuation
ValueCountFrequency (%)
) 535
100.0%
Open Punctuation
ValueCountFrequency (%)
( 535
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 102
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8858
56.8%
Common 6720
43.1%
Latin 24
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1168
 
13.2%
794
 
9.0%
685
 
7.7%
592
 
6.7%
585
 
6.6%
554
 
6.3%
543
 
6.1%
535
 
6.0%
247
 
2.8%
207
 
2.3%
Other values (194) 2948
33.3%
Common
ValueCountFrequency (%)
2738
40.7%
1 674
 
10.0%
) 535
 
8.0%
( 535
 
8.0%
2 433
 
6.4%
. 272
 
4.0%
3 241
 
3.6%
4 219
 
3.3%
0 219
 
3.3%
5 187
 
2.8%
Other values (6) 667
 
9.9%
Latin
ValueCountFrequency (%)
E 6
25.0%
C 6
25.0%
B 2
 
8.3%
e 2
 
8.3%
F 1
 
4.2%
H 1
 
4.2%
c 1
 
4.2%
W 1
 
4.2%
S 1
 
4.2%
T 1
 
4.2%
Other values (2) 2
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8858
56.8%
ASCII 6742
43.2%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2738
40.6%
1 674
 
10.0%
) 535
 
7.9%
( 535
 
7.9%
2 433
 
6.4%
. 272
 
4.0%
3 241
 
3.6%
4 219
 
3.2%
0 219
 
3.2%
5 187
 
2.8%
Other values (17) 689
 
10.2%
Hangul
ValueCountFrequency (%)
1168
 
13.2%
794
 
9.0%
685
 
7.7%
592
 
6.7%
585
 
6.6%
554
 
6.3%
543
 
6.1%
535
 
6.0%
247
 
2.8%
207
 
2.3%
Other values (194) 2948
33.3%
None
ValueCountFrequency (%)
· 2
100.0%

데이터 기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
Minimum2020-10-16 00:00:00
Maximum2020-10-16 00:00:00
2023-12-12T08:16:53.802846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:16:53.904787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Missing values

2023-12-12T08:16:51.841048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:16:51.923428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업소명업소도로명주소데이터 기준일자
0대왕종합상사대구광역시 중구 북성로 66 (북성로2가)2020.10.16.
1뽕산베이프대구광역시 중구 달구벌대로 2161. 1층 (삼덕동2가)2020.10.16.
2씨유 대신태왕점대구광역시 중구 달구벌대로393길 62. 서문목욕탕 1층 상가동 1-1호 (대신동)2020.10.16.
3경북상회대구광역시 중구 큰장로26길 24. 1층 102호 (대신동)2020.10.16.
4씨유 남산e편한점대구광역시 중구 달구벌대로 2020. 단지내상가 401동 106호 (남산동. 남산e편한세상)2020.10.16.
5씨유 대봉더샵센트럴파크점대구광역시 중구 대봉로 226. 1층 101호. 102호. 103(103호의 절반)호 (대봉동. 대봉화성그린빌아파트)2020.10.16.
6(주)제네스 나인(이마트24반월당제네스)대구광역시 중구 중앙대로 312. 반월당 아너스 제네스타워 오피스텔 상가동 101.102호 (남산동)2020.10.16.
7데이앤데이 중앙로점대구광역시 중구 중앙대로 432-13. 1층. 2층 (포정동)2020.10.16.
8이마트24 삼덕청아람점대구광역시 중구 달구벌대로447길 66 (삼덕동3가)2020.10.16.
9미루바움 커피(MIRUBAUM COFFEE)대구광역시 중구 달구벌대로450길 32. 지상1층 (대봉동)2020.10.16.
업소명업소도로명주소데이터 기준일자
550시엔에스(C&S)대구광역시 중구 태평로 161 (태평로1가. 대구역지하상가15호.22호)2020.10.16.
551대일식품대구광역시 중구 동성로 80-3 (북성로1가)2020.10.16.
552경인식품대구광역시 중구 태평로 225 (동인동1가.22-2호)2020.10.16.
553금호자전거대구광역시 중구 남산로4길 82-1 (남산동)2020.10.16.
554대구광역시 중구 남성로 24 (남성로)2020.10.16.
555롯데담배대구광역시 중구 중앙대로58길 20 (남산동)2020.10.16.
556구멍가게대구광역시 중구 달구벌대로 1929-12 (대신동)2020.10.16.
557대성슈퍼대구광역시 중구 달성공원로6길 1 (대신동)2020.10.16.
558유니온호텔대구광역시 중구 태평로 117 (태평로2가.유니온호텔)2020.10.16.
559대광식육점대구광역시 중구 동덕로36길 58 (동인동4가)2020.10.16.

Duplicate rows

Most frequently occurring

업소명업소도로명주소데이터 기준일자# duplicates
02020.10.16.3
1대구광역시 중구 북성로 51-3 (북성로2가)2020.10.16.2
2복지마트대구광역시 중구 남산로8길 34 (남산동)2020.10.16.2
3승차권판매소2020.10.16.2