Overview

Dataset statistics

Number of variables6
Number of observations642
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory30.2 KiB
Average record size in memory48.2 B

Variable types

Categorical1
Text3
DateTime2

Dataset

Description대전광역시 대덕구 내 담배소매업현황에 대한 데이터로 민원구분, 업소명, 지번주소, 도로명주소, 지정일자 등의 항목을 제공합니다.
URLhttps://www.data.go.kr/data/3060369/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
민원구분 is highly imbalanced (58.0%)Imbalance

Reproduction

Analysis started2023-12-12 13:55:39.709423
Analysis finished2023-12-12 13:55:40.320986
Duration0.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

민원구분
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
일반소매인
537 
구내소매인
103 
자동판매기
 
2

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반소매인
2nd row일반소매인
3rd row일반소매인
4th row일반소매인
5th row일반소매인

Common Values

ValueCountFrequency (%)
일반소매인 537
83.6%
구내소매인 103
 
16.0%
자동판매기 2
 
0.3%

Length

2023-12-12T22:55:40.386950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:55:40.491497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반소매인 537
83.6%
구내소매인 103
 
16.0%
자동판매기 2
 
0.3%
Distinct622
Distinct (%)96.9%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2023-12-12T22:55:40.727805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length17
Mean length7.2507788
Min length1

Characters and Unicode

Total characters4655
Distinct characters429
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique613 ?
Unique (%)95.5%

Sample

1st rowGS25 송촌스타점
2nd row지에스25 대덕비래점
3rd row대한식당
4th row제이마트
5th row세븐일레븐 대전중리본점
ValueCountFrequency (%)
세븐일레븐 32
 
3.6%
씨유 31
 
3.5%
지에스25 17
 
1.9%
이마트24 14
 
1.6%
gs25 12
 
1.4%
미니스톱 11
 
1.2%
주)코리아세븐 10
 
1.1%
주식회사 6
 
0.7%
매점 4
 
0.5%
지에스(gs)25 4
 
0.5%
Other values (686) 741
84.0%
2023-12-12T22:55:41.167576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
263
 
5.6%
200
 
4.3%
192
 
4.1%
143
 
3.1%
133
 
2.9%
129
 
2.8%
78
 
1.7%
76
 
1.6%
75
 
1.6%
2 73
 
1.6%
Other values (419) 3293
70.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3993
85.8%
Space Separator 263
 
5.6%
Decimal Number 169
 
3.6%
Uppercase Letter 102
 
2.2%
Open Punctuation 55
 
1.2%
Close Punctuation 55
 
1.2%
Lowercase Letter 9
 
0.2%
Other Punctuation 7
 
0.2%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
200
 
5.0%
192
 
4.8%
143
 
3.6%
133
 
3.3%
129
 
3.2%
78
 
2.0%
76
 
1.9%
75
 
1.9%
73
 
1.8%
72
 
1.8%
Other values (375) 2822
70.7%
Uppercase Letter
ValueCountFrequency (%)
G 27
26.5%
S 27
26.5%
C 8
 
7.8%
E 4
 
3.9%
D 4
 
3.9%
K 4
 
3.9%
O 4
 
3.9%
U 3
 
2.9%
I 3
 
2.9%
J 3
 
2.9%
Other values (12) 15
14.7%
Decimal Number
ValueCountFrequency (%)
2 73
43.2%
5 47
27.8%
4 27
 
16.0%
1 9
 
5.3%
0 9
 
5.3%
3 2
 
1.2%
7 1
 
0.6%
9 1
 
0.6%
Lowercase Letter
ValueCountFrequency (%)
e 3
33.3%
h 1
 
11.1%
w 1
 
11.1%
s 1
 
11.1%
u 1
 
11.1%
p 1
 
11.1%
r 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
. 4
57.1%
& 2
28.6%
/ 1
 
14.3%
Space Separator
ValueCountFrequency (%)
263
100.0%
Open Punctuation
ValueCountFrequency (%)
( 55
100.0%
Close Punctuation
ValueCountFrequency (%)
) 55
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3993
85.8%
Common 551
 
11.8%
Latin 111
 
2.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
200
 
5.0%
192
 
4.8%
143
 
3.6%
133
 
3.3%
129
 
3.2%
78
 
2.0%
76
 
1.9%
75
 
1.9%
73
 
1.8%
72
 
1.8%
Other values (375) 2822
70.7%
Latin
ValueCountFrequency (%)
G 27
24.3%
S 27
24.3%
C 8
 
7.2%
E 4
 
3.6%
D 4
 
3.6%
K 4
 
3.6%
O 4
 
3.6%
U 3
 
2.7%
I 3
 
2.7%
e 3
 
2.7%
Other values (19) 24
21.6%
Common
ValueCountFrequency (%)
263
47.7%
2 73
 
13.2%
( 55
 
10.0%
) 55
 
10.0%
5 47
 
8.5%
4 27
 
4.9%
1 9
 
1.6%
0 9
 
1.6%
. 4
 
0.7%
& 2
 
0.4%
Other values (5) 7
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3993
85.8%
ASCII 662
 
14.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
263
39.7%
2 73
 
11.0%
( 55
 
8.3%
) 55
 
8.3%
5 47
 
7.1%
4 27
 
4.1%
G 27
 
4.1%
S 27
 
4.1%
1 9
 
1.4%
0 9
 
1.4%
Other values (34) 70
 
10.6%
Hangul
ValueCountFrequency (%)
200
 
5.0%
192
 
4.8%
143
 
3.6%
133
 
3.3%
129
 
3.2%
78
 
2.0%
76
 
1.9%
75
 
1.9%
73
 
1.8%
72
 
1.8%
Other values (375) 2822
70.7%
Distinct636
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2023-12-12T22:55:41.467250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length39
Mean length24.057632
Min length7

Characters and Unicode

Total characters15445
Distinct characters198
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique631 ?
Unique (%)98.3%

Sample

1st row대전광역시 대덕구 송촌동 467-14
2nd row대전광역시 대덕구 비래동 144-1
3rd row대전광역시 대덕구 읍내동 100 대한통운
4th row대전광역시 대덕구 석봉동 575-5
5th row대전광역시 대덕구 중리동 376-5
ValueCountFrequency (%)
대전광역시 637
19.4%
대덕구 636
19.4%
110
 
3.4%
오정동 106
 
3.2%
중리동 95
 
2.9%
1호 58
 
1.8%
비래동 56
 
1.7%
신탄진동 52
 
1.6%
석봉동 48
 
1.5%
송촌동 48
 
1.5%
Other values (705) 1433
43.7%
2023-12-12T22:55:41.897353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3369
21.8%
1335
 
8.6%
1 697
 
4.5%
669
 
4.3%
655
 
4.2%
644
 
4.2%
641
 
4.2%
639
 
4.1%
638
 
4.1%
637
 
4.1%
Other values (188) 5521
35.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9104
58.9%
Space Separator 3369
 
21.8%
Decimal Number 2776
 
18.0%
Dash Punctuation 152
 
1.0%
Uppercase Letter 24
 
0.2%
Close Punctuation 7
 
< 0.1%
Open Punctuation 7
 
< 0.1%
Other Punctuation 4
 
< 0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1335
14.7%
669
 
7.3%
655
 
7.2%
644
 
7.1%
641
 
7.0%
639
 
7.0%
638
 
7.0%
637
 
7.0%
568
 
6.2%
476
 
5.2%
Other values (158) 2202
24.2%
Uppercase Letter
ValueCountFrequency (%)
K 4
16.7%
B 4
16.7%
G 3
12.5%
T 3
12.5%
L 2
8.3%
S 2
8.3%
A 1
 
4.2%
O 1
 
4.2%
W 1
 
4.2%
E 1
 
4.2%
Other values (2) 2
8.3%
Decimal Number
ValueCountFrequency (%)
1 697
25.1%
2 367
13.2%
4 321
11.6%
3 281
10.1%
0 218
 
7.9%
5 203
 
7.3%
6 183
 
6.6%
8 183
 
6.6%
7 170
 
6.1%
9 153
 
5.5%
Other Punctuation
ValueCountFrequency (%)
. 2
50.0%
& 2
50.0%
Lowercase Letter
ValueCountFrequency (%)
s 1
50.0%
e 1
50.0%
Space Separator
ValueCountFrequency (%)
3369
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 152
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9104
58.9%
Common 6315
40.9%
Latin 26
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1335
14.7%
669
 
7.3%
655
 
7.2%
644
 
7.1%
641
 
7.0%
639
 
7.0%
638
 
7.0%
637
 
7.0%
568
 
6.2%
476
 
5.2%
Other values (158) 2202
24.2%
Common
ValueCountFrequency (%)
3369
53.3%
1 697
 
11.0%
2 367
 
5.8%
4 321
 
5.1%
3 281
 
4.4%
0 218
 
3.5%
5 203
 
3.2%
6 183
 
2.9%
8 183
 
2.9%
7 170
 
2.7%
Other values (6) 323
 
5.1%
Latin
ValueCountFrequency (%)
K 4
15.4%
B 4
15.4%
G 3
11.5%
T 3
11.5%
L 2
7.7%
S 2
7.7%
s 1
 
3.8%
A 1
 
3.8%
O 1
 
3.8%
W 1
 
3.8%
Other values (4) 4
15.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9104
58.9%
ASCII 6341
41.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3369
53.1%
1 697
 
11.0%
2 367
 
5.8%
4 321
 
5.1%
3 281
 
4.4%
0 218
 
3.4%
5 203
 
3.2%
6 183
 
2.9%
8 183
 
2.9%
7 170
 
2.7%
Other values (20) 349
 
5.5%
Hangul
ValueCountFrequency (%)
1335
14.7%
669
 
7.3%
655
 
7.2%
644
 
7.1%
641
 
7.0%
639
 
7.0%
638
 
7.0%
637
 
7.0%
568
 
6.2%
476
 
5.2%
Other values (158) 2202
24.2%
Distinct525
Distinct (%)81.8%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2023-12-12T22:55:42.197480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length57
Median length50
Mean length24.308411
Min length1

Characters and Unicode

Total characters15606
Distinct characters217
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique522 ?
Unique (%)81.3%

Sample

1st row대전광역시 대덕구 송촌북로12번길 1 (송촌동)
2nd row대전광역시 대덕구 비래동로32번길 36. 1층 (비래동)
3rd row대전광역시 대덕구 신탄진로 1. 대한통운 사무동 지하1층 (읍내동)
4th row대전광역시 대덕구 대덕대로1585번길 14. 716-5 (석봉동)
5th row대전광역시 대덕구 중리로 63. 1층 (중리동)
ValueCountFrequency (%)
대전광역시 525
 
17.7%
대덕구 525
 
17.7%
1층 109
 
3.7%
오정동 84
 
2.8%
중리동 79
 
2.7%
비래동 44
 
1.5%
신탄진동 44
 
1.5%
송촌동 40
 
1.3%
석봉동 38
 
1.3%
대화동 28
 
0.9%
Other values (642) 1450
48.9%
2023-12-12T22:55:42.641129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2703
 
17.3%
1281
 
8.2%
1 689
 
4.4%
646
 
4.1%
618
 
4.0%
569
 
3.6%
) 533
 
3.4%
( 533
 
3.4%
532
 
3.4%
527
 
3.4%
Other values (207) 6975
44.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9088
58.2%
Space Separator 2703
 
17.3%
Decimal Number 2416
 
15.5%
Close Punctuation 533
 
3.4%
Open Punctuation 533
 
3.4%
Other Punctuation 268
 
1.7%
Dash Punctuation 39
 
0.2%
Uppercase Letter 24
 
0.2%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1281
14.1%
646
 
7.1%
618
 
6.8%
569
 
6.3%
532
 
5.9%
527
 
5.8%
525
 
5.8%
525
 
5.8%
506
 
5.6%
307
 
3.4%
Other values (181) 3052
33.6%
Decimal Number
ValueCountFrequency (%)
1 689
28.5%
2 265
 
11.0%
3 217
 
9.0%
4 215
 
8.9%
0 205
 
8.5%
7 193
 
8.0%
5 188
 
7.8%
6 183
 
7.6%
8 144
 
6.0%
9 117
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
B 6
25.0%
K 5
20.8%
G 3
12.5%
T 3
12.5%
S 3
12.5%
A 2
 
8.3%
D 1
 
4.2%
L 1
 
4.2%
Other Punctuation
ValueCountFrequency (%)
. 265
98.9%
& 3
 
1.1%
Lowercase Letter
ValueCountFrequency (%)
s 1
50.0%
e 1
50.0%
Space Separator
ValueCountFrequency (%)
2703
100.0%
Close Punctuation
ValueCountFrequency (%)
) 533
100.0%
Open Punctuation
ValueCountFrequency (%)
( 533
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 39
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9088
58.2%
Common 6492
41.6%
Latin 26
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1281
14.1%
646
 
7.1%
618
 
6.8%
569
 
6.3%
532
 
5.9%
527
 
5.8%
525
 
5.8%
525
 
5.8%
506
 
5.6%
307
 
3.4%
Other values (181) 3052
33.6%
Common
ValueCountFrequency (%)
2703
41.6%
1 689
 
10.6%
) 533
 
8.2%
( 533
 
8.2%
2 265
 
4.1%
. 265
 
4.1%
3 217
 
3.3%
4 215
 
3.3%
0 205
 
3.2%
7 193
 
3.0%
Other values (6) 674
 
10.4%
Latin
ValueCountFrequency (%)
B 6
23.1%
K 5
19.2%
G 3
11.5%
T 3
11.5%
S 3
11.5%
A 2
 
7.7%
s 1
 
3.8%
D 1
 
3.8%
L 1
 
3.8%
e 1
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9088
58.2%
ASCII 6518
41.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2703
41.5%
1 689
 
10.6%
) 533
 
8.2%
( 533
 
8.2%
2 265
 
4.1%
. 265
 
4.1%
3 217
 
3.3%
4 215
 
3.3%
0 205
 
3.1%
7 193
 
3.0%
Other values (16) 700
 
10.7%
Hangul
ValueCountFrequency (%)
1281
14.1%
646
 
7.1%
618
 
6.8%
569
 
6.3%
532
 
5.9%
527
 
5.8%
525
 
5.8%
525
 
5.8%
506
 
5.6%
307
 
3.4%
Other values (181) 3052
33.6%
Distinct560
Distinct (%)87.2%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
Minimum1971-09-30 00:00:00
Maximum2022-05-11 00:00:00
2023-12-12T22:55:42.771376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:42.910954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
Minimum2023-06-08 00:00:00
Maximum2023-06-08 00:00:00
2023-12-12T22:55:43.023596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:43.104043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Missing values

2023-12-12T22:55:40.163758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:55:40.280926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

민원구분업소명업소지번주소업소도로명주소지정일자데이터기준일자
0일반소매인GS25 송촌스타점대전광역시 대덕구 송촌동 467-14대전광역시 대덕구 송촌북로12번길 1 (송촌동)2022-05-112023-06-08
1일반소매인지에스25 대덕비래점대전광역시 대덕구 비래동 144-1대전광역시 대덕구 비래동로32번길 36. 1층 (비래동)2022-05-032023-06-08
2일반소매인대한식당대전광역시 대덕구 읍내동 100 대한통운대전광역시 대덕구 신탄진로 1. 대한통운 사무동 지하1층 (읍내동)2022-04-182023-06-08
3일반소매인제이마트대전광역시 대덕구 석봉동 575-5대전광역시 대덕구 대덕대로1585번길 14. 716-5 (석봉동)2022-04-052023-06-08
4일반소매인세븐일레븐 대전중리본점대전광역시 대덕구 중리동 376-5대전광역시 대덕구 중리로 63. 1층 (중리동)2022-03-282023-06-08
5일반소매인세븐일레븐 대전송촌나드리점대전광역시 대덕구 송촌동 496-6대전광역시 대덕구 선비마을로23번길 39. 1층 (송촌동)2022-03-222023-06-08
6일반소매인클로즈유닉대전광역시 대덕구 중리동 152-10대전광역시 대덕구 중리동로40번길 21. 1층 (중리동)2022-03-072023-06-08
7구내소매인GS25 비래학사점대전광역시 대덕구 비래동 149-28대전광역시 대덕구 우암로 430. 1층 (비래동)2022-02-282023-06-08
8일반소매인씨유 대전동일스위트점대전광역시 대덕구 신탄진동 771 대전 동일스위트 리버스카이 1단지대전광역시 대덕구 대청로 43. B동 1층 104호 (신탄진동. 대전 동일스위트 리버스카이 1단지)2022-01-272023-06-08
9일반소매인세븐일레븐 대전신탄대로점대전광역시 대덕구 신탄진동 117-1대전광역시 대덕구 신탄진로844번길 2 (신탄진동)2022-01-212023-06-08
민원구분업소명업소지번주소업소도로명주소지정일자데이터기준일자
632일반소매인1985-05-252023-06-08
633일반소매인대전광역시 대덕구 오정동 196번지 1 호1984-06-222023-06-08
634일반소매인대전광역시 대덕구 읍내동 240번지 58 호대전광역시 대덕구 대전로1375번길 25 (읍내동)1983-09-202023-06-08
635구내소매인대전광역시 대덕구 석봉동 388호1981-06-052023-06-08
636일반소매인대전광역시 대덕구 법동 77번지 8 호1981-05-142023-06-08
637일반소매인대전광역시 대덕구 삼정동 258호1980-12-272023-06-08
638일반소매인대전광역시 대덕구 오정동 66호1980-12-132023-06-08
639일반소매인대전광역시 대덕구 오정동 263번지 4 호1980-12-132023-06-08
640일반소매인대전광역시 대덕구 석봉동 176호1977-03-072023-06-08
641일반소매인왔다상회대전광역시 대덕구 석봉동 306번지 34호대전광역시 대덕구 석봉북로9번길 4 (석봉동)1971-09-302023-06-08