Overview

Dataset statistics

Number of variables5
Number of observations245
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.9 KiB
Average record size in memory41.5 B

Variable types

Numeric1
Categorical1
Text2
DateTime1

Dataset

Description의왕시의 담배소매업 현황에 대한 정보를 제공합니다. 사업장명, 인허가일자, 인허가취소일자, 영업상태명, 도로명우편번호, 소재지도로명, 소재지우편번호의 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15092429/fileData.do

Alerts

민원구분 is highly imbalanced (62.1%)Imbalance
번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 03:43:16.850961
Analysis finished2023-12-12 03:43:17.699946
Duration0.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct245
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean123
Minimum1
Maximum245
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 KiB
2023-12-12T12:43:17.813819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile13.2
Q162
median123
Q3184
95-th percentile232.8
Maximum245
Range244
Interquartile range (IQR)122

Descriptive statistics

Standard deviation70.869599
Coefficient of variation (CV)0.5761756
Kurtosis-1.2
Mean123
Median Absolute Deviation (MAD)61
Skewness0
Sum30135
Variance5022.5
MonotonicityStrictly increasing
2023-12-12T12:43:18.013851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
155 1
 
0.4%
157 1
 
0.4%
158 1
 
0.4%
159 1
 
0.4%
160 1
 
0.4%
161 1
 
0.4%
162 1
 
0.4%
163 1
 
0.4%
164 1
 
0.4%
Other values (235) 235
95.9%
ValueCountFrequency (%)
1 1
0.4%
2 1
0.4%
3 1
0.4%
4 1
0.4%
5 1
0.4%
6 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
10 1
0.4%
ValueCountFrequency (%)
245 1
0.4%
244 1
0.4%
243 1
0.4%
242 1
0.4%
241 1
0.4%
240 1
0.4%
239 1
0.4%
238 1
0.4%
237 1
0.4%
236 1
0.4%

민원구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
제7조의3제2항에따른경우
227 
제7조의3제3항에따른경우
 
18

Length

Max length13
Median length13
Mean length13
Min length13

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제7조의3제2항에따른경우
2nd row제7조의3제2항에따른경우
3rd row제7조의3제2항에따른경우
4th row제7조의3제2항에따른경우
5th row제7조의3제2항에따른경우

Common Values

ValueCountFrequency (%)
제7조의3제2항에따른경우 227
92.7%
제7조의3제3항에따른경우 18
 
7.3%

Length

2023-12-12T12:43:18.218125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:43:18.368450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제7조의3제2항에따른경우 227
92.7%
제7조의3제3항에따른경우 18
 
7.3%
Distinct244
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2023-12-12T12:43:18.620779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length16
Mean length9.2285714
Min length2

Characters and Unicode

Total characters2261
Distinct characters291
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique243 ?
Unique (%)99.2%

Sample

1st row씨유 의왕청계점
2nd row씨스페이스24 백운호수점
3rd row이마트24 SMART지지대주유소점
4th row비비(BB)마켓
5th row지에스25 의왕퍼스트힐점
ValueCountFrequency (%)
세븐일레븐 19
 
5.3%
씨유 16
 
4.5%
지에스(gs)25 11
 
3.1%
주)코리아세븐 10
 
2.8%
gs25 10
 
2.8%
이마트24 5
 
1.4%
지에스25 5
 
1.4%
주식회사 4
 
1.1%
의왕점 3
 
0.8%
의왕청계점 3
 
0.8%
Other values (259) 272
76.0%
2023-12-12T12:43:19.159242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
127
 
5.6%
119
 
5.3%
119
 
5.3%
114
 
5.0%
57
 
2.5%
2 55
 
2.4%
52
 
2.3%
) 49
 
2.2%
5 48
 
2.1%
( 48
 
2.1%
Other values (281) 1473
65.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1807
79.9%
Decimal Number 122
 
5.4%
Space Separator 114
 
5.0%
Uppercase Letter 88
 
3.9%
Close Punctuation 49
 
2.2%
Open Punctuation 48
 
2.1%
Lowercase Letter 29
 
1.3%
Other Symbol 3
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
127
 
7.0%
119
 
6.6%
119
 
6.6%
57
 
3.2%
52
 
2.9%
47
 
2.6%
43
 
2.4%
40
 
2.2%
39
 
2.2%
38
 
2.1%
Other values (243) 1126
62.3%
Uppercase Letter
ValueCountFrequency (%)
S 28
31.8%
G 28
31.8%
C 8
 
9.1%
U 6
 
6.8%
I 3
 
3.4%
T 3
 
3.4%
L 2
 
2.3%
B 2
 
2.3%
M 2
 
2.3%
K 1
 
1.1%
Other values (5) 5
 
5.7%
Lowercase Letter
ValueCountFrequency (%)
s 10
34.5%
g 9
31.0%
e 2
 
6.9%
t 2
 
6.9%
i 1
 
3.4%
f 1
 
3.4%
l 1
 
3.4%
k 1
 
3.4%
r 1
 
3.4%
a 1
 
3.4%
Decimal Number
ValueCountFrequency (%)
2 55
45.1%
5 48
39.3%
4 10
 
8.2%
9 3
 
2.5%
3 2
 
1.6%
1 2
 
1.6%
7 1
 
0.8%
6 1
 
0.8%
Space Separator
ValueCountFrequency (%)
114
100.0%
Close Punctuation
ValueCountFrequency (%)
) 49
100.0%
Open Punctuation
ValueCountFrequency (%)
( 48
100.0%
Other Symbol
ValueCountFrequency (%)
3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1810
80.1%
Common 334
 
14.8%
Latin 117
 
5.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
127
 
7.0%
119
 
6.6%
119
 
6.6%
57
 
3.1%
52
 
2.9%
47
 
2.6%
43
 
2.4%
40
 
2.2%
39
 
2.2%
38
 
2.1%
Other values (244) 1129
62.4%
Latin
ValueCountFrequency (%)
S 28
23.9%
G 28
23.9%
s 10
 
8.5%
g 9
 
7.7%
C 8
 
6.8%
U 6
 
5.1%
I 3
 
2.6%
T 3
 
2.6%
L 2
 
1.7%
B 2
 
1.7%
Other values (15) 18
15.4%
Common
ValueCountFrequency (%)
114
34.1%
2 55
16.5%
) 49
14.7%
5 48
14.4%
( 48
14.4%
4 10
 
3.0%
9 3
 
0.9%
3 2
 
0.6%
1 2
 
0.6%
7 1
 
0.3%
Other values (2) 2
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1807
79.9%
ASCII 451
 
19.9%
None 3
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
127
 
7.0%
119
 
6.6%
119
 
6.6%
57
 
3.2%
52
 
2.9%
47
 
2.6%
43
 
2.4%
40
 
2.2%
39
 
2.2%
38
 
2.1%
Other values (243) 1126
62.3%
ASCII
ValueCountFrequency (%)
114
25.3%
2 55
12.2%
) 49
10.9%
5 48
10.6%
( 48
10.6%
S 28
 
6.2%
G 28
 
6.2%
s 10
 
2.2%
4 10
 
2.2%
g 9
 
2.0%
Other values (27) 52
11.5%
None
ValueCountFrequency (%)
3
100.0%
Distinct223
Distinct (%)91.0%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2023-12-12T12:43:19.610708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length48
Mean length28.979592
Min length1

Characters and Unicode

Total characters7100
Distinct characters270
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique220 ?
Unique (%)89.8%

Sample

1st row경기도 의왕시 청계1로 29. 휴먼시아청계마을아파트 상가동 101호 (청계동)
2nd row경기도 의왕시 백운로 489 (학의동)
3rd row경기도 의왕시 경수대로 112. 고천주유소 1층 (왕곡동)
4th row경기도 의왕시 백운로 370. 1층 (학의동)
5th row경기도 의왕시 신장승길 29. 1층 104호 (삼동)
ValueCountFrequency (%)
경기도 224
 
15.1%
의왕시 224
 
15.1%
1층 51
 
3.4%
삼동 43
 
2.9%
내손동 41
 
2.8%
오전동 38
 
2.6%
101호 27
 
1.8%
포일동 27
 
1.8%
상가동 21
 
1.4%
오전로 12
 
0.8%
Other values (428) 776
52.3%
2023-12-12T12:43:20.200785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1299
 
18.3%
1 375
 
5.3%
277
 
3.9%
257
 
3.6%
256
 
3.6%
241
 
3.4%
238
 
3.4%
234
 
3.3%
( 230
 
3.2%
. 230
 
3.2%
Other values (260) 3463
48.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4053
57.1%
Space Separator 1299
 
18.3%
Decimal Number 1002
 
14.1%
Other Punctuation 233
 
3.3%
Open Punctuation 230
 
3.2%
Close Punctuation 230
 
3.2%
Dash Punctuation 26
 
0.4%
Uppercase Letter 26
 
0.4%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
277
 
6.8%
257
 
6.3%
256
 
6.3%
241
 
5.9%
238
 
5.9%
234
 
5.8%
229
 
5.7%
159
 
3.9%
121
 
3.0%
80
 
2.0%
Other values (233) 1961
48.4%
Decimal Number
ValueCountFrequency (%)
1 375
37.4%
0 133
 
13.3%
2 119
 
11.9%
3 76
 
7.6%
4 68
 
6.8%
5 57
 
5.7%
6 49
 
4.9%
7 46
 
4.6%
8 40
 
4.0%
9 39
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
B 8
30.8%
I 4
15.4%
T 3
 
11.5%
A 3
 
11.5%
H 2
 
7.7%
N 2
 
7.7%
S 1
 
3.8%
G 1
 
3.8%
C 1
 
3.8%
D 1
 
3.8%
Other Punctuation
ValueCountFrequency (%)
. 230
98.7%
/ 3
 
1.3%
Space Separator
ValueCountFrequency (%)
1299
100.0%
Open Punctuation
ValueCountFrequency (%)
( 230
100.0%
Close Punctuation
ValueCountFrequency (%)
) 230
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4053
57.1%
Common 3021
42.5%
Latin 26
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
277
 
6.8%
257
 
6.3%
256
 
6.3%
241
 
5.9%
238
 
5.9%
234
 
5.8%
229
 
5.7%
159
 
3.9%
121
 
3.0%
80
 
2.0%
Other values (233) 1961
48.4%
Common
ValueCountFrequency (%)
1299
43.0%
1 375
 
12.4%
( 230
 
7.6%
. 230
 
7.6%
) 230
 
7.6%
0 133
 
4.4%
2 119
 
3.9%
3 76
 
2.5%
4 68
 
2.3%
5 57
 
1.9%
Other values (7) 204
 
6.8%
Latin
ValueCountFrequency (%)
B 8
30.8%
I 4
15.4%
T 3
 
11.5%
A 3
 
11.5%
H 2
 
7.7%
N 2
 
7.7%
S 1
 
3.8%
G 1
 
3.8%
C 1
 
3.8%
D 1
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4053
57.1%
ASCII 3047
42.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1299
42.6%
1 375
 
12.3%
( 230
 
7.5%
. 230
 
7.5%
) 230
 
7.5%
0 133
 
4.4%
2 119
 
3.9%
3 76
 
2.5%
4 68
 
2.2%
5 57
 
1.9%
Other values (17) 230
 
7.5%
Hangul
ValueCountFrequency (%)
277
 
6.8%
257
 
6.3%
256
 
6.3%
241
 
5.9%
238
 
5.9%
234
 
5.8%
229
 
5.7%
159
 
3.9%
121
 
3.0%
80
 
2.0%
Other values (233) 1961
48.4%
Distinct227
Distinct (%)92.7%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
Minimum1989-01-01 00:00:00
Maximum2023-03-02 00:00:00
2023-12-12T12:43:20.384560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:43:20.595851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T12:43:17.269003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:43:20.719089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호민원구분
번호1.0000.275
민원구분0.2751.000
2023-12-12T12:43:20.829661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호민원구분
번호1.0000.198
민원구분0.1981.000

Missing values

2023-12-12T12:43:17.467186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:43:17.640913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호민원구분업소명업소도로명주소지정일자
01제7조의3제2항에따른경우씨유 의왕청계점경기도 의왕시 청계1로 29. 휴먼시아청계마을아파트 상가동 101호 (청계동)2023-03-02
12제7조의3제2항에따른경우씨스페이스24 백운호수점경기도 의왕시 백운로 489 (학의동)2022-12-08
23제7조의3제2항에따른경우이마트24 SMART지지대주유소점경기도 의왕시 경수대로 112. 고천주유소 1층 (왕곡동)2022-12-01
34제7조의3제2항에따른경우비비(BB)마켓경기도 의왕시 백운로 370. 1층 (학의동)2022-11-21
45제7조의3제2항에따른경우지에스25 의왕퍼스트힐점경기도 의왕시 신장승길 29. 1층 104호 (삼동)2022-10-31
56제7조의3제2항에따른경우(주)코리아세븐 의왕오전신안점경기도 의왕시 원골로 8. 한마음프라자 상가동 105. 106호 (오전동)2022-10-12
67제7조의3제2항에따른경우(주)코리아세븐 내손갈미점경기도 의왕시 갈미1로 52. 1층 (내손동)2022-10-12
78제7조의3제2항에따른경우(주)코리아세븐 의왕대우점경기도 의왕시 부곡복지관길 31. 대우이안아파트 101호 (삼동)2022-09-30
89제7조의3제2항에따른경우(주)코리아세븐 청계마을점경기도 의왕시 청계1로 11. 휴먼시아 청계마을아파트 102호. 103호 (청계동)2022-09-27
910제7조의3제2항에따른경우(주)코리아세븐 의왕부곡중앙점경기도 의왕시 부곡시장길 50. 1층 (삼동)2022-09-21
번호민원구분업소명업소도로명주소지정일자
235236제7조의3제2항에따른경우대풍상회경기도 의왕시 내손로 76 (내손동.보우상가지층)1999-08-24
236237제7조의3제2항에따른경우풍광상회1999-07-30
237238제7조의3제2항에따른경우고천타일1999-05-04
238239제7조의3제2항에따른경우한전구내식당경기도 의왕시 학의로 486 (내손동)1998-11-21
239240제7조의3제2항에따른경우포일전기통신센타경기도 의왕시 복지로 86-1 (내손동)1998-11-03
240241제7조의3제2항에따른경우의왕농협 청계연쇄점경기도 의왕시 안양판교로 238 (청계동)1998-07-15
241242제7조의3제2항에따른경우의왕농협하나로마트 부곡점1997-05-12
242243제7조의3제2항에따른경우현대슈퍼경기도 의왕시 장승길 6-1 (삼동)1994-01-14
243244제7조의3제2항에따른경우계요병원매점경기도 의왕시 오전로 15 (왕곡동)1989-01-01
244245제7조의3제2항에따른경우오봉슈퍼경기도 의왕시 창말윗길 3-5 (이동)1989-05-31