Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells1558
Missing cells (%)3.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory468.8 KiB
Average record size in memory48.0 B

Variable types

Text2
Categorical3

Dataset

Description충청북도 충주시 금연구역흡연시설물관리시스템 금연구액 시설물 현황입니다(업소명, 업태명, 업종명. 소재지 도로명 주소, 데이터 기준일)
URLhttps://www.data.go.kr/data/15121428/fileData.do

Alerts

데이터기준일 has constant value ""Constant
업태명 is highly overall correlated with 업종명High correlation
업종명 is highly overall correlated with 업태명High correlation
소재지 도로명주소 has 1558 (15.6%) missing valuesMissing

Reproduction

Analysis started2023-12-12 13:39:26.509469
Analysis finished2023-12-12 13:39:28.177842
Duration1.67 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9403
Distinct (%)94.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T22:39:28.533481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length39
Mean length7.4004
Min length1

Characters and Unicode

Total characters74004
Distinct characters1044
Distinct categories11 ?
Distinct scripts5 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8806 ?
Unique (%)88.1%

Sample

1st row안용운씨 집앞
2nd row향기나는치과의원
3rd row충주시동량보건지소
4th row하늘주막
5th row송가네
ValueCountFrequency (%)
293
 
2.4%
입구 126
 
1.0%
공장 124
 
1.0%
맞은편 99
 
0.8%
제2종근린생활시설 37
 
0.3%
37
 
0.3%
제1종근린생활시설 36
 
0.3%
대소원면 35
 
0.3%
건너편 34
 
0.3%
충주공장 30
 
0.2%
Other values (9923) 11525
93.1%
2023-12-12T22:39:29.093172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2379
 
3.2%
1853
 
2.5%
1460
 
2.0%
1257
 
1.7%
1255
 
1.7%
1151
 
1.6%
) 961
 
1.3%
( 961
 
1.3%
948
 
1.3%
832
 
1.1%
Other values (1034) 60947
82.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 64628
87.3%
Space Separator 2379
 
3.2%
Decimal Number 2229
 
3.0%
Uppercase Letter 1567
 
2.1%
Close Punctuation 963
 
1.3%
Open Punctuation 963
 
1.3%
Lowercase Letter 859
 
1.2%
Other Punctuation 189
 
0.3%
Connector Punctuation 127
 
0.2%
Dash Punctuation 94
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1853
 
2.9%
1460
 
2.3%
1257
 
1.9%
1255
 
1.9%
1151
 
1.8%
948
 
1.5%
832
 
1.3%
808
 
1.3%
801
 
1.2%
759
 
1.2%
Other values (953) 53504
82.8%
Uppercase Letter
ValueCountFrequency (%)
C 282
18.0%
P 205
13.1%
S 103
 
6.6%
E 99
 
6.3%
G 98
 
6.3%
A 79
 
5.0%
B 69
 
4.4%
O 66
 
4.2%
N 64
 
4.1%
T 54
 
3.4%
Other values (16) 448
28.6%
Lowercase Letter
ValueCountFrequency (%)
e 118
13.7%
a 81
 
9.4%
o 74
 
8.6%
i 59
 
6.9%
c 58
 
6.8%
n 56
 
6.5%
l 45
 
5.2%
s 43
 
5.0%
t 36
 
4.2%
f 35
 
4.1%
Other values (16) 254
29.6%
Decimal Number
ValueCountFrequency (%)
1 567
25.4%
2 418
18.8%
0 304
13.6%
5 188
 
8.4%
3 181
 
8.1%
4 149
 
6.7%
9 117
 
5.2%
6 108
 
4.8%
8 105
 
4.7%
7 92
 
4.1%
Other Punctuation
ValueCountFrequency (%)
& 78
41.3%
. 58
30.7%
, 34
18.0%
! 6
 
3.2%
# 5
 
2.6%
: 2
 
1.1%
2
 
1.1%
/ 2
 
1.1%
@ 1
 
0.5%
· 1
 
0.5%
Close Punctuation
ValueCountFrequency (%)
) 961
99.8%
] 2
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 961
99.8%
[ 2
 
0.2%
Other Symbol
ValueCountFrequency (%)
5
83.3%
° 1
 
16.7%
Space Separator
ValueCountFrequency (%)
2379
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 127
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 94
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 64612
87.3%
Common 6945
 
9.4%
Latin 2426
 
3.3%
Han 18
 
< 0.1%
Hiragana 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1853
 
2.9%
1460
 
2.3%
1257
 
1.9%
1255
 
1.9%
1151
 
1.8%
948
 
1.5%
832
 
1.3%
808
 
1.3%
801
 
1.2%
759
 
1.2%
Other values (935) 53488
82.8%
Latin
ValueCountFrequency (%)
C 282
 
11.6%
P 205
 
8.5%
e 118
 
4.9%
S 103
 
4.2%
E 99
 
4.1%
G 98
 
4.0%
a 81
 
3.3%
A 79
 
3.3%
o 74
 
3.1%
B 69
 
2.8%
Other values (42) 1218
50.2%
Common
ValueCountFrequency (%)
2379
34.3%
) 961
13.8%
( 961
13.8%
1 567
 
8.2%
2 418
 
6.0%
0 304
 
4.4%
5 188
 
2.7%
3 181
 
2.6%
4 149
 
2.1%
_ 127
 
1.8%
Other values (18) 710
 
10.2%
Han
ValueCountFrequency (%)
2
 
11.1%
2
 
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (6) 6
33.3%
Hiragana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 64607
87.3%
ASCII 9367
 
12.7%
CJK 17
 
< 0.1%
None 9
 
< 0.1%
Hiragana 3
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2379
25.4%
) 961
 
10.3%
( 961
 
10.3%
1 567
 
6.1%
2 418
 
4.5%
0 304
 
3.2%
C 282
 
3.0%
P 205
 
2.2%
5 188
 
2.0%
3 181
 
1.9%
Other values (67) 2921
31.2%
Hangul
ValueCountFrequency (%)
1853
 
2.9%
1460
 
2.3%
1257
 
1.9%
1255
 
1.9%
1151
 
1.8%
948
 
1.5%
832
 
1.3%
808
 
1.3%
801
 
1.2%
759
 
1.2%
Other values (934) 53483
82.8%
None
ValueCountFrequency (%)
5
55.6%
2
 
22.2%
· 1
 
11.1%
° 1
 
11.1%
CJK
ValueCountFrequency (%)
2
 
11.8%
2
 
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Other values (5) 5
29.4%
Hiragana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%

업태명
Categorical

HIGH CORRELATION 

Distinct32
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
음식점
5567 
버스정류소
1084 
사무용건축물+공장 및 복합건축물
774 
학원
 
422
사회복지시설
 
339
Other values (27)
1814 

Length

Max length17
Median length3
Mean length5.0605
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row버스정류소
2nd row의료기관+보건소등
3rd row의료기관+보건소등
4th row음식점
5th row음식점

Common Values

ValueCountFrequency (%)
음식점 5567
55.7%
버스정류소 1084
 
10.8%
사무용건축물+공장 및 복합건축물 774
 
7.7%
학원 422
 
4.2%
사회복지시설 339
 
3.4%
게임제공업소 251
 
2.5%
의료기관+보건소등 244
 
2.4%
유치원+초중고등학교 185
 
1.8%
유치원-어린이집경계10m 162
 
1.6%
청사 134
 
1.3%
Other values (22) 838
 
8.4%

Length

2023-12-12T22:39:29.259066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
음식점 5567
47.8%
버스정류소 1084
 
9.3%
777
 
6.7%
사무용건축물+공장 774
 
6.7%
복합건축물 774
 
6.7%
학원 422
 
3.6%
사회복지시설 339
 
2.9%
게임제공업소 251
 
2.2%
의료기관+보건소등 244
 
2.1%
유치원+초중고등학교 185
 
1.6%
Other values (28) 1218
 
10.5%

업종명
Categorical

HIGH CORRELATION 

Distinct32
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
음식점
5567 
버스정류소
1084 
사무용건축물+공장 및 복합건축물
774 
학원
 
422
사회복지시설
 
339
Other values (27)
1814 

Length

Max length17
Median length3
Mean length5.0605
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row버스정류소
2nd row의료기관+보건소등
3rd row의료기관+보건소등
4th row음식점
5th row음식점

Common Values

ValueCountFrequency (%)
음식점 5567
55.7%
버스정류소 1084
 
10.8%
사무용건축물+공장 및 복합건축물 774
 
7.7%
학원 422
 
4.2%
사회복지시설 339
 
3.4%
게임제공업소 251
 
2.5%
의료기관+보건소등 244
 
2.4%
유치원+초중고등학교 185
 
1.8%
유치원-어린이집경계10m 162
 
1.6%
청사 134
 
1.3%
Other values (22) 838
 
8.4%

Length

2023-12-12T22:39:29.395147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
음식점 5567
47.8%
버스정류소 1084
 
9.3%
777
 
6.7%
사무용건축물+공장 774
 
6.7%
복합건축물 774
 
6.7%
학원 422
 
3.6%
사회복지시설 339
 
2.9%
게임제공업소 251
 
2.2%
의료기관+보건소등 244
 
2.1%
유치원+초중고등학교 185
 
1.6%
Other values (28) 1218
 
10.5%
Distinct7019
Distinct (%)83.1%
Missing1558
Missing (%)15.6%
Memory size156.2 KiB
2023-12-12T22:39:29.689095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length44
Mean length19.961384
Min length11

Characters and Unicode

Total characters168514
Distinct characters409
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6025 ?
Unique (%)71.4%

Sample

1st row충청북도 충주시 남산로 83
2nd row충청북도 충주시 동량면 조돈안길 1
3rd row충청북도 충주시 대현6길 23
4th row충청북도 충주시 예성로 323-1
5th row충청북도 충주시 갱고개로 154
ValueCountFrequency (%)
충청북도 8442
22.0%
충주시 8442
22.0%
1층 1456
 
3.8%
대소원면 366
 
1.0%
2층 327
 
0.9%
예성로 274
 
0.7%
중앙탑면 268
 
0.7%
수안보면 244
 
0.6%
주덕읍 223
 
0.6%
앙성면 192
 
0.5%
Other values (3497) 18118
47.2%
2023-12-12T22:39:30.128076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34782
20.6%
17268
 
10.2%
8888
 
5.3%
8585
 
5.1%
8575
 
5.1%
8547
 
5.1%
8462
 
5.0%
1 8176
 
4.9%
2 4401
 
2.6%
4394
 
2.6%
Other values (399) 56436
33.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 102038
60.6%
Space Separator 34782
 
20.6%
Decimal Number 28309
 
16.8%
Other Punctuation 2025
 
1.2%
Dash Punctuation 1256
 
0.7%
Uppercase Letter 87
 
0.1%
Lowercase Letter 10
 
< 0.1%
Math Symbol 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
17268
16.9%
8888
 
8.7%
8585
 
8.4%
8575
 
8.4%
8547
 
8.4%
8462
 
8.3%
4394
 
4.3%
4230
 
4.1%
2034
 
2.0%
1746
 
1.7%
Other values (365) 29309
28.7%
Decimal Number
ValueCountFrequency (%)
1 8176
28.9%
2 4401
15.5%
3 3010
 
10.6%
4 2375
 
8.4%
0 2044
 
7.2%
5 2032
 
7.2%
6 1683
 
5.9%
7 1618
 
5.7%
9 1543
 
5.5%
8 1427
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
A 44
50.6%
B 22
25.3%
C 7
 
8.0%
G 4
 
4.6%
K 3
 
3.4%
F 2
 
2.3%
S 2
 
2.3%
L 1
 
1.1%
R 1
 
1.1%
T 1
 
1.1%
Lowercase Letter
ValueCountFrequency (%)
l 2
20.0%
c 2
20.0%
e 2
20.0%
i 1
10.0%
v 1
10.0%
u 1
10.0%
h 1
10.0%
Other Punctuation
ValueCountFrequency (%)
, 2014
99.5%
@ 7
 
0.3%
· 2
 
0.1%
. 2
 
0.1%
Space Separator
ValueCountFrequency (%)
34782
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1256
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 102038
60.6%
Common 66379
39.4%
Latin 97
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
17268
16.9%
8888
 
8.7%
8585
 
8.4%
8575
 
8.4%
8547
 
8.4%
8462
 
8.3%
4394
 
4.3%
4230
 
4.1%
2034
 
2.0%
1746
 
1.7%
Other values (365) 29309
28.7%
Common
ValueCountFrequency (%)
34782
52.4%
1 8176
 
12.3%
2 4401
 
6.6%
3 3010
 
4.5%
4 2375
 
3.6%
0 2044
 
3.1%
5 2032
 
3.1%
, 2014
 
3.0%
6 1683
 
2.5%
7 1618
 
2.4%
Other values (7) 4244
 
6.4%
Latin
ValueCountFrequency (%)
A 44
45.4%
B 22
22.7%
C 7
 
7.2%
G 4
 
4.1%
K 3
 
3.1%
l 2
 
2.1%
F 2
 
2.1%
c 2
 
2.1%
e 2
 
2.1%
S 2
 
2.1%
Other values (7) 7
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 102037
60.6%
ASCII 66474
39.4%
None 2
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
34782
52.3%
1 8176
 
12.3%
2 4401
 
6.6%
3 3010
 
4.5%
4 2375
 
3.6%
0 2044
 
3.1%
5 2032
 
3.1%
, 2014
 
3.0%
6 1683
 
2.5%
7 1618
 
2.4%
Other values (23) 4339
 
6.5%
Hangul
ValueCountFrequency (%)
17268
16.9%
8888
 
8.7%
8585
 
8.4%
8575
 
8.4%
8547
 
8.4%
8462
 
8.3%
4394
 
4.3%
4230
 
4.1%
2034
 
2.0%
1746
 
1.7%
Other values (364) 29308
28.7%
None
ValueCountFrequency (%)
· 2
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

데이터기준일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-08-30
10000 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-08-30
2nd row2023-08-30
3rd row2023-08-30
4th row2023-08-30
5th row2023-08-30

Common Values

ValueCountFrequency (%)
2023-08-30 10000
100.0%

Length

2023-12-12T22:39:30.253660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:39:30.329102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-08-30 10000
100.0%

Correlations

2023-12-12T22:39:30.380817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업태명업종명
업태명1.0001.000
업종명1.0001.000
2023-12-12T22:39:30.457251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업태명업종명
업태명1.0001.000
업종명1.0001.000
2023-12-12T22:39:30.539406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업태명업종명
업태명1.0001.000
업종명1.0001.000

Missing values

2023-12-12T22:39:28.006356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:39:28.121117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업소명업태명업종명소재지 도로명주소데이터기준일
6380안용운씨 집앞버스정류소버스정류소<NA>2023-08-30
317향기나는치과의원의료기관+보건소등의료기관+보건소등충청북도 충주시 남산로 832023-08-30
432충주시동량보건지소의료기관+보건소등의료기관+보건소등충청북도 충주시 동량면 조돈안길 12023-08-30
4621하늘주막음식점음식점충청북도 충주시 대현6길 232023-08-30
4777송가네음식점음식점충청북도 충주시 예성로 323-12023-08-30
4903명랑핫도그음식점음식점충청북도 충주시 갱고개로 1542023-08-30
6025뭉치포차음식점음식점충청북도 충주시 중앙탑면 원앙길 8-9 1층 103호2023-08-30
6620유개형 승강장 옆버스정류소버스정류소<NA>2023-08-30
7589건설경영연수원건너편버스정류소버스정류소<NA>2023-08-30
6214중앙탑초등학교절대정화구역절대정화구역충청북도 충주시 중앙탑면 첨단산업로 6702023-08-30
업소명업태명업종명소재지 도로명주소데이터기준일
6835덕해 최정일씨 집 앞버스정류소버스정류소<NA>2023-08-30
2776올바른푸드음식점음식점충청북도 충주시 야현1길 262023-08-30
7914충주카리타스노인복지센터사회복지시설사회복지시설충청북도 충주시 봉현로 2512023-08-30
6647대흥레미콘 옆버스정류소버스정류소<NA>2023-08-30
3413간이역호암역점음식점음식점충청북도 충주시 예성로 282023-08-30
1267엄정면 원곡리 69-1 공장 ((유한)세기산업)사무용건축물+공장 및 복합건축물사무용건축물+공장 및 복합건축물<NA>2023-08-30
8796놀자PC게임제공업소게임제공업소충청북도 충주시 연수상가1길 11, 2층2023-08-30
10920또아다방음식점음식점충청북도 충주시 주덕읍 신양로 94-12023-08-30
8033jh빌딩사무용건축물+공장 및 복합건축물사무용건축물+공장 및 복합건축물충청북도 충주시 금봉2길 18-72023-08-30
230연세88마취통증의학과의원의료기관+보건소등의료기관+보건소등충청북도 충주시 봉계1길 62, 401호2023-08-30