Overview

Dataset statistics

Number of variables7
Number of observations1111
Missing cells334
Missing cells (%)4.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory63.1 KiB
Average record size in memory58.1 B

Variable types

Numeric1
Categorical2
Text3
DateTime1

Dataset

Description세종특별자치시 소독의무시설 현황을 제공합니다.데이터는 구분(숙박, 식품, 운수, 급식, 학교 등), 소재지 주소, 전화번호, 기준년도로 구성되어 있습니다.
Author세종특별자치시
URLhttps://www.data.go.kr/data/15069380/fileData.do

Alerts

등록기준일 has constant value ""Constant
구분 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
기준년도 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
기준년도 is highly imbalanced (97.3%)Imbalance
전화번호 has 334 (30.1%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-16 15:22:36.583613
Analysis finished2023-12-16 15:22:44.201253
Duration7.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1111
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean556
Minimum1
Maximum1111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.9 KiB
2023-12-16T15:22:44.488849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile56.5
Q1278.5
median556
Q3833.5
95-th percentile1055.5
Maximum1111
Range1110
Interquartile range (IQR)555

Descriptive statistics

Standard deviation320.86238
Coefficient of variation (CV)0.57709061
Kurtosis-1.2
Mean556
Median Absolute Deviation (MAD)278
Skewness0
Sum617716
Variance102952.67
MonotonicityStrictly increasing
2023-12-16T15:22:44.940367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
740 1
 
0.1%
746 1
 
0.1%
745 1
 
0.1%
744 1
 
0.1%
743 1
 
0.1%
742 1
 
0.1%
741 1
 
0.1%
739 1
 
0.1%
731 1
 
0.1%
Other values (1101) 1101
99.1%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1111 1
0.1%
1110 1
0.1%
1109 1
0.1%
1108 1
0.1%
1107 1
0.1%
1106 1
0.1%
1105 1
0.1%
1104 1
0.1%
1103 1
0.1%
1102 1
0.1%

구분
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
사무실+복합용도 건축물
251 
급식
205 
식품
163 
공동주택
156 
어린이집 및 유치원
152 
Other values (8)
184 

Length

Max length12
Median length10
Mean length5.7020702
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박
2nd row숙박
3rd row숙박
4th row숙박
5th row숙박

Common Values

ValueCountFrequency (%)
사무실+복합용도 건축물 251
22.6%
급식 205
18.5%
식품 163
14.7%
공동주택 156
14.0%
어린이집 및 유치원 152
13.7%
학교 103
9.3%
숙박 31
 
2.8%
병원 14
 
1.3%
운수 13
 
1.2%
마트+시장 11
 
1.0%
Other values (3) 12
 
1.1%

Length

2023-12-16T15:22:45.675843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사무실+복합용도 251
15.1%
건축물 251
15.1%
급식 205
12.3%
식품 163
9.8%
공동주택 156
9.4%
어린이집 152
9.1%
152
9.1%
유치원 152
9.1%
학교 103
6.2%
숙박 31
 
1.9%
Other values (6) 50
 
3.0%
Distinct940
Distinct (%)84.6%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
2023-12-16T15:22:46.560297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length22
Mean length7.2574257
Min length2

Characters and Unicode

Total characters8063
Distinct characters505
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique776 ?
Unique (%)69.8%

Sample

1st row해동장여관
2nd row당산파크장여관
3rd row동광파크여관
4th row로얄모텔
5th row수정파크여관
ValueCountFrequency (%)
세종점 24
 
1.7%
가락마을 20
 
1.4%
세종 19
 
1.3%
도램마을 15
 
1.0%
1단지 14
 
1.0%
6단지 13
 
0.9%
2단지 13
 
0.9%
3단지 13
 
0.9%
새뜸마을 12
 
0.8%
5단지 12
 
0.8%
Other values (941) 1279
89.2%
2023-12-16T15:22:48.291696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
327
 
4.1%
252
 
3.1%
243
 
3.0%
232
 
2.9%
224
 
2.8%
223
 
2.8%
204
 
2.5%
173
 
2.1%
165
 
2.0%
155
 
1.9%
Other values (495) 5865
72.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7228
89.6%
Space Separator 327
 
4.1%
Decimal Number 221
 
2.7%
Uppercase Letter 110
 
1.4%
Close Punctuation 57
 
0.7%
Open Punctuation 57
 
0.7%
Lowercase Letter 26
 
0.3%
Other Punctuation 14
 
0.2%
Letter Number 11
 
0.1%
Other Symbol 9
 
0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
252
 
3.5%
243
 
3.4%
232
 
3.2%
224
 
3.1%
223
 
3.1%
204
 
2.8%
173
 
2.4%
165
 
2.3%
155
 
2.1%
155
 
2.1%
Other values (440) 5202
72.0%
Uppercase Letter
ValueCountFrequency (%)
A 13
11.8%
T 13
11.8%
K 10
 
9.1%
B 9
 
8.2%
G 7
 
6.4%
C 7
 
6.4%
R 7
 
6.4%
S 7
 
6.4%
D 5
 
4.5%
N 5
 
4.5%
Other values (12) 27
24.5%
Decimal Number
ValueCountFrequency (%)
1 69
31.2%
2 38
17.2%
3 22
 
10.0%
5 18
 
8.1%
4 16
 
7.2%
6 15
 
6.8%
7 11
 
5.0%
0 11
 
5.0%
9 11
 
5.0%
8 10
 
4.5%
Lowercase Letter
ValueCountFrequency (%)
e 6
23.1%
l 6
23.1%
i 3
11.5%
a 3
11.5%
n 2
 
7.7%
v 2
 
7.7%
h 1
 
3.8%
r 1
 
3.8%
s 1
 
3.8%
d 1
 
3.8%
Letter Number
ValueCountFrequency (%)
5
45.5%
3
27.3%
2
 
18.2%
1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
& 12
85.7%
· 1
 
7.1%
/ 1
 
7.1%
Space Separator
ValueCountFrequency (%)
327
100.0%
Close Punctuation
ValueCountFrequency (%)
) 57
100.0%
Open Punctuation
ValueCountFrequency (%)
( 57
100.0%
Other Symbol
ValueCountFrequency (%)
9
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7237
89.8%
Common 679
 
8.4%
Latin 147
 
1.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
252
 
3.5%
243
 
3.4%
232
 
3.2%
224
 
3.1%
223
 
3.1%
204
 
2.8%
173
 
2.4%
165
 
2.3%
155
 
2.1%
155
 
2.1%
Other values (441) 5211
72.0%
Latin
ValueCountFrequency (%)
A 13
 
8.8%
T 13
 
8.8%
K 10
 
6.8%
B 9
 
6.1%
G 7
 
4.8%
C 7
 
4.8%
R 7
 
4.8%
S 7
 
4.8%
e 6
 
4.1%
l 6
 
4.1%
Other values (26) 62
42.2%
Common
ValueCountFrequency (%)
327
48.2%
1 69
 
10.2%
) 57
 
8.4%
( 57
 
8.4%
2 38
 
5.6%
3 22
 
3.2%
5 18
 
2.7%
4 16
 
2.4%
6 15
 
2.2%
& 12
 
1.8%
Other values (8) 48
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7228
89.6%
ASCII 814
 
10.1%
Number Forms 11
 
0.1%
None 10
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
327
40.2%
1 69
 
8.5%
) 57
 
7.0%
( 57
 
7.0%
2 38
 
4.7%
3 22
 
2.7%
5 18
 
2.2%
4 16
 
2.0%
6 15
 
1.8%
A 13
 
1.6%
Other values (39) 182
22.4%
Hangul
ValueCountFrequency (%)
252
 
3.5%
243
 
3.4%
232
 
3.2%
224
 
3.1%
223
 
3.1%
204
 
2.8%
173
 
2.4%
165
 
2.3%
155
 
2.1%
155
 
2.1%
Other values (440) 5202
72.0%
None
ValueCountFrequency (%)
9
90.0%
· 1
 
10.0%
Number Forms
ValueCountFrequency (%)
5
45.5%
3
27.3%
2
 
18.2%
1
 
9.1%
Distinct977
Distinct (%)87.9%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
2023-12-16T15:22:50.006683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length30
Mean length17.828983
Min length13

Characters and Unicode

Total characters19808
Distinct characters225
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique860 ?
Unique (%)77.4%

Sample

1st row세종특별자치시 조치원읍 새내9길 15
2nd row세종특별자치시 연기면 당산말길 3
3rd row세종특별자치시 전의면 상교동1길 13
4th row세종특별자치시 조치원읍 안터1길 1
5th row세종특별자치시 연서면 대첩로 256
ValueCountFrequency (%)
세종특별자치시 1111
30.0%
조치원읍 84
 
2.3%
한누리대로 77
 
2.1%
시청대로 43
 
1.2%
관리동 37
 
1.0%
장군면 33
 
0.9%
남세종로 33
 
0.9%
달빛1로 32
 
0.9%
마음로 31
 
0.8%
연서면 30
 
0.8%
Other values (766) 2195
59.2%
2023-12-16T15:22:52.927558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3195
16.1%
1213
 
6.1%
1182
 
6.0%
1180
 
6.0%
1157
 
5.8%
1112
 
5.6%
1111
 
5.6%
1111
 
5.6%
903
 
4.6%
1 810
 
4.1%
Other values (215) 6834
34.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13141
66.3%
Decimal Number 3313
 
16.7%
Space Separator 3195
 
16.1%
Dash Punctuation 155
 
0.8%
Math Symbol 2
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1213
 
9.2%
1182
 
9.0%
1180
 
9.0%
1157
 
8.8%
1112
 
8.5%
1111
 
8.5%
1111
 
8.5%
903
 
6.9%
179
 
1.4%
176
 
1.3%
Other values (200) 3817
29.0%
Decimal Number
ValueCountFrequency (%)
1 810
24.4%
2 452
13.6%
3 396
12.0%
4 281
 
8.5%
5 280
 
8.5%
7 240
 
7.2%
0 240
 
7.2%
6 235
 
7.1%
9 200
 
6.0%
8 179
 
5.4%
Uppercase Letter
ValueCountFrequency (%)
L 1
50.0%
H 1
50.0%
Space Separator
ValueCountFrequency (%)
3195
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 155
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13141
66.3%
Common 6665
33.6%
Latin 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1213
 
9.2%
1182
 
9.0%
1180
 
9.0%
1157
 
8.8%
1112
 
8.5%
1111
 
8.5%
1111
 
8.5%
903
 
6.9%
179
 
1.4%
176
 
1.3%
Other values (200) 3817
29.0%
Common
ValueCountFrequency (%)
3195
47.9%
1 810
 
12.2%
2 452
 
6.8%
3 396
 
5.9%
4 281
 
4.2%
5 280
 
4.2%
7 240
 
3.6%
0 240
 
3.6%
6 235
 
3.5%
9 200
 
3.0%
Other values (3) 336
 
5.0%
Latin
ValueCountFrequency (%)
L 1
50.0%
H 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13141
66.3%
ASCII 6667
33.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3195
47.9%
1 810
 
12.1%
2 452
 
6.8%
3 396
 
5.9%
4 281
 
4.2%
5 280
 
4.2%
7 240
 
3.6%
0 240
 
3.6%
6 235
 
3.5%
9 200
 
3.0%
Other values (5) 338
 
5.1%
Hangul
ValueCountFrequency (%)
1213
 
9.2%
1182
 
9.0%
1180
 
9.0%
1157
 
8.8%
1112
 
8.5%
1111
 
8.5%
1111
 
8.5%
903
 
6.9%
179
 
1.4%
176
 
1.3%
Other values (200) 3817
29.0%

전화번호
Text

MISSING 

Distinct734
Distinct (%)94.5%
Missing334
Missing (%)30.1%
Memory size8.8 KiB
2023-12-16T15:22:54.152479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length11.220077
Min length8

Characters and Unicode

Total characters8718
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique691 ?
Unique (%)88.9%

Sample

1st row044-867-0190
2nd row044-862-7337
3rd row044-862-5100
4th row044-865-1566
5th row044-867-9440
ValueCountFrequency (%)
044 3
 
0.4%
044-417-1501 2
 
0.3%
862-1081 2
 
0.3%
044-715-7005 2
 
0.3%
044-998-3131 2
 
0.3%
044-865-0066 2
 
0.3%
860-3701 2
 
0.3%
044-863-7200 2
 
0.3%
903-1001 2
 
0.3%
410-0500 2
 
0.3%
Other values (725) 759
97.3%
2023-12-16T15:22:55.937036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 1517
17.4%
0 1458
16.7%
- 1394
16.0%
8 859
9.9%
6 837
9.6%
2 490
 
5.6%
9 479
 
5.5%
1 454
 
5.2%
7 451
 
5.2%
3 408
 
4.7%
Other values (2) 371
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7319
84.0%
Dash Punctuation 1394
 
16.0%
Space Separator 5
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 1517
20.7%
0 1458
19.9%
8 859
11.7%
6 837
11.4%
2 490
 
6.7%
9 479
 
6.5%
1 454
 
6.2%
7 451
 
6.2%
3 408
 
5.6%
5 366
 
5.0%
Dash Punctuation
ValueCountFrequency (%)
- 1394
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8718
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 1517
17.4%
0 1458
16.7%
- 1394
16.0%
8 859
9.9%
6 837
9.6%
2 490
 
5.6%
9 479
 
5.5%
1 454
 
5.2%
7 451
 
5.2%
3 408
 
4.7%
Other values (2) 371
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8718
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 1517
17.4%
0 1458
16.7%
- 1394
16.0%
8 859
9.9%
6 837
9.6%
2 490
 
5.6%
9 479
 
5.5%
1 454
 
5.2%
7 451
 
5.2%
3 408
 
4.7%
Other values (2) 371
 
4.3%

기준년도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
2023
1108 
<NA>
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2023 1108
99.7%
<NA> 3
 
0.3%

Length

2023-12-16T15:22:56.601450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-16T15:22:57.108876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 1108
99.7%
na 3
 
0.3%

등록기준일
Date

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
Minimum2023-12-15 00:00:00
Maximum2023-12-15 00:00:00
2023-12-16T15:22:57.631030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T15:22:57.998554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-16T15:22:42.906840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-16T15:22:58.324491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분
연번1.0000.919
구분0.9191.000
2023-12-16T15:22:58.754238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분기준년도
구분1.0001.000
기준년도1.0001.000
2023-12-16T15:22:59.097060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분기준년도
연번1.0000.7171.000
구분0.7171.0001.000
기준년도1.0001.0001.000

Missing values

2023-12-16T15:22:43.341725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-16T15:22:44.010469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번구분업소명(시설명)소재지전화번호기준년도등록기준일
01숙박해동장여관세종특별자치시 조치원읍 새내9길 15044-867-019020232023-12-15
12숙박당산파크장여관세종특별자치시 연기면 당산말길 3044-862-733720232023-12-15
23숙박동광파크여관세종특별자치시 전의면 상교동1길 13044-862-510020232023-12-15
34숙박로얄모텔세종특별자치시 조치원읍 안터1길 1044-865-156620232023-12-15
45숙박수정파크여관세종특별자치시 연서면 대첩로 256044-867-944020232023-12-15
56숙박궁전파크여관세종특별자치시 조치원읍 안터길 14044-866-132120232023-12-15
67숙박커플링모텔세종특별자치시 금남면 용포로 119-53044-866-205520232023-12-15
78숙박썸모텔세종특별자치시 부강면 부강외천로 210044-275-170020232023-12-15
89숙박로얄파크장세종특별자치시 부강면 시장길 35044-275-265320232023-12-15
910숙박세종그랜드모텔세종특별자치시 부강면 청연로 77044-275-100120232023-12-15
연번구분업소명(시설명)소재지전화번호기준년도등록기준일
11011102공동주택새나루마을2단지세종특별자치시 집현서로 15044-715-599320232023-12-15
11021103공동주택새나루마을1단지세종특별자치시 남세종로 160044-864-660820232023-12-15
11031104공동주택새나루마을11단지세종특별자치시 집현서2로 84044-868-220820232023-12-15
11041105공동주택새나루마을9단지세종특별자치시 시청대로 642044-863-995520232023-12-15
11051106공동주택새샘마을4단지세종특별자치시 소담3로 21044-868-740120232023-12-15
11061107공동주택새나루마을8단지세종특별자치시 집현서로 16044-868-872320232023-12-15
11071108공동주택한뜰마을4단지세종특별자치시 다솜1로 9044-862-298320232023-12-15
11081109공동주택새나루마을10단지세종특별자치시 시청대로 643<NA>20232023-12-15
11091110공동주택한뜰마을5단지세종특별자치시 다솜로 7044-868-077420232023-12-15
11101111공동주택수루배마을9단지세종특별자치시 시청대로 547<NA>20232023-12-15