Overview

Dataset statistics

Number of variables5
Number of observations484
Missing cells86
Missing cells (%)3.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory19.5 KiB
Average record size in memory41.3 B

Variable types

Numeric1
Categorical1
Text3

Dataset

Description경기도 포천시에서 제공하는 소독의무대상시설(연번, 구분, 시설명, 도로명 주소, 전화번호 등) 현황 데이터 입니다.
Author경기도 포천시
URLhttps://www.data.go.kr/data/15002413/fileData.do

Alerts

연번 is highly overall correlated with 구분High correlation
구분 is highly overall correlated with 연번High correlation
전화번호 has 86 (17.8%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-13 00:53:57.931745
Analysis finished2023-12-13 00:53:58.468072
Duration0.54 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct484
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean242.5
Minimum1
Maximum484
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.4 KiB
2023-12-13T09:53:58.520986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile25.15
Q1121.75
median242.5
Q3363.25
95-th percentile459.85
Maximum484
Range483
Interquartile range (IQR)241.5

Descriptive statistics

Standard deviation139.86303
Coefficient of variation (CV)0.57675476
Kurtosis-1.2
Mean242.5
Median Absolute Deviation (MAD)121
Skewness0
Sum117370
Variance19561.667
MonotonicityStrictly increasing
2023-12-13T09:53:58.625122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.2%
305 1
 
0.2%
333 1
 
0.2%
332 1
 
0.2%
331 1
 
0.2%
330 1
 
0.2%
329 1
 
0.2%
328 1
 
0.2%
327 1
 
0.2%
326 1
 
0.2%
Other values (474) 474
97.9%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
484 1
0.2%
483 1
0.2%
482 1
0.2%
481 1
0.2%
480 1
0.2%
479 1
0.2%
478 1
0.2%
477 1
0.2%
476 1
0.2%
475 1
0.2%

구분
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
식품접객업소
100 
숙박업소
99 
집단급식소
71 
초중고등학교
55 
어린이집 유치원
52 
Other values (12)
107 

Length

Max length10
Median length8
Mean length5.5764463
Min length2

Unique

Unique2 ?
Unique (%)0.4%

Sample

1st row숙박업소
2nd row숙박업소
3rd row숙박업소
4th row숙박업소
5th row숙박업소

Common Values

ValueCountFrequency (%)
식품접객업소 100
20.7%
숙박업소 99
20.5%
집단급식소 71
14.7%
초중고등학교 55
11.4%
어린이집 유치원 52
10.7%
복합용도 건축물 등 31
 
6.4%
운송시설 26
 
5.4%
공동주택 19
 
3.9%
병원 10
 
2.1%
위탁급식소 9
 
1.9%
Other values (7) 12
 
2.5%

Length

2023-12-13T09:53:58.728331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
식품접객업소 100
16.7%
숙박업소 99
16.6%
집단급식소 71
11.9%
초중고등학교 55
9.2%
어린이집 52
8.7%
유치원 52
8.7%
복합용도 31
 
5.2%
건축물 31
 
5.2%
31
 
5.2%
운송시설 26
 
4.3%
Other values (10) 50
8.4%
Distinct477
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
2023-12-13T09:53:59.159066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length19
Mean length6.5082645
Min length1

Characters and Unicode

Total characters3150
Distinct characters447
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique470 ?
Unique (%)97.1%

Sample

1st row미시간모텔
2nd row꿈의궁전
3rd row베어스타운리조트
4th row샤론모텔
5th row그린파크
ValueCountFrequency (%)
호텔 8
 
1.4%
모텔 5
 
0.9%
클럽하우스 4
 
0.7%
아도니스 3
 
0.5%
꼬마등대어린이집 2
 
0.4%
한내 2
 
0.4%
참밸리 2
 
0.4%
드림 2
 
0.4%
어린이집 2
 
0.4%
대식당 2
 
0.4%
Other values (514) 525
94.3%
2023-12-13T09:53:59.503033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
106
 
3.4%
78
 
2.5%
74
 
2.3%
74
 
2.3%
68
 
2.2%
63
 
2.0%
63
 
2.0%
60
 
1.9%
60
 
1.9%
) 57
 
1.8%
Other values (437) 2447
77.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2853
90.6%
Space Separator 74
 
2.3%
Close Punctuation 57
 
1.8%
Open Punctuation 57
 
1.8%
Uppercase Letter 51
 
1.6%
Decimal Number 28
 
0.9%
Lowercase Letter 24
 
0.8%
Other Punctuation 4
 
0.1%
Other Symbol 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
106
 
3.7%
78
 
2.7%
74
 
2.6%
68
 
2.4%
63
 
2.2%
63
 
2.2%
60
 
2.1%
60
 
2.1%
49
 
1.7%
47
 
1.6%
Other values (391) 2185
76.6%
Uppercase Letter
ValueCountFrequency (%)
O 7
13.7%
C 5
9.8%
T 5
9.8%
D 4
 
7.8%
A 4
 
7.8%
R 4
 
7.8%
L 3
 
5.9%
F 3
 
5.9%
J 3
 
5.9%
E 2
 
3.9%
Other values (8) 11
21.6%
Decimal Number
ValueCountFrequency (%)
2 8
28.6%
1 8
28.6%
0 3
 
10.7%
5 2
 
7.1%
9 2
 
7.1%
3 1
 
3.6%
4 1
 
3.6%
6 1
 
3.6%
8 1
 
3.6%
7 1
 
3.6%
Lowercase Letter
ValueCountFrequency (%)
t 5
20.8%
e 5
20.8%
k 2
 
8.3%
a 2
 
8.3%
r 2
 
8.3%
n 2
 
8.3%
s 2
 
8.3%
u 2
 
8.3%
j 1
 
4.2%
h 1
 
4.2%
Other Punctuation
ValueCountFrequency (%)
& 1
25.0%
. 1
25.0%
' 1
25.0%
/ 1
25.0%
Space Separator
ValueCountFrequency (%)
74
100.0%
Close Punctuation
ValueCountFrequency (%)
) 57
100.0%
Open Punctuation
ValueCountFrequency (%)
( 57
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2855
90.6%
Common 220
 
7.0%
Latin 75
 
2.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
106
 
3.7%
78
 
2.7%
74
 
2.6%
68
 
2.4%
63
 
2.2%
63
 
2.2%
60
 
2.1%
60
 
2.1%
49
 
1.7%
47
 
1.6%
Other values (392) 2187
76.6%
Latin
ValueCountFrequency (%)
O 7
 
9.3%
t 5
 
6.7%
e 5
 
6.7%
C 5
 
6.7%
T 5
 
6.7%
D 4
 
5.3%
A 4
 
5.3%
R 4
 
5.3%
L 3
 
4.0%
F 3
 
4.0%
Other values (18) 30
40.0%
Common
ValueCountFrequency (%)
74
33.6%
) 57
25.9%
( 57
25.9%
2 8
 
3.6%
1 8
 
3.6%
0 3
 
1.4%
5 2
 
0.9%
9 2
 
0.9%
3 1
 
0.5%
4 1
 
0.5%
Other values (7) 7
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2853
90.6%
ASCII 295
 
9.4%
None 2
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
106
 
3.7%
78
 
2.7%
74
 
2.6%
68
 
2.4%
63
 
2.2%
63
 
2.2%
60
 
2.1%
60
 
2.1%
49
 
1.7%
47
 
1.6%
Other values (391) 2185
76.6%
ASCII
ValueCountFrequency (%)
74
25.1%
) 57
19.3%
( 57
19.3%
2 8
 
2.7%
1 8
 
2.7%
O 7
 
2.4%
t 5
 
1.7%
e 5
 
1.7%
C 5
 
1.7%
T 5
 
1.7%
Other values (35) 64
21.7%
None
ValueCountFrequency (%)
2
100.0%
Distinct452
Distinct (%)93.4%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
2023-12-13T09:53:59.716473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length38
Mean length21.975207
Min length14

Characters and Unicode

Total characters10636
Distinct characters214
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique423 ?
Unique (%)87.4%

Sample

1st row경기도 포천시 중앙로 102-14
2nd row경기도 포천시 영북면 운천안2길 6
3rd row경기도 포천시 내촌면 금강로2536번길 27
4th row경기도 포천시 영북면 운천로24번길 3
5th row경기도 포천시 중앙로78번길 27
ValueCountFrequency (%)
경기도 484
19.6%
포천시 483
19.6%
소흘읍 136
 
5.5%
신북면 59
 
2.4%
호국로 55
 
2.2%
화동로 31
 
1.3%
영북면 29
 
1.2%
일동면 29
 
1.2%
내촌면 28
 
1.1%
군내면 28
 
1.1%
Other values (583) 1102
44.7%
2023-12-13T09:54:00.052299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2084
19.6%
512
 
4.8%
511
 
4.8%
495
 
4.7%
486
 
4.6%
485
 
4.6%
484
 
4.6%
433
 
4.1%
1 397
 
3.7%
2 308
 
2.9%
Other values (204) 4441
41.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6336
59.6%
Space Separator 2084
 
19.6%
Decimal Number 1961
 
18.4%
Dash Punctuation 124
 
1.2%
Other Punctuation 67
 
0.6%
Open Punctuation 22
 
0.2%
Close Punctuation 20
 
0.2%
Uppercase Letter 12
 
0.1%
Math Symbol 10
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
512
 
8.1%
511
 
8.1%
495
 
7.8%
486
 
7.7%
485
 
7.7%
484
 
7.6%
433
 
6.8%
259
 
4.1%
183
 
2.9%
150
 
2.4%
Other values (183) 2338
36.9%
Decimal Number
ValueCountFrequency (%)
1 397
20.2%
2 308
15.7%
3 191
9.7%
5 178
9.1%
4 155
 
7.9%
7 153
 
7.8%
0 150
 
7.6%
8 149
 
7.6%
6 149
 
7.6%
9 131
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
B 5
41.7%
A 4
33.3%
C 2
 
16.7%
E 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
, 66
98.5%
. 1
 
1.5%
Space Separator
ValueCountFrequency (%)
2084
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 124
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 20
100.0%
Math Symbol
ValueCountFrequency (%)
~ 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6336
59.6%
Common 4288
40.3%
Latin 12
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
512
 
8.1%
511
 
8.1%
495
 
7.8%
486
 
7.7%
485
 
7.7%
484
 
7.6%
433
 
6.8%
259
 
4.1%
183
 
2.9%
150
 
2.4%
Other values (183) 2338
36.9%
Common
ValueCountFrequency (%)
2084
48.6%
1 397
 
9.3%
2 308
 
7.2%
3 191
 
4.5%
5 178
 
4.2%
4 155
 
3.6%
7 153
 
3.6%
0 150
 
3.5%
8 149
 
3.5%
6 149
 
3.5%
Other values (7) 374
 
8.7%
Latin
ValueCountFrequency (%)
B 5
41.7%
A 4
33.3%
C 2
 
16.7%
E 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6336
59.6%
ASCII 4300
40.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2084
48.5%
1 397
 
9.2%
2 308
 
7.2%
3 191
 
4.4%
5 178
 
4.1%
4 155
 
3.6%
7 153
 
3.6%
0 150
 
3.5%
8 149
 
3.5%
6 149
 
3.5%
Other values (11) 386
 
9.0%
Hangul
ValueCountFrequency (%)
512
 
8.1%
511
 
8.1%
495
 
7.8%
486
 
7.7%
485
 
7.7%
484
 
7.6%
433
 
6.8%
259
 
4.1%
183
 
2.9%
150
 
2.4%
Other values (183) 2338
36.9%

전화번호
Text

MISSING 

Distinct379
Distinct (%)95.2%
Missing86
Missing (%)17.8%
Memory size3.9 KiB
2023-12-13T09:54:00.273488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length11.997487
Min length9

Characters and Unicode

Total characters4775
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique361 ?
Unique (%)90.7%

Sample

1st row031-535-3603
2nd row031-531-0508
3rd row031-532-2534
4th row031-534-7756
5th row031-534-9097
ValueCountFrequency (%)
031-532-2534 3
 
0.7%
070-4008-6006 2
 
0.5%
031-544-2313 2
 
0.5%
031-539 2
 
0.5%
031-539-1114 2
 
0.5%
1899-2010 2
 
0.5%
031-541-1010 2
 
0.5%
031-539-9114 2
 
0.5%
031-530-9100 2
 
0.5%
031-538-3483 2
 
0.5%
Other values (376) 385
94.8%
2023-12-13T09:54:00.586202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 869
18.2%
- 792
16.6%
0 684
14.3%
1 681
14.3%
5 573
12.0%
4 337
 
7.1%
2 226
 
4.7%
6 174
 
3.6%
8 166
 
3.5%
7 136
 
2.8%
Other values (2) 137
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3975
83.2%
Dash Punctuation 792
 
16.6%
Space Separator 8
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 869
21.9%
0 684
17.2%
1 681
17.1%
5 573
14.4%
4 337
 
8.5%
2 226
 
5.7%
6 174
 
4.4%
8 166
 
4.2%
7 136
 
3.4%
9 129
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 792
100.0%
Space Separator
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4775
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 869
18.2%
- 792
16.6%
0 684
14.3%
1 681
14.3%
5 573
12.0%
4 337
 
7.1%
2 226
 
4.7%
6 174
 
3.6%
8 166
 
3.5%
7 136
 
2.8%
Other values (2) 137
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4775
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 869
18.2%
- 792
16.6%
0 684
14.3%
1 681
14.3%
5 573
12.0%
4 337
 
7.1%
2 226
 
4.7%
6 174
 
3.6%
8 166
 
3.5%
7 136
 
2.8%
Other values (2) 137
 
2.9%

Interactions

2023-12-13T09:53:58.288362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:54:00.663365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분
연번1.0000.934
구분0.9341.000
2023-12-13T09:54:00.727069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분
연번1.0000.728
구분0.7281.000

Missing values

2023-12-13T09:53:58.372049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:53:58.440693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번구분시설명도로명주소전화번호
01숙박업소미시간모텔경기도 포천시 중앙로 102-14031-535-3603
12숙박업소꿈의궁전경기도 포천시 영북면 운천안2길 6031-531-0508
23숙박업소베어스타운리조트경기도 포천시 내촌면 금강로2536번길 27031-532-2534
34숙박업소샤론모텔경기도 포천시 영북면 운천로24번길 3<NA>
45숙박업소그린파크경기도 포천시 중앙로78번길 27031-534-7756
56숙박업소라스베가스경기도 포천시 신북면 호국로 2447031-534-9097
67숙박업소하이원경기도 포천시 소흘읍 거친봉이길 15-1031-543-1050
78관광숙박업소베어스타운콘도경기도 포천시 내촌면 금강로2536번길 27031-532-2534
89숙박업소유토피아경기도 포천시 중앙로149번길 10-3031-531-0031
910숙박업소숲속의궁전경기도 포천시 내촌면 금강로 2119-18031-531-6423
연번구분시설명도로명주소전화번호
474475공동주택세창선단마을경기도 포천시 선마로 22031-541-6217
475476공동주택연봉마을(영화)경기도 포천시 소흘읍 봉솔로 9031-541-8147
476477공동주택송천마을(주공2)경기도 포천시 소흘읍 태봉로 227031-541-0425
477478공동주택태봉마을(주공3)경기도 포천시 소흘읍 태봉로 153031-541-7625
478479공동주택휴먼시아아파트경기도 포천시 왕방로118번길 7031-533-8425
479480공동주택고은마을(주공4)경기도 포천시 소흘읍 태봉로 124031-543-0346
480481공동주택추산마을(주공5)경기도 포천시 소흘읍 태봉로 83031-544-0466
481482공동주택포천아이파크1차경기도 포천시 군내면 호국로 1536-10031-536-9172
482483공동주택포천아이파크2차경기도 포천시 호국로 1536-22031-536-9066
483484공동주택포천 용정 행복주택경기도 포천시 용정경제로1길 14031-536-6710