Overview

Dataset statistics

Number of variables8
Number of observations236
Missing cells135
Missing cells (%)7.2%
Duplicate rows1
Duplicate rows (%)0.4%
Total size in memory15.3 KiB
Average record size in memory66.6 B

Variable types

Categorical3
Text2
Numeric2
DateTime1

Dataset

Description충청남도 보령시로 여가생활을 즐기러 오시는 관광객분들을 위한 농어촌민박 현황(시도명, 시군명, 위치, 민박업소명) 등을 제공합니다.
Author충청남도 보령시
URLhttps://www.data.go.kr/data/15113712/fileData.do

Alerts

데이터기준일 has constant value ""Constant
Dataset has 1 (0.4%) duplicate rowsDuplicates
시군명 is highly overall correlated with 위도 and 3 other fieldsHigh correlation
읍면동 is highly overall correlated with 위도 and 3 other fieldsHigh correlation
시도명 is highly overall correlated with 위도 and 3 other fieldsHigh correlation
위도 is highly overall correlated with 시도명 and 2 other fieldsHigh correlation
경도 is highly overall correlated with 시도명 and 2 other fieldsHigh correlation
도로명주소 has 27 (11.4%) missing valuesMissing
민박업소명 has 27 (11.4%) missing valuesMissing
위도 has 27 (11.4%) missing valuesMissing
경도 has 27 (11.4%) missing valuesMissing
데이터기준일 has 27 (11.4%) missing valuesMissing

Reproduction

Analysis started2023-12-12 14:42:10.263428
Analysis finished2023-12-12 14:42:11.620255
Duration1.36 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
충청남도
209 
<NA>
27 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충청남도
2nd row충청남도
3rd row충청남도
4th row충청남도
5th row충청남도

Common Values

ValueCountFrequency (%)
충청남도 209
88.6%
<NA> 27
 
11.4%

Length

2023-12-12T23:42:11.709437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:42:11.829014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
충청남도 209
88.6%
na 27
 
11.4%

시군명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
보령시
209 
<NA>
27 

Length

Max length4
Median length3
Mean length3.1144068
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보령시
2nd row보령시
3rd row보령시
4th row보령시
5th row보령시

Common Values

ValueCountFrequency (%)
보령시 209
88.6%
<NA> 27
 
11.4%

Length

2023-12-12T23:42:11.943678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:42:12.043727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보령시 209
88.6%
na 27
 
11.4%

읍면동
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)5.9%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
웅천읍
52 
대천5동
48 
오천면
29 
<NA>
27 
성주면
22 
Other values (9)
58 

Length

Max length11
Median length3
Mean length3.8813559
Min length3

Unique

Unique3 ?
Unique (%)1.3%

Sample

1st row남포면
2nd row오천면
3rd row오천면
4th row웅천읍
5th row웅천읍

Common Values

ValueCountFrequency (%)
웅천읍 52
22.0%
대천5동 48
20.3%
오천면 29
12.3%
<NA> 27
11.4%
성주면 22
9.3%
남포면 16
 
6.8%
오천면(원산도출장소) 16
 
6.8%
천북면 14
 
5.9%
웅천읍 5
 
2.1%
미산면 2
 
0.8%
Other values (4) 5
 
2.1%

Length

2023-12-12T23:42:12.183084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
웅천읍 57
24.2%
대천5동 48
20.3%
오천면 29
12.3%
na 27
11.4%
성주면 22
 
9.3%
남포면 16
 
6.8%
오천면(원산도출장소 16
 
6.8%
천북면 14
 
5.9%
미산면 2
 
0.8%
청소면 2
 
0.8%
Other values (3) 3
 
1.3%

도로명주소
Text

MISSING 

Distinct209
Distinct (%)100.0%
Missing27
Missing (%)11.4%
Memory size2.0 KiB
2023-12-12T23:42:12.447931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length31
Mean length23.344498
Min length16

Characters and Unicode

Total characters4879
Distinct characters131
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique209 ?
Unique (%)100.0%

Sample

1st row충청남도 보령시 남포면 용두욕장길 43
2nd row충청남도 보령시 오천면 소도길 11-6
3rd row충청남도 보령시 오천면 소도길 7
4th row충청남도 보령시 웅천읍 열린바다로 579
5th row충청남도 보령시 웅천읍 열린바다로 599-13
ValueCountFrequency (%)
충청남도 209
20.5%
보령시 209
20.5%
웅천읍 57
 
5.6%
오천면 45
 
4.4%
열린바다로 25
 
2.5%
성주면 22
 
2.2%
심원계곡로 18
 
1.8%
남포면 16
 
1.6%
천북면 14
 
1.4%
간드리2길 14
 
1.4%
Other values (284) 391
38.3%
2023-12-12T23:42:12.895180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
811
 
16.6%
244
 
5.0%
228
 
4.7%
220
 
4.5%
215
 
4.4%
211
 
4.3%
210
 
4.3%
210
 
4.3%
1 168
 
3.4%
2 133
 
2.7%
Other values (121) 2229
45.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3027
62.0%
Decimal Number 824
 
16.9%
Space Separator 811
 
16.6%
Dash Punctuation 100
 
2.0%
Open Punctuation 42
 
0.9%
Close Punctuation 42
 
0.9%
Other Punctuation 23
 
0.5%
Uppercase Letter 10
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
244
 
8.1%
228
 
7.5%
220
 
7.3%
215
 
7.1%
211
 
7.0%
210
 
6.9%
210
 
6.9%
131
 
4.3%
126
 
4.2%
104
 
3.4%
Other values (103) 1128
37.3%
Decimal Number
ValueCountFrequency (%)
1 168
20.4%
2 133
16.1%
4 85
10.3%
3 75
9.1%
9 68
8.3%
6 67
 
8.1%
5 66
 
8.0%
7 66
 
8.0%
8 49
 
5.9%
0 47
 
5.7%
Uppercase Letter
ValueCountFrequency (%)
B 4
40.0%
C 3
30.0%
A 3
30.0%
Space Separator
ValueCountFrequency (%)
811
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 100
100.0%
Open Punctuation
ValueCountFrequency (%)
( 42
100.0%
Close Punctuation
ValueCountFrequency (%)
) 42
100.0%
Other Punctuation
ValueCountFrequency (%)
, 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3027
62.0%
Common 1842
37.8%
Latin 10
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
244
 
8.1%
228
 
7.5%
220
 
7.3%
215
 
7.1%
211
 
7.0%
210
 
6.9%
210
 
6.9%
131
 
4.3%
126
 
4.2%
104
 
3.4%
Other values (103) 1128
37.3%
Common
ValueCountFrequency (%)
811
44.0%
1 168
 
9.1%
2 133
 
7.2%
- 100
 
5.4%
4 85
 
4.6%
3 75
 
4.1%
9 68
 
3.7%
6 67
 
3.6%
5 66
 
3.6%
7 66
 
3.6%
Other values (5) 203
 
11.0%
Latin
ValueCountFrequency (%)
B 4
40.0%
C 3
30.0%
A 3
30.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3027
62.0%
ASCII 1852
38.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
811
43.8%
1 168
 
9.1%
2 133
 
7.2%
- 100
 
5.4%
4 85
 
4.6%
3 75
 
4.0%
9 68
 
3.7%
6 67
 
3.6%
5 66
 
3.6%
7 66
 
3.6%
Other values (8) 213
 
11.5%
Hangul
ValueCountFrequency (%)
244
 
8.1%
228
 
7.5%
220
 
7.3%
215
 
7.1%
211
 
7.0%
210
 
6.9%
210
 
6.9%
131
 
4.3%
126
 
4.2%
104
 
3.4%
Other values (103) 1128
37.3%

민박업소명
Text

MISSING 

Distinct209
Distinct (%)100.0%
Missing27
Missing (%)11.4%
Memory size2.0 KiB
2023-12-12T23:42:13.257178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length13
Mean length5.2488038
Min length2

Characters and Unicode

Total characters1097
Distinct characters269
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique209 ?
Unique (%)100.0%

Sample

1st row은행나무
2nd row바위섬
3rd row바위섬소도
4th row독산해변
5th row삼도정
ValueCountFrequency (%)
무창포 2
 
0.9%
히든빌리지 2
 
0.9%
비체하우스펜션 2
 
0.9%
브로콜리 1
 
0.5%
나무숲펜션 1
 
0.5%
베리굿 1
 
0.5%
우농서원 1
 
0.5%
가고파민박 1
 
0.5%
대일민박 1
 
0.5%
놀자 1
 
0.5%
Other values (201) 201
93.9%
2023-12-12T23:42:13.798346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
54
 
4.9%
51
 
4.6%
50
 
4.6%
49
 
4.5%
29
 
2.6%
23
 
2.1%
20
 
1.8%
20
 
1.8%
18
 
1.6%
15
 
1.4%
Other values (259) 768
70.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1044
95.2%
Uppercase Letter 24
 
2.2%
Decimal Number 13
 
1.2%
Space Separator 5
 
0.5%
Lowercase Letter 4
 
0.4%
Close Punctuation 3
 
0.3%
Open Punctuation 3
 
0.3%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
54
 
5.2%
51
 
4.9%
50
 
4.8%
49
 
4.7%
29
 
2.8%
23
 
2.2%
20
 
1.9%
20
 
1.9%
18
 
1.7%
15
 
1.4%
Other values (235) 715
68.5%
Uppercase Letter
ValueCountFrequency (%)
A 9
37.5%
B 6
25.0%
L 2
 
8.3%
R 2
 
8.3%
V 2
 
8.3%
C 1
 
4.2%
J 1
 
4.2%
P 1
 
4.2%
Decimal Number
ValueCountFrequency (%)
2 4
30.8%
5 2
15.4%
1 2
15.4%
8 1
 
7.7%
7 1
 
7.7%
0 1
 
7.7%
6 1
 
7.7%
4 1
 
7.7%
Lowercase Letter
ValueCountFrequency (%)
m 1
25.0%
e 1
25.0%
r 1
25.0%
a 1
25.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1044
95.2%
Latin 28
 
2.6%
Common 25
 
2.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
54
 
5.2%
51
 
4.9%
50
 
4.8%
49
 
4.7%
29
 
2.8%
23
 
2.2%
20
 
1.9%
20
 
1.9%
18
 
1.7%
15
 
1.4%
Other values (235) 715
68.5%
Latin
ValueCountFrequency (%)
A 9
32.1%
B 6
21.4%
L 2
 
7.1%
R 2
 
7.1%
V 2
 
7.1%
m 1
 
3.6%
e 1
 
3.6%
r 1
 
3.6%
a 1
 
3.6%
C 1
 
3.6%
Other values (2) 2
 
7.1%
Common
ValueCountFrequency (%)
5
20.0%
2 4
16.0%
) 3
12.0%
( 3
12.0%
5 2
 
8.0%
1 2
 
8.0%
8 1
 
4.0%
7 1
 
4.0%
0 1
 
4.0%
& 1
 
4.0%
Other values (2) 2
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1044
95.2%
ASCII 53
 
4.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
54
 
5.2%
51
 
4.9%
50
 
4.8%
49
 
4.7%
29
 
2.8%
23
 
2.2%
20
 
1.9%
20
 
1.9%
18
 
1.7%
15
 
1.4%
Other values (235) 715
68.5%
ASCII
ValueCountFrequency (%)
A 9
17.0%
B 6
 
11.3%
5
 
9.4%
2 4
 
7.5%
) 3
 
5.7%
( 3
 
5.7%
5 2
 
3.8%
L 2
 
3.8%
R 2
 
3.8%
V 2
 
3.8%
Other values (14) 15
28.3%

위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct200
Distinct (%)95.7%
Missing27
Missing (%)11.4%
Infinite0
Infinite (%)0.0%
Mean36.321587
Minimum36.220204
Maximum36.500965
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2023-12-12T23:42:13.970422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.220204
5-th percentile36.225194
Q136.251826
median36.328243
Q336.349568
95-th percentile36.443449
Maximum36.500965
Range0.28076102
Interquartile range (IQR)0.09774181

Descriptive statistics

Standard deviation0.068177161
Coefficient of variation (CV)0.0018770424
Kurtosis-0.51742835
Mean36.321587
Median Absolute Deviation (MAD)0.04472058
Skewness0.38178991
Sum7591.2118
Variance0.0046481252
MonotonicityNot monotonic
2023-12-12T23:42:14.108358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
36.33134992 3
 
1.3%
36.32911442 2
 
0.8%
36.32426918 2
 
0.8%
36.34733776 2
 
0.8%
36.32771382 2
 
0.8%
36.32968381 2
 
0.8%
36.33166882 2
 
0.8%
36.23010217 2
 
0.8%
36.27457392 1
 
0.4%
36.23925981 1
 
0.4%
Other values (190) 190
80.5%
(Missing) 27
 
11.4%
ValueCountFrequency (%)
36.22020445 1
0.4%
36.22042458 1
0.4%
36.22114771 1
0.4%
36.2212595 1
0.4%
36.22154139 1
0.4%
36.22185062 1
0.4%
36.22201575 1
0.4%
36.22294209 1
0.4%
36.22306319 1
0.4%
36.22366605 1
0.4%
ValueCountFrequency (%)
36.50096547 1
0.4%
36.50014266 1
0.4%
36.46761259 1
0.4%
36.46700264 1
0.4%
36.4589815 1
0.4%
36.454745 1
0.4%
36.45387077 1
0.4%
36.45037802 1
0.4%
36.446219 1
0.4%
36.44540746 1
0.4%

경도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct200
Distinct (%)95.7%
Missing27
Missing (%)11.4%
Infinite0
Infinite (%)0.0%
Mean126.52461
Minimum126.0757
Maximum126.69169
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2023-12-12T23:42:14.259632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.0757
5-th percentile126.35578
Q1126.50556
median126.53124
Q3126.54793
95-th percentile126.66953
Maximum126.69169
Range0.6159896
Interquartile range (IQR)0.0423679

Descriptive statistics

Standard deviation0.086098491
Coefficient of variation (CV)0.00068048811
Kurtosis3.5547956
Mean126.52461
Median Absolute Deviation (MAD)0.0212366
Skewness-0.94266003
Sum26443.643
Variance0.0074129501
MonotonicityNot monotonic
2023-12-12T23:42:14.447837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.5243043 3
 
1.3%
126.5278392 2
 
0.8%
126.5049028 2
 
0.8%
126.6696165 2
 
0.8%
126.5232536 2
 
0.8%
126.527117 2
 
0.8%
126.5242983 2
 
0.8%
126.5316056 2
 
0.8%
126.2690219 1
 
0.4%
126.533479 1
 
0.4%
Other values (190) 190
80.5%
(Missing) 27
 
11.4%
ValueCountFrequency (%)
126.075701 1
0.4%
126.2685055 1
0.4%
126.2690219 1
0.4%
126.3510403 1
0.4%
126.3523913 1
0.4%
126.3525337 1
0.4%
126.3525918 1
0.4%
126.3527528 1
0.4%
126.3535098 1
0.4%
126.3536153 1
0.4%
ValueCountFrequency (%)
126.6916906 1
0.4%
126.6913751 1
0.4%
126.6811154 1
0.4%
126.6807078 1
0.4%
126.6756898 1
0.4%
126.6729573 1
0.4%
126.6719897 1
0.4%
126.6713918 1
0.4%
126.6702197 1
0.4%
126.6696165 2
0.8%

데이터기준일
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)0.5%
Missing27
Missing (%)11.4%
Memory size2.0 KiB
Minimum2023-11-07 00:00:00
Maximum2023-11-07 00:00:00
2023-12-12T23:42:14.580402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:42:14.690515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T23:42:10.940976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:42:10.741310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:42:11.030692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:42:10.843915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:42:14.777513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
읍면동위도경도
읍면동1.0000.8840.886
위도0.8841.0000.798
경도0.8860.7981.000
2023-12-12T23:42:14.880664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명읍면동시도명
시군명1.0001.0001.000
읍면동1.0001.0001.000
시도명1.0001.0001.000
2023-12-12T23:42:14.982570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도시도명시군명읍면동
위도1.000-0.3651.0001.0000.631
경도-0.3651.0001.0001.0000.667
시도명1.0001.0001.0001.0001.000
시군명1.0001.0001.0001.0001.000
읍면동0.6310.6671.0001.0001.000

Missing values

2023-12-12T23:42:11.194082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:42:11.364770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T23:42:11.509173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시도명시군명읍면동도로명주소민박업소명위도경도데이터기준일
0충청남도보령시남포면충청남도 보령시 남포면 용두욕장길 43은행나무36.265527126.5488572023-11-07
1충청남도보령시오천면충청남도 보령시 오천면 소도길 11-6바위섬36.398669126.4321182023-11-07
2충청남도보령시오천면충청남도 보령시 오천면 소도길 7바위섬소도36.398243126.4314332023-11-07
3충청남도보령시웅천읍충청남도 보령시 웅천읍 열린바다로 579독산해변36.22377126.5321132023-11-07
4충청남도보령시웅천읍충청남도 보령시 웅천읍 열린바다로 599-13삼도정36.221851126.5312422023-11-07
5충청남도보령시웅천읍충청남도 보령시 웅천읍 열린바다로 305새서울36.243726126.5372892023-11-07
6충청남도보령시웅천읍충청남도 보령시 웅천읍 열린바다로 315-35자연36.243821126.5377032023-11-07
7충청남도보령시웅천읍충청남도 보령시 웅천읍 열린바다로 599-32전망대타운 독채펜션36.22126126.5321562023-11-07
8충청남도보령시천북면충청남도 보령시 천북면 염생이길 154, 158해나루36.453871126.4873982023-11-07
9충청남도보령시청라면충청남도 보령시 청라면 오서산길 150-32정촌유기농원36.42582126.671992023-11-07
시도명시군명읍면동도로명주소민박업소명위도경도데이터기준일
226<NA><NA><NA><NA><NA><NA><NA><NA>
227<NA><NA><NA><NA><NA><NA><NA><NA>
228<NA><NA><NA><NA><NA><NA><NA><NA>
229<NA><NA><NA><NA><NA><NA><NA><NA>
230<NA><NA><NA><NA><NA><NA><NA><NA>
231<NA><NA><NA><NA><NA><NA><NA><NA>
232<NA><NA><NA><NA><NA><NA><NA><NA>
233<NA><NA><NA><NA><NA><NA><NA><NA>
234<NA><NA><NA><NA><NA><NA><NA><NA>
235<NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

시도명시군명읍면동도로명주소민박업소명위도경도데이터기준일# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA>27