Overview

Dataset statistics

Number of variables4
Number of observations1840
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory59.4 KiB
Average record size in memory33.1 B

Variable types

Numeric1
Categorical1
Text2

Dataset

Description- 농어촌민박 : 농어촌지역 또는 준농어촌지역의 주민이 직접 거주하는 주택 - 연면적 230제곱미터 미만의 단독주택 1개동 시장군수 에게 신고 후 운영하여야 함
URLhttps://www.data.go.kr/data/15019878/fileData.do

Alerts

연번 is highly overall correlated with 시군명High correlation
시군명 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 16:07:16.602400
Analysis finished2023-12-12 16:07:17.269440
Duration0.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1840
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean920.5
Minimum1
Maximum1840
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.3 KiB
2023-12-13T01:07:17.336295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile92.95
Q1460.75
median920.5
Q31380.25
95-th percentile1748.05
Maximum1840
Range1839
Interquartile range (IQR)919.5

Descriptive statistics

Standard deviation531.30657
Coefficient of variation (CV)0.57719344
Kurtosis-1.2
Mean920.5
Median Absolute Deviation (MAD)460
Skewness0
Sum1693720
Variance282286.67
MonotonicityStrictly increasing
2023-12-13T01:07:17.488866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
1238 1
 
0.1%
1236 1
 
0.1%
1235 1
 
0.1%
1234 1
 
0.1%
1233 1
 
0.1%
1232 1
 
0.1%
1231 1
 
0.1%
1230 1
 
0.1%
1229 1
 
0.1%
Other values (1830) 1830
99.5%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1840 1
0.1%
1839 1
0.1%
1838 1
0.1%
1837 1
0.1%
1836 1
0.1%
1835 1
0.1%
1834 1
0.1%
1833 1
0.1%
1832 1
0.1%
1831 1
0.1%

시군명
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size14.5 KiB
태안군
923 
보령시
186 
공주시
128 
논산시
 
88
서천군
 
87
Other values (10)
428 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row천안시
2nd row천안시
3rd row천안시
4th row천안시
5th row천안시

Common Values

ValueCountFrequency (%)
태안군 923
50.2%
보령시 186
 
10.1%
공주시 128
 
7.0%
논산시 88
 
4.8%
서천군 87
 
4.7%
서산시 86
 
4.7%
금산군 64
 
3.5%
천안시 54
 
2.9%
당진시 50
 
2.7%
아산시 47
 
2.6%
Other values (5) 127
 
6.9%

Length

2023-12-13T01:07:17.606297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
태안군 923
50.2%
보령시 186
 
10.1%
공주시 128
 
7.0%
논산시 88
 
4.8%
서천군 87
 
4.7%
서산시 86
 
4.7%
금산군 64
 
3.5%
천안시 54
 
2.9%
당진시 50
 
2.7%
아산시 47
 
2.6%
Other values (5) 127
 
6.9%
Distinct1726
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Memory size14.5 KiB
2023-12-13T01:07:17.832965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length17
Mean length5.2755435
Min length1

Characters and Unicode

Total characters9707
Distinct characters615
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1638 ?
Unique (%)89.0%

Sample

1st row갈재산장
2nd row하얀집
3rd row여울목펜션
4th row숲속의쉼터
5th row동천골
ValueCountFrequency (%)
민박 69
 
3.5%
펜션 18
 
0.9%
서천휴리조트펜션 8
 
0.4%
해변민박 8
 
0.4%
쉴만한물가 4
 
0.2%
통나무 4
 
0.2%
힐링 4
 
0.2%
송림민박 4
 
0.2%
더마레풀빌라 4
 
0.2%
한옥민박 4
 
0.2%
Other values (1729) 1863
93.6%
2023-12-13T01:07:18.185624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
544
 
5.6%
544
 
5.6%
536
 
5.5%
483
 
5.0%
453
 
4.7%
177
 
1.8%
173
 
1.8%
157
 
1.6%
137
 
1.4%
135
 
1.4%
Other values (605) 6368
65.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8947
92.2%
Space Separator 536
 
5.5%
Decimal Number 84
 
0.9%
Uppercase Letter 54
 
0.6%
Lowercase Letter 51
 
0.5%
Letter Number 17
 
0.2%
Other Punctuation 6
 
0.1%
Open Punctuation 4
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
544
 
6.1%
544
 
6.1%
483
 
5.4%
453
 
5.1%
177
 
2.0%
173
 
1.9%
157
 
1.8%
137
 
1.5%
135
 
1.5%
129
 
1.4%
Other values (551) 6015
67.2%
Uppercase Letter
ValueCountFrequency (%)
A 13
24.1%
B 6
11.1%
S 4
 
7.4%
E 4
 
7.4%
N 4
 
7.4%
O 3
 
5.6%
L 3
 
5.6%
J 3
 
5.6%
K 2
 
3.7%
T 2
 
3.7%
Other values (7) 10
18.5%
Lowercase Letter
ValueCountFrequency (%)
n 9
17.6%
o 7
13.7%
g 7
13.7%
d 5
9.8%
a 4
7.8%
e 4
7.8%
i 4
7.8%
l 2
 
3.9%
s 2
 
3.9%
k 2
 
3.9%
Other values (5) 5
9.8%
Decimal Number
ValueCountFrequency (%)
2 20
23.8%
1 16
19.0%
3 11
13.1%
5 9
10.7%
6 8
 
9.5%
4 5
 
6.0%
8 5
 
6.0%
9 4
 
4.8%
7 3
 
3.6%
0 3
 
3.6%
Letter Number
ValueCountFrequency (%)
8
47.1%
7
41.2%
1
 
5.9%
1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
. 3
50.0%
& 1
 
16.7%
, 1
 
16.7%
' 1
 
16.7%
Space Separator
ValueCountFrequency (%)
536
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8947
92.2%
Common 638
 
6.6%
Latin 122
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
544
 
6.1%
544
 
6.1%
483
 
5.4%
453
 
5.1%
177
 
2.0%
173
 
1.9%
157
 
1.8%
137
 
1.5%
135
 
1.5%
129
 
1.4%
Other values (551) 6015
67.2%
Latin
ValueCountFrequency (%)
A 13
 
10.7%
n 9
 
7.4%
8
 
6.6%
o 7
 
5.7%
g 7
 
5.7%
7
 
5.7%
B 6
 
4.9%
d 5
 
4.1%
S 4
 
3.3%
E 4
 
3.3%
Other values (26) 52
42.6%
Common
ValueCountFrequency (%)
536
84.0%
2 20
 
3.1%
1 16
 
2.5%
3 11
 
1.7%
5 9
 
1.4%
6 8
 
1.3%
4 5
 
0.8%
8 5
 
0.8%
( 4
 
0.6%
) 4
 
0.6%
Other values (8) 20
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8947
92.2%
ASCII 743
 
7.7%
Number Forms 17
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
544
 
6.1%
544
 
6.1%
483
 
5.4%
453
 
5.1%
177
 
2.0%
173
 
1.9%
157
 
1.8%
137
 
1.5%
135
 
1.5%
129
 
1.4%
Other values (551) 6015
67.2%
ASCII
ValueCountFrequency (%)
536
72.1%
2 20
 
2.7%
1 16
 
2.2%
A 13
 
1.7%
3 11
 
1.5%
5 9
 
1.2%
n 9
 
1.2%
6 8
 
1.1%
o 7
 
0.9%
g 7
 
0.9%
Other values (40) 107
 
14.4%
Number Forms
ValueCountFrequency (%)
8
47.1%
7
41.2%
1
 
5.9%
1
 
5.9%

주소
Text

Distinct1815
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size14.5 KiB
2023-12-13T01:07:18.657464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length25
Mean length18.458696
Min length7

Characters and Unicode

Total characters33964
Distinct characters329
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1791 ?
Unique (%)97.3%

Sample

1st row천안시 동남구 광덕면 해수길 348
2nd row천안시 동남구 목천읍 덕전1길 121
3rd row천안시 동남구 목천읍 삼방로 778-22
4th row천안시 동남구 성남면 약수로 15
5th row천안시 동남구 광덕면 죽계2길 30
ValueCountFrequency (%)
서천군 87
 
2.3%
천안시 54
 
1.4%
당진시 50
 
1.3%
연무읍 47
 
1.2%
동남구 47
 
1.2%
서면 41
 
1.1%
석문면 37
 
1.0%
웅천읍 37
 
1.0%
송악면 35
 
0.9%
팔봉면 33
 
0.9%
Other values (2296) 3345
87.7%
2023-12-13T01:07:18.996599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2367
 
7.0%
2159
 
6.4%
- 1813
 
5.3%
1639
 
4.8%
2 1604
 
4.7%
3 1310
 
3.9%
1141
 
3.4%
4 1122
 
3.3%
1090
 
3.2%
( 1029
 
3.0%
Other values (319) 18690
55.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16304
48.0%
Decimal Number 11537
34.0%
Space Separator 2159
 
6.4%
Dash Punctuation 1813
 
5.3%
Open Punctuation 1029
 
3.0%
Close Punctuation 1026
 
3.0%
Other Punctuation 83
 
0.2%
Uppercase Letter 13
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1639
 
10.1%
1141
 
7.0%
1090
 
6.7%
796
 
4.9%
711
 
4.4%
603
 
3.7%
384
 
2.4%
309
 
1.9%
299
 
1.8%
298
 
1.8%
Other values (298) 9034
55.4%
Decimal Number
ValueCountFrequency (%)
1 2367
20.5%
2 1604
13.9%
3 1310
11.4%
4 1122
9.7%
5 975
8.5%
7 933
 
8.1%
6 918
 
8.0%
9 799
 
6.9%
0 779
 
6.8%
8 730
 
6.3%
Other Punctuation
ValueCountFrequency (%)
, 64
77.1%
/ 15
 
18.1%
. 3
 
3.6%
? 1
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
B 7
53.8%
A 4
30.8%
C 2
 
15.4%
Space Separator
ValueCountFrequency (%)
2159
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1813
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1029
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1026
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17647
52.0%
Hangul 16304
48.0%
Latin 13
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1639
 
10.1%
1141
 
7.0%
1090
 
6.7%
796
 
4.9%
711
 
4.4%
603
 
3.7%
384
 
2.4%
309
 
1.9%
299
 
1.8%
298
 
1.8%
Other values (298) 9034
55.4%
Common
ValueCountFrequency (%)
1 2367
13.4%
2159
12.2%
- 1813
10.3%
2 1604
9.1%
3 1310
 
7.4%
4 1122
 
6.4%
( 1029
 
5.8%
) 1026
 
5.8%
5 975
 
5.5%
7 933
 
5.3%
Other values (8) 3309
18.8%
Latin
ValueCountFrequency (%)
B 7
53.8%
A 4
30.8%
C 2
 
15.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17660
52.0%
Hangul 16304
48.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2367
13.4%
2159
12.2%
- 1813
10.3%
2 1604
9.1%
3 1310
 
7.4%
4 1122
 
6.4%
( 1029
 
5.8%
) 1026
 
5.8%
5 975
 
5.5%
7 933
 
5.3%
Other values (11) 3322
18.8%
Hangul
ValueCountFrequency (%)
1639
 
10.1%
1141
 
7.0%
1090
 
6.7%
796
 
4.9%
711
 
4.4%
603
 
3.7%
384
 
2.4%
309
 
1.9%
299
 
1.8%
298
 
1.8%
Other values (298) 9034
55.4%

Interactions

2023-12-13T01:07:17.059777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:07:19.079743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시군명
연번1.0000.942
시군명0.9421.000
2023-12-13T01:07:19.145219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시군명
연번1.0000.718
시군명0.7181.000

Missing values

2023-12-13T01:07:17.166312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:07:17.236728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시군명민박명주소
01천안시갈재산장천안시 동남구 광덕면 해수길 348
12천안시하얀집천안시 동남구 목천읍 덕전1길 121
23천안시여울목펜션천안시 동남구 목천읍 삼방로 778-22
34천안시숲속의쉼터천안시 동남구 성남면 약수로 15
45천안시동천골천안시 동남구 광덕면 죽계2길 30
56천안시산울림펜션천안시 동남구 목천읍 덕전1길 171
67천안시유왕별서천안시 동남구 목천읍 유왕골1길 181
78천안시천안휴펜션천안시 동남구 광덕면 휴암1길 17
89천안시은퇴농장천안시 동남구 병천면 봉항4길 120
910천안시우리농원천안시 동남구 병천면 봉항4길 97
연번시군명민박명주소
18301831태안군산해로안면읍샛별길165-27(신야리781-14)
18311832태안군티파니펜션안면읍꽃지2길41-7(승언리174-1)
18321833태안군꽃지바다스토리안면읍꽃지2길58-2(승언리211)
18331834태안군메리엘펜션안면읍안면대로2986-4(승언리1297-6)
18341835태안군바다정원안면읍해안관광로58(승언리339-295)
18351836태안군일번지민박원북면신두해변길201-61(신두리357-18)
18361837태안군해룡펜션소원면만리포3길3-13(의항리979-241)
18371838태안군로그하우스안면읍황도로523(황도리8-34)
18381839태안군대명안면읍삼봉길221(창기리1303-9)
18391840태안군모모네민박원북면옥파로1010-1(황촌리800-45)