Overview

Dataset statistics

Number of variables4
Number of observations3364
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)0.1%
Total size in memory108.5 KiB
Average record size in memory33.0 B

Variable types

Categorical1
Text2
Numeric1

Dataset

Description전라남도내 농어촌민박 현황(2023.8월 기준)에 대한 데이터로 시군명, 주소, 민박명, 객실수 정보를 제공합니다.
Author전라남도
URLhttps://www.data.go.kr/data/15074651/fileData.do

Alerts

Dataset has 2 (0.1%) duplicate rowsDuplicates
객실수 has 126 (3.7%) zerosZeros

Reproduction

Analysis started2023-12-12 03:11:44.359313
Analysis finished2023-12-12 03:11:45.229734
Duration0.87 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

Distinct22
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size26.4 KiB
여수시
646 
신안군
349 
완도군
276 
담양군
257 
구례군
246 
Other values (17)
1590 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row목포시
2nd row목포시
3rd row목포시
4th row목포시
5th row목포시

Common Values

ValueCountFrequency (%)
여수시 646
19.2%
신안군 349
10.4%
완도군 276
 
8.2%
담양군 257
 
7.6%
구례군 246
 
7.3%
순천시 205
 
6.1%
광양시 200
 
5.9%
고흥군 175
 
5.2%
강진군 140
 
4.2%
해남군 119
 
3.5%
Other values (12) 751
22.3%

Length

2023-12-12T12:11:45.551568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
여수시 646
19.2%
신안군 349
10.4%
완도군 276
 
8.2%
담양군 257
 
7.6%
구례군 246
 
7.3%
순천시 205
 
6.1%
광양시 200
 
5.9%
고흥군 175
 
5.2%
강진군 140
 
4.2%
해남군 119
 
3.5%
Other values (12) 751
22.3%

주소
Text

Distinct3288
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Memory size26.4 KiB
2023-12-12T12:11:45.812179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length35
Mean length21.893876
Min length17

Characters and Unicode

Total characters73651
Distinct characters424
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3219 ?
Unique (%)95.7%

Sample

1st row전라남도 목포시 달리길 484-5 (달동)
2nd row전라남도 목포시 달리길 497(달동)
3rd row전라남도 목포시 외달도길 21-11 (달동)
4th row전라남도 목포시 외달도길 21-22 (달동)
5th row전라남도 목포시 외달도길 21-24 (달동)
ValueCountFrequency (%)
전라남도 3364
 
19.8%
여수시 646
 
3.8%
신안군 349
 
2.1%
완도군 276
 
1.6%
담양군 257
 
1.5%
구례군 246
 
1.4%
돌산읍 240
 
1.4%
순천시 205
 
1.2%
광양시 200
 
1.2%
고흥군 175
 
1.0%
Other values (3705) 11046
65.0%
2023-12-12T12:11:46.241915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13640
18.5%
4289
 
5.8%
3727
 
5.1%
3520
 
4.8%
3409
 
4.6%
2685
 
3.6%
1 2478
 
3.4%
2361
 
3.2%
2314
 
3.1%
- 1705
 
2.3%
Other values (414) 33523
45.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 46451
63.1%
Space Separator 13640
 
18.5%
Decimal Number 11503
 
15.6%
Dash Punctuation 1705
 
2.3%
Other Punctuation 177
 
0.2%
Close Punctuation 82
 
0.1%
Open Punctuation 82
 
0.1%
Uppercase Letter 11
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4289
 
9.2%
3727
 
8.0%
3520
 
7.6%
3409
 
7.3%
2685
 
5.8%
2361
 
5.1%
2314
 
5.0%
1151
 
2.5%
1099
 
2.4%
1016
 
2.2%
Other values (389) 20880
45.0%
Decimal Number
ValueCountFrequency (%)
1 2478
21.5%
2 1518
13.2%
3 1347
11.7%
4 1077
9.4%
5 1035
9.0%
7 948
 
8.2%
6 915
 
8.0%
8 768
 
6.7%
9 719
 
6.3%
0 698
 
6.1%
Uppercase Letter
ValueCountFrequency (%)
E 2
18.2%
P 1
9.1%
M 1
9.1%
C 1
9.1%
A 1
9.1%
D 1
9.1%
F 1
9.1%
S 1
9.1%
G 1
9.1%
B 1
9.1%
Space Separator
ValueCountFrequency (%)
13640
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1705
100.0%
Other Punctuation
ValueCountFrequency (%)
, 177
100.0%
Close Punctuation
ValueCountFrequency (%)
) 82
100.0%
Open Punctuation
ValueCountFrequency (%)
( 82
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 46451
63.1%
Common 27189
36.9%
Latin 11
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4289
 
9.2%
3727
 
8.0%
3520
 
7.6%
3409
 
7.3%
2685
 
5.8%
2361
 
5.1%
2314
 
5.0%
1151
 
2.5%
1099
 
2.4%
1016
 
2.2%
Other values (389) 20880
45.0%
Common
ValueCountFrequency (%)
13640
50.2%
1 2478
 
9.1%
- 1705
 
6.3%
2 1518
 
5.6%
3 1347
 
5.0%
4 1077
 
4.0%
5 1035
 
3.8%
7 948
 
3.5%
6 915
 
3.4%
8 768
 
2.8%
Other values (5) 1758
 
6.5%
Latin
ValueCountFrequency (%)
E 2
18.2%
P 1
9.1%
M 1
9.1%
C 1
9.1%
A 1
9.1%
D 1
9.1%
F 1
9.1%
S 1
9.1%
G 1
9.1%
B 1
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 46451
63.1%
ASCII 27200
36.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
13640
50.1%
1 2478
 
9.1%
- 1705
 
6.3%
2 1518
 
5.6%
3 1347
 
5.0%
4 1077
 
4.0%
5 1035
 
3.8%
7 948
 
3.5%
6 915
 
3.4%
8 768
 
2.8%
Other values (15) 1769
 
6.5%
Hangul
ValueCountFrequency (%)
4289
 
9.2%
3727
 
8.0%
3520
 
7.6%
3409
 
7.3%
2685
 
5.8%
2361
 
5.1%
2314
 
5.0%
1151
 
2.5%
1099
 
2.4%
1016
 
2.2%
Other values (389) 20880
45.0%
Distinct3139
Distinct (%)93.3%
Missing0
Missing (%)0.0%
Memory size26.4 KiB
2023-12-12T12:11:46.521066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length18
Mean length5.2856718
Min length1

Characters and Unicode

Total characters17781
Distinct characters710
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2989 ?
Unique (%)88.9%

Sample

1st row달리도 민박
2nd row회자촌
3rd row달성민박
4th row현대민박
5th row영자네민박
ValueCountFrequency (%)
민박 142
 
3.5%
펜션 66
 
1.6%
여수 22
 
0.5%
풀빌라 19
 
0.5%
한옥민박 17
 
0.4%
스테이 16
 
0.4%
15
 
0.4%
바다민박 11
 
0.3%
한옥 8
 
0.2%
하우스 7
 
0.2%
Other values (3289) 3732
92.0%
2023-12-12T12:11:46.914543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1488
 
8.4%
1484
 
8.3%
694
 
3.9%
484
 
2.7%
455
 
2.6%
278
 
1.6%
243
 
1.4%
228
 
1.3%
205
 
1.2%
191
 
1.1%
Other values (700) 12031
67.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16403
92.3%
Space Separator 694
 
3.9%
Uppercase Letter 195
 
1.1%
Lowercase Letter 193
 
1.1%
Decimal Number 167
 
0.9%
Open Punctuation 41
 
0.2%
Close Punctuation 41
 
0.2%
Other Punctuation 36
 
0.2%
Dash Punctuation 8
 
< 0.1%
Letter Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1488
 
9.1%
1484
 
9.0%
484
 
3.0%
455
 
2.8%
278
 
1.7%
243
 
1.5%
228
 
1.4%
205
 
1.2%
191
 
1.2%
189
 
1.2%
Other values (630) 11158
68.0%
Uppercase Letter
ValueCountFrequency (%)
A 26
13.3%
O 19
 
9.7%
H 15
 
7.7%
S 15
 
7.7%
B 14
 
7.2%
T 14
 
7.2%
Y 11
 
5.6%
D 11
 
5.6%
E 10
 
5.1%
L 8
 
4.1%
Other values (14) 52
26.7%
Lowercase Letter
ValueCountFrequency (%)
s 27
14.0%
e 24
12.4%
o 23
11.9%
a 19
9.8%
y 12
 
6.2%
h 11
 
5.7%
n 10
 
5.2%
u 10
 
5.2%
t 9
 
4.7%
l 8
 
4.1%
Other values (10) 40
20.7%
Decimal Number
ValueCountFrequency (%)
1 46
27.5%
2 32
19.2%
3 18
 
10.8%
0 15
 
9.0%
5 13
 
7.8%
7 13
 
7.8%
9 10
 
6.0%
4 10
 
6.0%
8 7
 
4.2%
6 3
 
1.8%
Other Punctuation
ValueCountFrequency (%)
& 13
36.1%
. 9
25.0%
, 6
16.7%
' 4
 
11.1%
· 1
 
2.8%
! 1
 
2.8%
# 1
 
2.8%
1
 
2.8%
Open Punctuation
ValueCountFrequency (%)
( 39
95.1%
[ 2
 
4.9%
Close Punctuation
ValueCountFrequency (%)
) 39
95.1%
] 2
 
4.9%
Letter Number
ValueCountFrequency (%)
2
66.7%
1
33.3%
Space Separator
ValueCountFrequency (%)
694
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16388
92.2%
Common 987
 
5.6%
Latin 391
 
2.2%
Han 15
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1488
 
9.1%
1484
 
9.1%
484
 
3.0%
455
 
2.8%
278
 
1.7%
243
 
1.5%
228
 
1.4%
205
 
1.3%
191
 
1.2%
189
 
1.2%
Other values (617) 11143
68.0%
Latin
ValueCountFrequency (%)
s 27
 
6.9%
A 26
 
6.6%
e 24
 
6.1%
o 23
 
5.9%
a 19
 
4.9%
O 19
 
4.9%
H 15
 
3.8%
S 15
 
3.8%
B 14
 
3.6%
T 14
 
3.6%
Other values (36) 195
49.9%
Common
ValueCountFrequency (%)
694
70.3%
1 46
 
4.7%
( 39
 
4.0%
) 39
 
4.0%
2 32
 
3.2%
3 18
 
1.8%
0 15
 
1.5%
& 13
 
1.3%
5 13
 
1.3%
7 13
 
1.3%
Other values (14) 65
 
6.6%
Han
ValueCountFrequency (%)
3
20.0%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Other values (3) 3
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16387
92.2%
ASCII 1373
 
7.7%
CJK 15
 
0.1%
Number Forms 3
 
< 0.1%
None 2
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1488
 
9.1%
1484
 
9.1%
484
 
3.0%
455
 
2.8%
278
 
1.7%
243
 
1.5%
228
 
1.4%
205
 
1.3%
191
 
1.2%
189
 
1.2%
Other values (616) 11142
68.0%
ASCII
ValueCountFrequency (%)
694
50.5%
1 46
 
3.4%
( 39
 
2.8%
) 39
 
2.8%
2 32
 
2.3%
s 27
 
2.0%
A 26
 
1.9%
e 24
 
1.7%
o 23
 
1.7%
a 19
 
1.4%
Other values (56) 404
29.4%
CJK
ValueCountFrequency (%)
3
20.0%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Other values (3) 3
20.0%
Number Forms
ValueCountFrequency (%)
2
66.7%
1
33.3%
None
ValueCountFrequency (%)
· 1
50.0%
1
50.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

객실수
Real number (ℝ)

ZEROS 

Distinct15
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9586801
Minimum0
Maximum44
Zeros126
Zeros (%)3.7%
Negative0
Negative (%)0.0%
Memory size29.7 KiB
2023-12-12T12:11:47.034225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile6
Maximum44
Range44
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.9563522
Coefficient of variation (CV)0.66122463
Kurtosis59.126755
Mean2.9586801
Median Absolute Deviation (MAD)1
Skewness3.5938841
Sum9953
Variance3.8273139
MonotonicityNot monotonic
2023-12-12T12:11:47.138575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
2 937
27.9%
3 627
18.6%
1 582
17.3%
4 466
13.9%
5 285
 
8.5%
6 178
 
5.3%
0 126
 
3.7%
7 119
 
3.5%
8 31
 
0.9%
9 4
 
0.1%
Other values (5) 9
 
0.3%
ValueCountFrequency (%)
0 126
 
3.7%
1 582
17.3%
2 937
27.9%
3 627
18.6%
4 466
13.9%
5 285
 
8.5%
6 178
 
5.3%
7 119
 
3.5%
8 31
 
0.9%
9 4
 
0.1%
ValueCountFrequency (%)
44 1
 
< 0.1%
20 1
 
< 0.1%
13 2
 
0.1%
12 1
 
< 0.1%
10 4
 
0.1%
9 4
 
0.1%
8 31
 
0.9%
7 119
3.5%
6 178
5.3%
5 285
8.5%

Interactions

2023-12-12T12:11:44.998695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:11:47.214215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명객실수
시군명1.0000.276
객실수0.2761.000
2023-12-12T12:11:47.296365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
객실수시군명
객실수1.0000.140
시군명0.1401.000

Missing values

2023-12-12T12:11:45.112905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:11:45.194653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군명주소민박명객실수
0목포시전라남도 목포시 달리길 484-5 (달동)달리도 민박1
1목포시전라남도 목포시 달리길 497(달동)회자촌1
2목포시전라남도 목포시 외달도길 21-11 (달동)달성민박3
3목포시전라남도 목포시 외달도길 21-22 (달동)현대민박2
4목포시전라남도 목포시 외달도길 21-24 (달동)영자네민박3
5목포시전라남도 목포시 외달도길 33 (달동)오복민박2
6여수시전라남도 여수시 남면 건너물길 2다모아민박1
7여수시전라남도 여수시 남면 금오로 1139써니아일랜드민박3
8여수시전라남도 여수시 남면 금오로 1339-45포시즌 펜션6
9여수시전라남도 여수시 남면 금오로 392-10해돋이펜션민박2
시군명주소민박명객실수
3354신안군전라남도 신안군 흑산면 흑산일주로 1291-1심촌3
3355신안군전라남도 신안군 흑산면 흑산일주로 1477해솔1
3356신안군전라남도 신안군 흑산면 흑산일주로 1498모래미4
3357신안군전라남도 신안군 흑산면 흑산일주로 154-6스텔라3
3358신안군전라남도 신안군 흑산면 흑산일주로 201-4바닷가5
3359신안군전라남도 신안군 흑산면 흑산일주로 21-2우리민박4
3360신안군전라남도 신안군 흑산면 흑산일주로 23-2게스트민박4
3361신안군전라남도 신안군 흑산면 흑산일주로 31-1신영4
3362신안군전라남도 신안군 흑산면 흑산일주로 39-39큰바다2
3363신안군전라남도 신안군 흑산면 흑산일주로 45-6해당화4

Duplicate rows

Most frequently occurring

시군명주소민박명객실수# duplicates
0신안군전라남도 신안군 비금면 비금북부길 781-2명가한옥22
1여수시전라남도 여수시 화양면 참새미길 134수 민박22