Overview

Dataset statistics

Number of variables5
Number of observations6548
Missing cells0
Missing cells (%)0.0%
Duplicate rows412
Duplicate rows (%)6.3%
Total size in memory255.9 KiB
Average record size in memory40.0 B

Variable types

Categorical3
Text2

Dataset

Description강원특별자치도에서 운영 중인 농어촌 민박 운영 개소수를 시군별로 분류하였고, 민박명, 민박소재지, 정상운영 여부를 추가적으로 기재함
URLhttps://www.data.go.kr/data/3045496/fileData.do

Alerts

시도명 has constant value ""Constant
Dataset has 412 (6.3%) duplicate rowsDuplicates
영업상태 is highly imbalanced (99.6%)Imbalance

Reproduction

Analysis started2023-12-12 08:57:47.991101
Analysis finished2023-12-12 08:57:49.138619
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.3 KiB
강원특별자치도
6548 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원특별자치도
2nd row강원특별자치도
3rd row강원특별자치도
4th row강원특별자치도
5th row강원특별자치도

Common Values

ValueCountFrequency (%)
강원특별자치도 6548
100.0%

Length

2023-12-12T17:57:49.203616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:57:49.315075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
강원특별자치도 6548
100.0%

시군구명
Categorical

Distinct18
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size51.3 KiB
강릉시
825 
삼척시
810 
고성군
628 
평창군
556 
홍천군
533 
Other values (13)
3196 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row춘천시
2nd row춘천시
3rd row춘천시
4th row춘천시
5th row춘천시

Common Values

ValueCountFrequency (%)
강릉시 825
12.6%
삼척시 810
12.4%
고성군 628
9.6%
평창군 556
8.5%
홍천군 533
8.1%
인제군 521
8.0%
양양군 480
7.3%
춘천시 476
7.3%
영월군 370
 
5.7%
정선군 301
 
4.6%
Other values (8) 1048
16.0%

Length

2023-12-12T17:57:49.494182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강릉시 825
12.6%
삼척시 810
12.4%
고성군 628
9.6%
평창군 556
8.5%
홍천군 533
8.1%
인제군 521
8.0%
양양군 480
7.3%
춘천시 476
7.3%
영월군 370
 
5.7%
정선군 301
 
4.6%
Other values (8) 1048
16.0%
Distinct5694
Distinct (%)87.0%
Missing0
Missing (%)0.0%
Memory size51.3 KiB
2023-12-12T17:57:49.867288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length20
Mean length5.3819487
Min length1

Characters and Unicode

Total characters35241
Distinct characters855
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5065 ?
Unique (%)77.4%

Sample

1st row푸른마트 4층
2nd row미래하우스
3rd row팔도민박
4th row연화펜션
5th row정 팜스테이
ValueCountFrequency (%)
민박 305
 
3.9%
펜션 165
 
2.1%
하우스 29
 
0.4%
스테이 21
 
0.3%
고향민박 16
 
0.2%
숲속의 14
 
0.2%
14
 
0.2%
하얀집 14
 
0.2%
해변민박 14
 
0.2%
13
 
0.2%
Other values (5808) 7168
92.2%
2023-12-12T17:57:50.480920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1995
 
5.7%
1989
 
5.6%
1672
 
4.7%
1532
 
4.3%
1449
 
4.1%
780
 
2.2%
655
 
1.9%
652
 
1.9%
532
 
1.5%
419
 
1.2%
Other values (845) 23566
66.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 31855
90.4%
Space Separator 1672
 
4.7%
Decimal Number 606
 
1.7%
Uppercase Letter 445
 
1.3%
Lowercase Letter 437
 
1.2%
Close Punctuation 65
 
0.2%
Open Punctuation 65
 
0.2%
Other Punctuation 60
 
0.2%
Letter Number 21
 
0.1%
Dash Punctuation 13
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1995
 
6.3%
1989
 
6.2%
1532
 
4.8%
1449
 
4.5%
780
 
2.4%
655
 
2.1%
652
 
2.0%
532
 
1.7%
419
 
1.3%
410
 
1.3%
Other values (769) 21442
67.3%
Uppercase Letter
ValueCountFrequency (%)
A 54
 
12.1%
S 45
 
10.1%
B 45
 
10.1%
O 29
 
6.5%
H 28
 
6.3%
T 24
 
5.4%
E 23
 
5.2%
Y 18
 
4.0%
C 16
 
3.6%
U 16
 
3.6%
Other values (14) 147
33.0%
Lowercase Letter
ValueCountFrequency (%)
e 58
13.3%
o 52
11.9%
a 48
11.0%
s 40
 
9.2%
i 25
 
5.7%
l 22
 
5.0%
n 21
 
4.8%
p 20
 
4.6%
h 19
 
4.3%
t 19
 
4.3%
Other values (13) 113
25.9%
Decimal Number
ValueCountFrequency (%)
2 145
23.9%
1 121
20.0%
0 69
11.4%
3 56
 
9.2%
5 43
 
7.1%
8 43
 
7.1%
7 35
 
5.8%
4 34
 
5.6%
9 30
 
5.0%
6 30
 
5.0%
Other Punctuation
ValueCountFrequency (%)
& 21
35.0%
. 10
16.7%
? 6
 
10.0%
; 5
 
8.3%
, 5
 
8.3%
" 4
 
6.7%
# 4
 
6.7%
' 3
 
5.0%
1
 
1.7%
· 1
 
1.7%
Letter Number
ValueCountFrequency (%)
12
57.1%
7
33.3%
2
 
9.5%
Space Separator
ValueCountFrequency (%)
1672
100.0%
Close Punctuation
ValueCountFrequency (%)
) 65
100.0%
Open Punctuation
ValueCountFrequency (%)
( 65
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 31849
90.4%
Common 2483
 
7.0%
Latin 903
 
2.6%
Han 6
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1995
 
6.3%
1989
 
6.2%
1532
 
4.8%
1449
 
4.5%
780
 
2.4%
655
 
2.1%
652
 
2.0%
532
 
1.7%
419
 
1.3%
410
 
1.3%
Other values (765) 21436
67.3%
Latin
ValueCountFrequency (%)
e 58
 
6.4%
A 54
 
6.0%
o 52
 
5.8%
a 48
 
5.3%
S 45
 
5.0%
B 45
 
5.0%
s 40
 
4.4%
O 29
 
3.2%
H 28
 
3.1%
i 25
 
2.8%
Other values (40) 479
53.0%
Common
ValueCountFrequency (%)
1672
67.3%
2 145
 
5.8%
1 121
 
4.9%
0 69
 
2.8%
) 65
 
2.6%
( 65
 
2.6%
3 56
 
2.3%
5 43
 
1.7%
8 43
 
1.7%
7 35
 
1.4%
Other values (16) 169
 
6.8%
Han
ValueCountFrequency (%)
3
50.0%
1
 
16.7%
1
 
16.7%
1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 31849
90.4%
ASCII 3362
 
9.5%
Number Forms 21
 
0.1%
CJK 6
 
< 0.1%
None 2
 
< 0.1%
Misc Symbols 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1995
 
6.3%
1989
 
6.2%
1532
 
4.8%
1449
 
4.5%
780
 
2.4%
655
 
2.1%
652
 
2.0%
532
 
1.7%
419
 
1.3%
410
 
1.3%
Other values (765) 21436
67.3%
ASCII
ValueCountFrequency (%)
1672
49.7%
2 145
 
4.3%
1 121
 
3.6%
0 69
 
2.1%
) 65
 
1.9%
( 65
 
1.9%
e 58
 
1.7%
3 56
 
1.7%
A 54
 
1.6%
o 52
 
1.5%
Other values (60) 1005
29.9%
Number Forms
ValueCountFrequency (%)
12
57.1%
7
33.3%
2
 
9.5%
CJK
ValueCountFrequency (%)
3
50.0%
1
 
16.7%
1
 
16.7%
1
 
16.7%
None
ValueCountFrequency (%)
1
50.0%
· 1
50.0%
Misc Symbols
ValueCountFrequency (%)
1
100.0%
Distinct6005
Distinct (%)91.7%
Missing0
Missing (%)0.0%
Memory size51.3 KiB
2023-12-12T17:57:51.063922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length39
Mean length25.802535
Min length20

Characters and Unicode

Total characters168955
Distinct characters573
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5480 ?
Unique (%)83.7%

Sample

1st row강원특별자치도 춘천시 남산면 강촌로 116
2nd row강원특별자치도 춘천시 남산면 문의골길 3
3rd row강원특별자치도 춘천시 남산면 강촌구곡길 19
4th row강원특별자치도 춘천시 남산면 방하로 777, 가동
5th row강원특별자치도 춘천시 사북면 원평길 54
ValueCountFrequency (%)
강원특별자치도 6543
 
19.6%
강릉시 825
 
2.5%
삼척시 810
 
2.4%
고성군 628
 
1.9%
평창군 556
 
1.7%
홍천군 533
 
1.6%
인제군 521
 
1.6%
근덕면 500
 
1.5%
양양군 480
 
1.4%
춘천시 476
 
1.4%
Other values (5972) 21571
64.5%
2023-12-12T17:57:51.812759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26895
 
15.9%
7959
 
4.7%
7405
 
4.4%
6939
 
4.1%
6667
 
3.9%
6664
 
3.9%
6549
 
3.9%
6543
 
3.9%
5061
 
3.0%
1 4997
 
3.0%
Other values (563) 83276
49.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 113411
67.1%
Space Separator 26895
 
15.9%
Decimal Number 23921
 
14.2%
Dash Punctuation 3009
 
1.8%
Other Punctuation 738
 
0.4%
Close Punctuation 456
 
0.3%
Open Punctuation 456
 
0.3%
Uppercase Letter 46
 
< 0.1%
Lowercase Letter 23
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7959
 
7.0%
7405
 
6.5%
6939
 
6.1%
6667
 
5.9%
6664
 
5.9%
6549
 
5.8%
6543
 
5.8%
5061
 
4.5%
4157
 
3.7%
4126
 
3.6%
Other values (519) 51341
45.3%
Uppercase Letter
ValueCountFrequency (%)
A 9
19.6%
S 6
13.0%
B 6
13.0%
C 3
 
6.5%
Y 3
 
6.5%
R 3
 
6.5%
O 3
 
6.5%
E 2
 
4.3%
G 2
 
4.3%
J 2
 
4.3%
Other values (5) 7
15.2%
Lowercase Letter
ValueCountFrequency (%)
o 3
13.0%
l 3
13.0%
u 3
13.0%
e 2
8.7%
p 2
8.7%
x 2
8.7%
d 2
8.7%
a 1
 
4.3%
b 1
 
4.3%
t 1
 
4.3%
Other values (3) 3
13.0%
Decimal Number
ValueCountFrequency (%)
1 4997
20.9%
2 3494
14.6%
3 2517
10.5%
4 2271
9.5%
5 2068
8.6%
6 1983
 
8.3%
7 1822
 
7.6%
0 1637
 
6.8%
8 1627
 
6.8%
9 1505
 
6.3%
Other Punctuation
ValueCountFrequency (%)
, 734
99.5%
. 4
 
0.5%
Space Separator
ValueCountFrequency (%)
26895
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3009
100.0%
Close Punctuation
ValueCountFrequency (%)
) 456
100.0%
Open Punctuation
ValueCountFrequency (%)
( 456
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 113410
67.1%
Common 55475
32.8%
Latin 69
 
< 0.1%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7959
 
7.0%
7405
 
6.5%
6939
 
6.1%
6667
 
5.9%
6664
 
5.9%
6549
 
5.8%
6543
 
5.8%
5061
 
4.5%
4157
 
3.7%
4126
 
3.6%
Other values (518) 51340
45.3%
Latin
ValueCountFrequency (%)
A 9
 
13.0%
S 6
 
8.7%
B 6
 
8.7%
o 3
 
4.3%
C 3
 
4.3%
l 3
 
4.3%
Y 3
 
4.3%
R 3
 
4.3%
O 3
 
4.3%
u 3
 
4.3%
Other values (18) 27
39.1%
Common
ValueCountFrequency (%)
26895
48.5%
1 4997
 
9.0%
2 3494
 
6.3%
- 3009
 
5.4%
3 2517
 
4.5%
4 2271
 
4.1%
5 2068
 
3.7%
6 1983
 
3.6%
7 1822
 
3.3%
0 1637
 
3.0%
Other values (6) 4782
 
8.6%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 113410
67.1%
ASCII 55544
32.9%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
26895
48.4%
1 4997
 
9.0%
2 3494
 
6.3%
- 3009
 
5.4%
3 2517
 
4.5%
4 2271
 
4.1%
5 2068
 
3.7%
6 1983
 
3.6%
7 1822
 
3.3%
0 1637
 
2.9%
Other values (34) 4851
 
8.7%
Hangul
ValueCountFrequency (%)
7959
 
7.0%
7405
 
6.5%
6939
 
6.1%
6667
 
5.9%
6664
 
5.9%
6549
 
5.8%
6543
 
5.8%
5061
 
4.5%
4157
 
3.7%
4126
 
3.6%
Other values (518) 51340
45.3%
CJK
ValueCountFrequency (%)
1
100.0%

영업상태
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size51.3 KiB
정상
6546 
휴업
 
2

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정상
2nd row정상
3rd row정상
4th row정상
5th row정상

Common Values

ValueCountFrequency (%)
정상 6546
> 99.9%
휴업 2
 
< 0.1%

Length

2023-12-12T17:57:52.051099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:57:52.205051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정상 6546
> 99.9%
휴업 2
 
< 0.1%

Correlations

2023-12-12T17:57:52.286207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구명영업상태
시군구명1.0000.000
영업상태0.0001.000
2023-12-12T17:57:52.380517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구명영업상태
시군구명1.0000.000
영업상태0.0001.000
2023-12-12T17:57:52.495422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구명영업상태
시군구명1.0000.000
영업상태0.0001.000

Missing values

2023-12-12T17:57:48.987360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:57:49.091730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도명시군구명업소명소재지도로명주소영업상태
0강원특별자치도춘천시푸른마트 4층강원특별자치도 춘천시 남산면 강촌로 116정상
1강원특별자치도춘천시미래하우스강원특별자치도 춘천시 남산면 문의골길 3정상
2강원특별자치도춘천시팔도민박강원특별자치도 춘천시 남산면 강촌구곡길 19정상
3강원특별자치도춘천시연화펜션강원특별자치도 춘천시 남산면 방하로 777, 가동정상
4강원특별자치도춘천시정 팜스테이강원특별자치도 춘천시 사북면 원평길 54정상
5강원특별자치도춘천시연화민박강원특별자치도 춘천시 남산면 방하로 777, 나동정상
6강원특별자치도춘천시오월당강원특별자치도 춘천시 서면 납실길 214정상
7강원특별자치도춘천시갓골가든강원특별자치도 춘천시 사북면 화악지암길 765정상
8강원특별자치도춘천시삼팔선의봄강원특별자치도 춘천시 사북면 말고개길 14-13정상
9강원특별자치도춘천시강가에서민박강원특별자치도 춘천시 사북면 춘화로 372정상
시도명시군구명업소명소재지도로명주소영업상태
6538강원특별자치도양양군마나긔 하우스강원특별자치도 양양군 강현면 회룡1길 161-2정상
6539강원특별자치도양양군고꼴집강원특별자치도 양양군 현남면 화상해안길 287정상
6540강원특별자치도양양군송이로1010강원특별자치도 양양군 현북면 송이로 1010정상
6541강원특별자치도양양군물치항펜션강원특별자치도 양양군 강현면 동해대로 3595-1정상
6542강원특별자치도양양군랑스강원특별자치도 양양군 양양읍 동해신묘길 16-2정상
6543강원특별자치도양양군법수치리 120강원특별자치도 양양군 현북면 법수치길 913정상
6544강원특별자치도양양군화이트펜션강원특별자치도 양양군 현남면 동해대로 952정상
6545강원특별자치도양양군증바우 민박강원특별자치도 양양군 강현면 정암2길 55정상
6546강원특별자치도양양군사인삼각강원특별자치도 양양군 현남면 황태골로 69-18정상
6547강원특별자치도양양군남애 영희와 윤아저씨강원특별자치도 양양군 현남면 미륭마을길 1-7정상

Duplicate rows

Most frequently occurring

시도명시군구명업소명소재지도로명주소영업상태# duplicates
0강원특별자치도고성군까사텔아야강원특별자치도 고성군 토성면 아야진해변길 19정상2
1강원특별자치도고성군수성민박강원특별자치도 고성군 거진읍 반암길 10, 수성민박정상2
2강원특별자치도고성군예쁜연못집 민박강원특별자치도 고성군 간성읍 소똥령마을길 48, 예쁜연못집정상2
3강원특별자치도고성군전원민박강원특별자치도 고성군 죽왕면 삼포민박촌2길 31, 전원가든정상2
4강원특별자치도고성군천진슈퍼 민박강원특별자치도 고성군 토성면 토성로 153정상2
5강원특별자치도삼척시#끝집민박강원특별자치도 삼척시 원덕읍 갈남길 69-26, 골든민박정상2
6강원특별자치도삼척시103HOUSE강원특별자치도 삼척시 원덕읍 임원항구로 15-103, 모텔하우스정상2
7강원특별자치도삼척시1박2일강원특별자치도 삼척시 가곡면 덕풍길 45-141정상2
8강원특별자치도삼척시2087펜션강원특별자치도 삼척시 근덕면 삼척로 2087정상2
9강원특별자치도삼척시FoRest(숲&amp;휴식)강원특별자치도 삼척시 하장면 중봉당골길 297정상2