Overview

Dataset statistics

Number of variables6
Number of observations41
Missing cells36
Missing cells (%)14.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 KiB
Average record size in memory51.2 B

Variable types

Text3
Unsupported2
Categorical1

Dataset

Description업종별관광숙박업등록현황20146
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202319

Alerts

Unnamed: 5 is highly imbalanced (75.3%)Imbalance
관광숙박업등록현황(2014년6월) has 36 (87.8%) missing valuesMissing
Unnamed: 1 has unique valuesUnique
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 02:39:39.227684
Analysis finished2024-03-14 02:39:39.609328
Duration0.38 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct5
Distinct (%)100.0%
Missing36
Missing (%)87.8%
Memory size460.0 B
2024-03-14T11:39:39.696155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.6
Min length3

Characters and Unicode

Total characters18
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)100.0%

Sample

1st row업종별
2nd row관광호텔
3rd row가족호텔
4th row호스텔
5th row휴양콘도
ValueCountFrequency (%)
업종별 1
20.0%
관광호텔 1
20.0%
가족호텔 1
20.0%
호스텔 1
20.0%
휴양콘도 1
20.0%
2024-03-14T11:39:39.931165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3
16.7%
3
16.7%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (4) 4
22.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 18
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3
16.7%
3
16.7%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (4) 4
22.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 18
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3
16.7%
3
16.7%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (4) 4
22.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 18
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3
16.7%
3
16.7%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (4) 4
22.2%

Unnamed: 1
Text

UNIQUE 

Distinct41
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size460.0 B
2024-03-14T11:39:40.141562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length10
Mean length7.6341463
Min length4

Characters and Unicode

Total characters313
Distinct characters103
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)100.0%

Sample

1st row상 호 명
2nd row풍남관광호텔
3rd row전주코아호텔
4th row㈜호텔르윈
5th row전주호텔한성
ValueCountFrequency (%)
1
 
2.2%
베스트웨스턴군산호텔 1
 
2.2%
내장산관광호텔 1
 
2.2%
지리산구룡관광호텔 1
 
2.2%
스위트관광호텔 1
 
2.2%
대둔산관광호텔 1
 
2.2%
호텔티롤 1
 
2.2%
선운산관광호텔 1
 
2.2%
채석강스타힐스 1
 
2.2%
호텔 1
 
2.2%
Other values (35) 35
77.8%
2024-03-14T11:39:40.495775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
33
 
10.5%
32
 
10.2%
18
 
5.8%
18
 
5.8%
15
 
4.8%
14
 
4.5%
13
 
4.2%
8
 
2.6%
8
 
2.6%
5
 
1.6%
Other values (93) 149
47.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 308
98.4%
Space Separator 4
 
1.3%
Other Symbol 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
33
 
10.7%
32
 
10.4%
18
 
5.8%
18
 
5.8%
15
 
4.9%
14
 
4.5%
13
 
4.2%
8
 
2.6%
8
 
2.6%
5
 
1.6%
Other values (91) 144
46.8%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 309
98.7%
Common 4
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
33
 
10.7%
32
 
10.4%
18
 
5.8%
18
 
5.8%
15
 
4.9%
14
 
4.5%
13
 
4.2%
8
 
2.6%
8
 
2.6%
5
 
1.6%
Other values (92) 145
46.9%
Common
ValueCountFrequency (%)
4
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 308
98.4%
ASCII 4
 
1.3%
None 1
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
33
 
10.7%
32
 
10.4%
18
 
5.8%
18
 
5.8%
15
 
4.9%
14
 
4.5%
13
 
4.2%
8
 
2.6%
8
 
2.6%
5
 
1.6%
Other values (91) 144
46.8%
ASCII
ValueCountFrequency (%)
4
100.0%
None
ValueCountFrequency (%)
1
100.0%
Distinct39
Distinct (%)95.1%
Missing0
Missing (%)0.0%
Memory size460.0 B
2024-03-14T11:39:40.754109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length16
Mean length14.02439
Min length3

Characters and Unicode

Total characters575
Distinct characters93
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)92.7%

Sample

1st row소재지
2nd row전주시 완산구 객사2길 45-7
3rd row전주시 완산구 팔달로 262-2
4th row전주시 완산구 기린대로 85
5th row전주시 완산구 전주객사5길 44-5
ValueCountFrequency (%)
군산시 11
 
7.5%
전주시 10
 
6.8%
완산구 7
 
4.8%
남원시 7
 
4.8%
무주군 5
 
3.4%
변산면 3
 
2.1%
설천면 3
 
2.1%
덕진구 3
 
2.1%
부안군 3
 
2.1%
만선로 3
 
2.1%
Other values (82) 91
62.3%
2024-03-14T11:39:41.094580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
105
 
18.3%
31
 
5.4%
30
 
5.2%
5 24
 
4.2%
21
 
3.7%
1 21
 
3.7%
20
 
3.5%
20
 
3.5%
20
 
3.5%
4 17
 
3.0%
Other values (83) 266
46.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 349
60.7%
Decimal Number 112
 
19.5%
Space Separator 105
 
18.3%
Dash Punctuation 9
 
1.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
31
 
8.9%
30
 
8.6%
21
 
6.0%
20
 
5.7%
20
 
5.7%
20
 
5.7%
15
 
4.3%
13
 
3.7%
11
 
3.2%
11
 
3.2%
Other values (71) 157
45.0%
Decimal Number
ValueCountFrequency (%)
5 24
21.4%
1 21
18.8%
4 17
15.2%
2 14
12.5%
3 9
 
8.0%
7 7
 
6.2%
6 7
 
6.2%
8 6
 
5.4%
0 5
 
4.5%
9 2
 
1.8%
Space Separator
ValueCountFrequency (%)
105
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 349
60.7%
Common 226
39.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
31
 
8.9%
30
 
8.6%
21
 
6.0%
20
 
5.7%
20
 
5.7%
20
 
5.7%
15
 
4.3%
13
 
3.7%
11
 
3.2%
11
 
3.2%
Other values (71) 157
45.0%
Common
ValueCountFrequency (%)
105
46.5%
5 24
 
10.6%
1 21
 
9.3%
4 17
 
7.5%
2 14
 
6.2%
3 9
 
4.0%
- 9
 
4.0%
7 7
 
3.1%
6 7
 
3.1%
8 6
 
2.7%
Other values (2) 7
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 349
60.7%
ASCII 226
39.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
105
46.5%
5 24
 
10.6%
1 21
 
9.3%
4 17
 
7.5%
2 14
 
6.2%
3 9
 
4.0%
- 9
 
4.0%
7 7
 
3.1%
6 7
 
3.1%
8 6
 
2.7%
Other values (2) 7
 
3.1%
Hangul
ValueCountFrequency (%)
31
 
8.9%
30
 
8.6%
21
 
6.0%
20
 
5.7%
20
 
5.7%
20
 
5.7%
15
 
4.3%
13
 
3.7%
11
 
3.2%
11
 
3.2%
Other values (71) 157
45.0%

Unnamed: 3
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size460.0 B

Unnamed: 4
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size460.0 B

Unnamed: 5
Categorical

IMBALANCE 

Distinct4
Distinct (%)9.8%
Missing0
Missing (%)0.0%
Memory size460.0 B
-
38 
비 고
 
1
(구)전주코아리베라
 
1
(2006.8.18부터~공사중단)
 
1

Length

Max length18
Median length1
Mean length1.6829268
Min length1

Unique

Unique3 ?
Unique (%)7.3%

Sample

1st row비 고
2nd row-
3rd row-
4th row(구)전주코아리베라
5th row-

Common Values

ValueCountFrequency (%)
- 38
92.7%
비 고 1
 
2.4%
(구)전주코아리베라 1
 
2.4%
(2006.8.18부터~공사중단) 1
 
2.4%

Length

2024-03-14T11:39:41.202374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T11:39:41.308771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
38
90.5%
1
 
2.4%
1
 
2.4%
구)전주코아리베라 1
 
2.4%
2006.8.18부터~공사중단 1
 
2.4%

Correlations

2024-03-14T11:39:41.365370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관광숙박업등록현황(2014년6월)Unnamed: 1Unnamed: 2Unnamed: 5
관광숙박업등록현황(2014년6월)1.0001.0001.0001.000
Unnamed: 11.0001.0001.0001.000
Unnamed: 21.0001.0001.0001.000
Unnamed: 51.0001.0001.0001.000

Missing values

2024-03-14T11:39:39.433825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T11:39:39.567518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

관광숙박업등록현황(2014년6월)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5
0업종별상 호 명소재지등급객실수비 고
1관광호텔풍남관광호텔전주시 완산구 객사2길 45-7263-
2<NA>전주코아호텔전주시 완산구 팔달로 262-2특2111-
3<NA>㈜호텔르윈전주시 완산구 기린대로 85특2166(구)전주코아리베라
4<NA>전주호텔한성전주시 완산구 전주객사5길 44-5240-
5<NA>째즈어라운드호텔전주시 덕진구 정언신로 182135-
6<NA>화이트관광호텔전주시 덕진구 전주천동로 501235-
7<NA>궁관광호텔전주시 덕진구 용산1길 17-4330-
8<NA>전주한옥태조궁관광호텔전주시 완산구 전라감영로 40330-
9<NA>전주관광호텔전주시 완산구 객사5길 44-5331-
관광숙박업등록현황(2014년6월)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5
31<NA>무주덕유산리조트국민호텔무주군 설천면 만선로 185-418-
32<NA>대명리조트변산가족호텔부안군 변산면 변산해변로 51-504-
33<NA>모항해나루가족호텔부안군 변산면 모항해변길 73-112-
34호스텔소리울호스텔전주시 완산구 팔달로 144-4-17-
35휴양콘도켄싱턴리조트지리산남원남원시 소리길 66-156-
36<NA>중앙하이츠콘도남원시 장승안길 2-9-153-
37<NA>지리산토비스콘도남원시 산내면 산내원천길 4-5-60-
38<NA>일성지리산콘도남원시 산내면 천왕봉로 626-25-167-
39<NA>일성무주콘도무주군 무풍면 라제통문로 455-121-
40<NA>무주토비스콘도무주군 무풍면 구천동로 350-106-