Overview

Dataset statistics

Number of variables4
Number of observations216
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.9 KiB
Average record size in memory32.6 B

Variable types

Categorical2
Text2

Dataset

Description경상북도 김천시의 숙박업소 현황으로 관광숙박업, 농어촌민박, 숙박업, 외국인관광 도시숙박업의 시설명 및 주소 정보를 제공합니다.
Author경상북도 김천시
URLhttps://www.data.go.kr/data/3033907/fileData.do

Alerts

데이터기준일 has constant value ""Constant
시설종류 is highly imbalanced (60.2%)Imbalance

Reproduction

Analysis started2023-12-12 03:49:49.277338
Analysis finished2023-12-12 03:49:49.759163
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설종류
Categorical

IMBALANCE 

Distinct7
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
숙박업(일반)
151 
농어촌민박
57 
숙박업(생활)
 
4
관광숙박업
 
1
외국인관광 도시민박업
 
1
Other values (2)
 
2

Length

Max length11
Median length7
Mean length6.4722222
Min length5

Unique

Unique4 ?
Unique (%)1.9%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
숙박업(일반) 151
69.9%
농어촌민박 57
 
26.4%
숙박업(생활) 4
 
1.9%
관광숙박업 1
 
0.5%
외국인관광 도시민박업 1
 
0.5%
외국인도시민박 1
 
0.5%
한옥체험업 1
 
0.5%

Length

2023-12-12T12:49:49.880918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:49:50.106151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
숙박업(일반 151
69.6%
농어촌민박 57
 
26.3%
숙박업(생활 4
 
1.8%
관광숙박업 1
 
0.5%
외국인관광 1
 
0.5%
도시민박업 1
 
0.5%
외국인도시민박 1
 
0.5%
한옥체험업 1
 
0.5%
Distinct213
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-12T12:49:50.522354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length13
Mean length5.1851852
Min length2

Characters and Unicode

Total characters1120
Distinct characters274
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique210 ?
Unique (%)97.2%

Sample

1st row영남여관
2nd row흥남장여관
3rd row반도하숙
4th row구미여인숙
5th row명동파크
ValueCountFrequency (%)
모텔 5
 
2.1%
민박 3
 
1.3%
브라운도트 2
 
0.8%
house 2
 
0.8%
2
 
0.8%
호텔로제니아 2
 
0.8%
황토민박 2
 
0.8%
김천파크관광호텔 1
 
0.4%
뉴비앤비모텔 1
 
0.4%
사명대사공원건강문화원 1
 
0.4%
Other values (218) 218
91.2%
2023-12-12T12:49:51.190643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
106
 
9.5%
94
 
8.4%
31
 
2.8%
29
 
2.6%
29
 
2.6%
29
 
2.6%
28
 
2.5%
25
 
2.2%
23
 
2.1%
19
 
1.7%
Other values (264) 707
63.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1035
92.4%
Lowercase Letter 27
 
2.4%
Space Separator 23
 
2.1%
Decimal Number 16
 
1.4%
Uppercase Letter 11
 
1.0%
Close Punctuation 3
 
0.3%
Open Punctuation 3
 
0.3%
Math Symbol 1
 
0.1%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
106
 
10.2%
94
 
9.1%
31
 
3.0%
29
 
2.8%
29
 
2.8%
29
 
2.8%
28
 
2.7%
25
 
2.4%
19
 
1.8%
14
 
1.4%
Other values (233) 631
61.0%
Lowercase Letter
ValueCountFrequency (%)
s 5
18.5%
e 4
14.8%
u 4
14.8%
o 4
14.8%
t 2
 
7.4%
h 2
 
7.4%
i 1
 
3.7%
q 1
 
3.7%
a 1
 
3.7%
m 1
 
3.7%
Other values (2) 2
 
7.4%
Uppercase Letter
ValueCountFrequency (%)
B 2
18.2%
E 2
18.2%
J 1
9.1%
G 1
9.1%
S 1
9.1%
X 1
9.1%
A 1
9.1%
H 1
9.1%
C 1
9.1%
Decimal Number
ValueCountFrequency (%)
1 8
50.0%
3 2
 
12.5%
2 2
 
12.5%
4 2
 
12.5%
7 2
 
12.5%
Space Separator
ValueCountFrequency (%)
23
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1035
92.4%
Common 47
 
4.2%
Latin 38
 
3.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
106
 
10.2%
94
 
9.1%
31
 
3.0%
29
 
2.8%
29
 
2.8%
29
 
2.8%
28
 
2.7%
25
 
2.4%
19
 
1.8%
14
 
1.4%
Other values (233) 631
61.0%
Latin
ValueCountFrequency (%)
s 5
13.2%
e 4
 
10.5%
u 4
 
10.5%
o 4
 
10.5%
B 2
 
5.3%
t 2
 
5.3%
E 2
 
5.3%
h 2
 
5.3%
J 1
 
2.6%
i 1
 
2.6%
Other values (11) 11
28.9%
Common
ValueCountFrequency (%)
23
48.9%
1 8
 
17.0%
) 3
 
6.4%
( 3
 
6.4%
3 2
 
4.3%
2 2
 
4.3%
4 2
 
4.3%
7 2
 
4.3%
+ 1
 
2.1%
- 1
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1035
92.4%
ASCII 85
 
7.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
106
 
10.2%
94
 
9.1%
31
 
3.0%
29
 
2.8%
29
 
2.8%
29
 
2.8%
28
 
2.7%
25
 
2.4%
19
 
1.8%
14
 
1.4%
Other values (233) 631
61.0%
ASCII
ValueCountFrequency (%)
23
27.1%
1 8
 
9.4%
s 5
 
5.9%
e 4
 
4.7%
u 4
 
4.7%
o 4
 
4.7%
) 3
 
3.5%
( 3
 
3.5%
3 2
 
2.4%
2 2
 
2.4%
Other values (21) 27
31.8%

주소
Text

Distinct211
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-12T12:49:51.704447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length28
Mean length22.185185
Min length9

Characters and Unicode

Total characters4792
Distinct characters145
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique206 ?
Unique (%)95.4%

Sample

1st row경상북도 김천시 평화길 112-3 (평화동)
2nd row경상북도 김천시 평화중앙6길 17 (평화동)
3rd row경상북도 김천시 평화중앙7길 37 (평화동)
4th row경상북도 김천시 평화중앙2길 17-3 (평화동)
5th row경상북도 김천시 부곡시장2길 27 (부곡동)
ValueCountFrequency (%)
경상북도 214
19.9%
김천시 214
19.9%
평화동 31
 
2.9%
대항면 30
 
2.8%
부곡동 27
 
2.5%
남면 26
 
2.4%
증산면 20
 
1.9%
수도길 14
 
1.3%
감호동 14
 
1.3%
아포읍 12
 
1.1%
Other values (285) 475
44.1%
2023-12-12T12:49:52.328012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
861
18.0%
228
 
4.8%
224
 
4.7%
223
 
4.7%
220
 
4.6%
217
 
4.5%
216
 
4.5%
214
 
4.5%
1 153
 
3.2%
151
 
3.2%
Other values (135) 2085
43.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2898
60.5%
Space Separator 861
 
18.0%
Decimal Number 735
 
15.3%
Close Punctuation 98
 
2.0%
Open Punctuation 98
 
2.0%
Dash Punctuation 79
 
1.6%
Uppercase Letter 14
 
0.3%
Other Punctuation 5
 
0.1%
Math Symbol 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
228
 
7.9%
224
 
7.7%
223
 
7.7%
220
 
7.6%
217
 
7.5%
216
 
7.5%
214
 
7.4%
151
 
5.2%
104
 
3.6%
104
 
3.6%
Other values (112) 997
34.4%
Decimal Number
ValueCountFrequency (%)
1 153
20.8%
2 102
13.9%
3 90
12.2%
7 65
8.8%
4 61
 
8.3%
6 57
 
7.8%
8 57
 
7.8%
0 55
 
7.5%
5 49
 
6.7%
9 46
 
6.3%
Uppercase Letter
ValueCountFrequency (%)
E 4
28.6%
C 2
14.3%
A 2
14.3%
P 2
14.3%
L 2
14.3%
R 2
14.3%
Math Symbol
ValueCountFrequency (%)
> 2
50.0%
< 2
50.0%
Space Separator
ValueCountFrequency (%)
861
100.0%
Close Punctuation
ValueCountFrequency (%)
) 98
100.0%
Open Punctuation
ValueCountFrequency (%)
( 98
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 79
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2898
60.5%
Common 1880
39.2%
Latin 14
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
228
 
7.9%
224
 
7.7%
223
 
7.7%
220
 
7.6%
217
 
7.5%
216
 
7.5%
214
 
7.4%
151
 
5.2%
104
 
3.6%
104
 
3.6%
Other values (112) 997
34.4%
Common
ValueCountFrequency (%)
861
45.8%
1 153
 
8.1%
2 102
 
5.4%
) 98
 
5.2%
( 98
 
5.2%
3 90
 
4.8%
- 79
 
4.2%
7 65
 
3.5%
4 61
 
3.2%
6 57
 
3.0%
Other values (7) 216
 
11.5%
Latin
ValueCountFrequency (%)
E 4
28.6%
C 2
14.3%
A 2
14.3%
P 2
14.3%
L 2
14.3%
R 2
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2898
60.5%
ASCII 1894
39.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
861
45.5%
1 153
 
8.1%
2 102
 
5.4%
) 98
 
5.2%
( 98
 
5.2%
3 90
 
4.8%
- 79
 
4.2%
7 65
 
3.4%
4 61
 
3.2%
6 57
 
3.0%
Other values (13) 230
 
12.1%
Hangul
ValueCountFrequency (%)
228
 
7.9%
224
 
7.7%
223
 
7.7%
220
 
7.6%
217
 
7.5%
216
 
7.5%
214
 
7.4%
151
 
5.2%
104
 
3.6%
104
 
3.6%
Other values (112) 997
34.4%

데이터기준일
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-10-04
216 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-10-04
2nd row2023-10-04
3rd row2023-10-04
4th row2023-10-04
5th row2023-10-04

Common Values

ValueCountFrequency (%)
2023-10-04 216
100.0%

Length

2023-12-12T12:49:52.471095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:49:52.571472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-10-04 216
100.0%

Missing values

2023-12-12T12:49:49.582499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:49:49.707874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설종류시설명주소데이터기준일
0숙박업(일반)영남여관경상북도 김천시 평화길 112-3 (평화동)2023-10-04
1숙박업(일반)흥남장여관경상북도 김천시 평화중앙6길 17 (평화동)2023-10-04
2숙박업(일반)반도하숙경상북도 김천시 평화중앙7길 37 (평화동)2023-10-04
3숙박업(일반)구미여인숙경상북도 김천시 평화중앙2길 17-3 (평화동)2023-10-04
4숙박업(일반)명동파크경상북도 김천시 부곡시장2길 27 (부곡동)2023-10-04
5숙박업(일반)청운장경상북도 김천시 아랫장터2길 67 (감호동)2023-10-04
6숙박업(일반)서림장여관경상북도 김천시 자래밭골길 15 (황금동)2023-10-04
7숙박업(일반)동원여인숙경상북도 김천시 평화시장2길 5 (평화동)2023-10-04
8숙박업(일반)월드모텔경상북도 김천시 아랫장터길 76 (용두동)2023-10-04
9숙박업(일반)하이파크장경상북도 김천시 부곡시장1길 38 (부곡동)2023-10-04
시설종류시설명주소데이터기준일
206농어촌민박삼도봉민박경상북도 김천시 부항면 대야길 5452023-10-04
207농어촌민박숲실산방경상북도 김천시 부항면 파천2길 4702023-10-04
208농어촌민박여관식민박경상북도 김천시 남면 오봉로 492-12023-10-04
209농어촌민박수도산민박경상북도 김천시 증산면 수도길 9832023-10-04
210농어촌민박상록수민박경상북도 김천시 증산면 수도길 2082023-10-04
211농어촌민박대창민박경상북도 김천시 증산면 수도길 11752023-10-04
212농어촌민박넝쿨장미집경상북도 김천시 증산면 수도길 1392023-10-04
213농어촌민박단지봉민박경상북도 김천시 증산면 수도길 552023-10-04
214농어촌민박돌집민박경상북도 김천시 대항면 북암길 952023-10-04
215농어촌민박운수민박경상북도 김천시 대항면 북암길 872023-10-04