Overview

Dataset statistics

Number of variables5
Number of observations173
Missing cells37
Missing cells (%)4.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.1 KiB
Average record size in memory41.8 B

Variable types

Categorical1
Text3
Numeric1

Dataset

Description서산시의 숙박업소 현황에 대한 데이터입니다. 항목명은 업종명, 업소명, 업소소재지, 소재지전화, 객실수로 이루어져 있습니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=445&beforeMenuCd=DOM_000000201001001000&publicdatapk=15069198

Alerts

업종명 is highly imbalanced (65.8%)Imbalance
소재지전화 has 37 (21.4%) missing valuesMissing

Reproduction

Analysis started2024-01-09 21:36:46.049242
Analysis finished2024-01-09 21:36:46.465616
Duration0.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
숙박업(일반)
162 
숙박업(생활)
 
11

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
숙박업(일반) 162
93.6%
숙박업(생활) 11
 
6.4%

Length

2024-01-10T06:36:46.513982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:36:46.589295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
숙박업(일반 162
93.6%
숙박업(생활 11
 
6.4%
Distinct172
Distinct (%)99.4%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-01-10T06:36:46.779106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length18
Mean length5.2890173
Min length2

Characters and Unicode

Total characters915
Distinct characters219
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique171 ?
Unique (%)98.8%

Sample

1st row한일여인숙
2nd row서인여인숙
3rd row화정여인숙
4th row부흥여인숙
5th row대지여인숙
ValueCountFrequency (%)
모텔 4
 
2.1%
한일여인숙 2
 
1.1%
서산호수공원점 2
 
1.1%
산수휴양림 1
 
0.5%
라온무인텔호텔 1
 
0.5%
백제의미소 1
 
0.5%
모텔첼로 1
 
0.5%
브라운도트호텔 1
 
0.5%
블루스카이 1
 
0.5%
넘버(no)25호텔 1
 
0.5%
Other values (173) 173
92.0%
2024-01-10T06:36:47.121645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
102
 
11.1%
64
 
7.0%
40
 
4.4%
35
 
3.8%
33
 
3.6%
27
 
3.0%
22
 
2.4%
20
 
2.2%
19
 
2.1%
17
 
1.9%
Other values (209) 536
58.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 855
93.4%
Space Separator 15
 
1.6%
Decimal Number 10
 
1.1%
Uppercase Letter 10
 
1.1%
Lowercase Letter 10
 
1.1%
Open Punctuation 6
 
0.7%
Close Punctuation 6
 
0.7%
Other Punctuation 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
102
 
11.9%
64
 
7.5%
40
 
4.7%
35
 
4.1%
33
 
3.9%
27
 
3.2%
22
 
2.6%
20
 
2.3%
19
 
2.2%
17
 
2.0%
Other values (187) 476
55.7%
Uppercase Letter
ValueCountFrequency (%)
O 2
20.0%
H 2
20.0%
J 1
10.0%
T 1
10.0%
E 1
10.0%
N 1
10.0%
L 1
10.0%
C 1
10.0%
Lowercase Letter
ValueCountFrequency (%)
t 2
20.0%
o 2
20.0%
l 2
20.0%
e 2
20.0%
s 1
10.0%
i 1
10.0%
Decimal Number
ValueCountFrequency (%)
4 3
30.0%
5 3
30.0%
2 2
20.0%
6 2
20.0%
Space Separator
ValueCountFrequency (%)
15
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 855
93.4%
Common 40
 
4.4%
Latin 20
 
2.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
102
 
11.9%
64
 
7.5%
40
 
4.7%
35
 
4.1%
33
 
3.9%
27
 
3.2%
22
 
2.6%
20
 
2.3%
19
 
2.2%
17
 
2.0%
Other values (187) 476
55.7%
Latin
ValueCountFrequency (%)
O 2
10.0%
t 2
10.0%
H 2
10.0%
o 2
10.0%
l 2
10.0%
e 2
10.0%
J 1
 
5.0%
T 1
 
5.0%
E 1
 
5.0%
N 1
 
5.0%
Other values (4) 4
20.0%
Common
ValueCountFrequency (%)
15
37.5%
( 6
 
15.0%
) 6
 
15.0%
4 3
 
7.5%
5 3
 
7.5%
. 3
 
7.5%
2 2
 
5.0%
6 2
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 855
93.4%
ASCII 60
 
6.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
102
 
11.9%
64
 
7.5%
40
 
4.7%
35
 
4.1%
33
 
3.9%
27
 
3.2%
22
 
2.6%
20
 
2.3%
19
 
2.2%
17
 
2.0%
Other values (187) 476
55.7%
ASCII
ValueCountFrequency (%)
15
25.0%
( 6
 
10.0%
) 6
 
10.0%
4 3
 
5.0%
5 3
 
5.0%
. 3
 
5.0%
O 2
 
3.3%
2 2
 
3.3%
t 2
 
3.3%
H 2
 
3.3%
Other values (12) 16
26.7%
Distinct172
Distinct (%)99.4%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-01-10T06:36:47.435286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length32
Mean length24.456647
Min length19

Characters and Unicode

Total characters4231
Distinct characters124
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique171 ?
Unique (%)98.8%

Sample

1st row충청남도 서산시 시장4길 5-1 (동문동)
2nd row충청남도 서산시 고운로 195 (동문동)
3rd row충청남도 서산시 연당3길 12 (동문동)
4th row충청남도 서산시 번화2로 44-1 (동문동)
5th row충청남도 서산시 고운로 198 (동문동)
ValueCountFrequency (%)
충청남도 173
18.8%
서산시 173
18.8%
동문동 43
 
4.7%
읍내동 43
 
4.7%
대산읍 29
 
3.1%
읍내1로 11
 
1.2%
동헌로 11
 
1.2%
충의로 11
 
1.2%
시장6로 10
 
1.1%
읍내2로 9
 
1.0%
Other values (263) 408
44.3%
2024-01-10T06:36:47.854450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
748
17.7%
213
 
5.0%
199
 
4.7%
185
 
4.4%
1 178
 
4.2%
176
 
4.2%
175
 
4.1%
174
 
4.1%
174
 
4.1%
171
 
4.0%
Other values (114) 1838
43.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2383
56.3%
Space Separator 748
 
17.7%
Decimal Number 679
 
16.0%
Open Punctuation 117
 
2.8%
Close Punctuation 117
 
2.8%
Dash Punctuation 83
 
2.0%
Other Punctuation 57
 
1.3%
Math Symbol 40
 
0.9%
Uppercase Letter 7
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
213
 
8.9%
199
 
8.4%
185
 
7.8%
176
 
7.4%
175
 
7.3%
174
 
7.3%
174
 
7.3%
171
 
7.2%
124
 
5.2%
103
 
4.3%
Other values (94) 689
28.9%
Decimal Number
ValueCountFrequency (%)
1 178
26.2%
2 112
16.5%
3 76
11.2%
4 66
 
9.7%
6 51
 
7.5%
5 49
 
7.2%
9 45
 
6.6%
7 39
 
5.7%
0 35
 
5.2%
8 28
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
B 3
42.9%
A 2
28.6%
D 1
 
14.3%
C 1
 
14.3%
Space Separator
ValueCountFrequency (%)
748
100.0%
Open Punctuation
ValueCountFrequency (%)
( 117
100.0%
Close Punctuation
ValueCountFrequency (%)
) 117
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 83
100.0%
Other Punctuation
ValueCountFrequency (%)
, 57
100.0%
Math Symbol
ValueCountFrequency (%)
~ 40
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2383
56.3%
Common 1841
43.5%
Latin 7
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
213
 
8.9%
199
 
8.4%
185
 
7.8%
176
 
7.4%
175
 
7.3%
174
 
7.3%
174
 
7.3%
171
 
7.2%
124
 
5.2%
103
 
4.3%
Other values (94) 689
28.9%
Common
ValueCountFrequency (%)
748
40.6%
1 178
 
9.7%
( 117
 
6.4%
) 117
 
6.4%
2 112
 
6.1%
- 83
 
4.5%
3 76
 
4.1%
4 66
 
3.6%
, 57
 
3.1%
6 51
 
2.8%
Other values (6) 236
 
12.8%
Latin
ValueCountFrequency (%)
B 3
42.9%
A 2
28.6%
D 1
 
14.3%
C 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2383
56.3%
ASCII 1848
43.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
748
40.5%
1 178
 
9.6%
( 117
 
6.3%
) 117
 
6.3%
2 112
 
6.1%
- 83
 
4.5%
3 76
 
4.1%
4 66
 
3.6%
, 57
 
3.1%
6 51
 
2.8%
Other values (10) 243
 
13.1%
Hangul
ValueCountFrequency (%)
213
 
8.9%
199
 
8.4%
185
 
7.8%
176
 
7.4%
175
 
7.3%
174
 
7.3%
174
 
7.3%
171
 
7.2%
124
 
5.2%
103
 
4.3%
Other values (94) 689
28.9%

소재지전화
Text

MISSING 

Distinct134
Distinct (%)98.5%
Missing37
Missing (%)21.4%
Memory size1.5 KiB
2024-01-10T06:36:48.049088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length11.977941
Min length9

Characters and Unicode

Total characters1629
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique132 ?
Unique (%)97.1%

Sample

1st row041-665-0505
2nd row041-665-2618
3rd row041-662-1535
4th row041-665-3066
5th row041-665-3682
ValueCountFrequency (%)
041-664-9933 2
 
1.5%
041-668-7822 2
 
1.5%
041-665-0505 1
 
0.7%
041-669-9449 1
 
0.7%
041-665-4842 1
 
0.7%
041-667-7774 1
 
0.7%
041-688-8488 1
 
0.7%
041-667-7474 1
 
0.7%
041-663-9831 1
 
0.7%
041-668-6555 1
 
0.7%
Other values (124) 124
91.2%
2024-01-10T06:36:48.347470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 300
18.4%
- 271
16.6%
0 223
13.7%
1 205
12.6%
4 187
11.5%
8 102
 
6.3%
5 93
 
5.7%
3 76
 
4.7%
9 63
 
3.9%
2 63
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1358
83.4%
Dash Punctuation 271
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 300
22.1%
0 223
16.4%
1 205
15.1%
4 187
13.8%
8 102
 
7.5%
5 93
 
6.8%
3 76
 
5.6%
9 63
 
4.6%
2 63
 
4.6%
7 46
 
3.4%
Dash Punctuation
ValueCountFrequency (%)
- 271
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1629
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 300
18.4%
- 271
16.6%
0 223
13.7%
1 205
12.6%
4 187
11.5%
8 102
 
6.3%
5 93
 
5.7%
3 76
 
4.7%
9 63
 
3.9%
2 63
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1629
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 300
18.4%
- 271
16.6%
0 223
13.7%
1 205
12.6%
4 187
11.5%
8 102
 
6.3%
5 93
 
5.7%
3 76
 
4.7%
9 63
 
3.9%
2 63
 
3.9%

객실수
Real number (ℝ)

Distinct45
Distinct (%)26.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.445087
Minimum1
Maximum115
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2024-01-10T06:36:48.463906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7
Q112
median19
Q329
95-th percentile46
Maximum115
Range114
Interquartile range (IQR)17

Descriptive statistics

Standard deviation14.20177
Coefficient of variation (CV)0.63273404
Kurtosis10.345285
Mean22.445087
Median Absolute Deviation (MAD)8
Skewness2.244798
Sum3883
Variance201.69028
MonotonicityNot monotonic
2024-01-10T06:36:48.566535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
19 16
 
9.2%
12 10
 
5.8%
10 10
 
5.8%
18 10
 
5.8%
36 9
 
5.2%
17 7
 
4.0%
29 6
 
3.5%
7 6
 
3.5%
22 6
 
3.5%
35 6
 
3.5%
Other values (35) 87
50.3%
ValueCountFrequency (%)
1 1
 
0.6%
4 1
 
0.6%
5 3
 
1.7%
6 3
 
1.7%
7 6
3.5%
8 5
2.9%
9 4
 
2.3%
10 10
5.8%
11 2
 
1.2%
12 10
5.8%
ValueCountFrequency (%)
115 1
 
0.6%
70 1
 
0.6%
69 1
 
0.6%
56 1
 
0.6%
53 1
 
0.6%
51 1
 
0.6%
46 4
2.3%
44 1
 
0.6%
40 3
1.7%
38 4
2.3%

Interactions

2024-01-10T06:36:46.281082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T06:36:48.636173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종명객실수
업종명1.0000.169
객실수0.1691.000
2024-01-10T06:36:48.696128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
객실수업종명
객실수1.0000.124
업종명0.1241.000

Missing values

2024-01-10T06:36:46.365143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T06:36:46.434570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명업소소재지(도로명)소재지전화객실수
0숙박업(일반)한일여인숙충청남도 서산시 시장4길 5-1 (동문동)041-665-050514
1숙박업(일반)서인여인숙충청남도 서산시 고운로 195 (동문동)041-665-26188
2숙박업(일반)화정여인숙충청남도 서산시 연당3길 12 (동문동)041-662-153512
3숙박업(일반)부흥여인숙충청남도 서산시 번화2로 44-1 (동문동)041-665-306610
4숙박업(일반)대지여인숙충청남도 서산시 고운로 198 (동문동)041-665-36827
5숙박업(일반)양지여인숙충청남도 서산시 번화2로 50-6 (동문동)<NA>7
6숙박업(일반)한일여인숙충청남도 서산시 대산읍 죽엽로 10041-663-70825
7숙박업(일반)중앙여관충청남도 서산시 고운로 166-10 (동문동)041-665-800120
8숙박업(일반)수정여인숙충청남도 서산시 대산읍 삼길포2길 32-10041-665-89736
9숙박업(일반)금성장여관충청남도 서산시 시장6로 30-9 (동문동)041-665-255819
업종명업소명업소소재지(도로명)소재지전화객실수
163숙박업(생활)펜션파인씨충청남도 서산시 음암면 음암로 34, 1~2층041-669-67779
164숙박업(생활)노을에반하다충청남도 서산시 읍내2로 19, 1~8층 (읍내동)041-664-627013
165숙박업(생활)천송펜션충청남도 서산시 팔봉면 서해로 2608, 1~3층041-664-33199
166숙박업(생활)나폴리펜션충청남도 서산시 안견로 465 (갈산동)041-669-29887
167숙박업(생활)리젠트모텔충청남도 서산시 남부순환로 523 (양대동)<NA>36
168숙박업(생활)썬셋모텔펜션충청남도 서산시 음암면 운암로 698, 2층041-669-793313
169숙박업(생활)산수휴양림 자연의소리충청남도 서산시 대산읍 구진천로 10-9041-688-421011
170숙박업(생활)서산바닷가펜션충청남도 서산시 음암면 상홍검동길 34-22, 토크무인텔 2층<NA>4
171숙박업(생활)굴포펜션충청남도 서산시 지곡면 왕산길 6-16, 1~4동<NA>1
172숙박업(생활)만조스테이충청남도 서산시 팔봉면 호리영살길 122<NA>6