Overview

Dataset statistics

Number of variables6
Number of observations179
Missing cells39
Missing cells (%)3.6%
Duplicate rows1
Duplicate rows (%)0.6%
Total size in memory8.7 KiB
Average record size in memory49.7 B

Variable types

Categorical2
Text3
Numeric1

Dataset

Description서산시의 숙박업소 현황에 대한 데이터입니다. 항목명은 업종명, 업소명, 업소소재지, 소재지전화, 객실수로 이루어져 있습니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=445&beforeMenuCd=DOM_000000201001001000&publicdatapk=15069198

Alerts

데이터기준일 has constant value ""Constant
Dataset has 1 (0.6%) duplicate rowsDuplicates
업종명 is highly imbalanced (73.7%)Imbalance
소재지전화 has 39 (21.8%) missing valuesMissing

Reproduction

Analysis started2024-01-09 21:36:39.181906
Analysis finished2024-01-09 21:36:39.651981
Duration0.47 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
숙박업(일반)
171 
숙박업(생활)
 
8

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
숙박업(일반) 171
95.5%
숙박업(생활) 8
 
4.5%

Length

2024-01-10T06:36:39.704195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:36:39.781297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
숙박업(일반 171
95.5%
숙박업(생활 8
 
4.5%
Distinct177
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-01-10T06:36:39.971530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.8882682
Min length2

Characters and Unicode

Total characters875
Distinct characters204
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique175 ?
Unique (%)97.8%

Sample

1st row한일여인숙
2nd row서인여인숙
3rd row화정여인숙
4th row신화여인숙
5th row부흥여인숙
ValueCountFrequency (%)
모텔 4
 
2.2%
한일여인숙 2
 
1.1%
뉴그린모텔 2
 
1.1%
천송펜션 1
 
0.5%
중왕펜션 1
 
0.5%
블루스카이 1
 
0.5%
화정여인숙 1
 
0.5%
부석모텔 1
 
0.5%
프로포즈모텔 1
 
0.5%
청수파크 1
 
0.5%
Other values (171) 171
91.9%
2024-01-10T06:36:40.299321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
105
 
12.0%
73
 
8.3%
48
 
5.5%
39
 
4.5%
29
 
3.3%
25
 
2.9%
24
 
2.7%
24
 
2.7%
17
 
1.9%
15
 
1.7%
Other values (194) 476
54.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 861
98.4%
Space Separator 7
 
0.8%
Other Punctuation 2
 
0.2%
Open Punctuation 2
 
0.2%
Close Punctuation 2
 
0.2%
Uppercase Letter 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
105
 
12.2%
73
 
8.5%
48
 
5.6%
39
 
4.5%
29
 
3.4%
25
 
2.9%
24
 
2.8%
24
 
2.8%
17
 
2.0%
15
 
1.7%
Other values (189) 462
53.7%
Space Separator
ValueCountFrequency (%)
7
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
J 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 861
98.4%
Common 13
 
1.5%
Latin 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
105
 
12.2%
73
 
8.5%
48
 
5.6%
39
 
4.5%
29
 
3.4%
25
 
2.9%
24
 
2.8%
24
 
2.8%
17
 
2.0%
15
 
1.7%
Other values (189) 462
53.7%
Common
ValueCountFrequency (%)
7
53.8%
. 2
 
15.4%
( 2
 
15.4%
) 2
 
15.4%
Latin
ValueCountFrequency (%)
J 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 861
98.4%
ASCII 14
 
1.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
105
 
12.2%
73
 
8.5%
48
 
5.6%
39
 
4.5%
29
 
3.4%
25
 
2.9%
24
 
2.8%
24
 
2.8%
17
 
2.0%
15
 
1.7%
Other values (189) 462
53.7%
ASCII
ValueCountFrequency (%)
7
50.0%
. 2
 
14.3%
( 2
 
14.3%
) 2
 
14.3%
J 1
 
7.1%
Distinct178
Distinct (%)99.4%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-01-10T06:36:40.565425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length64
Median length44
Mean length24.96648
Min length19

Characters and Unicode

Total characters4469
Distinct characters128
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique177 ?
Unique (%)98.9%

Sample

1st row충청남도 서산시 시장4길 5-1 (동문동)
2nd row충청남도 서산시 고운로 195 (동문동)
3rd row충청남도 서산시 연당3길 12 (동문동)
4th row충청남도 서산시 번화2로 44-1 (동문동)
5th row충청남도 서산시 고운로 198 (동문동)
ValueCountFrequency (%)
충청남도 179
18.7%
서산시 179
18.7%
읍내동 44
 
4.6%
동문동 43
 
4.5%
대산읍 29
 
3.0%
동헌로 12
 
1.3%
읍내1로 11
 
1.1%
충의로 11
 
1.1%
시장6로 10
 
1.0%
시장4길 9
 
0.9%
Other values (274) 431
45.0%
2024-01-10T06:36:40.915773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
779
 
17.4%
221
 
4.9%
205
 
4.6%
1 200
 
4.5%
191
 
4.3%
182
 
4.1%
182
 
4.1%
181
 
4.1%
180
 
4.0%
180
 
4.0%
Other values (118) 1968
44.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2477
55.4%
Space Separator 779
 
17.4%
Decimal Number 741
 
16.6%
Close Punctuation 126
 
2.8%
Open Punctuation 126
 
2.8%
Dash Punctuation 93
 
2.1%
Other Punctuation 75
 
1.7%
Math Symbol 43
 
1.0%
Uppercase Letter 9
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
221
 
8.9%
205
 
8.3%
191
 
7.7%
182
 
7.3%
182
 
7.3%
181
 
7.3%
180
 
7.3%
180
 
7.3%
125
 
5.0%
104
 
4.2%
Other values (98) 726
29.3%
Decimal Number
ValueCountFrequency (%)
1 200
27.0%
2 127
17.1%
3 82
11.1%
4 68
 
9.2%
6 56
 
7.6%
5 49
 
6.6%
9 47
 
6.3%
0 42
 
5.7%
7 40
 
5.4%
8 30
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
B 4
44.4%
A 3
33.3%
D 1
 
11.1%
C 1
 
11.1%
Space Separator
ValueCountFrequency (%)
779
100.0%
Close Punctuation
ValueCountFrequency (%)
) 126
100.0%
Open Punctuation
ValueCountFrequency (%)
( 126
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 93
100.0%
Other Punctuation
ValueCountFrequency (%)
, 75
100.0%
Math Symbol
ValueCountFrequency (%)
~ 43
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2477
55.4%
Common 1983
44.4%
Latin 9
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
221
 
8.9%
205
 
8.3%
191
 
7.7%
182
 
7.3%
182
 
7.3%
181
 
7.3%
180
 
7.3%
180
 
7.3%
125
 
5.0%
104
 
4.2%
Other values (98) 726
29.3%
Common
ValueCountFrequency (%)
779
39.3%
1 200
 
10.1%
2 127
 
6.4%
) 126
 
6.4%
( 126
 
6.4%
- 93
 
4.7%
3 82
 
4.1%
, 75
 
3.8%
4 68
 
3.4%
6 56
 
2.8%
Other values (6) 251
 
12.7%
Latin
ValueCountFrequency (%)
B 4
44.4%
A 3
33.3%
D 1
 
11.1%
C 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2477
55.4%
ASCII 1992
44.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
779
39.1%
1 200
 
10.0%
2 127
 
6.4%
) 126
 
6.3%
( 126
 
6.3%
- 93
 
4.7%
3 82
 
4.1%
, 75
 
3.8%
4 68
 
3.4%
6 56
 
2.8%
Other values (10) 260
 
13.1%
Hangul
ValueCountFrequency (%)
221
 
8.9%
205
 
8.3%
191
 
7.7%
182
 
7.3%
182
 
7.3%
181
 
7.3%
180
 
7.3%
180
 
7.3%
125
 
5.0%
104
 
4.2%
Other values (98) 726
29.3%

소재지전화
Text

MISSING 

Distinct137
Distinct (%)97.9%
Missing39
Missing (%)21.8%
Memory size1.5 KiB
2024-01-10T06:36:41.103281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1680
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique134 ?
Unique (%)95.7%

Sample

1st row041-665-0505
2nd row041-665-2618
3rd row041-662-1535
4th row041-665-3368
5th row041-665-3066
ValueCountFrequency (%)
041-669-9449 2
 
1.4%
041-664-9933 2
 
1.4%
041-668-7822 2
 
1.4%
041-669-2988 1
 
0.7%
041-663-5048 1
 
0.7%
041-688-8488 1
 
0.7%
041-665-3106 1
 
0.7%
041-668-5858 1
 
0.7%
041-666-2255 1
 
0.7%
041-664-1797 1
 
0.7%
Other values (127) 127
90.7%
2024-01-10T06:36:41.380826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 314
18.7%
- 280
16.7%
0 222
13.2%
1 208
12.4%
4 196
11.7%
5 107
 
6.4%
8 102
 
6.1%
3 74
 
4.4%
9 64
 
3.8%
2 64
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1400
83.3%
Dash Punctuation 280
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 314
22.4%
0 222
15.9%
1 208
14.9%
4 196
14.0%
5 107
 
7.6%
8 102
 
7.3%
3 74
 
5.3%
9 64
 
4.6%
2 64
 
4.6%
7 49
 
3.5%
Dash Punctuation
ValueCountFrequency (%)
- 280
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1680
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 314
18.7%
- 280
16.7%
0 222
13.2%
1 208
12.4%
4 196
11.7%
5 107
 
6.4%
8 102
 
6.1%
3 74
 
4.4%
9 64
 
3.8%
2 64
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1680
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 314
18.7%
- 280
16.7%
0 222
13.2%
1 208
12.4%
4 196
11.7%
5 107
 
6.4%
8 102
 
6.1%
3 74
 
4.4%
9 64
 
3.8%
2 64
 
3.8%

객실수
Real number (ℝ)

Distinct45
Distinct (%)25.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.256983
Minimum4
Maximum194
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2024-01-10T06:36:41.748347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile7
Q112
median19
Q329
95-th percentile46
Maximum194
Range190
Interquartile range (IQR)17

Descriptive statistics

Standard deviation19.0151
Coefficient of variation (CV)0.8176082
Kurtosis38.591079
Mean23.256983
Median Absolute Deviation (MAD)8
Skewness4.9496223
Sum4163
Variance361.57404
MonotonicityNot monotonic
2024-01-10T06:36:41.855027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
19 17
 
9.5%
12 12
 
6.7%
18 11
 
6.1%
10 11
 
6.1%
36 10
 
5.6%
17 7
 
3.9%
7 7
 
3.9%
8 7
 
3.9%
29 6
 
3.4%
15 6
 
3.4%
Other values (35) 85
47.5%
ValueCountFrequency (%)
4 1
 
0.6%
5 3
 
1.7%
6 2
 
1.1%
7 7
3.9%
8 7
3.9%
9 4
 
2.2%
10 11
6.1%
11 1
 
0.6%
12 12
6.7%
13 4
 
2.2%
ValueCountFrequency (%)
194 1
 
0.6%
115 1
 
0.6%
70 1
 
0.6%
69 1
 
0.6%
56 1
 
0.6%
53 1
 
0.6%
51 1
 
0.6%
46 4
2.2%
44 1
 
0.6%
40 3
1.7%

데이터기준일
Categorical

CONSTANT 

Distinct1
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2021-10-27
179 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-10-27
2nd row2021-10-27
3rd row2021-10-27
4th row2021-10-27
5th row2021-10-27

Common Values

ValueCountFrequency (%)
2021-10-27 179
100.0%

Length

2024-01-10T06:36:41.957241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:36:42.035706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021-10-27 179
100.0%

Interactions

2024-01-10T06:36:39.433323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T06:36:42.081246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종명객실수
업종명1.0000.000
객실수0.0001.000
2024-01-10T06:36:42.146479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
객실수업종명
객실수1.0000.000
업종명0.0001.000

Missing values

2024-01-10T06:36:39.542103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T06:36:39.619642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명업소소재지(도로명)소재지전화객실수데이터기준일
0숙박업(일반)한일여인숙충청남도 서산시 시장4길 5-1 (동문동)041-665-0505142021-10-27
1숙박업(일반)서인여인숙충청남도 서산시 고운로 195 (동문동)041-665-261882021-10-27
2숙박업(일반)화정여인숙충청남도 서산시 연당3길 12 (동문동)041-662-1535122021-10-27
3숙박업(일반)신화여인숙충청남도 서산시 번화2로 44-1 (동문동)041-665-3368122021-10-27
4숙박업(일반)부흥여인숙충청남도 서산시 고운로 198 (동문동)041-665-3066102021-10-27
5숙박업(일반)대지여인숙충청남도 서산시 번화2로 50-6 (동문동)041-665-368272021-10-27
6숙박업(일반)양지여인숙충청남도 서산시 대산읍 죽엽로 10<NA>72021-10-27
7숙박업(일반)여로여인숙충청남도 서산시 고운로 166-10 (동문동)041-665-250582021-10-27
8숙박업(일반)한일여인숙충청남도 서산시 대산읍 삼길포2길 32-10041-663-708252021-10-27
9숙박업(일반)중앙여관충청남도 서산시 시장6로 30-9 (동문동)041-665-8001202021-10-27
업종명업소명업소소재지(도로명)소재지전화객실수데이터기준일
169숙박업(일반)자자호텔충청남도 서산시 대산읍 구진천로 10-9041-665-5200462021-10-27
170숙박업(일반)톡무인텔충청남도 서산시 음암면 상홍검동길 34-22, 토크무인텔 2층041-664-2400232021-10-27
171숙박업(생활)서산펜션 해뜨는비치충청남도 서산시 지곡면 왕산길 6-16, 1~4동041-669-3890202021-10-27
172숙박업(생활)펜션파인씨충청남도 서산시 팔봉면 호리영살길 122041-669-677792021-10-27
173숙박업(생활)중왕펜션충청남도 서산시 지곡면 왕산길 16-4 (-6(1,2동),-9(1,2동)-10,-11-12,-13(1,2동),14)041-664-6270132021-10-27
174숙박업(생활)천송펜션충청남도 서산시 팔봉면 정자동길 161-38 (가동(1,2층),나동(1,2층))041-664-331992021-10-27
175숙박업(생활)나폴리펜션충청남도 서산시 팔봉면 호리1길 132-10, A동, B동(201,202,301,302호)041-669-298872021-10-27
176숙박업(생활)리젠트모텔충청남도 서산시 동헌로 89, 1~6층 (읍내동)<NA>362021-10-27
177숙박업(생활)바닷가펜션충청남도 서산시 지곡면 왕산길 6-10, 1~2동<NA>42021-10-27
178숙박업(생활)썬셋모텔펜션충청남도 서산시 부석면 창리2길 27, 2~3층041-669-7933132021-10-27

Duplicate rows

Most frequently occurring

업종명업소명업소소재지(도로명)소재지전화객실수데이터기준일# duplicates
0숙박업(일반)뉴그린모텔충청남도 서산시 지곡면 왕산이로 6041-669-9449192021-10-272