Overview

Dataset statistics

Number of variables5
Number of observations275
Missing cells3
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.4 KiB
Average record size in memory42.5 B

Variable types

Categorical1
Text2
Numeric2

Dataset

Description진주시 관내 운영중인 호텔, 여관, 모텔, 민박, 펜션 등 숙박정보에 대한 업소명, 소재지, 면적, 객실수 설명입니다
URLhttps://www.data.go.kr/data/3066707/fileData.do

Alerts

영업장면적 is highly overall correlated with 객실수 and 1 other fieldsHigh correlation
객실수 is highly overall correlated with 영업장면적 and 1 other fieldsHigh correlation
업종명 is highly overall correlated with 영업장면적 and 1 other fieldsHigh correlation
업종명 is highly imbalanced (89.0%)Imbalance
영업장면적 has 3 (1.1%) missing valuesMissing
영업장면적 has 4 (1.5%) zerosZeros

Reproduction

Analysis started2023-12-12 13:02:58.256788
Analysis finished2023-12-12 13:02:59.159891
Duration0.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
숙박업(일반)
271 
숙박업(생활)
 
4

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
숙박업(일반) 271
98.5%
숙박업(생활) 4
 
1.5%

Length

2023-12-12T22:02:59.233321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:02:59.351988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
숙박업(일반 271
98.5%
숙박업(생활 4
 
1.5%
Distinct268
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-12T22:02:59.619978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length18
Mean length5.2
Min length2

Characters and Unicode

Total characters1430
Distinct characters292
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique261 ?
Unique (%)94.9%

Sample

1st row진주스토리
2nd row진성모텔
3rd row휴모텔
4th row진양장여관
5th row삼학여관
ValueCountFrequency (%)
모텔 4
 
1.4%
제이모텔 2
 
0.7%
휴모텔 2
 
0.7%
진주성점 2
 
0.7%
코지모텔 2
 
0.7%
드림모텔 2
 
0.7%
동경모텔 2
 
0.7%
썸모텔 2
 
0.7%
헤라모텔 2
 
0.7%
리버빌 1
 
0.3%
Other values (270) 270
92.8%
2023-12-12T22:03:00.100987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
183
 
12.8%
133
 
9.3%
65
 
4.5%
62
 
4.3%
52
 
3.6%
52
 
3.6%
31
 
2.2%
23
 
1.6%
) 21
 
1.5%
( 21
 
1.5%
Other values (282) 787
55.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1286
89.9%
Uppercase Letter 55
 
3.8%
Close Punctuation 21
 
1.5%
Open Punctuation 21
 
1.5%
Lowercase Letter 19
 
1.3%
Space Separator 16
 
1.1%
Decimal Number 8
 
0.6%
Other Punctuation 3
 
0.2%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
183
 
14.2%
133
 
10.3%
65
 
5.1%
62
 
4.8%
52
 
4.0%
52
 
4.0%
31
 
2.4%
23
 
1.8%
16
 
1.2%
15
 
1.2%
Other values (240) 654
50.9%
Uppercase Letter
ValueCountFrequency (%)
E 6
10.9%
T 6
10.9%
S 4
 
7.3%
A 4
 
7.3%
L 4
 
7.3%
O 4
 
7.3%
H 4
 
7.3%
M 4
 
7.3%
I 3
 
5.5%
G 3
 
5.5%
Other values (9) 13
23.6%
Lowercase Letter
ValueCountFrequency (%)
e 5
26.3%
o 3
15.8%
t 2
 
10.5%
l 2
 
10.5%
r 2
 
10.5%
h 1
 
5.3%
s 1
 
5.3%
p 1
 
5.3%
u 1
 
5.3%
v 1
 
5.3%
Decimal Number
ValueCountFrequency (%)
2 2
25.0%
9 2
25.0%
5 1
12.5%
8 1
12.5%
4 1
12.5%
7 1
12.5%
Other Punctuation
ValueCountFrequency (%)
& 1
33.3%
' 1
33.3%
. 1
33.3%
Close Punctuation
ValueCountFrequency (%)
) 21
100.0%
Open Punctuation
ValueCountFrequency (%)
( 21
100.0%
Space Separator
ValueCountFrequency (%)
16
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1286
89.9%
Latin 74
 
5.2%
Common 70
 
4.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
183
 
14.2%
133
 
10.3%
65
 
5.1%
62
 
4.8%
52
 
4.0%
52
 
4.0%
31
 
2.4%
23
 
1.8%
16
 
1.2%
15
 
1.2%
Other values (240) 654
50.9%
Latin
ValueCountFrequency (%)
E 6
 
8.1%
T 6
 
8.1%
e 5
 
6.8%
S 4
 
5.4%
A 4
 
5.4%
L 4
 
5.4%
O 4
 
5.4%
H 4
 
5.4%
M 4
 
5.4%
I 3
 
4.1%
Other values (19) 30
40.5%
Common
ValueCountFrequency (%)
) 21
30.0%
( 21
30.0%
16
22.9%
2 2
 
2.9%
9 2
 
2.9%
& 1
 
1.4%
' 1
 
1.4%
5 1
 
1.4%
+ 1
 
1.4%
8 1
 
1.4%
Other values (3) 3
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1286
89.9%
ASCII 144
 
10.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
183
 
14.2%
133
 
10.3%
65
 
5.1%
62
 
4.8%
52
 
4.0%
52
 
4.0%
31
 
2.4%
23
 
1.8%
16
 
1.2%
15
 
1.2%
Other values (240) 654
50.9%
ASCII
ValueCountFrequency (%)
) 21
 
14.6%
( 21
 
14.6%
16
 
11.1%
E 6
 
4.2%
T 6
 
4.2%
e 5
 
3.5%
S 4
 
2.8%
A 4
 
2.8%
L 4
 
2.8%
O 4
 
2.8%
Other values (32) 53
36.8%
Distinct274
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-12T22:03:00.381124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length44
Mean length27.232727
Min length20

Characters and Unicode

Total characters7489
Distinct characters127
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique273 ?
Unique (%)99.3%

Sample

1st row경상남도 진주시 비봉로54번길 8 (계동)
2nd row경상남도 진주시 진주대로1032번길 11 (동성동)
3rd row경상남도 진주시 진주대로1040번길 10 (동성동)
4th row경상남도 진주시 진주대로891번길 41 (강남동)
5th row경상남도 진주시 진주대로879번길 14-16 (강남동)
ValueCountFrequency (%)
경상남도 275
 
18.9%
진주시 275
 
18.9%
장대동 38
 
2.6%
상평동 32
 
2.2%
봉곡동 32
 
2.2%
상대동 22
 
1.5%
논개길 18
 
1.2%
강남동 18
 
1.2%
남강로 17
 
1.2%
옥봉동 17
 
1.2%
Other values (363) 710
48.8%
2023-12-12T22:03:00.842004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1180
 
15.8%
347
 
4.6%
346
 
4.6%
337
 
4.5%
327
 
4.4%
1 319
 
4.3%
290
 
3.9%
284
 
3.8%
275
 
3.7%
275
 
3.7%
Other values (117) 3509
46.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4229
56.5%
Decimal Number 1294
 
17.3%
Space Separator 1180
 
15.8%
Open Punctuation 269
 
3.6%
Close Punctuation 269
 
3.6%
Other Punctuation 129
 
1.7%
Dash Punctuation 97
 
1.3%
Math Symbol 20
 
0.3%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
347
 
8.2%
346
 
8.2%
337
 
8.0%
327
 
7.7%
290
 
6.9%
284
 
6.7%
275
 
6.5%
275
 
6.5%
256
 
6.1%
196
 
4.6%
Other values (97) 1296
30.6%
Decimal Number
ValueCountFrequency (%)
1 319
24.7%
2 135
10.4%
5 132
10.2%
6 130
10.0%
3 127
 
9.8%
4 99
 
7.7%
0 98
 
7.6%
9 97
 
7.5%
7 88
 
6.8%
8 69
 
5.3%
Other Punctuation
ValueCountFrequency (%)
, 127
98.4%
/ 1
 
0.8%
. 1
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
A 1
50.0%
B 1
50.0%
Space Separator
ValueCountFrequency (%)
1180
100.0%
Open Punctuation
ValueCountFrequency (%)
( 269
100.0%
Close Punctuation
ValueCountFrequency (%)
) 269
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 97
100.0%
Math Symbol
ValueCountFrequency (%)
~ 20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4229
56.5%
Common 3258
43.5%
Latin 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
347
 
8.2%
346
 
8.2%
337
 
8.0%
327
 
7.7%
290
 
6.9%
284
 
6.7%
275
 
6.5%
275
 
6.5%
256
 
6.1%
196
 
4.6%
Other values (97) 1296
30.6%
Common
ValueCountFrequency (%)
1180
36.2%
1 319
 
9.8%
( 269
 
8.3%
) 269
 
8.3%
2 135
 
4.1%
5 132
 
4.1%
6 130
 
4.0%
, 127
 
3.9%
3 127
 
3.9%
4 99
 
3.0%
Other values (8) 471
 
14.5%
Latin
ValueCountFrequency (%)
A 1
50.0%
B 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4229
56.5%
ASCII 3260
43.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1180
36.2%
1 319
 
9.8%
( 269
 
8.3%
) 269
 
8.3%
2 135
 
4.1%
5 132
 
4.0%
6 130
 
4.0%
, 127
 
3.9%
3 127
 
3.9%
4 99
 
3.0%
Other values (10) 473
14.5%
Hangul
ValueCountFrequency (%)
347
 
8.2%
346
 
8.2%
337
 
8.0%
327
 
7.7%
290
 
6.9%
284
 
6.7%
275
 
6.5%
275
 
6.5%
256
 
6.1%
196
 
4.6%
Other values (97) 1296
30.6%

영업장면적
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct261
Distinct (%)96.0%
Missing3
Missing (%)1.1%
Infinite0
Infinite (%)0.0%
Mean669.74125
Minimum0
Maximum5954
Zeros4
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2023-12-12T22:03:01.008717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile146.021
Q1320.6025
median455.115
Q3740.78
95-th percentile1710.736
Maximum5954
Range5954
Interquartile range (IQR)420.1775

Descriptive statistics

Standard deviation770.48003
Coefficient of variation (CV)1.1504145
Kurtosis22.021852
Mean669.74125
Median Absolute Deviation (MAD)177.725
Skewness4.2082047
Sum182169.62
Variance593639.47
MonotonicityNot monotonic
2023-12-12T22:03:01.181348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 4
 
1.5%
812.39 3
 
1.1%
446.8 2
 
0.7%
652.18 2
 
0.7%
594.0 2
 
0.7%
331.2 2
 
0.7%
408.05 2
 
0.7%
575.78 2
 
0.7%
2223.26 1
 
0.4%
518.24 1
 
0.4%
Other values (251) 251
91.3%
(Missing) 3
 
1.1%
ValueCountFrequency (%)
0.0 4
1.5%
37.69 1
 
0.4%
69.38 1
 
0.4%
76.48 1
 
0.4%
90.05 1
 
0.4%
108.63 1
 
0.4%
111.38 1
 
0.4%
116.2 1
 
0.4%
119.5 1
 
0.4%
139.37 1
 
0.4%
ValueCountFrequency (%)
5954.0 1
0.4%
5803.8 1
0.4%
4892.52 1
0.4%
4875.33 1
0.4%
4115.42 1
0.4%
2800.95 1
0.4%
2800.52 1
0.4%
2436.56 1
0.4%
2322.78 1
0.4%
2306.0 1
0.4%

객실수
Real number (ℝ)

HIGH CORRELATION 

Distinct42
Distinct (%)15.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.047273
Minimum5
Maximum128
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2023-12-12T22:03:01.349918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile9
Q113
median17
Q323
95-th percentile35
Maximum128
Range123
Interquartile range (IQR)10

Descriptive statistics

Standard deviation13.741937
Coefficient of variation (CV)0.68547661
Kurtosis25.265249
Mean20.047273
Median Absolute Deviation (MAD)4
Skewness4.3261304
Sum5513
Variance188.84082
MonotonicityNot monotonic
2023-12-12T22:03:01.832063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
19 25
 
9.1%
18 23
 
8.4%
14 22
 
8.0%
15 21
 
7.6%
12 19
 
6.9%
11 13
 
4.7%
16 13
 
4.7%
13 12
 
4.4%
20 12
 
4.4%
17 11
 
4.0%
Other values (32) 104
37.8%
ValueCountFrequency (%)
5 3
 
1.1%
7 3
 
1.1%
8 7
 
2.5%
9 3
 
1.1%
10 11
4.0%
11 13
4.7%
12 19
6.9%
13 12
4.4%
14 22
8.0%
15 21
7.6%
ValueCountFrequency (%)
128 1
0.4%
110 1
0.4%
95 1
0.4%
83 1
0.4%
76 1
0.4%
72 1
0.4%
57 1
0.4%
53 1
0.4%
43 1
0.4%
38 1
0.4%

Interactions

2023-12-12T22:02:58.773977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:02:58.559702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:02:58.874232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:02:58.670575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:03:01.928252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종명영업장면적객실수
업종명1.0000.8170.974
영업장면적0.8171.0000.915
객실수0.9740.9151.000
2023-12-12T22:03:02.014292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
영업장면적객실수업종명
영업장면적1.0000.8950.630
객실수0.8951.0000.848
업종명0.6300.8481.000

Missing values

2023-12-12T22:02:59.018380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:02:59.121273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명영업소 주소(도로명)영업장면적객실수
0숙박업(일반)진주스토리경상남도 진주시 비봉로54번길 8 (계동)272.036
1숙박업(일반)진성모텔경상남도 진주시 진주대로1032번길 11 (동성동)454.9212
2숙박업(일반)휴모텔경상남도 진주시 진주대로1040번길 10 (동성동)139.3715
3숙박업(일반)진양장여관경상남도 진주시 진주대로891번길 41 (강남동)147.1113
4숙박업(일반)삼학여관경상남도 진주시 진주대로879번길 14-16 (강남동)158.688
5숙박업(일반)백만장여관경상남도 진주시 장대로10번길 10 (장대동)116.211
6숙박업(일반)삼성여관경상남도 진주시 장대로6번길 5-1 (장대동)69.3811
7숙박업(일반)유명장여관경상남도 진주시 진양호로564번길 12 (장대동)400.5919
8숙박업(일반)선화장여관경상남도 진주시 장대로15번길 5-2 (장대동)206.287
9숙박업(일반)이화장 여관경상남도 진주시 장대로6번길 5 (장대동)157.6911
업종명업소명영업소 주소(도로명)영업장면적객실수
265숙박업(일반)호텔보보경상남도 진주시 돗골로20번길 11 (상평동)511.8717
266숙박업(일반)호텔로그인경상남도 진주시 정촌면 화개천로54번길 33-16, 1~3층2800.9528
267숙박업(일반)에포케호텔경상남도 진주시 정촌면 화개천로 132-19, 1,2,3층2223.2623
268숙박업(일반)히트모텔경상남도 진주시 논개길 49, 1층~5층 (장대동)459.0520
269숙박업(일반)에스티알엘(Strl)호텔경상남도 진주시 순환로 506, 1층~7층 (평거동)1729.2635
270숙박업(일반)호텔테라경상남도 진주시 정촌면 화개천로54번길 81, 1층~3층1207.1410
271숙박업(생활)주식회사라온스테이경상남도 진주시 영천강로 164, 라온스테이페를라1차 95객실 3층~14층 (충무공동)2193.3195
272숙박업(생활)더클라우드경상남도 진주시 논개길 47-2 (장대동, 1층~6층)724.3519
273숙박업(생활)뉴라온스테이경상남도 진주시 영천강로 166, 라온 스테이 인 페를라 2차 3-14층 (충무공동)4115.42110
274숙박업(생활)골든튤립호텔남강경상남도 진주시 남강로673번길 16, 골든튤립남강 4층-16층 (동성동)4875.33128