Overview

Dataset statistics

Number of variables6
Number of observations56
Missing cells1
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.9 KiB
Average record size in memory52.4 B

Variable types

Categorical1
Text3
Numeric2

Dataset

Description부산광역시남구_숙박업현황_20230413
Author부산광역시 남구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15055782

Alerts

업종명 is highly imbalanced (77.8%)Imbalance
소재지전화 has 1 (1.8%) missing valuesMissing
영업소 주소(도로명) has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:52:05.942757
Analysis finished2023-12-10 16:52:07.436019
Duration1.49 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct2
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size580.0 B
숙박업(일반)
54 
숙박업(생활)
 
2

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
숙박업(일반) 54
96.4%
숙박업(생활) 2
 
3.6%

Length

2023-12-11T01:52:07.544062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:52:07.738446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
숙박업(일반 54
96.4%
숙박업(생활 2
 
3.6%
Distinct54
Distinct (%)96.4%
Missing0
Missing (%)0.0%
Memory size580.0 B
2023-12-11T01:52:08.032619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length16
Mean length6.5892857
Min length3

Characters and Unicode

Total characters369
Distinct characters128
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)92.9%

Sample

1st row선샤인 하우스
2nd row풍조장여관
3rd row범일 여인숙
4th row남일 여인숙
5th row문현 모텔
ValueCountFrequency (%)
호텔 7
 
8.0%
여관 4
 
4.6%
모텔 4
 
4.6%
낙원모텔 2
 
2.3%
브이모텔 2
 
2.3%
여인숙 2
 
2.3%
하우스 2
 
2.3%
뮤트호텔 1
 
1.1%
아젤리아 1
 
1.1%
호텔브라운도트대연 1
 
1.1%
Other values (61) 61
70.1%
2023-12-11T01:52:08.653387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
36
 
9.8%
31
 
8.4%
18
 
4.9%
18
 
4.9%
10
 
2.7%
9
 
2.4%
8
 
2.2%
( 8
 
2.2%
) 8
 
2.2%
7
 
1.9%
Other values (118) 216
58.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 267
72.4%
Uppercase Letter 49
 
13.3%
Space Separator 31
 
8.4%
Open Punctuation 8
 
2.2%
Close Punctuation 8
 
2.2%
Decimal Number 5
 
1.4%
Other Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
36
 
13.5%
18
 
6.7%
18
 
6.7%
10
 
3.7%
9
 
3.4%
8
 
3.0%
7
 
2.6%
6
 
2.2%
5
 
1.9%
5
 
1.9%
Other values (90) 145
54.3%
Uppercase Letter
ValueCountFrequency (%)
E 5
10.2%
N 5
10.2%
S 4
 
8.2%
G 4
 
8.2%
T 4
 
8.2%
O 4
 
8.2%
U 3
 
6.1%
A 3
 
6.1%
H 3
 
6.1%
I 2
 
4.1%
Other values (10) 12
24.5%
Decimal Number
ValueCountFrequency (%)
2 2
40.0%
5 1
20.0%
9 1
20.0%
1 1
20.0%
Space Separator
ValueCountFrequency (%)
31
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 267
72.4%
Common 53
 
14.4%
Latin 49
 
13.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
36
 
13.5%
18
 
6.7%
18
 
6.7%
10
 
3.7%
9
 
3.4%
8
 
3.0%
7
 
2.6%
6
 
2.2%
5
 
1.9%
5
 
1.9%
Other values (90) 145
54.3%
Latin
ValueCountFrequency (%)
E 5
10.2%
N 5
10.2%
S 4
 
8.2%
G 4
 
8.2%
T 4
 
8.2%
O 4
 
8.2%
U 3
 
6.1%
A 3
 
6.1%
H 3
 
6.1%
I 2
 
4.1%
Other values (10) 12
24.5%
Common
ValueCountFrequency (%)
31
58.5%
( 8
 
15.1%
) 8
 
15.1%
2 2
 
3.8%
& 1
 
1.9%
5 1
 
1.9%
9 1
 
1.9%
1 1
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 267
72.4%
ASCII 102
 
27.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
36
 
13.5%
18
 
6.7%
18
 
6.7%
10
 
3.7%
9
 
3.4%
8
 
3.0%
7
 
2.6%
6
 
2.2%
5
 
1.9%
5
 
1.9%
Other values (90) 145
54.3%
ASCII
ValueCountFrequency (%)
31
30.4%
( 8
 
7.8%
) 8
 
7.8%
E 5
 
4.9%
N 5
 
4.9%
S 4
 
3.9%
G 4
 
3.9%
T 4
 
3.9%
O 4
 
3.9%
U 3
 
2.9%
Other values (18) 26
25.5%
Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size580.0 B
2023-12-11T01:52:09.119236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length28
Mean length25.053571
Min length21

Characters and Unicode

Total characters1403
Distinct characters59
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)100.0%

Sample

1st row부산광역시 남구 전포대로 106-2 (문현동)
2nd row부산광역시 남구 수영로 159 (대연동)
3rd row부산광역시 남구 남동천로 56-6 (문현동)
4th row부산광역시 남구 못골로 92-1 (대연동)
5th row부산광역시 남구 지게골로10번길 5 (문현동)
ValueCountFrequency (%)
부산광역시 56
19.9%
남구 56
19.9%
대연동 28
 
9.9%
문현동 14
 
5.0%
용호동 10
 
3.5%
용호로 9
 
3.2%
유엔평화로4번길 9
 
3.2%
수영로13번길 6
 
2.1%
못골로12번길 4
 
1.4%
5 4
 
1.4%
Other values (68) 86
30.5%
2023-12-11T01:52:09.705217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
226
 
16.1%
59
 
4.2%
58
 
4.1%
57
 
4.1%
( 56
 
4.0%
56
 
4.0%
56
 
4.0%
56
 
4.0%
56
 
4.0%
56
 
4.0%
Other values (49) 667
47.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 837
59.7%
Space Separator 226
 
16.1%
Decimal Number 205
 
14.6%
Open Punctuation 56
 
4.0%
Close Punctuation 56
 
4.0%
Dash Punctuation 16
 
1.1%
Other Punctuation 6
 
0.4%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
59
 
7.0%
58
 
6.9%
57
 
6.8%
56
 
6.7%
56
 
6.7%
56
 
6.7%
56
 
6.7%
56
 
6.7%
56
 
6.7%
34
 
4.1%
Other values (33) 293
35.0%
Decimal Number
ValueCountFrequency (%)
1 51
24.9%
2 27
13.2%
3 24
11.7%
5 19
 
9.3%
4 18
 
8.8%
0 18
 
8.8%
6 16
 
7.8%
9 14
 
6.8%
8 10
 
4.9%
7 8
 
3.9%
Space Separator
ValueCountFrequency (%)
226
100.0%
Open Punctuation
ValueCountFrequency (%)
( 56
100.0%
Close Punctuation
ValueCountFrequency (%)
) 56
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%
Other Punctuation
ValueCountFrequency (%)
, 6
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 837
59.7%
Common 566
40.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
59
 
7.0%
58
 
6.9%
57
 
6.8%
56
 
6.7%
56
 
6.7%
56
 
6.7%
56
 
6.7%
56
 
6.7%
56
 
6.7%
34
 
4.1%
Other values (33) 293
35.0%
Common
ValueCountFrequency (%)
226
39.9%
( 56
 
9.9%
) 56
 
9.9%
1 51
 
9.0%
2 27
 
4.8%
3 24
 
4.2%
5 19
 
3.4%
4 18
 
3.2%
0 18
 
3.2%
- 16
 
2.8%
Other values (6) 55
 
9.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 837
59.7%
ASCII 566
40.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
226
39.9%
( 56
 
9.9%
) 56
 
9.9%
1 51
 
9.0%
2 27
 
4.8%
3 24
 
4.2%
5 19
 
3.4%
4 18
 
3.2%
0 18
 
3.2%
- 16
 
2.8%
Other values (6) 55
 
9.7%
Hangul
ValueCountFrequency (%)
59
 
7.0%
58
 
6.9%
57
 
6.8%
56
 
6.7%
56
 
6.7%
56
 
6.7%
56
 
6.7%
56
 
6.7%
56
 
6.7%
34
 
4.1%
Other values (33) 293
35.0%

우편번호
Real number (ℝ)

Distinct21
Distinct (%)37.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48482.304
Minimum48400
Maximum48587
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.0 B
2023-12-11T01:52:09.908665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum48400
5-th percentile48402
Q148445
median48492
Q348509.5
95-th percentile48568
Maximum48587
Range187
Interquartile range (IQR)64.5

Descriptive statistics

Standard deviation56.772567
Coefficient of variation (CV)0.0011709957
Kurtosis-1.0624852
Mean48482.304
Median Absolute Deviation (MAD)47
Skewness0.19891795
Sum2715009
Variance3223.1244
MonotonicityNot monotonic
2023-12-11T01:52:10.127468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
48492 11
19.6%
48445 7
12.5%
48415 6
10.7%
48402 5
8.9%
48567 5
8.9%
48548 3
 
5.4%
48453 3
 
5.4%
48496 2
 
3.6%
48568 2
 
3.6%
48411 1
 
1.8%
Other values (11) 11
19.6%
ValueCountFrequency (%)
48400 1
 
1.8%
48402 5
8.9%
48411 1
 
1.8%
48415 6
10.7%
48445 7
12.5%
48453 3
 
5.4%
48476 1
 
1.8%
48485 1
 
1.8%
48492 11
19.6%
48493 1
 
1.8%
ValueCountFrequency (%)
48587 1
 
1.8%
48579 1
 
1.8%
48568 2
 
3.6%
48567 5
8.9%
48556 1
 
1.8%
48548 3
5.4%
48523 1
 
1.8%
48505 1
 
1.8%
48504 1
 
1.8%
48497 1
 
1.8%

소재지전화
Text

MISSING 

Distinct54
Distinct (%)98.2%
Missing1
Missing (%)1.8%
Memory size580.0 B
2023-12-11T01:52:10.548485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.018182
Min length12

Characters and Unicode

Total characters661
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53 ?
Unique (%)96.4%

Sample

1st row051-633-9297
2nd row051-639-0011
3rd row051-633-7174
4th row051-627-1144
5th row051-646-1745
ValueCountFrequency (%)
051-627-1144 2
 
3.6%
051-612-8314 1
 
1.8%
051-637-7775 1
 
1.8%
051-645-0068 1
 
1.8%
051-633-3417 1
 
1.8%
051-627-7451 1
 
1.8%
051-626-8400 1
 
1.8%
051-642-2923 1
 
1.8%
051-625-5464 1
 
1.8%
051-637-6819 1
 
1.8%
Other values (44) 44
80.0%
2023-12-11T01:52:11.230458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 110
16.6%
1 96
14.5%
0 91
13.8%
6 91
13.8%
5 73
11.0%
2 51
7.7%
4 47
7.1%
3 34
 
5.1%
7 30
 
4.5%
8 22
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 551
83.4%
Dash Punctuation 110
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 96
17.4%
0 91
16.5%
6 91
16.5%
5 73
13.2%
2 51
9.3%
4 47
8.5%
3 34
 
6.2%
7 30
 
5.4%
8 22
 
4.0%
9 16
 
2.9%
Dash Punctuation
ValueCountFrequency (%)
- 110
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 661
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 110
16.6%
1 96
14.5%
0 91
13.8%
6 91
13.8%
5 73
11.0%
2 51
7.7%
4 47
7.1%
3 34
 
5.1%
7 30
 
4.5%
8 22
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 661
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 110
16.6%
1 96
14.5%
0 91
13.8%
6 91
13.8%
5 73
11.0%
2 51
7.7%
4 47
7.1%
3 34
 
5.1%
7 30
 
4.5%
8 22
 
3.3%

객실수
Real number (ℝ)

Distinct32
Distinct (%)57.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.428571
Minimum9
Maximum281
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.0 B
2023-12-11T01:52:11.850512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile9.75
Q116
median23
Q335.25
95-th percentile49.5
Maximum281
Range272
Interquartile range (IQR)19.25

Descriptive statistics

Standard deviation36.019619
Coefficient of variation (CV)1.1837434
Kurtosis44.349025
Mean30.428571
Median Absolute Deviation (MAD)9
Skewness6.3194493
Sum1704
Variance1297.413
MonotonicityNot monotonic
2023-12-11T01:52:12.170101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
22 4
 
7.1%
9 3
 
5.4%
35 3
 
5.4%
14 3
 
5.4%
19 3
 
5.4%
20 3
 
5.4%
38 2
 
3.6%
40 2
 
3.6%
41 2
 
3.6%
28 2
 
3.6%
Other values (22) 29
51.8%
ValueCountFrequency (%)
9 3
5.4%
10 2
3.6%
12 2
3.6%
13 1
 
1.8%
14 3
5.4%
15 2
3.6%
16 2
3.6%
18 1
 
1.8%
19 3
5.4%
20 3
5.4%
ValueCountFrequency (%)
281 1
1.8%
52 1
1.8%
51 1
1.8%
49 1
1.8%
45 1
1.8%
42 1
1.8%
41 2
3.6%
40 2
3.6%
38 2
3.6%
36 2
3.6%

Interactions

2023-12-11T01:52:06.835420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:52:06.519577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:52:06.976405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:52:06.685446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:52:12.421455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종명업소명영업소 주소(도로명)우편번호소재지전화객실수
업종명1.0001.0001.0000.0000.0000.000
업소명1.0001.0001.0000.9420.9911.000
영업소 주소(도로명)1.0001.0001.0001.0001.0001.000
우편번호0.0000.9421.0001.0001.0000.000
소재지전화0.0000.9911.0001.0001.0001.000
객실수0.0001.0001.0000.0001.0001.000
2023-12-11T01:52:12.610301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호객실수업종명
우편번호1.000-0.1070.000
객실수-0.1071.0000.000
업종명0.0000.0001.000

Missing values

2023-12-11T01:52:07.168274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:52:07.360007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명영업소 주소(도로명)우편번호소재지전화객실수
0숙박업(일반)선샤인 하우스부산광역시 남구 전포대로 106-2 (문현동)48411051-633-92979
1숙박업(일반)풍조장여관부산광역시 남구 수영로 159 (대연동)48453051-639-001112
2숙박업(일반)범일 여인숙부산광역시 남구 남동천로 56-6 (문현동)48402051-633-717415
3숙박업(일반)남일 여인숙부산광역시 남구 못골로 92-1 (대연동)48445051-627-114414
4숙박업(일반)문현 모텔부산광역시 남구 지게골로10번길 5 (문현동)48476051-646-174510
5숙박업(일반)루이모텔부산광역시 남구 유엔평화로4번길 74, 모텔루이 (대연동)48492051-634-001114
6숙박업(일반)성운장부산광역시 남구 용호로 202 (용호동)48568051-622-668819
7숙박업(일반)은하모텔부산광역시 남구 용호로 98 (용호동)48523051-623-966319
8숙박업(일반)오륙도여관부산광역시 남구 신선로319번길 5 (용당동)48548051-626-42769
9숙박업(일반)낙원모텔부산광역시 남구 수영로219번길 12-4 (대연동)48445051-626-844018
업종명업소명영업소 주소(도로명)우편번호소재지전화객실수
46숙박업(일반)호텔 온나(ONNA)부산광역시 남구 유엔평화로4번길 5 (대연동)48493051-623-799538
47숙박업(일반)버킹엄 모텔부산광역시 남구 용호로 170 (용호동)48567051-622-646630
48숙박업(일반)브이모텔부산광역시 남구 자유평화로60번길 85 (문현동)48402051-645-015427
49숙박업(일반)시카고모텔부산광역시 남구 동명로132번길 21 (용호동)48567051-621-000238
50숙박업(일반)보브호텔부산광역시 남구 자유평화로60번길 90 (문현동)48402051-633-013136
51숙박업(일반)르이데아호텔부산광역시 남구 유엔평화로 35 (대연동)48505051-611-664352
52숙박업(일반)지앤지(G&G)관광호텔부산광역시 남구 유엔평화로4번길 34 (대연동)48492051-626-772341
53숙박업(일반)아바니 센트럴 부산부산광역시 남구 전포대로 133, 1,3,5,19~35,37층 (문현동)48400051-791-5800281
54숙박업(생활)우성빌부산광역시 남구 못골로 94-1 (대연동)48445051-627-11449
55숙박업(생활)장미하우스부산광역시 남구 수영로250번길 11-5 (대연동)48497<NA>16