Overview

Dataset statistics

Number of variables6
Number of observations60
Missing cells1
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.1 KiB
Average record size in memory52.2 B

Variable types

Categorical1
Text3
Numeric2

Dataset

Description부산광역시남구_숙박업현황_20210813
Author부산광역시 남구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15055782

Alerts

업종명 is highly imbalanced (78.9%)Imbalance
소재지전화 has 1 (1.7%) missing valuesMissing
영업소 주소(도로명) has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:52:13.848953
Analysis finished2023-12-10 16:52:15.435050
Duration1.59 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct2
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size612.0 B
숙박업(일반)
58 
숙박업(생활)
 
2

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
숙박업(일반) 58
96.7%
숙박업(생활) 2
 
3.3%

Length

2023-12-11T01:52:15.547258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:52:15.705897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
숙박업(일반 58
96.7%
숙박업(생활 2
 
3.3%
Distinct58
Distinct (%)96.7%
Missing0
Missing (%)0.0%
Memory size612.0 B
2023-12-11T01:52:16.033713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length14
Mean length6.2166667
Min length3

Characters and Unicode

Total characters373
Distinct characters129
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)93.3%

Sample

1st row우암여관
2nd row선샤인 하우스
3rd row풍조장여관
4th row범일 여인숙
5th row남일 여인숙
ValueCountFrequency (%)
호텔 5
 
5.6%
모텔 5
 
5.6%
여관 4
 
4.5%
낙원모텔 2
 
2.2%
하우스 2
 
2.2%
브이모텔 2
 
2.2%
여인숙 2
 
2.2%
블루 1
 
1.1%
t 1
 
1.1%
뮤트호텔 1
 
1.1%
Other values (64) 64
71.9%
2023-12-11T01:52:16.551323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
36
 
9.7%
29
 
7.8%
21
 
5.6%
15
 
4.0%
12
 
3.2%
11
 
2.9%
11
 
2.9%
10
 
2.7%
( 7
 
1.9%
) 7
 
1.9%
Other values (119) 214
57.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 275
73.7%
Uppercase Letter 49
 
13.1%
Space Separator 29
 
7.8%
Open Punctuation 7
 
1.9%
Close Punctuation 7
 
1.9%
Decimal Number 5
 
1.3%
Other Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
36
 
13.1%
21
 
7.6%
15
 
5.5%
12
 
4.4%
11
 
4.0%
11
 
4.0%
10
 
3.6%
6
 
2.2%
6
 
2.2%
5
 
1.8%
Other values (91) 142
51.6%
Uppercase Letter
ValueCountFrequency (%)
N 5
10.2%
E 5
10.2%
S 4
 
8.2%
T 4
 
8.2%
G 4
 
8.2%
O 4
 
8.2%
U 3
 
6.1%
H 3
 
6.1%
A 3
 
6.1%
I 2
 
4.1%
Other values (10) 12
24.5%
Decimal Number
ValueCountFrequency (%)
2 2
40.0%
5 1
20.0%
1 1
20.0%
9 1
20.0%
Space Separator
ValueCountFrequency (%)
29
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 275
73.7%
Common 49
 
13.1%
Latin 49
 
13.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
36
 
13.1%
21
 
7.6%
15
 
5.5%
12
 
4.4%
11
 
4.0%
11
 
4.0%
10
 
3.6%
6
 
2.2%
6
 
2.2%
5
 
1.8%
Other values (91) 142
51.6%
Latin
ValueCountFrequency (%)
N 5
10.2%
E 5
10.2%
S 4
 
8.2%
T 4
 
8.2%
G 4
 
8.2%
O 4
 
8.2%
U 3
 
6.1%
H 3
 
6.1%
A 3
 
6.1%
I 2
 
4.1%
Other values (10) 12
24.5%
Common
ValueCountFrequency (%)
29
59.2%
( 7
 
14.3%
) 7
 
14.3%
2 2
 
4.1%
& 1
 
2.0%
5 1
 
2.0%
1 1
 
2.0%
9 1
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 275
73.7%
ASCII 98
 
26.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
36
 
13.1%
21
 
7.6%
15
 
5.5%
12
 
4.4%
11
 
4.0%
11
 
4.0%
10
 
3.6%
6
 
2.2%
6
 
2.2%
5
 
1.8%
Other values (91) 142
51.6%
ASCII
ValueCountFrequency (%)
29
29.6%
( 7
 
7.1%
) 7
 
7.1%
N 5
 
5.1%
E 5
 
5.1%
S 4
 
4.1%
T 4
 
4.1%
G 4
 
4.1%
O 4
 
4.1%
U 3
 
3.1%
Other values (18) 26
26.5%
Distinct60
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size612.0 B
2023-12-11T01:52:16.848897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length28
Mean length25.066667
Min length21

Characters and Unicode

Total characters1504
Distinct characters61
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique60 ?
Unique (%)100.0%

Sample

1st row부산광역시 남구 우암번영로14번길 14 (우암동)
2nd row부산광역시 남구 전포대로 106-2 (문현동)
3rd row부산광역시 남구 수영로 159 (대연동)
4th row부산광역시 남구 남동천로 56-6 (문현동)
5th row부산광역시 남구 못골로 92-1 (대연동)
ValueCountFrequency (%)
부산광역시 60
19.9%
남구 60
19.9%
대연동 28
 
9.3%
문현동 15
 
5.0%
용호동 11
 
3.6%
유엔평화로4번길 9
 
3.0%
용호로 9
 
3.0%
수영로13번길 7
 
2.3%
못골로12번길 4
 
1.3%
5 4
 
1.3%
Other values (72) 95
31.5%
2023-12-11T01:52:17.496808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
242
 
16.1%
64
 
4.3%
62
 
4.1%
62
 
4.1%
60
 
4.0%
60
 
4.0%
60
 
4.0%
60
 
4.0%
60
 
4.0%
60
 
4.0%
Other values (51) 714
47.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 897
59.6%
Space Separator 242
 
16.1%
Decimal Number 221
 
14.7%
Close Punctuation 60
 
4.0%
Open Punctuation 60
 
4.0%
Dash Punctuation 17
 
1.1%
Other Punctuation 6
 
0.4%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
64
 
7.1%
62
 
6.9%
62
 
6.9%
60
 
6.7%
60
 
6.7%
60
 
6.7%
60
 
6.7%
60
 
6.7%
60
 
6.7%
38
 
4.2%
Other values (35) 311
34.7%
Decimal Number
ValueCountFrequency (%)
1 57
25.8%
2 28
12.7%
3 26
11.8%
4 20
 
9.0%
5 20
 
9.0%
0 18
 
8.1%
6 17
 
7.7%
9 16
 
7.2%
8 11
 
5.0%
7 8
 
3.6%
Space Separator
ValueCountFrequency (%)
242
100.0%
Close Punctuation
ValueCountFrequency (%)
) 60
100.0%
Open Punctuation
ValueCountFrequency (%)
( 60
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 17
100.0%
Other Punctuation
ValueCountFrequency (%)
, 6
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 897
59.6%
Common 607
40.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
64
 
7.1%
62
 
6.9%
62
 
6.9%
60
 
6.7%
60
 
6.7%
60
 
6.7%
60
 
6.7%
60
 
6.7%
60
 
6.7%
38
 
4.2%
Other values (35) 311
34.7%
Common
ValueCountFrequency (%)
242
39.9%
) 60
 
9.9%
( 60
 
9.9%
1 57
 
9.4%
2 28
 
4.6%
3 26
 
4.3%
4 20
 
3.3%
5 20
 
3.3%
0 18
 
3.0%
6 17
 
2.8%
Other values (6) 59
 
9.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 897
59.6%
ASCII 607
40.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
242
39.9%
) 60
 
9.9%
( 60
 
9.9%
1 57
 
9.4%
2 28
 
4.6%
3 26
 
4.3%
4 20
 
3.3%
5 20
 
3.3%
0 18
 
3.0%
6 17
 
2.8%
Other values (6) 59
 
9.7%
Hangul
ValueCountFrequency (%)
64
 
7.1%
62
 
6.9%
62
 
6.9%
60
 
6.7%
60
 
6.7%
60
 
6.7%
60
 
6.7%
60
 
6.7%
60
 
6.7%
38
 
4.2%
Other values (35) 311
34.7%

우편번호
Real number (ℝ)

Distinct22
Distinct (%)36.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48483.767
Minimum48400
Maximum48587
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size672.0 B
2023-12-11T01:52:17.740092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum48400
5-th percentile48402
Q148445
median48492
Q348529.25
95-th percentile48568
Maximum48587
Range187
Interquartile range (IQR)84.25

Descriptive statistics

Standard deviation57.384923
Coefficient of variation (CV)0.0011835904
Kurtosis-1.1365747
Mean48483.767
Median Absolute Deviation (MAD)47
Skewness0.16519359
Sum2909026
Variance3293.0294
MonotonicityNot monotonic
2023-12-11T01:52:17.934133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
48492 11
18.3%
48415 7
11.7%
48445 7
11.7%
48567 6
10.0%
48402 5
8.3%
48453 3
 
5.0%
48548 3
 
5.0%
48568 2
 
3.3%
48556 2
 
3.3%
48496 2
 
3.3%
Other values (12) 12
20.0%
ValueCountFrequency (%)
48400 1
 
1.7%
48402 5
8.3%
48411 1
 
1.7%
48415 7
11.7%
48445 7
11.7%
48453 3
 
5.0%
48476 1
 
1.7%
48479 1
 
1.7%
48485 1
 
1.7%
48492 11
18.3%
ValueCountFrequency (%)
48587 1
 
1.7%
48579 1
 
1.7%
48568 2
 
3.3%
48567 6
10.0%
48556 2
 
3.3%
48548 3
5.0%
48523 1
 
1.7%
48505 1
 
1.7%
48504 1
 
1.7%
48497 1
 
1.7%

소재지전화
Text

MISSING 

Distinct58
Distinct (%)98.3%
Missing1
Missing (%)1.7%
Memory size612.0 B
2023-12-11T01:52:18.309064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.016949
Min length12

Characters and Unicode

Total characters709
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)96.6%

Sample

1st row051-646-3975
2nd row051-633-9297
3rd row051-639-0011
4th row051-633-7174
5th row051-627-1144
ValueCountFrequency (%)
051-627-1144 2
 
3.4%
051-611-6882 1
 
1.7%
051-612-8314 1
 
1.7%
051-637-7775 1
 
1.7%
051-644-8480 1
 
1.7%
051-611-6097 1
 
1.7%
051-633-3417 1
 
1.7%
051-627-7451 1
 
1.7%
051-626-8400 1
 
1.7%
051-642-2923 1
 
1.7%
Other values (48) 48
81.4%
2023-12-11T01:52:18.955266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 118
16.6%
1 100
14.1%
6 97
13.7%
0 95
13.4%
5 81
11.4%
2 52
7.3%
4 51
7.2%
3 39
 
5.5%
7 34
 
4.8%
8 24
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 591
83.4%
Dash Punctuation 118
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 100
16.9%
6 97
16.4%
0 95
16.1%
5 81
13.7%
2 52
8.8%
4 51
8.6%
3 39
 
6.6%
7 34
 
5.8%
8 24
 
4.1%
9 18
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 118
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 709
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 118
16.6%
1 100
14.1%
6 97
13.7%
0 95
13.4%
5 81
11.4%
2 52
7.3%
4 51
7.2%
3 39
 
5.5%
7 34
 
4.8%
8 24
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 709
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 118
16.6%
1 100
14.1%
6 97
13.7%
0 95
13.4%
5 81
11.4%
2 52
7.3%
4 51
7.2%
3 39
 
5.5%
7 34
 
4.8%
8 24
 
3.4%

객실수
Real number (ℝ)

Distinct33
Distinct (%)55.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.466667
Minimum9
Maximum281
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size672.0 B
2023-12-11T01:52:19.160982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile9
Q115.75
median22.5
Q335
95-th percentile49.1
Maximum281
Range272
Interquartile range (IQR)19.25

Descriptive statistics

Standard deviation35.006279
Coefficient of variation (CV)1.1879959
Kurtosis46.891103
Mean29.466667
Median Absolute Deviation (MAD)9
Skewness6.4794394
Sum1768
Variance1225.4395
MonotonicityNot monotonic
2023-12-11T01:52:19.363185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
9 4
 
6.7%
22 4
 
6.7%
35 3
 
5.0%
14 3
 
5.0%
19 3
 
5.0%
20 3
 
5.0%
12 3
 
5.0%
26 2
 
3.3%
40 2
 
3.3%
41 2
 
3.3%
Other values (23) 31
51.7%
ValueCountFrequency (%)
9 4
6.7%
10 2
3.3%
12 3
5.0%
13 1
 
1.7%
14 3
5.0%
15 2
3.3%
16 2
3.3%
17 1
 
1.7%
18 1
 
1.7%
19 3
5.0%
ValueCountFrequency (%)
281 1
1.7%
52 1
1.7%
51 1
1.7%
49 1
1.7%
45 1
1.7%
42 1
1.7%
41 2
3.3%
40 2
3.3%
38 2
3.3%
36 2
3.3%

Interactions

2023-12-11T01:52:14.848569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:52:14.513778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:52:14.997598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:52:14.684925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:52:19.536910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종명업소명영업소 주소(도로명)우편번호소재지전화객실수
업종명1.0001.0001.0000.0560.0000.000
업소명1.0001.0001.0000.9400.9921.000
영업소 주소(도로명)1.0001.0001.0001.0001.0001.000
우편번호0.0560.9401.0001.0001.0000.000
소재지전화0.0000.9921.0001.0001.0001.000
객실수0.0001.0001.0000.0001.0001.000
2023-12-11T01:52:19.735245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호객실수업종명
우편번호1.000-0.1340.046
객실수-0.1341.0000.000
업종명0.0460.0001.000

Missing values

2023-12-11T01:52:15.179840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:52:15.360380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명영업소 주소(도로명)우편번호소재지전화객실수
0숙박업(일반)우암여관부산광역시 남구 우암번영로14번길 14 (우암동)48479051-646-39759
1숙박업(일반)선샤인 하우스부산광역시 남구 전포대로 106-2 (문현동)48411051-633-92979
2숙박업(일반)풍조장여관부산광역시 남구 수영로 159 (대연동)48453051-639-001112
3숙박업(일반)범일 여인숙부산광역시 남구 남동천로 56-6 (문현동)48402051-633-717415
4숙박업(일반)남일 여인숙부산광역시 남구 못골로 92-1 (대연동)48445051-627-114414
5숙박업(일반)문현 모텔부산광역시 남구 지게골로10번길 5 (문현동)48476051-646-174510
6숙박업(일반)루이모텔부산광역시 남구 유엔평화로4번길 74, 모텔루이 (대연동)48492051-634-001114
7숙박업(일반)성운장부산광역시 남구 용호로 202 (용호동)48568051-622-668819
8숙박업(일반)은하모텔부산광역시 남구 용호로 98 (용호동)48523051-623-966319
9숙박업(일반)오륙도여관부산광역시 남구 신선로319번길 5 (용당동)48548051-626-42769
업종명업소명영업소 주소(도로명)우편번호소재지전화객실수
50숙박업(일반)호텔 온나(ONNA)부산광역시 남구 유엔평화로4번길 5 (대연동)48493051-623-799538
51숙박업(일반)버킹엄 모텔부산광역시 남구 용호로 170 (용호동)48567051-622-646630
52숙박업(일반)브이모텔부산광역시 남구 자유평화로60번길 85 (문현동)48402051-645-015427
53숙박업(일반)시카고모텔부산광역시 남구 동명로132번길 21 (용호동)48567051-621-000238
54숙박업(일반)보브호텔부산광역시 남구 자유평화로60번길 90 (문현동)48402051-633-013136
55숙박업(일반)이데아호텔부산광역시 남구 유엔평화로 35 (대연동)48505051-611-664352
56숙박업(일반)지앤지(G&G)관광호텔부산광역시 남구 유엔평화로4번길 34 (대연동)48492051-626-772341
57숙박업(일반)아바니 센트럴 부산부산광역시 남구 전포대로 133, 1,3,5,19~35,37층 (문현동)48400051-791-5800281
58숙박업(생활)우성빌부산광역시 남구 못골로 94-1 (대연동)48445051-627-11449
59숙박업(생활)장미하우스부산광역시 남구 수영로250번길 11-5 (대연동)48497<NA>16