Overview

Dataset statistics

Number of variables6
Number of observations250
Missing cells119
Missing cells (%)7.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.1 KiB
Average record size in memory49.5 B

Variable types

Text3
Numeric1
Categorical2

Dataset

Description부산광역시 부산진구 내 등록된 여행업현황 정보를 제공합니다.업종, 상호, 소재지 등의 정보를 제공하고 있습니다.
Author부산광역시 부산진구
URLhttps://www.data.go.kr/data/15025550/fileData.do

Alerts

분류 is highly overall correlated with Unnamed: 5High correlation
Unnamed: 5 is highly overall correlated with 우편번호 and 1 other fieldsHigh correlation
우편번호 is highly overall correlated with Unnamed: 5High correlation
Unnamed: 5 is highly imbalanced (88.2%)Imbalance
전화번호 has 119 (47.6%) missing valuesMissing

Reproduction

Analysis started2023-12-12 14:56:12.443576
Analysis finished2023-12-12 14:56:13.144808
Duration0.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상호
Text

Distinct225
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-12T23:56:13.362116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length17
Mean length7.976
Min length2

Characters and Unicode

Total characters1994
Distinct characters310
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique200 ?
Unique (%)80.0%

Sample

1st row(주)신세기고속관광
2nd row동아여행사
3rd row아름다운 동서여행(주)
4th row세븐투어(주)
5th row남성관광여행사
ValueCountFrequency (%)
주식회사 9
 
3.0%
투어 3
 
1.0%
tour 3
 
1.0%
여행사 3
 
1.0%
the 3
 
1.0%
엔젤여행사 2
 
0.7%
성운투어여행사 2
 
0.7%
라이프t 2
 
0.7%
하나프리미어여행(주 2
 
0.7%
한날애커뮤니케이션 2
 
0.7%
Other values (244) 266
89.6%
2023-12-12T23:56:13.783358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
135
 
6.8%
( 134
 
6.7%
) 134
 
6.7%
94
 
4.7%
94
 
4.7%
87
 
4.4%
84
 
4.2%
72
 
3.6%
47
 
2.4%
41
 
2.1%
Other values (300) 1072
53.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1561
78.3%
Open Punctuation 134
 
6.7%
Close Punctuation 134
 
6.7%
Uppercase Letter 70
 
3.5%
Space Separator 47
 
2.4%
Lowercase Letter 43
 
2.2%
Other Symbol 2
 
0.1%
Decimal Number 2
 
0.1%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
135
 
8.6%
94
 
6.0%
94
 
6.0%
87
 
5.6%
84
 
5.4%
72
 
4.6%
41
 
2.6%
31
 
2.0%
23
 
1.5%
22
 
1.4%
Other values (256) 878
56.2%
Uppercase Letter
ValueCountFrequency (%)
T 14
20.0%
D 5
 
7.1%
E 5
 
7.1%
R 5
 
7.1%
O 5
 
7.1%
H 5
 
7.1%
A 4
 
5.7%
P 3
 
4.3%
U 3
 
4.3%
I 3
 
4.3%
Other values (10) 18
25.7%
Lowercase Letter
ValueCountFrequency (%)
o 7
16.3%
r 7
16.3%
u 4
9.3%
i 4
9.3%
a 4
9.3%
p 3
7.0%
n 2
 
4.7%
t 2
 
4.7%
e 2
 
4.7%
b 1
 
2.3%
Other values (7) 7
16.3%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 134
100.0%
Close Punctuation
ValueCountFrequency (%)
) 134
100.0%
Space Separator
ValueCountFrequency (%)
47
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1556
78.0%
Common 318
 
15.9%
Latin 113
 
5.7%
Han 7
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
135
 
8.7%
94
 
6.0%
94
 
6.0%
87
 
5.6%
84
 
5.4%
72
 
4.6%
41
 
2.6%
31
 
2.0%
23
 
1.5%
22
 
1.4%
Other values (250) 873
56.1%
Latin
ValueCountFrequency (%)
T 14
 
12.4%
o 7
 
6.2%
r 7
 
6.2%
D 5
 
4.4%
E 5
 
4.4%
R 5
 
4.4%
O 5
 
4.4%
H 5
 
4.4%
u 4
 
3.5%
i 4
 
3.5%
Other values (27) 52
46.0%
Han
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Common
ValueCountFrequency (%)
( 134
42.1%
) 134
42.1%
47
 
14.8%
1 1
 
0.3%
2 1
 
0.3%
. 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1554
77.9%
ASCII 431
 
21.6%
CJK 7
 
0.4%
None 2
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
135
 
8.7%
94
 
6.0%
94
 
6.0%
87
 
5.6%
84
 
5.4%
72
 
4.6%
41
 
2.6%
31
 
2.0%
23
 
1.5%
22
 
1.4%
Other values (249) 871
56.0%
ASCII
ValueCountFrequency (%)
( 134
31.1%
) 134
31.1%
47
 
10.9%
T 14
 
3.2%
o 7
 
1.6%
r 7
 
1.6%
D 5
 
1.2%
E 5
 
1.2%
R 5
 
1.2%
O 5
 
1.2%
Other values (33) 68
15.8%
None
ValueCountFrequency (%)
2
100.0%
CJK
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

우편번호
Real number (ℝ)

HIGH CORRELATION 

Distinct76
Distinct (%)30.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49534.192
Minimum47102
Maximum614845
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 KiB
2023-12-12T23:56:13.998604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum47102
5-th percentile47181.5
Q147220
median47257
Q347295
95-th percentile47366
Maximum614845
Range567743
Interquartile range (IQR)75

Descriptive statistics

Standard deviation35897.031
Coefficient of variation (CV)0.72469196
Kurtosis249.99865
Mean49534.192
Median Absolute Deviation (MAD)38
Skewness15.811324
Sum12383548
Variance1.2885968 × 109
MonotonicityNot monotonic
2023-12-12T23:56:14.163064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
47257 23
 
9.2%
47209 13
 
5.2%
47295 12
 
4.8%
47216 12
 
4.8%
47366 11
 
4.4%
47247 10
 
4.0%
47284 7
 
2.8%
47353 6
 
2.4%
47217 6
 
2.4%
47289 6
 
2.4%
Other values (66) 144
57.6%
ValueCountFrequency (%)
47102 1
 
0.4%
47103 1
 
0.4%
47109 1
 
0.4%
47113 1
 
0.4%
47123 1
 
0.4%
47124 1
 
0.4%
47126 1
 
0.4%
47146 3
1.2%
47148 1
 
0.4%
47175 1
 
0.4%
ValueCountFrequency (%)
614845 1
 
0.4%
47373 1
 
0.4%
47369 3
 
1.2%
47366 11
4.4%
47365 4
 
1.6%
47362 2
 
0.8%
47360 4
 
1.6%
47358 2
 
0.8%
47357 3
 
1.2%
47353 6
2.4%
Distinct217
Distinct (%)86.8%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-12T23:56:14.481239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length41
Mean length35.42
Min length23

Characters and Unicode

Total characters8855
Distinct characters205
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique189 ?
Unique (%)75.6%

Sample

1st row부산광역시 부산진구 범일로 150-1 (범천동)
2nd row부산광역시 부산진구 연수로 17 (양정동)
3rd row부산광역시 부산진구 양지로 54 (양정동, 진리관 1층)
4th row부산광역시 부산진구 자유평화로 11 (범천동, 누리엔 지하1층)
5th row부산광역시 부산진구 연수로 29 (양정동, 상가 6동 1층)
ValueCountFrequency (%)
부산광역시 250
 
14.8%
부산진구 250
 
14.8%
부전동 86
 
5.1%
양정동 55
 
3.3%
중앙대로 49
 
2.9%
범천동 43
 
2.5%
전포동 33
 
2.0%
2층 25
 
1.5%
1층 23
 
1.4%
27 22
 
1.3%
Other values (407) 856
50.6%
2023-12-12T23:56:14.959661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1442
 
16.3%
616
 
7.0%
505
 
5.7%
1 319
 
3.6%
292
 
3.3%
, 284
 
3.2%
267
 
3.0%
256
 
2.9%
255
 
2.9%
254
 
2.9%
Other values (195) 4365
49.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5263
59.4%
Space Separator 1442
 
16.3%
Decimal Number 1302
 
14.7%
Other Punctuation 285
 
3.2%
Close Punctuation 250
 
2.8%
Open Punctuation 250
 
2.8%
Uppercase Letter 35
 
0.4%
Dash Punctuation 27
 
0.3%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
616
 
11.7%
505
 
9.6%
292
 
5.5%
267
 
5.1%
256
 
4.9%
255
 
4.8%
254
 
4.8%
254
 
4.8%
250
 
4.8%
163
 
3.1%
Other values (161) 2151
40.9%
Uppercase Letter
ValueCountFrequency (%)
L 4
11.4%
E 4
11.4%
A 4
11.4%
T 3
8.6%
H 3
8.6%
B 2
 
5.7%
R 2
 
5.7%
K 2
 
5.7%
U 2
 
5.7%
M 2
 
5.7%
Other values (7) 7
20.0%
Decimal Number
ValueCountFrequency (%)
1 319
24.5%
2 162
12.4%
0 149
11.4%
3 132
10.1%
9 110
 
8.4%
4 108
 
8.3%
7 94
 
7.2%
6 92
 
7.1%
5 73
 
5.6%
8 63
 
4.8%
Other Punctuation
ValueCountFrequency (%)
, 284
99.6%
& 1
 
0.4%
Space Separator
ValueCountFrequency (%)
1442
100.0%
Close Punctuation
ValueCountFrequency (%)
) 250
100.0%
Open Punctuation
ValueCountFrequency (%)
( 250
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 27
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5263
59.4%
Common 3557
40.2%
Latin 35
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
616
 
11.7%
505
 
9.6%
292
 
5.5%
267
 
5.1%
256
 
4.9%
255
 
4.8%
254
 
4.8%
254
 
4.8%
250
 
4.8%
163
 
3.1%
Other values (161) 2151
40.9%
Common
ValueCountFrequency (%)
1442
40.5%
1 319
 
9.0%
, 284
 
8.0%
) 250
 
7.0%
( 250
 
7.0%
2 162
 
4.6%
0 149
 
4.2%
3 132
 
3.7%
9 110
 
3.1%
4 108
 
3.0%
Other values (7) 351
 
9.9%
Latin
ValueCountFrequency (%)
L 4
11.4%
E 4
11.4%
A 4
11.4%
T 3
8.6%
H 3
8.6%
B 2
 
5.7%
R 2
 
5.7%
K 2
 
5.7%
U 2
 
5.7%
M 2
 
5.7%
Other values (7) 7
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5263
59.4%
ASCII 3592
40.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1442
40.1%
1 319
 
8.9%
, 284
 
7.9%
) 250
 
7.0%
( 250
 
7.0%
2 162
 
4.5%
0 149
 
4.1%
3 132
 
3.7%
9 110
 
3.1%
4 108
 
3.0%
Other values (24) 386
 
10.7%
Hangul
ValueCountFrequency (%)
616
 
11.7%
505
 
9.6%
292
 
5.5%
267
 
5.1%
256
 
4.9%
255
 
4.8%
254
 
4.8%
254
 
4.8%
250
 
4.8%
163
 
3.1%
Other values (161) 2151
40.9%

전화번호
Text

MISSING 

Distinct117
Distinct (%)89.3%
Missing119
Missing (%)47.6%
Memory size2.1 KiB
2023-12-12T23:56:15.307797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length11.389313
Min length8

Characters and Unicode

Total characters1492
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique103 ?
Unique (%)78.6%

Sample

1st row645-6363
2nd row865-8300
3rd row051-868-9970
4th row051-868-1037
5th row070-8908-1642
ValueCountFrequency (%)
051-513-0874 2
 
1.5%
070-8228-2891 2
 
1.5%
051-757-7166 2
 
1.5%
0517833070 2
 
1.5%
865-8300 2
 
1.5%
051-931-2525 2
 
1.5%
1800-5911 2
 
1.5%
645-6363 2
 
1.5%
051-806-0086 2
 
1.5%
051-462-5525 2
 
1.5%
Other values (107) 111
84.7%
2023-12-12T23:56:15.760111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 255
17.1%
- 223
14.9%
5 201
13.5%
1 190
12.7%
8 141
9.5%
6 98
 
6.6%
2 96
 
6.4%
7 93
 
6.2%
3 71
 
4.8%
4 68
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1269
85.1%
Dash Punctuation 223
 
14.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 255
20.1%
5 201
15.8%
1 190
15.0%
8 141
11.1%
6 98
 
7.7%
2 96
 
7.6%
7 93
 
7.3%
3 71
 
5.6%
4 68
 
5.4%
9 56
 
4.4%
Dash Punctuation
ValueCountFrequency (%)
- 223
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1492
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 255
17.1%
- 223
14.9%
5 201
13.5%
1 190
12.7%
8 141
9.5%
6 98
 
6.6%
2 96
 
6.4%
7 93
 
6.2%
3 71
 
4.8%
4 68
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1492
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 255
17.1%
- 223
14.9%
5 201
13.5%
1 190
12.7%
8 141
9.5%
6 98
 
6.6%
2 96
 
6.4%
7 93
 
6.2%
3 71
 
4.8%
4 68
 
4.6%

분류
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
국내외여행업
143 
종합여행업
59 
국내여행업
48 

Length

Max length6
Median length6
Mean length5.572
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국내여행업
2nd row국내여행업
3rd row국내여행업
4th row국내여행업
5th row국내여행업

Common Values

ValueCountFrequency (%)
국내외여행업 143
57.2%
종합여행업 59
23.6%
국내여행업 48
 
19.2%

Length

2023-12-12T23:56:15.920459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:56:16.039608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국내외여행업 143
57.2%
종합여행업 59
23.6%
국내여행업 48
 
19.2%

Unnamed: 5
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
<NA>
246 
휴업
 
4

Length

Max length4
Median length4
Mean length3.968
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 246
98.4%
휴업 4
 
1.6%

Length

2023-12-12T23:56:16.185922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:56:16.306320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 246
98.4%
휴업 4
 
1.6%

Interactions

2023-12-12T23:56:12.829768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:56:16.363837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호분류
우편번호1.0000.000
분류0.0001.000
2023-12-12T23:56:16.458113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분류Unnamed: 5
분류1.0001.000
Unnamed: 51.0001.000
2023-12-12T23:56:16.546881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호분류Unnamed: 5
우편번호1.0000.0001.000
분류0.0001.0001.000
Unnamed: 51.0001.0001.000

Missing values

2023-12-12T23:56:12.976676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:56:13.104257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상호우편번호소재지(도로명)전화번호분류Unnamed: 5
0(주)신세기고속관광47365부산광역시 부산진구 범일로 150-1 (범천동)645-6363국내여행업<NA>
1동아여행사47216부산광역시 부산진구 연수로 17 (양정동)865-8300국내여행업<NA>
2아름다운 동서여행(주)47230부산광역시 부산진구 양지로 54 (양정동, 진리관 1층)051-868-9970국내여행업<NA>
3세븐투어(주)47366부산광역시 부산진구 자유평화로 11 (범천동, 누리엔 지하1층)<NA>국내여행업<NA>
4남성관광여행사47216부산광역시 부산진구 연수로 29 (양정동, 상가 6동 1층)051-868-1037국내여행업<NA>
5(주)호야여행사47253부산광역시 부산진구 새싹로28번길 14 (부전동)070-8908-1642국내여행업<NA>
6(주)한성여행사47213부산광역시 부산진구 동평로 418 (양정동)051-863-0025국내여행업<NA>
7오네트윅스(주)47246부산광역시 부산진구 서전로 11, 301호 (부전동, 덕명빌딩)051-462-5525국내여행업<NA>
8신부산고속투어(주)47216부산광역시 부산진구 연수로 31, 201호 (양정동)<NA>국내여행업<NA>
9(주)여행담47250부산광역시 부산진구 중앙대로 795-1, 2층 (부전동)<NA>국내여행업<NA>
상호우편번호소재지(도로명)전화번호분류Unnamed: 5
240에덴투어47126부산광역시 부산진구 성지로83번길 42 (초읍동)<NA>종합여행업<NA>
241폴리워크47229부산광역시 부산진구 양지로5번길 8, 2층 (양정동)<NA>종합여행업<NA>
242(주)더트래블47257부산광역시 부산진구 서면문화로 27, 유원 골든타워 오피스텔 724호 (부전동)<NA>종합여행업<NA>
243THE 여행가자47256부산광역시 부산진구 서면문화로 26, 1층 (부전동)<NA>종합여행업<NA>
244쿨투어47343부산광역시 부산진구 신암로 66, 309호 (범천동, 범천엘에이치아파트)<NA>종합여행업<NA>
245비욘드투어47209부산광역시 부산진구 중앙대로989번길 54, 지하 1층 (양정동)<NA>종합여행업<NA>
246레서트47289부산광역시 부산진구 서면로 10, 2915호 (부전동)<NA>종합여행업<NA>
247케이투어47257부산광역시 부산진구 서면문화로 27, 유원 골든타워 오피스텔 921호 (부전동)<NA>종합여행업<NA>
248㈜항성국제여행사47369부산광역시 부산진구 신암로 9, 5층 40호 (범천동)<NA>종합여행업<NA>
249㈜글로브임펙트47290부산광역시 부산진구 부전로 29, KT&G 부산진지사 2층 (부전동)<NA>종합여행업<NA>