Overview

Dataset statistics

Number of variables6
Number of observations355
Missing cells184
Missing cells (%)8.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.8 KiB
Average record size in memory48.4 B

Variable types

Categorical3
Text3

Dataset

Description제주특별자치도에 소재하고 있는 일반여행업(종합여행업)과 관련한 데이터로 상호명, 소재지, 연락처 등의 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15056289/fileData.do

Alerts

데이터기준일자 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
구분 is highly overall correlated with 비고 and 1 other fieldsHigh correlation
비고 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
구분 is highly imbalanced (97.2%)Imbalance
비고 is highly imbalanced (92.6%)Imbalance
데이터기준일자 is highly imbalanced (97.2%)Imbalance
연락처 has 182 (51.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 05:07:48.476698
Analysis finished2023-12-12 05:07:49.366455
Duration0.89 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.9 KiB
종합여행업
354 
<NA>
 
1

Length

Max length5
Median length5
Mean length4.9971831
Min length4

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row종합여행업
2nd row종합여행업
3rd row종합여행업
4th row종합여행업
5th row종합여행업

Common Values

ValueCountFrequency (%)
종합여행업 354
99.7%
<NA> 1
 
0.3%

Length

2023-12-12T14:07:49.458245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:07:49.597208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
종합여행업 354
99.7%
na 1
 
0.3%
Distinct354
Distinct (%)100.0%
Missing1
Missing (%)0.3%
Memory size2.9 KiB
2023-12-12T14:07:49.928488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length15
Mean length8.9971751
Min length3

Characters and Unicode

Total characters3185
Distinct characters336
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique354 ?
Unique (%)100.0%

Sample

1st row주식회사 카카오
2nd row(주)해여여행사
3rd row(주)제주몬
4th row주식회사 제주오라투어
5th row주식회사 헤이스타즈
ValueCountFrequency (%)
주식회사 143
 
27.3%
투어 6
 
1.1%
유한회사 3
 
0.6%
3
 
0.6%
여행사 2
 
0.4%
제주 2
 
0.4%
주)리앤최투어 1
 
0.2%
신화한국여행사 1
 
0.2%
골프부킹황제 1
 
0.2%
주)투어웨이 1
 
0.2%
Other values (360) 360
68.8%
2023-12-12T14:07:50.517895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
358
 
11.2%
242
 
7.6%
170
 
5.3%
150
 
4.7%
( 149
 
4.7%
) 149
 
4.7%
146
 
4.6%
126
 
4.0%
112
 
3.5%
110
 
3.5%
Other values (326) 1473
46.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2682
84.2%
Space Separator 170
 
5.3%
Open Punctuation 149
 
4.7%
Close Punctuation 149
 
4.7%
Lowercase Letter 14
 
0.4%
Other Symbol 10
 
0.3%
Uppercase Letter 6
 
0.2%
Decimal Number 4
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
358
 
13.3%
242
 
9.0%
150
 
5.6%
146
 
5.4%
126
 
4.7%
112
 
4.2%
110
 
4.1%
109
 
4.1%
109
 
4.1%
64
 
2.4%
Other values (302) 1156
43.1%
Lowercase Letter
ValueCountFrequency (%)
r 3
21.4%
o 3
21.4%
e 2
14.3%
v 1
 
7.1%
a 1
 
7.1%
u 1
 
7.1%
c 1
 
7.1%
s 1
 
7.1%
i 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
K 1
16.7%
G 1
16.7%
A 1
16.7%
O 1
16.7%
T 1
16.7%
D 1
16.7%
Decimal Number
ValueCountFrequency (%)
3 1
25.0%
2 1
25.0%
4 1
25.0%
8 1
25.0%
Space Separator
ValueCountFrequency (%)
170
100.0%
Open Punctuation
ValueCountFrequency (%)
( 149
100.0%
Close Punctuation
ValueCountFrequency (%)
) 149
100.0%
Other Symbol
ValueCountFrequency (%)
10
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2692
84.5%
Common 473
 
14.9%
Latin 20
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
358
 
13.3%
242
 
9.0%
150
 
5.6%
146
 
5.4%
126
 
4.7%
112
 
4.2%
110
 
4.1%
109
 
4.0%
109
 
4.0%
64
 
2.4%
Other values (303) 1166
43.3%
Latin
ValueCountFrequency (%)
r 3
15.0%
o 3
15.0%
e 2
 
10.0%
v 1
 
5.0%
K 1
 
5.0%
a 1
 
5.0%
G 1
 
5.0%
A 1
 
5.0%
O 1
 
5.0%
T 1
 
5.0%
Other values (5) 5
25.0%
Common
ValueCountFrequency (%)
170
35.9%
( 149
31.5%
) 149
31.5%
- 1
 
0.2%
3 1
 
0.2%
2 1
 
0.2%
4 1
 
0.2%
8 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2682
84.2%
ASCII 493
 
15.5%
None 10
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
358
 
13.3%
242
 
9.0%
150
 
5.6%
146
 
5.4%
126
 
4.7%
112
 
4.2%
110
 
4.1%
109
 
4.1%
109
 
4.1%
64
 
2.4%
Other values (302) 1156
43.1%
ASCII
ValueCountFrequency (%)
170
34.5%
( 149
30.2%
) 149
30.2%
r 3
 
0.6%
o 3
 
0.6%
e 2
 
0.4%
- 1
 
0.2%
3 1
 
0.2%
2 1
 
0.2%
4 1
 
0.2%
Other values (13) 13
 
2.6%
None
ValueCountFrequency (%)
10
100.0%
Distinct294
Distinct (%)83.1%
Missing1
Missing (%)0.3%
Memory size2.9 KiB
2023-12-12T14:07:50.826353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length31
Mean length23.596045
Min length18

Characters and Unicode

Total characters8353
Distinct characters208
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique264 ?
Unique (%)74.6%

Sample

1st row제주특별자치도 제주시 첨단로 242
2nd row제주특별자치도 제주시 신대로 70 4층
3rd row제주특별자치도 제주시 월성로4길 76-2 비동
4th row제주특별자치도 제주시 연동13길 5 도호빌딩 2층
5th row제주특별자치도 제주시 중앙로 217 제주벤처마루 3층
ValueCountFrequency (%)
제주특별자치도 354
20.6%
제주시 334
19.4%
2층 73
 
4.2%
1층 39
 
2.3%
3층 34
 
2.0%
연북로 29
 
1.7%
도령로 28
 
1.6%
92 22
 
1.3%
진현빌딩 22
 
1.3%
서귀포시 20
 
1.2%
Other values (423) 767
44.5%
2023-12-12T14:07:51.310828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1543
18.5%
707
 
8.5%
696
 
8.3%
390
 
4.7%
355
 
4.2%
354
 
4.2%
354
 
4.2%
354
 
4.2%
354
 
4.2%
284
 
3.4%
Other values (198) 2962
35.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5513
66.0%
Space Separator 1543
 
18.5%
Decimal Number 1228
 
14.7%
Dash Punctuation 56
 
0.7%
Uppercase Letter 13
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
707
12.8%
696
12.6%
390
 
7.1%
355
 
6.4%
354
 
6.4%
354
 
6.4%
354
 
6.4%
354
 
6.4%
284
 
5.2%
172
 
3.1%
Other values (181) 1493
27.1%
Decimal Number
ValueCountFrequency (%)
1 267
21.7%
2 257
20.9%
3 133
10.8%
5 103
 
8.4%
4 101
 
8.2%
6 87
 
7.1%
7 84
 
6.8%
9 77
 
6.3%
0 68
 
5.5%
8 51
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
S 4
30.8%
B 4
30.8%
C 3
23.1%
J 1
 
7.7%
I 1
 
7.7%
Space Separator
ValueCountFrequency (%)
1543
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 56
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5513
66.0%
Common 2827
33.8%
Latin 13
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
707
12.8%
696
12.6%
390
 
7.1%
355
 
6.4%
354
 
6.4%
354
 
6.4%
354
 
6.4%
354
 
6.4%
284
 
5.2%
172
 
3.1%
Other values (181) 1493
27.1%
Common
ValueCountFrequency (%)
1543
54.6%
1 267
 
9.4%
2 257
 
9.1%
3 133
 
4.7%
5 103
 
3.6%
4 101
 
3.6%
6 87
 
3.1%
7 84
 
3.0%
9 77
 
2.7%
0 68
 
2.4%
Other values (2) 107
 
3.8%
Latin
ValueCountFrequency (%)
S 4
30.8%
B 4
30.8%
C 3
23.1%
J 1
 
7.7%
I 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5513
66.0%
ASCII 2840
34.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1543
54.3%
1 267
 
9.4%
2 257
 
9.0%
3 133
 
4.7%
5 103
 
3.6%
4 101
 
3.6%
6 87
 
3.1%
7 84
 
3.0%
9 77
 
2.7%
0 68
 
2.4%
Other values (7) 120
 
4.2%
Hangul
ValueCountFrequency (%)
707
12.8%
696
12.6%
390
 
7.1%
355
 
6.4%
354
 
6.4%
354
 
6.4%
354
 
6.4%
354
 
6.4%
284
 
5.2%
172
 
3.1%
Other values (181) 1493
27.1%

연락처
Text

MISSING 

Distinct172
Distinct (%)99.4%
Missing182
Missing (%)51.3%
Memory size2.9 KiB
2023-12-12T14:07:51.576636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length11.99422
Min length9

Characters and Unicode

Total characters2075
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique171 ?
Unique (%)98.8%

Sample

1st row1899-1326
2nd row1800-7584
3rd row1544-5899
4th row070-7780-2300
5th row070-7773-9902
ValueCountFrequency (%)
064-742-8889 2
 
1.2%
064-729-8251 1
 
0.6%
064-725-0108 1
 
0.6%
064-740-7800 1
 
0.6%
064-738-2888 1
 
0.6%
064-735-5511 1
 
0.6%
064-733-4271 1
 
0.6%
064-731-7900 1
 
0.6%
064-731-4112 1
 
0.6%
064-730-1100 1
 
0.6%
Other values (162) 162
93.6%
2023-12-12T14:07:52.088200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 343
16.5%
0 313
15.1%
4 289
13.9%
7 245
11.8%
6 230
11.1%
1 138
6.7%
8 126
 
6.1%
2 122
 
5.9%
5 95
 
4.6%
9 87
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1732
83.5%
Dash Punctuation 343
 
16.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 313
18.1%
4 289
16.7%
7 245
14.1%
6 230
13.3%
1 138
8.0%
8 126
7.3%
2 122
 
7.0%
5 95
 
5.5%
9 87
 
5.0%
3 87
 
5.0%
Dash Punctuation
ValueCountFrequency (%)
- 343
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2075
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 343
16.5%
0 313
15.1%
4 289
13.9%
7 245
11.8%
6 230
11.1%
1 138
6.7%
8 126
 
6.1%
2 122
 
5.9%
5 95
 
4.6%
9 87
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2075
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 343
16.5%
0 313
15.1%
4 289
13.9%
7 245
11.8%
6 230
11.1%
1 138
6.7%
8 126
 
6.1%
2 122
 
5.9%
5 95
 
4.6%
9 87
 
4.2%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.9 KiB
<NA>
350 
휴업중
 
4
전입
 
1

Length

Max length4
Median length4
Mean length3.9830986
Min length2

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 350
98.6%
휴업중 4
 
1.1%
전입 1
 
0.3%

Length

2023-12-12T14:07:52.276441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:07:52.415728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 350
98.6%
휴업중 4
 
1.1%
전입 1
 
0.3%

데이터기준일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.9 KiB
2023-03-31
354 
<NA>
 
1

Length

Max length10
Median length10
Mean length9.9830986
Min length4

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row2023-03-31
2nd row2023-03-31
3rd row2023-03-31
4th row2023-03-31
5th row2023-03-31

Common Values

ValueCountFrequency (%)
2023-03-31 354
99.7%
<NA> 1
 
0.3%

Length

2023-12-12T14:07:52.537883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:07:52.643972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-03-31 354
99.7%
na 1
 
0.3%

Correlations

2023-12-12T14:07:52.710002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비고
비고1.000
2023-12-12T14:07:52.799757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
데이터기준일자구분비고
데이터기준일자1.0001.0001.000
구분1.0001.0001.000
비고1.0001.0001.000
2023-12-12T14:07:52.943810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분비고데이터기준일자
구분1.0001.0001.000
비고1.0001.0001.000
데이터기준일자1.0001.0001.000

Missing values

2023-12-12T14:07:48.950571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:07:49.103393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T14:07:49.268080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분상호명소재지연락처비고데이터기준일자
0종합여행업주식회사 카카오제주특별자치도 제주시 첨단로 2421899-1326<NA>2023-03-31
1종합여행업(주)해여여행사제주특별자치도 제주시 신대로 70 4층1800-7584<NA>2023-03-31
2종합여행업(주)제주몬제주특별자치도 제주시 월성로4길 76-2 비동1544-5899<NA>2023-03-31
3종합여행업주식회사 제주오라투어제주특별자치도 제주시 연동13길 5 도호빌딩 2층070-7780-2300<NA>2023-03-31
4종합여행업주식회사 헤이스타즈제주특별자치도 제주시 중앙로 217 제주벤처마루 3층070-7773-9902<NA>2023-03-31
5종합여행업주식회사 로드투어제주특별자치도 제주시 서광로27길 2070-7723-2684<NA>2023-03-31
6종합여행업(주)신정중한국제여행사제주특별자치도 제주시 신대로 70 4층070-4523-8779<NA>2023-03-31
7종합여행업(주)대경미디어제주특별자치도 제주시 도령로 128 삼무파크빌070-4333-1613<NA>2023-03-31
8종합여행업주식회사 엘에스컴퍼니제주특별자치도 제주시 청사로 11 4층070-4263-1069<NA>2023-03-31
9종합여행업인제주여행사제주특별자치도 제주시 용담로 17 1층070-4105-3580<NA>2023-03-31
구분상호명소재지연락처비고데이터기준일자
345종합여행업주식회사 더제인투어제주특별자치도 제주시 광양6길 2 모던테라스<NA><NA>2023-03-31
346종합여행업(주)더존관광개발제주특별자치도 제주시 고마로9길 15 2층<NA><NA>2023-03-31
347종합여행업(유)아주관광제주특별자치도 제주시 고마로 21<NA><NA>2023-03-31
348종합여행업루카스 트레블제주특별자치도 제주시 고마로 115 3층<NA><NA>2023-03-31
349종합여행업한국해양방송 주식회사제주특별자치도 서귀포시 표선면 번영로 2524<NA><NA>2023-03-31
350종합여행업쿰다투어제주특별자치도 서귀포시 표선면 번영로 2454<NA><NA>2023-03-31
351종합여행업화웨이㈜제주특별자치도 서귀포시 이어도로 67<NA><NA>2023-03-31
352종합여행업주식회사 여행원제주특별자치도 서귀포시 안덕면 화순로142번길 23<NA>전입2023-03-31
353종합여행업㈜지유투어제주특별자치도 서귀포시 신서귀로 97번길 51<NA><NA>2023-03-31
354<NA><NA><NA><NA><NA><NA>