Overview

Dataset statistics

Number of variables6
Number of observations562
Missing cells16
Missing cells (%)0.5%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory27.0 KiB
Average record size in memory49.2 B

Variable types

Numeric1
Categorical2
Text2
DateTime1

Dataset

Description2023년 7월 31일 기준 광주광역시 소재 여행업 등록 현황입니다. 여행업에는 국내여행업, 국내외여행업, 종합여행업이 있습니다.
URLhttps://www.data.go.kr/data/15107985/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (0.2%) duplicate rowsDuplicates
업종중분류 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
지역 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 지역 and 1 other fieldsHigh correlation
지역 is highly imbalanced (93.9%)Imbalance

Reproduction

Analysis started2023-12-12 04:20:14.349720
Analysis finished2023-12-12 04:20:15.427326
Duration1.08 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION 

Distinct558
Distinct (%)100.0%
Missing4
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean279.5
Minimum1
Maximum558
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.1 KiB
2023-12-12T13:20:15.511218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile28.85
Q1140.25
median279.5
Q3418.75
95-th percentile530.15
Maximum558
Range557
Interquartile range (IQR)278.5

Descriptive statistics

Standard deviation161.225
Coefficient of variation (CV)0.57683362
Kurtosis-1.2
Mean279.5
Median Absolute Deviation (MAD)139.5
Skewness0
Sum155961
Variance25993.5
MonotonicityStrictly increasing
2023-12-12T13:20:15.699042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
376 1
 
0.2%
370 1
 
0.2%
371 1
 
0.2%
372 1
 
0.2%
373 1
 
0.2%
374 1
 
0.2%
375 1
 
0.2%
377 1
 
0.2%
385 1
 
0.2%
378 1
 
0.2%
Other values (548) 548
97.5%
(Missing) 4
 
0.7%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
558 1
0.2%
557 1
0.2%
556 1
0.2%
555 1
0.2%
554 1
0.2%
553 1
0.2%
552 1
0.2%
551 1
0.2%
550 1
0.2%
549 1
0.2%

지역
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
광주광역시
558 
<NA>
 
4

Length

Max length5
Median length5
Mean length4.9928826
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광주광역시
2nd row광주광역시
3rd row광주광역시
4th row광주광역시
5th row광주광역시

Common Values

ValueCountFrequency (%)
광주광역시 558
99.3%
<NA> 4
 
0.7%

Length

2023-12-12T13:20:15.904710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:20:16.052062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
광주광역시 558
99.3%
na 4
 
0.7%

업종중분류
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
국내외여행업
287 
종합여행업
157 
국내여행업
114 
<NA>
 
4

Length

Max length6
Median length6
Mean length5.5035587
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종합여행업
2nd row종합여행업
3rd row종합여행업
4th row종합여행업
5th row종합여행업

Common Values

ValueCountFrequency (%)
국내외여행업 287
51.1%
종합여행업 157
27.9%
국내여행업 114
 
20.3%
<NA> 4
 
0.7%

Length

2023-12-12T13:20:16.254063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:20:16.429659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국내외여행업 287
51.1%
종합여행업 157
27.9%
국내여행업 114
 
20.3%
na 4
 
0.7%
Distinct471
Distinct (%)84.4%
Missing4
Missing (%)0.7%
Memory size4.5 KiB
2023-12-12T13:20:16.747434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length8.5035842
Min length2

Characters and Unicode

Total characters4745
Distinct characters360
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique386 ?
Unique (%)69.2%

Sample

1st row(합)아시아 관광 여행사
2nd row주)그린여행사
3rd row주식회사 퍼니허니문
4th row(주)광주도시여행청
5th row(유)금호관광여행사
ValueCountFrequency (%)
주식회사 74
 
10.0%
유한회사 31
 
4.2%
여행사 22
 
3.0%
투어 9
 
1.2%
4
 
0.5%
세계로 4
 
0.5%
tour 3
 
0.4%
여행 3
 
0.4%
협동조합 3
 
0.4%
주)임해관광 3
 
0.4%
Other values (489) 583
78.9%
2023-12-12T13:20:17.246252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
389
 
8.2%
) 350
 
7.4%
( 347
 
7.3%
282
 
5.9%
222
 
4.7%
220
 
4.6%
186
 
3.9%
185
 
3.9%
181
 
3.8%
107
 
2.3%
Other values (350) 2276
48.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3803
80.1%
Close Punctuation 350
 
7.4%
Open Punctuation 347
 
7.3%
Space Separator 181
 
3.8%
Uppercase Letter 29
 
0.6%
Lowercase Letter 20
 
0.4%
Decimal Number 8
 
0.2%
Other Punctuation 4
 
0.1%
Other Symbol 2
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
389
 
10.2%
282
 
7.4%
222
 
5.8%
220
 
5.8%
186
 
4.9%
185
 
4.9%
107
 
2.8%
102
 
2.7%
93
 
2.4%
92
 
2.4%
Other values (314) 1925
50.6%
Uppercase Letter
ValueCountFrequency (%)
T 4
13.8%
C 4
13.8%
S 3
10.3%
O 3
10.3%
K 3
10.3%
A 2
 
6.9%
B 2
 
6.9%
R 1
 
3.4%
I 1
 
3.4%
U 1
 
3.4%
Other values (5) 5
17.2%
Lowercase Letter
ValueCountFrequency (%)
o 5
25.0%
u 4
20.0%
r 4
20.0%
g 2
 
10.0%
t 1
 
5.0%
f 1
 
5.0%
a 1
 
5.0%
e 1
 
5.0%
l 1
 
5.0%
Decimal Number
ValueCountFrequency (%)
1 3
37.5%
8 2
25.0%
4 1
 
12.5%
3 1
 
12.5%
5 1
 
12.5%
Other Punctuation
ValueCountFrequency (%)
& 2
50.0%
. 2
50.0%
Close Punctuation
ValueCountFrequency (%)
) 350
100.0%
Open Punctuation
ValueCountFrequency (%)
( 347
100.0%
Space Separator
ValueCountFrequency (%)
181
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3799
80.1%
Common 891
 
18.8%
Latin 49
 
1.0%
Han 6
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
389
 
10.2%
282
 
7.4%
222
 
5.8%
220
 
5.8%
186
 
4.9%
185
 
4.9%
107
 
2.8%
102
 
2.7%
93
 
2.4%
92
 
2.4%
Other values (310) 1921
50.6%
Latin
ValueCountFrequency (%)
o 5
 
10.2%
u 4
 
8.2%
T 4
 
8.2%
C 4
 
8.2%
r 4
 
8.2%
S 3
 
6.1%
O 3
 
6.1%
K 3
 
6.1%
g 2
 
4.1%
A 2
 
4.1%
Other values (14) 15
30.6%
Common
ValueCountFrequency (%)
) 350
39.3%
( 347
38.9%
181
20.3%
1 3
 
0.3%
8 2
 
0.2%
& 2
 
0.2%
. 2
 
0.2%
- 1
 
0.1%
4 1
 
0.1%
3 1
 
0.1%
Han
ValueCountFrequency (%)
2
33.3%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3797
80.0%
ASCII 940
 
19.8%
CJK 4
 
0.1%
None 2
 
< 0.1%
CJK Compat Ideographs 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
389
 
10.2%
282
 
7.4%
222
 
5.8%
220
 
5.8%
186
 
4.9%
185
 
4.9%
107
 
2.8%
102
 
2.7%
93
 
2.4%
92
 
2.4%
Other values (309) 1919
50.5%
ASCII
ValueCountFrequency (%)
) 350
37.2%
( 347
36.9%
181
19.3%
o 5
 
0.5%
u 4
 
0.4%
T 4
 
0.4%
C 4
 
0.4%
r 4
 
0.4%
S 3
 
0.3%
O 3
 
0.3%
Other values (25) 35
 
3.7%
None
ValueCountFrequency (%)
2
100.0%
CJK
ValueCountFrequency (%)
2
50.0%
1
25.0%
1
25.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct431
Distinct (%)77.2%
Missing4
Missing (%)0.7%
Memory size4.5 KiB
2023-12-12T13:20:17.663421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length70
Median length50
Mean length31.623656
Min length19

Characters and Unicode

Total characters17646
Distinct characters322
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique331 ?
Unique (%)59.3%

Sample

1st row광주광역시 동구 천변우로 339, 1708호 (수기동)
2nd row광주광역시 동구 문화전당로26번길 지하 7, 문화전당역 104-A호 (광산동)
3rd row광주광역시 동구 서석로 1, 2층 (불로동)
4th row광주광역시 동구 서석로85번길 24, 1층 (궁동)
5th row광주광역시 동구 천변우로 339, 제일오피스텔빌딩 101동 905호 (수기동)
ValueCountFrequency (%)
광주광역시 558
 
15.6%
서구 182
 
5.1%
동구 135
 
3.8%
북구 100
 
2.8%
광산구 83
 
2.3%
2층 80
 
2.2%
치평동 78
 
2.2%
남구 58
 
1.6%
1층 57
 
1.6%
3층 37
 
1.0%
Other values (870) 2199
61.6%
2023-12-12T13:20:18.293025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3010
 
17.1%
1236
 
7.0%
735
 
4.2%
1 666
 
3.8%
606
 
3.4%
601
 
3.4%
595
 
3.4%
585
 
3.3%
566
 
3.2%
) 562
 
3.2%
Other values (312) 8484
48.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9872
55.9%
Space Separator 3010
 
17.1%
Decimal Number 2917
 
16.5%
Close Punctuation 562
 
3.2%
Open Punctuation 562
 
3.2%
Other Punctuation 498
 
2.8%
Dash Punctuation 127
 
0.7%
Uppercase Letter 75
 
0.4%
Lowercase Letter 23
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1236
 
12.5%
735
 
7.4%
606
 
6.1%
601
 
6.1%
595
 
6.0%
585
 
5.9%
566
 
5.7%
255
 
2.6%
241
 
2.4%
233
 
2.4%
Other values (267) 4219
42.7%
Uppercase Letter
ValueCountFrequency (%)
B 8
10.7%
S 8
10.7%
E 6
 
8.0%
A 6
 
8.0%
O 6
 
8.0%
Y 6
 
8.0%
C 6
 
8.0%
J 5
 
6.7%
H 5
 
6.7%
P 4
 
5.3%
Other values (8) 15
20.0%
Lowercase Letter
ValueCountFrequency (%)
y 4
17.4%
t 3
13.0%
e 3
13.0%
a 3
13.0%
n 3
13.0%
i 2
8.7%
s 1
 
4.3%
h 1
 
4.3%
d 1
 
4.3%
u 1
 
4.3%
Decimal Number
ValueCountFrequency (%)
1 666
22.8%
2 465
15.9%
3 338
11.6%
0 306
10.5%
4 264
 
9.1%
5 213
 
7.3%
6 181
 
6.2%
7 169
 
5.8%
8 165
 
5.7%
9 150
 
5.1%
Other Punctuation
ValueCountFrequency (%)
, 497
99.8%
. 1
 
0.2%
Space Separator
ValueCountFrequency (%)
3010
100.0%
Close Punctuation
ValueCountFrequency (%)
) 562
100.0%
Open Punctuation
ValueCountFrequency (%)
( 562
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 127
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9872
55.9%
Common 7676
43.5%
Latin 98
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1236
 
12.5%
735
 
7.4%
606
 
6.1%
601
 
6.1%
595
 
6.0%
585
 
5.9%
566
 
5.7%
255
 
2.6%
241
 
2.4%
233
 
2.4%
Other values (267) 4219
42.7%
Latin
ValueCountFrequency (%)
B 8
 
8.2%
S 8
 
8.2%
E 6
 
6.1%
A 6
 
6.1%
O 6
 
6.1%
Y 6
 
6.1%
C 6
 
6.1%
J 5
 
5.1%
H 5
 
5.1%
P 4
 
4.1%
Other values (19) 38
38.8%
Common
ValueCountFrequency (%)
3010
39.2%
1 666
 
8.7%
) 562
 
7.3%
( 562
 
7.3%
, 497
 
6.5%
2 465
 
6.1%
3 338
 
4.4%
0 306
 
4.0%
4 264
 
3.4%
5 213
 
2.8%
Other values (6) 793
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9872
55.9%
ASCII 7774
44.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3010
38.7%
1 666
 
8.6%
) 562
 
7.2%
( 562
 
7.2%
, 497
 
6.4%
2 465
 
6.0%
3 338
 
4.3%
0 306
 
3.9%
4 264
 
3.4%
5 213
 
2.7%
Other values (35) 891
 
11.5%
Hangul
ValueCountFrequency (%)
1236
 
12.5%
735
 
7.4%
606
 
6.1%
601
 
6.1%
595
 
6.0%
585
 
5.9%
566
 
5.7%
255
 
2.6%
241
 
2.4%
233
 
2.4%
Other values (267) 4219
42.7%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing4
Missing (%)0.7%
Memory size4.5 KiB
Minimum2023-07-31 00:00:00
Maximum2023-07-31 00:00:00
2023-12-12T13:20:18.467709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:20:18.596637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T13:20:14.914518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:20:18.704485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종중분류
연번1.0000.967
업종중분류0.9671.000
2023-12-12T13:20:18.807269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종중분류지역
업종중분류1.0001.000
지역1.0001.000
2023-12-12T13:20:18.922559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번지역업종중분류
연번1.0001.0000.965
지역1.0001.0001.000
업종중분류0.9651.0001.000

Missing values

2023-12-12T13:20:15.067405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:20:15.186306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T13:20:15.333944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번지역업종중분류업체명소재지도로명주소데이터기준일자
01광주광역시종합여행업(합)아시아 관광 여행사광주광역시 동구 천변우로 339, 1708호 (수기동)2023-07-31
12광주광역시종합여행업주)그린여행사광주광역시 동구 문화전당로26번길 지하 7, 문화전당역 104-A호 (광산동)2023-07-31
23광주광역시종합여행업주식회사 퍼니허니문광주광역시 동구 서석로 1, 2층 (불로동)2023-07-31
34광주광역시종합여행업(주)광주도시여행청광주광역시 동구 서석로85번길 24, 1층 (궁동)2023-07-31
45광주광역시종합여행업(유)금호관광여행사광주광역시 동구 천변우로 339, 제일오피스텔빌딩 101동 905호 (수기동)2023-07-31
56광주광역시종합여행업유한회사 현진여행사광주광역시 동구 천변우로 361-1, 2층 (수기동)2023-07-31
67광주광역시종합여행업남도마실길광주광역시 동구 구성로 187-2, 403호 (금남로5가)2023-07-31
78광주광역시종합여행업동명문화트래블 협동조합광주광역시 동구 중앙로252번길 11-1 (동명동)2023-07-31
89광주광역시종합여행업(주)여행가는날광주광역시 동구 문화전당로23번길 26, 문화전당역 오펠리움 1314호 (금동)2023-07-31
910광주광역시종합여행업(주)월드항공여행사광주광역시 동구 제봉로82번길 13-2 (서석동)2023-07-31
연번지역업종중분류업체명소재지도로명주소데이터기준일자
552553광주광역시국내여행업주식회사 아리수여행사광주광역시 광산구 광산로 111 (송정동)2023-07-31
553554광주광역시국내여행업주식회사 자이언트투어광주광역시 광산구 풍영로200번길 59, 301호 (장덕동, 한도빌딩)2023-07-31
554555광주광역시국내여행업주식회사 포플레이광주광역시 광산구 앰코로 35, 폭스존 217호 (쌍암동)2023-07-31
555556광주광역시국내여행업케이솔루션투어광주광역시 광산구 산월로3번길 13 (월계동,1층)2023-07-31
556557광주광역시국내여행업투어나우광주광역시 광산구 앰코로 35, 폭스존 129호 (쌍암동)2023-07-31
557558광주광역시국내여행업투어디자인(주)광주광역시 광산구 임방울대로826번길 7-36 (쌍암동)2023-07-31
558<NA><NA><NA><NA><NA><NA>
559<NA><NA><NA><NA><NA><NA>
560<NA><NA><NA><NA><NA><NA>
561<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번지역업종중분류업체명소재지도로명주소데이터기준일자# duplicates
0<NA><NA><NA><NA><NA><NA>4