Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory742.2 KiB
Average record size in memory76.0 B

Variable types

Numeric2
Categorical5
DateTime1

Dataset

Description제주 도보여행 탐방객 실측 데이터로서 공공데이터 뉴딜 사업으로 구축된 데이터입니다.
Author제주관광공사
URLhttps://www.data.go.kr/data/15096602/fileData.do

Alerts

측정연도 has constant value ""Constant
고유번호 is highly overall correlated with 측정월High correlation
측정일 is highly overall correlated with 측정월High correlation
측정월 is highly overall correlated with 고유번호 and 1 other fieldsHigh correlation
위치정보 is highly imbalanced (68.7%)Imbalance
고유번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 05:08:04.466180
Analysis finished2023-12-12 05:08:05.654684
Duration1.19 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

고유번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47931.907
Minimum20
Maximum95271
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:08:05.751656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile4430.95
Q123971.25
median48320
Q371849.75
95-th percentile90638.75
Maximum95271
Range95251
Interquartile range (IQR)47878.5

Descriptive statistics

Standard deviation27600.6
Coefficient of variation (CV)0.57582938
Kurtosis-1.2015474
Mean47931.907
Median Absolute Deviation (MAD)23968.5
Skewness-0.023960028
Sum4.7931907 × 108
Variance7.6179312 × 108
MonotonicityNot monotonic
2023-12-12T14:08:05.917860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17858 1
 
< 0.1%
41381 1
 
< 0.1%
76054 1
 
< 0.1%
77313 1
 
< 0.1%
39731 1
 
< 0.1%
26716 1
 
< 0.1%
71848 1
 
< 0.1%
2404 1
 
< 0.1%
73851 1
 
< 0.1%
32884 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
20 1
< 0.1%
21 1
< 0.1%
36 1
< 0.1%
52 1
< 0.1%
54 1
< 0.1%
57 1
< 0.1%
63 1
< 0.1%
79 1
< 0.1%
86 1
< 0.1%
101 1
< 0.1%
ValueCountFrequency (%)
95271 1
< 0.1%
95257 1
< 0.1%
95254 1
< 0.1%
95246 1
< 0.1%
95242 1
< 0.1%
95238 1
< 0.1%
95234 1
< 0.1%
95216 1
< 0.1%
95181 1
< 0.1%
95170 1
< 0.1%

위치정보
Categorical

IMBALANCE 

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
금오름
6976 
새별오름
2690 
사라봉
 
125
산방산
 
64
궷물오름
 
56
Other values (7)
 
89

Length

Max length6
Median length3
Mean length3.2865
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row금오름
2nd row금오름
3rd row금오름
4th row금오름
5th row새별오름

Common Values

ValueCountFrequency (%)
금오름 6976
69.8%
새별오름 2690
 
26.9%
사라봉 125
 
1.2%
산방산 64
 
0.6%
궷물오름 56
 
0.6%
도두봉 31
 
0.3%
물영아리오름 24
 
0.2%
광이오름 9
 
0.1%
바리메오름 9
 
0.1%
민오름 6
 
0.1%
Other values (2) 10
 
0.1%

Length

2023-12-12T14:08:06.084290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
금오름 6976
69.8%
새별오름 2690
 
26.9%
사라봉 125
 
1.2%
산방산 64
 
0.6%
궷물오름 56
 
0.6%
도두봉 31
 
0.3%
물영아리오름 24
 
0.2%
광이오름 9
 
0.1%
바리메오름 9
 
0.1%
민오름 6
 
0.1%
Other values (2) 10
 
0.1%

측정연도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2021
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2021 10000
100.0%

Length

2023-12-12T14:08:06.260471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:08:06.441214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 10000
100.0%

측정월
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
10
4272 
9
2325 
11
1792 
8
1520 
7
 
91

Length

Max length2
Median length2
Mean length1.6064
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9
2nd row11
3rd row10
4th row10
5th row8

Common Values

ValueCountFrequency (%)
10 4272
42.7%
9 2325
23.2%
11 1792
17.9%
8 1520
 
15.2%
7 91
 
0.9%

Length

2023-12-12T14:08:06.598618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:08:06.745896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
10 4272
42.7%
9 2325
23.2%
11 1792
17.9%
8 1520
 
15.2%
7 91
 
0.9%

측정일
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.2212
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:08:06.914990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q17
median13
Q325
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)18

Descriptive statistics

Standard deviation9.091619
Coefficient of variation (CV)0.59729975
Kurtosis-1.3927108
Mean15.2212
Median Absolute Deviation (MAD)8
Skewness0.12469644
Sum152212
Variance82.657536
MonotonicityNot monotonic
2023-12-12T14:08:07.086136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
9 837
 
8.4%
26 668
 
6.7%
2 663
 
6.6%
7 632
 
6.3%
25 615
 
6.2%
3 594
 
5.9%
10 555
 
5.5%
28 466
 
4.7%
29 461
 
4.6%
12 437
 
4.4%
Other values (21) 4072
40.7%
ValueCountFrequency (%)
1 98
 
1.0%
2 663
6.6%
3 594
5.9%
4 91
 
0.9%
5 163
 
1.6%
6 278
 
2.8%
7 632
6.3%
8 155
 
1.6%
9 837
8.4%
10 555
5.5%
ValueCountFrequency (%)
31 70
 
0.7%
30 150
 
1.5%
29 461
4.6%
28 466
4.7%
27 166
 
1.7%
26 668
6.7%
25 615
6.2%
24 270
2.7%
23 319
3.2%
22 102
 
1.0%
Distinct547
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2023-12-12 08:59:00
Maximum2023-12-12 18:08:00
2023-12-12T14:08:07.285949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:07.495119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

연령대
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
청년
5387 
중장년
3025 
어르신
795 
아동
793 

Length

Max length3
Median length2
Mean length2.382
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row청년
2nd row청년
3rd row청년
4th row청년
5th row청년

Common Values

ValueCountFrequency (%)
청년 5387
53.9%
중장년 3025
30.2%
어르신 795
 
8.0%
아동 793
 
7.9%

Length

2023-12-12T14:08:07.698222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:08:07.838599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
청년 5387
53.9%
중장년 3025
30.2%
어르신 795
 
8.0%
아동 793
 
7.9%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
5794 
4206 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
5794
57.9%
4206
42.1%

Length

2023-12-12T14:08:08.003731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:08:08.132552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5794
57.9%
4206
42.1%

Interactions

2023-12-12T14:08:05.178011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:04.952421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:05.292539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:05.057360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:08:08.224592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고유번호위치정보측정월측정일연령대성별
고유번호1.0000.6790.9660.9170.2350.047
위치정보0.6791.0000.4870.4750.2530.051
측정월0.9660.4871.0000.8800.0860.027
측정일0.9170.4750.8801.0000.1600.000
연령대0.2350.2530.0860.1601.0000.104
성별0.0470.0510.0270.0000.1041.000
2023-12-12T14:08:08.382699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정월성별위치정보연령대
측정월1.0000.0330.2950.070
성별0.0331.0000.0390.069
위치정보0.2950.0391.0000.120
연령대0.0700.0690.1201.000
2023-12-12T14:08:08.526896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고유번호측정일위치정보측정월연령대성별
고유번호1.000-0.2660.3670.7450.1420.036
측정일-0.2661.0000.2350.5350.1030.000
위치정보0.3670.2351.0000.2950.1200.039
측정월0.7450.5350.2951.0000.0700.033
연령대0.1420.1030.1200.0701.0000.069
성별0.0360.0000.0390.0330.0691.000

Missing values

2023-12-12T14:08:05.436099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:08:05.589231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

고유번호위치정보측정연도측정월측정일측정시간연령대성별
1785717858금오름202191217:51청년
5807858079금오름2021111314:38청년
2707627077금오름202110216:22청년
3177931780금오름202110915:17청년
7272772728새별오름202183110:10청년
8890488905새별오름202110912:13어르신
3724937250금오름2021101710:23중장년
8637086371새별오름202110317:20청년
8465284653새별오름202110312:40중장년
3074530746금오름202110809:58청년
고유번호위치정보측정연도측정월측정일측정시간연령대성별
8996389964새별오름202110914:12중장년
3435334354금오름2021101015:22중장년
1035010351금오름20219610:29중장년
40764077금오름202182809:53청년
5646056461금오름202111913:14중장년
5608956090금오름202111716:51청년
36173618금오름202182711:21중장년
32473248금오름202182613:15아동
6784167842도두봉20218316:10어르신
8522285223새별오름202110314:31아동