Overview

Dataset statistics

Number of variables6
Number of observations170
Missing cells232
Missing cells (%)22.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.9 KiB
Average record size in memory53.8 B

Variable types

Text1
Numeric5

Dataset

Description국내 석유제품의 제주 지역 소비량에 관한 자료로 산업별(농림수산업,광업,식품.담배업,섬유제품업,목재업,제지.인쇄업,화학제품업,요업,철강업,비철금속산업,기계조립업,수송장비업,기타제조업,건설업,기타에너지,발전,석유정제,개스제조,철도,도로,해운,항공,상업,가정,공공,기타), 제품별로 작성 단위 : 물량(KL)
URLhttps://www.data.go.kr/data/15121148/fileData.do

Alerts

2018 is highly overall correlated with 2019 and 3 other fieldsHigh correlation
2019 is highly overall correlated with 2018 and 3 other fieldsHigh correlation
2020 is highly overall correlated with 2018 and 3 other fieldsHigh correlation
2021 is highly overall correlated with 2018 and 3 other fieldsHigh correlation
2022 is highly overall correlated with 2018 and 3 other fieldsHigh correlation
2018 has 36 (21.2%) missing valuesMissing
2019 has 38 (22.4%) missing valuesMissing
2020 has 50 (29.4%) missing valuesMissing
2021 has 54 (31.8%) missing valuesMissing
2022 has 54 (31.8%) missing valuesMissing
시군구_산업_제품 has unique valuesUnique
2022 has 2 (1.2%) zerosZeros

Reproduction

Analysis started2023-12-12 03:14:11.247281
Analysis finished2023-12-12 03:14:14.825006
Duration3.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct170
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-12T12:14:15.005755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length19
Mean length17.994118
Min length11

Characters and Unicode

Total characters3059
Distinct characters94
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique170 ?
Unique (%)100.0%

Sample

1st row제주제주시_농림수산업_무연보통휘발유
2nd row제주제주시_농림수산업_실내등유
3rd row제주제주시_농림수산업_경유(0.05%)
4th row제주제주시_농림수산업_경유(0.001%)
5th row제주제주시_농림수산업_경질중유(2.0%)
ValueCountFrequency (%)
제주제주시_농림수산업_무연보통휘발유 1
 
0.6%
제주서귀포시_요업_경유(0.001 1
 
0.6%
제주서귀포시_식품.담배업_중유(0.3 1
 
0.6%
제주서귀포시_건설업_실내등유 1
 
0.6%
제주서귀포시_식품.담배업_프로판 1
 
0.6%
제주서귀포시_제지.인쇄업_중유(0.3 1
 
0.6%
제주서귀포시_화학제품업_경유(0.001 1
 
0.6%
제주서귀포시_화학제품업_중유(0.3 1
 
0.6%
제주서귀포시_화학제품업_부생연료유(중유형 1
 
0.6%
제주서귀포시_요업_경유(0.05 1
 
0.6%
Other values (160) 160
94.1%
2023-12-12T12:14:15.434553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 340
 
11.1%
295
 
9.6%
268
 
8.8%
170
 
5.6%
162
 
5.3%
0 150
 
4.9%
) 106
 
3.5%
( 106
 
3.5%
99
 
3.2%
. 98
 
3.2%
Other values (84) 1265
41.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2063
67.4%
Connector Punctuation 340
 
11.1%
Decimal Number 240
 
7.8%
Other Punctuation 185
 
6.0%
Close Punctuation 106
 
3.5%
Open Punctuation 106
 
3.5%
Uppercase Letter 16
 
0.5%
Dash Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
295
 
14.3%
268
 
13.0%
170
 
8.2%
162
 
7.9%
99
 
4.8%
72
 
3.5%
72
 
3.5%
72
 
3.5%
54
 
2.6%
51
 
2.5%
Other values (67) 748
36.3%
Decimal Number
ValueCountFrequency (%)
0 150
62.5%
3 34
 
14.2%
1 30
 
12.5%
5 18
 
7.5%
2 6
 
2.5%
4 2
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
C 4
25.0%
J 3
18.8%
E 3
18.8%
T 3
18.8%
A 3
18.8%
Other Punctuation
ValueCountFrequency (%)
. 98
53.0%
% 87
47.0%
Connector Punctuation
ValueCountFrequency (%)
_ 340
100.0%
Close Punctuation
ValueCountFrequency (%)
) 106
100.0%
Open Punctuation
ValueCountFrequency (%)
( 106
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2063
67.4%
Common 980
32.0%
Latin 16
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
295
 
14.3%
268
 
13.0%
170
 
8.2%
162
 
7.9%
99
 
4.8%
72
 
3.5%
72
 
3.5%
72
 
3.5%
54
 
2.6%
51
 
2.5%
Other values (67) 748
36.3%
Common
ValueCountFrequency (%)
_ 340
34.7%
0 150
15.3%
) 106
 
10.8%
( 106
 
10.8%
. 98
 
10.0%
% 87
 
8.9%
3 34
 
3.5%
1 30
 
3.1%
5 18
 
1.8%
2 6
 
0.6%
Other values (2) 5
 
0.5%
Latin
ValueCountFrequency (%)
C 4
25.0%
J 3
18.8%
E 3
18.8%
T 3
18.8%
A 3
18.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2063
67.4%
ASCII 996
32.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 340
34.1%
0 150
15.1%
) 106
 
10.6%
( 106
 
10.6%
. 98
 
9.8%
% 87
 
8.7%
3 34
 
3.4%
1 30
 
3.0%
5 18
 
1.8%
2 6
 
0.6%
Other values (7) 21
 
2.1%
Hangul
ValueCountFrequency (%)
295
 
14.3%
268
 
13.0%
170
 
8.2%
162
 
7.9%
99
 
4.8%
72
 
3.5%
72
 
3.5%
72
 
3.5%
54
 
2.6%
51
 
2.5%
Other values (67) 748
36.3%

2018
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct123
Distinct (%)91.8%
Missing36
Missing (%)21.2%
Infinite0
Infinite (%)0.0%
Mean11633.552
Minimum1
Maximum234970
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-12T12:14:15.623474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q164.25
median231
Q32207
95-th percentile71270.2
Maximum234970
Range234969
Interquartile range (IQR)2142.75

Descriptive statistics

Standard deviation36069.906
Coefficient of variation (CV)3.1005066
Kurtosis20.596257
Mean11633.552
Median Absolute Deviation (MAD)223
Skewness4.320929
Sum1558896
Variance1.3010381 × 109
MonotonicityNot monotonic
2023-12-12T12:14:15.798428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 3
 
1.8%
1 3
 
1.8%
4 2
 
1.2%
24 2
 
1.2%
8 2
 
1.2%
103 2
 
1.2%
31 2
 
1.2%
73 2
 
1.2%
11 2
 
1.2%
1966 1
 
0.6%
Other values (113) 113
66.5%
(Missing) 36
 
21.2%
ValueCountFrequency (%)
1 3
1.8%
2 1
 
0.6%
4 2
1.2%
5 3
1.8%
6 1
 
0.6%
7 1
 
0.6%
8 2
1.2%
11 2
1.2%
15 1
 
0.6%
19 1
 
0.6%
ValueCountFrequency (%)
234970 1
0.6%
220221 1
0.6%
145062 1
0.6%
125013 1
0.6%
113003 1
0.6%
104688 1
0.6%
85526 1
0.6%
63594 1
0.6%
51607 1
0.6%
51530 1
0.6%

2019
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct126
Distinct (%)95.5%
Missing38
Missing (%)22.4%
Infinite0
Infinite (%)0.0%
Mean11699.242
Minimum1
Maximum237208
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-12T12:14:15.948766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9.55
Q177.25
median256.5
Q31230
95-th percentile74975.1
Maximum237208
Range237207
Interquartile range (IQR)1152.75

Descriptive statistics

Standard deviation36514.158
Coefficient of variation (CV)3.1210703
Kurtosis22.254391
Mean11699.242
Median Absolute Deviation (MAD)243
Skewness4.4411155
Sum1544300
Variance1.3332838 × 109
MonotonicityNot monotonic
2023-12-12T12:14:16.090475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15 3
 
1.8%
841 2
 
1.2%
180 2
 
1.2%
5 2
 
1.2%
7 2
 
1.2%
38 1
 
0.6%
1055 1
 
0.6%
1 1
 
0.6%
2339 1
 
0.6%
120 1
 
0.6%
Other values (116) 116
68.2%
(Missing) 38
 
22.4%
ValueCountFrequency (%)
1 1
 
0.6%
3 1
 
0.6%
5 2
1.2%
7 2
1.2%
9 1
 
0.6%
10 1
 
0.6%
11 1
 
0.6%
13 1
 
0.6%
14 1
 
0.6%
15 3
1.8%
ValueCountFrequency (%)
237208 1
0.6%
235463 1
0.6%
130506 1
0.6%
128276 1
0.6%
104456 1
0.6%
84218 1
0.6%
82762 1
0.6%
68604 1
0.6%
62434 1
0.6%
54394 1
0.6%

2020
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct112
Distinct (%)93.3%
Missing50
Missing (%)29.4%
Infinite0
Infinite (%)0.0%
Mean9723.25
Minimum0
Maximum206460
Zeros1
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-12T12:14:16.252982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q170.5
median359
Q31338.5
95-th percentile54330.9
Maximum206460
Range206460
Interquartile range (IQR)1268

Descriptive statistics

Standard deviation29337.836
Coefficient of variation (CV)3.0172871
Kurtosis22.746301
Mean9723.25
Median Absolute Deviation (MAD)346.5
Skewness4.4687861
Sum1166790
Variance8.6070865 × 108
MonotonicityNot monotonic
2023-12-12T12:14:16.392685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 3
 
1.8%
9 2
 
1.2%
47 2
 
1.2%
4 2
 
1.2%
39 2
 
1.2%
26 2
 
1.2%
20 2
 
1.2%
453 1
 
0.6%
2022 1
 
0.6%
17 1
 
0.6%
Other values (102) 102
60.0%
(Missing) 50
29.4%
ValueCountFrequency (%)
0 1
 
0.6%
2 1
 
0.6%
3 3
1.8%
4 2
1.2%
6 1
 
0.6%
8 1
 
0.6%
9 2
1.2%
11 1
 
0.6%
12 1
 
0.6%
13 1
 
0.6%
ValueCountFrequency (%)
206460 1
0.6%
150996 1
0.6%
124014 1
0.6%
99795 1
0.6%
73296 1
0.6%
55773 1
0.6%
54255 1
0.6%
50013 1
0.6%
46111 1
0.6%
42473 1
0.6%

2021
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct109
Distinct (%)94.0%
Missing54
Missing (%)31.8%
Infinite0
Infinite (%)0.0%
Mean10706.853
Minimum0
Maximum210893
Zeros1
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-12T12:14:16.553135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q155.75
median354
Q31707.5
95-th percentile62610.25
Maximum210893
Range210893
Interquartile range (IQR)1651.75

Descriptive statistics

Standard deviation32691.817
Coefficient of variation (CV)3.0533543
Kurtosis21.296367
Mean10706.853
Median Absolute Deviation (MAD)335
Skewness4.3939928
Sum1241995
Variance1.0687549 × 109
MonotonicityNot monotonic
2023-12-12T12:14:16.701458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31 3
 
1.8%
2 3
 
1.8%
46 2
 
1.2%
341 2
 
1.2%
4 2
 
1.2%
0 1
 
0.6%
93 1
 
0.6%
127 1
 
0.6%
27 1
 
0.6%
280 1
 
0.6%
Other values (99) 99
58.2%
(Missing) 54
31.8%
ValueCountFrequency (%)
0 1
 
0.6%
2 3
1.8%
3 1
 
0.6%
4 2
1.2%
7 1
 
0.6%
8 1
 
0.6%
12 1
 
0.6%
14 1
 
0.6%
15 1
 
0.6%
17 1
 
0.6%
ValueCountFrequency (%)
210893 1
0.6%
191807 1
0.6%
136477 1
0.6%
102054 1
0.6%
70785 1
0.6%
63178 1
0.6%
62421 1
0.6%
55750 1
0.6%
47443 1
0.6%
45679 1
0.6%

2022
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct108
Distinct (%)93.1%
Missing54
Missing (%)31.8%
Infinite0
Infinite (%)0.0%
Mean11012.052
Minimum0
Maximum211065
Zeros2
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-12T12:14:16.845675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.75
Q153.25
median368
Q31715.5
95-th percentile59414
Maximum211065
Range211065
Interquartile range (IQR)1662.25

Descriptive statistics

Standard deviation32818.091
Coefficient of variation (CV)2.9801977
Kurtosis20.353258
Mean11012.052
Median Absolute Deviation (MAD)346.5
Skewness4.2868297
Sum1277398
Variance1.0770271 × 109
MonotonicityNot monotonic
2023-12-12T12:14:16.988994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 3
 
1.8%
0 2
 
1.2%
61 2
 
1.2%
13 2
 
1.2%
7 2
 
1.2%
45 2
 
1.2%
520 2
 
1.2%
475 1
 
0.6%
21 1
 
0.6%
146 1
 
0.6%
Other values (98) 98
57.6%
(Missing) 54
31.8%
ValueCountFrequency (%)
0 2
1.2%
1 1
 
0.6%
2 3
1.8%
3 1
 
0.6%
7 2
1.2%
10 1
 
0.6%
12 1
 
0.6%
13 2
1.2%
15 1
 
0.6%
21 1
 
0.6%
ValueCountFrequency (%)
211065 1
0.6%
186782 1
0.6%
144475 1
0.6%
97085 1
0.6%
70155 1
0.6%
63398 1
0.6%
58086 1
0.6%
54977 1
0.6%
53382 1
0.6%
49471 1
0.6%

Interactions

2023-12-12T12:14:13.951322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:11.522768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:12.415423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:12.942209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:13.422423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:14.040305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:12.019192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:12.510625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:13.030325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:13.528603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:14.161012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:12.108282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:12.606876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:13.131751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:13.638294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:14.260510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:12.218504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:12.720545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:13.228863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:13.754477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:14.360206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:12.316944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:12.827493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:13.326568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:14:13.856721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:14:17.098760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
20182019202020212022
20181.0000.9340.9360.9870.893
20190.9341.0000.9170.9770.889
20200.9360.9171.0000.9580.992
20210.9870.9770.9581.0000.955
20220.8930.8890.9920.9551.000
2023-12-12T12:14:17.245733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
20182019202020212022
20181.0000.9600.8700.7920.812
20190.9601.0000.8850.7970.857
20200.8700.8851.0000.8960.912
20210.7920.7970.8961.0000.902
20220.8120.8570.9120.9021.000

Missing values

2023-12-12T12:14:14.515495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:14:14.635284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T12:14:14.744829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군구_산업_제품20182019202020212022
0제주제주시_농림수산업_무연보통휘발유54881107312313345
1제주제주시_농림수산업_실내등유66151105518546525
2제주제주시_농림수산업_경유(0.05%)20722829437746<NA>
3제주제주시_농림수산업_경유(0.001%)25561309227011491080
4제주제주시_농림수산업_경질중유(2.0%)1087<NA><NA><NA>
5제주제주시_농림수산업_경질중유(0.3%)53<NA><NA><NA>
6제주제주시_농림수산업_중유(1.0%)7<NA><NA><NA><NA>
7제주제주시_농림수산업_중유(0.5%)116<NA><NA><NA><NA>
8제주제주시_농림수산업_중유(0.3%)734664396225112
9제주제주시_농림수산업_부생연료유(등유형)<NA><NA><NA>27417114
시군구_산업_제품20182019202020212022
160제주서귀포시_가정_용제원료891187
161제주서귀포시_가정_프로판1651017419173631782518683
162제주서귀포시_가정_부탄220221188177181
163제주서귀포시_공공_무연보통휘발유18099803759
164제주서귀포시_공공_실내등유316276270257147
165제주서귀포시_공공_경유(0.05%)3806623822242221375213738
166제주서귀포시_공공_경유(0.001%)1108841828341378
167제주서귀포시_공공_경질중유(0.3%)242287765<NA><NA>
168제주서귀포시_공공_중유(0.3%)4357911712
169제주서귀포시_공공_프로판150162160164177