Overview

Dataset statistics

Number of variables8
Number of observations23
Missing cells8
Missing cells (%)4.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.7 KiB
Average record size in memory77.7 B

Variable types

Numeric4
Categorical4

Dataset

Description연도별 유류(휘발유, 방카씨유 등) 수송실적 입니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15068399/fileData.do

Alerts

유류기타(톤) is highly overall correlated with 연도 and 4 other fieldsHigh correlation
등유(톤) is highly overall correlated with 연도 and 4 other fieldsHigh correlation
JP8제트유(톤) is highly overall correlated with 연도 and 4 other fieldsHigh correlation
경유(톤) is highly overall correlated with 연도 and 4 other fieldsHigh correlation
연도 is highly overall correlated with 휘발유(톤) and 5 other fieldsHigh correlation
휘발유(톤) is highly overall correlated with 연도 and 1 other fieldsHigh correlation
방카씨유(톤) is highly overall correlated with 연도 and 5 other fieldsHigh correlation
경유(톤) is highly imbalanced (61.7%)Imbalance
등유(톤) is highly imbalanced (61.7%)Imbalance
JP8제트유(톤) is highly imbalanced (56.3%)Imbalance
유류기타(톤) is highly imbalanced (61.7%)Imbalance
휘발유(톤) has 3 (13.0%) missing valuesMissing
방카씨유(톤) has 1 (4.3%) missing valuesMissing
기타석유류(톤) has 4 (17.4%) missing valuesMissing
연도 has unique valuesUnique

Reproduction

Analysis started2023-12-12 09:56:46.905489
Analysis finished2023-12-12 09:56:49.412681
Duration2.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct23
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2007
Minimum1996
Maximum2018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T18:56:49.480525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1996
5-th percentile1997.1
Q12001.5
median2007
Q32012.5
95-th percentile2016.9
Maximum2018
Range22
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.78233
Coefficient of variation (CV)0.0033793373
Kurtosis-1.2
Mean2007
Median Absolute Deviation (MAD)6
Skewness0
Sum46161
Variance46
MonotonicityStrictly increasing
2023-12-12T18:56:49.608541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1996 1
 
4.3%
1997 1
 
4.3%
2018 1
 
4.3%
2017 1
 
4.3%
2016 1
 
4.3%
2015 1
 
4.3%
2014 1
 
4.3%
2013 1
 
4.3%
2012 1
 
4.3%
2011 1
 
4.3%
Other values (13) 13
56.5%
ValueCountFrequency (%)
1996 1
4.3%
1997 1
4.3%
1998 1
4.3%
1999 1
4.3%
2000 1
4.3%
2001 1
4.3%
2002 1
4.3%
2003 1
4.3%
2004 1
4.3%
2005 1
4.3%
ValueCountFrequency (%)
2018 1
4.3%
2017 1
4.3%
2016 1
4.3%
2015 1
4.3%
2014 1
4.3%
2013 1
4.3%
2012 1
4.3%
2011 1
4.3%
2010 1
4.3%
2009 1
4.3%

휘발유(톤)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct20
Distinct (%)100.0%
Missing3
Missing (%)13.0%
Infinite0
Infinite (%)0.0%
Mean188807.7
Minimum45759
Maximum1013668
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T18:56:49.753083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum45759
5-th percentile47158.35
Q155640
median80877.5
Q3180647
95-th percentile882608.85
Maximum1013668
Range967909
Interquartile range (IQR)125007

Descriptive statistics

Standard deviation266560.49
Coefficient of variation (CV)1.4118094
Kurtosis6.3189103
Mean188807.7
Median Absolute Deviation (MAD)34382
Skewness2.6548571
Sum3776154
Variance7.1054497 × 1010
MonotonicityNot monotonic
2023-12-12T18:56:49.871604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
47232 1
 
4.3%
56680 1
 
4.3%
52520 1
 
4.3%
61080 1
 
4.3%
63696 1
 
4.3%
58080 1
 
4.3%
58184 1
 
4.3%
50040 1
 
4.3%
48252 1
 
4.3%
1013668 1
 
4.3%
Other values (10) 10
43.5%
(Missing) 3
 
13.0%
ValueCountFrequency (%)
45759 1
4.3%
47232 1
4.3%
48252 1
4.3%
50040 1
4.3%
52520 1
4.3%
56680 1
4.3%
58080 1
4.3%
58184 1
4.3%
61080 1
4.3%
63696 1
4.3%
ValueCountFrequency (%)
1013668 1
4.3%
875711 1
4.3%
230935 1
4.3%
202898 1
4.3%
199790 1
4.3%
174266 1
4.3%
157580 1
4.3%
144083 1
4.3%
137641 1
4.3%
98059 1
4.3%

방카씨유(톤)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct22
Distinct (%)100.0%
Missing1
Missing (%)4.3%
Infinite0
Infinite (%)0.0%
Mean1043530.5
Minimum208593
Maximum2149866
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T18:56:50.225760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum208593
5-th percentile270259.05
Q1349869.75
median1142674.5
Q31482707.2
95-th percentile1936289.3
Maximum2149866
Range1941273
Interquartile range (IQR)1132837.5

Descriptive statistics

Standard deviation611864.47
Coefficient of variation (CV)0.58634074
Kurtosis-1.2301925
Mean1043530.5
Median Absolute Deviation (MAD)457528.5
Skewness0.019953152
Sum22957672
Variance3.7437813 × 1011
MonotonicityNot monotonic
2023-12-12T18:56:50.328713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
2149866 1
 
4.3%
1044421 1
 
4.3%
208593 1
 
4.3%
281907 1
 
4.3%
317524 1
 
4.3%
282703 1
 
4.3%
269646 1
 
4.3%
303264 1
 
4.3%
446907 1
 
4.3%
693944 1
 
4.3%
Other values (12) 12
52.2%
ValueCountFrequency (%)
208593 1
4.3%
269646 1
4.3%
281907 1
4.3%
282703 1
4.3%
303264 1
4.3%
317524 1
4.3%
446907 1
4.3%
693944 1
4.3%
860830 1
4.3%
1044421 1
4.3%
ValueCountFrequency (%)
2149866 1
4.3%
1944228 1
4.3%
1785454 1
4.3%
1601901 1
4.3%
1598505 1
4.3%
1496719 1
4.3%
1440672 1
4.3%
1375462 1
4.3%
1336867 1
4.3%
1232910 1
4.3%

기타석유류(톤)
Real number (ℝ)

MISSING 

Distinct19
Distinct (%)100.0%
Missing4
Missing (%)17.4%
Infinite0
Infinite (%)0.0%
Mean904885.16
Minimum554584
Maximum1463963
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T18:56:50.437168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum554584
5-th percentile609961
Q1792224.5
median907574
Q31021275.5
95-th percentile1154538.5
Maximum1463963
Range909379
Interquartile range (IQR)229051

Descriptive statistics

Standard deviation208154.63
Coefficient of variation (CV)0.23003431
Kurtosis1.7753266
Mean904885.16
Median Absolute Deviation (MAD)126541
Skewness0.71761834
Sum17192818
Variance4.3328351 × 1010
MonotonicityNot monotonic
2023-12-12T18:56:50.543966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
892788 1
 
4.3%
554584 1
 
4.3%
702409 1
 
4.3%
805802 1
 
4.3%
843212 1
 
4.3%
888670 1
 
4.3%
907574 1
 
4.3%
913466 1
 
4.3%
919645 1
 
4.3%
1463963 1
 
4.3%
Other values (9) 9
39.1%
(Missing) 4
17.4%
ValueCountFrequency (%)
554584 1
4.3%
616114 1
4.3%
661100 1
4.3%
702409 1
4.3%
778647 1
4.3%
805802 1
4.3%
843212 1
4.3%
888670 1
4.3%
892788 1
4.3%
907574 1
4.3%
ValueCountFrequency (%)
1463963 1
4.3%
1120158 1
4.3%
1072677 1
4.3%
1070655 1
4.3%
1034115 1
4.3%
1008436 1
4.3%
938803 1
4.3%
919645 1
4.3%
913466 1
4.3%
907574 1
4.3%

경유(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
20 
339814
 
1
216148
 
1
66002
 
1

Length

Max length6
Median length4
Mean length4.2173913
Min length4

Unique

Unique3 ?
Unique (%)13.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 20
87.0%
339814 1
 
4.3%
216148 1
 
4.3%
66002 1
 
4.3%

Length

2023-12-12T18:56:50.656518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:56:50.751831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 20
87.0%
339814 1
 
4.3%
216148 1
 
4.3%
66002 1
 
4.3%

등유(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
20 
28974
 
1
20760
 
1
7334
 
1

Length

Max length5
Median length4
Mean length4.0869565
Min length4

Unique

Unique3 ?
Unique (%)13.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 20
87.0%
28974 1
 
4.3%
20760 1
 
4.3%
7334 1
 
4.3%

Length

2023-12-12T18:56:50.851295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:56:50.935005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 20
87.0%
28974 1
 
4.3%
20760 1
 
4.3%
7334 1
 
4.3%

JP8제트유(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
264887
 
1
208657
 
1
224641
 
1
190636
 
1

Length

Max length6
Median length4
Mean length4.3478261
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
264887 1
 
4.3%
208657 1
 
4.3%
224641 1
 
4.3%
190636 1
 
4.3%

Length

2023-12-12T18:56:51.033609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:56:51.130784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
264887 1
 
4.3%
208657 1
 
4.3%
224641 1
 
4.3%
190636 1
 
4.3%

유류기타(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
20 
7929
 
1
7880
 
1
1857
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique3 ?
Unique (%)13.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 20
87.0%
7929 1
 
4.3%
7880 1
 
4.3%
1857 1
 
4.3%

Length

2023-12-12T18:56:51.238886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:56:51.380490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 20
87.0%
7929 1
 
4.3%
7880 1
 
4.3%
1857 1
 
4.3%

Interactions

2023-12-12T18:56:48.636335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:47.263244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:47.716939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:48.188793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:48.738421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:47.371215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:47.834436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:48.295600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:48.838577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:47.481476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:47.943779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:48.395309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:48.958890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:47.608046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:48.074558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:56:48.515430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T18:56:51.465447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도휘발유(톤)방카씨유(톤)기타석유류(톤)경유(톤)등유(톤)JP8제트유(톤)유류기타(톤)
연도1.0000.8750.7040.5941.0001.0001.0001.000
휘발유(톤)0.8751.0000.9070.723NaNNaNNaNNaN
방카씨유(톤)0.7040.9071.0000.615NaNNaNNaNNaN
기타석유류(톤)0.5940.7230.6151.000NaNNaNNaNNaN
경유(톤)1.000NaNNaNNaN1.0001.0001.0001.000
등유(톤)1.000NaNNaNNaN1.0001.0001.0001.000
JP8제트유(톤)1.000NaNNaNNaN1.0001.0001.0001.000
유류기타(톤)1.000NaNNaNNaN1.0001.0001.0001.000
2023-12-12T18:56:51.617718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유류기타(톤)등유(톤)JP8제트유(톤)경유(톤)
유류기타(톤)1.0001.0001.0001.000
등유(톤)1.0001.0001.0001.000
JP8제트유(톤)1.0001.0001.0001.000
경유(톤)1.0001.0001.0001.000
2023-12-12T18:56:51.741884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도휘발유(톤)방카씨유(톤)기타석유류(톤)경유(톤)등유(톤)JP8제트유(톤)유류기타(톤)
연도1.000-0.785-0.988-0.3051.0001.0001.0001.000
휘발유(톤)-0.7851.0000.7730.046NaNNaNNaNNaN
방카씨유(톤)-0.9880.7731.0000.3021.0001.0001.0001.000
기타석유류(톤)-0.3050.0460.3021.0000.0000.0000.0000.000
경유(톤)1.000NaN1.0000.0001.0001.0001.0001.000
등유(톤)1.000NaN1.0000.0001.0001.0001.0001.000
JP8제트유(톤)1.000NaN1.0000.0001.0001.0001.0001.000
유류기타(톤)1.000NaN1.0000.0001.0001.0001.0001.000

Missing values

2023-12-12T18:56:49.081663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:56:49.212723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T18:56:49.328078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연도휘발유(톤)방카씨유(톤)기타석유류(톤)경유(톤)등유(톤)JP8제트유(톤)유류기타(톤)
01996101366821498661463963<NA><NA><NA><NA>
119978757111944228892788<NA><NA><NA><NA>
219981997901601901616114<NA><NA><NA><NA>
319992309351785454661100<NA><NA><NA><NA>
420002028981598505778647<NA><NA><NA><NA>
520011575801496719938803<NA><NA><NA><NA>
6200217426614406721034115<NA><NA><NA><NA>
7200314408313754621120158<NA><NA><NA><NA>
8200413764113368671072677<NA><NA><NA><NA>
920059805912329101070655<NA><NA><NA><NA>
연도휘발유(톤)방카씨유(톤)기타석유류(톤)경유(톤)등유(톤)JP8제트유(톤)유류기타(톤)
13200950040860830907574<NA><NA><NA><NA>
14201058184693944888670<NA><NA><NA><NA>
15201158080446907843212<NA><NA><NA><NA>
16201263696303264805802<NA><NA><NA><NA>
17201361080269646702409<NA><NA><NA><NA>
18201452520282703554584<NA><NA><NA><NA>
19201556680317524<NA>339814289742648877929
202016<NA>281907<NA>216148207602086577880
212017<NA>208593<NA>6600273342246411857
222018<NA><NA><NA><NA><NA>190636<NA>