Overview

Dataset statistics

Number of variables9
Number of observations23
Missing cells31
Missing cells (%)15.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory86.7 B

Variable types

Numeric3
Unsupported1
Categorical5

Dataset

Description연도별 석탄(국내, 수입 등) 수송실적 입니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15068398/fileData.do

Alerts

발전용유연탄(톤) is highly overall correlated with 연도 and 4 other fieldsHigh correlation
시멘트용유연탄(톤) is highly overall correlated with 연도 and 4 other fieldsHigh correlation
경석(톤) is highly overall correlated with 연도 and 4 other fieldsHigh correlation
민수용무연탄(톤) is highly overall correlated with 연도 and 4 other fieldsHigh correlation
발전용무연탄(톤) is highly overall correlated with 연도 and 4 other fieldsHigh correlation
연도 is highly overall correlated with 국내(톤) and 6 other fieldsHigh correlation
국내(톤) is highly overall correlated with 연도 and 1 other fieldsHigh correlation
수입(톤) is highly overall correlated with 연도 and 1 other fieldsHigh correlation
발전용무연탄(톤) is highly imbalanced (56.3%)Imbalance
민수용무연탄(톤) is highly imbalanced (56.3%)Imbalance
발전용유연탄(톤) is highly imbalanced (61.7%)Imbalance
시멘트용유연탄(톤) is highly imbalanced (56.3%)Imbalance
경석(톤) is highly imbalanced (56.3%)Imbalance
국내(톤) has 4 (17.4%) missing valuesMissing
수입(톤) has 4 (17.4%) missing valuesMissing
수출(톤) has 23 (100.0%) missing valuesMissing
연도 has unique valuesUnique
수출(톤) is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 23:55:41.925752
Analysis finished2023-12-11 23:55:43.397683
Duration1.47 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct23
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2007
Minimum1996
Maximum2018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T08:55:43.448543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1996
5-th percentile1997.1
Q12001.5
median2007
Q32012.5
95-th percentile2016.9
Maximum2018
Range22
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.78233
Coefficient of variation (CV)0.0033793373
Kurtosis-1.2
Mean2007
Median Absolute Deviation (MAD)6
Skewness0
Sum46161
Variance46
MonotonicityStrictly increasing
2023-12-12T08:55:43.556539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1996 1
 
4.3%
1997 1
 
4.3%
2018 1
 
4.3%
2017 1
 
4.3%
2016 1
 
4.3%
2015 1
 
4.3%
2014 1
 
4.3%
2013 1
 
4.3%
2012 1
 
4.3%
2011 1
 
4.3%
Other values (13) 13
56.5%
ValueCountFrequency (%)
1996 1
4.3%
1997 1
4.3%
1998 1
4.3%
1999 1
4.3%
2000 1
4.3%
2001 1
4.3%
2002 1
4.3%
2003 1
4.3%
2004 1
4.3%
2005 1
4.3%
ValueCountFrequency (%)
2018 1
4.3%
2017 1
4.3%
2016 1
4.3%
2015 1
4.3%
2014 1
4.3%
2013 1
4.3%
2012 1
4.3%
2011 1
4.3%
2010 1
4.3%
2009 1
4.3%

국내(톤)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct19
Distinct (%)100.0%
Missing4
Missing (%)17.4%
Infinite0
Infinite (%)0.0%
Mean3596087.5
Minimum944080
Maximum5027562
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T08:55:43.699901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum944080
5-th percentile1209546.7
Q13049241
median4089392
Q34345787
95-th percentile4807981.8
Maximum5027562
Range4083482
Interquartile range (IQR)1296546

Descriptive statistics

Standard deviation1292809.2
Coefficient of variation (CV)0.35950438
Kurtosis-0.20825126
Mean3596087.5
Median Absolute Deviation (MAD)552753
Skewness-1.1005371
Sum68325663
Variance1.6713556 × 1012
MonotonicityNot monotonic
2023-12-12T08:55:43.836833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
4281399 1
 
4.3%
944080 1
 
4.3%
1239043 1
 
4.3%
1741369 1
 
4.3%
1536453 1
 
4.3%
2728292 1
 
4.3%
3370190 1
 
4.3%
4136569 1
 
4.3%
4070265 1
 
4.3%
4783584 1
 
4.3%
Other values (9) 9
39.1%
(Missing) 4
17.4%
ValueCountFrequency (%)
944080 1
4.3%
1239043 1
4.3%
1536453 1
4.3%
1741369 1
4.3%
2728292 1
4.3%
3370190 1
4.3%
4025969 1
4.3%
4055006 1
4.3%
4070265 1
4.3%
4089392 1
4.3%
ValueCountFrequency (%)
5027562 1
4.3%
4783584 1
4.3%
4737617 1
4.3%
4642145 1
4.3%
4407588 1
4.3%
4283986 1
4.3%
4281399 1
4.3%
4225154 1
4.3%
4136569 1
4.3%
4089392 1
4.3%

수입(톤)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct19
Distinct (%)100.0%
Missing4
Missing (%)17.4%
Infinite0
Infinite (%)0.0%
Mean2822966.3
Minimum2281731
Maximum3758397
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T08:55:43.941219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2281731
5-th percentile2319359.1
Q12422263
median2808051
Q33065482
95-th percentile3473354.4
Maximum3758397
Range1476666
Interquartile range (IQR)643219

Descriptive statistics

Standard deviation441680.83
Coefficient of variation (CV)0.15645983
Kurtosis-0.72786665
Mean2822966.3
Median Absolute Deviation (MAD)366160
Skewness0.50170113
Sum53636359
Variance1.9508195 × 1011
MonotonicityNot monotonic
2023-12-12T08:55:44.074504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
2987572 1
 
4.3%
3344409 1
 
4.3%
3345638 1
 
4.3%
3134260 1
 
4.3%
3758397 1
 
4.3%
3441683 1
 
4.3%
2996704 1
 
4.3%
2940398 1
 
4.3%
2808051 1
 
4.3%
2979948 1
 
4.3%
Other values (9) 9
39.1%
(Missing) 4
17.4%
ValueCountFrequency (%)
2281731 1
4.3%
2323540 1
4.3%
2340728 1
4.3%
2351645 1
4.3%
2402635 1
4.3%
2441891 1
4.3%
2472467 1
4.3%
2576597 1
4.3%
2708065 1
4.3%
2808051 1
4.3%
ValueCountFrequency (%)
3758397 1
4.3%
3441683 1
4.3%
3345638 1
4.3%
3344409 1
4.3%
3134260 1
4.3%
2996704 1
4.3%
2987572 1
4.3%
2979948 1
4.3%
2940398 1
4.3%
2808051 1
4.3%

수출(톤)
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing23
Missing (%)100.0%
Memory size339.0 B

발전용무연탄(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
465006
 
1
599807
 
1
201061
 
1
143170
 
1

Length

Max length6
Median length4
Mean length4.3478261
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
465006 1
 
4.3%
599807 1
 
4.3%
201061 1
 
4.3%
143170 1
 
4.3%

Length

2023-12-12T08:55:44.198653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:55:44.296272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
465006 1
 
4.3%
599807 1
 
4.3%
201061 1
 
4.3%
143170 1
 
4.3%

민수용무연탄(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
453166
 
1
369718
 
1
382752
 
1
290613
 
1

Length

Max length6
Median length4
Mean length4.3478261
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
453166 1
 
4.3%
369718 1
 
4.3%
382752 1
 
4.3%
290613 1
 
4.3%

Length

2023-12-12T08:55:44.399302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:55:44.495960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
453166 1
 
4.3%
369718 1
 
4.3%
382752 1
 
4.3%
290613 1
 
4.3%

발전용유연탄(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
20 
878350
 
1
527629
 
1
260037
 
1

Length

Max length6
Median length4
Mean length4.2608696
Min length4

Unique

Unique3 ?
Unique (%)13.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 20
87.0%
878350 1
 
4.3%
527629 1
 
4.3%
260037 1
 
4.3%

Length

2023-12-12T08:55:44.593938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:55:44.686690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 20
87.0%
878350 1
 
4.3%
527629 1
 
4.3%
260037 1
 
4.3%

시멘트용유연탄(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
1958518
 
1
1753819
 
1
1794689
 
1
1571366
 
1

Length

Max length7
Median length4
Mean length4.5217391
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
1958518 1
 
4.3%
1753819 1
 
4.3%
1794689 1
 
4.3%
1571366 1
 
4.3%

Length

2023-12-12T08:55:44.787485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:55:44.878760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
1958518 1
 
4.3%
1753819 1
 
4.3%
1794689 1
 
4.3%
1571366 1
 
4.3%

경석(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
65105
 
1
56938
 
1
85957
 
1
23745
 
1

Length

Max length5
Median length4
Mean length4.173913
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
65105 1
 
4.3%
56938 1
 
4.3%
85957 1
 
4.3%
23745 1
 
4.3%

Length

2023-12-12T08:55:44.977903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:55:45.064858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
65105 1
 
4.3%
56938 1
 
4.3%
85957 1
 
4.3%
23745 1
 
4.3%

Interactions

2023-12-12T08:55:42.748240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:55:42.250925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:55:42.495729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:55:42.836951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:55:42.311736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:55:42.578436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:55:42.932114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:55:42.407735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:55:42.668131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:55:45.129801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도국내(톤)수입(톤)발전용무연탄(톤)민수용무연탄(톤)발전용유연탄(톤)시멘트용유연탄(톤)경석(톤)
연도1.0000.3660.6401.0001.0001.0001.0001.000
국내(톤)0.3661.0000.461NaNNaNNaNNaNNaN
수입(톤)0.6400.4611.000NaNNaNNaNNaNNaN
발전용무연탄(톤)1.000NaNNaN1.0001.0001.0001.0001.000
민수용무연탄(톤)1.000NaNNaN1.0001.0001.0001.0001.000
발전용유연탄(톤)1.000NaNNaN1.0001.0001.0001.0001.000
시멘트용유연탄(톤)1.000NaNNaN1.0001.0001.0001.0001.000
경석(톤)1.000NaNNaN1.0001.0001.0001.0001.000
2023-12-12T08:55:45.227975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발전용유연탄(톤)시멘트용유연탄(톤)경석(톤)민수용무연탄(톤)발전용무연탄(톤)
발전용유연탄(톤)1.0001.0001.0001.0001.000
시멘트용유연탄(톤)1.0001.0001.0001.0001.000
경석(톤)1.0001.0001.0001.0001.000
민수용무연탄(톤)1.0001.0001.0001.0001.000
발전용무연탄(톤)1.0001.0001.0001.0001.000
2023-12-12T08:55:45.631585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도국내(톤)수입(톤)발전용무연탄(톤)민수용무연탄(톤)발전용유연탄(톤)시멘트용유연탄(톤)경석(톤)
연도1.000-0.7420.5961.0001.0001.0001.0001.000
국내(톤)-0.7421.000-0.6510.0000.0000.0000.0000.000
수입(톤)0.596-0.6511.0000.0000.0000.0000.0000.000
발전용무연탄(톤)1.0000.0000.0001.0001.0001.0001.0001.000
민수용무연탄(톤)1.0000.0000.0001.0001.0001.0001.0001.000
발전용유연탄(톤)1.0000.0000.0001.0001.0001.0001.0001.000
시멘트용유연탄(톤)1.0000.0000.0001.0001.0001.0001.0001.000
경석(톤)1.0000.0000.0001.0001.0001.0001.0001.000

Missing values

2023-12-12T08:55:43.053763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:55:43.178754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T08:55:43.313947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연도국내(톤)수입(톤)수출(톤)발전용무연탄(톤)민수용무연탄(톤)발전용유연탄(톤)시멘트용유연탄(톤)경석(톤)
0199647835842979948<NA><NA><NA><NA><NA><NA>
1199742813992987572<NA><NA><NA><NA><NA><NA>
2199842251542323540<NA><NA><NA><NA><NA><NA>
3199940550062402635<NA><NA><NA><NA><NA><NA>
4200046421452472467<NA><NA><NA><NA><NA><NA>
5200147376172441891<NA><NA><NA><NA><NA><NA>
6200240893922576597<NA><NA><NA><NA><NA><NA>
7200344075882708065<NA><NA><NA><NA><NA><NA>
8200440259692351645<NA><NA><NA><NA><NA><NA>
9200542839862281731<NA><NA><NA><NA><NA><NA>
연도국내(톤)수입(톤)수출(톤)발전용무연탄(톤)민수용무연탄(톤)발전용유연탄(톤)시멘트용유연탄(톤)경석(톤)
13200933701902996704<NA><NA><NA><NA><NA><NA>
14201027282923441683<NA><NA><NA><NA><NA><NA>
15201115364533758397<NA><NA><NA><NA><NA><NA>
16201217413693134260<NA><NA><NA><NA><NA><NA>
17201312390433345638<NA><NA><NA><NA><NA><NA>
1820149440803344409<NA><NA><NA><NA><NA><NA>
192015<NA><NA><NA>465006453166878350195851865105
202016<NA><NA><NA>599807369718527629175381956938
212017<NA><NA><NA>201061382752260037179468985957
222018<NA><NA><NA>143170290613<NA>157136623745