Overview

Dataset statistics

Number of variables7
Number of observations23
Missing cells27
Missing cells (%)16.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.5 KiB
Average record size in memory68.7 B

Variable types

Numeric4
Categorical2
Unsupported1

Dataset

Description연도별 시멘트(벌크, 크링카 등) 수송실적 입니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15068512/fileData.do

Alerts

연도 is highly overall correlated with 시멘트(톤-키로) and 3 other fieldsHigh correlation
시멘트(톤-키로) is highly overall correlated with 연도 and 1 other fieldsHigh correlation
벌크(톤-키로) is highly overall correlated with 연도 and 2 other fieldsHigh correlation
크링카(톤-키로) is highly overall correlated with 연도 and 3 other fieldsHigh correlation
포대(톤-키로) is highly overall correlated with 연도 and 2 other fieldsHigh correlation
수출(톤-키로) is highly imbalanced (74.2%)Imbalance
포대(톤-키로) is highly imbalanced (56.3%)Imbalance
시멘트(톤-키로) has 4 (17.4%) missing valuesMissing
수입(톤-키로) has 23 (100.0%) missing valuesMissing
연도 has unique valuesUnique
벌크(톤-키로) has unique valuesUnique
크링카(톤-키로) has unique valuesUnique
수입(톤-키로) is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 08:48:35.614080
Analysis finished2023-12-12 08:48:37.887624
Duration2.27 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct23
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2007
Minimum1996
Maximum2018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T17:48:37.987022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1996
5-th percentile1997.1
Q12001.5
median2007
Q32012.5
95-th percentile2016.9
Maximum2018
Range22
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.78233
Coefficient of variation (CV)0.0033793373
Kurtosis-1.2
Mean2007
Median Absolute Deviation (MAD)6
Skewness0
Sum46161
Variance46
MonotonicityStrictly increasing
2023-12-12T17:48:38.159932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1996 1
 
4.3%
1997 1
 
4.3%
2018 1
 
4.3%
2017 1
 
4.3%
2016 1
 
4.3%
2015 1
 
4.3%
2014 1
 
4.3%
2013 1
 
4.3%
2012 1
 
4.3%
2011 1
 
4.3%
Other values (13) 13
56.5%
ValueCountFrequency (%)
1996 1
4.3%
1997 1
4.3%
1998 1
4.3%
1999 1
4.3%
2000 1
4.3%
2001 1
4.3%
2002 1
4.3%
2003 1
4.3%
2004 1
4.3%
2005 1
4.3%
ValueCountFrequency (%)
2018 1
4.3%
2017 1
4.3%
2016 1
4.3%
2015 1
4.3%
2014 1
4.3%
2013 1
4.3%
2012 1
4.3%
2011 1
4.3%
2010 1
4.3%
2009 1
4.3%

시멘트(톤-키로)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct19
Distinct (%)100.0%
Missing4
Missing (%)17.4%
Infinite0
Infinite (%)0.0%
Mean2.1678403 × 108
Minimum61051192
Maximum7.4815364 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T17:48:38.314856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum61051192
5-th percentile66297550
Q170946856
median1.0292396 × 108
Q32.9457767 × 108
95-th percentile6.2949624 × 108
Maximum7.4815364 × 108
Range6.8710245 × 108
Interquartile range (IQR)2.2363081 × 108

Descriptive statistics

Standard deviation1.9802002 × 108
Coefficient of variation (CV)0.91344377
Kurtosis2.0029239
Mean2.1678403 × 108
Median Absolute Deviation (MAD)41872773
Skewness1.5245405
Sum4.1188966 × 109
Variance3.9211929 × 1016
MonotonicityNot monotonic
2023-12-12T17:48:38.439448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
616312083.8 1
 
4.3%
61051191.5 1
 
4.3%
69146601.4 1
 
4.3%
66880478.5 1
 
4.3%
67974131.9 1
 
4.3%
68484852.3 1
 
4.3%
72747110.7 1
 
4.3%
74443358.3 1
 
4.3%
74289362.0 1
 
4.3%
748153644.5 1
 
4.3%
Other values (9) 9
39.1%
(Missing) 4
17.4%
ValueCountFrequency (%)
61051191.5 1
4.3%
66880478.5 1
4.3%
67974131.9 1
4.3%
68484852.3 1
4.3%
69146601.4 1
4.3%
72747110.7 1
4.3%
74289362.0 1
4.3%
74443358.3 1
4.3%
76686659.5 1
4.3%
102923964.5 1
4.3%
ValueCountFrequency (%)
748153644.5 1
4.3%
616312083.8 1
4.3%
379666718.1 1
4.3%
320597490.2 1
4.3%
294681909.9 1
4.3%
294473429.2 1
4.3%
284585217.3 1
4.3%
262859275.0 1
4.3%
182939095.2 1
4.3%
102923964.5 1
4.3%

벌크(톤-키로)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct23
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9130975 × 109
Minimum2.3736904 × 109
Maximum3.4669025 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T17:48:38.624568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.3736904 × 109
5-th percentile2.5124557 × 109
Q12.704202 × 109
median2.8516761 × 109
Q33.1283922 × 109
95-th percentile3.4177633 × 109
Maximum3.4669025 × 109
Range1.0932122 × 109
Interquartile range (IQR)4.2419015 × 108

Descriptive statistics

Standard deviation2.989485 × 108
Coefficient of variation (CV)0.10262221
Kurtosis-0.77430949
Mean2.9130975 × 109
Median Absolute Deviation (MAD)2.0730918 × 108
Skewness0.21129403
Sum6.7001242 × 1010
Variance8.9370208 × 1016
MonotonicityNot monotonic
2023-12-12T17:48:38.796453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
3083892167.0 1
 
4.3%
3212071881.5 1
 
4.3%
2373690372.4 1
 
4.3%
2505434675.3 1
 
4.3%
2575645126.0 1
 
4.3%
2780367096.4 1
 
4.3%
2588892710.0 1
 
4.3%
2851676092.0 1
 
4.3%
2686931013.0 1
 
4.3%
2755416628.5 1
 
4.3%
Other values (13) 13
56.5%
ValueCountFrequency (%)
2373690372.4 1
4.3%
2505434675.3 1
4.3%
2575645126.0 1
4.3%
2588892710.0 1
4.3%
2682484486.1 1
4.3%
2686931013.0 1
4.3%
2721473011.7 1
4.3%
2728206999.1 1
4.3%
2755416628.5 1
4.3%
2780367096.4 1
4.3%
ValueCountFrequency (%)
3466902547.8 1
4.3%
3433174405.9 1
4.3%
3279063185.1 1
4.3%
3257805230.5 1
4.3%
3212071881.5 1
4.3%
3172892167.3 1
4.3%
3083892167.0 1
4.3%
3058985267.6 1
4.3%
3037888176.2 1
4.3%
2984131374.1 1
4.3%

크링카(톤-키로)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct23
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4168861 × 108
Minimum4236436.2
Maximum3.9734015 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T17:48:38.959515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4236436.2
5-th percentile23134286
Q151840466
median1.5471002 × 108
Q31.9325081 × 108
95-th percentile3.1073003 × 108
Maximum3.9734015 × 108
Range3.9310371 × 108
Interquartile range (IQR)1.4141035 × 108

Descriptive statistics

Standard deviation1.0047479 × 108
Coefficient of variation (CV)0.70912399
Kurtosis0.37085332
Mean1.4168861 × 108
Median Absolute Deviation (MAD)83306848
Skewness0.70788669
Sum3.2588379 × 109
Variance1.0095183 × 1016
MonotonicityNot monotonic
2023-12-12T17:48:39.124261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
317966847.7 1
 
4.3%
397340146.1 1
 
4.3%
4236436.2 1
 
4.3%
22672766.5 1
 
4.3%
27287961.3 1
 
4.3%
42981350.1 1
 
4.3%
42340169.2 1
 
4.3%
71403171.8 1
 
4.3%
68969944.6 1
 
4.3%
57030758.6 1
 
4.3%
Other values (13) 13
56.5%
ValueCountFrequency (%)
4236436.2 1
4.3%
22672766.5 1
4.3%
27287961.3 1
4.3%
42340169.2 1
4.3%
42981350.1 1
4.3%
46650172.6 1
4.3%
57030758.6 1
4.3%
68969944.6 1
4.3%
71403171.8 1
4.3%
95161101.8 1
4.3%
ValueCountFrequency (%)
397340146.1 1
4.3%
317966847.7 1
4.3%
245598673.2 1
4.3%
208625018.7 1
4.3%
207337494.4 1
4.3%
195669962.9 1
4.3%
190831664.8 1
4.3%
188452067.3 1
4.3%
188384790.9 1
4.3%
182209252.0 1
4.3%

수출(톤-키로)
Categorical

IMBALANCE 

Distinct2
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
22 
1559534.1
 
1

Length

Max length9
Median length4
Mean length4.2173913
Min length4

Unique

Unique1 ?
Unique (%)4.3%

Sample

1st row<NA>
2nd row<NA>
3rd row1559534.1
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 22
95.7%
1559534.1 1
 
4.3%

Length

2023-12-12T17:48:39.298288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:48:39.405767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 22
95.7%
1559534.1 1
 
4.3%

수입(톤-키로)
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing23
Missing (%)100.0%
Memory size339.0 B

포대(톤-키로)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
56542904.9
 
1
44875286.7
 
1
46301522.6
 
1
47403530.8
 
1

Length

Max length10
Median length4
Mean length5.0434783
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
56542904.9 1
 
4.3%
44875286.7 1
 
4.3%
46301522.6 1
 
4.3%
47403530.8 1
 
4.3%

Length

2023-12-12T17:48:39.520127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:48:39.658300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
56542904.9 1
 
4.3%
44875286.7 1
 
4.3%
46301522.6 1
 
4.3%
47403530.8 1
 
4.3%

Interactions

2023-12-12T17:48:37.187239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:35.824730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:36.297101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:36.732083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:37.288592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:35.936973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:36.401702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:36.873130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:37.393269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:36.054420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:36.507246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:36.963625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:37.503477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:36.147070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:36.627496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:48:37.081807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:48:39.750006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도시멘트(톤-키로)벌크(톤-키로)크링카(톤-키로)포대(톤-키로)
연도1.0000.7680.5640.8031.000
시멘트(톤-키로)0.7681.0000.3690.835NaN
벌크(톤-키로)0.5640.3691.0000.7571.000
크링카(톤-키로)0.8030.8350.7571.000NaN
포대(톤-키로)1.000NaN1.000NaN1.000
2023-12-12T17:48:39.886037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수출(톤-키로)포대(톤-키로)
수출(톤-키로)1.000NaN
포대(톤-키로)NaN1.000
2023-12-12T17:48:39.992917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도시멘트(톤-키로)벌크(톤-키로)크링카(톤-키로)수출(톤-키로)포대(톤-키로)
연도1.000-0.986-0.614-0.870NaN1.000
시멘트(톤-키로)-0.9861.0000.4300.779NaN0.000
벌크(톤-키로)-0.6140.4301.0000.743NaN1.000
크링카(톤-키로)-0.8700.7790.7431.000NaN1.000
수출(톤-키로)NaNNaNNaNNaN1.0000.000
포대(톤-키로)1.0000.0001.0001.0000.0001.000

Missing values

2023-12-12T17:48:37.661218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:48:37.820265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도시멘트(톤-키로)벌크(톤-키로)크링카(톤-키로)수출(톤-키로)수입(톤-키로)포대(톤-키로)
01996748153644.53083892167.0317966847.7<NA><NA><NA>
11997616312083.83212071881.5397340146.1<NA><NA><NA>
21998379666718.12682484486.1195669962.91559534.1<NA><NA>
31999320597490.22721473011.7190831664.8<NA><NA><NA>
42000294681909.93058985267.6188384790.9<NA><NA><NA>
52001284585217.33172892167.3182209252.0<NA><NA><NA>
62002294473429.23433174405.9154710019.5<NA><NA><NA>
72003262859275.03466902547.8245598673.2<NA><NA><NA>
82004182939095.23257805230.5188452067.3<NA><NA><NA>
92005102923964.52816097642.2148020819.0<NA><NA><NA>
연도시멘트(톤-키로)벌크(톤-키로)크링카(톤-키로)수출(톤-키로)수입(톤-키로)포대(톤-키로)
13200972747110.72948120056.995161101.8<NA><NA><NA>
14201068484852.32728206999.146650172.6<NA><NA><NA>
15201167974131.92755416628.557030758.6<NA><NA><NA>
16201266880478.52686931013.068969944.6<NA><NA><NA>
17201369146601.42851676092.071403171.8<NA><NA><NA>
18201461051191.52588892710.042340169.2<NA><NA><NA>
192015<NA>2780367096.442981350.1<NA><NA>56542904.9
202016<NA>2575645126.027287961.3<NA><NA>44875286.7
212017<NA>2505434675.322672766.5<NA><NA>46301522.6
222018<NA>2373690372.44236436.2<NA><NA>47403530.8