Overview

Dataset statistics

Number of variables6
Number of observations23
Missing cells4
Missing cells (%)2.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.3 KiB
Average record size in memory59.7 B

Variable types

Numeric4
Categorical2

Dataset

Description연도별 시멘트(벌크, 크링카 등) 수송실적입니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15068396/fileData.do

Alerts

연도 is highly overall correlated with 시멘트(톤) and 3 other fieldsHigh correlation
시멘트(톤) is highly overall correlated with 연도 and 1 other fieldsHigh correlation
벌크(톤) is highly overall correlated with 연도 and 2 other fieldsHigh correlation
크링카(톤) is highly overall correlated with 연도 and 3 other fieldsHigh correlation
포대(톤) is highly overall correlated with 연도 and 2 other fieldsHigh correlation
수출(톤) is highly imbalanced (74.2%)Imbalance
포대(톤) is highly imbalanced (56.3%)Imbalance
시멘트(톤) has 4 (17.4%) missing valuesMissing
연도 has unique valuesUnique
벌크(톤) has unique valuesUnique
크링카(톤) has unique valuesUnique

Reproduction

Analysis started2023-12-12 23:03:57.635645
Analysis finished2023-12-12 23:03:59.706045
Duration2.07 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct23
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2007
Minimum1996
Maximum2018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-13T08:03:59.775515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1996
5-th percentile1997.1
Q12001.5
median2007
Q32012.5
95-th percentile2016.9
Maximum2018
Range22
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.78233
Coefficient of variation (CV)0.0033793373
Kurtosis-1.2
Mean2007
Median Absolute Deviation (MAD)6
Skewness0
Sum46161
Variance46
MonotonicityStrictly increasing
2023-12-13T08:03:59.930796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1996 1
 
4.3%
1997 1
 
4.3%
2018 1
 
4.3%
2017 1
 
4.3%
2016 1
 
4.3%
2015 1
 
4.3%
2014 1
 
4.3%
2013 1
 
4.3%
2012 1
 
4.3%
2011 1
 
4.3%
Other values (13) 13
56.5%
ValueCountFrequency (%)
1996 1
4.3%
1997 1
4.3%
1998 1
4.3%
1999 1
4.3%
2000 1
4.3%
2001 1
4.3%
2002 1
4.3%
2003 1
4.3%
2004 1
4.3%
2005 1
4.3%
ValueCountFrequency (%)
2018 1
4.3%
2017 1
4.3%
2016 1
4.3%
2015 1
4.3%
2014 1
4.3%
2013 1
4.3%
2012 1
4.3%
2011 1
4.3%
2010 1
4.3%
2009 1
4.3%

시멘트(톤)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct19
Distinct (%)100.0%
Missing4
Missing (%)17.4%
Infinite0
Infinite (%)0.0%
Mean996247.16
Minimum293618
Maximum3098098
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-13T08:04:00.054074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum293618
5-th percentile314509.7
Q1339790
median524763
Q31402909
95-th percentile2707312.6
Maximum3098098
Range2804480
Interquartile range (IQR)1063119

Descriptive statistics

Standard deviation837172.7
Coefficient of variation (CV)0.84032632
Kurtosis1.0700651
Mean996247.16
Median Absolute Deviation (MAD)231145
Skewness1.2657099
Sum18928696
Variance7.0085814 × 1011
MonotonicityNot monotonic
2023-12-13T08:04:00.210367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
2663892 1
 
4.3%
293618 1
 
4.3%
316831 1
 
4.3%
320567 1
 
4.3%
327606 1
 
4.3%
330909 1
 
4.3%
348671 1
 
4.3%
368986 1
 
4.3%
371316 1
 
4.3%
3098098 1
 
4.3%
Other values (9) 9
39.1%
(Missing) 4
17.4%
ValueCountFrequency (%)
293618 1
4.3%
316831 1
4.3%
320567 1
4.3%
327606 1
4.3%
330909 1
4.3%
348671 1
4.3%
368986 1
4.3%
371316 1
4.3%
383302 1
4.3%
524763 1
4.3%
ValueCountFrequency (%)
3098098 1
4.3%
2663892 1
4.3%
1706844 1
4.3%
1512046 1
4.3%
1415018 1
4.3%
1390800 1
4.3%
1337608 1
4.3%
1295641 1
4.3%
922180 1
4.3%
524763 1
4.3%

벌크(톤)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct23
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14578542
Minimum12136076
Maximum17034903
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-13T08:04:00.369565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12136076
5-th percentile12907935
Q113713446
median14413495
Q315463336
95-th percentile16711669
Maximum17034903
Range4898827
Interquartile range (IQR)1749890

Descriptive statistics

Standard deviation1264376.2
Coefficient of variation (CV)0.086728581
Kurtosis-0.47593184
Mean14578542
Median Absolute Deviation (MAD)803167
Skewness0.20880197
Sum3.3530646 × 108
Variance1.5986472 × 1012
MonotonicityNot monotonic
2023-12-13T08:04:00.536477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
14506523 1
 
4.3%
15954529 1
 
4.3%
12136076 1
 
4.3%
12882083 1
 
4.3%
13140603 1
 
4.3%
14413495 1
 
4.3%
13644768 1
 
4.3%
14242162 1
 
4.3%
13970151 1
 
4.3%
14081680 1
 
4.3%
Other values (13) 13
56.5%
ValueCountFrequency (%)
12136076 1
4.3%
12882083 1
4.3%
13140603 1
4.3%
13335846 1
4.3%
13455219 1
4.3%
13644768 1
4.3%
13782123 1
4.3%
13970151 1
4.3%
14081680 1
4.3%
14242162 1
4.3%
ValueCountFrequency (%)
17034903 1
4.3%
16770594 1
4.3%
16181340 1
4.3%
15954529 1
4.3%
15822987 1
4.3%
15710009 1
4.3%
15216662 1
4.3%
15101880 1
4.3%
15088791 1
4.3%
14590832 1
4.3%

크링카(톤)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct23
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean701845.35
Minimum20548
Maximum1975391
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-13T08:04:00.686454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20548
5-th percentile106740
Q1241882
median848816
Q31013967
95-th percentile1449083.9
Maximum1975391
Range1954843
Interquartile range (IQR)772085

Descriptive statistics

Standard deviation503775.06
Coefficient of variation (CV)0.71778642
Kurtosis0.11160207
Mean701845.35
Median Absolute Deviation (MAD)399410
Skewness0.59178147
Sum16142443
Variance2.5378931 × 1011
MonotonicityNot monotonic
2023-12-13T08:04:00.840080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1479651 1
 
4.3%
1975391 1
 
4.3%
20548 1
 
4.3%
104350 1
 
4.3%
128250 1
 
4.3%
197807 1
 
4.3%
187463 1
 
4.3%
287801 1
 
4.3%
311727 1
 
4.3%
266458 1
 
4.3%
Other values (13) 13
56.5%
ValueCountFrequency (%)
20548 1
4.3%
104350 1
4.3%
128250 1
4.3%
187463 1
4.3%
197807 1
4.3%
217306 1
4.3%
266458 1
4.3%
287801 1
4.3%
311727 1
4.3%
449406 1
4.3%
ValueCountFrequency (%)
1975391 1
4.3%
1479651 1
4.3%
1173980 1
4.3%
1120148 1
4.3%
1017899 1
4.3%
1016861 1
4.3%
1011073 1
4.3%
970713 1
4.3%
894887 1
4.3%
868790 1
4.3%

수출(톤)
Categorical

IMBALANCE 

Distinct2
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
22 
4803
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique1 ?
Unique (%)4.3%

Sample

1st row<NA>
2nd row<NA>
3rd row4803
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 22
95.7%
4803 1
 
4.3%

Length

2023-12-13T08:04:01.006914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:04:01.144303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 22
95.7%
4803 1
 
4.3%

포대(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
279282
 
1
218988
 
1
236161
 
1
239643
 
1

Length

Max length6
Median length4
Mean length4.3478261
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
279282 1
 
4.3%
218988 1
 
4.3%
236161 1
 
4.3%
239643 1
 
4.3%

Length

2023-12-13T08:04:01.276473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:04:01.400685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
279282 1
 
4.3%
218988 1
 
4.3%
236161 1
 
4.3%
239643 1
 
4.3%

Interactions

2023-12-13T08:03:59.103948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:57.854536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:58.280804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:58.700152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:59.199011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:57.968414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:58.397898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:58.783853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:59.312729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:58.076056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:58.486156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:58.889987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:59.410487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:58.168374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:58.585865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:03:58.986080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:04:01.490769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도시멘트(톤)벌크(톤)크링카(톤)포대(톤)
연도1.0000.6990.0000.7671.000
시멘트(톤)0.6991.0000.3960.850NaN
벌크(톤)0.0000.3961.0000.2061.000
크링카(톤)0.7670.8500.2061.000NaN
포대(톤)1.000NaN1.000NaN1.000
2023-12-13T08:04:01.618205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
포대(톤)수출(톤)
포대(톤)1.000NaN
수출(톤)NaN1.000
2023-12-13T08:04:04.165405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도시멘트(톤)벌크(톤)크링카(톤)수출(톤)포대(톤)
연도1.000-0.995-0.500-0.867NaN1.000
시멘트(톤)-0.9951.0000.2320.753NaN0.000
벌크(톤)-0.5000.2321.0000.621NaN1.000
크링카(톤)-0.8670.7530.6211.000NaN1.000
수출(톤)NaNNaNNaNNaN1.0000.000
포대(톤)1.0000.0001.0001.0000.0001.000

Missing values

2023-12-13T08:03:59.541873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:03:59.652930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도시멘트(톤)벌크(톤)크링카(톤)수출(톤)포대(톤)
019963098098145065231479651<NA><NA>
119972663892159545291975391<NA><NA>
2199817068441333584610110734803<NA>
319991512046134552191016861<NA><NA>
42000139080015101880868790<NA><NA>
52001133760815710009894887<NA><NA>
62002141501816770594740215<NA><NA>
720031295641170349031173980<NA><NA>
8200492218015822987970713<NA><NA>
9200552476313782123852903<NA><NA>
연도시멘트(톤)벌크(톤)크링카(톤)수출(톤)포대(톤)
13200934867115216662449406<NA><NA>
14201033090914243201217306<NA><NA>
15201132760614081680266458<NA><NA>
16201232056713970151311727<NA><NA>
17201331683114242162287801<NA><NA>
18201429361813644768187463<NA><NA>
192015<NA>14413495197807<NA>279282
202016<NA>13140603128250<NA>218988
212017<NA>12882083104350<NA>236161
222018<NA>1213607620548<NA>239643