Overview

Dataset statistics

Number of variables9
Number of observations23
Missing cells9
Missing cells (%)4.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory86.7 B

Variable types

Numeric3
Categorical6

Dataset

Description연도별 컨테이너(석탄류, 크랑카, 광재 등) 수송실적입니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15068393/fileData.do

Alerts

기타중량품(톤) is highly overall correlated with 연도 and 6 other fieldsHigh correlation
광재(톤) is highly overall correlated with 연도 and 6 other fieldsHigh correlation
석탄류(톤) is highly overall correlated with 연도 and 6 other fieldsHigh correlation
크링카(톤) is highly overall correlated with 연도 and 6 other fieldsHigh correlation
페로니켈(톤) is highly overall correlated with 연도 and 6 other fieldsHigh correlation
연도 is highly overall correlated with 중량(톤) and 5 other fieldsHigh correlation
일반(톤) is highly overall correlated with 중량(톤) and 5 other fieldsHigh correlation
중량(톤) is highly overall correlated with 연도 and 6 other fieldsHigh correlation
석탄류(톤) is highly imbalanced (56.3%)Imbalance
크링카(톤) is highly imbalanced (56.3%)Imbalance
광재(톤) is highly imbalanced (61.7%)Imbalance
기타중량품(톤) is highly imbalanced (56.3%)Imbalance
페로니켈(톤) is highly imbalanced (56.3%)Imbalance
JR컨테이너(톤) is highly imbalanced (74.2%)Imbalance
중량(톤) has 9 (39.1%) missing valuesMissing
연도 has unique valuesUnique
일반(톤) has unique valuesUnique

Reproduction

Analysis started2023-12-12 02:48:24.874246
Analysis finished2023-12-12 02:48:26.937790
Duration2.06 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct23
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2007
Minimum1996
Maximum2018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T11:48:27.017845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1996
5-th percentile1997.1
Q12001.5
median2007
Q32012.5
95-th percentile2016.9
Maximum2018
Range22
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.78233
Coefficient of variation (CV)0.0033793373
Kurtosis-1.2
Mean2007
Median Absolute Deviation (MAD)6
Skewness0
Sum46161
Variance46
MonotonicityStrictly increasing
2023-12-12T11:48:27.159268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1996 1
 
4.3%
1997 1
 
4.3%
2018 1
 
4.3%
2017 1
 
4.3%
2016 1
 
4.3%
2015 1
 
4.3%
2014 1
 
4.3%
2013 1
 
4.3%
2012 1
 
4.3%
2011 1
 
4.3%
Other values (13) 13
56.5%
ValueCountFrequency (%)
1996 1
4.3%
1997 1
4.3%
1998 1
4.3%
1999 1
4.3%
2000 1
4.3%
2001 1
4.3%
2002 1
4.3%
2003 1
4.3%
2004 1
4.3%
2005 1
4.3%
ValueCountFrequency (%)
2018 1
4.3%
2017 1
4.3%
2016 1
4.3%
2015 1
4.3%
2014 1
4.3%
2013 1
4.3%
2012 1
4.3%
2011 1
4.3%
2010 1
4.3%
2009 1
4.3%

일반(톤)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct23
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9255274.9
Minimum5822301
Maximum12443420
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T11:48:27.302836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5822301
5-th percentile6406676.6
Q17922342.5
median8753001
Q310819512
95-th percentile12084244
Maximum12443420
Range6621119
Interquartile range (IQR)2897169.5

Descriptive statistics

Standard deviation1909447.9
Coefficient of variation (CV)0.20630915
Kurtosis-0.91592793
Mean9255274.9
Median Absolute Deviation (MAD)1194589
Skewness0.1467903
Sum2.1287132 × 108
Variance3.6459913 × 1012
MonotonicityNot monotonic
2023-12-12T11:48:27.437840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
5822301 1
 
4.3%
6350040 1
 
4.3%
8680582 1
 
4.3%
7817933 1
 
4.3%
8026752 1
 
4.3%
9341755 1
 
4.3%
10386279 1
 
4.3%
11852929 1
 
4.3%
12109946 1
 
4.3%
11678460 1
 
4.3%
Other values (13) 13
56.5%
ValueCountFrequency (%)
5822301 1
4.3%
6350040 1
4.3%
6916406 1
4.3%
7648361 1
4.3%
7773795 1
4.3%
7817933 1
4.3%
8026752 1
4.3%
8154003 1
4.3%
8511304 1
4.3%
8680582 1
4.3%
ValueCountFrequency (%)
12443420 1
4.3%
12109946 1
4.3%
11852929 1
4.3%
11728968 1
4.3%
11678460 1
4.3%
11252745 1
4.3%
10386279 1
4.3%
10034028 1
4.3%
9947590 1
4.3%
9341755 1
4.3%

중량(톤)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct14
Distinct (%)100.0%
Missing9
Missing (%)39.1%
Infinite0
Infinite (%)0.0%
Mean849965.86
Minimum245571
Maximum1185355
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-12T11:48:27.585496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum245571
5-th percentile253270.25
Q1770089.5
median950750
Q31099130.5
95-th percentile1155006.5
Maximum1185355
Range939784
Interquartile range (IQR)329041

Descriptive statistics

Standard deviation341735.46
Coefficient of variation (CV)0.40205787
Kurtosis-0.29887476
Mean849965.86
Median Absolute Deviation (MAD)163569
Skewness-1.0877602
Sum11899522
Variance1.1678312 × 1011
MonotonicityNot monotonic
2023-12-12T11:48:27.745907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
956807 1
 
4.3%
1069251 1
 
4.3%
1126755 1
 
4.3%
1185355 1
 
4.3%
799617 1
 
4.3%
934111 1
 
4.3%
1099956 1
 
4.3%
1138665 1
 
4.3%
1096654 1
 
4.3%
944693 1
 
4.3%
Other values (4) 4
17.4%
(Missing) 9
39.1%
ValueCountFrequency (%)
245571 1
4.3%
257416 1
4.3%
284424 1
4.3%
760247 1
4.3%
799617 1
4.3%
934111 1
4.3%
944693 1
4.3%
956807 1
4.3%
1069251 1
4.3%
1096654 1
4.3%
ValueCountFrequency (%)
1185355 1
4.3%
1138665 1
4.3%
1126755 1
4.3%
1099956 1
4.3%
1096654 1
4.3%
1069251 1
4.3%
956807 1
4.3%
944693 1
4.3%
934111 1
4.3%
799617 1
4.3%

석탄류(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
83200
 
1
82050
 
1
60500
 
1
343568
 
1

Length

Max length6
Median length4
Mean length4.2173913
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
83200 1
 
4.3%
82050 1
 
4.3%
60500 1
 
4.3%
343568 1
 
4.3%

Length

2023-12-12T11:48:27.896020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:48:28.033407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
83200 1
 
4.3%
82050 1
 
4.3%
60500 1
 
4.3%
343568 1
 
4.3%

크링카(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
24200
 
1
20450
 
1
34550
 
1
14550
 
1

Length

Max length5
Median length4
Mean length4.173913
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
24200 1
 
4.3%
20450 1
 
4.3%
34550 1
 
4.3%
14550 1
 
4.3%

Length

2023-12-12T11:48:28.161090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:48:28.301336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
24200 1
 
4.3%
20450 1
 
4.3%
34550 1
 
4.3%
14550 1
 
4.3%

광재(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
20 
25350
 
1
20450
 
1
1550
 
1

Length

Max length5
Median length4
Mean length4.0869565
Min length4

Unique

Unique3 ?
Unique (%)13.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 20
87.0%
25350 1
 
4.3%
20450 1
 
4.3%
1550 1
 
4.3%

Length

2023-12-12T11:48:28.433555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:48:28.541187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 20
87.0%
25350 1
 
4.3%
20450 1
 
4.3%
1550 1
 
4.3%

기타중량품(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
53300
 
1
58250
 
1
246500
 
1
27300
 
1

Length

Max length6
Median length4
Mean length4.2173913
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
53300 1
 
4.3%
58250 1
 
4.3%
246500 1
 
4.3%
27300 1
 
4.3%

Length

2023-12-12T11:48:28.712204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:48:28.860882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
53300 1
 
4.3%
58250 1
 
4.3%
246500 1
 
4.3%
27300 1
 
4.3%

페로니켈(톤)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
28892
 
1
46004
 
1
47058
 
1
48051
 
1

Length

Max length5
Median length4
Mean length4.173913
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
28892 1
 
4.3%
46004 1
 
4.3%
47058 1
 
4.3%
48051 1
 
4.3%

Length

2023-12-12T11:48:29.001872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:48:29.136243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
28892 1
 
4.3%
46004 1
 
4.3%
47058 1
 
4.3%
48051 1
 
4.3%

JR컨테이너(톤)
Categorical

IMBALANCE 

Distinct2
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
22 
150
 
1

Length

Max length4
Median length4
Mean length3.9565217
Min length3

Unique

Unique1 ?
Unique (%)4.3%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 22
95.7%
150 1
 
4.3%

Length

2023-12-12T11:48:29.291428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:48:29.409709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 22
95.7%
150 1
 
4.3%

Interactions

2023-12-12T11:48:26.356182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:25.709193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:26.003266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:26.467360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:25.783850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:26.111959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:26.567006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:25.901296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:48:26.235679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T11:48:29.521402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도일반(톤)중량(톤)석탄류(톤)크링카(톤)광재(톤)기타중량품(톤)페로니켈(톤)
연도1.0000.7220.0001.0001.0001.0001.0001.000
일반(톤)0.7221.0000.8881.0001.0001.0001.0001.000
중량(톤)0.0000.8881.0001.0001.000NaN1.0001.000
석탄류(톤)1.0001.0001.0001.0001.0001.0001.0001.000
크링카(톤)1.0001.0001.0001.0001.0001.0001.0001.000
광재(톤)1.0001.000NaN1.0001.0001.0001.0001.000
기타중량품(톤)1.0001.0001.0001.0001.0001.0001.0001.000
페로니켈(톤)1.0001.0001.0001.0001.0001.0001.0001.000
2023-12-12T11:48:29.674049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기타중량품(톤)광재(톤)JR컨테이너(톤)석탄류(톤)크링카(톤)페로니켈(톤)
기타중량품(톤)1.0001.000NaN1.0001.0001.000
광재(톤)1.0001.000NaN1.0001.0001.000
JR컨테이너(톤)NaNNaN1.000NaNNaNNaN
석탄류(톤)1.0001.000NaN1.0001.0001.000
크링카(톤)1.0001.000NaN1.0001.0001.000
페로니켈(톤)1.0001.000NaN1.0001.0001.000
2023-12-12T11:48:29.843955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도일반(톤)중량(톤)석탄류(톤)크링카(톤)광재(톤)기타중량품(톤)페로니켈(톤)JR컨테이너(톤)
연도1.0000.482-0.6181.0001.0001.0001.0001.000NaN
일반(톤)0.4821.0000.9601.0001.0001.0001.0001.000NaN
중량(톤)-0.6180.9601.0001.0001.0001.0001.0001.000NaN
석탄류(톤)1.0001.0001.0001.0001.0001.0001.0001.000NaN
크링카(톤)1.0001.0001.0001.0001.0001.0001.0001.000NaN
광재(톤)1.0001.0001.0001.0001.0001.0001.0001.000NaN
기타중량품(톤)1.0001.0001.0001.0001.0001.0001.0001.000NaN
페로니켈(톤)1.0001.0001.0001.0001.0001.0001.0001.000NaN
JR컨테이너(톤)NaNNaNNaNNaNNaNNaNNaNNaN1.000

Missing values

2023-12-12T11:48:26.709081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T11:48:26.873165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도일반(톤)중량(톤)석탄류(톤)크링카(톤)광재(톤)기타중량품(톤)페로니켈(톤)JR컨테이너(톤)
019965822301<NA><NA><NA><NA><NA><NA><NA>
119976350040<NA><NA><NA><NA><NA><NA><NA>
219986916406<NA><NA><NA><NA><NA><NA><NA>
319997648361<NA><NA><NA><NA><NA><NA><NA>
420008715518<NA><NA><NA><NA><NA><NA><NA>
520017773795<NA><NA><NA><NA><NA><NA><NA>
620028154003<NA><NA><NA><NA><NA><NA><NA>
720038753001<NA><NA><NA><NA><NA><NA><NA>
820048925206<NA><NA><NA><NA><NA><NA><NA>
9200510034028956807<NA><NA><NA><NA><NA><NA>
연도일반(톤)중량(톤)석탄류(톤)크링카(톤)광재(톤)기타중량품(톤)페로니켈(톤)JR컨테이너(톤)
1320098511304799617<NA><NA><NA><NA><NA><NA>
1420109947590934111<NA><NA><NA><NA><NA><NA>
152011116784601099956<NA><NA><NA><NA><NA><NA>
162012121099461138665<NA><NA><NA><NA><NA><NA>
172013118529291096654<NA><NA><NA><NA><NA><NA>
18201410386279944693<NA><NA><NA><NA><NA><NA>
19201593417552844248320024200253505330028892150
20201680267522455718205020450204505825046004<NA>
21201778179332574166050034550155024650047058<NA>
222018868058276024734356814550<NA>2730048051<NA>