Overview

Dataset statistics

Number of variables6
Number of observations29
Missing cells29
Missing cells (%)16.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.5 KiB
Average record size in memory54.6 B

Variable types

Text3
DateTime1
Numeric1
Categorical1

Dataset

Description일반, 컨테이너 운임 및 기타 부대시설 사용 등 철도 화물수송에 대한 운임 총괄표
Author한국철도공사
URLhttps://www.data.go.kr/data/15028069/fileData.do

Alerts

임율(%) is highly imbalanced (67.8%)Imbalance
임율(원) has 3 (10.3%) missing valuesMissing
비고 has 26 (89.7%) missing valuesMissing

Reproduction

Analysis started2023-12-12 05:26:40.221191
Analysis finished2023-12-12 05:26:41.312628
Duration1.09 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

Distinct15
Distinct (%)51.7%
Missing0
Missing (%)0.0%
Memory size364.0 B
2023-12-12T14:26:41.449554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length10
Mean length13.448276
Min length4

Characters and Unicode

Total characters390
Distinct characters52
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)37.9%

Sample

1st row일반화물 임율
2nd row컨테이너화물
3rd row컨테이너화물
4th row컨테이너화물
5th row컨테이너화물
ValueCountFrequency (%)
하차장사용료(1㎡당 12
26.7%
장기사용료(1개월마다 6
13.3%
일시사용료(1일마다 6
13.3%
컨테이너화물 4
 
8.9%
사용료 2
 
4.4%
탁송변경료-탁송취소 2
 
4.4%
운반료 1
 
2.2%
구내 1
 
2.2%
기관차 1
 
2.2%
선로유치료 1
 
2.2%
Other values (9) 9
20.0%
2023-12-12T14:26:41.871966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35
 
9.0%
28
 
7.2%
27
 
6.9%
( 24
 
6.2%
1 24
 
6.2%
) 24
 
6.2%
18
 
4.6%
16
 
4.1%
16
 
4.1%
13
 
3.3%
Other values (42) 165
42.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 288
73.8%
Open Punctuation 24
 
6.2%
Decimal Number 24
 
6.2%
Close Punctuation 24
 
6.2%
Space Separator 16
 
4.1%
Other Symbol 12
 
3.1%
Dash Punctuation 2
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
35
 
12.2%
28
 
9.7%
27
 
9.4%
18
 
6.2%
16
 
5.6%
13
 
4.5%
12
 
4.2%
12
 
4.2%
12
 
4.2%
12
 
4.2%
Other values (36) 103
35.8%
Open Punctuation
ValueCountFrequency (%)
( 24
100.0%
Decimal Number
ValueCountFrequency (%)
1 24
100.0%
Close Punctuation
ValueCountFrequency (%)
) 24
100.0%
Space Separator
ValueCountFrequency (%)
16
100.0%
Other Symbol
ValueCountFrequency (%)
12
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 288
73.8%
Common 102
 
26.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
35
 
12.2%
28
 
9.7%
27
 
9.4%
18
 
6.2%
16
 
5.6%
13
 
4.5%
12
 
4.2%
12
 
4.2%
12
 
4.2%
12
 
4.2%
Other values (36) 103
35.8%
Common
ValueCountFrequency (%)
( 24
23.5%
1 24
23.5%
) 24
23.5%
16
15.7%
12
11.8%
- 2
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 288
73.8%
ASCII 90
 
23.1%
CJK Compat 12
 
3.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
35
 
12.2%
28
 
9.7%
27
 
9.4%
18
 
6.2%
16
 
5.6%
13
 
4.5%
12
 
4.2%
12
 
4.2%
12
 
4.2%
12
 
4.2%
Other values (36) 103
35.8%
ASCII
ValueCountFrequency (%)
( 24
26.7%
1 24
26.7%
) 24
26.7%
16
17.8%
- 2
 
2.2%
CJK Compat
ValueCountFrequency (%)
12
100.0%

내용
Text

Distinct22
Distinct (%)75.9%
Missing0
Missing (%)0.0%
Memory size364.0 B
2023-12-12T14:26:42.119582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length11
Mean length10.344828
Min length3

Characters and Unicode

Total characters300
Distinct characters65
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)51.7%

Sample

1st row1톤 1km마다
2nd row영컨테이너 20피트 1km마다
3rd row영컨테이너 40피트 1km마다
4th row영컨테이너 45피트 1km마다
5th row공컨테이너
ValueCountFrequency (%)
1km마다 4
 
9.3%
영컨테이너 3
 
7.0%
1시간마다 3
 
7.0%
화물헛간(을지 2
 
4.7%
1톤 2
 
4.7%
화물헛간(갑지 2
 
4.7%
화물헛간(특지 2
 
4.7%
1량 2
 
4.7%
야적하치장(을지-철도cy포함 2
 
4.7%
야적하치장(갑지-철도cy포함 2
 
4.7%
Other values (18) 19
44.2%
2023-12-12T14:26:42.483273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 19
 
6.3%
( 16
 
5.3%
) 16
 
5.3%
14
 
4.7%
12
 
4.0%
10
 
3.3%
8
 
2.7%
8
 
2.7%
8
 
2.7%
7
 
2.3%
Other values (55) 182
60.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 201
67.0%
Decimal Number 25
 
8.3%
Open Punctuation 16
 
5.3%
Close Punctuation 16
 
5.3%
Space Separator 14
 
4.7%
Uppercase Letter 12
 
4.0%
Lowercase Letter 10
 
3.3%
Dash Punctuation 6
 
2.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
6.0%
10
 
5.0%
8
 
4.0%
8
 
4.0%
8
 
4.0%
7
 
3.5%
7
 
3.5%
6
 
3.0%
6
 
3.0%
6
 
3.0%
Other values (42) 123
61.2%
Decimal Number
ValueCountFrequency (%)
1 19
76.0%
4 2
 
8.0%
0 2
 
8.0%
2 1
 
4.0%
5 1
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
Y 6
50.0%
C 6
50.0%
Lowercase Letter
ValueCountFrequency (%)
m 5
50.0%
k 5
50.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Space Separator
ValueCountFrequency (%)
14
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 201
67.0%
Common 77
 
25.7%
Latin 22
 
7.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
6.0%
10
 
5.0%
8
 
4.0%
8
 
4.0%
8
 
4.0%
7
 
3.5%
7
 
3.5%
6
 
3.0%
6
 
3.0%
6
 
3.0%
Other values (42) 123
61.2%
Common
ValueCountFrequency (%)
1 19
24.7%
( 16
20.8%
) 16
20.8%
14
18.2%
- 6
 
7.8%
4 2
 
2.6%
0 2
 
2.6%
2 1
 
1.3%
5 1
 
1.3%
Latin
ValueCountFrequency (%)
Y 6
27.3%
C 6
27.3%
m 5
22.7%
k 5
22.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 201
67.0%
ASCII 99
33.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 19
19.2%
( 16
16.2%
) 16
16.2%
14
14.1%
Y 6
 
6.1%
C 6
 
6.1%
- 6
 
6.1%
m 5
 
5.1%
k 5
 
5.1%
4 2
 
2.0%
Other values (3) 4
 
4.0%
Hangul
ValueCountFrequency (%)
12
 
6.0%
10
 
5.0%
8
 
4.0%
8
 
4.0%
8
 
4.0%
7
 
3.5%
7
 
3.5%
6
 
3.0%
6
 
3.0%
6
 
3.0%
Other values (42) 123
61.2%
Distinct2
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size364.0 B
Minimum2013-10-01 00:00:00
Maximum2017-01-01 00:00:00
2023-12-12T14:26:42.605344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:26:42.720159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=2)

임율(원)
Real number (ℝ)

MISSING 

Distinct23
Distinct (%)88.5%
Missing3
Missing (%)10.3%
Infinite0
Infinite (%)0.0%
Mean159830.88
Minimum45.9
Maximum3798600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size393.0 B
2023-12-12T14:26:42.820241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum45.9
5-th percentile88.75
Q1253.5
median818
Q33318
95-th percentile177600
Maximum3798600
Range3798554.1
Interquartile range (IQR)3064.5

Descriptive statistics

Standard deviation743497.37
Coefficient of variation (CV)4.6517754
Kurtosis25.790496
Mean159830.88
Median Absolute Deviation (MAD)694
Skewness5.0704064
Sum4155602.9
Variance5.5278833 × 1011
MonotonicityNot monotonic
2023-12-12T14:26:42.951250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
124.0 2
 
6.9%
26700.0 2
 
6.9%
513.0 2
 
6.9%
1366.0 1
 
3.4%
3798600.0 1
 
3.4%
226400.0 1
 
3.4%
29200.0 1
 
3.4%
31200.0 1
 
3.4%
300.0 1
 
3.4%
153.0 1
 
3.4%
Other values (13) 13
44.8%
(Missing) 3
 
10.3%
ValueCountFrequency (%)
45.9 1
3.4%
77.0 1
3.4%
124.0 2
6.9%
153.0 1
3.4%
173.0 1
3.4%
238.0 1
3.4%
300.0 1
3.4%
309.0 1
3.4%
513.0 2
6.9%
516.0 1
3.4%
ValueCountFrequency (%)
3798600.0 1
3.4%
226400.0 1
3.4%
31200.0 1
3.4%
29200.0 1
3.4%
26700.0 2
6.9%
3521.0 1
3.4%
2709.0 1
3.4%
1892.0 1
3.4%
1647.0 1
3.4%
1366.0 1
3.4%

임율(%)
Categorical

IMBALANCE 

Distinct4
Distinct (%)13.8%
Missing0
Missing (%)0.0%
Memory size364.0 B
<NA>
26 
74
 
1
10
 
1
80
 
1

Length

Max length4
Median length4
Mean length3.7931034
Min length2

Unique

Unique3 ?
Unique (%)10.3%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row74

Common Values

ValueCountFrequency (%)
<NA> 26
89.7%
74 1
 
3.4%
10 1
 
3.4%
80 1
 
3.4%

Length

2023-12-12T14:26:43.130650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:26:43.259920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 26
89.7%
74 1
 
3.4%
10 1
 
3.4%
80 1
 
3.4%

비고
Text

MISSING 

Distinct3
Distinct (%)100.0%
Missing26
Missing (%)89.7%
Memory size364.0 B
2023-12-12T14:26:43.422128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length9
Mean length12.666667
Min length9

Characters and Unicode

Total characters38
Distinct characters26
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row규격별 영컨테이너 임율(원)의 74%
2nd row전세운임의 10%
3rd row최저운임의 80%
ValueCountFrequency (%)
규격별 1
12.5%
영컨테이너 1
12.5%
임율(원)의 1
12.5%
74 1
12.5%
전세운임의 1
12.5%
10 1
12.5%
최저운임의 1
12.5%
80 1
12.5%
2023-12-12T14:26:43.788159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
 
13.2%
3
 
7.9%
3
 
7.9%
% 3
 
7.9%
0 2
 
5.3%
2
 
5.3%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1 1
 
2.6%
Other values (16) 16
42.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 22
57.9%
Decimal Number 6
 
15.8%
Space Separator 5
 
13.2%
Other Punctuation 3
 
7.9%
Close Punctuation 1
 
2.6%
Open Punctuation 1
 
2.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3
13.6%
3
13.6%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (7) 7
31.8%
Decimal Number
ValueCountFrequency (%)
0 2
33.3%
1 1
16.7%
4 1
16.7%
7 1
16.7%
8 1
16.7%
Space Separator
ValueCountFrequency (%)
5
100.0%
Other Punctuation
ValueCountFrequency (%)
% 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 22
57.9%
Common 16
42.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3
13.6%
3
13.6%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (7) 7
31.8%
Common
ValueCountFrequency (%)
5
31.2%
% 3
18.8%
0 2
 
12.5%
1 1
 
6.2%
4 1
 
6.2%
7 1
 
6.2%
) 1
 
6.2%
( 1
 
6.2%
8 1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 22
57.9%
ASCII 16
42.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5
31.2%
% 3
18.8%
0 2
 
12.5%
1 1
 
6.2%
4 1
 
6.2%
7 1
 
6.2%
) 1
 
6.2%
( 1
 
6.2%
8 1
 
6.2%
Hangul
ValueCountFrequency (%)
3
13.6%
3
13.6%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (7) 7
31.8%

Interactions

2023-12-12T14:26:40.500287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:26:43.915462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분내용적용일임율(원)임율(%)비고
구분1.0000.9491.0001.0001.0001.000
내용0.9491.0001.0001.0001.0001.000
적용일1.0001.0001.0000.000NaNNaN
임율(원)1.0001.0000.0001.000NaNNaN
임율(%)1.0001.000NaNNaN1.0001.000
비고1.0001.000NaNNaN1.0001.000
2023-12-12T14:26:44.052778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
임율(원)임율(%)
임율(원)1.0000.000
임율(%)0.0001.000

Missing values

2023-12-12T14:26:40.697921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:26:40.826110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T14:26:41.248383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분내용적용일임율(원)임율(%)비고
0일반화물 임율1톤 1km마다2013-10-0145.9<NA><NA>
1컨테이너화물영컨테이너 20피트 1km마다2017-01-01516.0<NA><NA>
2컨테이너화물영컨테이너 40피트 1km마다2017-01-01800.0<NA><NA>
3컨테이너화물영컨테이너 45피트 1km마다2017-01-01946.0<NA><NA>
4컨테이너화물공컨테이너2017-01-01<NA>74규격별 영컨테이너 임율(원)의 74%
5하차장사용료(1㎡당) 일시사용료(1일마다)화물헛간(특지)2017-01-01309.0<NA><NA>
6하차장사용료(1㎡당) 일시사용료(1일마다)화물헛간(갑지)2017-01-01238.0<NA><NA>
7하차장사용료(1㎡당) 일시사용료(1일마다)화물헛간(을지)2017-01-01124.0<NA><NA>
8하차장사용료(1㎡당) 일시사용료(1일마다)야적하치장(특지-철도CY포함)2017-01-01173.0<NA><NA>
9하차장사용료(1㎡당) 일시사용료(1일마다)야적하치장(갑지-철도CY포함)2017-01-01124.0<NA><NA>
구분내용적용일임율(원)임율(%)비고
19탁송변경료-탁송취소일반화물(1량당)2017-01-0126700.0<NA><NA>
20탁송변경료-탁송취소전세열차(1열차당)2017-01-01<NA>10전세운임의 10%
21탁송변경료착역변경 및 발역송환(1량당)2017-01-0126700.0<NA><NA>
22대리호송인료운임계산거리 1km당2017-01-01300.0<NA><NA>
23화차계중기 사용료1차마다2017-01-0131200.0<NA><NA>
24화차전용료1일 1차당2017-01-0129200.0<NA><NA>
25선로유치료1량 1시간마다2017-01-01513.0<NA><NA>
26기관차 사용료시간당2017-01-01226400.0<NA><NA>
27구내 운반료1건당2017-01-01<NA>80최저운임의 80%
28최저운임전세열차(열차당)2017-01-013798600.0<NA><NA>