Overview

Dataset statistics

Number of variables13
Number of observations1621
Missing cells1580
Missing cells (%)7.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory171.1 KiB
Average record size in memory108.1 B

Variable types

Categorical5
Numeric3
Text5

Dataset

Description우리나라 농축산물의 WTO 양허관세율과 기본세율 자료
Author농림축산식품부
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220217000000002067

Alerts

2014 has constant value ""Constant
1 is highly overall correlated with Unnamed: 10High correlation
0 is highly overall correlated with Unnamed: 8 and 1 other fieldsHigh correlation
Unnamed: 8 is highly overall correlated with 0 and 2 other fieldsHigh correlation
Unnamed: 10 is highly overall correlated with 1 and 2 other fieldsHigh correlation
Unnamed: 11 is highly overall correlated with Unnamed: 8 and 1 other fieldsHigh correlation
Unnamed: 12 is highly overall correlated with Unnamed: 11High correlation
Unnamed: 8 is highly imbalanced (71.2%)Imbalance
Unnamed: 10 is highly imbalanced (91.7%)Imbalance
Unnamed: 11 is highly imbalanced (61.7%)Imbalance
Unnamed: 12 is highly imbalanced (65.8%)Imbalance
Unnamed: 9 has 1575 (97.2%) missing valuesMissing
1 has unique valuesUnique
0101.21.1000 has unique valuesUnique
0 has 106 (6.5%) zerosZeros

Reproduction

Analysis started2023-12-11 03:44:13.747963
Analysis finished2023-12-11 03:44:16.595637
Duration2.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

2014
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
2014
1621 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2014
2nd row2014
3rd row2014
4th row2014
5th row2014

Common Values

ValueCountFrequency (%)
2014 1621
100.0%

Length

2023-12-11T12:44:16.668209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:44:16.789742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2014 1621
100.0%

1
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1621
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean812
Minimum2
Maximum1622
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.4 KiB
2023-12-11T12:44:16.912216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile83
Q1407
median812
Q31217
95-th percentile1541
Maximum1622
Range1620
Interquartile range (IQR)810

Descriptive statistics

Standard deviation468.08671
Coefficient of variation (CV)0.57646146
Kurtosis-1.2
Mean812
Median Absolute Deviation (MAD)405
Skewness0
Sum1316252
Variance219105.17
MonotonicityStrictly increasing
2023-12-11T12:44:17.087004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 1
 
0.1%
1080 1
 
0.1%
1090 1
 
0.1%
1089 1
 
0.1%
1088 1
 
0.1%
1087 1
 
0.1%
1086 1
 
0.1%
1085 1
 
0.1%
1084 1
 
0.1%
1083 1
 
0.1%
Other values (1611) 1611
99.4%
ValueCountFrequency (%)
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
11 1
0.1%
ValueCountFrequency (%)
1622 1
0.1%
1621 1
0.1%
1620 1
0.1%
1619 1
0.1%
1618 1
0.1%
1617 1
0.1%
1616 1
0.1%
1615 1
0.1%
1614 1
0.1%
1613 1
0.1%

0101.21.1000
Text

UNIQUE 

Distinct1621
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
2023-12-11T12:44:17.337692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters19452
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1621 ?
Unique (%)100.0%

Sample

1st row0101.21.9000
2nd row0101.29.1000
3rd row0101.29.9000
4th row0101.30.1000
5th row0101.30.9000
ValueCountFrequency (%)
0101.21.9000 1
 
0.1%
1702.30.2000 1
 
0.1%
1802.00.1000 1
 
0.1%
1801.00.2000 1
 
0.1%
1801.00.1000 1
 
0.1%
1704.90.9000 1
 
0.1%
1704.90.2090 1
 
0.1%
1704.90.2020 1
 
0.1%
1704.90.2010 1
 
0.1%
1704.90.1000 1
 
0.1%
Other values (1611) 1611
99.4%
2023-12-11T12:44:17.718123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8064
41.5%
. 3242
16.7%
1 2514
 
12.9%
2 1622
 
8.3%
9 1508
 
7.8%
3 624
 
3.2%
4 485
 
2.5%
5 466
 
2.4%
6 342
 
1.8%
7 330
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16210
83.3%
Other Punctuation 3242
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8064
49.7%
1 2514
 
15.5%
2 1622
 
10.0%
9 1508
 
9.3%
3 624
 
3.8%
4 485
 
3.0%
5 466
 
2.9%
6 342
 
2.1%
7 330
 
2.0%
8 255
 
1.6%
Other Punctuation
ValueCountFrequency (%)
. 3242
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 19452
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8064
41.5%
. 3242
16.7%
1 2514
 
12.9%
2 1622
 
8.3%
9 1508
 
7.8%
3 624
 
3.2%
4 485
 
2.5%
5 466
 
2.4%
6 342
 
1.8%
7 330
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19452
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8064
41.5%
. 3242
16.7%
1 2514
 
12.9%
2 1622
 
8.3%
9 1508
 
7.8%
3 624
 
3.2%
4 485
 
2.5%
5 466
 
2.4%
6 342
 
1.8%
7 330
 
1.7%
Distinct1600
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
2023-12-11T12:44:17.989436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length56
Median length48
Mean length12.58174
Min length1

Characters and Unicode

Total characters20395
Distinct characters596
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1586 ?
Unique (%)97.8%

Sample

1st row말(번식용/기타)
2nd row말(기타/경주말)
3rd row말(기타/기타)
4th row당나귀(번식용)
5th row당나귀(기타)
ValueCountFrequency (%)
207
 
5.8%
또는 95
 
2.6%
기타 94
 
2.6%
63
 
1.8%
34
 
0.9%
종자 29
 
0.8%
안한 28
 
0.8%
분획물 22
 
0.6%
도메스티쿠스종에 21
 
0.6%
함유한 19
 
0.5%
Other values (2066) 2986
83.0%
2023-12-11T12:44:18.402838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2005
 
9.8%
( 1198
 
5.9%
) 1196
 
5.9%
707
 
3.5%
623
 
3.1%
/ 504
 
2.5%
310
 
1.5%
306
 
1.5%
303
 
1.5%
302
 
1.5%
Other values (586) 12941
63.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14826
72.7%
Space Separator 2005
 
9.8%
Open Punctuation 1198
 
5.9%
Close Punctuation 1196
 
5.9%
Other Punctuation 694
 
3.4%
Decimal Number 343
 
1.7%
Dash Punctuation 74
 
0.4%
Lowercase Letter 57
 
0.3%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
707
 
4.8%
623
 
4.2%
310
 
2.1%
306
 
2.1%
303
 
2.0%
302
 
2.0%
282
 
1.9%
264
 
1.8%
250
 
1.7%
243
 
1.6%
Other values (560) 11236
75.8%
Decimal Number
ValueCountFrequency (%)
0 102
29.7%
1 63
18.4%
2 41
12.0%
5 35
 
10.2%
6 22
 
6.4%
8 20
 
5.8%
4 20
 
5.8%
9 18
 
5.2%
3 17
 
5.0%
7 5
 
1.5%
Other Punctuation
ValueCountFrequency (%)
/ 504
72.6%
, 117
 
16.9%
· 31
 
4.5%
. 29
 
4.2%
% 11
 
1.6%
: 1
 
0.1%
? 1
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
g 25
43.9%
k 16
28.1%
m 16
28.1%
Math Symbol
ValueCountFrequency (%)
< 1
50.0%
> 1
50.0%
Space Separator
ValueCountFrequency (%)
2005
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1198
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1196
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 74
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14816
72.6%
Common 5512
 
27.0%
Latin 57
 
0.3%
Han 10
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
707
 
4.8%
623
 
4.2%
310
 
2.1%
306
 
2.1%
303
 
2.0%
302
 
2.0%
282
 
1.9%
264
 
1.8%
250
 
1.7%
243
 
1.6%
Other values (550) 11226
75.8%
Common
ValueCountFrequency (%)
2005
36.4%
( 1198
21.7%
) 1196
21.7%
/ 504
 
9.1%
, 117
 
2.1%
0 102
 
1.9%
- 74
 
1.3%
1 63
 
1.1%
2 41
 
0.7%
5 35
 
0.6%
Other values (13) 177
 
3.2%
Han
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Latin
ValueCountFrequency (%)
g 25
43.9%
k 16
28.1%
m 16
28.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14816
72.6%
ASCII 5538
 
27.2%
None 31
 
0.2%
CJK 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2005
36.2%
( 1198
21.6%
) 1196
21.6%
/ 504
 
9.1%
, 117
 
2.1%
0 102
 
1.8%
- 74
 
1.3%
1 63
 
1.1%
2 41
 
0.7%
5 35
 
0.6%
Other values (15) 203
 
3.7%
Hangul
ValueCountFrequency (%)
707
 
4.8%
623
 
4.2%
310
 
2.1%
306
 
2.1%
303
 
2.0%
302
 
2.0%
282
 
1.9%
264
 
1.8%
250
 
1.7%
243
 
1.6%
Other values (550) 11226
75.8%
None
ValueCountFrequency (%)
· 31
100.0%
CJK
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Distinct1590
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
2023-12-11T12:44:18.731267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length150
Median length82
Mean length31.969155
Min length4

Characters and Unicode

Total characters51822
Distinct characters75
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1576 ?
Unique (%)97.2%

Sample

1st rowHorses: Pure-bred breeding anmials(Other)
2nd rowHorses: Other(Horses for racing)
3rd rowHorses: Other(Other)
4th rowAsses: Pure-bred breeding animals
5th rowAsses: Other
ValueCountFrequency (%)
or 367
 
5.5%
of 342
 
5.1%
other 311
 
4.6%
and 242
 
3.6%
meat 118
 
1.8%
chilled 80
 
1.2%
preserved 73
 
1.1%
the 63
 
0.9%
offal 57
 
0.8%
by 56
 
0.8%
Other values (1890) 5008
74.6%
2023-12-11T12:44:19.257224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 5735
 
11.1%
5104
 
9.8%
r 3916
 
7.6%
o 3192
 
6.2%
a 3131
 
6.0%
s 2962
 
5.7%
t 2938
 
5.7%
i 2603
 
5.0%
n 2417
 
4.7%
d 1867
 
3.6%
Other values (65) 17957
34.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40902
78.9%
Space Separator 5104
 
9.8%
Uppercase Letter 2545
 
4.9%
Open Punctuation 1089
 
2.1%
Close Punctuation 1088
 
2.1%
Other Punctuation 736
 
1.4%
Decimal Number 249
 
0.5%
Dash Punctuation 104
 
0.2%
Other Symbol 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5735
14.0%
r 3916
 
9.6%
o 3192
 
7.8%
a 3131
 
7.7%
s 2962
 
7.2%
t 2938
 
7.2%
i 2603
 
6.4%
n 2417
 
5.9%
d 1867
 
4.6%
l 1776
 
4.3%
Other values (16) 10365
25.3%
Uppercase Letter
ValueCountFrequency (%)
O 717
28.2%
C 239
 
9.4%
S 212
 
8.3%
M 170
 
6.7%
P 168
 
6.6%
F 126
 
5.0%
B 121
 
4.8%
R 110
 
4.3%
G 106
 
4.2%
L 89
 
3.5%
Other values (16) 487
19.1%
Decimal Number
ValueCountFrequency (%)
0 75
30.1%
5 40
16.1%
1 39
15.7%
2 30
 
12.0%
4 18
 
7.2%
9 15
 
6.0%
8 13
 
5.2%
3 10
 
4.0%
6 6
 
2.4%
7 3
 
1.2%
Other Punctuation
ValueCountFrequency (%)
/ 334
45.4%
, 223
30.3%
. 79
 
10.7%
: 40
 
5.4%
% 37
 
5.0%
' 16
 
2.2%
; 7
 
1.0%
Other Symbol
ValueCountFrequency (%)
° 4
80.0%
1
 
20.0%
Space Separator
ValueCountFrequency (%)
5104
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1089
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1088
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 104
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 43447
83.8%
Common 8375
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5735
13.2%
r 3916
 
9.0%
o 3192
 
7.3%
a 3131
 
7.2%
s 2962
 
6.8%
t 2938
 
6.8%
i 2603
 
6.0%
n 2417
 
5.6%
d 1867
 
4.3%
l 1776
 
4.1%
Other values (42) 12910
29.7%
Common
ValueCountFrequency (%)
5104
60.9%
( 1089
 
13.0%
) 1088
 
13.0%
/ 334
 
4.0%
, 223
 
2.7%
- 104
 
1.2%
. 79
 
0.9%
0 75
 
0.9%
5 40
 
0.5%
: 40
 
0.5%
Other values (13) 199
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 51817
> 99.9%
None 4
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 5735
 
11.1%
5104
 
9.9%
r 3916
 
7.6%
o 3192
 
6.2%
a 3131
 
6.0%
s 2962
 
5.7%
t 2938
 
5.7%
i 2603
 
5.0%
n 2417
 
4.7%
d 1867
 
3.6%
Other values (63) 17952
34.6%
None
ValueCountFrequency (%)
° 4
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%

0
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct27
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.595928
Minimum0
Maximum50
Zeros106
Zeros (%)6.5%
Negative0
Negative (%)0.0%
Memory size14.4 KiB
2023-12-11T12:44:19.408859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median8
Q327
95-th percentile45
Maximum50
Range50
Interquartile range (IQR)22

Descriptive statistics

Standard deviation13.686267
Coefficient of variation (CV)0.82467617
Kurtosis-0.51322946
Mean16.595928
Median Absolute Deviation (MAD)8
Skewness0.74881973
Sum26902
Variance187.3139
MonotonicityNot monotonic
2023-12-11T12:44:19.570527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
8.0 456
28.1%
30.0 203
12.5%
5.0 142
 
8.8%
20.0 128
 
7.9%
3.0 122
 
7.5%
0.0 106
 
6.5%
27.0 102
 
6.3%
45.0 58
 
3.6%
18.0 54
 
3.3%
40.0 46
 
2.8%
Other values (17) 204
12.6%
ValueCountFrequency (%)
0.0 106
6.5%
1.0 2
 
0.1%
1.8 4
 
0.2%
2.0 28
 
1.7%
3.0 122
7.5%
4.0 1
 
0.1%
4.2 3
 
0.2%
5.0 142
8.8%
5.4 6
 
0.4%
7.0 1
 
0.1%
ValueCountFrequency (%)
50.0 46
 
2.8%
45.0 58
 
3.6%
40.0 46
 
2.8%
36.0 37
 
2.3%
32.8 1
 
0.1%
30.0 203
12.5%
27.0 102
6.3%
25.0 20
 
1.2%
24.0 2
 
0.1%
22.5 38
 
2.3%

20
Text

Distinct85
Distinct (%)5.2%
Missing1
Missing (%)0.1%
Memory size12.8 KiB
2023-12-11T12:44:19.812744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.5006173
Min length1

Characters and Unicode

Total characters4051
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)1.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20
ValueCountFrequency (%)
30 283
17.5%
20 235
14.5%
10 143
 
8.8%
60 98
 
6.0%
40 74
 
4.6%
50 66
 
4.1%
100 59
 
3.6%
25 52
 
3.2%
59.2 48
 
3.0%
35 40
 
2.5%
Other values (75) 522
32.2%
2023-12-11T12:44:20.234924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1118
27.6%
2 494
12.2%
3 460
11.4%
5 399
 
9.8%
1 339
 
8.4%
. 319
 
7.9%
4 256
 
6.3%
6 197
 
4.9%
9 182
 
4.5%
7 144
 
3.6%
Other values (2) 143
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3716
91.7%
Other Punctuation 319
 
7.9%
Dash Punctuation 16
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1118
30.1%
2 494
13.3%
3 460
12.4%
5 399
 
10.7%
1 339
 
9.1%
4 256
 
6.9%
6 197
 
5.3%
9 182
 
4.9%
7 144
 
3.9%
8 127
 
3.4%
Other Punctuation
ValueCountFrequency (%)
. 319
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4051
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1118
27.6%
2 494
12.2%
3 460
11.4%
5 399
 
9.8%
1 339
 
8.4%
. 319
 
7.9%
4 256
 
6.3%
6 197
 
4.9%
9 182
 
4.5%
7 144
 
3.6%
Other values (2) 143
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4051
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1118
27.6%
2 494
12.2%
3 460
11.4%
5 399
 
9.8%
1 339
 
8.4%
. 319
 
7.9%
4 256
 
6.3%
6 197
 
4.9%
9 182
 
4.5%
7 144
 
3.6%
Other values (2) 143
 
3.5%

13.1
Text

Distinct101
Distinct (%)6.2%
Missing4
Missing (%)0.2%
Memory size12.8 KiB
2023-12-11T12:44:20.523803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length2.7340754
Min length1

Characters and Unicode

Total characters4421
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)1.2%

Sample

1st row13.1
2nd row13.1
3rd row13.1
4th row13.1
5th row13.1
ValueCountFrequency (%)
27 168
 
10.4%
18 125
 
7.7%
19.7 123
 
7.6%
54 121
 
7.5%
13.1 109
 
6.7%
45 93
 
5.8%
36 71
 
4.4%
6.6 57
 
3.5%
22.5 55
 
3.4%
30 54
 
3.3%
Other values (91) 641
39.6%
2023-12-11T12:44:20.954066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 632
14.3%
1 607
13.7%
2 512
11.6%
5 501
11.3%
7 398
9.0%
3 391
8.8%
4 384
8.7%
6 285
6.4%
9 233
 
5.3%
0 232
 
5.2%
Other values (2) 246
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3773
85.3%
Other Punctuation 632
 
14.3%
Dash Punctuation 16
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 607
16.1%
2 512
13.6%
5 501
13.3%
7 398
10.5%
3 391
10.4%
4 384
10.2%
6 285
7.6%
9 233
 
6.2%
0 232
 
6.1%
8 230
 
6.1%
Other Punctuation
ValueCountFrequency (%)
. 632
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4421
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 632
14.3%
1 607
13.7%
2 512
11.6%
5 501
11.3%
7 398
9.0%
3 391
8.8%
4 384
8.7%
6 285
6.4%
9 233
 
5.3%
0 232
 
5.2%
Other values (2) 246
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4421
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 632
14.3%
1 607
13.7%
2 512
11.6%
5 501
11.3%
7 398
9.0%
3 391
8.8%
4 384
8.7%
6 285
6.4%
9 233
 
5.3%
0 232
 
5.2%
Other values (2) 246
 
5.6%

Unnamed: 8
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct17
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
<NA>
1349 
20
 
73
-
 
45
5
 
43
8
 
24
Other values (12)
 
87

Length

Max length4
Median length4
Mean length3.5823566
Min length1

Unique

Unique3 ?
Unique (%)0.2%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1349
83.2%
20 73
 
4.5%
- 45
 
2.8%
5 43
 
2.7%
8 24
 
1.5%
50 20
 
1.2%
40 16
 
1.0%
3 14
 
0.9%
30 13
 
0.8%
0 11
 
0.7%
Other values (7) 13
 
0.8%

Length

2023-12-11T12:44:21.100595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 1349
83.2%
20 73
 
4.5%
45
 
2.8%
5 43
 
2.7%
8 24
 
1.5%
50 20
 
1.2%
40 16
 
1.0%
3 14
 
0.9%
30 13
 
0.8%
0 11
 
0.7%
Other values (7) 13
 
0.8%

Unnamed: 9
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)13.0%
Missing1575
Missing (%)97.2%
Infinite0
Infinite (%)0.0%
Mean20.543478
Minimum5
Maximum40
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.4 KiB
2023-12-11T12:44:21.214077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile15
Q115
median15
Q320
95-th percentile40
Maximum40
Range35
Interquartile range (IQR)5

Descriptive statistics

Standard deviation9.2031553
Coefficient of variation (CV)0.44798428
Kurtosis0.10630103
Mean20.543478
Median Absolute Deviation (MAD)5
Skewness1.017925
Sum945
Variance84.698068
MonotonicityNot monotonic
2023-12-11T12:44:21.335869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
15 22
 
1.4%
20 12
 
0.7%
35 5
 
0.3%
40 4
 
0.2%
5 2
 
0.1%
30 1
 
0.1%
(Missing) 1575
97.2%
ValueCountFrequency (%)
5 2
 
0.1%
15 22
1.4%
20 12
0.7%
30 1
 
0.1%
35 5
 
0.3%
40 4
 
0.2%
ValueCountFrequency (%)
40 4
 
0.2%
35 5
 
0.3%
30 1
 
0.1%
20 12
0.7%
15 22
1.4%
5 2
 
0.1%

Unnamed: 10
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct15
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
<NA>
1557 
할당0 (12월말)
 
41
조정45%
 
5
할당25 (6월말)
 
4
할당1 (6월말)
 
3
Other values (10)
 
11

Length

Max length15
Median length4
Mean length4.2264035
Min length4

Unique

Unique9 ?
Unique (%)0.6%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1557
96.1%
할당0 (12월말) 41
 
2.5%
조정45% 5
 
0.3%
할당25 (6월말) 4
 
0.2%
할당1 (6월말) 3
 
0.2%
할당5 (6월말) 2
 
0.1%
조정40%,1625원/kg 1
 
0.1%
조정40%,1,625원/kg 1
 
0.1%
할당10/0 (12월말) 1
 
0.1%
할당4 (12월말) 1
 
0.1%
Other values (5) 5
 
0.3%

Length

2023-12-11T12:44:21.459615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 1557
92.8%
12월말 44
 
2.6%
할당0 41
 
2.4%
6월말 10
 
0.6%
조정45 5
 
0.3%
할당25 4
 
0.2%
할당1 3
 
0.2%
할당5 3
 
0.2%
할당4 2
 
0.1%
206원/kg 1
 
0.1%
Other values (7) 7
 
0.4%

Unnamed: 11
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct8
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
<NA>
1330 
TC
 
72
TM
 
51
BM
 
51
BC
 
49
Other values (3)
 
68

Length

Max length4
Median length4
Mean length3.6409624
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1330
82.0%
TC 72
 
4.4%
TM 51
 
3.1%
BM 51
 
3.1%
BC 49
 
3.0%
BX 40
 
2.5%
ST 16
 
1.0%
TX 12
 
0.7%

Length

2023-12-11T12:44:21.584131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:44:21.710265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1330
82.0%
tc 72
 
4.4%
tm 51
 
3.1%
bm 51
 
3.1%
bc 49
 
3.0%
bx 40
 
2.5%
st 16
 
1.0%
tx 12
 
0.7%

Unnamed: 12
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
<NA>
1330 
95.1
197 
97.7
 
39
96.7
 
22
-
 
16
Other values (2)
 
17

Length

Max length4
Median length4
Mean length3.961752
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1330
82.0%
95.1 197
 
12.2%
97.7 39
 
2.4%
96.7 22
 
1.4%
- 16
 
1.0%
1.1 14
 
0.9%
96.1 3
 
0.2%

Length

2023-12-11T12:44:21.856252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:44:21.983841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1330
82.0%
95.1 197
 
12.2%
97.7 39
 
2.4%
96.7 22
 
1.4%
16
 
1.0%
1.1 14
 
0.9%
96.1 3
 
0.2%

Interactions

2023-12-11T12:44:15.732637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:14.903299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:15.300589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:15.855382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:15.059975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:15.458312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:15.952630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:15.185440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:15.610017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:44:22.140224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
1020Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12
11.0000.7650.8680.7830.9120.8440.6720.680
00.7651.0000.9120.8930.9200.8920.6770.777
200.8680.9121.0000.9840.8320.8560.9800.974
Unnamed: 80.7830.8930.9841.000NaN0.7580.7920.725
Unnamed: 90.9120.9200.832NaN1.000NaNNaNNaN
Unnamed: 100.8440.8920.8560.758NaN1.0000.2660.000
Unnamed: 110.6720.6770.9800.792NaN0.2661.0000.730
Unnamed: 120.6800.7770.9740.725NaN0.0000.7301.000
2023-12-11T12:44:22.279596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 11Unnamed: 8Unnamed: 12Unnamed: 10
Unnamed: 111.0000.5110.5450.240
Unnamed: 80.5111.0000.4650.567
Unnamed: 120.5450.4651.0000.000
Unnamed: 100.2400.5670.0001.000
2023-12-11T12:44:22.659147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
10Unnamed: 9Unnamed: 8Unnamed: 10Unnamed: 11Unnamed: 12
11.000-0.219-0.1740.4460.5400.4210.441
0-0.2191.000-0.0520.6740.5390.3770.473
Unnamed: 9-0.174-0.0521.0000.000NaN0.0000.000
Unnamed: 80.4460.6740.0001.0000.5670.5110.465
Unnamed: 100.5400.539NaN0.5671.0000.2400.000
Unnamed: 110.4210.3770.0000.5110.2401.0000.545
Unnamed: 120.4410.4730.0000.4650.0000.5451.000

Missing values

2023-12-11T12:44:16.115551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:44:16.367961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T12:44:16.509782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

201410101.21.1000말(번식용/농가사육용)Horses: Pure-bred breeding anmials(For farm breeding)02013.1Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12
0201420101.21.9000말(번식용/기타)Horses: Pure-bred breeding anmials(Other)8.02013.1<NA><NA><NA><NA><NA>
1201430101.29.1000말(기타/경주말)Horses: Other(Horses for racing)8.02013.1<NA><NA><NA><NA><NA>
2201440101.29.9000말(기타/기타)Horses: Other(Other)8.02013.1<NA><NA><NA><NA><NA>
3201450101.30.1000당나귀(번식용)Asses: Pure-bred breeding animals8.02013.1<NA><NA><NA><NA><NA>
4201460101.30.9000당나귀(기타)Asses: Other8.02013.1<NA><NA><NA><NA><NA>
5201470101.90.0000기타(기타)Other8.02013.1<NA><NA><NA><NA><NA>
6201480102.21.1000축우(번식용/젖소)Cattle: Pure-bred breeding animals(For milk)0.09989.10<NA><NA>TM95.1
7201490102.21.2000축우(번식용/육우)Cattle: Pure-bred breeding animals(For meat)0.09989.10<NA><NA>TM95.1
82014100102.21.9000축우(번식용/기타)Cattle: Pure-bred breeding animals(Other)0.09989.10<NA><NA>TM95.1
92014110102.29.1000축우(기타/젖소)Cattle: Other(For milk)20.044.540-<NA><NA>BX1.1
201410101.21.1000말(번식용/농가사육용)Horses: Pure-bred breeding anmials(For farm breeding)02013.1Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12
1611201416135203.00.0000면(카드 또는 코움한 것)Cotton, carded or combed0.0102<NA><NA><NA><NA><NA>
1612201416145301.10.0000생아마 또는 침지아마Flax, raw or retted2.0102<NA><NA><NA><NA><NA>
1613201416155301.21.0000아마(쇄경 또는 타마한 것)Flax(broken or scutched)2.0102<NA><NA><NA><NA><NA>
1614201416165301.29.0000아마(기타)Other flax2.0102<NA><NA><NA><NA><NA>
1615201416175301.30.1000아마의 토우Flax tow2.0102<NA><NA><NA><NA><NA>
1616201416185301.30.2000아마의 웨이스트Flax waste2.0102<NA><NA><NA><NA><NA>
1617201416195302.10.0000생대마 또는 침지대마True hemp, raw or retted2.0102<NA><NA><NA><NA><NA>
1618201416205302.90.1000쇄경, 탐, 핵클 또는 기타의 방법으로 가공한 대마True hemp, broken, scutched, hackled or other wise processed2.0102<NA><NA><NA><NA><NA>
1619201416215302.90.2010대마의 토우Tow of true hemp2.0102<NA><NA><NA><NA><NA>
1620201416225302.90.2020대마의 웨이스트Waste of true hemp2.0102<NA><NA><NA><NA><NA>