Overview

Dataset statistics

Number of variables12
Number of observations1746
Missing cells2501
Missing cells (%)11.9%
Duplicate rows6
Duplicate rows (%)0.3%
Total size in memory168.9 KiB
Average record size in memory99.1 B

Variable types

Numeric3
Text5
Categorical4

Dataset

Description우리나라 농축산물의 WTO 양허관세율과 기본세율 자료
Author농림축산식품부
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220217000000002067

Alerts

Dataset has 6 (0.3%) duplicate rowsDuplicates
일련 번호 is highly overall correlated with 탄력관세High correlation
기본세율 is highly overall correlated with 시장접근세율 and 1 other fieldsHigh correlation
시장접근세율 is highly overall correlated with 기본세율 and 2 other fieldsHigh correlation
탄력관세 is highly overall correlated with 일련 번호 and 2 other fieldsHigh correlation
개방구분 is highly overall correlated with 시장접근세율 and 1 other fieldsHigh correlation
개방연도 is highly overall correlated with 개방구분High correlation
시장접근세율 is highly imbalanced (72.8%)Imbalance
탄력관세 is highly imbalanced (86.8%)Imbalance
개방구분 is highly imbalanced (63.7%)Imbalance
개방연도 is highly imbalanced (67.6%)Imbalance
일련 번호 has 124 (7.1%) missing valuesMissing
H S K has 124 (7.1%) missing valuesMissing
한 글 품 명 has 65 (3.7%) missing valuesMissing
영 문 품 명 has 111 (6.4%) missing valuesMissing
기본세율 has 124 (7.1%) missing valuesMissing
양허기준세율 has 125 (7.2%) missing valuesMissing
2014양허관세 has 128 (7.3%) missing valuesMissing
국제협력관세 has 1700 (97.4%) missing valuesMissing
기본세율 has 107 (6.1%) zerosZeros

Reproduction

Analysis started2023-12-11 03:44:34.432352
Analysis finished2023-12-11 03:44:37.281181
Duration2.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련 번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1622
Distinct (%)100.0%
Missing124
Missing (%)7.1%
Infinite0
Infinite (%)0.0%
Mean811.5
Minimum1
Maximum1622
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.5 KiB
2023-12-11T12:44:37.366814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile82.05
Q1406.25
median811.5
Q31216.75
95-th percentile1540.95
Maximum1622
Range1621
Interquartile range (IQR)810.5

Descriptive statistics

Standard deviation468.37538
Coefficient of variation (CV)0.57717238
Kurtosis-1.2
Mean811.5
Median Absolute Deviation (MAD)405.5
Skewness0
Sum1316253
Variance219375.5
MonotonicityStrictly increasing
2023-12-11T12:44:37.549299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1080 1
 
0.1%
1090 1
 
0.1%
1089 1
 
0.1%
1088 1
 
0.1%
1087 1
 
0.1%
1086 1
 
0.1%
1085 1
 
0.1%
1084 1
 
0.1%
1083 1
 
0.1%
1082 1
 
0.1%
Other values (1612) 1612
92.3%
(Missing) 124
 
7.1%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1622 1
0.1%
1621 1
0.1%
1620 1
0.1%
1619 1
0.1%
1618 1
0.1%
1617 1
0.1%
1616 1
0.1%
1615 1
0.1%
1614 1
0.1%
1613 1
0.1%

H S K
Text

MISSING 

Distinct1622
Distinct (%)100.0%
Missing124
Missing (%)7.1%
Memory size13.8 KiB
2023-12-11T12:44:37.846580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters19464
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1622 ?
Unique (%)100.0%

Sample

1st row0101.21.1000
2nd row0101.21.9000
3rd row0101.29.1000
4th row0101.29.9000
5th row0101.30.1000
ValueCountFrequency (%)
0102.29.9000 1
 
0.1%
1803.20.0000 1
 
0.1%
1703.10.1000 1
 
0.1%
1801.00.1000 1
 
0.1%
1704.90.9000 1
 
0.1%
1704.90.2090 1
 
0.1%
1704.90.2020 1
 
0.1%
1704.90.2010 1
 
0.1%
1704.90.1000 1
 
0.1%
1704.10.0000 1
 
0.1%
Other values (1612) 1612
99.4%
2023-12-11T12:44:38.372440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8069
41.5%
. 3244
16.7%
1 2518
 
12.9%
2 1623
 
8.3%
9 1508
 
7.7%
3 624
 
3.2%
4 485
 
2.5%
5 466
 
2.4%
6 342
 
1.8%
7 330
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16220
83.3%
Other Punctuation 3244
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8069
49.7%
1 2518
 
15.5%
2 1623
 
10.0%
9 1508
 
9.3%
3 624
 
3.8%
4 485
 
3.0%
5 466
 
2.9%
6 342
 
2.1%
7 330
 
2.0%
8 255
 
1.6%
Other Punctuation
ValueCountFrequency (%)
. 3244
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 19464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8069
41.5%
. 3244
16.7%
1 2518
 
12.9%
2 1623
 
8.3%
9 1508
 
7.7%
3 624
 
3.2%
4 485
 
2.5%
5 466
 
2.4%
6 342
 
1.8%
7 330
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8069
41.5%
. 3244
16.7%
1 2518
 
12.9%
2 1623
 
8.3%
9 1508
 
7.7%
3 624
 
3.2%
4 485
 
2.5%
5 466
 
2.4%
6 342
 
1.8%
7 330
 
1.7%

한 글 품 명
Text

MISSING 

Distinct1650
Distinct (%)98.2%
Missing65
Missing (%)3.7%
Memory size13.8 KiB
2023-12-11T12:44:38.755581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length56
Median length47
Mean length12.149911
Min length1

Characters and Unicode

Total characters20424
Distinct characters596
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1627 ?
Unique (%)96.8%

Sample

1st row말(번식용/농가사육용)
2nd row말(번식용/기타)
3rd row말(기타/경주말)
4th row말(기타/기타)
5th row당나귀(번식용)
ValueCountFrequency (%)
207
 
5.7%
또는 96
 
2.7%
기타 95
 
2.6%
63
 
1.7%
34
 
0.9%
종자 29
 
0.8%
안한 28
 
0.8%
분획물 22
 
0.6%
도메스티쿠스종에 21
 
0.6%
동물의 19
 
0.5%
Other values (2072) 3008
83.0%
2023-12-11T12:44:39.313050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2006
 
9.8%
( 1199
 
5.9%
) 1198
 
5.9%
709
 
3.5%
624
 
3.1%
/ 506
 
2.5%
311
 
1.5%
306
 
1.5%
303
 
1.5%
302
 
1.5%
Other values (586) 12960
63.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14849
72.7%
Space Separator 2006
 
9.8%
Open Punctuation 1199
 
5.9%
Close Punctuation 1198
 
5.9%
Other Punctuation 696
 
3.4%
Decimal Number 343
 
1.7%
Dash Punctuation 74
 
0.4%
Lowercase Letter 57
 
0.3%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
709
 
4.8%
624
 
4.2%
311
 
2.1%
306
 
2.1%
303
 
2.0%
302
 
2.0%
282
 
1.9%
264
 
1.8%
250
 
1.7%
244
 
1.6%
Other values (560) 11254
75.8%
Decimal Number
ValueCountFrequency (%)
0 102
29.7%
1 63
18.4%
2 41
12.0%
5 35
 
10.2%
6 22
 
6.4%
8 20
 
5.8%
4 20
 
5.8%
9 18
 
5.2%
3 17
 
5.0%
7 5
 
1.5%
Other Punctuation
ValueCountFrequency (%)
/ 506
72.7%
, 117
 
16.8%
· 31
 
4.5%
. 29
 
4.2%
% 11
 
1.6%
: 1
 
0.1%
? 1
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
g 25
43.9%
k 16
28.1%
m 16
28.1%
Math Symbol
ValueCountFrequency (%)
> 1
50.0%
< 1
50.0%
Space Separator
ValueCountFrequency (%)
2006
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1199
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1198
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 74
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14839
72.7%
Common 5518
 
27.0%
Latin 57
 
0.3%
Han 10
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
709
 
4.8%
624
 
4.2%
311
 
2.1%
306
 
2.1%
303
 
2.0%
302
 
2.0%
282
 
1.9%
264
 
1.8%
250
 
1.7%
244
 
1.6%
Other values (550) 11244
75.8%
Common
ValueCountFrequency (%)
2006
36.4%
( 1199
21.7%
) 1198
21.7%
/ 506
 
9.2%
, 117
 
2.1%
0 102
 
1.8%
- 74
 
1.3%
1 63
 
1.1%
2 41
 
0.7%
5 35
 
0.6%
Other values (13) 177
 
3.2%
Han
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Latin
ValueCountFrequency (%)
g 25
43.9%
k 16
28.1%
m 16
28.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14839
72.7%
ASCII 5544
 
27.1%
None 31
 
0.2%
CJK 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2006
36.2%
( 1199
21.6%
) 1198
21.6%
/ 506
 
9.1%
, 117
 
2.1%
0 102
 
1.8%
- 74
 
1.3%
1 63
 
1.1%
2 41
 
0.7%
5 35
 
0.6%
Other values (15) 203
 
3.7%
Hangul
ValueCountFrequency (%)
709
 
4.8%
624
 
4.2%
311
 
2.1%
306
 
2.1%
303
 
2.0%
302
 
2.0%
282
 
1.9%
264
 
1.8%
250
 
1.7%
244
 
1.6%
Other values (550) 11244
75.8%
None
ValueCountFrequency (%)
· 31
100.0%
CJK
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%

영 문 품 명
Text

MISSING 

Distinct1597
Distinct (%)97.7%
Missing111
Missing (%)6.4%
Memory size13.8 KiB
2023-12-11T12:44:39.694418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length150
Median length81
Mean length31.721101
Min length4

Characters and Unicode

Total characters51864
Distinct characters75
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1580 ?
Unique (%)96.6%

Sample

1st rowHorses: Pure-bred breeding anmials(For farm breeding)
2nd rowHorses: Pure-bred breeding anmials(Other)
3rd rowHorses: Other(Horses for racing)
4th rowHorses: Other(Other)
5th rowAsses: Pure-bred breeding animals
ValueCountFrequency (%)
or 367
 
5.5%
of 342
 
5.1%
other 311
 
4.6%
and 242
 
3.6%
meat 118
 
1.8%
chilled 80
 
1.2%
preserved 73
 
1.1%
the 63
 
0.9%
offal 57
 
0.8%
by 56
 
0.8%
Other values (1892) 5014
74.6%
2023-12-11T12:44:40.296917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 5742
 
11.1%
5098
 
9.8%
r 3923
 
7.6%
o 3194
 
6.2%
a 3134
 
6.0%
s 2965
 
5.7%
t 2938
 
5.7%
i 2606
 
5.0%
n 2420
 
4.7%
d 1870
 
3.6%
Other values (65) 17974
34.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40943
78.9%
Space Separator 5098
 
9.8%
Uppercase Letter 2548
 
4.9%
Open Punctuation 1090
 
2.1%
Close Punctuation 1089
 
2.1%
Other Punctuation 737
 
1.4%
Decimal Number 249
 
0.5%
Dash Punctuation 105
 
0.2%
Other Symbol 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5742
14.0%
r 3923
 
9.6%
o 3194
 
7.8%
a 3134
 
7.7%
s 2965
 
7.2%
t 2938
 
7.2%
i 2606
 
6.4%
n 2420
 
5.9%
d 1870
 
4.6%
l 1777
 
4.3%
Other values (16) 10374
25.3%
Uppercase Letter
ValueCountFrequency (%)
O 717
28.1%
C 239
 
9.4%
S 212
 
8.3%
M 170
 
6.7%
P 169
 
6.6%
F 127
 
5.0%
B 121
 
4.7%
R 110
 
4.3%
G 106
 
4.2%
L 89
 
3.5%
Other values (16) 488
19.2%
Decimal Number
ValueCountFrequency (%)
0 75
30.1%
5 40
16.1%
1 39
15.7%
2 30
 
12.0%
4 18
 
7.2%
9 15
 
6.0%
8 13
 
5.2%
3 10
 
4.0%
6 6
 
2.4%
7 3
 
1.2%
Other Punctuation
ValueCountFrequency (%)
/ 334
45.3%
, 223
30.3%
. 79
 
10.7%
: 41
 
5.6%
% 37
 
5.0%
' 16
 
2.2%
; 7
 
0.9%
Other Symbol
ValueCountFrequency (%)
° 4
80.0%
1
 
20.0%
Space Separator
ValueCountFrequency (%)
5098
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1090
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1089
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 105
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 43491
83.9%
Common 8373
 
16.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5742
13.2%
r 3923
 
9.0%
o 3194
 
7.3%
a 3134
 
7.2%
s 2965
 
6.8%
t 2938
 
6.8%
i 2606
 
6.0%
n 2420
 
5.6%
d 1870
 
4.3%
l 1777
 
4.1%
Other values (42) 12922
29.7%
Common
ValueCountFrequency (%)
5098
60.9%
( 1090
 
13.0%
) 1089
 
13.0%
/ 334
 
4.0%
, 223
 
2.7%
- 105
 
1.3%
. 79
 
0.9%
0 75
 
0.9%
: 41
 
0.5%
5 40
 
0.5%
Other values (13) 199
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 51859
> 99.9%
None 4
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 5742
 
11.1%
5098
 
9.8%
r 3923
 
7.6%
o 3194
 
6.2%
a 3134
 
6.0%
s 2965
 
5.7%
t 2938
 
5.7%
i 2606
 
5.0%
n 2420
 
4.7%
d 1870
 
3.6%
Other values (63) 17969
34.6%
None
ValueCountFrequency (%)
° 4
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%

기본세율
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct27
Distinct (%)1.7%
Missing124
Missing (%)7.1%
Infinite0
Infinite (%)0.0%
Mean16.585697
Minimum0
Maximum50
Zeros107
Zeros (%)6.1%
Negative0
Negative (%)0.0%
Memory size15.5 KiB
2023-12-11T12:44:40.474109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median8
Q327
95-th percentile45
Maximum50
Range50
Interquartile range (IQR)22

Descriptive statistics

Standard deviation13.688249
Coefficient of variation (CV)0.82530441
Kurtosis-0.51263131
Mean16.585697
Median Absolute Deviation (MAD)8
Skewness0.74917662
Sum26902
Variance187.36815
MonotonicityNot monotonic
2023-12-11T12:44:40.641225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
8.0 456
26.1%
30.0 203
11.6%
5.0 142
 
8.1%
20.0 128
 
7.3%
3.0 122
 
7.0%
0.0 107
 
6.1%
27.0 102
 
5.8%
45.0 58
 
3.3%
18.0 54
 
3.1%
50.0 46
 
2.6%
Other values (17) 204
11.7%
(Missing) 124
 
7.1%
ValueCountFrequency (%)
0.0 107
6.1%
1.0 2
 
0.1%
1.8 4
 
0.2%
2.0 28
 
1.6%
3.0 122
7.0%
4.0 1
 
0.1%
4.2 3
 
0.2%
5.0 142
8.1%
5.4 6
 
0.3%
7.0 1
 
0.1%
ValueCountFrequency (%)
50.0 46
 
2.6%
45.0 58
 
3.3%
40.0 46
 
2.6%
36.0 37
 
2.1%
32.8 1
 
0.1%
30.0 203
11.6%
27.0 102
5.8%
25.0 20
 
1.1%
24.0 2
 
0.1%
22.5 38
 
2.2%

양허기준세율
Text

MISSING 

Distinct85
Distinct (%)5.2%
Missing125
Missing (%)7.2%
Memory size13.8 KiB
2023-12-11T12:44:40.895905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.5003085
Min length1

Characters and Unicode

Total characters4053
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)1.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20
ValueCountFrequency (%)
30 283
17.5%
20 236
14.6%
10 143
 
8.8%
60 98
 
6.0%
40 74
 
4.6%
50 66
 
4.1%
100 59
 
3.6%
25 52
 
3.2%
59.2 48
 
3.0%
35 40
 
2.5%
Other values (75) 522
32.2%
2023-12-11T12:44:41.318804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1119
27.6%
2 495
12.2%
3 460
11.3%
5 399
 
9.8%
1 339
 
8.4%
. 319
 
7.9%
4 256
 
6.3%
6 197
 
4.9%
9 182
 
4.5%
7 144
 
3.6%
Other values (2) 143
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3718
91.7%
Other Punctuation 319
 
7.9%
Dash Punctuation 16
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1119
30.1%
2 495
13.3%
3 460
12.4%
5 399
 
10.7%
1 339
 
9.1%
4 256
 
6.9%
6 197
 
5.3%
9 182
 
4.9%
7 144
 
3.9%
8 127
 
3.4%
Other Punctuation
ValueCountFrequency (%)
. 319
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4053
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1119
27.6%
2 495
12.2%
3 460
11.3%
5 399
 
9.8%
1 339
 
8.4%
. 319
 
7.9%
4 256
 
6.3%
6 197
 
4.9%
9 182
 
4.5%
7 144
 
3.6%
Other values (2) 143
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4053
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1119
27.6%
2 495
12.2%
3 460
11.3%
5 399
 
9.8%
1 339
 
8.4%
. 319
 
7.9%
4 256
 
6.3%
6 197
 
4.9%
9 182
 
4.5%
7 144
 
3.6%
Other values (2) 143
 
3.5%

2014양허관세
Text

MISSING 

Distinct101
Distinct (%)6.2%
Missing128
Missing (%)7.3%
Memory size13.8 KiB
2023-12-11T12:44:41.896688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length2.7348578
Min length1

Characters and Unicode

Total characters4425
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)1.2%

Sample

1st row13.1
2nd row13.1
3rd row13.1
4th row13.1
5th row13.1
ValueCountFrequency (%)
27 168
 
10.4%
18 125
 
7.7%
19.7 123
 
7.6%
54 121
 
7.5%
13.1 110
 
6.8%
45 93
 
5.7%
36 71
 
4.4%
6.6 57
 
3.5%
22.5 55
 
3.4%
30 54
 
3.3%
Other values (91) 641
39.6%
2023-12-11T12:44:42.365004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 633
14.3%
1 609
13.8%
2 512
11.6%
5 501
11.3%
7 398
9.0%
3 392
8.9%
4 384
8.7%
6 285
6.4%
9 233
 
5.3%
0 232
 
5.2%
Other values (2) 246
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3776
85.3%
Other Punctuation 633
 
14.3%
Dash Punctuation 16
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 609
16.1%
2 512
13.6%
5 501
13.3%
7 398
10.5%
3 392
10.4%
4 384
10.2%
6 285
7.5%
9 233
 
6.2%
0 232
 
6.1%
8 230
 
6.1%
Other Punctuation
ValueCountFrequency (%)
. 633
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4425
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 633
14.3%
1 609
13.8%
2 512
11.6%
5 501
11.3%
7 398
9.0%
3 392
8.9%
4 384
8.7%
6 285
6.4%
9 233
 
5.3%
0 232
 
5.2%
Other values (2) 246
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4425
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 633
14.3%
1 609
13.8%
2 512
11.6%
5 501
11.3%
7 398
9.0%
3 392
8.9%
4 384
8.7%
6 285
6.4%
9 233
 
5.3%
0 232
 
5.2%
Other values (2) 246
 
5.6%

시장접근세율
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct17
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
<NA>
1474 
20
 
73
-
 
45
5
 
43
8
 
24
Other values (12)
 
87

Length

Max length4
Median length4
Mean length3.6122566
Min length1

Unique

Unique3 ?
Unique (%)0.2%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1474
84.4%
20 73
 
4.2%
- 45
 
2.6%
5 43
 
2.5%
8 24
 
1.4%
50 20
 
1.1%
40 16
 
0.9%
3 14
 
0.8%
30 13
 
0.7%
0 11
 
0.6%
Other values (7) 13
 
0.7%

Length

2023-12-11T12:44:42.510093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 1474
84.4%
20 73
 
4.2%
45
 
2.6%
5 43
 
2.5%
8 24
 
1.4%
50 20
 
1.1%
40 16
 
0.9%
3 14
 
0.8%
30 13
 
0.7%
0 11
 
0.6%
Other values (7) 13
 
0.7%

국제협력관세
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)13.0%
Missing1700
Missing (%)97.4%
Infinite0
Infinite (%)0.0%
Mean20.543478
Minimum5
Maximum40
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.5 KiB
2023-12-11T12:44:42.643279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile15
Q115
median15
Q320
95-th percentile40
Maximum40
Range35
Interquartile range (IQR)5

Descriptive statistics

Standard deviation9.2031553
Coefficient of variation (CV)0.44798428
Kurtosis0.10630103
Mean20.543478
Median Absolute Deviation (MAD)5
Skewness1.017925
Sum945
Variance84.698068
MonotonicityNot monotonic
2023-12-11T12:44:42.762802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
15 22
 
1.3%
20 12
 
0.7%
35 5
 
0.3%
40 4
 
0.2%
5 2
 
0.1%
30 1
 
0.1%
(Missing) 1700
97.4%
ValueCountFrequency (%)
5 2
 
0.1%
15 22
1.3%
20 12
0.7%
30 1
 
0.1%
35 5
 
0.3%
40 4
 
0.2%
ValueCountFrequency (%)
40 4
 
0.2%
35 5
 
0.3%
30 1
 
0.1%
20 12
0.7%
15 22
1.3%
5 2
 
0.1%

탄력관세
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct18
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
<NA>
1624 
(12월말)
 
44
할당0
 
41
(6월말)
 
10
조정45%
 
5
Other values (13)
 
22

Length

Max length9
Median length4
Mean length4.0446735
Min length3

Unique

Unique8 ?
Unique (%)0.5%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1624
93.0%
(12월말) 44
 
2.5%
할당0 41
 
2.3%
(6월말) 10
 
0.6%
조정45% 5
 
0.3%
할당25 4
 
0.2%
할당1 3
 
0.2%
할당5 3
 
0.2%
조정40%, 2
 
0.1%
할당4 2
 
0.1%
Other values (8) 8
 
0.5%

Length

2023-12-11T12:44:42.886822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 1624
93.0%
12월말 44
 
2.5%
할당0 41
 
2.3%
6월말 10
 
0.6%
조정45 5
 
0.3%
할당25 4
 
0.2%
할당1 3
 
0.2%
할당5 3
 
0.2%
할당4 2
 
0.1%
조정40 2
 
0.1%
Other values (8) 8
 
0.5%

개방구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct8
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
<NA>
1455 
TC
 
72
TM
 
51
BM
 
51
BC
 
49
Other values (3)
 
68

Length

Max length4
Median length4
Mean length3.6666667
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1455
83.3%
TC 72
 
4.1%
TM 51
 
2.9%
BM 51
 
2.9%
BC 49
 
2.8%
BX 40
 
2.3%
ST 16
 
0.9%
TX 12
 
0.7%

Length

2023-12-11T12:44:43.029528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:44:43.143376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1455
83.3%
tc 72
 
4.1%
tm 51
 
2.9%
bm 51
 
2.9%
bc 49
 
2.8%
bx 40
 
2.3%
st 16
 
0.9%
tx 12
 
0.7%

개방연도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
<NA>
1455 
95.1
197 
97.7
 
39
96.7
 
22
-
 
16
Other values (2)
 
17

Length

Max length4
Median length4
Mean length3.9644903
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1455
83.3%
95.1 197
 
11.3%
97.7 39
 
2.2%
96.7 22
 
1.3%
- 16
 
0.9%
1.1 14
 
0.8%
96.1 3
 
0.2%

Length

2023-12-11T12:44:43.271672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:44:43.396523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1455
83.3%
95.1 197
 
11.3%
97.7 39
 
2.2%
96.7 22
 
1.3%
16
 
0.9%
1.1 14
 
0.8%
96.1 3
 
0.2%

Interactions

2023-12-11T12:44:36.375918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:35.574503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:35.963617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:36.485842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:35.717440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:36.127579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:36.592883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:35.832752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:44:36.258318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:44:43.486474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련 번호기본세율양허기준세율시장접근세율국제협력관세탄력관세개방구분개방연도
일련 번호1.0000.7640.8680.7820.9120.8260.6740.680
기본세율0.7641.0000.9120.8930.9200.7010.6770.777
양허기준세율0.8680.9121.0000.9840.8320.8610.9800.974
시장접근세율0.7820.8930.9841.000NaN0.7580.7920.725
국제협력관세0.9120.9200.832NaN1.000NaNNaNNaN
탄력관세0.8260.7010.8610.758NaN1.0000.2660.000
개방구분0.6740.6770.9800.792NaN0.2661.0000.730
개방연도0.6800.7770.9740.725NaN0.0000.7301.000
2023-12-11T12:44:43.613628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
개방연도개방구분탄력관세시장접근세율
개방연도1.0000.5450.0000.465
개방구분0.5451.0000.2400.511
탄력관세0.0000.2401.0000.567
시장접근세율0.4650.5110.5671.000
2023-12-11T12:44:43.716754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련 번호기본세율국제협력관세시장접근세율탄력관세개방구분개방연도
일련 번호1.000-0.216-0.1740.4460.5560.4230.441
기본세율-0.2161.000-0.0520.6740.5530.3770.473
국제협력관세-0.174-0.0521.0000.000NaN0.0000.000
시장접근세율0.4460.6740.0001.0000.5670.5110.465
탄력관세0.5560.553NaN0.5671.0000.2400.000
개방구분0.4230.3770.0000.5110.2401.0000.545
개방연도0.4410.4730.0000.4650.0000.5451.000

Missing values

2023-12-11T12:44:36.751749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:44:36.913598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T12:44:37.135223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

일련 번호H S K한 글 품 명영 문 품 명기본세율양허기준세율2014양허관세시장접근세율국제협력관세탄력관세개방구분개방연도
010101.21.1000말(번식용/농가사육용)Horses: Pure-bred breeding anmials(For farm breeding)0.02013.1<NA><NA><NA><NA><NA>
120101.21.9000말(번식용/기타)Horses: Pure-bred breeding anmials(Other)8.02013.1<NA><NA><NA><NA><NA>
230101.29.1000말(기타/경주말)Horses: Other(Horses for racing)8.02013.1<NA><NA><NA><NA><NA>
340101.29.9000말(기타/기타)Horses: Other(Other)8.02013.1<NA><NA><NA><NA><NA>
450101.30.1000당나귀(번식용)Asses: Pure-bred breeding animals8.02013.1<NA><NA><NA><NA><NA>
560101.30.9000당나귀(기타)Asses: Other8.02013.1<NA><NA><NA><NA><NA>
670101.90.0000기타(기타)Other8.02013.1<NA><NA><NA><NA><NA>
780102.21.1000축우(번식용/젖소)Cattle: Pure-bred breeding animals(For milk)0.09989.10<NA><NA>TM95.1
890102.21.2000축우(번식용/육우)Cattle: Pure-bred breeding animals(For meat)0.09989.10<NA><NA>TM95.1
9100102.21.9000축우(번식용/기타)Cattle: Pure-bred breeding animals(Other)0.09989.10<NA><NA>TM95.1
일련 번호H S K한 글 품 명영 문 품 명기본세율양허기준세율2014양허관세시장접근세율국제협력관세탄력관세개방구분개방연도
173616135203.00.0000면(카드 또는 코움한 것)Cotton, carded or combed0.0102<NA><NA><NA><NA><NA>
173716145301.10.0000생아마 또는 침지아마Flax, raw or retted2.0102<NA><NA><NA><NA><NA>
173816155301.21.0000아마(쇄경 또는 타마한 것)Flax(broken or scutched)2.0102<NA><NA><NA><NA><NA>
173916165301.29.0000아마(기타)Other flax2.0102<NA><NA><NA><NA><NA>
174016175301.30.1000아마의 토우Flax tow2.0102<NA><NA><NA><NA><NA>
174116185301.30.2000아마의 웨이스트Flax waste2.0102<NA><NA><NA><NA><NA>
174216195302.10.0000생대마 또는 침지대마True hemp, raw or retted2.0102<NA><NA><NA><NA><NA>
174316205302.90.1000쇄경, 탐, 핵클 또는 기타의 방법으로 가공한 대마True hemp, broken, scutched, hackled or other wise processed2.0102<NA><NA><NA><NA><NA>
174416215302.90.2010대마의 토우Tow of true hemp2.0102<NA><NA><NA><NA><NA>
174516225302.90.2020대마의 웨이스트Waste of true hemp2.0102<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

일련 번호H S K한 글 품 명영 문 품 명기본세율양허기준세율2014양허관세시장접근세율국제협력관세탄력관세개방구분개방연도# duplicates
4<NA><NA><NA><NA><NA><NA><NA><NA><NA>(12월말)<NA><NA>41
5<NA><NA><NA><NA><NA><NA><NA><NA><NA>(6월말)<NA><NA>9
1<NA><NA>/유연처리 안한 것)<NA><NA><NA><NA><NA><NA><NA><NA><NA>3
0<NA><NA>(밀폐용기의 것)<NA><NA><NA><NA><NA><NA><NA><NA><NA>2
2<NA><NA>안한 것)<NA><NA><NA><NA><NA><NA><NA><NA><NA>2
3<NA><NA>전중량의 100분의 10을 초과)<NA><NA><NA><NA><NA><NA><NA><NA><NA>2