Overview

Dataset statistics

Number of variables9
Number of observations10000
Missing cells19958
Missing cells (%)22.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory830.1 KiB
Average record size in memory85.0 B

Variable types

Numeric3
Categorical3
Unsupported1
DateTime2

Dataset

Description관세청 관세율표(2019년 기준), 기본세율은 5년주기 변경(22년도 갱신 예정)
Author관세청
URLhttps://www.data.go.kr/data/15051181/fileData.do

Alerts

적용만료일 has constant value ""Constant
적용국가구분 is highly overall correlated with 관세율구분High correlation
관세율구분 is highly overall correlated with 적용국가구분High correlation
관세율 is highly overall correlated with 단위당세액 and 1 other fieldsHigh correlation
단위당세액 is highly overall correlated with 관세율High correlation
용도세율구분 is highly overall correlated with 관세율High correlation
용도세율구분 is highly imbalanced (97.6%)Imbalance
단위당세액 has 9958 (99.6%) missing valuesMissing
기준가격 has 10000 (100.0%) missing valuesMissing
관세율 is highly skewed (γ1 = 20.12887725)Skewed
기준가격 is an unsupported type, check if it needs cleaning or further analysisUnsupported
관세율 has 6509 (65.1%) zerosZeros

Reproduction

Analysis started2023-12-12 07:58:23.506898
Analysis finished2023-12-12 07:58:25.414536
Duration1.91 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

품목번호
Real number (ℝ)

Distinct6988
Distinct (%)69.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.1712039 × 109
Minimum1.01211 × 108
Maximum9.706002 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:58:25.826289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.01211 × 108
5-th percentile4.02919 × 108
Q12.9152325 × 109
median5.109505 × 109
Q38.4091 × 109
95-th percentile9.023009 × 109
Maximum9.706002 × 109
Range9.604791 × 109
Interquartile range (IQR)5.4938675 × 109

Descriptive statistics

Standard deviation2.8011458 × 109
Coefficient of variation (CV)0.54168156
Kurtosis-1.2703053
Mean5.1712039 × 109
Median Absolute Deviation (MAD)2.284003 × 109
Skewness-0.079352406
Sum5.1712039 × 1013
Variance7.846418 × 1018
MonotonicityNot monotonic
2023-12-12T16:58:26.030945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3402903000 5
 
0.1%
5603121000 5
 
0.1%
303990000 5
 
0.1%
5311002000 5
 
0.1%
3904500000 5
 
0.1%
5006004000 4
 
< 0.1%
703901000 4
 
< 0.1%
6103420000 4
 
< 0.1%
3702322000 4
 
< 0.1%
1212212020 4
 
< 0.1%
Other values (6978) 9955
99.6%
ValueCountFrequency (%)
101211000 1
< 0.1%
101219000 1
< 0.1%
101291000 2
< 0.1%
101299000 1
< 0.1%
102391000 1
< 0.1%
102392000 1
< 0.1%
102909020 1
< 0.1%
103100000 1
< 0.1%
103910000 1
< 0.1%
104101000 1
< 0.1%
ValueCountFrequency (%)
9706002000 1
< 0.1%
9706001000 1
< 0.1%
9705000000 1
< 0.1%
9704001000 1
< 0.1%
9703002000 1
< 0.1%
9703001000 1
< 0.1%
9701103000 1
< 0.1%
9701101000 1
< 0.1%
9620000000 2
< 0.1%
9619009090 1
< 0.1%

관세율구분
Categorical

HIGH CORRELATION 

Distinct41
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A
1280 
FAU1
1278 
FAS1
1255 
FCEHN1
1249 
FCECR1
1246 
Other values (36)
3692 

Length

Max length6
Median length5
Mean length3.8262
Min length1

Unique

Unique13 ?
Unique (%)0.1%

Sample

1st rowFAU1
2nd rowFAU1
3rd rowFCENI1
4th rowFAS1
5th rowFCECR1

Common Values

ValueCountFrequency (%)
A 1280
12.8%
FAU1 1278
12.8%
FAS1 1255
12.6%
FCEHN1 1249
12.5%
FCECR1 1246
12.5%
FCA1 1237
12.4%
C 1096
11.0%
FCENI1 731
7.3%
E1 285
 
2.9%
E2 110
 
1.1%
Other values (31) 233
 
2.3%

Length

2023-12-12T16:58:26.188425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a 1280
12.8%
fau1 1278
12.8%
fas1 1255
12.6%
fcehn1 1249
12.5%
fcecr1 1246
12.5%
fca1 1237
12.4%
c 1096
11.0%
fceni1 731
7.3%
e1 285
 
2.9%
e2 110
 
1.1%
Other values (31) 233
 
2.3%

관세율
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct179
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.25299
Minimum0
Maximum887.4
Zeros6509
Zeros (%)65.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:58:26.315790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q36.5
95-th percentile18
Maximum887.4
Range887.4
Interquartile range (IQR)6.5

Descriptive statistics

Standard deviation26.818785
Coefficient of variation (CV)5.1054323
Kurtosis500.32266
Mean5.25299
Median Absolute Deviation (MAD)0
Skewness20.128877
Sum52529.9
Variance719.24722
MonotonicityNot monotonic
2023-12-12T16:58:26.446548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 6509
65.1%
8.0 892
 
8.9%
13.0 398
 
4.0%
5.0 194
 
1.9%
6.5 169
 
1.7%
10.0 98
 
1.0%
16.0 88
 
0.9%
3.0 80
 
0.8%
30.0 70
 
0.7%
5.6 70
 
0.7%
Other values (169) 1432
 
14.3%
ValueCountFrequency (%)
0.0 6509
65.1%
0.5 2
 
< 0.1%
0.6 1
 
< 0.1%
0.7 1
 
< 0.1%
0.8 3
 
< 0.1%
0.9 1
 
< 0.1%
1.0 21
 
0.2%
1.1 1
 
< 0.1%
1.2 2
 
< 0.1%
1.3 7
 
0.1%
ValueCountFrequency (%)
887.4 1
< 0.1%
800.3 1
< 0.1%
776.4 1
< 0.1%
754.3 2
< 0.1%
630.0 1
< 0.1%
551.2 2
< 0.1%
486.0 1
< 0.1%
426.8 1
< 0.1%
389.6 1
< 0.1%
385.0 1
< 0.1%

단위당세액
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct34
Distinct (%)81.0%
Missing9958
Missing (%)99.6%
Infinite0
Infinite (%)0.0%
Mean1664.7095
Minimum9
Maximum10552
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:58:26.571500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile21.25
Q1268.75
median784
Q31586
95-th percentile6210
Maximum10552
Range10543
Interquartile range (IQR)1317.25

Descriptive statistics

Standard deviation2354.0649
Coefficient of variation (CV)1.4140995
Kurtosis4.4651851
Mean1664.7095
Median Absolute Deviation (MAD)593.1
Skewness2.1647616
Sum69917.8
Variance5541621.6
MonotonicityNot monotonic
2023-12-12T16:58:26.707973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
744.0 3
 
< 0.1%
1586.0 3
 
< 0.1%
974.0 3
 
< 0.1%
6210.0 2
 
< 0.1%
35.0 2
 
< 0.1%
2110.0 1
 
< 0.1%
720.0 1
 
< 0.1%
5591.0 1
 
< 0.1%
1778.0 1
 
< 0.1%
1625.0 1
 
< 0.1%
Other values (24) 24
 
0.2%
(Missing) 9958
99.6%
ValueCountFrequency (%)
9.0 1
< 0.1%
18.0 1
< 0.1%
21.0 1
< 0.1%
26.0 1
< 0.1%
35.0 2
< 0.1%
100.0 1
< 0.1%
182.0 1
< 0.1%
199.8 1
< 0.1%
226.0 1
< 0.1%
260.0 1
< 0.1%
ValueCountFrequency (%)
10552.0 1
 
< 0.1%
6660.0 1
 
< 0.1%
6210.0 2
< 0.1%
5827.0 1
 
< 0.1%
5591.0 1
 
< 0.1%
4098.0 1
 
< 0.1%
2110.0 1
 
< 0.1%
1778.0 1
 
< 0.1%
1625.0 1
 
< 0.1%
1586.0 3
< 0.1%

기준가격
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB

적용국가구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2
7527 
1
2473 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 7527
75.3%
1 2473
 
24.7%

Length

2023-12-12T16:58:26.831719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:58:26.936206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 7527
75.3%
1 2473
 
24.7%

용도세율구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9962 
A
 
34
C
 
4

Length

Max length4
Median length4
Mean length3.9886
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9962
99.6%
A 34
 
0.3%
C 4
 
< 0.1%

Length

2023-12-12T16:58:27.070963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:58:27.195756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9962
99.6%
a 34
 
0.3%
c 4
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-01-01 00:00:00
Maximum2020-12-01 00:00:00
2023-12-12T16:58:27.285378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:58:27.375289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=2)

적용만료일
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-12-31 00:00:00
Maximum2020-12-31 00:00:00
2023-12-12T16:58:27.460104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:58:27.549569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T16:58:24.835746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:58:24.136549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:58:24.506203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:58:24.914086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:58:24.263336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:58:24.618189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:58:25.003315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:58:24.398452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:58:24.728045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:58:27.644011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품목번호관세율구분관세율단위당세액적용국가구분용도세율구분적용개시일
품목번호1.0000.3340.1590.4260.0650.0000.015
관세율구분0.3341.0000.1670.0001.0000.3990.526
관세율0.1590.1671.0000.7350.029NaN0.000
단위당세액0.4260.0000.7351.0000.000NaNNaN
적용국가구분0.0651.0000.0290.0001.0000.0000.000
용도세율구분0.0000.399NaNNaN0.0001.000NaN
적용개시일0.0150.5260.000NaN0.000NaN1.000
2023-12-12T16:58:27.765018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용도세율구분적용국가구분관세율구분
용도세율구분1.0000.0000.322
적용국가구분0.0001.0000.998
관세율구분0.3220.9981.000
2023-12-12T16:58:27.854526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품목번호관세율단위당세액관세율구분적용국가구분용도세율구분
품목번호1.000-0.240-0.3650.1210.0500.000
관세율-0.2401.0000.5160.0580.0221.000
단위당세액-0.3650.5161.0000.0000.0000.000
관세율구분0.1210.0580.0001.0000.9980.322
적용국가구분0.0500.0220.0000.9981.0000.000
용도세율구분0.0001.0000.0000.3220.0001.000

Missing values

2023-12-12T16:58:25.142145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:58:25.320625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

품목번호관세율구분관세율단위당세액기준가격적용국가구분용도세율구분적용개시일적용만료일
486997315900000FAU10.0<NA><NA>2<NA>2020-01-012020-12-31
508198517621000FAU10.0<NA><NA>2<NA>2020-01-012020-12-31
934284407910000FCENI13.0<NA><NA>2<NA>2020-01-012020-12-31
28875504001031FAS10.0<NA><NA>2<NA>2020-01-012020-12-31
743098471501000FCECR10.0<NA><NA>2<NA>2020-01-012020-12-31
690253802100000FCECR10.0<NA><NA>2<NA>2020-01-012020-12-31
77532714102090FCEHN1776.4<NA><NA>2<NA>2020-01-012020-12-31
922313403119000FCENI10.0<NA><NA>2<NA>2020-01-012020-12-31
684783006105010FCECR10.0<NA><NA>2<NA>2020-01-012020-12-31
915152929903000FCENI10.0<NA><NA>2<NA>2020-01-012020-12-31
품목번호관세율구분관세율단위당세액기준가격적용국가구분용도세율구분적용개시일적용만료일
40607210930000FAU16.7<NA><NA>2<NA>2020-01-012020-12-31
471155601210000FAU10.0<NA><NA>2<NA>2020-01-012020-12-31
647069404290000FCA10.0<NA><NA>2<NA>2020-01-012020-12-31
649019613909000FCA10.0<NA><NA>2<NA>2020-01-012020-12-31
506548504409099FAU10.0<NA><NA>2<NA>2020-01-012020-12-31
915622931320000FCENI10.0<NA><NA>2<NA>2020-01-012020-12-31
29087712320000FAS10.0<NA><NA>2<NA>2020-01-012020-12-31
587705106109000FCA10.0<NA><NA>2<NA>2020-01-012020-12-31
569243304912000FCA10.0<NA><NA>2<NA>2020-01-012020-12-31
846607502109000FCEHN10.0<NA><NA>2<NA>2020-01-012020-12-31