Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells4416
Missing cells (%)5.5%
Duplicate rows1142
Duplicate rows (%)11.4%
Total size in memory703.1 KiB
Average record size in memory72.0 B

Variable types

Categorical3
Text4
DateTime1

Dataset

DescriptionKATI(https://www.kati.net/index.do)에 수집되는 농식품의 통관문제 사례(수입국, 수출국, 검역소명, 불합격구분사유, 발생년월 등)
URLhttps://www.data.go.kr/data/15071796/fileData.do

Alerts

Dataset has 1142 (11.4%) duplicate rowsDuplicates
수입국 is highly overall correlated with 조치사항High correlation
조치사항 is highly overall correlated with 수입국High correlation
구분 is highly imbalanced (82.8%)Imbalance
검역소 has 4416 (44.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 23:43:29.179033
Analysis finished2023-12-12 23:43:30.331749
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경쟁국산
9744 
한국산
 
256

Length

Max length4
Median length4
Mean length3.9744
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경쟁국산
2nd row한국산
3rd row경쟁국산
4th row경쟁국산
5th row경쟁국산

Common Values

ValueCountFrequency (%)
경쟁국산 9744
97.4%
한국산 256
 
2.6%

Length

2023-12-13T08:43:30.380956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:43:30.453857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경쟁국산 9744
97.4%
한국산 256
 
2.6%

수입국
Categorical

HIGH CORRELATION 

Distinct44
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
미국
4363 
중국
1368 
일본
644 
필리핀
590 
대만
 
317
Other values (39)
2718 

Length

Max length8
Median length2
Mean length2.4027
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row프랑스
2nd row미국
3rd row아일랜드
4th row미국
5th row미국

Common Values

ValueCountFrequency (%)
미국 4363
43.6%
중국 1368
 
13.7%
일본 644
 
6.4%
필리핀 590
 
5.9%
대만 317
 
3.2%
캐나다 295
 
2.9%
네덜란드 289
 
2.9%
독일 283
 
2.8%
스페인 170
 
1.7%
벨기에 162
 
1.6%
Other values (34) 1519
 
15.2%

Length

2023-12-13T08:43:30.537371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
미국 4363
43.6%
중국 1368
 
13.7%
일본 644
 
6.4%
필리핀 590
 
5.9%
대만 317
 
3.2%
캐나다 295
 
2.9%
네덜란드 289
 
2.9%
독일 283
 
2.8%
스페인 170
 
1.7%
벨기에 162
 
1.6%
Other values (34) 1519
 
15.2%
Distinct149
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T08:43:30.765619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length3.007
Min length2

Characters and Unicode

Total characters30070
Distinct characters158
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)0.2%

Sample

1st row스페인
2nd row대한민국
3rd row아일랜드
4th row중국
5th row멕시코
ValueCountFrequency (%)
멕시코 1162
 
11.6%
미국 877
 
8.7%
인도 650
 
6.5%
중국 605
 
6.0%
일본 487
 
4.8%
필리핀 466
 
4.6%
베트남 436
 
4.3%
튀르키예 397
 
3.9%
캐나다 394
 
3.9%
대한민국 256
 
2.5%
Other values (143) 4322
43.0%
2023-12-13T08:43:31.093872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2095
 
7.0%
1650
 
5.5%
1276
 
4.2%
1162
 
3.9%
1052
 
3.5%
1021
 
3.4%
1015
 
3.4%
945
 
3.1%
898
 
3.0%
866
 
2.9%
Other values (148) 18090
60.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30015
99.8%
Space Separator 52
 
0.2%
Dash Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2095
 
7.0%
1650
 
5.5%
1276
 
4.3%
1162
 
3.9%
1052
 
3.5%
1021
 
3.4%
1015
 
3.4%
945
 
3.1%
898
 
3.0%
866
 
2.9%
Other values (146) 18035
60.1%
Space Separator
ValueCountFrequency (%)
52
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30015
99.8%
Common 55
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2095
 
7.0%
1650
 
5.5%
1276
 
4.3%
1162
 
3.9%
1052
 
3.5%
1021
 
3.4%
1015
 
3.4%
945
 
3.1%
898
 
3.0%
866
 
2.9%
Other values (146) 18035
60.1%
Common
ValueCountFrequency (%)
52
94.5%
- 3
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30015
99.8%
ASCII 55
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2095
 
7.0%
1650
 
5.5%
1276
 
4.3%
1162
 
3.9%
1052
 
3.5%
1021
 
3.4%
1015
 
3.4%
945
 
3.1%
898
 
3.0%
866
 
2.9%
Other values (146) 18035
60.1%
ASCII
ValueCountFrequency (%)
52
94.5%
- 3
 
5.5%
Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-01-01 00:00:00
Maximum2023-06-01 00:00:00
2023-12-13T08:43:31.189862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:43:31.282508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)

품목
Text

Distinct855
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T08:43:31.462424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length88
Median length69
Mean length18.1872
Min length1

Characters and Unicode

Total characters181872
Distinct characters571
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique267 ?
Unique (%)2.7%

Sample

1st row조제품 기타
2nd row팽이버섯(신선/냉장)
3rd row닭고기(설육/간장제외/냉동)
4th row기타버섯(신선/냉장)
5th row콘 칩
ValueCountFrequency (%)
기타 2020
 
5.7%
1808
 
5.1%
이외 1018
 
2.9%
않은 716
 
2.0%
또는 644
 
1.8%
제외 592
 
1.7%
조제품 565
 
1.6%
과실 536
 
1.5%
486
 
1.4%
448
 
1.3%
Other values (1460) 26517
75.0%
2023-12-13T08:43:31.773201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25350
 
13.9%
( 8103
 
4.5%
) 8102
 
4.5%
6364
 
3.5%
5986
 
3.3%
, 5069
 
2.8%
4063
 
2.2%
3738
 
2.1%
/ 3246
 
1.8%
3189
 
1.8%
Other values (561) 108662
59.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 126308
69.4%
Space Separator 25350
 
13.9%
Other Punctuation 8510
 
4.7%
Open Punctuation 8111
 
4.5%
Close Punctuation 8110
 
4.5%
Lowercase Letter 3547
 
2.0%
Decimal Number 1809
 
1.0%
Uppercase Letter 126
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6364
 
5.0%
5986
 
4.7%
4063
 
3.2%
3738
 
3.0%
3189
 
2.5%
2916
 
2.3%
2816
 
2.2%
2357
 
1.9%
2172
 
1.7%
2065
 
1.6%
Other values (509) 90642
71.8%
Lowercase Letter
ValueCountFrequency (%)
e 427
12.0%
a 424
12.0%
l 338
9.5%
n 321
9.0%
o 217
 
6.1%
r 212
 
6.0%
u 209
 
5.9%
t 205
 
5.8%
d 203
 
5.7%
m 196
 
5.5%
Other values (14) 795
22.4%
Decimal Number
ValueCountFrequency (%)
0 821
45.4%
2 394
21.8%
9 258
 
14.3%
1 182
 
10.1%
6 73
 
4.0%
3 32
 
1.8%
5 28
 
1.5%
8 12
 
0.7%
7 7
 
0.4%
4 2
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
B 52
41.3%
L 30
23.8%
P 20
 
15.9%
C 11
 
8.7%
T 10
 
7.9%
D 2
 
1.6%
O 1
 
0.8%
Other Punctuation
ValueCountFrequency (%)
, 5069
59.6%
/ 3246
38.1%
· 126
 
1.5%
. 58
 
0.7%
% 11
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 8103
99.9%
[ 8
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 8102
99.9%
] 8
 
0.1%
Space Separator
ValueCountFrequency (%)
25350
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 126297
69.4%
Common 51891
28.5%
Latin 3673
 
2.0%
Han 11
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6364
 
5.0%
5986
 
4.7%
4063
 
3.2%
3738
 
3.0%
3189
 
2.5%
2916
 
2.3%
2816
 
2.2%
2357
 
1.9%
2172
 
1.7%
2065
 
1.6%
Other values (503) 90631
71.8%
Latin
ValueCountFrequency (%)
e 427
11.6%
a 424
11.5%
l 338
 
9.2%
n 321
 
8.7%
o 217
 
5.9%
r 212
 
5.8%
u 209
 
5.7%
t 205
 
5.6%
d 203
 
5.5%
m 196
 
5.3%
Other values (21) 921
25.1%
Common
ValueCountFrequency (%)
25350
48.9%
( 8103
 
15.6%
) 8102
 
15.6%
, 5069
 
9.8%
/ 3246
 
6.3%
0 821
 
1.6%
2 394
 
0.8%
9 258
 
0.5%
1 182
 
0.4%
· 126
 
0.2%
Other values (11) 240
 
0.5%
Han
ValueCountFrequency (%)
4
36.4%
2
18.2%
2
18.2%
1
 
9.1%
1
 
9.1%
1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 124859
68.7%
ASCII 55438
30.5%
Compat Jamo 1438
 
0.8%
None 126
 
0.1%
CJK 8
 
< 0.1%
CJK Compat Ideographs 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
25350
45.7%
( 8103
 
14.6%
) 8102
 
14.6%
, 5069
 
9.1%
/ 3246
 
5.9%
0 821
 
1.5%
e 427
 
0.8%
a 424
 
0.8%
2 394
 
0.7%
l 338
 
0.6%
Other values (41) 3164
 
5.7%
Hangul
ValueCountFrequency (%)
6364
 
5.1%
5986
 
4.8%
4063
 
3.3%
3738
 
3.0%
3189
 
2.6%
2916
 
2.3%
2816
 
2.3%
2357
 
1.9%
2172
 
1.7%
2065
 
1.7%
Other values (502) 89193
71.4%
Compat Jamo
ValueCountFrequency (%)
1438
100.0%
None
ValueCountFrequency (%)
· 126
100.0%
CJK
ValueCountFrequency (%)
4
50.0%
2
25.0%
1
 
12.5%
1
 
12.5%
CJK Compat Ideographs
ValueCountFrequency (%)
2
66.7%
1
33.3%

검역소
Text

MISSING 

Distinct98
Distinct (%)1.8%
Missing4416
Missing (%)44.2%
Memory size156.2 KiB
2023-12-13T08:43:31.928691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length8
Mean length6.2682665
Min length1

Characters and Unicode

Total characters35002
Distinct characters157
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)0.6%

Sample

1st rowWCID 검역소
2nd rowSWID 검역소
3rd row광저우
4th row상해
5th rowWCID 검역소
ValueCountFrequency (%)
검역소 3759
39.8%
swid 1108
 
11.7%
seid 833
 
8.8%
wcid 793
 
8.4%
neid 611
 
6.5%
nbid 414
 
4.4%
상해 163
 
1.7%
심천 160
 
1.7%
광저우 153
 
1.6%
천진 137
 
1.5%
Other values (97) 1303
 
13.8%
2023-12-13T08:43:32.185710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3852
11.0%
3761
10.7%
I 3759
10.7%
D 3759
10.7%
3759
10.7%
3759
10.7%
S 1941
 
5.5%
W 1901
 
5.4%
E 1444
 
4.1%
N 1025
 
2.9%
Other values (147) 6042
17.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15958
45.6%
Uppercase Letter 15046
43.0%
Space Separator 3852
 
11.0%
Lowercase Letter 116
 
0.3%
Decimal Number 28
 
0.1%
Other Punctuation 1
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3761
23.6%
3759
23.6%
3759
23.6%
297
 
1.9%
256
 
1.6%
243
 
1.5%
201
 
1.3%
179
 
1.1%
174
 
1.1%
164
 
1.0%
Other values (115) 3165
19.8%
Lowercase Letter
ValueCountFrequency (%)
e 19
16.4%
o 14
12.1%
r 12
10.3%
n 10
 
8.6%
z 8
 
6.9%
s 8
 
6.9%
l 6
 
5.2%
h 5
 
4.3%
g 5
 
4.3%
d 4
 
3.4%
Other values (8) 25
21.6%
Uppercase Letter
ValueCountFrequency (%)
I 3759
25.0%
D 3759
25.0%
S 1941
12.9%
W 1901
12.6%
E 1444
 
9.6%
N 1025
 
6.8%
C 795
 
5.3%
B 414
 
2.8%
R 6
 
< 0.1%
O 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3852
100.0%
Decimal Number
ValueCountFrequency (%)
2 28
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15958
45.6%
Latin 15162
43.3%
Common 3882
 
11.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3761
23.6%
3759
23.6%
3759
23.6%
297
 
1.9%
256
 
1.6%
243
 
1.5%
201
 
1.3%
179
 
1.1%
174
 
1.1%
164
 
1.0%
Other values (115) 3165
19.8%
Latin
ValueCountFrequency (%)
I 3759
24.8%
D 3759
24.8%
S 1941
12.8%
W 1901
12.5%
E 1444
 
9.5%
N 1025
 
6.8%
C 795
 
5.2%
B 414
 
2.7%
e 19
 
0.1%
o 14
 
0.1%
Other values (18) 91
 
0.6%
Common
ValueCountFrequency (%)
3852
99.2%
2 28
 
0.7%
/ 1
 
< 0.1%
- 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19044
54.4%
Hangul 15958
45.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3852
20.2%
I 3759
19.7%
D 3759
19.7%
S 1941
10.2%
W 1901
10.0%
E 1444
 
7.6%
N 1025
 
5.4%
C 795
 
4.2%
B 414
 
2.2%
2 28
 
0.1%
Other values (22) 126
 
0.7%
Hangul
ValueCountFrequency (%)
3761
23.6%
3759
23.6%
3759
23.6%
297
 
1.9%
256
 
1.6%
243
 
1.5%
201
 
1.3%
179
 
1.1%
174
 
1.1%
164
 
1.0%
Other values (115) 3165
19.8%
Distinct2118
Distinct (%)21.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T08:43:32.412502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length529
Median length282
Mean length45.5874
Min length5

Characters and Unicode

Total characters455874
Distinct characters671
Distinct categories15 ?
Distinct scripts5 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1613 ?
Unique (%)16.1%

Sample

1st row위생(미생물)/식중독균 살모넬라균 검출
2nd row위생(미생물)/리스테리아균 검출
3rd row위생(미생물) / 살모넬라 티피뮤리움 검출
4th row성분(잔류농약)/살충제 화학물질 검출
5th row라벨링/인공 색소 성분 미표기
ValueCountFrequency (%)
검출 3975
 
4.2%
3378
 
3.6%
1797
 
1.9%
성분(식품첨가물 1625
 
1.7%
않은 1530
 
1.6%
미표기 947
 
1.0%
의거 937
 
1.0%
건강 895
 
0.9%
판매 894
 
0.9%
식품 804
 
0.8%
Other values (3578) 77838
82.3%
2023-12-13T08:43:32.767034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
86059
 
18.9%
/ 11371
 
2.5%
, 8190
 
1.8%
7483
 
1.6%
7190
 
1.6%
) 6730
 
1.5%
( 6719
 
1.5%
6246
 
1.4%
6219
 
1.4%
5978
 
1.3%
Other values (661) 303689
66.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 289321
63.5%
Space Separator 86073
 
18.9%
Other Punctuation 23657
 
5.2%
Lowercase Letter 21826
 
4.8%
Decimal Number 13446
 
2.9%
Close Punctuation 6859
 
1.5%
Open Punctuation 6847
 
1.5%
Uppercase Letter 4098
 
0.9%
Math Symbol 1637
 
0.4%
Dash Punctuation 1336
 
0.3%
Other values (5) 774
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7483
 
2.6%
7190
 
2.5%
6246
 
2.2%
6219
 
2.1%
5978
 
2.1%
5964
 
2.1%
5868
 
2.0%
5356
 
1.9%
5168
 
1.8%
4957
 
1.7%
Other values (556) 228892
79.1%
Lowercase Letter
ValueCountFrequency (%)
o 2576
11.8%
g 2029
9.3%
i 2017
9.2%
t 1946
8.9%
n 1906
8.7%
d 1813
8.3%
a 1428
 
6.5%
m 1369
 
6.3%
r 1325
 
6.1%
p 1198
 
5.5%
Other values (17) 4219
19.3%
Uppercase Letter
ValueCountFrequency (%)
A 1263
30.8%
F 623
15.2%
D 622
15.2%
B 585
14.3%
G 377
 
9.2%
S 122
 
3.0%
C 93
 
2.3%
E 88
 
2.1%
T 41
 
1.0%
P 36
 
0.9%
Other values (15) 248
 
6.1%
Other Punctuation
ValueCountFrequency (%)
/ 11371
48.1%
, 8190
34.6%
. 3481
 
14.7%
: 246
 
1.0%
161
 
0.7%
' 106
 
0.4%
? 57
 
0.2%
· 16
 
0.1%
" 12
 
0.1%
% 10
 
< 0.1%
Other values (3) 7
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 3332
24.8%
0 2883
21.4%
2 2289
17.0%
3 951
 
7.1%
5 945
 
7.0%
8 792
 
5.9%
7 792
 
5.9%
4 645
 
4.8%
6 524
 
3.9%
9 293
 
2.2%
Math Symbol
ValueCountFrequency (%)
> 671
41.0%
< 671
41.0%
241
 
14.7%
× 36
 
2.2%
= 9
 
0.5%
± 5
 
0.3%
+ 2
 
0.1%
~ 2
 
0.1%
Other Number
ValueCountFrequency (%)
18
42.9%
18
42.9%
3
 
7.1%
³ 2
 
4.8%
1
 
2.4%
Close Punctuation
ValueCountFrequency (%)
) 6730
98.1%
121
 
1.8%
4
 
0.1%
] 4
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 6719
98.1%
120
 
1.8%
[ 4
 
0.1%
4
 
0.1%
Other Symbol
ValueCountFrequency (%)
482
66.5%
241
33.2%
2
 
0.3%
Space Separator
ValueCountFrequency (%)
86059
> 99.9%
  14
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 1336
100.0%
Final Punctuation
ValueCountFrequency (%)
3
100.0%
Initial Punctuation
ValueCountFrequency (%)
3
100.0%
Control
ValueCountFrequency (%)
 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 289307
63.5%
Common 140629
30.8%
Latin 25554
 
5.6%
Greek 372
 
0.1%
Han 12
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7483
 
2.6%
7190
 
2.5%
6246
 
2.2%
6219
 
2.1%
5978
 
2.1%
5964
 
2.1%
5868
 
2.0%
5356
 
1.9%
5168
 
1.8%
4957
 
1.7%
Other values (543) 228878
79.1%
Common
ValueCountFrequency (%)
86059
61.2%
/ 11371
 
8.1%
, 8190
 
5.8%
) 6730
 
4.8%
( 6719
 
4.8%
. 3481
 
2.5%
1 3332
 
2.4%
0 2883
 
2.1%
2 2289
 
1.6%
- 1336
 
1.0%
Other values (43) 8239
 
5.9%
Latin
ValueCountFrequency (%)
o 2576
 
10.1%
g 2029
 
7.9%
i 2017
 
7.9%
t 1946
 
7.6%
n 1906
 
7.5%
d 1813
 
7.1%
a 1428
 
5.6%
m 1369
 
5.4%
r 1325
 
5.2%
A 1263
 
4.9%
Other values (42) 7882
30.8%
Han
ValueCountFrequency (%)
1
8.3%
1
8.3%
貿 1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
Other values (2) 2
16.7%
Greek
ValueCountFrequency (%)
μ 372
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 289307
63.5%
ASCII 164685
36.1%
None 859
 
0.2%
CJK Compat 723
 
0.2%
Math Operators 241
 
0.1%
Enclosed Alphanum 39
 
< 0.1%
CJK 12
 
< 0.1%
Punctuation 6
 
< 0.1%
Letterlike Symbols 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
86059
52.3%
/ 11371
 
6.9%
, 8190
 
5.0%
) 6730
 
4.1%
( 6719
 
4.1%
. 3481
 
2.1%
1 3332
 
2.0%
0 2883
 
1.8%
o 2576
 
1.6%
2 2289
 
1.4%
Other values (73) 31055
 
18.9%
Hangul
ValueCountFrequency (%)
7483
 
2.6%
7190
 
2.5%
6246
 
2.2%
6219
 
2.1%
5978
 
2.1%
5964
 
2.1%
5868
 
2.0%
5356
 
1.9%
5168
 
1.8%
4957
 
1.7%
Other values (543) 228878
79.1%
CJK Compat
ValueCountFrequency (%)
482
66.7%
241
33.3%
None
ValueCountFrequency (%)
μ 372
43.3%
161
18.7%
121
 
14.1%
120
 
14.0%
× 36
 
4.2%
· 16
 
1.9%
  14
 
1.6%
± 5
 
0.6%
4
 
0.5%
4
 
0.5%
Other values (4) 6
 
0.7%
Math Operators
ValueCountFrequency (%)
241
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
18
46.2%
18
46.2%
3
 
7.7%
Punctuation
ValueCountFrequency (%)
3
50.0%
3
50.0%
Letterlike Symbols
ValueCountFrequency (%)
2
100.0%
CJK
ValueCountFrequency (%)
1
8.3%
1
8.3%
貿 1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
Other values (2) 2
16.7%

조치사항
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
통관거부
4004 
리콜
2380 
폐기 또는 반송
2069 
기타
949 
압류
 
291
Other values (3)
 
307

Length

Max length8
Median length4
Mean length4.0422
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row리콜
2nd row리콜
3rd row기타
4th row통관거부
5th row통관거부

Common Values

ValueCountFrequency (%)
통관거부 4004
40.0%
리콜 2380
23.8%
폐기 또는 반송 2069
20.7%
기타 949
 
9.5%
압류 291
 
2.9%
반송 200
 
2.0%
폐기 104
 
1.0%
소각 3
 
< 0.1%

Length

2023-12-13T08:43:32.875850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:43:32.965733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
통관거부 4004
28.3%
리콜 2380
16.8%
반송 2269
16.0%
폐기 2173
15.4%
또는 2069
14.6%
기타 949
 
6.7%
압류 291
 
2.1%
소각 3
 
< 0.1%

Correlations

2023-12-13T08:43:33.032712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분수입국발생일자검역소조치사항
구분1.0000.2050.4700.3260.096
수입국0.2051.0000.3211.0000.889
발생일자0.4700.3211.0000.5250.226
검역소0.3261.0000.5251.0000.990
조치사항0.0960.8890.2260.9901.000
2023-12-13T08:43:33.104980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분조치사항수입국
구분1.0000.0720.163
조치사항0.0721.0000.590
수입국0.1630.5901.000
2023-12-13T08:43:33.171461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분수입국조치사항
구분1.0000.1630.072
수입국0.1631.0000.590
조치사항0.0720.5901.000

Missing values

2023-12-13T08:43:30.186393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:43:30.281320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분수입국원산지발생일자품목검역소문제사유조치사항
11839경쟁국산프랑스스페인2022-07조제품 기타<NA>위생(미생물)/식중독균 살모넬라균 검출리콜
6200한국산미국대한민국2022-11팽이버섯(신선/냉장)<NA>위생(미생물)/리스테리아균 검출리콜
3278경쟁국산아일랜드아일랜드2023-01닭고기(설육/간장제외/냉동)<NA>위생(미생물) / 살모넬라 티피뮤리움 검출기타
19984경쟁국산미국중국2022-03기타버섯(신선/냉장)WCID 검역소성분(잔류농약)/살충제 화학물질 검출통관거부
12823경쟁국산미국멕시코2022-07콘 칩SWID 검역소라벨링/인공 색소 성분 미표기통관거부
8416경쟁국산중국태국2022-09코코넛(신선, 건조)광저우위생(미생물)/곰팡이 기준치 초과폐기 또는 반송
18584경쟁국산미국미국2022-04초콜릿과 초콜릿과자(다른 것으로 속을 채운 것/블록 모양ㆍ슬래브 모양ㆍ막대 모양의 것)<NA>위생(미생물)/살모넬라균 검출 가능성리콜
9973경쟁국산중국이탈리아2022-08돼지고기(식용설육/족/냉동)상해기타/검역검증 승인을 얻지 못함, <식품안전법>제92조의거, 수입 식품 해당국가 검역당국에 검역허가증명서를 받아야 함폐기 또는 반송
8297경쟁국산오스트리아프랑스2022-09가금류의 고기(육 또는 식용설육분.조분)<NA>위생(미생물)/식중독균 캄필로박터균 검출기타
13764경쟁국산리투아니아미국2022-06기타어류(냉동)<NA>성분(식품첨가물 및 유해물질)/수은 검출압류
구분수입국원산지발생일자품목검역소문제사유조치사항
13324경쟁국산중국베트남2022-06기타(건조한 어류/염장했는지에 상관없으며 훈제한 것은 제외)난닝성분(식품첨가물 및 유해물질)/식품첨가물 인산 및 인산염 사용량 초과폐기 또는 반송
20024경쟁국산미국말레이시아2022-03식물성 산물(식용)NBID 검역소성분(식품첨가물 및 유해물질)/승인 받지 못한 신약 포함통관거부
9785경쟁국산호주중국2022-08낙화생(탈각한 것)(기타)<NA>위생(미생물)/기준치 이상의 아플라톡신 검출통관거부
3727경쟁국산미국러시아2023-01식빵(bread)NBID 검역소라벨링/비영양성 감미료인 사카린이 함유되어있지만, 라벨의 첨가제목록에 포함되어있지 않음통관거부
22369경쟁국산대만중국2022-01대추(건조)<NA>성분(잔류농약)/ 잔류농약 프로파자이트 0.07 ppm 검출폐기 또는 반송
14047경쟁국산미국미국2022-06치즈(기타)<NA>위생(미생물)/리스테리아균 검출 가능성리콜
19920경쟁국산미국태국2022-03배추속채소(양배추ㆍ꽃양배추ㆍ구경양배추ㆍ케일 외 기타/신선ㆍ냉장한 것)WCID 검역소위생(미생물)/독성물질 살모넬라균 검출통관거부
2645경쟁국산미국미국2023-02초콜릿과 초콜릿과자(기타)<NA>라벨링/라벨에 표기되지 않은 성분(콩) 검출리콜
6468경쟁국산미국콜롬비아2022-11개사료NBID 검역소위생(미생물)/독성물질 살모넬라균 검출통관거부
14509경쟁국산미국캐나다2022-06올리브(냉동하지 않은 것/조제 및 보존처리/식초나 초산으로 처리한 것 제외)NBID 검역소서류미비/규정에 따른 조건 하에 제조되고 있다는 것을 입증하는 서류 미제출통관거부

Duplicate rows

Most frequently occurring

구분수입국원산지발생일자품목검역소문제사유조치사항# duplicates
152경쟁국산미국멕시코2022-04콘 칩SWID 검역소라벨링/인공 색소 성분 미표기통관거부114
203경쟁국산미국멕시코2022-08콘 칩SWID 검역소라벨링/인공 색소 성분 미표기통관거부63
291경쟁국산미국미국2022-06치즈(기타)<NA>위생(미생물)/리스테리아균 검출 가능성리콜48
148경쟁국산미국멕시코2022-04치즈(가공/갈았거나 분상의 것 제외)SWID 검역소라벨링/인공 색소 성분 미표기통관거부45
323경쟁국산미국미국2023-02베이커리 제품(빵, 건빵, 파이와 케이크, 비스킷, 쿠키와 크래커, 쌀과자 외 기타)<NA>위생(미생물)/리스테리아균 검출리콜43
193경쟁국산미국멕시코2022-07콘 칩SWID 검역소라벨링/인공 색소 성분 미표기통관거부34
153경쟁국산미국멕시코2022-04콘 칩SWID 검역소라벨링/필수 라벨 정보가 영어로 표기되지 않음통관거부31
267경쟁국산미국미국2022-01과실 샐러드<NA>위생(미생물)/리스테리아 균 검출 가능성리콜31
140경쟁국산미국멕시코2022-04곡물제조식료품(기타)SWID 검역소라벨링/인공 색소 성분 미표기통관거부28
162경쟁국산미국멕시코2022-05콘 칩SWID 검역소라벨링/인공 색소 성분 미표기통관거부28