Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory468.8 KiB
Average record size in memory48.0 B

Variable types

Text3
Categorical2

Dataset

Description2015년 제·개정된 농축수산물 표준코드의 포장상태코드와 동일한 의미를 가지는 2013년 농축수산물 표준코드의 포장상태코드를 나타낸 정보
Author농림수산식품교육문화정보원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220210000000001768

Alerts

UPDT_DE is highly imbalanced (93.6%)Imbalance

Reproduction

Analysis started2024-04-21 01:00:35.197911
Analysis finished2024-04-21 01:00:36.382445
Duration1.18 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct57
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-21T10:00:37.066885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters30000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row714
2nd row703
3rd row117
4th row117
5th row117
ValueCountFrequency (%)
1zz 487
 
4.9%
108 475
 
4.8%
7zz 456
 
4.6%
701 447
 
4.5%
703 445
 
4.5%
106 247
 
2.5%
101 246
 
2.5%
110 244
 
2.4%
114 241
 
2.4%
715 238
 
2.4%
Other values (47) 6474
64.7%
2024-04-21T10:00:38.031873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 10380
34.6%
7 5287
17.6%
0 5198
17.3%
Z 2154
 
7.2%
3 1609
 
5.4%
8 1235
 
4.1%
4 995
 
3.3%
2 977
 
3.3%
5 954
 
3.2%
6 712
 
2.4%
Other values (2) 499
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 27824
92.7%
Uppercase Letter 2176
 
7.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 10380
37.3%
7 5287
19.0%
0 5198
18.7%
3 1609
 
5.8%
8 1235
 
4.4%
4 995
 
3.6%
2 977
 
3.5%
5 954
 
3.4%
6 712
 
2.6%
9 477
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
Z 2154
99.0%
A 22
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27824
92.7%
Latin 2176
 
7.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 10380
37.3%
7 5287
19.0%
0 5198
18.7%
3 1609
 
5.8%
8 1235
 
4.4%
4 995
 
3.6%
2 977
 
3.5%
5 954
 
3.4%
6 712
 
2.6%
9 477
 
1.7%
Latin
ValueCountFrequency (%)
Z 2154
99.0%
A 22
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 10380
34.6%
7 5287
17.6%
0 5198
17.3%
Z 2154
 
7.2%
3 1609
 
5.4%
8 1235
 
4.1%
4 995
 
3.3%
2 977
 
3.3%
5 954
 
3.2%
6 712
 
2.4%
Other values (2) 499
 
1.7%

STD_FRMLC_NEW_NM
Categorical

Distinct35
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
기타
1077 
상자
752 
봉지
683 
PP대
 
509
그물망
 
490
Other values (30)
6489 

Length

Max length6
Median length5
Mean length2.1192
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd rowD/M
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
기타 1077
 
10.8%
상자 752
 
7.5%
봉지 683
 
6.8%
PP대 509
 
5.1%
그물망 490
 
4.9%
464
 
4.6%
D/M 445
 
4.5%
440
 
4.4%
430
 
4.3%
419
 
4.2%
Other values (25) 4291
42.9%

Length

2024-04-21T10:00:38.264285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
기타 1077
 
10.8%
상자 752
 
7.5%
봉지 683
 
6.8%
pp대 509
 
5.1%
그물망 490
 
4.9%
464
 
4.6%
d/m 445
 
4.5%
440
 
4.4%
430
 
4.3%
419
 
4.2%
Other values (25) 4291
42.9%
Distinct64
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-21T10:00:39.112870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters30000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row714
2nd row703
3rd row117
4th row117
5th row117
ValueCountFrequency (%)
106 247
 
2.5%
1zz 246
 
2.5%
101 246
 
2.5%
110 244
 
2.4%
109 244
 
2.4%
7zz 241
 
2.4%
114 241
 
2.4%
100 241
 
2.4%
715 238
 
2.4%
112 237
 
2.4%
Other values (54) 7575
75.8%
2024-04-21T10:00:40.225430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 10374
34.6%
0 6017
20.1%
7 5287
17.6%
3 1609
 
5.4%
Z 1114
 
3.7%
4 995
 
3.3%
8 991
 
3.3%
2 977
 
3.3%
5 954
 
3.2%
6 939
 
3.1%
Other values (2) 743
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 28864
96.2%
Uppercase Letter 1136
 
3.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 10374
35.9%
0 6017
20.8%
7 5287
18.3%
3 1609
 
5.6%
4 995
 
3.4%
8 991
 
3.4%
2 977
 
3.4%
5 954
 
3.3%
6 939
 
3.3%
9 721
 
2.5%
Uppercase Letter
ValueCountFrequency (%)
Z 1114
98.1%
A 22
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
Common 28864
96.2%
Latin 1136
 
3.8%

Most frequent character per script

Common
ValueCountFrequency (%)
1 10374
35.9%
0 6017
20.8%
7 5287
18.3%
3 1609
 
5.6%
4 995
 
3.4%
8 991
 
3.4%
2 977
 
3.4%
5 954
 
3.3%
6 939
 
3.3%
9 721
 
2.5%
Latin
ValueCountFrequency (%)
Z 1114
98.1%
A 22
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 10374
34.6%
0 6017
20.1%
7 5287
17.6%
3 1609
 
5.4%
Z 1114
 
3.7%
4 995
 
3.3%
8 991
 
3.3%
2 977
 
3.3%
5 954
 
3.2%
6 939
 
3.1%
Other values (2) 743
 
2.5%
Distinct9556
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-21T10:00:41.307022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length22
Mean length8.978
Min length1

Characters and Unicode

Total characters89780
Distinct characters95
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9133 ?
Unique (%)91.3%

Sample

1st rowton 코 3미
2nd rowg D/M 1방
3rd rowton 축 500개이상
4th rowl 축 80내
5th rowton 축 35내
ValueCountFrequency (%)
g 2088
 
7.7%
kg 2071
 
7.6%
ton 2046
 
7.5%
l 969
 
3.6%
ml 807
 
3.0%
기타 727
 
2.7%
상자 525
 
1.9%
pp대 509
 
1.9%
그물망 490
 
1.8%
475
 
1.7%
Other values (181) 16524
60.7%
2024-04-21T10:00:42.552763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17231
 
19.2%
0 5499
 
6.1%
g 4112
 
4.6%
4090
 
4.6%
1 3957
 
4.4%
5 2347
 
2.6%
2 2312
 
2.6%
P 2196
 
2.4%
2153
 
2.4%
k 2067
 
2.3%
Other values (85) 43816
48.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24882
27.7%
Decimal Number 20356
22.7%
Space Separator 17231
19.2%
Lowercase Letter 16514
18.4%
Uppercase Letter 5932
 
6.6%
Other Punctuation 1557
 
1.7%
Open Punctuation 1134
 
1.3%
Close Punctuation 1059
 
1.2%
Math Symbol 570
 
0.6%
Dash Punctuation 545
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4090
 
16.4%
2153
 
8.7%
1473
 
5.9%
1126
 
4.5%
1017
 
4.1%
990
 
4.0%
971
 
3.9%
727
 
2.9%
683
 
2.7%
683
 
2.7%
Other values (45) 10969
44.1%
Uppercase Letter
ValueCountFrequency (%)
P 2196
37.0%
B 498
 
8.4%
T 456
 
7.7%
S 396
 
6.7%
M 389
 
6.6%
E 272
 
4.6%
X 271
 
4.6%
O 271
 
4.6%
C 227
 
3.8%
D 224
 
3.8%
Other values (5) 732
 
12.3%
Decimal Number
ValueCountFrequency (%)
0 5499
27.0%
1 3957
19.4%
5 2347
11.5%
2 2312
11.4%
3 1802
 
8.9%
4 1291
 
6.3%
8 972
 
4.8%
7 906
 
4.5%
6 675
 
3.3%
9 595
 
2.9%
Lowercase Letter
ValueCountFrequency (%)
g 4112
24.9%
k 2067
12.5%
n 2046
12.4%
o 2046
12.4%
t 2044
12.4%
m 1906
11.5%
l 1723
10.4%
c 570
 
3.5%
Other Punctuation
ValueCountFrequency (%)
/ 901
57.9%
. 656
42.1%
Space Separator
ValueCountFrequency (%)
17231
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1134
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1059
100.0%
Math Symbol
ValueCountFrequency (%)
× 570
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 545
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 42452
47.3%
Hangul 24882
27.7%
Latin 22446
25.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4090
 
16.4%
2153
 
8.7%
1473
 
5.9%
1126
 
4.5%
1017
 
4.1%
990
 
4.0%
971
 
3.9%
727
 
2.9%
683
 
2.7%
683
 
2.7%
Other values (45) 10969
44.1%
Latin
ValueCountFrequency (%)
g 4112
18.3%
P 2196
9.8%
k 2067
9.2%
n 2046
9.1%
o 2046
9.1%
t 2044
9.1%
m 1906
8.5%
l 1723
7.7%
c 570
 
2.5%
B 498
 
2.2%
Other values (13) 3238
14.4%
Common
ValueCountFrequency (%)
17231
40.6%
0 5499
 
13.0%
1 3957
 
9.3%
5 2347
 
5.5%
2 2312
 
5.4%
3 1802
 
4.2%
4 1291
 
3.0%
( 1134
 
2.7%
) 1059
 
2.5%
8 972
 
2.3%
Other values (7) 4848
 
11.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64328
71.7%
Hangul 24882
 
27.7%
None 570
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17231
26.8%
0 5499
 
8.5%
g 4112
 
6.4%
1 3957
 
6.2%
5 2347
 
3.6%
2 2312
 
3.6%
P 2196
 
3.4%
k 2067
 
3.2%
n 2046
 
3.2%
o 2046
 
3.2%
Other values (29) 20515
31.9%
Hangul
ValueCountFrequency (%)
4090
 
16.4%
2153
 
8.7%
1473
 
5.9%
1126
 
4.5%
1017
 
4.1%
990
 
4.0%
971
 
3.9%
727
 
2.9%
683
 
2.7%
683
 
2.7%
Other values (45) 10969
44.1%
None
ValueCountFrequency (%)
× 570
100.0%

UPDT_DE
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20220127
9925 
뿌리)
 
75

Length

Max length8
Median length8
Mean length7.9625
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20220127
2nd row20220127
3rd row20220127
4th row20220127
5th row20220127

Common Values

ValueCountFrequency (%)
20220127 9925
99.2%
뿌리) 75
 
0.8%

Length

2024-04-21T10:00:42.777709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T10:00:42.944159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20220127 9925
99.2%
뿌리 75
 
0.8%

Correlations

2024-04-21T10:00:43.047152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
STD_FRMLC_NEW_CODESTD_FRMLC_NEW_NMSTD_FRMLC_CODEUPDT_DE
STD_FRMLC_NEW_CODE1.0001.0001.0000.072
STD_FRMLC_NEW_NM1.0001.0001.0000.046
STD_FRMLC_CODE1.0001.0001.0000.070
UPDT_DE0.0720.0460.0701.000
2024-04-21T10:00:43.206397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
UPDT_DESTD_FRMLC_NEW_NM
UPDT_DE1.0000.038
STD_FRMLC_NEW_NM0.0381.000
2024-04-21T10:00:43.345216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
STD_FRMLC_NEW_NMUPDT_DE
STD_FRMLC_NEW_NM1.0000.038
UPDT_DE0.0381.000

Missing values

2024-04-21T10:00:36.018989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T10:00:36.295856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

STD_FRMLC_NEW_CODESTD_FRMLC_NEW_NMSTD_FRMLC_CODESTD_FRMLC_NMUPDT_DE
14870714714ton 코 3미20220127
11466703D/M703g D/M 1방20220127
6735117117ton 축 500개이상20220127
6620117117l 축 80내20220127
6732117117ton 축 35내20220127
14953714714ton 코 80내20220127
10868702PAN(펜)702PAN(펜) L20220127
14262711711g 각 170내20220127
2495107파렛트107g 파렛트 22개20220127
11484703D/M713깡 400내20220127
STD_FRMLC_NEW_CODESTD_FRMLC_NEW_NMSTD_FRMLC_CODESTD_FRMLC_NMUPDT_DE
5533114114ml 채 18개20220127
1546104PP대104ton PP대 22개20220127
749102P-BOX102ml P-BOX 5개20220127
10050701상자706ton C/T(B/T) 20미20220127
10367701상자706C/T(B/T) 5통20220127
3910110접시용기110ml 접시용기 19개20220127
12228705그물망705g 그물망 9통20220127
10703702PAN(펜)702ton PAN(펜) 3방20220127
11688704PP대704PP대 6미20220127
5748115115속 40내20220127