Overview

Dataset statistics

Number of variables13
Number of observations276
Missing cells130
Missing cells (%)3.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory29.8 KiB
Average record size in memory110.5 B

Variable types

Numeric4
Categorical5
Text4

Dataset

Description도매시장에서 실거래가 발생하는 농축수산물 표준 품목 코드 499개를 사전 선정하였으며 499개의 농축수산물 표준 품목코드를 기준으로 동일한 국제표준코드를 나타낸 정보
Author농림수산식품교육문화정보원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220210000000001774

Alerts

GPC_SEGM_CODE has constant value ""Constant
GPC_SEGM_NM has constant value ""Constant
UPDT_DE has constant value ""Constant
CATGORY_CODE is highly overall correlated with CATGORY_NM and 1 other fieldsHigh correlation
GPC_FAMY_CODE is highly overall correlated with GPC_CLAS_CODE and 3 other fieldsHigh correlation
GPC_CLAS_CODE is highly overall correlated with GPC_FAMY_CODE and 3 other fieldsHigh correlation
GPC_BRIK_CODE is highly overall correlated with GPC_FAMY_CODE and 1 other fieldsHigh correlation
CATGORY_NM is highly overall correlated with CATGORY_CODE and 3 other fieldsHigh correlation
GPC_FAMY_NM is highly overall correlated with CATGORY_CODE and 3 other fieldsHigh correlation
GPC_BRIK_CODE has 65 (23.6%) missing valuesMissing
GPC_BRIK_NM has 65 (23.6%) missing valuesMissing
STD_PRDLST_CODE has unique valuesUnique

Reproduction

Analysis started2023-12-11 03:47:21.722744
Analysis finished2023-12-11 03:47:25.122138
Duration3.4 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

CATGORY_CODE
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)11.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.224638
Minimum1
Maximum91
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-11T12:47:25.189854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.75
Q110
median13
Q319
95-th percentile72
Maximum91
Range90
Interquartile range (IQR)9

Descriptive statistics

Standard deviation22.204838
Coefficient of variation (CV)1.0461822
Kurtosis1.5988677
Mean21.224638
Median Absolute Deviation (MAD)6
Skewness1.7430078
Sum5858
Variance493.05481
MonotonicityIncreasing
2023-12-11T12:47:25.321569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
6 47
17.0%
10 41
14.9%
13 26
 
9.4%
19 25
 
9.1%
14 16
 
5.8%
17 16
 
5.8%
12 12
 
4.3%
61 10
 
3.6%
71 10
 
3.6%
16 7
 
2.5%
Other values (21) 66
23.9%
ValueCountFrequency (%)
1 2
 
0.7%
2 1
 
0.4%
3 6
 
2.2%
4 5
 
1.8%
5 3
 
1.1%
6 47
17.0%
9 4
 
1.4%
10 41
14.9%
11 5
 
1.8%
12 12
 
4.3%
ValueCountFrequency (%)
91 3
 
1.1%
81 5
1.8%
73 3
 
1.1%
72 4
 
1.4%
71 10
3.6%
64 3
 
1.1%
63 2
 
0.7%
62 4
 
1.4%
61 10
3.6%
47 2
 
0.7%

CATGORY_NM
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)11.2%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
과실류
47 
엽경채류
41 
양채류
26 
약용작물류
25 
산채류
16 
Other values (26)
121 

Length

Max length6
Median length5
Mean length3.7246377
Min length2

Unique

Unique3 ?
Unique (%)1.1%

Sample

1st row미곡류
2nd row미곡류
3rd row맥류
4th row두류
5th row두류

Common Values

ValueCountFrequency (%)
과실류 47
17.0%
엽경채류 41
14.9%
양채류 26
 
9.4%
약용작물류 25
 
9.1%
산채류 16
 
5.8%
버섯류 16
 
5.8%
조미채소류 12
 
4.3%
내수면어류 10
 
3.6%
해면어류 10
 
3.6%
특용작물류 7
 
2.5%
Other values (21) 66
23.9%

Length

2023-12-11T12:47:25.506239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
과실류 47
17.0%
엽경채류 41
14.9%
양채류 26
 
9.4%
약용작물류 25
 
9.1%
산채류 16
 
5.8%
버섯류 16
 
5.8%
조미채소류 12
 
4.3%
내수면어류 10
 
3.6%
해면어류 10
 
3.6%
특용작물류 7
 
2.5%
Other values (21) 66
23.9%

STD_PRDLST_CODE
Text

UNIQUE 

Distinct276
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-11T12:47:25.960114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1104
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique276 ?
Unique (%)100.0%

Sample

1st row0101
2nd row0104
3rd row0201
4th row0301
5th row0302
ValueCountFrequency (%)
0101 1
 
0.4%
1705 1
 
0.4%
1610 1
 
0.4%
1615 1
 
0.4%
1701 1
 
0.4%
1702 1
 
0.4%
1704 1
 
0.4%
1603 1
 
0.4%
1707 1
 
0.4%
1602 1
 
0.4%
Other values (266) 266
96.4%
2023-12-11T12:47:26.617887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 295
26.7%
0 241
21.8%
6 101
 
9.1%
2 99
 
9.0%
3 92
 
8.3%
4 75
 
6.8%
7 58
 
5.3%
9 56
 
5.1%
5 47
 
4.3%
8 33
 
3.0%
Other values (5) 7
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1097
99.4%
Uppercase Letter 7
 
0.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 295
26.9%
0 241
22.0%
6 101
 
9.2%
2 99
 
9.0%
3 92
 
8.4%
4 75
 
6.8%
7 58
 
5.3%
9 56
 
5.1%
5 47
 
4.3%
8 33
 
3.0%
Uppercase Letter
ValueCountFrequency (%)
O 2
28.6%
B 2
28.6%
N 1
14.3%
D 1
14.3%
V 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 1097
99.4%
Latin 7
 
0.6%

Most frequent character per script

Common
ValueCountFrequency (%)
1 295
26.9%
0 241
22.0%
6 101
 
9.2%
2 99
 
9.0%
3 92
 
8.4%
4 75
 
6.8%
7 58
 
5.3%
9 56
 
5.1%
5 47
 
4.3%
8 33
 
3.0%
Latin
ValueCountFrequency (%)
O 2
28.6%
B 2
28.6%
N 1
14.3%
D 1
14.3%
V 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1104
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 295
26.7%
0 241
21.8%
6 101
 
9.1%
2 99
 
9.0%
3 92
 
8.3%
4 75
 
6.8%
7 58
 
5.3%
9 56
 
5.1%
5 47
 
4.3%
8 33
 
3.0%
Other values (5) 7
 
0.6%
Distinct275
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-11T12:47:27.055510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length3.1485507
Min length1

Characters and Unicode

Total characters869
Distinct characters272
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique274 ?
Unique (%)99.3%

Sample

1st row
2nd row찹쌀
3rd row보리
4th row
5th row
ValueCountFrequency (%)
민들레 2
 
0.7%
팽이버섯 1
 
0.4%
호박씨 1
 
0.4%
수세미 1
 
0.4%
느타리버섯 1
 
0.4%
양송이 1
 
0.4%
표고버섯 1
 
0.4%
땅콩 1
 
0.4%
목이 1
 
0.4%
사보래(사보이양배추 1
 
0.4%
Other values (265) 265
96.0%
2023-12-11T12:47:27.650279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
42
 
4.8%
26
 
3.0%
22
 
2.5%
16
 
1.8%
16
 
1.8%
16
 
1.8%
16
 
1.8%
15
 
1.7%
15
 
1.7%
( 14
 
1.6%
Other values (262) 671
77.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 840
96.7%
Open Punctuation 14
 
1.6%
Close Punctuation 14
 
1.6%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
42
 
5.0%
26
 
3.1%
22
 
2.6%
16
 
1.9%
16
 
1.9%
16
 
1.9%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
Other values (259) 642
76.4%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Other Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 840
96.7%
Common 29
 
3.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
42
 
5.0%
26
 
3.1%
22
 
2.6%
16
 
1.9%
16
 
1.9%
16
 
1.9%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
Other values (259) 642
76.4%
Common
ValueCountFrequency (%)
( 14
48.3%
) 14
48.3%
1
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 840
96.7%
ASCII 28
 
3.2%
None 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
42
 
5.0%
26
 
3.1%
22
 
2.6%
16
 
1.9%
16
 
1.9%
16
 
1.9%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
Other values (259) 642
76.4%
ASCII
ValueCountFrequency (%)
( 14
50.0%
) 14
50.0%
None
ValueCountFrequency (%)
1
100.0%

GPC_SEGM_CODE
Categorical

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
50000000
276 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row50000000
2nd row50000000
3rd row50000000
4th row50000000
5th row50000000

Common Values

ValueCountFrequency (%)
50000000 276
100.0%

Length

2023-12-11T12:47:27.846919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:47:27.975153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
50000000 276
100.0%

GPC_SEGM_NM
Categorical

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
Food/Beverage/Tobacco
276 

Length

Max length21
Median length21
Mean length21
Min length21

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFood/Beverage/Tobacco
2nd rowFood/Beverage/Tobacco
3rd rowFood/Beverage/Tobacco
4th rowFood/Beverage/Tobacco
5th rowFood/Beverage/Tobacco

Common Values

ValueCountFrequency (%)
Food/Beverage/Tobacco 276
100.0%

Length

2023-12-11T12:47:28.120148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:47:28.220500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
food/beverage/tobacco 276
100.0%

GPC_FAMY_CODE
Real number (ℝ)

HIGH CORRELATION 

Distinct16
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50844420
Minimum50100000
Maximum93030000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-11T12:47:28.358487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum50100000
5-th percentile50100000
Q150120001
median50260000
Q350260000
95-th percentile50342500
Maximum93030000
Range42930000
Interquartile range (IQR)139999.25

Descriptive statistics

Standard deviation5125540.5
Coefficient of variation (CV)0.10080832
Kurtosis65.18544
Mean50844420
Median Absolute Deviation (MAD)10000
Skewness8.1669724
Sum1.403306 × 1010
Variance2.6271166 × 1013
MonotonicityNot monotonic
2023-12-11T12:47:28.504063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
50260000 120
43.5%
50250000 46
 
16.7%
50120000 40
 
14.5%
50100000 29
 
10.5%
50350000 10
 
3.6%
50220000 7
 
2.5%
50340000 4
 
1.4%
93030000 4
 
1.4%
50310000 3
 
1.1%
50320000 3
 
1.1%
Other values (6) 10
 
3.6%
ValueCountFrequency (%)
50100000 29
 
10.5%
50120000 40
 
14.5%
50120001 1
 
0.4%
50130000 1
 
0.4%
50150000 1
 
0.4%
50190000 3
 
1.1%
50220000 7
 
2.5%
50250000 46
 
16.7%
50260000 120
43.5%
50290000 1
 
0.4%
ValueCountFrequency (%)
93030000 4
 
1.4%
50350000 10
 
3.6%
50340000 4
 
1.4%
50330000 3
 
1.1%
50320000 3
 
1.1%
50310000 3
 
1.1%
50290000 1
 
0.4%
50260000 120
43.5%
50250000 46
 
16.7%
50220000 7
 
2.5%

GPC_FAMY_NM
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
Vegetables (Non Leaf) ? Unprepared/Unprocessed (Fresh)
121 
Fruits ? Unprepared/Unprocessed (Fresh)
46 
Seafood
41 
Fruits/Vegetables/Nuts/Seeds Prepared/Processed
29 
Leaf Vegetables ? Unprepared/Unprocessed (Fresh)
 
10
Other values (9)
29 

Length

Max length54
Median length50
Mean length41.713768
Min length7

Unique

Unique2 ?
Unique (%)0.7%

Sample

1st rowCereal/Grain/Pulse Products
2nd rowCereal/Grain/Pulse Products
3rd rowCereal/Grain/Pulse Products
4th rowVegetables (Non Leaf) ? Unprepared/Unprocessed (Fresh)
5th rowVegetables (Non Leaf) ? Unprepared/Unprocessed (Fresh)

Common Values

ValueCountFrequency (%)
Vegetables (Non Leaf) ? Unprepared/Unprocessed (Fresh) 121
43.8%
Fruits ? Unprepared/Unprocessed (Fresh) 46
 
16.7%
Seafood 41
 
14.9%
Fruits/Vegetables/Nuts/Seeds Prepared/Processed 29
 
10.5%
Leaf Vegetables ? Unprepared/Unprocessed (Fresh) 10
 
3.6%
Cereal/Grain/Pulse Products 7
 
2.5%
Nuts/Seeds ? Unprepared/Unprocessed (Shelf Stable) 4
 
1.4%
Live Plants (Genus A thru G) 4
 
1.4%
Fruits ? Unprepared/Unprocessed (Shelf Stable) 3
 
1.1%
Vegetables ? Unprepared/Unprocessed (Shelf Stable) 3
 
1.1%
Other values (4) 8
 
2.9%

Length

2023-12-11T12:47:28.651659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
190
16.3%
unprepared/unprocessed 190
16.3%
fresh 180
15.4%
vegetables 134
11.5%
leaf 131
11.2%
non 121
10.4%
fruits 49
 
4.2%
seafood 41
 
3.5%
fruits/vegetables/nuts/seeds 29
 
2.5%
prepared/processed 29
 
2.5%
Other values (16) 74
 
6.3%

GPC_CLAS_CODE
Real number (ℝ)

HIGH CORRELATION 

Distinct57
Distinct (%)20.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50845567
Minimum50101800
Maximum93037100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-11T12:47:28.808928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum50101800
5-th percentile50102100
Q150122250
median50260100
Q350261300
95-th percentile50342600
Maximum93037100
Range42935300
Interquartile range (IQR)139050

Descriptive statistics

Standard deviation5125825.3
Coefficient of variation (CV)0.10081165
Kurtosis65.185825
Mean50845567
Median Absolute Deviation (MAD)9050
Skewness8.167008
Sum1.4033377 × 1010
Variance2.6274085 × 1013
MonotonicityNot monotonic
2023-12-11T12:47:28.993365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50102100 27
 
9.8%
50260100 24
 
8.7%
50121500 23
 
8.3%
50261300 21
 
7.6%
50261100 19
 
6.9%
50261700 16
 
5.8%
50121700 11
 
4.0%
50250600 9
 
3.3%
50251000 9
 
3.3%
50251900 8
 
2.9%
Other values (47) 109
39.5%
ValueCountFrequency (%)
50101800 2
 
0.7%
50102100 27
9.8%
50121500 23
8.3%
50121501 1
 
0.4%
50121700 11
4.0%
50121900 2
 
0.7%
50122000 1
 
0.4%
50122100 2
 
0.7%
50122300 1
 
0.4%
50132500 1
 
0.4%
ValueCountFrequency (%)
93037100 1
 
0.4%
93033300 2
0.7%
93030500 1
 
0.4%
50350500 1
 
0.4%
50350400 3
1.1%
50350200 3
1.1%
50350100 3
1.1%
50340100 4
1.4%
50330100 3
1.1%
50320100 3
1.1%
Distinct56
Distinct (%)20.3%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-11T12:47:29.266298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length42
Mean length19.833333
Min length5

Characters and Unicode

Total characters5474
Distinct characters47
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)8.3%

Sample

1st rowGrains/Flour
2nd rowGrains/Flour
3rd rowGrains/Flour
4th rowBeans (With Pods)
5th rowBeans (With Pods)
ValueCountFrequency (%)
vegetables 95
15.5%
80
 
13.1%
unprepared/unprocessed 50
 
8.2%
prepared/processed 34
 
5.5%
fish 26
 
4.2%
fruit 24
 
3.9%
root/tuber 24
 
3.9%
herbs 21
 
3.4%
brassica 19
 
3.1%
fungi 16
 
2.6%
Other values (65) 224
36.5%
2023-12-11T12:47:29.742159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 866
15.8%
s 494
 
9.0%
r 434
 
7.9%
338
 
6.2%
a 309
 
5.6%
t 232
 
4.2%
l 213
 
3.9%
d 202
 
3.7%
p 195
 
3.6%
o 187
 
3.4%
Other values (37) 2004
36.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4207
76.9%
Uppercase Letter 668
 
12.2%
Space Separator 338
 
6.2%
Other Punctuation 215
 
3.9%
Close Punctuation 23
 
0.4%
Open Punctuation 23
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 866
20.6%
s 494
11.7%
r 434
10.3%
a 309
 
7.3%
t 232
 
5.5%
l 213
 
5.1%
d 202
 
4.8%
p 195
 
4.6%
o 187
 
4.4%
i 183
 
4.3%
Other values (12) 892
21.2%
Uppercase Letter
ValueCountFrequency (%)
U 100
15.0%
V 96
14.4%
P 96
14.4%
F 88
13.2%
S 76
11.4%
B 42
6.3%
T 24
 
3.6%
H 24
 
3.6%
R 24
 
3.6%
C 18
 
2.7%
Other values (10) 80
12.0%
Other Punctuation
ValueCountFrequency (%)
/ 137
63.7%
? 78
36.3%
Space Separator
ValueCountFrequency (%)
338
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4875
89.1%
Common 599
 
10.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 866
17.8%
s 494
 
10.1%
r 434
 
8.9%
a 309
 
6.3%
t 232
 
4.8%
l 213
 
4.4%
d 202
 
4.1%
p 195
 
4.0%
o 187
 
3.8%
i 183
 
3.8%
Other values (32) 1560
32.0%
Common
ValueCountFrequency (%)
338
56.4%
/ 137
22.9%
? 78
 
13.0%
) 23
 
3.8%
( 23
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5474
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 866
15.8%
s 494
 
9.0%
r 434
 
7.9%
338
 
6.2%
a 309
 
5.6%
t 232
 
4.2%
l 213
 
3.9%
d 202
 
3.7%
p 195
 
3.6%
o 187
 
3.4%
Other values (37) 2004
36.6%

GPC_BRIK_CODE
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct123
Distinct (%)58.3%
Missing65
Missing (%)23.6%
Infinite0
Infinite (%)0.0%
Mean10194313
Minimum10000003
Maximum50261700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-11T12:47:29.932098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10000003
5-th percentile10000008
Q110000272
median10005917
Q310006114
95-th percentile10006354
Maximum50261700
Range40261697
Interquartile range (IQR)5842

Descriptive statistics

Standard deviation2771489.2
Coefficient of variation (CV)0.27186621
Kurtosis210.99952
Mean10194313
Median Absolute Deviation (MAD)424
Skewness14.525814
Sum2.1510001 × 109
Variance7.6811526 × 1012
MonotonicityNot monotonic
2023-12-11T12:47:30.141235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000272 25
 
9.1%
10000282 19
 
6.9%
10000019 11
 
4.0%
10006260 7
 
2.5%
10000211 5
 
1.8%
10000281 4
 
1.4%
10000008 4
 
1.4%
10000007 3
 
1.1%
10000003 3
 
1.1%
10000006 3
 
1.1%
Other values (113) 127
46.0%
(Missing) 65
23.6%
ValueCountFrequency (%)
10000003 3
 
1.1%
10000006 3
 
1.1%
10000007 3
 
1.1%
10000008 4
 
1.4%
10000016 1
 
0.4%
10000017 1
 
0.4%
10000019 11
4.0%
10000146 1
 
0.4%
10000149 1
 
0.4%
10000203 1
 
0.4%
ValueCountFrequency (%)
50261700 1
0.4%
10006632 1
0.4%
10006594 2
0.7%
10006566 1
0.4%
10006441 1
0.4%
10006417 1
0.4%
10006364 1
0.4%
10006363 2
0.7%
10006362 1
0.4%
10006345 1
0.4%

GPC_BRIK_NM
Text

MISSING 

Distinct120
Distinct (%)56.9%
Missing65
Missing (%)23.6%
Memory size2.3 KiB
2023-12-11T12:47:30.594978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length46
Mean length26.720379
Min length4

Characters and Unicode

Total characters5638
Distinct characters57
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique95 ?
Unique (%)45.0%

Sample

1st rowGrains/Cereal ? Not Ready to Eat ? (Shelf Stable)
2nd rowGrains/Cereal ? Not Ready to Eat ? (Shelf Stable)
3rd rowGrains/Cereal ? Not Ready to Eat ? (Shelf Stable)
4th rowBeans (Winged)
5th rowPeas
ValueCountFrequency (%)
102
 
15.7%
unprepared/unprocessed 49
 
7.5%
shelf 45
 
6.9%
stable 45
 
6.9%
perishable 41
 
6.3%
prepared/processed 34
 
5.2%
vegetables 30
 
4.6%
fish 26
 
4.0%
shellfish 13
 
2.0%
nuts/seeds 9
 
1.4%
Other values (157) 256
39.4%
2023-12-11T12:47:31.186649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 805
 
14.3%
440
 
7.8%
s 437
 
7.8%
r 427
 
7.6%
a 342
 
6.1%
l 236
 
4.2%
p 217
 
3.8%
d 207
 
3.7%
o 198
 
3.5%
t 184
 
3.3%
Other values (47) 2145
38.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4129
73.2%
Uppercase Letter 643
 
11.4%
Space Separator 440
 
7.8%
Other Punctuation 209
 
3.7%
Open Punctuation 108
 
1.9%
Close Punctuation 108
 
1.9%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 805
19.5%
s 437
10.6%
r 427
10.3%
a 342
 
8.3%
l 236
 
5.7%
p 217
 
5.3%
d 207
 
5.0%
o 198
 
4.8%
t 184
 
4.5%
h 181
 
4.4%
Other values (16) 895
21.7%
Uppercase Letter
ValueCountFrequency (%)
S 143
22.2%
P 135
21.0%
U 98
15.2%
F 43
 
6.7%
V 31
 
4.8%
C 30
 
4.7%
R 20
 
3.1%
G 18
 
2.8%
N 16
 
2.5%
M 15
 
2.3%
Other values (13) 94
14.6%
Other Punctuation
ValueCountFrequency (%)
/ 106
50.7%
? 101
48.3%
, 1
 
0.5%
' 1
 
0.5%
Space Separator
ValueCountFrequency (%)
440
100.0%
Open Punctuation
ValueCountFrequency (%)
( 108
100.0%
Close Punctuation
ValueCountFrequency (%)
) 108
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4772
84.6%
Common 866
 
15.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 805
16.9%
s 437
 
9.2%
r 427
 
8.9%
a 342
 
7.2%
l 236
 
4.9%
p 217
 
4.5%
d 207
 
4.3%
o 198
 
4.1%
t 184
 
3.9%
h 181
 
3.8%
Other values (39) 1538
32.2%
Common
ValueCountFrequency (%)
440
50.8%
( 108
 
12.5%
) 108
 
12.5%
/ 106
 
12.2%
? 101
 
11.7%
- 1
 
0.1%
, 1
 
0.1%
' 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5638
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 805
 
14.3%
440
 
7.8%
s 437
 
7.8%
r 427
 
7.6%
a 342
 
6.1%
l 236
 
4.2%
p 217
 
3.8%
d 207
 
3.7%
o 198
 
3.5%
t 184
 
3.3%
Other values (47) 2145
38.0%

UPDT_DE
Categorical

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
20151203
276 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20151203
2nd row20151203
3rd row20151203
4th row20151203
5th row20151203

Common Values

ValueCountFrequency (%)
20151203 276
100.0%

Length

2023-12-11T12:47:31.325811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:47:31.432785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20151203 276
100.0%

Interactions

2023-12-11T12:47:24.139379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:22.465688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:22.933075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:23.678375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:24.258063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:22.575589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:23.328232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:23.764763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:24.391654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:22.685841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:23.423021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:23.863832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:24.518864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:22.807914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:23.548546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:23.992932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:47:31.508407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
CATGORY_CODECATGORY_NMGPC_FAMY_CODEGPC_FAMY_NMGPC_CLAS_CODEGPC_CLAS_NMGPC_BRIK_CODE
CATGORY_CODE1.0001.0000.4960.8940.1810.9460.000
CATGORY_NM1.0001.0000.8470.9510.8220.9690.000
GPC_FAMY_CODE0.4960.8471.0001.0000.9801.0000.000
GPC_FAMY_NM0.8940.9511.0001.0001.0001.0000.000
GPC_CLAS_CODE0.1810.8220.9801.0001.0001.0000.000
GPC_CLAS_NM0.9460.9691.0001.0001.0001.0001.000
GPC_BRIK_CODE0.0000.0000.0000.0000.0001.0001.000
2023-12-11T12:47:31.624367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GPC_FAMY_NMCATGORY_NM
GPC_FAMY_NM1.0000.668
CATGORY_NM0.6681.000
2023-12-11T12:47:31.732584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
CATGORY_CODEGPC_FAMY_CODEGPC_CLAS_CODEGPC_BRIK_CODECATGORY_NMGPC_FAMY_NM
CATGORY_CODE1.000-0.374-0.324-0.3310.9580.599
GPC_FAMY_CODE-0.3741.0000.9530.5570.6930.978
GPC_CLAS_CODE-0.3240.9531.0000.5350.6930.978
GPC_BRIK_CODE-0.3310.5570.5351.0000.0000.000
CATGORY_NM0.9580.6930.6930.0001.0000.668
GPC_FAMY_NM0.5990.9780.9780.0000.6681.000

Missing values

2023-12-11T12:47:24.687738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:47:24.903423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T12:47:25.058223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

CATGORY_CODECATGORY_NMSTD_PRDLST_CODESTD_PRDLST_NMGPC_SEGM_CODEGPC_SEGM_NMGPC_FAMY_CODEGPC_FAMY_NMGPC_CLAS_CODEGPC_CLAS_NMGPC_BRIK_CODEGPC_BRIK_NMUPDT_DE
01미곡류010150000000Food/Beverage/Tobacco50220000Cereal/Grain/Pulse Products50221000Grains/Flour10000211Grains/Cereal ? Not Ready to Eat ? (Shelf Stable)20151203
11미곡류0104찹쌀50000000Food/Beverage/Tobacco50220000Cereal/Grain/Pulse Products50221000Grains/Flour10000211Grains/Cereal ? Not Ready to Eat ? (Shelf Stable)20151203
22맥류0201보리50000000Food/Beverage/Tobacco50220000Cereal/Grain/Pulse Products50221000Grains/Flour10000315Grains/Cereal ? Not Ready to Eat ? (Shelf Stable)20151203
33두류030150000000Food/Beverage/Tobacco50260000Vegetables (Non Leaf) ? Unprepared/Unprocessed (Fresh)50261400Beans (With Pods)10006336Beans (Winged)20151203
43두류030250000000Food/Beverage/Tobacco50260000Vegetables (Non Leaf) ? Unprepared/Unprocessed (Fresh)50261400Beans (With Pods)<NA><NA>20151203
53두류0303녹두50000000Food/Beverage/Tobacco50260000Vegetables (Non Leaf) ? Unprepared/Unprocessed (Fresh)50261400Beans (With Pods)<NA><NA>20151203
63두류0304완두50000000Food/Beverage/Tobacco50260000Vegetables (Non Leaf) ? Unprepared/Unprocessed (Fresh)50261500Peas (With Pods)10005984Peas20151203
73두류0305강낭콩50000000Food/Beverage/Tobacco50290000Vegetables (Non Leaf) ? Unprepared/Unprocessed (Fresh)50261400Beans (With Pods)<NA><NA>20151203
83두류0306동부50000000Food/Beverage/Tobacco50260000Vegetables (Non Leaf) ? Unprepared/Unprocessed (Fresh)50261400Beans (With Pods)<NA><NA>20151203
94잡곡류0401옥수수50000000Food/Beverage/Tobacco50260000Vegetables (Non Leaf) ? Unprepared/Unprocessed (Fresh)50261000Other Vegetables10006147Sweetcorn20151203
CATGORY_CODECATGORY_NMSTD_PRDLST_CODESTD_PRDLST_NMGPC_SEGM_CODEGPC_SEGM_NMGPC_FAMY_CODEGPC_FAMY_NMGPC_CLAS_CODEGPC_CLAS_NMGPC_BRIK_CODEGPC_BRIK_NMUPDT_DE
26673내수면갑각류7302민물게류50000000Food/Beverage/Tobacco50120000Seafood50121700Shellfish Unprepared/Unprocessed10000019Shellfish ? Unprepared/Unprocessed (Perishable)20151203
26773내수면갑각류7303민물새우류50000000Food/Beverage/Tobacco50120000Seafood50121700Shellfish Unprepared/Unprocessed10000019Shellfish ? Unprepared/Unprocessed (Perishable)20151203
26881해조류8102갈래곰보류50000000Food/Beverage/Tobacco50120000Seafood50121500Fish ? Unprepared/Unprocessed10000281Fish ? Unprepared/Unprocessed (Frozen)20151203
26981해조류8103김류50000000Food/Beverage/Tobacco50120000Seafood50121500Fish ? Unprepared/Unprocessed10000281Fish ? Unprepared/Unprocessed (Frozen)20151203
27081해조류8104꼬시래기류50000000Food/Beverage/Tobacco50120000Seafood50121900Fish ? Prepared/Processed10000017Fish ? Prepared/Processed (Frozen)20151203
27181해조류8106도박류50000000Food/Beverage/Tobacco50120000Seafood50121500Fish ? Unprepared/Unprocessed10000281Fish ? Unprepared/Unprocessed (Frozen)20151203
27281해조류8112청각류50000000Food/Beverage/Tobacco50120000Seafood50121500Fish ? Unprepared/Unprocessed10000281Fish ? Unprepared/Unprocessed (Frozen)20151203
27391농림가공9104절임식품50000000Food/Beverage/Tobacco50190000Prepared/Preserved Foods50193100Vegetable Based Products / Meals10000289Vegetable Based Products / Meals ? Ready to Eat (Perishable)20151203
27491농림가공9105유지50000000Food/Beverage/Tobacco50150000Oils/Fats Edible50151500Oils Edible<NA><NA>20151203
27591농림가공9107곡물제조50000000Food/Beverage/Tobacco50190000Prepared/Preserved Foods50193200Grain Based Products / Meals<NA><NA>20151203