Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows14
Duplicate rows (%)0.1%
Total size in memory732.4 KiB
Average record size in memory75.0 B

Variable types

Numeric3
Categorical2
Text1
DateTime2

Dataset

Description인증번호에 대해 품목별 인증 현황(인증번호, 인증종류명, 인증농가, 인증품목명, 재배(작업장)면적(사육두수), 생산(수입)계획량, 인증기간(시작일), 인증기간(종료일), 원재료인증구분)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220204000000001679

Alerts

Dataset has 14 (0.1%) duplicate rowsDuplicates
인증종류명 is highly overall correlated with 원재료인증구분High correlation
원재료인증구분 is highly overall correlated with 재배(작업장)면적(사육두수) and 1 other fieldsHigh correlation
재배(작업장)면적(사육두수) is highly overall correlated with 원재료인증구분High correlation
원재료인증구분 is highly imbalanced (79.5%)Imbalance

Reproduction

Analysis started2024-01-05 22:16:55.517979
Analysis finished2024-01-05 22:17:06.653499
Duration11.14 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

인증번호
Real number (ℝ)

Distinct5899
Distinct (%)59.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12332571
Minimum1300011
Maximum15303294
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-05T22:17:06.949040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1300011
5-th percentile7300013
Q110502365
median13100463
Q314600049
95-th percentile15104397
Maximum15303294
Range14003283
Interquartile range (IQR)4097684.2

Descriptive statistics

Standard deviation2741180.1
Coefficient of variation (CV)0.22227158
Kurtosis3.3764244
Mean12332571
Median Absolute Deviation (MAD)2000619
Skewness-1.5373969
Sum1.2332571 × 1011
Variance7.5140685 × 1012
MonotonicityNot monotonic
2024-01-05T22:17:07.450130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12100774 45
 
0.4%
13100914 35
 
0.4%
13100603 30
 
0.3%
13100916 23
 
0.2%
4100031 22
 
0.2%
13100641 20
 
0.2%
14100057 20
 
0.2%
10100687 18
 
0.2%
10303755 16
 
0.2%
14303158 15
 
0.1%
Other values (5889) 9756
97.6%
ValueCountFrequency (%)
1300011 1
 
< 0.1%
1300015 3
 
< 0.1%
1300016 1
 
< 0.1%
1300019 2
 
< 0.1%
1300021 4
 
< 0.1%
1300022 3
 
< 0.1%
1301135 10
0.1%
1301136 8
0.1%
1301138 8
0.1%
1301140 1
 
< 0.1%
ValueCountFrequency (%)
15303294 1
 
< 0.1%
15303278 2
 
< 0.1%
15303241 1
 
< 0.1%
15303240 6
0.1%
15303235 2
 
< 0.1%
15303217 1
 
< 0.1%
15303191 1
 
< 0.1%
15303184 1
 
< 0.1%
15303160 3
< 0.1%
15303159 1
 
< 0.1%

인증종류명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
유기농산물
4660 
무농약농산물
3928 
취급자
872 
무항생제축산물
 
422
유기가공식품
 
92
Other values (3)
 
26

Length

Max length9
Median length8
Mean length5.3185
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row유기농산물
2nd row유기농산물
3rd row무농약농산물
4th row유기농산물
5th row유기농산물

Common Values

ValueCountFrequency (%)
유기농산물 4660
46.6%
무농약농산물 3928
39.3%
취급자 872
 
8.7%
무항생제축산물 422
 
4.2%
유기가공식품 92
 
0.9%
무농약원료가공식품 14
 
0.1%
유기축산물 9
 
0.1%
비식용유기가공품 3
 
< 0.1%

Length

2024-01-05T22:17:08.021975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-05T22:17:08.538596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유기농산물 4660
46.6%
무농약농산물 3928
39.3%
취급자 872
 
8.7%
무항생제축산물 422
 
4.2%
유기가공식품 92
 
0.9%
무농약원료가공식품 14
 
0.1%
유기축산물 9
 
0.1%
비식용유기가공품 3
 
< 0.1%
Distinct618
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-01-05T22:17:09.501312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length12
Mean length2.7054
Min length1

Characters and Unicode

Total characters27054
Distinct characters395
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique198 ?
Unique (%)2.0%

Sample

1st row
2nd row뉴그린
3rd row
4th row단호박
5th row
ValueCountFrequency (%)
2042
 
20.4%
한우(식육 270
 
2.7%
찰벼 266
 
2.7%
233
 
2.3%
감자 175
 
1.7%
블루베리 147
 
1.5%
돼지(식육 141
 
1.4%
들깨 129
 
1.3%
120
 
1.2%
112
 
1.1%
Other values (618) 6394
63.8%
2024-01-05T22:17:10.919009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2381
 
8.8%
880
 
3.3%
858
 
3.2%
( 754
 
2.8%
) 754
 
2.8%
754
 
2.8%
648
 
2.4%
641
 
2.4%
558
 
2.1%
488
 
1.8%
Other values (385) 18338
67.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 25500
94.3%
Open Punctuation 754
 
2.8%
Close Punctuation 754
 
2.8%
Space Separator 29
 
0.1%
Decimal Number 6
 
< 0.1%
Other Punctuation 4
 
< 0.1%
Uppercase Letter 4
 
< 0.1%
Lowercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2381
 
9.3%
880
 
3.5%
858
 
3.4%
754
 
3.0%
648
 
2.5%
641
 
2.5%
558
 
2.2%
488
 
1.9%
413
 
1.6%
401
 
1.6%
Other values (371) 17478
68.5%
Uppercase Letter
ValueCountFrequency (%)
O 1
25.0%
E 1
25.0%
Y 1
25.0%
R 1
25.0%
Lowercase Letter
ValueCountFrequency (%)
s 1
33.3%
t 1
33.3%
a 1
33.3%
Other Punctuation
ValueCountFrequency (%)
% 3
75.0%
. 1
 
25.0%
Decimal Number
ValueCountFrequency (%)
0 3
50.0%
7 3
50.0%
Open Punctuation
ValueCountFrequency (%)
( 754
100.0%
Close Punctuation
ValueCountFrequency (%)
) 754
100.0%
Space Separator
ValueCountFrequency (%)
29
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 25500
94.3%
Common 1547
 
5.7%
Latin 7
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2381
 
9.3%
880
 
3.5%
858
 
3.4%
754
 
3.0%
648
 
2.5%
641
 
2.5%
558
 
2.2%
488
 
1.9%
413
 
1.6%
401
 
1.6%
Other values (371) 17478
68.5%
Common
ValueCountFrequency (%)
( 754
48.7%
) 754
48.7%
29
 
1.9%
% 3
 
0.2%
0 3
 
0.2%
7 3
 
0.2%
. 1
 
0.1%
Latin
ValueCountFrequency (%)
s 1
14.3%
t 1
14.3%
a 1
14.3%
O 1
14.3%
E 1
14.3%
Y 1
14.3%
R 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 25500
94.3%
ASCII 1554
 
5.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2381
 
9.3%
880
 
3.5%
858
 
3.4%
754
 
3.0%
648
 
2.5%
641
 
2.5%
558
 
2.2%
488
 
1.9%
413
 
1.6%
401
 
1.6%
Other values (371) 17478
68.5%
ASCII
ValueCountFrequency (%)
( 754
48.5%
) 754
48.5%
29
 
1.9%
% 3
 
0.2%
0 3
 
0.2%
7 3
 
0.2%
s 1
 
0.1%
. 1
 
0.1%
t 1
 
0.1%
a 1
 
0.1%
Other values (4) 4
 
0.3%

재배(작업장)면적(사육두수)
Real number (ℝ)

HIGH CORRELATION 

Distinct4937
Distinct (%)49.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6897.3075
Minimum0.7
Maximum828672
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-05T22:17:11.398866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.7
5-th percentile41
Q1309
median1652
Q35450.25
95-th percentile29316.95
Maximum828672
Range828671.3
Interquartile range (IQR)5141.25

Descriptive statistics

Standard deviation21408.631
Coefficient of variation (CV)3.1039113
Kurtosis413.08746
Mean6897.3075
Median Absolute Deviation (MAD)1529
Skewness15.15809
Sum68973075
Variance4.5832947 × 108
MonotonicityNot monotonic
2024-01-05T22:17:11.945204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100.0 232
 
2.3%
200.0 182
 
1.8%
300.0 168
 
1.7%
50.0 144
 
1.4%
1000.0 139
 
1.4%
500.0 124
 
1.2%
330.0 106
 
1.1%
150.0 104
 
1.0%
660.0 93
 
0.9%
600.0 88
 
0.9%
Other values (4927) 8620
86.2%
ValueCountFrequency (%)
0.7 3
 
< 0.1%
1.0 5
 
0.1%
1.1 1
 
< 0.1%
2.0 5
 
0.1%
2.1 1
 
< 0.1%
2.2 1
 
< 0.1%
2.5 1
 
< 0.1%
3.0 3
 
< 0.1%
5.0 14
0.1%
5.5 1
 
< 0.1%
ValueCountFrequency (%)
828672.0 1
< 0.1%
700000.0 1
< 0.1%
570000.0 1
< 0.1%
326016.0 1
< 0.1%
311040.0 1
< 0.1%
290400.0 1
< 0.1%
273597.0 1
< 0.1%
270521.0 1
< 0.1%
270000.0 1
< 0.1%
267840.0 1
< 0.1%

생산(수입)계획량
Real number (ℝ)

Distinct2653
Distinct (%)26.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39646.452
Minimum0
Maximum11000000
Zeros77
Zeros (%)0.8%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-05T22:17:12.624781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30
Q1500
median2000
Q37180
95-th percentile91059.65
Maximum11000000
Range11000000
Interquartile range (IQR)6680

Descriptive statistics

Standard deviation276815.45
Coefficient of variation (CV)6.9820989
Kurtosis467.65051
Mean39646.452
Median Absolute Deviation (MAD)1878
Skewness17.882906
Sum3.9646452 × 108
Variance7.6626792 × 1010
MonotonicityNot monotonic
2024-01-05T22:17:13.173550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000.0 425
 
4.2%
500.0 306
 
3.1%
2000.0 296
 
3.0%
100.0 287
 
2.9%
200.0 255
 
2.5%
3000.0 221
 
2.2%
300.0 214
 
2.1%
50.0 207
 
2.1%
5000.0 163
 
1.6%
1500.0 155
 
1.6%
Other values (2643) 7471
74.7%
ValueCountFrequency (%)
0.0 77
0.8%
0.1 4
 
< 0.1%
0.2 1
 
< 0.1%
0.5 1
 
< 0.1%
1.0 42
0.4%
1.5 1
 
< 0.1%
2.0 16
 
0.2%
2.5 1
 
< 0.1%
3.0 10
 
0.1%
3.6 1
 
< 0.1%
ValueCountFrequency (%)
11000000.0 1
< 0.1%
8409600.0 1
< 0.1%
6748000.0 1
< 0.1%
6000000.0 1
< 0.1%
5000000.0 2
< 0.1%
4970000.0 1
< 0.1%
4800000.0 2
< 0.1%
4200000.0 1
< 0.1%
3615000.0 1
< 0.1%
3467500.0 1
< 0.1%
Distinct362
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2021-09-22 00:00:00
Maximum2022-09-21 00:00:00
2024-01-05T22:17:13.613713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:17:14.058014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct362
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-09-21 00:00:00
Maximum2023-09-20 00:00:00
2024-01-05T22:17:14.641658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:17:15.117155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

원재료인증구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9128 
무농약농산물
 
340
유기농산물
 
320
무항생제축산물
 
188
유기가공식품
 
17
Other values (2)
 
7

Length

Max length7
Median length4
Mean length4.1603
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9128
91.3%
무농약농산물 340
 
3.4%
유기농산물 320
 
3.2%
무항생제축산물 188
 
1.9%
유기가공식품 17
 
0.2%
유기축산물 6
 
0.1%
취급자 1
 
< 0.1%

Length

2024-01-05T22:17:15.643737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-05T22:17:16.006979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9128
91.3%
무농약농산물 340
 
3.4%
유기농산물 320
 
3.2%
무항생제축산물 188
 
1.9%
유기가공식품 17
 
0.2%
유기축산물 6
 
0.1%
취급자 1
 
< 0.1%

Interactions

2024-01-05T22:17:04.556885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:17:02.888137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:17:03.747294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:17:04.948770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:17:03.189680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:17:04.024209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:17:05.254591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:17:03.484068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T22:17:04.282581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-05T22:17:16.246645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인증번호인증종류명재배(작업장)면적(사육두수)생산(수입)계획량원재료인증구분
인증번호1.0000.4040.0000.0570.371
인증종류명0.4041.0000.2140.204NaN
재배(작업장)면적(사육두수)0.0000.2141.0000.394NaN
생산(수입)계획량0.0570.2040.3941.0000.000
원재료인증구분0.371NaNNaN0.0001.000
2024-01-05T22:17:16.625684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인증종류명원재료인증구분
인증종류명1.0001.000
원재료인증구분1.0001.000
2024-01-05T22:17:17.233862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인증번호재배(작업장)면적(사육두수)생산(수입)계획량인증종류명원재료인증구분
인증번호1.0000.3890.0950.2110.194
재배(작업장)면적(사육두수)0.3891.0000.4850.1161.000
생산(수입)계획량0.0950.4851.0000.1010.000
인증종류명0.2110.1160.1011.0001.000
원재료인증구분0.1941.0000.0001.0001.000

Missing values

2024-01-05T22:17:05.750440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-05T22:17:06.389780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

인증번호인증종류명인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분
7908215100996유기농산물2652.01600.02022-09-152023-09-14<NA>
5242013100676유기농산물뉴그린1740.07000.02021-11-092022-11-08<NA>
4847012304456무농약농산물3882.02300.02022-07-072023-07-06<NA>
5344613100850유기농산물단호박2820.08200.02022-04-282023-04-27<NA>
7890215100955유기농산물7961.33600.02022-09-102023-09-09<NA>
1197010303433무농약농산물찰벼7200.04650.02022-08-272023-08-26<NA>
9821600371취급자육우(식육)185.05000.02022-06-242023-06-23무항생제축산물
7260214304015무농약농산물얼갈이배추40.0200.02022-09-112023-09-10<NA>
9025115103202유기농산물2949.01770.02021-10-232022-10-22<NA>
9343215103791유기농산물흑미4259.02800.02021-10-282022-10-27<NA>
인증번호인증종류명인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분
3040811100420유기농산물약콩2729.02.02021-10-232022-10-22<NA>
6239713600329취급자한우(식육)465.0300000.02022-02-022023-02-01무항생제축산물
9413715103944유기농산물차조8219.02466.02022-04-292023-04-28<NA>
7785715100735유기농산물토란2223.03334.02022-09-022023-09-01<NA>
4610112303455무농약농산물호두10.010.02022-04-152023-04-14<NA>
2632210600380취급자한우(식육)96.010000.02021-11-202022-11-19무항생제축산물
2018310305496무농약농산물감자1800.04380.02022-06-082023-06-07<NA>
8097915101318유기농산물귀리34663.417325.02021-09-302022-09-29<NA>
21103300086무농약농산물오이200.0800.02022-06-042023-06-03<NA>
2666910600463취급자잡곡류2341.7150000.02021-12-202022-12-19유기농산물

Duplicate rows

Most frequently occurring

인증번호인증종류명인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분# duplicates
815101079유기농산물2975.01339.02022-09-182023-09-17<NA>5
514100764유기농산물4000.02600.02022-08-312023-08-30<NA>3
715100477유기농산물1921.0870.02022-08-272023-08-26<NA>3
04100031유기농산물셀러리300.01000.02022-02-282023-02-27<NA>2
112100774유기농산물브로코리(녹색꽃양배추)1653.03835.02022-05-252023-05-24<NA>2
213100002유기농산물9900.02400.02022-08-122023-08-11<NA>2
313302205무농약농산물4000.02770.02021-10-102022-10-09<NA>2
414100071유기농산물4000.02600.02022-08-312023-08-30<NA>2
614302032무농약농산물4000.02600.02022-08-312023-08-30<NA>2
915102660유기농산물10000.01200.02022-07-062023-07-05<NA>2