Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows12
Duplicate rows (%)0.1%
Total size in memory732.4 KiB
Average record size in memory75.0 B

Variable types

Numeric3
Categorical2
Text1
DateTime2

Dataset

Description무농약, 유기농산물, 유기축산물, 무항생제축산물, 친환경인증을 받은 농축산물에 대한 인증정보(인증번호, 인증종류, 인증품목, 재배면적, 생산계획량, 인증기간, 원재료인증구분 등 정보)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20181019000000000977

Alerts

Dataset has 12 (0.1%) duplicate rowsDuplicates
원재료인증구분 is highly overall correlated with 재배(작업장)면적(사육두수) and 1 other fieldsHigh correlation
인증종류명 is highly overall correlated with 원재료인증구분High correlation
재배(작업장)면적(사육두수) is highly overall correlated with 생산(수입)계획량 and 1 other fieldsHigh correlation
생산(수입)계획량 is highly overall correlated with 재배(작업장)면적(사육두수)High correlation
원재료인증구분 is highly imbalanced (80.1%)Imbalance
재배(작업장)면적(사육두수) is highly skewed (γ1 = 36.27564784)Skewed
생산(수입)계획량 is highly skewed (γ1 = 99.99280585)Skewed

Reproduction

Analysis started2024-03-23 07:39:24.837158
Analysis finished2024-03-23 07:39:30.081043
Duration5.24 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

인증번호
Real number (ℝ)

Distinct5826
Distinct (%)58.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12231165
Minimum1300002
Maximum15104675
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:39:30.403137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1300002
5-th percentile6600357
Q110500070
median12501870
Q314304122
95-th percentile15103592
Maximum15104675
Range13804673
Interquartile range (IQR)3804053

Descriptive statistics

Standard deviation2758211.9
Coefficient of variation (CV)0.22550688
Kurtosis3.2147544
Mean12231165
Median Absolute Deviation (MAD)1999120
Skewness-1.5149327
Sum1.2231165 × 1011
Variance7.6077328 × 1012
MonotonicityNot monotonic
2024-03-23T07:39:30.986207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13100914 36
 
0.4%
12100774 36
 
0.4%
4100031 27
 
0.3%
10100687 27
 
0.3%
13301925 25
 
0.2%
13100449 23
 
0.2%
13100916 21
 
0.2%
13100641 20
 
0.2%
12303591 19
 
0.2%
14100071 18
 
0.2%
Other values (5816) 9748
97.5%
ValueCountFrequency (%)
1300002 2
< 0.1%
1300011 4
< 0.1%
1300013 1
 
< 0.1%
1300015 1
 
< 0.1%
1300016 1
 
< 0.1%
1300018 2
< 0.1%
1300019 2
< 0.1%
1300020 3
< 0.1%
1300022 2
< 0.1%
1300023 2
< 0.1%
ValueCountFrequency (%)
15104675 1
 
< 0.1%
15104670 8
0.1%
15104656 1
 
< 0.1%
15104654 1
 
< 0.1%
15104652 2
 
< 0.1%
15104646 1
 
< 0.1%
15104643 1
 
< 0.1%
15104637 1
 
< 0.1%
15104632 1
 
< 0.1%
15104626 1
 
< 0.1%

인증종류명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
유기농산물
4665 
무농약농산물
4006 
취급자
833 
무항생제축산물
 
380
유기가공식품
 
94
Other values (3)
 
22

Length

Max length9
Median length8
Mean length5.3245
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row유기농산물
2nd row유기농산물
3rd row무농약농산물
4th row무농약농산물
5th row무농약농산물

Common Values

ValueCountFrequency (%)
유기농산물 4665
46.7%
무농약농산물 4006
40.1%
취급자 833
 
8.3%
무항생제축산물 380
 
3.8%
유기가공식품 94
 
0.9%
무농약원료가공식품 12
 
0.1%
유기축산물 9
 
0.1%
비식용유기가공품 1
 
< 0.1%

Length

2024-03-23T07:39:31.582458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:39:31.967283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유기농산물 4665
46.7%
무농약농산물 4006
40.1%
취급자 833
 
8.3%
무항생제축산물 380
 
3.8%
유기가공식품 94
 
0.9%
무농약원료가공식품 12
 
0.1%
유기축산물 9
 
0.1%
비식용유기가공품 1
 
< 0.1%
Distinct614
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-23T07:39:32.603776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length16
Mean length2.6498
Min length1

Characters and Unicode

Total characters26498
Distinct characters395
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique179 ?
Unique (%)1.8%

Sample

1st row
2nd row바질
3rd row양파
4th row아로니아
5th row찰벼
ValueCountFrequency (%)
2020
 
20.2%
찰벼 283
 
2.8%
한우(식육 251
 
2.5%
감자 220
 
2.2%
210
 
2.1%
204
 
2.0%
돼지(식육 136
 
1.4%
양파 134
 
1.3%
블루베리 131
 
1.3%
116
 
1.2%
Other values (614) 6318
63.0%
2024-03-23T07:39:33.414088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2371
 
8.9%
831
 
3.1%
817
 
3.1%
) 715
 
2.7%
( 715
 
2.7%
708
 
2.7%
606
 
2.3%
562
 
2.1%
524
 
2.0%
470
 
1.8%
Other values (385) 18179
68.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 25005
94.4%
Close Punctuation 715
 
2.7%
Open Punctuation 715
 
2.7%
Lowercase Letter 30
 
0.1%
Space Separator 23
 
0.1%
Uppercase Letter 4
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Decimal Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2371
 
9.5%
831
 
3.3%
817
 
3.3%
708
 
2.8%
606
 
2.4%
562
 
2.2%
524
 
2.1%
470
 
1.9%
408
 
1.6%
407
 
1.6%
Other values (366) 17301
69.2%
Lowercase Letter
ValueCountFrequency (%)
a 8
26.7%
s 4
13.3%
t 4
13.3%
h 4
13.3%
e 2
 
6.7%
w 2
 
6.7%
n 2
 
6.7%
r 2
 
6.7%
o 2
 
6.7%
Decimal Number
ValueCountFrequency (%)
7 1
33.3%
0 1
33.3%
1 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 2
66.7%
% 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
O 2
50.0%
K 2
50.0%
Close Punctuation
ValueCountFrequency (%)
) 715
100.0%
Open Punctuation
ValueCountFrequency (%)
( 715
100.0%
Space Separator
ValueCountFrequency (%)
23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 25005
94.4%
Common 1459
 
5.5%
Latin 34
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2371
 
9.5%
831
 
3.3%
817
 
3.3%
708
 
2.8%
606
 
2.4%
562
 
2.2%
524
 
2.1%
470
 
1.9%
408
 
1.6%
407
 
1.6%
Other values (366) 17301
69.2%
Latin
ValueCountFrequency (%)
a 8
23.5%
s 4
11.8%
t 4
11.8%
h 4
11.8%
O 2
 
5.9%
K 2
 
5.9%
e 2
 
5.9%
w 2
 
5.9%
n 2
 
5.9%
r 2
 
5.9%
Common
ValueCountFrequency (%)
) 715
49.0%
( 715
49.0%
23
 
1.6%
. 2
 
0.1%
7 1
 
0.1%
0 1
 
0.1%
% 1
 
0.1%
1 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 25005
94.4%
ASCII 1493
 
5.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2371
 
9.5%
831
 
3.3%
817
 
3.3%
708
 
2.8%
606
 
2.4%
562
 
2.2%
524
 
2.1%
470
 
1.9%
408
 
1.6%
407
 
1.6%
Other values (366) 17301
69.2%
ASCII
ValueCountFrequency (%)
) 715
47.9%
( 715
47.9%
23
 
1.5%
a 8
 
0.5%
s 4
 
0.3%
t 4
 
0.3%
h 4
 
0.3%
. 2
 
0.1%
O 2
 
0.1%
K 2
 
0.1%
Other values (9) 14
 
0.9%

재배(작업장)면적(사육두수)
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct4707
Distinct (%)47.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6950.0883
Minimum1
Maximum1756032
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:39:33.676100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45.95
Q1330
median1691.5
Q35800
95-th percentile29299.05
Maximum1756032
Range1756031
Interquartile range (IQR)5470

Descriptive statistics

Standard deviation25877.247
Coefficient of variation (CV)3.7232976
Kurtosis2189.9361
Mean6950.0883
Median Absolute Deviation (MAD)1559.5
Skewness36.275648
Sum69500883
Variance6.6963193 × 108
MonotonicityNot monotonic
2024-03-23T07:39:34.074777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 253
 
2.5%
200 157
 
1.6%
300 152
 
1.5%
50 136
 
1.4%
330 132
 
1.3%
1000 127
 
1.3%
500 119
 
1.2%
150 96
 
1.0%
600 95
 
0.9%
660 89
 
0.9%
Other values (4697) 8644
86.4%
ValueCountFrequency (%)
1 5
 
0.1%
2 6
 
0.1%
3 10
 
0.1%
5 12
 
0.1%
6 6
 
0.1%
7 4
 
< 0.1%
8 5
 
0.1%
9 2
 
< 0.1%
10 63
0.6%
11 12
 
0.1%
ValueCountFrequency (%)
1756032 1
< 0.1%
755000 1
< 0.1%
483840 1
< 0.1%
400000 1
< 0.1%
385000 1
< 0.1%
360000 1
< 0.1%
341092 1
< 0.1%
269625 1
< 0.1%
258308 1
< 0.1%
209664 1
< 0.1%

생산(수입)계획량
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct2553
Distinct (%)25.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1017330.1
Minimum0
Maximum9.6808062 × 109
Zeros77
Zeros (%)0.8%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:39:34.428398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30
Q1500
median2100
Q37485
95-th percentile98075
Maximum9.6808062 × 109
Range9.6808062 × 109
Interquartile range (IQR)6985

Descriptive statistics

Standard deviation96809896
Coefficient of variation (CV)95.160754
Kurtosis9999.0384
Mean1017330.1
Median Absolute Deviation (MAD)1940
Skewness99.992806
Sum1.0173301 × 1010
Variance9.372156 × 1015
MonotonicityNot monotonic
2024-03-23T07:39:34.825126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000 473
 
4.7%
2000 311
 
3.1%
100 304
 
3.0%
500 298
 
3.0%
200 269
 
2.7%
3000 222
 
2.2%
300 211
 
2.1%
50 199
 
2.0%
5000 175
 
1.8%
1500 161
 
1.6%
Other values (2543) 7377
73.8%
ValueCountFrequency (%)
0 77
0.8%
1 32
 
0.3%
2 14
 
0.1%
3 12
 
0.1%
4 2
 
< 0.1%
5 27
 
0.3%
6 1
 
< 0.1%
7 1
 
< 0.1%
8 2
 
< 0.1%
10 104
1.0%
ValueCountFrequency (%)
9680806200 1
< 0.1%
49000000 1
< 0.1%
25500000 1
< 0.1%
24318000 1
< 0.1%
13687500 1
< 0.1%
11200000 1
< 0.1%
8409600 1
< 0.1%
6000000 1
< 0.1%
5300000 1
< 0.1%
5100000 1
< 0.1%
Distinct363
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2021-03-25 00:00:00
Maximum2022-03-24 00:00:00
2024-03-23T07:39:35.229903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:35.658606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct363
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-03-24 00:00:00
Maximum2023-03-23 00:00:00
2024-03-23T07:39:36.106887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:36.547018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

원재료인증구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9167 
무농약농산물
 
332
유기농산물
 
283
무항생제축산물
 
191
유기가공식품
 
14
Other values (2)
 
13

Length

Max length7
Median length4
Mean length4.1555
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9167
91.7%
무농약농산물 332
 
3.3%
유기농산물 283
 
2.8%
무항생제축산물 191
 
1.9%
유기가공식품 14
 
0.1%
유기축산물 10
 
0.1%
취급자 3
 
< 0.1%

Length

2024-03-23T07:39:36.993339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:39:37.294066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9167
91.7%
무농약농산물 332
 
3.3%
유기농산물 283
 
2.8%
무항생제축산물 191
 
1.9%
유기가공식품 14
 
0.1%
유기축산물 10
 
0.1%
취급자 3
 
< 0.1%

Interactions

2024-03-23T07:39:28.250322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:26.177916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:27.285821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:28.439084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:26.546842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:27.581292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:28.646194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:26.983280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:28.060985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T07:39:37.464036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인증번호인증종류명재배(작업장)면적(사육두수)생산(수입)계획량원재료인증구분
인증번호1.0000.4080.0000.0000.210
인증종류명0.4081.0000.0840.027NaN
재배(작업장)면적(사육두수)0.0000.0841.0000.000NaN
생산(수입)계획량0.0000.0270.0001.0000.000
원재료인증구분0.210NaNNaN0.0001.000
2024-03-23T07:39:37.723145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
원재료인증구분인증종류명
원재료인증구분1.0001.000
인증종류명1.0001.000
2024-03-23T07:39:37.875188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인증번호재배(작업장)면적(사육두수)생산(수입)계획량인증종류명원재료인증구분
인증번호1.0000.3900.1100.2140.106
재배(작업장)면적(사육두수)0.3901.0000.5160.0511.000
생산(수입)계획량0.1100.5161.0000.0200.000
인증종류명0.2140.0510.0201.0001.000
원재료인증구분0.1061.0000.0001.0001.000

Missing values

2024-03-23T07:39:29.368042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:39:29.851932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

인증번호인증종류명인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분
6545014100197유기농산물297921002021-09-292022-09-28<NA>
4260612100760유기농산물바질20402021-05-142022-05-13<NA>
2344910306269무농약농산물양파84817202022-02-222023-02-21<NA>
5692013302093무농약농산물아로니아10401002021-09-162022-09-15<NA>
4584812303098무농약농산물찰벼450227002021-10-162022-10-15<NA>
5612713301894무농약농산물651245002021-08-072022-08-06<NA>
3002311100002유기농산물고추5003002021-07-112022-07-10<NA>
3118211100468유기농산물할라피뇨(일반)33010002022-01-102023-01-09<NA>
5875013303318무농약농산물66035002021-05-212022-05-20<NA>
3384011100923유기농산물곤드레나물33001002021-06-072022-06-06<NA>
인증번호인증종류명인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분
2942410600983취급자한우(식육)2850002021-10-262022-10-25무항생제축산물
3545711303389무농약농산물찹쌀(일반)519926002021-10-202022-10-19<NA>
4522312302856무농약농산물붉은팥63859502021-08-222022-08-21<NA>
7546514501121무항생제축산물오리(식육)1260002772002021-12-142022-12-13<NA>
2459210306630무농약농산물바실184016702021-10-222022-10-21<NA>
5311013100697유기농산물깔라만시1881002022-01-092023-01-08<NA>
35965100001유기농산물쑥갓1501502021-09-102022-09-09<NA>
3534811303349무농약농산물3191502021-10-162022-10-15<NA>
1600310304338무농약농산물실파3504002021-11-282022-11-27<NA>
1996010305181무농약농산물녹두49104402021-05-152022-05-14<NA>

Duplicate rows

Most frequently occurring

인증번호인증종류명인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분# duplicates
915101079유기농산물297513392021-09-182022-09-17<NA>6
110303451무농약농산물198312002021-08-272022-08-26<NA>3
210303454무농약농산물198312002021-08-272022-08-26<NA>3
04100031유기농산물얼갈이배추30010002022-02-282023-02-27<NA>2
311100804유기농산물귀리30000300002021-07-112022-07-10<NA>2
412100978유기농산물324018002021-09-272022-09-26<NA>2
514100071유기농산물400028002021-08-312022-08-30<NA>2
614100071유기농산물800056002021-08-312022-08-30<NA>2
714100169유기농산물396727702021-08-272022-08-26<NA>2
815100613유기농산물300021002021-10-072022-10-06<NA>2