Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows17
Duplicate rows (%)0.2%
Total size in memory732.4 KiB
Average record size in memory75.0 B

Variable types

Numeric3
Categorical2
Text1
DateTime2

Dataset

Description무농약, 유기농산물, 유기축산물, 무항생제축산물, 친환경인증을 받은 농축산물에 대한 인증정보(인증번호, 인증종류, 인증품목, 재배면적, 생산계획량, 인증기간, 원재료인증구분 등 정보)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20181019000000000977

Alerts

Dataset has 17 (0.2%) duplicate rowsDuplicates
원재료인증구분 is highly overall correlated with 재배(작업장)면적(사육두수) and 1 other fieldsHigh correlation
인증종류명 is highly overall correlated with 원재료인증구분High correlation
재배(작업장)면적(사육두수) is highly overall correlated with 원재료인증구분High correlation
원재료인증구분 is highly imbalanced (80.9%)Imbalance
생산(수입)계획량 is highly skewed (γ1 = 97.7927222)Skewed
생산(수입)계획량 has 101 (1.0%) zerosZeros

Reproduction

Analysis started2024-03-23 07:39:57.070178
Analysis finished2024-03-23 07:40:02.031962
Duration4.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

인증번호
Real number (ℝ)

Distinct5902
Distinct (%)59.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12270640
Minimum1300011
Maximum15105992
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:40:02.242846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1300011
5-th percentile6600390.7
Q110501556
median12600366
Q314501168
95-th percentile15104197
Maximum15105992
Range13805981
Interquartile range (IQR)3999611

Descriptive statistics

Standard deviation2774133.7
Coefficient of variation (CV)0.22607898
Kurtosis3.2553535
Mean12270640
Median Absolute Deviation (MAD)2000065.5
Skewness-1.5327212
Sum1.227064 × 1011
Variance7.695818 × 1012
MonotonicityNot monotonic
2024-03-23T07:40:02.995924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12100774 38
 
0.4%
4100031 25
 
0.2%
13100916 25
 
0.2%
10100687 24
 
0.2%
13100914 24
 
0.2%
13100449 23
 
0.2%
14100057 21
 
0.2%
13100641 20
 
0.2%
14303158 16
 
0.2%
13100697 16
 
0.2%
Other values (5892) 9768
97.7%
ValueCountFrequency (%)
1300011 1
 
< 0.1%
1300014 1
 
< 0.1%
1300016 1
 
< 0.1%
1300018 2
 
< 0.1%
1300019 2
 
< 0.1%
1300020 1
 
< 0.1%
1300021 2
 
< 0.1%
1300022 2
 
< 0.1%
1300024 2
 
< 0.1%
1301135 8
0.1%
ValueCountFrequency (%)
15105992 1
< 0.1%
15105991 1
< 0.1%
15105988 1
< 0.1%
15105987 1
< 0.1%
15105978 1
< 0.1%
15105967 2
< 0.1%
15105961 1
< 0.1%
15105956 1
< 0.1%
15105955 1
< 0.1%
15105948 1
< 0.1%

인증종류명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
유기농산물
4825 
무농약농산물
3785 
취급자
856 
무항생제축산물
 
418
유기가공식품
 
86
Other values (3)
 
30

Length

Max length9
Median length8
Mean length5.3073
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row유기농산물
2nd row유기농산물
3rd row유기농산물
4th row유기농산물
5th row유기농산물

Common Values

ValueCountFrequency (%)
유기농산물 4825
48.2%
무농약농산물 3785
37.9%
취급자 856
 
8.6%
무항생제축산물 418
 
4.2%
유기가공식품 86
 
0.9%
무농약원료가공식품 18
 
0.2%
유기축산물 10
 
0.1%
비식용유기가공품 2
 
< 0.1%

Length

2024-03-23T07:40:03.621891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:40:04.118495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유기농산물 4825
48.2%
무농약농산물 3785
37.9%
취급자 856
 
8.6%
무항생제축산물 418
 
4.2%
유기가공식품 86
 
0.9%
무농약원료가공식품 18
 
0.2%
유기축산물 10
 
0.1%
비식용유기가공품 2
 
< 0.1%
Distinct617
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-23T07:40:04.671067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length15
Mean length2.6588
Min length1

Characters and Unicode

Total characters26588
Distinct characters397
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique180 ?
Unique (%)1.8%

Sample

1st row
2nd row쪽파
3rd row
4th row홍고추
5th row
ValueCountFrequency (%)
2128
 
21.2%
찰벼 300
 
3.0%
한우(식육 261
 
2.6%
233
 
2.3%
감자 178
 
1.8%
돼지(식육 137
 
1.4%
양파 131
 
1.3%
블루베리 115
 
1.1%
대파 115
 
1.1%
106
 
1.1%
Other values (615) 6329
63.1%
2024-03-23T07:40:05.703734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2470
 
9.3%
855
 
3.2%
802
 
3.0%
740
 
2.8%
( 730
 
2.7%
) 730
 
2.7%
662
 
2.5%
618
 
2.3%
554
 
2.1%
451
 
1.7%
Other values (387) 17976
67.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 25029
94.1%
Open Punctuation 730
 
2.7%
Close Punctuation 730
 
2.7%
Lowercase Letter 54
 
0.2%
Space Separator 33
 
0.1%
Uppercase Letter 9
 
< 0.1%
Other Punctuation 2
 
< 0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2470
 
9.9%
855
 
3.4%
802
 
3.2%
740
 
3.0%
662
 
2.6%
618
 
2.5%
554
 
2.2%
451
 
1.8%
407
 
1.6%
385
 
1.5%
Other values (368) 17085
68.3%
Lowercase Letter
ValueCountFrequency (%)
a 14
25.9%
h 8
14.8%
s 6
11.1%
t 6
11.1%
e 4
 
7.4%
o 4
 
7.4%
r 4
 
7.4%
w 4
 
7.4%
n 4
 
7.4%
Uppercase Letter
ValueCountFrequency (%)
K 4
44.4%
O 2
22.2%
E 1
 
11.1%
Y 1
 
11.1%
R 1
 
11.1%
Open Punctuation
ValueCountFrequency (%)
( 730
100.0%
Close Punctuation
ValueCountFrequency (%)
) 730
100.0%
Space Separator
ValueCountFrequency (%)
33
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Decimal Number
ValueCountFrequency (%)
4 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 25029
94.1%
Common 1496
 
5.6%
Latin 63
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2470
 
9.9%
855
 
3.4%
802
 
3.2%
740
 
3.0%
662
 
2.6%
618
 
2.5%
554
 
2.2%
451
 
1.8%
407
 
1.6%
385
 
1.5%
Other values (368) 17085
68.3%
Latin
ValueCountFrequency (%)
a 14
22.2%
h 8
12.7%
s 6
9.5%
t 6
9.5%
e 4
 
6.3%
o 4
 
6.3%
r 4
 
6.3%
w 4
 
6.3%
n 4
 
6.3%
K 4
 
6.3%
Other values (4) 5
 
7.9%
Common
ValueCountFrequency (%)
( 730
48.8%
) 730
48.8%
33
 
2.2%
. 2
 
0.1%
4 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 25029
94.1%
ASCII 1559
 
5.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2470
 
9.9%
855
 
3.4%
802
 
3.2%
740
 
3.0%
662
 
2.6%
618
 
2.5%
554
 
2.2%
451
 
1.8%
407
 
1.6%
385
 
1.5%
Other values (368) 17085
68.3%
ASCII
ValueCountFrequency (%)
( 730
46.8%
) 730
46.8%
33
 
2.1%
a 14
 
0.9%
h 8
 
0.5%
s 6
 
0.4%
t 6
 
0.4%
e 4
 
0.3%
o 4
 
0.3%
r 4
 
0.3%
Other values (9) 20
 
1.3%

재배(작업장)면적(사육두수)
Real number (ℝ)

HIGH CORRELATION 

Distinct4971
Distinct (%)49.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7074.596
Minimum0.1
Maximum760000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:40:06.131044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile40
Q1311.75
median1653
Q35523.5
95-th percentile29436.95
Maximum760000
Range759999.9
Interquartile range (IQR)5211.75

Descriptive statistics

Standard deviation24581.272
Coefficient of variation (CV)3.4745831
Kurtosis396.96855
Mean7074.596
Median Absolute Deviation (MAD)1530
Skewness16.423905
Sum70745960
Variance6.0423892 × 108
MonotonicityNot monotonic
2024-03-23T07:40:06.616087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100.0 247
 
2.5%
300.0 183
 
1.8%
200.0 171
 
1.7%
50.0 130
 
1.3%
1000.0 128
 
1.3%
500.0 127
 
1.3%
150.0 99
 
1.0%
330.0 97
 
1.0%
400.0 88
 
0.9%
20.0 82
 
0.8%
Other values (4961) 8648
86.5%
ValueCountFrequency (%)
0.1 1
 
< 0.1%
1.0 10
0.1%
2.0 11
0.1%
3.0 3
 
< 0.1%
3.6 1
 
< 0.1%
4.0 3
 
< 0.1%
5.0 14
0.1%
6.0 4
 
< 0.1%
6.8 1
 
< 0.1%
7.4 1
 
< 0.1%
ValueCountFrequency (%)
760000.0 1
< 0.1%
723883.0 1
< 0.1%
720000.0 1
< 0.1%
660000.0 1
< 0.1%
640000.0 1
< 0.1%
620000.0 1
< 0.1%
390000.0 1
< 0.1%
360000.0 1
< 0.1%
350000.0 1
< 0.1%
347941.1 1
< 0.1%

생산(수입)계획량
Real number (ℝ)

SKEWED  ZEROS 

Distinct2567
Distinct (%)25.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77878.845
Minimum0
Maximum3.29 × 108
Zeros101
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:40:07.081751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30
Q1500
median2100
Q37542.25
95-th percentile95002
Maximum3.29 × 108
Range3.29 × 108
Interquartile range (IQR)7042.25

Descriptive statistics

Standard deviation3314799.6
Coefficient of variation (CV)42.563543
Kurtosis9699.0488
Mean77878.845
Median Absolute Deviation (MAD)1970
Skewness97.792722
Sum7.7878845 × 108
Variance1.0987896 × 1013
MonotonicityNot monotonic
2024-03-23T07:40:07.523221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000.0 414
 
4.1%
100.0 353
 
3.5%
2000.0 312
 
3.1%
500.0 279
 
2.8%
200.0 267
 
2.7%
300.0 211
 
2.1%
3000.0 181
 
1.8%
50.0 180
 
1.8%
5000.0 155
 
1.6%
1500.0 153
 
1.5%
Other values (2557) 7495
75.0%
ValueCountFrequency (%)
0.0 101
1.0%
0.1 1
 
< 0.1%
0.4 2
 
< 0.1%
0.5 1
 
< 0.1%
1.0 49
0.5%
1.4 1
 
< 0.1%
1.5 1
 
< 0.1%
2.0 7
 
0.1%
2.5 1
 
< 0.1%
3.0 12
 
0.1%
ValueCountFrequency (%)
329000000.0 1
< 0.1%
24900000.0 1
< 0.1%
11000000.0 1
< 0.1%
10560000.0 1
< 0.1%
10000000.0 1
< 0.1%
8994222.0 1
< 0.1%
7846460.0 1
< 0.1%
7800000.0 1
< 0.1%
7500000.0 1
< 0.1%
6000000.0 1
< 0.1%
Distinct362
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-03-22 00:00:00
Maximum2023-03-21 00:00:00
2024-03-23T07:40:07.970505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:40:08.571007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct362
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2023-03-21 00:00:00
Maximum2024-03-20 00:00:00
2024-03-23T07:40:08.962311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:40:09.383293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

원재료인증구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9144 
무농약농산물
 
343
유기농산물
 
296
무항생제축산물
 
188
유기가공식품
 
18
Other values (3)
 
11

Length

Max length9
Median length4
Mean length4.1608
Min length4

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9144
91.4%
무농약농산물 343
 
3.4%
유기농산물 296
 
3.0%
무항생제축산물 188
 
1.9%
유기가공식품 18
 
0.2%
유기축산물 7
 
0.1%
무농약원료가공식품 3
 
< 0.1%
비식용유기가공품 1
 
< 0.1%

Length

2024-03-23T07:40:09.795697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:40:10.010768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9144
91.4%
무농약농산물 343
 
3.4%
유기농산물 296
 
3.0%
무항생제축산물 188
 
1.9%
유기가공식품 18
 
0.2%
유기축산물 7
 
0.1%
무농약원료가공식품 3
 
< 0.1%
비식용유기가공품 1
 
< 0.1%

Interactions

2024-03-23T07:40:00.292307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:58.375572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:59.241087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:40:00.613646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:58.651760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:59.560921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:40:00.995230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:58.933930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:39:59.997679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T07:40:10.174420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인증번호인증종류명재배(작업장)면적(사육두수)생산(수입)계획량원재료인증구분
인증번호1.0000.4230.0100.0000.275
인증종류명0.4231.0000.2420.026NaN
재배(작업장)면적(사육두수)0.0100.2421.0000.000NaN
생산(수입)계획량0.0000.0260.0001.0000.000
원재료인증구분0.275NaNNaN0.0001.000
2024-03-23T07:40:10.442308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
원재료인증구분인증종류명
원재료인증구분1.0001.000
인증종류명1.0001.000
2024-03-23T07:40:10.691332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인증번호재배(작업장)면적(사육두수)생산(수입)계획량인증종류명원재료인증구분
인증번호1.0000.3810.0850.2230.148
재배(작업장)면적(사육두수)0.3811.0000.4970.0831.000
생산(수입)계획량0.0850.4971.0000.0190.000
인증종류명0.2230.0830.0191.0001.000
원재료인증구분0.1481.0000.0001.0001.000

Missing values

2024-03-23T07:40:01.417220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:40:01.861238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

인증번호인증종류명인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분
8565115101907유기농산물4841.02661.02022-10-252023-10-24<NA>
3147611100570유기농산물쪽파700.01700.02022-04-042023-04-03<NA>
6505014100169유기농산물5583.04190.02022-08-272023-08-26<NA>
6685414100480유기농산물홍고추300.0400.02022-07-042023-07-03<NA>
9869215105469유기농산물67052.043780.02022-09-222023-09-21<NA>
9228515103569유기농산물찰벼1050.0720.02022-10-132023-10-12<NA>
6975314302266무농약농산물1077.0900.02022-09-272023-09-26<NA>
2890410601055취급자조미채소류752.410000.02022-04-122023-04-11무농약농산물
9917315105677유기농산물26400.018510.02022-10-042023-10-03<NA>
4408912100978유기농산물2039.01200.02022-09-272023-09-26<NA>
인증번호인증종류명인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분
1701710304667무농약농산물양배추70.0200.02022-04-062023-04-05<NA>
1262110303662무농약농산물2511.31750.02022-09-092023-09-08<NA>
7987215100996유기농산물4919.72900.02022-09-152023-09-14<NA>
5315413100659유기농산물딸기3900.010000.02022-11-012023-10-31<NA>
1961110305336무농약농산물들깨5286.0525.02022-06-192023-06-18<NA>
6873314301997무농약농산물9024.05460.02022-06-262023-06-25<NA>
893510100852유기농산물적근대300.0700.02022-04-062023-04-05<NA>
7615714600099취급자과일과채류140.0750.02023-03-102024-03-09유기농산물
1609510304506무농약농산물실파662.03900.02023-02-252024-02-24<NA>
9878015105511유기농산물2820.01270.02022-09-272023-09-26<NA>

Duplicate rows

Most frequently occurring

인증번호인증종류명인증품목명재배(작업장)면적(사육두수)생산(수입)계획량인증기간(시작일)인증기간(종료일)원재료인증구분# duplicates
614100169유기농산물3967.02980.02022-08-272023-08-26<NA>4
914302009무농약농산물찰벼3967.02980.02022-09-282023-09-27<NA>3
1015100477유기농산물1921.0870.02022-08-272023-08-26<NA>3
1515103595유기농산물3000.01740.02022-10-152023-10-14<NA>3
04100031유기농산물옥수수300.0100.02023-01-152024-01-14<NA>2
111100804유기농산물귀리30000.030000.02022-07-112023-07-10<NA>2
212100774유기농산물브로코리(녹색꽃양배추)1653.03835.02022-05-252023-05-24<NA>2
312100979유기농산물2579.01430.02022-09-272023-09-26<NA>2
413100002유기농산물9900.02400.02022-08-122023-08-11<NA>2
514100071유기농산물4000.02600.02022-08-312023-08-30<NA>2