Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory644.5 KiB
Average record size in memory66.0 B

Variable types

Categorical4
Numeric2
Text1

Dataset

Description국립농산물품질관리원에서 관리하는 하추곡검사계획량(년도, 품목명,행정구역, 업무구분, 수량단위, 수량)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20170915000000000843

Alerts

실적구분명 has constant value ""Constant
품목명 is highly overall correlated with 업무구분명High correlation
업무구분명 is highly overall correlated with 품목명 and 1 other fieldsHigh correlation
수량단위 is highly overall correlated with 업무구분명High correlation
수량단위 is highly imbalanced (64.0%)Imbalance

Reproduction

Analysis started2024-03-23 07:31:07.730253
Analysis finished2024-03-23 07:31:10.247932
Duration2.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

실적구분명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
계획
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row계획
2nd row계획
3rd row계획
4th row계획
5th row계획

Common Values

ValueCountFrequency (%)
계획 10000
100.0%

Length

2024-03-23T07:31:10.440949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:31:10.726935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
계획 10000
100.0%

년도
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2008.9325
Minimum1998
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:31:11.005635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1998
5-th percentile1999
Q12003
median2008
Q32015
95-th percentile2020
Maximum2022
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation6.8804708
Coefficient of variation (CV)0.0034249387
Kurtosis-1.1849791
Mean2008.9325
Median Absolute Deviation (MAD)6
Skewness0.21763479
Sum20089325
Variance47.340878
MonotonicityNot monotonic
2024-03-23T07:31:11.371661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
2001 567
 
5.7%
2003 559
 
5.6%
2002 530
 
5.3%
2004 525
 
5.2%
2006 510
 
5.1%
2016 508
 
5.1%
2008 495
 
5.0%
2009 470
 
4.7%
2019 466
 
4.7%
2007 454
 
4.5%
Other values (15) 4916
49.2%
ValueCountFrequency (%)
1998 257
2.6%
1999 441
4.4%
2000 443
4.4%
2001 567
5.7%
2002 530
5.3%
2003 559
5.6%
2004 525
5.2%
2005 424
4.2%
2006 510
5.1%
2007 454
4.5%
ValueCountFrequency (%)
2022 67
 
0.7%
2021 351
3.5%
2020 371
3.7%
2019 466
4.7%
2018 299
3.0%
2017 342
3.4%
2016 508
5.1%
2015 349
3.5%
2014 297
3.0%
2013 297
3.0%

품목명
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
벼종자
6188 
<NA>
1649 
겉보리종자
762 
 
542
맥주보리종자
 
367
Other values (5)
 
492

Length

Max length6
Median length3
Mean length3.3689
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row벼종자
3rd row<NA>
4th row벼종자
5th row

Common Values

ValueCountFrequency (%)
벼종자 6188
61.9%
<NA> 1649
 
16.5%
겉보리종자 762
 
7.6%
542
 
5.4%
맥주보리종자 367
 
3.7%
옥수수종자 244
 
2.4%
밀종자 225
 
2.2%
팥종자 12
 
0.1%
콩나물콩 8
 
0.1%
녹두종자 3
 
< 0.1%

Length

2024-03-23T07:31:11.893764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:31:12.179561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
벼종자 6188
61.9%
na 1649
 
16.5%
겉보리종자 762
 
7.6%
542
 
5.4%
맥주보리종자 367
 
3.7%
옥수수종자 244
 
2.4%
밀종자 225
 
2.2%
팥종자 12
 
0.1%
콩나물콩 8
 
0.1%
녹두종자 3
 
< 0.1%
Distinct257
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-23T07:31:12.753850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length8
Mean length7.9292
Min length6

Characters and Unicode

Total characters79292
Distinct characters134
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)0.4%

Sample

1st row전라북도 군산시
2nd row경상북도 칠곡군
3rd row경상북도 안동시
4th row전라북도 장수군
5th row경상북도 의성군
ValueCountFrequency (%)
전라남도 1815
 
9.1%
경상남도 1430
 
7.1%
경상북도 1383
 
6.9%
전라북도 1105
 
5.5%
충청남도 934
 
4.7%
강원도 905
 
4.5%
경기도 810
 
4.0%
충청북도 651
 
3.2%
광주광역시 189
 
0.9%
인천광역시 161
 
0.8%
Other values (254) 10658
53.2%
2024-03-23T07:31:13.924339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10042
 
12.7%
9292
 
11.7%
5618
 
7.1%
4701
 
5.9%
4591
 
5.8%
3826
 
4.8%
3295
 
4.2%
3136
 
4.0%
2920
 
3.7%
2891
 
3.6%
Other values (124) 28980
36.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 69250
87.3%
Space Separator 10042
 
12.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9292
 
13.4%
5618
 
8.1%
4701
 
6.8%
4591
 
6.6%
3826
 
5.5%
3295
 
4.8%
3136
 
4.5%
2920
 
4.2%
2891
 
4.2%
1935
 
2.8%
Other values (123) 27045
39.1%
Space Separator
ValueCountFrequency (%)
10042
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 69250
87.3%
Common 10042
 
12.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9292
 
13.4%
5618
 
8.1%
4701
 
6.8%
4591
 
6.6%
3826
 
5.5%
3295
 
4.8%
3136
 
4.5%
2920
 
4.2%
2891
 
4.2%
1935
 
2.8%
Other values (123) 27045
39.1%
Common
ValueCountFrequency (%)
10042
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 69238
87.3%
ASCII 10042
 
12.7%
Compat Jamo 12
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10042
100.0%
Hangul
ValueCountFrequency (%)
9292
 
13.4%
5618
 
8.1%
4701
 
6.8%
4591
 
6.6%
3826
 
5.5%
3295
 
4.8%
3136
 
4.5%
2920
 
4.2%
2891
 
4.2%
1935
 
2.8%
Other values (122) 27033
39.0%
Compat Jamo
ValueCountFrequency (%)
12
100.0%

업무구분명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
공공비축벼 검사(산물)
2108 
하곡검사 포대벼검사(40kg)
1988 
공공비축벼 포대벼검사(40kg),(800kg)
1720 
공공비축벼 포대벼검사(40kg)
1709 
잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 )
1301 
Other values (12)
1174 

Length

Max length38
Median length32
Mean length18.9332
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row하곡검사 포대벼검사(40kg)
2nd row공공비축벼 포대벼검사(40kg)
3rd row잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 )
4th row공공비축벼 검사(산물)
5th row콩검사(일반콩)(20kg),(40kg),(800kg),(1000kg)

Common Values

ValueCountFrequency (%)
공공비축벼 검사(산물) 2108
21.1%
하곡검사 포대벼검사(40kg) 1988
19.9%
공공비축벼 포대벼검사(40kg),(800kg) 1720
17.2%
공공비축벼 포대벼검사(40kg) 1709
17.1%
잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 ) 1301
13.0%
농협시가매입(시장격리곡) 290
 
2.9%
공공비축벼 피해벼 236
 
2.4%
콩검사(일반콩)(20kg),(40kg),(800kg),(1000kg) 167
 
1.7%
공공비축벼 포대벼검사(800kg) 130
 
1.3%
하곡검사 (산물) 116
 
1.2%
Other values (7) 235
 
2.4%

Length

2024-03-23T07:31:14.320719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
공공비축벼 5903
25.4%
포대벼검사(40kg 3697
15.9%
검사(산물 2108
 
9.1%
하곡검사 2104
 
9.0%
포대벼검사(40kg),(800kg 1720
 
7.4%
잡곡검사(콩 1312
 
5.6%
이전 1312
 
5.6%
1312
 
5.6%
옥수수)(40kg 1301
 
5.6%
2019.01 1301
 
5.6%
Other values (15) 1190
 
5.1%

수량단위
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
40kg/대
7531 
Kg
1845 
800kg/톤백
 
248
30Kg/대
 
177
20Kg/대
 
80
Other values (4)
 
119

Length

Max length8
Median length6
Mean length5.2896
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row40kg/대
2nd row40kg/대
3rd row40kg/대
4th rowKg
5th row40kg/대

Common Values

ValueCountFrequency (%)
40kg/대 7531
75.3%
Kg 1845
 
18.4%
800kg/톤백 248
 
2.5%
30Kg/대 177
 
1.8%
20Kg/대 80
 
0.8%
Ton 80
 
0.8%
600kg/대 21
 
0.2%
20kg/대 17
 
0.2%
4kg/대 1
 
< 0.1%

Length

2024-03-23T07:31:14.647323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:31:14.871677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40kg/대 7531
75.3%
kg 1845
 
18.4%
800kg/톤백 248
 
2.5%
30kg/대 177
 
1.8%
20kg/대 97
 
1.0%
ton 80
 
0.8%
600kg/대 21
 
0.2%
4kg/대 1
 
< 0.1%

수량
Real number (ℝ)

Distinct7822
Distinct (%)78.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean432189.74
Minimum2
Maximum22567000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:31:15.274157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile95
Q11629.5
median19112
Q3134776
95-th percentile2631238
Maximum22567000
Range22566998
Interquartile range (IQR)133146.5

Descriptive statistics

Standard deviation1375819.4
Coefficient of variation (CV)3.183369
Kurtosis43.591171
Mean432189.74
Median Absolute Deviation (MAD)18922
Skewness5.6372468
Sum4.3218974 × 109
Variance1.8928791 × 1012
MonotonicityNot monotonic
2024-03-23T07:31:15.713352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
500 31
 
0.3%
15000 26
 
0.3%
5000 26
 
0.3%
2000 22
 
0.2%
3000 22
 
0.2%
125 22
 
0.2%
100 22
 
0.2%
1000 22
 
0.2%
10000 20
 
0.2%
4000 19
 
0.2%
Other values (7812) 9768
97.7%
ValueCountFrequency (%)
2 4
 
< 0.1%
3 11
0.1%
4 5
 
0.1%
5 7
0.1%
6 7
0.1%
7 7
0.1%
8 3
 
< 0.1%
9 5
 
0.1%
10 15
0.1%
11 9
0.1%
ValueCountFrequency (%)
22567000 1
< 0.1%
20633800 1
< 0.1%
18893280 1
< 0.1%
18170720 1
< 0.1%
17480560 1
< 0.1%
16436840 1
< 0.1%
15230640 1
< 0.1%
13845360 1
< 0.1%
12851880 1
< 0.1%
12579960 1
< 0.1%

Interactions

2024-03-23T07:31:09.080250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:31:08.588430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:31:09.305763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:31:08.834164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T07:31:15.978425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도품목명업무구분명수량단위수량
년도1.0000.3900.6720.3740.188
품목명0.3901.0000.8380.2840.104
업무구분명0.6720.8381.0000.9020.273
수량단위0.3740.2840.9021.0000.385
수량0.1880.1040.2730.3851.000
2024-03-23T07:31:16.147298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업무구분명수량단위품목명
업무구분명1.0000.6540.547
수량단위0.6541.0000.143
품목명0.5470.1431.000
2024-03-23T07:31:16.351029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도수량품목명업무구분명수량단위
년도1.000-0.0280.1860.3400.181
수량-0.0281.0000.0470.1090.186
품목명0.1860.0471.0000.5470.143
업무구분명0.3400.1090.5471.0000.654
수량단위0.1810.1860.1430.6541.000

Missing values

2024-03-23T07:31:09.685481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:31:10.078919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

실적구분명년도품목명행정구역명업무구분명수량단위수량
1359계획2000<NA>전라북도 군산시하곡검사 포대벼검사(40kg)40kg/대132120
131계획1998벼종자경상북도 칠곡군공공비축벼 포대벼검사(40kg)40kg/대82739
3245계획2003<NA>경상북도 안동시잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 )40kg/대5685
2419계획2002벼종자전라북도 장수군공공비축벼 검사(산물)Kg1772800
11089계획2019경상북도 의성군콩검사(일반콩)(20kg),(40kg),(800kg),(1000kg)40kg/대12000
3005계획2003벼종자광주광역시 북구공공비축벼 검사(산물)Kg1045000
2485계획2002옥수수종자경기도 포천시잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 )40kg/대1364
4306계획2005벼종자충청남도 홍성군공공비축벼 포대벼검사(40kg)40kg/대194238
1327계획2000<NA>전라남도 담양군하곡검사 포대벼검사(40kg)40kg/대37935
8687계획2014벼종자충청남도 부여군공공비축벼 검사(산물)40kg/대42384
실적구분명년도품목명행정구역명업무구분명수량단위수량
7483계획2011벼종자경기도 가평군공공비축벼 포대벼검사(40kg),(800kg)40kg/대22000
11124계획2019전라북도 정읍시콩검사(일반콩)(20kg),(40kg),(800kg),(1000kg)Ton701
8713계획2014벼종자충청북도 옥천군공공비축벼 포대벼검사(40kg),(800kg)40kg/대81149
3532계획2004벼종자강원도 속초시공공비축벼 검사(산물)Kg698520
103계획1998벼종자경상북도 군위군공공비축벼 포대벼검사(40kg)40kg/대70893
5971계획2008벼종자광주광역시 북구공공비축벼 포대벼검사(40kg)40kg/대38316
10284계획2018벼종자강원도 동해시공공비축벼 포대벼검사(40kg),(800kg)40kg/대929
1771계획2001벼종자전라남도 진도군농협시가매입(시장격리곡)40kg/대122820
9981계획2017벼종자경상북도 영덕군공공비축벼 검사(산물)40kg/대40426
10777계획2019벼종자경상북도 김천시공공비축벼 검사(산물)40kg/대26000