Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory644.5 KiB
Average record size in memory66.0 B

Variable types

Categorical4
Numeric2
Text1

Dataset

Description국립농산물품질관리원에서 관리하는 하추곡검사계획량(년도, 품목명,행정구역, 업무구분, 수량단위, 수량)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20170915000000000843

Alerts

실적구분명 has constant value ""Constant
품목명 is highly overall correlated with 업무구분명High correlation
업무구분명 is highly overall correlated with 품목명 and 1 other fieldsHigh correlation
수량단위 is highly overall correlated with 업무구분명High correlation
수량단위 is highly imbalanced (68.3%)Imbalance

Reproduction

Analysis started2024-03-23 07:30:45.526898
Analysis finished2024-03-23 07:30:47.750821
Duration2.22 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

실적구분명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
계획
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row계획
2nd row계획
3rd row계획
4th row계획
5th row계획

Common Values

ValueCountFrequency (%)
계획 10000
100.0%

Length

2024-03-23T07:30:47.912872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:30:48.198630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
계획 10000
100.0%

년도
Real number (ℝ)

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2008.099
Minimum1998
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:30:48.537291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1998
5-th percentile1999
Q12003
median2007
Q32014
95-th percentile2019
Maximum2020
Range22
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.3107513
Coefficient of variation (CV)0.0031426495
Kurtosis-1.1405005
Mean2008.099
Median Absolute Deviation (MAD)5
Skewness0.23257113
Sum20080990
Variance39.825582
MonotonicityNot monotonic
2024-03-23T07:30:48.824501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
2001 590
 
5.9%
2003 588
 
5.9%
2004 582
 
5.8%
2002 579
 
5.8%
2006 549
 
5.5%
2016 531
 
5.3%
2008 529
 
5.3%
2009 511
 
5.1%
2019 507
 
5.1%
2007 488
 
4.9%
Other values (13) 4546
45.5%
ValueCountFrequency (%)
1998 273
2.7%
1999 463
4.6%
2000 476
4.8%
2001 590
5.9%
2002 579
5.8%
2003 588
5.9%
2004 582
5.8%
2005 452
4.5%
2006 549
5.5%
2007 488
4.9%
ValueCountFrequency (%)
2020 111
 
1.1%
2019 507
5.1%
2018 334
3.3%
2017 367
3.7%
2016 531
5.3%
2015 391
3.9%
2014 312
3.1%
2013 319
3.2%
2012 264
2.6%
2011 370
3.7%

품목명
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
6012 
<NA>
851 
겉보리
832 
쌀보리
819 
콩(일반)
 
486
Other values (9)
1000 

Length

Max length5
Median length1
Mean length1.9632
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row쌀보리
5th row<NA>

Common Values

ValueCountFrequency (%)
6012
60.1%
<NA> 851
 
8.5%
겉보리 832
 
8.3%
쌀보리 819
 
8.2%
콩(일반) 486
 
4.9%
맥주보리 399
 
4.0%
옥수수 247
 
2.5%
224
 
2.2%
양파 44
 
0.4%
마늘 33
 
0.3%
Other values (4) 53
 
0.5%

Length

2024-03-23T07:30:49.073373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
6012
60.1%
na 851
 
8.5%
겉보리 832
 
8.3%
쌀보리 819
 
8.2%
콩(일반 486
 
4.9%
맥주보리 399
 
4.0%
옥수수 247
 
2.5%
224
 
2.2%
양파 44
 
0.4%
마늘 33
 
0.3%
Other values (4) 53
 
0.5%
Distinct254
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-23T07:30:49.696959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length8
Mean length7.9386
Min length6

Characters and Unicode

Total characters79386
Distinct characters134
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)0.3%

Sample

1st row경상남도 의령군
2nd row강원도 횡성군
3rd row경기도 포천시
4th row경상남도 고성군
5th row강원도 삼척시
ValueCountFrequency (%)
전라남도 1815
 
9.1%
경상남도 1475
 
7.4%
경상북도 1379
 
6.9%
전라북도 1058
 
5.3%
충청남도 927
 
4.6%
강원도 892
 
4.5%
경기도 817
 
4.1%
충청북도 630
 
3.1%
광주광역시 185
 
0.9%
인천광역시 162
 
0.8%
Other values (251) 10701
53.4%
2024-03-23T07:30:50.797454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10042
 
12.6%
9256
 
11.7%
5591
 
7.0%
4704
 
5.9%
4632
 
5.8%
3878
 
4.9%
3246
 
4.1%
3101
 
3.9%
2931
 
3.7%
2873
 
3.6%
Other values (124) 29132
36.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 69344
87.4%
Space Separator 10042
 
12.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9256
 
13.3%
5591
 
8.1%
4704
 
6.8%
4632
 
6.7%
3878
 
5.6%
3246
 
4.7%
3101
 
4.5%
2931
 
4.2%
2873
 
4.1%
1890
 
2.7%
Other values (123) 27242
39.3%
Space Separator
ValueCountFrequency (%)
10042
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 69344
87.4%
Common 10042
 
12.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9256
 
13.3%
5591
 
8.1%
4704
 
6.8%
4632
 
6.7%
3878
 
5.6%
3246
 
4.7%
3101
 
4.5%
2931
 
4.2%
2873
 
4.1%
1890
 
2.7%
Other values (123) 27242
39.3%
Common
ValueCountFrequency (%)
10042
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 69328
87.3%
ASCII 10042
 
12.6%
Compat Jamo 16
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10042
100.0%
Hangul
ValueCountFrequency (%)
9256
 
13.4%
5591
 
8.1%
4704
 
6.8%
4632
 
6.7%
3878
 
5.6%
3246
 
4.7%
3101
 
4.5%
2931
 
4.2%
2873
 
4.1%
1890
 
2.7%
Other values (122) 27226
39.3%
Compat Jamo
ValueCountFrequency (%)
16
100.0%

업무구분명
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
하곡검사 포대벼검사(40kg)
2154 
공공비축벼 검사(산물)
2036 
공공비축벼 포대벼검사(40kg)
1865 
공공비축벼 포대벼검사(40kg),(800kg)
1655 
잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 )
1375 
Other values (10)
915 

Length

Max length33
Median length31
Mean length18.8652
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공공비축벼 검사(산물)
2nd row공공비축벼 포대벼검사(40kg)
3rd row공공비축벼 포대벼검사(800kg)
4th row하곡검사 (산물)
5th row잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 )

Common Values

ValueCountFrequency (%)
하곡검사 포대벼검사(40kg) 2154
21.5%
공공비축벼 검사(산물) 2036
20.4%
공공비축벼 포대벼검사(40kg) 1865
18.6%
공공비축벼 포대벼검사(40kg),(800kg) 1655
16.6%
잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 ) 1375
13.8%
농협시가매입(시장격리곡) 226
 
2.3%
공공비축벼 포대벼검사(800kg) 142
 
1.4%
공공비축벼 피해벼 127
 
1.3%
하곡검사 (산물) 121
 
1.2%
비축농산물 103
 
1.0%
Other values (5) 196
 
2.0%

Length

2024-03-23T07:30:51.300428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
공공비축벼 5825
24.6%
포대벼검사(40kg 4019
17.0%
하곡검사 2275
 
9.6%
검사(산물 2036
 
8.6%
포대벼검사(40kg),(800kg 1655
 
7.0%
잡곡검사(콩 1386
 
5.9%
이전 1386
 
5.9%
1386
 
5.9%
옥수수)(40kg 1375
 
5.8%
2019.01 1375
 
5.8%
Other values (13) 932
 
3.9%

수량단위
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
40kg/대
7659 
Kg
1849 
800kg/톤백
 
242
20Kg/대
 
102
30Kg/대
 
87
Other values (5)
 
61

Length

Max length8
Median length6
Mean length5.2942
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowKg
2nd row40kg/대
3rd row800kg/톤백
4th rowKg
5th row40kg/대

Common Values

ValueCountFrequency (%)
40kg/대 7659
76.6%
Kg 1849
 
18.5%
800kg/톤백 242
 
2.4%
20Kg/대 102
 
1.0%
30Kg/대 87
 
0.9%
Ton 49
 
0.5%
20kg/대 8
 
0.1%
600kg/대 2
 
< 0.1%
4kg/대 1
 
< 0.1%
10Kg/대 1
 
< 0.1%

Length

2024-03-23T07:30:51.593639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:30:51.947320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40kg/대 7659
76.6%
kg 1849
 
18.5%
800kg/톤백 242
 
2.4%
20kg/대 110
 
1.1%
30kg/대 87
 
0.9%
ton 49
 
0.5%
600kg/대 2
 
< 0.1%
4kg/대 1
 
< 0.1%
10kg/대 1
 
< 0.1%

수량
Real number (ℝ)

Distinct7858
Distinct (%)78.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean438379.76
Minimum1
Maximum22567000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:30:52.389187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile90
Q11568.5
median19062
Q3138629.5
95-th percentile2686406
Maximum22567000
Range22566999
Interquartile range (IQR)137061

Descriptive statistics

Standard deviation1382186.7
Coefficient of variation (CV)3.1529437
Kurtosis40.479218
Mean438379.76
Median Absolute Deviation (MAD)18876
Skewness5.4895519
Sum4.3837976 × 109
Variance1.9104401 × 1012
MonotonicityNot monotonic
2024-03-23T07:30:52.789413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5000 25
 
0.2%
500 25
 
0.2%
15000 21
 
0.2%
10000 19
 
0.2%
3000 18
 
0.2%
375 18
 
0.2%
50 17
 
0.2%
100 17
 
0.2%
2500 16
 
0.2%
200 15
 
0.1%
Other values (7848) 9809
98.1%
ValueCountFrequency (%)
1 1
 
< 0.1%
2 3
 
< 0.1%
3 8
0.1%
4 4
 
< 0.1%
5 5
 
0.1%
6 7
0.1%
7 6
0.1%
8 5
 
0.1%
9 5
 
0.1%
10 13
0.1%
ValueCountFrequency (%)
22567000 1
< 0.1%
18893280 1
< 0.1%
18170720 1
< 0.1%
17142560 1
< 0.1%
16436840 1
< 0.1%
16234920 1
< 0.1%
15230640 1
< 0.1%
13845360 1
< 0.1%
13354040 1
< 0.1%
13240960 1
< 0.1%

Interactions

2024-03-23T07:30:46.738087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:30:46.411731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:30:46.918592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:30:46.571119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T07:30:53.051881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도품목명업무구분명수량단위수량
년도1.0000.4540.6830.4900.225
품목명0.4541.0000.8570.6700.128
업무구분명0.6830.8571.0000.8980.308
수량단위0.4900.6700.8981.0000.507
수량0.2250.1280.3080.5071.000
2024-03-23T07:30:53.249937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업무구분명수량단위품목명
업무구분명1.0000.6100.529
수량단위0.6101.0000.354
품목명0.5290.3541.000
2024-03-23T07:30:53.408294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도수량품목명업무구분명수량단위
년도1.000-0.0190.2040.3340.170
수량-0.0191.0000.0530.1200.177
품목명0.2040.0531.0000.5290.354
업무구분명0.3340.1200.5291.0000.610
수량단위0.1700.1770.3540.6101.000

Missing values

2024-03-23T07:30:47.235717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:30:47.587795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

실적구분명년도품목명행정구역명업무구분명수량단위수량
6716계획2006경상남도 의령군공공비축벼 검사(산물)Kg417160
8382계획2003강원도 횡성군공공비축벼 포대벼검사(40kg)40kg/대32575
4536계획2009경기도 포천시공공비축벼 포대벼검사(800kg)800kg/톤백193
9188계획2002쌀보리경상남도 고성군하곡검사 (산물)Kg470840
7488계획2004<NA>강원도 삼척시잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 )40kg/대827
2535계획2014콩나물콩충청남도 서산시잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 )40kg/대3
11244계획1998전라남도 강진군공공비축벼 포대벼검사(40kg)40kg/대371777
10375계획2000전라남도 화순군공공비축벼 포대벼검사(40kg)40kg/대248083
3522계획2012경상북도 칠곡군공공비축벼 포대벼검사(40kg),(800kg)Kg1881000
164계획2019전라남도 목포시공공비축벼 피해벼30Kg/대167
실적구분명년도품목명행정구역명업무구분명수량단위수량
10981계획1999경기도 용인시공공비축벼 검사(산물)Kg2642000
354계획2019마늘충남지원 공주사무소비축농산물20Kg/대7033
4820계획2009맥주보리경상남도 사천시하곡검사 포대벼검사(40kg)40kg/대43044
11255계획1998경기도 김포시공공비축벼 검사(산물)Kg860000
6391계획2006<NA>경상북도 영주시잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 )40kg/대1493
8304계획2003대구광역시 동구공공비축벼 포대벼검사(40kg)40kg/대8385
2289계획2015경기도 김포시공공비축벼 포대벼검사(40kg),(800kg)40kg/대25000
5219계획2008<NA>경상북도 청송군잡곡검사(콩, 옥수수)(40kg) (2019.01 이전 )40kg/대836
3852계획2011쌀보리전라남도 보성군하곡검사 (산물)40kg/대40000
399계획2019경상북도 영천시공공비축벼 검사(산물)40kg/대12148