Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory644.5 KiB
Average record size in memory66.0 B

Variable types

Categorical4
Numeric2
Text1

Dataset

Description국립농산물품질관리원에서 관리하는 하추곡검사계획량(년도, 품목명,행정구역, 업무구분, 수량단위, 수량)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20170915000000000843

Alerts

실적구분명 has constant value ""Constant
품목명 is highly overall correlated with 업무구분명High correlation
업무구분명 is highly overall correlated with 품목명 and 1 other fieldsHigh correlation
수량단위 is highly overall correlated with 업무구분명High correlation
수량단위 is highly imbalanced (67.7%)Imbalance

Reproduction

Analysis started2024-03-23 07:31:18.156847
Analysis finished2024-03-23 07:31:20.773097
Duration2.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

실적구분명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
계획
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row계획
2nd row계획
3rd row계획
4th row계획
5th row계획

Common Values

ValueCountFrequency (%)
계획 10000
100.0%

Length

2024-03-23T07:31:20.904907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:31:21.114086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
계획 10000
100.0%

년도
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2009.9002
Minimum1998
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:31:21.401281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1998
5-th percentile1999
Q12003
median2009
Q32016
95-th percentile2022
Maximum2022
Range24
Interquartile range (IQR)13

Descriptive statistics

Standard deviation7.3163784
Coefficient of variation (CV)0.00364017
Kurtosis-1.2641457
Mean2009.9002
Median Absolute Deviation (MAD)6
Skewness0.12928688
Sum20099002
Variance53.529393
MonotonicityNot monotonic
2024-03-23T07:31:21.893670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
2022 600
 
6.0%
2016 505
 
5.1%
2002 504
 
5.0%
2003 499
 
5.0%
2001 495
 
5.0%
2004 489
 
4.9%
2019 470
 
4.7%
2006 456
 
4.6%
2009 443
 
4.4%
2008 441
 
4.4%
Other values (15) 5098
51.0%
ValueCountFrequency (%)
1998 227
2.3%
1999 408
4.1%
2000 400
4.0%
2001 495
5.0%
2002 504
5.0%
2003 499
5.0%
2004 489
4.9%
2005 379
3.8%
2006 456
4.6%
2007 440
4.4%
ValueCountFrequency (%)
2022 600
6.0%
2021 322
3.2%
2020 408
4.1%
2019 470
4.7%
2018 277
2.8%
2017 374
3.7%
2016 505
5.1%
2015 378
3.8%
2014 276
2.8%
2013 259
2.6%

품목명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
벼종자
6306 
<NA>
1506 
겉보리종자
704 
 
542
맥주보리종자
 
333
Other values (6)
 
609

Length

Max length7
Median length3
Mean length3.3737
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row벼종자
2nd row벼종자
3rd row벼종자
4th row
5th row<NA>

Common Values

ValueCountFrequency (%)
벼종자 6306
63.1%
<NA> 1506
 
15.1%
겉보리종자 704
 
7.0%
542
 
5.4%
맥주보리종자 333
 
3.3%
밀종자 242
 
2.4%
옥수수종자 237
 
2.4%
벼(품종혼합) 106
 
1.1%
팥종자 14
 
0.1%
콩나물콩 6
 
0.1%

Length

2024-03-23T07:31:22.272431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
벼종자 6306
63.1%
na 1506
 
15.1%
겉보리종자 704
 
7.0%
542
 
5.4%
맥주보리종자 333
 
3.3%
밀종자 242
 
2.4%
옥수수종자 237
 
2.4%
벼(품종혼합 106
 
1.1%
팥종자 14
 
0.1%
콩나물콩 6
 
0.1%
Distinct249
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-23T07:31:22.960221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length8
Mean length7.9223
Min length6

Characters and Unicode

Total characters79223
Distinct characters134
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)0.4%

Sample

1st row전라남도 무안군
2nd row경상남도 하동군
3rd row충청북도 옥천군
4th row경상북도 군위군
5th row광주광역시 남구
ValueCountFrequency (%)
전라남도 1798
 
9.0%
경상남도 1470
 
7.3%
경상북도 1409
 
7.0%
전라북도 1087
 
5.4%
충청남도 923
 
4.6%
강원도 914
 
4.6%
경기도 798
 
4.0%
충청북도 673
 
3.4%
광주광역시 190
 
0.9%
인천광역시 157
 
0.8%
Other values (248) 10617
53.0%
2024-03-23T07:31:23.804689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10037
 
12.7%
9327
 
11.8%
5682
 
7.2%
4645
 
5.9%
4607
 
5.8%
3874
 
4.9%
3323
 
4.2%
3080
 
3.9%
2952
 
3.7%
2885
 
3.6%
Other values (124) 28811
36.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 69186
87.3%
Space Separator 10037
 
12.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9327
 
13.5%
5682
 
8.2%
4645
 
6.7%
4607
 
6.7%
3874
 
5.6%
3323
 
4.8%
3080
 
4.5%
2952
 
4.3%
2885
 
4.2%
1939
 
2.8%
Other values (123) 26872
38.8%
Space Separator
ValueCountFrequency (%)
10037
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 69186
87.3%
Common 10037
 
12.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9327
 
13.5%
5682
 
8.2%
4645
 
6.7%
4607
 
6.7%
3874
 
5.6%
3323
 
4.8%
3080
 
4.5%
2952
 
4.3%
2885
 
4.2%
1939
 
2.8%
Other values (123) 26872
38.8%
Common
ValueCountFrequency (%)
10037
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 69175
87.3%
ASCII 10037
 
12.7%
Compat Jamo 11
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10037
100.0%
Hangul
ValueCountFrequency (%)
9327
 
13.5%
5682
 
8.2%
4645
 
6.7%
4607
 
6.7%
3874
 
5.6%
3323
 
4.8%
3080
 
4.5%
2952
 
4.3%
2885
 
4.2%
1939
 
2.8%
Other values (122) 26861
38.8%
Compat Jamo
ValueCountFrequency (%)
11
100.0%

업무구분명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
공공비축미_산물벼
2020 
하곡(보리)_40kg
1832 
공공비축 포대벼_40kg
1585 
공공비축미_포대벼
1574 
잡곡(콩/옥수수)(2019.01 이전)_40kg
1214 
Other values (12)
1775 

Length

Max length29
Median length27
Mean length12.5737
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공공비축미_산물벼
2nd row공공비축미_산물벼
3rd row공공비축 포대벼_40kg
4th row콩(일반콩)_20kg,40kg,800kg,1000kg
5th row하곡(보리)_40kg

Common Values

ValueCountFrequency (%)
공공비축미_산물벼 2020
20.2%
하곡(보리)_40kg 1832
18.3%
공공비축 포대벼_40kg 1585
15.8%
공공비축미_포대벼 1574
15.7%
잡곡(콩/옥수수)(2019.01 이전)_40kg 1214
12.1%
공공비축미_인수벼 500
 
5.0%
공공비축미_시장격리곡 403
 
4.0%
공공비축미_피해벼 241
 
2.4%
콩(일반콩)_20kg,40kg,800kg,1000kg 197
 
2.0%
공공비축 포대벼_800kg 123
 
1.2%
Other values (7) 311
 
3.1%

Length

2024-03-23T07:31:24.198695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
공공비축미_산물벼 2020
15.6%
하곡(보리)_40kg 1832
14.2%
공공비축 1708
13.2%
포대벼_40kg 1585
12.3%
공공비축미_포대벼 1574
12.2%
잡곡(콩/옥수수)(2019.01 1214
9.4%
이전)_40kg 1214
9.4%
공공비축미_인수벼 500
 
3.9%
공공비축미_시장격리곡 403
 
3.1%
공공비축미_피해벼 241
 
1.9%
Other values (11) 638
 
4.9%

수량단위
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
40kg/대
7639 
Kg
1757 
800kg/톤백
 
216
30Kg/대
 
170
1000kg
 
103
Other values (6)
 
115

Length

Max length8
Median length6
Mean length5.3423
Min length2

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowKg
2nd rowKg
3rd row40kg/대
4th row40kg/대
5th row40kg/대

Common Values

ValueCountFrequency (%)
40kg/대 7639
76.4%
Kg 1757
 
17.6%
800kg/톤백 216
 
2.2%
30Kg/대 170
 
1.7%
1000kg 103
 
1.0%
20Kg/대 72
 
0.7%
600kg/대 20
 
0.2%
20kg/대 20
 
0.2%
80kg/대 1
 
< 0.1%
10Kg/대 1
 
< 0.1%

Length

2024-03-23T07:31:24.578482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
40kg/대 7639
76.4%
kg 1757
 
17.6%
800kg/톤백 216
 
2.2%
30kg/대 170
 
1.7%
1000kg 103
 
1.0%
20kg/대 92
 
0.9%
600kg/대 20
 
0.2%
80kg/대 1
 
< 0.1%
10kg/대 1
 
< 0.1%
4kg/대 1
 
< 0.1%

수량
Real number (ℝ)

Distinct7750
Distinct (%)77.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean415541.67
Minimum1
Maximum20928600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:31:24.973811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile100
Q11843.75
median19699
Q3123694.5
95-th percentile2524125.8
Maximum20928600
Range20928599
Interquartile range (IQR)121850.75

Descriptive statistics

Standard deviation1355499.7
Coefficient of variation (CV)3.2620067
Kurtosis41.870467
Mean415541.67
Median Absolute Deviation (MAD)19459
Skewness5.6685891
Sum4.1554167 × 109
Variance1.8373795 × 1012
MonotonicityNot monotonic
2024-03-23T07:31:25.395172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5000 31
 
0.3%
15000 30
 
0.3%
500 23
 
0.2%
100 23
 
0.2%
10000 23
 
0.2%
2000 23
 
0.2%
3000 22
 
0.2%
1000 22
 
0.2%
4000 21
 
0.2%
200 21
 
0.2%
Other values (7740) 9761
97.6%
ValueCountFrequency (%)
1 2
 
< 0.1%
2 6
0.1%
3 8
0.1%
4 5
0.1%
5 8
0.1%
6 7
0.1%
7 4
 
< 0.1%
8 4
 
< 0.1%
9 3
 
< 0.1%
10 12
0.1%
ValueCountFrequency (%)
20928600 1
< 0.1%
18893280 1
< 0.1%
17142560 1
< 0.1%
17098440 1
< 0.1%
16234920 1
< 0.1%
15554560 1
< 0.1%
15024840 1
< 0.1%
13975840 1
< 0.1%
13354040 1
< 0.1%
12643870 1
< 0.1%

Interactions

2024-03-23T07:31:19.771342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:31:19.063880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:31:20.078119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:31:19.315826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T07:31:25.649138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도품목명업무구분명수량단위수량
년도1.0000.5490.6780.3380.190
품목명0.5491.0000.8470.3260.135
업무구분명0.6780.8471.0000.8810.280
수량단위0.3380.3260.8811.0000.394
수량0.1900.1350.2800.3941.000
2024-03-23T07:31:25.928156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업무구분명수량단위품목명
업무구분명1.0000.5860.542
수량단위0.5861.0000.154
품목명0.5420.1541.000
2024-03-23T07:31:26.181340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도수량품목명업무구분명수량단위
년도1.000-0.0060.1940.3450.152
수량-0.0061.0000.0420.1120.179
품목명0.1940.0421.0000.5420.154
업무구분명0.3450.1120.5421.0000.586
수량단위0.1520.1790.1540.5861.000

Missing values

2024-03-23T07:31:20.421136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:31:20.666322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

실적구분명년도품목명행정구역명업무구분명수량단위수량
2374계획2002벼종자전라남도 무안군공공비축미_산물벼Kg5062000
1072계획2000벼종자경상남도 하동군공공비축미_산물벼Kg3354480
4923계획2006벼종자충청북도 옥천군공공비축 포대벼_40kg40kg/대44069
12412계획2021경상북도 군위군콩(일반콩)_20kg,40kg,800kg,1000kg40kg/대860
7750계획2011<NA>광주광역시 남구하곡(보리)_40kg40kg/대3550
7383계획2011겉보리종자경상북도 영천시하곡(보리)_40kg40kg/대520
10631계획2018벼종자인천광역시 옹진군공공비축미_인수벼40kg/대35497
2008계획2001<NA>전라남도 완도군잡곡(콩/옥수수)(2019.01 이전)_40kg40kg/대2171
8031계획2012벼종자전라북도 임실군공공비축미_포대벼40kg/대58136
2736계획2003겉보리종자경상북도 상주시하곡(보리)_40kg40kg/대163
실적구분명년도품목명행정구역명업무구분명수량단위수량
8460계획2014벼종자강원도 삼척시공공비축미_산물벼40kg/대9254
10518계획2018벼종자경기도 평택시공공비축미_산물벼40kg/대40500
3693계획2004벼종자인천광역시 옹진군공공비축 포대벼_40kg40kg/대103377
8870계획2015벼종자경기도 안성시공공비축미_인수벼40kg/대50200
1902계획2001옥수수종자강원도 홍천군공공비축 포대벼_40kg40kg/대10656
9224계획2015강원도 영월군잡곡(콩/옥수수)(2019.01 이전)_40kg40kg/대11683
1354계획2000<NA>전라남도 해남군하곡(보리)_산물Kg642440
9748계획2016벼종자전라북도 김제시공공비축미_애프터40kg/대21080
12487계획2022벼(품종혼합)강원도 양양군공공비축미_시장격리곡40kg/대18220
10656계획2018벼종자전라남도 영광군공공비축미_산물벼40kg/대11068