Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells57
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory644.5 KiB
Average record size in memory66.0 B

Variable types

Categorical3
Numeric2
Text2

Dataset

Description국립농산물품질관리원에서 관리하는 하추곡검사사실적 정보(구분명, 년도, 품목, 행정구역, 업무구분, 수량단위, 수량)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220204000000001693

Alerts

실적구분명 has constant value ""Constant
업무구분명 is highly overall correlated with 수량단위High correlation
수량단위 is highly overall correlated with 업무구분명High correlation
수량단위 is highly imbalanced (62.9%)Imbalance

Reproduction

Analysis started2023-12-11 03:47:52.141948
Analysis finished2023-12-11 03:47:53.503929
Duration1.36 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

실적구분명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
실적
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row실적
2nd row실적
3rd row실적
4th row실적
5th row실적

Common Values

ValueCountFrequency (%)
실적 10000
100.0%

Length

2023-12-11T12:47:53.584821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:47:53.703048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
실적 10000
100.0%

년도
Real number (ℝ)

Distinct26
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2009.9858
Minimum1998
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T12:47:53.809170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1998
5-th percentile1999
Q12004
median2009
Q32016
95-th percentile2022
Maximum2023
Range25
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.1570469
Coefficient of variation (CV)0.003560745
Kurtosis-1.1738237
Mean2009.9858
Median Absolute Deviation (MAD)6
Skewness0.079804308
Sum20099858
Variance51.223321
MonotonicityNot monotonic
2023-12-11T12:47:53.940386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
2022 572
 
5.7%
2006 517
 
5.2%
2004 495
 
5.0%
2009 485
 
4.9%
2008 463
 
4.6%
2016 461
 
4.6%
2007 438
 
4.4%
2002 418
 
4.2%
2013 407
 
4.1%
2003 398
 
4.0%
Other values (16) 5346
53.5%
ValueCountFrequency (%)
1998 308
3.1%
1999 381
3.8%
2000 390
3.9%
2001 376
3.8%
2002 418
4.2%
2003 398
4.0%
2004 495
5.0%
2005 389
3.9%
2006 517
5.2%
2007 438
4.4%
ValueCountFrequency (%)
2023 1
 
< 0.1%
2022 572
5.7%
2021 328
3.3%
2020 333
3.3%
2019 389
3.9%
2018 339
3.4%
2017 378
3.8%
2016 461
4.6%
2015 389
3.9%
2014 315
3.1%
Distinct89
Distinct (%)0.9%
Missing57
Missing (%)0.6%
Memory size156.2 KiB
2023-12-11T12:47:54.226336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length4.0307754
Min length1

Characters and Unicode

Total characters40078
Distinct characters96
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.1%

Sample

1st row옥수수종자
2nd row일미벼
3rd row동진1호벼
4th row논콩(소립종)
5th row황금누리벼
ValueCountFrequency (%)
삼광벼 601
 
6.0%
미확인품종 566
 
5.7%
추청벼 459
 
4.6%
일품벼 456
 
4.6%
새누리벼 453
 
4.6%
동진1호벼 430
 
4.3%
일미벼 400
 
4.0%
겉보리종자 369
 
3.7%
쌀보리종자 365
 
3.7%
남평벼 364
 
3.7%
Other values (79) 5480
55.1%
2023-12-11T12:47:54.936800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6154
 
15.4%
2633
 
6.6%
1627
 
4.1%
1584
 
4.0%
1383
 
3.5%
( 1233
 
3.1%
) 1233
 
3.1%
1232
 
3.1%
1186
 
3.0%
1154
 
2.9%
Other values (86) 20659
51.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 36855
92.0%
Open Punctuation 1233
 
3.1%
Close Punctuation 1233
 
3.1%
Decimal Number 757
 
1.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6154
 
16.7%
2633
 
7.1%
1627
 
4.4%
1584
 
4.3%
1383
 
3.8%
1232
 
3.3%
1186
 
3.2%
1154
 
3.1%
1142
 
3.1%
1053
 
2.9%
Other values (82) 17707
48.0%
Decimal Number
ValueCountFrequency (%)
1 444
58.7%
2 313
41.3%
Open Punctuation
ValueCountFrequency (%)
( 1233
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1233
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 36855
92.0%
Common 3223
 
8.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6154
 
16.7%
2633
 
7.1%
1627
 
4.4%
1584
 
4.3%
1383
 
3.8%
1232
 
3.3%
1186
 
3.2%
1154
 
3.1%
1142
 
3.1%
1053
 
2.9%
Other values (82) 17707
48.0%
Common
ValueCountFrequency (%)
( 1233
38.3%
) 1233
38.3%
1 444
 
13.8%
2 313
 
9.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 36855
92.0%
ASCII 3223
 
8.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6154
 
16.7%
2633
 
7.1%
1627
 
4.4%
1584
 
4.3%
1383
 
3.8%
1232
 
3.3%
1186
 
3.2%
1154
 
3.1%
1142
 
3.1%
1053
 
2.9%
Other values (82) 17707
48.0%
ASCII
ValueCountFrequency (%)
( 1233
38.3%
) 1233
38.3%
1 444
 
13.8%
2 313
 
9.7%
Distinct187
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T12:47:55.350114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length8
Mean length8.2606
Min length7

Characters and Unicode

Total characters82606
Distinct characters127
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)0.1%

Sample

1st row경상북도 영양군
2nd row충청남도 아산시
3rd row경상북도 영덕군
4th row전라남도 영광군
5th row충청남도 천안시
ValueCountFrequency (%)
전라남도 1852
 
9.3%
경상북도 1459
 
7.3%
경상남도 1318
 
6.6%
전라북도 1079
 
5.4%
충청남도 1048
 
5.3%
강원특별자치도 908
 
4.6%
경기도 705
 
3.5%
충청북도 703
 
3.5%
인천광역시 169
 
0.8%
광주광역시 169
 
0.8%
Other values (179) 10500
52.7%
2023-12-11T12:47:55.913804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10000
 
12.1%
9322
 
11.3%
5822
 
7.0%
4592
 
5.6%
4591
 
5.6%
3657
 
4.4%
3339
 
4.0%
3104
 
3.8%
2931
 
3.5%
2859
 
3.5%
Other values (117) 32389
39.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 72606
87.9%
Space Separator 10000
 
12.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9322
 
12.8%
5822
 
8.0%
4592
 
6.3%
4591
 
6.3%
3657
 
5.0%
3339
 
4.6%
3104
 
4.3%
2931
 
4.0%
2859
 
3.9%
2070
 
2.9%
Other values (116) 30319
41.8%
Space Separator
ValueCountFrequency (%)
10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 72606
87.9%
Common 10000
 
12.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9322
 
12.8%
5822
 
8.0%
4592
 
6.3%
4591
 
6.3%
3657
 
5.0%
3339
 
4.6%
3104
 
4.3%
2931
 
4.0%
2859
 
3.9%
2070
 
2.9%
Other values (116) 30319
41.8%
Common
ValueCountFrequency (%)
10000
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 72606
87.9%
ASCII 10000
 
12.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10000
100.0%
Hangul
ValueCountFrequency (%)
9322
 
12.8%
5822
 
8.0%
4592
 
6.3%
4591
 
6.3%
3657
 
5.0%
3339
 
4.6%
3104
 
4.3%
2931
 
4.0%
2859
 
3.9%
2070
 
2.9%
Other values (116) 30319
41.8%

업무구분명
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
공공비축 포대벼_40kg
3535 
공공비축미_산물벼
2173 
공공비축 포대벼_800kg
1699 
잡곡(콩/옥수수)(2019.01 이전)_40kg
1018 
하곡(보리)_40kg
961 
Other values (14)
614 

Length

Max length27
Median length26
Mean length13.3773
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row잡곡(콩/옥수수)(2019.01 이전)_40kg
2nd row공공비축미_산물벼
3rd row공공비축 포대벼_40kg
4th row잡곡(콩/옥수수)(2019.01 이전)_40kg
5th row공공비축 포대벼_800kg

Common Values

ValueCountFrequency (%)
공공비축 포대벼_40kg 3535
35.4%
공공비축미_산물벼 2173
21.7%
공공비축 포대벼_800kg 1699
17.0%
잡곡(콩/옥수수)(2019.01 이전)_40kg 1018
 
10.2%
하곡(보리)_40kg 961
 
9.6%
시장격리곡(농협시가매입)_800kg 153
 
1.5%
공공비축미_시장격리곡 88
 
0.9%
인수벼 79
 
0.8%
콩(일반콩)_40kg 68
 
0.7%
비축농산물 57
 
0.6%
Other values (9) 169
 
1.7%

Length

2023-12-11T12:47:56.083297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
공공비축 5234
32.2%
포대벼_40kg 3535
21.7%
공공비축미_산물벼 2173
13.4%
포대벼_800kg 1699
 
10.5%
잡곡(콩/옥수수)(2019.01 1018
 
6.3%
이전)_40kg 1018
 
6.3%
하곡(보리)_40kg 961
 
5.9%
시장격리곡(농협시가매입)_800kg 153
 
0.9%
공공비축미_시장격리곡 88
 
0.5%
인수벼 79
 
0.5%
Other values (13) 298
 
1.8%

수량단위
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
40kg/대
8091 
800kg/대
1852 
20Kg/대
 
55
10Kg/대
 
2

Length

Max length7
Median length6
Mean length6.1852
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row40kg/대
2nd row40kg/대
3rd row40kg/대
4th row40kg/대
5th row800kg/대

Common Values

ValueCountFrequency (%)
40kg/대 8091
80.9%
800kg/대 1852
 
18.5%
20Kg/대 55
 
0.5%
10Kg/대 2
 
< 0.1%

Length

2023-12-11T12:47:56.207164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:47:56.346713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40kg/대 8091
80.9%
800kg/대 1852
 
18.5%
20kg/대 55
 
0.5%
10kg/대 2
 
< 0.1%

수량
Real number (ℝ)

Distinct6757
Distinct (%)67.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122664.92
Minimum1
Maximum8813640
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T12:47:56.487002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile29
Q1480
median3375
Q324977.5
95-th percentile522572
Maximum8813640
Range8813639
Interquartile range (IQR)24497.5

Descriptive statistics

Standard deviation549651.43
Coefficient of variation (CV)4.480918
Kurtosis76.216149
Mean122664.92
Median Absolute Deviation (MAD)3313
Skewness7.8936288
Sum1.2266492 × 109
Variance3.021167 × 1011
MonotonicityNot monotonic
2023-12-11T12:47:56.630716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20.0 27
 
0.3%
5.0 27
 
0.3%
3.0 25
 
0.2%
9.0 25
 
0.2%
15.0 24
 
0.2%
16.0 23
 
0.2%
8.0 22
 
0.2%
10.0 22
 
0.2%
18.0 21
 
0.2%
100.0 21
 
0.2%
Other values (6747) 9763
97.6%
ValueCountFrequency (%)
1.0 18
0.2%
2.0 21
0.2%
3.0 25
0.2%
4.0 17
0.2%
5.0 27
0.3%
6.0 19
0.2%
7.0 21
0.2%
8.0 22
0.2%
9.0 25
0.2%
10.0 22
0.2%
ValueCountFrequency (%)
8813640.0 1
< 0.1%
8745280.0 1
< 0.1%
8300440.0 1
< 0.1%
7955640.0 1
< 0.1%
7564800.0 1
< 0.1%
7338040.0 1
< 0.1%
6918200.0 1
< 0.1%
6719600.0 1
< 0.1%
6696160.0 1
< 0.1%
6417480.0 1
< 0.1%

Interactions

2023-12-11T12:47:53.033937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:52.847437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:53.144153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:52.936997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:47:56.727911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도품목명업무구분명수량단위수량
년도1.0000.8520.6000.4240.262
품목명0.8521.0000.9050.4920.294
업무구분명0.6000.9051.0000.9350.291
수량단위0.4240.4920.9351.0000.074
수량0.2620.2940.2910.0741.000
2023-12-11T12:47:56.858609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업무구분명수량단위
업무구분명1.0000.816
수량단위0.8161.000
2023-12-11T12:47:56.954849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도수량업무구분명수량단위
년도1.000-0.2750.2750.270
수량-0.2751.0000.1130.045
업무구분명0.2750.1131.0000.816
수량단위0.2700.0450.8161.000

Missing values

2023-12-11T12:47:53.290739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:47:53.421404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

실적구분명년도품목명행정구역명업무구분명수량단위수량
5537실적2003옥수수종자경상북도 영양군잡곡(콩/옥수수)(2019.01 이전)_40kg40kg/대924.0
6511실적2004일미벼충청남도 아산시공공비축미_산물벼40kg/대2906920.0
14913실적2013동진1호벼경상북도 영덕군공공비축 포대벼_40kg40kg/대578.0
3802실적2002논콩(소립종)전라남도 영광군잡곡(콩/옥수수)(2019.01 이전)_40kg40kg/대16.0
16610실적2014황금누리벼충청남도 천안시공공비축 포대벼_800kg800kg/대152.0
7844실적2006남평벼경상남도 함안군공공비축미_산물벼40kg/대1137040.0
14756실적2012황금누리벼전라남도 장흥군공공비축 포대벼_800kg800kg/대1333.0
1293실적1999미확인품종충청북도 괴산군공공비축 포대벼_40kg40kg/대14591.0
10612실적2008맥주보리종자전라남도 영암군하곡(보리)_40kg40kg/대64630.0
16952실적2015새누리벼충청남도 부여군공공비축 포대벼_40kg40kg/대33615.0
실적구분명년도품목명행정구역명업무구분명수량단위수량
14926실적2013동진1호벼경상북도 포항시공공비축미_산물벼40kg/대36267.0
17106실적2015운광벼강원특별자치도 강릉시공공비축 포대벼_40kg40kg/대572.0
18153실적2016신동진벼전라북도 임실군공공비축미_산물벼40kg/대9052.0
482실적1998일미벼전라남도 함평군공공비축 포대벼_40kg40kg/대15506.0
403실적1998설레미충청남도 부여군공공비축 포대벼_40kg40kg/대1575.0
3730실적2002논콩(대립종)전라남도 나주시잡곡(콩/옥수수)(2019.01 이전)_40kg40kg/대248.0
6870실적2005남평벼경상남도 창원시공공비축 포대벼_40kg40kg/대35982.0
6635실적2004주남벼전라남도 함평군공공비축미_산물벼40kg/대90400.0
11815실적2009새추청벼경상북도 영덕군공공비축미_산물벼40kg/대10200.0
8234실적2006동진1호벼경상북도 칠곡군공공비축 포대벼_40kg40kg/대6099.0