Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells71
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory644.5 KiB
Average record size in memory66.0 B

Variable types

Categorical3
Numeric2
Text2

Dataset

Description국립농산물품질관리원에서 관리하는 하추곡검사사실적 정보(구분명, 년도, 품목, 행정구역, 업무구분, 수량단위, 수량)
Author국립농산물품질관리원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220204000000001693

Alerts

실적구분명 has constant value ""Constant
업무구분명 is highly overall correlated with 수량단위High correlation
수량단위 is highly overall correlated with 업무구분명High correlation
수량단위 is highly imbalanced (63.8%)Imbalance

Reproduction

Analysis started2023-12-11 03:47:35.470697
Analysis finished2023-12-11 03:47:36.713800
Duration1.24 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

실적구분명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
실적
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row실적
2nd row실적
3rd row실적
4th row실적
5th row실적

Common Values

ValueCountFrequency (%)
실적 10000
100.0%

Length

2023-12-11T12:47:36.786617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:47:36.888944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
실적 10000
100.0%

년도
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2009.8664
Minimum1998
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T12:47:37.010704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1998
5-th percentile1999
Q12005
median2009
Q32016
95-th percentile2021
Maximum2022
Range24
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.6917433
Coefficient of variation (CV)0.0033294468
Kurtosis-1.0764808
Mean2009.8664
Median Absolute Deviation (MAD)5
Skewness0.024451604
Sum20098664
Variance44.779429
MonotonicityNot monotonic
2023-12-11T12:47:37.173739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
2006 556
 
5.6%
2008 547
 
5.5%
2004 547
 
5.5%
2009 525
 
5.2%
2007 522
 
5.2%
2016 515
 
5.1%
2013 490
 
4.9%
2019 450
 
4.5%
2005 424
 
4.2%
2017 416
 
4.2%
Other values (15) 5008
50.1%
ValueCountFrequency (%)
1998 304
3.0%
1999 345
3.5%
2000 365
3.6%
2001 265
2.6%
2002 334
3.3%
2003 336
3.4%
2004 547
5.5%
2005 424
4.2%
2006 556
5.6%
2007 522
5.2%
ValueCountFrequency (%)
2022 168
 
1.7%
2021 347
3.5%
2020 351
3.5%
2019 450
4.5%
2018 348
3.5%
2017 416
4.2%
2016 515
5.1%
2015 360
3.6%
2014 373
3.7%
2013 490
4.9%
Distinct89
Distinct (%)0.9%
Missing71
Missing (%)0.7%
Memory size156.2 KiB
2023-12-11T12:47:37.423035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length3.9814684
Min length1

Characters and Unicode

Total characters39532
Distinct characters99
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)0.2%

Sample

1st row동진1호벼
2nd row호품벼
3rd row논콩(대립종)
4th row
5th row새누리벼
ValueCountFrequency (%)
삼광벼 649
 
6.5%
새누리벼 486
 
4.9%
추청벼 483
 
4.9%
동진1호벼 482
 
4.9%
일품벼 468
 
4.7%
일미벼 435
 
4.4%
쌀보리종자 408
 
4.1%
남평벼 398
 
4.0%
겉보리종자 396
 
4.0%
기타(2군 359
 
3.6%
Other values (79) 5365
54.0%
2023-12-11T12:47:37.826601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6444
 
16.3%
2227
 
5.6%
1785
 
4.5%
1416
 
3.6%
) 1329
 
3.4%
( 1329
 
3.4%
1275
 
3.2%
1257
 
3.2%
1210
 
3.1%
1117
 
2.8%
Other values (89) 20143
51.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 36017
91.1%
Close Punctuation 1329
 
3.4%
Open Punctuation 1329
 
3.4%
Decimal Number 857
 
2.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6444
 
17.9%
2227
 
6.2%
1785
 
5.0%
1416
 
3.9%
1275
 
3.5%
1257
 
3.5%
1210
 
3.4%
1117
 
3.1%
1058
 
2.9%
990
 
2.7%
Other values (85) 17238
47.9%
Decimal Number
ValueCountFrequency (%)
1 494
57.6%
2 363
42.4%
Close Punctuation
ValueCountFrequency (%)
) 1329
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1329
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 36017
91.1%
Common 3515
 
8.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6444
 
17.9%
2227
 
6.2%
1785
 
5.0%
1416
 
3.9%
1275
 
3.5%
1257
 
3.5%
1210
 
3.4%
1117
 
3.1%
1058
 
2.9%
990
 
2.7%
Other values (85) 17238
47.9%
Common
ValueCountFrequency (%)
) 1329
37.8%
( 1329
37.8%
1 494
 
14.1%
2 363
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 36017
91.1%
ASCII 3515
 
8.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6444
 
17.9%
2227
 
6.2%
1785
 
5.0%
1416
 
3.9%
1275
 
3.5%
1257
 
3.5%
1210
 
3.4%
1117
 
3.1%
1058
 
2.9%
990
 
2.7%
Other values (85) 17238
47.9%
ASCII
ValueCountFrequency (%)
) 1329
37.8%
( 1329
37.8%
1 494
 
14.1%
2 363
 
10.3%
Distinct185
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T12:47:38.248499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length8
Mean length7.9085
Min length7

Characters and Unicode

Total characters79085
Distinct characters127
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row충청북도 충주시
2nd row전라남도 광양시
3rd row경상북도 고령군
4th row경기도 가평군
5th row전라북도 부안군
ValueCountFrequency (%)
전라남도 1875
 
9.4%
경상북도 1470
 
7.4%
경상남도 1339
 
6.7%
충청남도 1074
 
5.4%
전라북도 1027
 
5.2%
강원도 918
 
4.6%
충청북도 712
 
3.6%
경기도 634
 
3.2%
광주광역시 178
 
0.9%
인천광역시 152
 
0.8%
Other values (179) 10530
52.9%
2023-12-11T12:47:38.777348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10000
 
12.6%
9315
 
11.8%
5816
 
7.4%
4656
 
5.9%
4581
 
5.8%
3625
 
4.6%
3317
 
4.2%
3078
 
3.9%
2902
 
3.7%
2887
 
3.7%
Other values (117) 28908
36.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 69085
87.4%
Space Separator 10000
 
12.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9315
 
13.5%
5816
 
8.4%
4656
 
6.7%
4581
 
6.6%
3625
 
5.2%
3317
 
4.8%
3078
 
4.5%
2902
 
4.2%
2887
 
4.2%
2109
 
3.1%
Other values (116) 26799
38.8%
Space Separator
ValueCountFrequency (%)
10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 69085
87.4%
Common 10000
 
12.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9315
 
13.5%
5816
 
8.4%
4656
 
6.7%
4581
 
6.6%
3625
 
5.2%
3317
 
4.8%
3078
 
4.5%
2902
 
4.2%
2887
 
4.2%
2109
 
3.1%
Other values (116) 26799
38.8%
Common
ValueCountFrequency (%)
10000
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 69085
87.4%
ASCII 10000
 
12.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10000
100.0%
Hangul
ValueCountFrequency (%)
9315
 
13.5%
5816
 
8.4%
4656
 
6.7%
4581
 
6.6%
3625
 
5.2%
3317
 
4.8%
3078
 
4.5%
2902
 
4.2%
2887
 
4.2%
2109
 
3.1%
Other values (116) 26799
38.8%

업무구분명
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
공공비축 포대벼_40kg
3607 
공공비축 산물벼
2128 
공공비축 포대벼_800kg
1575 
잡곡(콩/옥수수)(2019.01 이전)_40kg
1111 
하곡(보리)_40kg
1060 
Other values (13)
519 

Length

Max length27
Median length26
Mean length13.2757
Min length3

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row공공비축 산물벼
2nd row공공비축 포대벼_40kg
3rd row잡곡(콩/옥수수)(2019.01 이전)_40kg
4th row콩(일반콩)_40kg
5th row공공비축 포대벼_800kg

Common Values

ValueCountFrequency (%)
공공비축 포대벼_40kg 3607
36.1%
공공비축 산물벼 2128
21.3%
공공비축 포대벼_800kg 1575
15.8%
잡곡(콩/옥수수)(2019.01 이전)_40kg 1111
 
11.1%
하곡(보리)_40kg 1060
 
10.6%
시장격리곡(농협시가매입)_800kg 150
 
1.5%
인수벼 89
 
0.9%
비축농산물 71
 
0.7%
시장격리곡(농협시가매입)_40kg 66
 
0.7%
콩(일반콩)_40kg 53
 
0.5%
Other values (8) 90
 
0.9%

Length

2023-12-11T12:47:38.926975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
공공비축 7310
39.7%
포대벼_40kg 3607
19.6%
산물벼 2128
 
11.5%
포대벼_800kg 1575
 
8.5%
잡곡(콩/옥수수)(2019.01 1111
 
6.0%
이전)_40kg 1111
 
6.0%
하곡(보리)_40kg 1060
 
5.8%
시장격리곡(농협시가매입)_800kg 150
 
0.8%
인수벼 89
 
0.5%
비축농산물 71
 
0.4%
Other values (12) 217
 
1.2%

수량단위
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
40kg/대
8204 
800kg/대
1725 
20Kg/대
 
69
10Kg/대
 
2

Length

Max length7
Median length6
Mean length6.1725
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row40kg/대
2nd row40kg/대
3rd row40kg/대
4th row40kg/대
5th row800kg/대

Common Values

ValueCountFrequency (%)
40kg/대 8204
82.0%
800kg/대 1725
 
17.2%
20Kg/대 69
 
0.7%
10Kg/대 2
 
< 0.1%

Length

2023-12-11T12:47:39.056314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:47:39.172927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40kg/대 8204
82.0%
800kg/대 1725
 
17.2%
20kg/대 69
 
0.7%
10kg/대 2
 
< 0.1%

수량
Real number (ℝ)

Distinct6759
Distinct (%)67.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100743.62
Minimum1
Maximum12658640
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T12:47:39.351722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile27
Q1444.75
median3451.5
Q324262.75
95-th percentile419142
Maximum12658640
Range12658639
Interquartile range (IQR)23818

Descriptive statistics

Standard deviation474041.05
Coefficient of variation (CV)4.7054199
Kurtosis135.79146
Mean100743.62
Median Absolute Deviation (MAD)3390.5
Skewness9.8419566
Sum1.0074362 × 109
Variance2.2471491 × 1011
MonotonicityNot monotonic
2023-12-11T12:47:39.532388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.0 35
 
0.4%
20.0 29
 
0.3%
10.0 29
 
0.3%
8.0 29
 
0.3%
12.0 27
 
0.3%
1.0 22
 
0.2%
4.0 22
 
0.2%
30.0 22
 
0.2%
50.0 22
 
0.2%
2.0 22
 
0.2%
Other values (6749) 9741
97.4%
ValueCountFrequency (%)
1.0 22
0.2%
2.0 22
0.2%
3.0 17
0.2%
4.0 22
0.2%
5.0 35
0.4%
6.0 18
0.2%
7.0 18
0.2%
8.0 29
0.3%
9.0 21
0.2%
10.0 29
0.3%
ValueCountFrequency (%)
12658640.0 1
< 0.1%
8745280.0 1
< 0.1%
8146880.0 1
< 0.1%
7292520.0 1
< 0.1%
7179160.0 1
< 0.1%
6916680.0 1
< 0.1%
6417480.0 1
< 0.1%
6338000.0 1
< 0.1%
6127560.0 1
< 0.1%
6001160.0 1
< 0.1%

Interactions

2023-12-11T12:47:36.249607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:36.035032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:36.357043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:47:36.126383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:47:39.635561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도품목명업무구분명수량단위수량
년도1.0000.8460.5770.4200.186
품목명0.8461.0000.9250.4510.309
업무구분명0.5770.9251.0000.9360.226
수량단위0.4200.4510.9361.0000.064
수량0.1860.3090.2260.0641.000
2023-12-11T12:47:39.738574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업무구분명수량단위
업무구분명1.0000.816
수량단위0.8161.000
2023-12-11T12:47:39.826142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도수량업무구분명수량단위
년도1.000-0.2380.2650.267
수량-0.2381.0000.0960.029
업무구분명0.2650.0961.0000.816
수량단위0.2670.0290.8161.000

Missing values

2023-12-11T12:47:36.488965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:47:36.639155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

실적구분명년도품목명행정구역명업무구분명수량단위수량
9254실적2008동진1호벼충청북도 충주시공공비축 산물벼40kg/대95960.0
13371실적2012호품벼전라남도 광양시공공비축 포대벼_40kg40kg/대29889.0
5656실적2005논콩(대립종)경상북도 고령군잡곡(콩/옥수수)(2019.01 이전)_40kg40kg/대88.0
21467실적2021경기도 가평군콩(일반콩)_40kg40kg/대57.0
13025실적2012새누리벼전라북도 부안군공공비축 포대벼_800kg800kg/대4851.0
7910실적2007논콩(대립종)충청남도 서천군잡곡(콩/옥수수)(2019.01 이전)_40kg40kg/대6.0
10689실적2009온누리벼충청남도 서산시공공비축 포대벼_800kg800kg/대5.0
21870실적2022일품벼경상북도 의성군시장격리곡(농협시가매입)_800kg800kg/대220.0
9430실적2008신동진벼전라북도 익산시공공비축 산물벼40kg/대230720.0
11721실적2010추청벼경기도 수원시공공비축 포대벼_40kg40kg/대189.0
실적구분명년도품목명행정구역명업무구분명수량단위수량
15029실적2014추청벼경기도 성남시공공비축 포대벼_40kg40kg/대640.0
21591실적2022벼(품종혼합)경상남도 거제시인수벼40kg/대36697.0
13881실적2013오대벼강원도 강릉시공공비축 포대벼_800kg800kg/대67.0
18099실적2017경상북도 구미시잡곡(콩/옥수수)(2019.01 이전)_40kg40kg/대42.0
21071실적2021새청무벼경상남도 진주시공공비축 포대벼_40kg40kg/대12896.0
16539실적2016새누리벼전라남도 나주시공공비축 포대벼_40kg40kg/대213260.0
14318실적2013호품벼충청북도 진천군공공비축 산물벼40kg/대2779.0
3390실적2002밭콩(대립종)전라남도 목포시잡곡(콩/옥수수)(2019.01 이전)_40kg40kg/대20.0
21285실적2021오대벼강원도 영월군공공비축 산물벼40kg/대4308.0
21110실적2021새청무벼전라남도 무안군공공비축 포대벼_40kg40kg/대3281.0