Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 3330 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 253.8 KiB |
Average record size in memory | 78.0 B |
Variable types
Categorical | 4 |
---|---|
Numeric | 5 |
Dataset
Description | 녹비작물 종자 지역별 공급 현황정보 |
---|---|
Author | 농림축산식품부 |
URL | https://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220216000000001997 |
SIGNGU_NM has a high cardinality: 153 distinct values | High cardinality |
AR is highly correlated with VOLM | High correlation |
VOLM is highly correlated with AR | High correlation |
AR has 1503 (45.1%) zeros | Zeros |
VOLM has 1508 (45.3%) zeros | Zeros |
Reproduction
Analysis started | 2022-08-12 14:44:20.781359 |
---|---|
Analysis finished | 2022-08-12 14:44:28.090369 |
Duration | 7.31 seconds |
Software version | pandas-profiling v3.2.0 |
Download configuration | config.json |
YEAR
Categorical
Distinct | 5 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 26.1 KiB |
2013 | |
---|---|
2014 | |
2015 | |
2016 | |
2017 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2013 |
---|---|
2nd row | 2013 |
3rd row | 2013 |
4th row | 2013 |
5th row | 2013 |
Common Values
Value | Count | Frequency (%) |
2013 | 790 | |
2014 | 740 | |
2015 | 655 | |
2016 | 650 | |
2017 | 495 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
2013 | 790 | |
2014 | 740 | |
2015 | 655 | |
2016 | 650 | |
2017 | 495 |
CTRD_NM
Categorical
Distinct | 16 |
---|---|
Distinct (%) | 0.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 26.1 KiB |
경상북도 | |
---|---|
전라남도 | |
강원도 | |
경상남도 | |
경기도 | |
Other values (11) |
Length
Max length | 7 |
---|---|
Median length | 4 |
Mean length | 3.896396396 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 강원도 |
---|---|
2nd row | 경상북도 |
3rd row | 충청남도 |
4th row | 경상남도 |
5th row | 전라남도 |
Common Values
Value | Count | Frequency (%) |
경상북도 | 505 | |
전라남도 | 495 | |
강원도 | 415 | |
경상남도 | 410 | |
경기도 | 340 | |
충청남도 | 325 | |
전라북도 | 310 | |
충청북도 | 260 | |
대전광역시 | 50 | 1.5% |
제주특별자치도 | 45 | 1.4% |
Other values (6) | 175 | 5.3% |
Length
Value | Count | Frequency (%) |
경상북도 | 505 | |
전라남도 | 495 | |
강원도 | 415 | |
경상남도 | 410 | |
경기도 | 340 | |
충청남도 | 325 | |
전라북도 | 310 | |
충청북도 | 260 | |
대전광역시 | 50 | 1.5% |
제주특별자치도 | 45 | 1.4% |
Other values (6) | 175 | 5.3% |
CTRD_CODE
Real number (ℝ≥0)
Distinct | 16 |
---|---|
Distinct (%) | 0.5% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 6433153.153 |
Minimum | 5690000 |
---|---|
Maximum | 6500000 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 29.4 KiB |
Quantile statistics
Minimum | 5690000 |
---|---|
5-th percentile | 6300000 |
Q1 | 6420000 |
median | 6450000 |
Q3 | 6470000 |
95-th percentile | 6480000 |
Maximum | 6500000 |
Range | 810000 |
Interquartile range (IQR) | 50000 |
Descriptive statistics
Standard deviation | 78503.3705 |
---|---|
Coefficient of variation (CV) | 0.01220293822 |
Kurtosis | 58.18409364 |
Mean | 6433153.153 |
Median Absolute Deviation (MAD) | 20000 |
Skewness | -6.706491034 |
Sum | 2.14224 × 1010 |
Variance | 6162779181 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
6470000 | 505 | |
6460000 | 495 | |
6420000 | 415 | |
6480000 | 410 | |
6410000 | 340 | |
6440000 | 325 | |
6450000 | 310 | |
6430000 | 260 | |
6300000 | 50 | 1.5% |
6500000 | 45 | 1.4% |
Other values (6) | 175 | 5.3% |
Value | Count | Frequency (%) |
5690000 | 25 | 0.8% |
6260000 | 25 | 0.8% |
6270000 | 10 | 0.3% |
6280000 | 45 | 1.4% |
6290000 | 35 | 1.1% |
6300000 | 50 | 1.5% |
6310000 | 35 | 1.1% |
6410000 | 340 | |
6420000 | 415 | |
6430000 | 260 |
Value | Count | Frequency (%) |
6500000 | 45 | 1.4% |
6480000 | 410 | |
6470000 | 505 | |
6460000 | 495 | |
6450000 | 310 | |
6440000 | 325 | |
6430000 | 260 | |
6420000 | 415 | |
6410000 | 340 | |
6310000 | 35 | 1.1% |
Distinct | 153 |
---|---|
Distinct (%) | 4.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 26.1 KiB |
고성군 | 45 |
---|---|
중구 | 30 |
포천시 | 25 |
연천군 | 25 |
안성시 | 25 |
Other values (148) |
Length
Max length | 4 |
---|---|
Median length | 3 |
Mean length | 2.984984985 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 철원군 |
---|---|
2nd row | 청송군 |
3rd row | 청양군 |
4th row | 남해군 |
5th row | 강진군 |
Common Values
Value | Count | Frequency (%) |
고성군 | 45 | 1.4% |
중구 | 30 | 0.9% |
포천시 | 25 | 0.8% |
연천군 | 25 | 0.8% |
안성시 | 25 | 0.8% |
청도군 | 25 | 0.8% |
봉화군 | 25 | 0.8% |
화성시 | 25 | 0.8% |
울진군 | 25 | 0.8% |
인제군 | 25 | 0.8% |
Other values (143) | 3055 |
Length
Value | Count | Frequency (%) |
고성군 | 45 | 1.4% |
중구 | 30 | 0.9% |
장수군 | 25 | 0.8% |
철원군 | 25 | 0.8% |
여수시 | 25 | 0.8% |
구미시 | 25 | 0.8% |
평택시 | 25 | 0.8% |
태백시 | 25 | 0.8% |
정읍시 | 25 | 0.8% |
공주시 | 25 | 0.8% |
Other values (143) | 3055 |
SIGNGU_CODE
Real number (ℝ≥0)
Distinct | 158 |
---|---|
Distinct (%) | 4.7% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4810773.348 |
Minimum | 3360000 |
---|---|
Maximum | 9999010 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 29.4 KiB |
Quantile statistics
Minimum | 3360000 |
---|---|
5-th percentile | 3690000 |
Q1 | 4350000 |
median | 4800000 |
Q3 | 5180000 |
95-th percentile | 5670000 |
Maximum | 9999010 |
Range | 6639010 |
Interquartile range (IQR) | 830000 |
Descriptive statistics
Standard deviation | 725007.1478 |
---|---|
Coefficient of variation (CV) | 0.1507049065 |
Kurtosis | 17.93432279 |
Mean | 4810773.348 |
Median Absolute Deviation (MAD) | 400000 |
Skewness | 2.668954566 |
Sum | 1.601987525 × 1010 |
Variance | 5.256353644 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4300000 | 25 | 0.8% |
4170000 | 25 | 0.8% |
5460000 | 25 | 0.8% |
5240000 | 25 | 0.8% |
5530000 | 25 | 0.8% |
5250000 | 25 | 0.8% |
4620000 | 25 | 0.8% |
5190000 | 25 | 0.8% |
5600000 | 25 | 0.8% |
5700000 | 25 | 0.8% |
Other values (148) | 3080 |
Value | Count | Frequency (%) |
3360000 | 10 | 0.3% |
3400000 | 15 | |
3480000 | 10 | 0.3% |
3490000 | 5 | 0.2% |
3550000 | 5 | 0.2% |
3570000 | 25 | |
3580000 | 10 | 0.3% |
3590000 | 5 | 0.2% |
3610000 | 5 | 0.2% |
3620000 | 10 | 0.3% |
Value | Count | Frequency (%) |
9999010 | 25 | |
6520000 | 20 | |
6510000 | 25 | |
5710000 | 25 | |
5700000 | 25 | |
5680000 | 25 | |
5670000 | 25 | |
5600000 | 25 | |
5590000 | 15 | |
5580000 | 10 | 0.3% |
FRMHS_CO
Real number (ℝ≥0)
Distinct | 479 |
---|---|
Distinct (%) | 14.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 869.1171171 |
Minimum | 1 |
---|---|
Maximum | 9102 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 29.4 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 5 |
Q1 | 58 |
median | 269 |
Q3 | 1039 |
95-th percentile | 4038 |
Maximum | 9102 |
Range | 9101 |
Interquartile range (IQR) | 981 |
Descriptive statistics
Standard deviation | 1453.908488 |
---|---|
Coefficient of variation (CV) | 1.672856809 |
Kurtosis | 9.698317504 |
Mean | 869.1171171 |
Median Absolute Deviation (MAD) | 253 |
Skewness | 2.938884766 |
Sum | 2894160 |
Variance | 2113849.89 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 45 | 1.4% |
5 | 40 | 1.2% |
3 | 40 | 1.2% |
2 | 35 | 1.1% |
10 | 35 | 1.1% |
19 | 30 | 0.9% |
16 | 30 | 0.9% |
58 | 30 | 0.9% |
57 | 30 | 0.9% |
6 | 30 | 0.9% |
Other values (469) | 2985 |
Value | Count | Frequency (%) |
1 | 45 | |
2 | 35 | |
3 | 40 | |
4 | 25 | |
5 | 40 | |
6 | 30 | |
7 | 10 | 0.3% |
8 | 10 | 0.3% |
9 | 25 | |
10 | 35 |
Value | Count | Frequency (%) |
9102 | 5 | |
8557 | 5 | |
8476 | 5 | |
8404 | 5 | |
8048 | 5 | |
7882 | 5 | |
7755 | 5 | |
7574 | 5 | |
7211 | 5 | |
7082 | 5 |
PRDLST_NM
Categorical
Distinct | 5 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 26.1 KiB |
호밀 | |
---|---|
헤어리베치 | |
녹비(청)보리 | |
들목새 | |
자운영 |
Length
Max length | 7 |
---|---|
Median length | 5 |
Mean length | 4 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 호밀 |
---|---|
2nd row | 호밀 |
3rd row | 호밀 |
4th row | 호밀 |
5th row | 호밀 |
Common Values
Value | Count | Frequency (%) |
호밀 | 666 | |
헤어리베치 | 666 | |
녹비(청)보리 | 666 | |
들목새 | 666 | |
자운영 | 666 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
호밀 | 666 | |
헤어리베치 | 666 | |
녹비(청)보리 | 666 | |
들목새 | 666 | |
자운영 | 666 |
Distinct | 1814 |
---|---|
Distinct (%) | 54.5% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 404502.1218 |
Minimum | 0 |
---|---|
Maximum | 15995095 |
Zeros | 1503 |
Zeros (%) | 45.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 29.4 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 6093.5 |
Q3 | 174472.5 |
95-th percentile | 2066391.2 |
Maximum | 15995095 |
Range | 15995095 |
Interquartile range (IQR) | 174472.5 |
Descriptive statistics
Standard deviation | 1292497.283 |
---|---|
Coefficient of variation (CV) | 3.195279366 |
Kurtosis | 48.41242495 |
Mean | 404502.1218 |
Median Absolute Deviation (MAD) | 6093.5 |
Skewness | 6.172817172 |
Sum | 1346992065 |
Variance | 1.670549227 × 1012 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 1503 | |
11388 | 2 | 0.1% |
22359 | 2 | 0.1% |
1292 | 2 | 0.1% |
1577 | 2 | 0.1% |
2000 | 2 | 0.1% |
4193 | 2 | 0.1% |
4112.9 | 2 | 0.1% |
10760 | 2 | 0.1% |
11664 | 2 | 0.1% |
Other values (1804) | 1809 |
Value | Count | Frequency (%) |
0 | 1503 | |
208 | 1 | < 0.1% |
536 | 1 | < 0.1% |
585 | 1 | < 0.1% |
900 | 1 | < 0.1% |
1000 | 2 | 0.1% |
1060 | 1 | < 0.1% |
1065 | 1 | < 0.1% |
1091 | 1 | < 0.1% |
1170 | 1 | < 0.1% |
Value | Count | Frequency (%) |
15995095 | 1 | |
15802421.7 | 1 | |
15108395.2 | 1 | |
13614790.8 | 1 | |
13249691 | 1 | |
13094147.7 | 1 | |
12991182.2 | 1 | |
12337307.4 | 1 | |
11841065.7 | 1 | |
11143780.2 | 1 |
Distinct | 763 |
---|---|
Distinct (%) | 22.9% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 334.090991 |
Minimum | 0 |
---|---|
Maximum | 18672 |
Zeros | 1508 |
Zeros (%) | 45.3% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 29.4 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 5 |
Q3 | 140 |
95-th percentile | 1633.55 |
Maximum | 18672 |
Range | 18672 |
Interquartile range (IQR) | 140 |
Descriptive statistics
Standard deviation | 1203.840473 |
---|---|
Coefficient of variation (CV) | 3.603331144 |
Kurtosis | 82.10911534 |
Mean | 334.090991 |
Median Absolute Deviation (MAD) | 5 |
Skewness | 7.878045575 |
Sum | 1112523 |
Variance | 1449231.884 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 1508 | |
1 | 41 | 1.2% |
4 | 38 | 1.1% |
2 | 35 | 1.1% |
5 | 34 | 1.0% |
3 | 26 | 0.8% |
8 | 25 | 0.8% |
12 | 22 | 0.7% |
7 | 21 | 0.6% |
6 | 20 | 0.6% |
Other values (753) | 1560 |
Value | Count | Frequency (%) |
0 | 1508 | |
1 | 41 | 1.2% |
2 | 35 | 1.1% |
3 | 26 | 0.8% |
4 | 38 | 1.1% |
5 | 34 | 1.0% |
6 | 20 | 0.6% |
7 | 21 | 0.6% |
8 | 25 | 0.8% |
9 | 15 | 0.5% |
Value | Count | Frequency (%) |
18672 | 1 | |
18392 | 1 | |
17379 | 1 | |
14873 | 1 | |
14625 | 1 | |
12896 | 1 | |
12753 | 1 | |
12622 | 1 | |
12483 | 1 | |
11774 | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
YEAR | CTRD_NM | CTRD_CODE | SIGNGU_NM | SIGNGU_CODE | FRMHS_CO | PRDLST_NM | AR | VOLM | |
---|---|---|---|---|---|---|---|---|---|
0 | 2013 | 강원도 | 6420000 | 철원군 | 4300000 | 204 | 호밀 | 313131.9 | 253 |
1 | 2013 | 경상북도 | 6470000 | 청송군 | 5160000 | 1181 | 호밀 | 2403573.0 | 1944 |
2 | 2013 | 충청남도 | 6440000 | 청양군 | 4590000 | 820 | 호밀 | 1294832.2 | 1051 |
3 | 2013 | 경상남도 | 6480000 | 남해군 | 5430000 | 1352 | 호밀 | 167248.5 | 133 |
4 | 2013 | 전라남도 | 6460000 | 강진군 | 4920000 | 8476 | 호밀 | 1083821.0 | 875 |
5 | 2013 | 강원도 | 6420000 | 화천군 | 4310000 | 1454 | 호밀 | 4120646.0 | 3282 |
6 | 2013 | 경상북도 | 6470000 | 영양군 | 5170000 | 454 | 호밀 | 348545.5 | 276 |
7 | 2013 | 전라북도 | 6450000 | 부안군 | 4790000 | 908 | 호밀 | 233356.9 | 180 |
8 | 2013 | 충청남도 | 6440000 | 홍성군 | 4600000 | 343 | 호밀 | 454769.8 | 362 |
9 | 2013 | 경상남도 | 6480000 | 하동군 | 5440000 | 4440 | 호밀 | 2628165.0 | 2101 |
Last rows
YEAR | CTRD_NM | CTRD_CODE | SIGNGU_NM | SIGNGU_CODE | FRMHS_CO | PRDLST_NM | AR | VOLM | |
---|---|---|---|---|---|---|---|---|---|
3320 | 2016 | 전라북도 | 6450000 | 임실군 | 4760000 | 6 | 자운영 | 0.0 | 0 |
3321 | 2016 | 경상남도 | 6480000 | 창녕군 | 5410000 | 476 | 자운영 | 10719.0 | 1 |
3322 | 2016 | 강원도 | 6420000 | 정선군 | 4290000 | 250 | 자운영 | 0.0 | 0 |
3323 | 2016 | 전라남도 | 6460000 | 화순군 | 4900000 | 1146 | 자운영 | 562825.5 | 125 |
3324 | 2016 | 경상북도 | 6470000 | 의성군 | 5150000 | 27 | 자운영 | 0.0 | 0 |
3325 | 2016 | 전라북도 | 6450000 | 순창군 | 4770000 | 312 | 자운영 | 5368.5 | 2 |
3326 | 2016 | 경상남도 | 6480000 | 고성군 | 5420000 | 148 | 자운영 | 0.0 | 0 |
3327 | 2016 | 충청북도 | 6430000 | 단양군 | 4480000 | 46 | 자운영 | 0.0 | 0 |
3328 | 2016 | 전라남도 | 6460000 | 장흥군 | 4910000 | 1051 | 자운영 | 1079907.7 | 209 |
3329 | 2016 | 강원도 | 6420000 | 철원군 | 4300000 | 58 | 자운영 | 0.0 | 0 |