Dataset statistics
Number of variables | 5 |
---|---|
Number of observations | 120 |
Missing cells | 5 |
Missing cells (%) | 0.8% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 5.2 KiB |
Average record size in memory | 44.1 B |
Variable types
Categorical | 3 |
---|---|
Numeric | 2 |
Dataset
Description | 잠업에 종사하는 전국의 모든 잠업가구의 양잠형태별 생산현황, 누에 생산 및 판매현황 조회 서비스 |
---|---|
Author | 농림축산식품부 |
URL | https://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220215000000001937 |
PRDCTN_FRMHS_CO has 5 (4.2%) missing values | Missing |
PRDCTN_FRMHS_CO has 15 (12.5%) zeros | Zeros |
Reproduction
Analysis started | 2022-08-12 14:42:04.370348 |
---|---|
Analysis finished | 2022-08-12 14:42:05.947881 |
Duration | 1.58 second |
Software version | pandas-profiling v3.2.0 |
Download configuration | config.json |
YEAR
Categorical
Distinct | 2 |
---|---|
Distinct (%) | 1.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.1 KiB |
2013 | |
---|---|
2014 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2013 |
---|---|
2nd row | 2013 |
3rd row | 2013 |
4th row | 2013 |
5th row | 2013 |
Common Values
Value | Count | Frequency (%) |
2013 | 60 | |
2014 | 60 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
2013 | 60 | |
2014 | 60 |
CTPRVN
Categorical
Distinct | 13 |
---|---|
Distinct (%) | 10.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.1 KiB |
대구광역시 | |
---|---|
광주광역시 | |
세종자치시 | |
경기도 | |
강원도 | |
Other values (8) |
Length
Max length | 7 |
---|---|
Median length | 6 |
Mean length | 4.25 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 제주광역시 |
---|---|
2nd row | 제주광역시 |
3rd row | 제주광역시 |
4th row | 제주광역시 |
5th row | 제주광역시 |
Common Values
Value | Count | Frequency (%) |
대구광역시 | 10 | |
광주광역시 | 10 | |
세종자치시 | 10 | |
경기도 | 10 | |
강원도 | 10 | |
충청북도 | 10 | |
충청남도 | 10 | |
전라북도 | 10 | |
전라남도 | 10 | |
경상북도 | 10 | |
Other values (3) | 20 |
Length
Value | Count | Frequency (%) |
대구광역시 | 10 | |
광주광역시 | 10 | |
세종자치시 | 10 | |
경기도 | 10 | |
강원도 | 10 | |
충청북도 | 10 | |
충청남도 | 10 | |
전라북도 | 10 | |
전라남도 | 10 | |
경상북도 | 10 | |
Other values (3) | 20 |
SE
Categorical
Distinct | 5 |
---|---|
Distinct (%) | 4.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.1 KiB |
3년이하 | |
---|---|
3~5년 | |
6~10년 | |
11~20년 | |
21년이상 |
Length
Max length | 6 |
---|---|
Median length | 5 |
Mean length | 4.8 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 3년이하 |
---|---|
2nd row | 3~5년 |
3rd row | 6~10년 |
4th row | 11~20년 |
5th row | 21년이상 |
Common Values
Value | Count | Frequency (%) |
3년이하 | 24 | |
3~5년 | 24 | |
6~10년 | 24 | |
11~20년 | 24 | |
21년이상 | 24 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
3년이하 | 24 | |
3~5년 | 24 | |
6~10년 | 24 | |
11~20년 | 24 | |
21년이상 | 24 |
Distinct | 71 |
---|---|
Distinct (%) | 61.7% |
Missing | 5 |
Missing (%) | 4.2% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 112.6869565 |
Minimum | 0 |
---|---|
Maximum | 1600 |
Zeros | 15 |
Zeros (%) | 12.5% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.2 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 5 |
median | 43 |
Q3 | 97.5 |
95-th percentile | 372 |
Maximum | 1600 |
Range | 1600 |
Interquartile range (IQR) | 92.5 |
Descriptive statistics
Standard deviation | 238.8734292 |
---|---|
Coefficient of variation (CV) | 2.119796617 |
Kurtosis | 24.97959279 |
Mean | 112.6869565 |
Median Absolute Deviation (MAD) | 42 |
Skewness | 4.657113986 |
Sum | 12959 |
Variance | 57060.51518 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 15 | 12.5% |
1 | 8 | 6.7% |
5 | 5 | 4.2% |
2 | 4 | 3.3% |
43 | 3 | 2.5% |
92 | 3 | 2.5% |
13 | 3 | 2.5% |
10 | 2 | 1.7% |
41 | 2 | 1.7% |
28 | 2 | 1.7% |
Other values (61) | 68 | |
(Missing) | 5 | 4.2% |
Value | Count | Frequency (%) |
0 | 15 | |
1 | 8 | |
2 | 4 | 3.3% |
5 | 5 | 4.2% |
7 | 2 | 1.7% |
8 | 1 | 0.8% |
9 | 1 | 0.8% |
10 | 2 | 1.7% |
11 | 1 | 0.8% |
12 | 1 | 0.8% |
Value | Count | Frequency (%) |
1600 | 1 | |
1565 | 1 | |
868 | 1 | |
760 | 1 | |
426 | 1 | |
414 | 1 | |
354 | 1 | |
352 | 1 | |
329 | 1 | |
309 | 1 |
CTPRVN_CD
Real number (ℝ≥0)
Distinct | 12 |
---|---|
Distinct (%) | 10.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 41.5 |
Minimum | 27 |
---|---|
Maximum | 50 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.2 KiB |
Quantile statistics
Minimum | 27 |
---|---|
5-th percentile | 27 |
Q1 | 39.75 |
median | 43.5 |
Q3 | 46.25 |
95-th percentile | 50 |
Maximum | 50 |
Range | 23 |
Interquartile range (IQR) | 6.5 |
Descriptive statistics
Standard deviation | 6.999399734 |
---|---|
Coefficient of variation (CV) | 0.1686602346 |
Kurtosis | -0.2028529704 |
Mean | 41.5 |
Median Absolute Deviation (MAD) | 3 |
Skewness | -0.9868816487 |
Sum | 4980 |
Variance | 48.99159664 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
50 | 10 | |
27 | 10 | |
29 | 10 | |
36 | 10 | |
41 | 10 | |
42 | 10 | |
43 | 10 | |
44 | 10 | |
45 | 10 | |
46 | 10 | |
Other values (2) | 20 |
Value | Count | Frequency (%) |
27 | 10 | |
29 | 10 | |
36 | 10 | |
41 | 10 | |
42 | 10 | |
43 | 10 | |
44 | 10 | |
45 | 10 | |
46 | 10 | |
47 | 10 |
Value | Count | Frequency (%) |
50 | 10 | |
48 | 10 | |
47 | 10 | |
46 | 10 | |
45 | 10 | |
44 | 10 | |
43 | 10 | |
42 | 10 | |
41 | 10 | |
36 | 10 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
YEAR | CTPRVN | SE | PRDCTN_FRMHS_CO | CTPRVN_CD | |
---|---|---|---|---|---|
0 | 2013 | 제주광역시 | 3년이하 | 0 | 50 |
1 | 2013 | 제주광역시 | 3~5년 | 10 | 50 |
2 | 2013 | 제주광역시 | 6~10년 | 1 | 50 |
3 | 2013 | 제주광역시 | 11~20년 | 0 | 50 |
4 | 2013 | 제주광역시 | 21년이상 | 0 | 50 |
5 | 2013 | 대구광역시 | 3년이하 | 13 | 27 |
6 | 2013 | 대구광역시 | 3~5년 | 0 | 27 |
7 | 2013 | 대구광역시 | 6~10년 | 0 | 27 |
8 | 2013 | 대구광역시 | 11~20년 | 0 | 27 |
9 | 2013 | 대구광역시 | 21년이상 | 1 | 27 |
Last rows
YEAR | CTPRVN | SE | PRDCTN_FRMHS_CO | CTPRVN_CD | |
---|---|---|---|---|---|
110 | 2014 | 경상북도 | 3년이하 | 83 | 47 |
111 | 2014 | 경상북도 | 3~5년 | 168 | 47 |
112 | 2014 | 경상북도 | 6~10년 | 122 | 47 |
113 | 2014 | 경상북도 | 11~20년 | 91 | 47 |
114 | 2014 | 경상북도 | 21년이상 | 188 | 47 |
115 | 2014 | 경상남도 | 3년이하 | 63 | 48 |
116 | 2014 | 경상남도 | 3~5년 | 129 | 48 |
117 | 2014 | 경상남도 | 6~10년 | 65 | 48 |
118 | 2014 | 경상남도 | 11~20년 | 41 | 48 |
119 | 2014 | 경상남도 | 21년이상 | 41 | 48 |