Overview

Dataset statistics

Number of variables7
Number of observations2896
Missing cells3275
Missing cells (%)16.2%
Duplicate rows113
Duplicate rows (%)3.9%
Total size in memory169.8 KiB
Average record size in memory60.0 B

Variable types

Numeric4
Categorical2
Text1

Dataset

Description시군별 기능성 양잠 산물(누에고치, 수번데기, 생누에, 동충하초, 잠분, 오디, 건조누에, 기타)에 대한 농가수, 사육량, 생산량 통계
Author농림축산식품부
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20181023000000001004

Alerts

Dataset has 113 (3.9%) duplicate rowsDuplicates
농가수(호) is highly overall correlated with 사육량(상자) and 1 other fieldsHigh correlation
사육량(상자) is highly overall correlated with 농가수(호) and 1 other fieldsHigh correlation
생산 량(kg) is highly overall correlated with 농가수(호) and 1 other fieldsHigh correlation
시군 has 776 (26.8%) missing valuesMissing
농가수(호) has 783 (27.0%) missing valuesMissing
사육량(상자) has 932 (32.2%) missing valuesMissing
생산 량(kg) has 784 (27.1%) missing valuesMissing
농가수(호) has 1239 (42.8%) zerosZeros
사육량(상자) has 1516 (52.3%) zerosZeros
생산 량(kg) has 1240 (42.8%) zerosZeros

Reproduction

Analysis started2023-12-11 03:25:51.038026
Analysis finished2023-12-11 03:25:53.842995
Duration2.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2012.5815
Minimum2008
Maximum2014
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.6 KiB
2023-12-11T12:25:53.892297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2008
5-th percentile2008
Q12012
median2013
Q32014
95-th percentile2014
Maximum2014
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.7622767
Coefficient of variation (CV)0.00087563001
Kurtosis0.7568985
Mean2012.5815
Median Absolute Deviation (MAD)1
Skewness-1.3794733
Sum5828436
Variance3.1056193
MonotonicityDecreasing
2023-12-11T12:25:54.239256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2014 1083
37.4%
2013 1048
36.2%
2012 153
 
5.3%
2011 153
 
5.3%
2010 153
 
5.3%
2009 153
 
5.3%
2008 153
 
5.3%
ValueCountFrequency (%)
2008 153
 
5.3%
2009 153
 
5.3%
2010 153
 
5.3%
2011 153
 
5.3%
2012 153
 
5.3%
2013 1048
36.2%
2014 1083
37.4%
ValueCountFrequency (%)
2014 1083
37.4%
2013 1048
36.2%
2012 153
 
5.3%
2011 153
 
5.3%
2010 153
 
5.3%
2009 153
 
5.3%
2008 153
 
5.3%

시도
Categorical

Distinct20
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size22.8 KiB
전라남도
405 
경상북도
359 
경상남도
342 
충청남도
315 
전라북도
297 
Other values (15)
1178 

Length

Max length7
Median length4
Mean length4.0901243
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경상남도
2nd row경상남도
3rd row경상남도
4th row경상남도
5th row경상남도

Common Values

ValueCountFrequency (%)
전라남도 405
14.0%
경상북도 359
12.4%
경상남도 342
11.8%
충청남도 315
10.9%
전라북도 297
10.3%
충청북도 252
8.7%
강원도 225
7.8%
경기도 195
6.7%
광주광역시 81
 
2.8%
대구광역시 63
 
2.2%
Other values (10) 362
12.5%

Length

2023-12-11T12:25:54.369920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전라남도 405
14.0%
경상북도 359
12.4%
경상남도 342
11.8%
충청남도 315
10.9%
전라북도 297
10.3%
충청북도 252
8.7%
강원도 225
7.8%
경기도 195
6.7%
광주광역시 81
 
2.8%
대구광역시 63
 
2.2%
Other values (10) 362
12.5%

시군
Text

MISSING 

Distinct192
Distinct (%)9.1%
Missing776
Missing (%)26.8%
Memory size22.8 KiB
2023-12-11T12:25:54.725856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.5490566
Min length2

Characters and Unicode

Total characters5404
Distinct characters100
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row함안
2nd row창녕
3rd row고성
4th row남해
5th row하동
ValueCountFrequency (%)
고성 27
 
1.3%
영동군 18
 
0.8%
청주시 18
 
0.8%
평창 18
 
0.8%
보은군 18
 
0.8%
제천시 18
 
0.8%
충주시 18
 
0.8%
홍성군 18
 
0.8%
인제 18
 
0.8%
영월 18
 
0.8%
Other values (182) 1931
91.1%
2023-12-11T12:25:55.271698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
706
 
13.1%
444
 
8.2%
282
 
5.2%
263
 
4.9%
216
 
4.0%
207
 
3.8%
158
 
2.9%
139
 
2.6%
130
 
2.4%
117
 
2.2%
Other values (90) 2742
50.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5404
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
706
 
13.1%
444
 
8.2%
282
 
5.2%
263
 
4.9%
216
 
4.0%
207
 
3.8%
158
 
2.9%
139
 
2.6%
130
 
2.4%
117
 
2.2%
Other values (90) 2742
50.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5404
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
706
 
13.1%
444
 
8.2%
282
 
5.2%
263
 
4.9%
216
 
4.0%
207
 
3.8%
158
 
2.9%
139
 
2.6%
130
 
2.4%
117
 
2.2%
Other values (90) 2742
50.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5404
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
706
 
13.1%
444
 
8.2%
282
 
5.2%
263
 
4.9%
216
 
4.0%
207
 
3.8%
158
 
2.9%
139
 
2.6%
130
 
2.4%
117
 
2.2%
Other values (90) 2742
50.7%

누에구분
Categorical

Distinct10
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size22.8 KiB
잠분
696 
수번데기
327 
생누에
315 
건조누에
315 
누에고치
315 
Other values (5)
928 

Length

Max length4
Median length3
Mean length2.9872238
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수번데기
2nd row수번데기
3rd row수번데기
4th row수번데기
5th row수번데기

Common Values

ValueCountFrequency (%)
잠분 696
24.0%
수번데기 327
11.3%
생누에 315
10.9%
건조누에 315
10.9%
누에고치 315
10.9%
뽕잎 205
 
7.1%
오디 204
 
7.0%
기타 204
 
7.0%
동충하초 202
 
7.0%
동중하초 113
 
3.9%

Length

2023-12-11T12:25:55.433586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:25:55.585977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
잠분 696
24.0%
수번데기 327
11.3%
생누에 315
10.9%
건조누에 315
10.9%
누에고치 315
10.9%
뽕잎 205
 
7.1%
오디 204
 
7.0%
기타 204
 
7.0%
동충하초 202
 
7.0%
동중하초 113
 
3.9%

농가수(호)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct140
Distinct (%)6.6%
Missing783
Missing (%)27.0%
Infinite0
Infinite (%)0.0%
Mean22.416943
Minimum0
Maximum3431
Zeros1239
Zeros (%)42.8%
Negative0
Negative (%)0.0%
Memory size25.6 KiB
2023-12-11T12:25:55.754487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile68
Maximum3431
Range3431
Interquartile range (IQR)3

Descriptive statistics

Standard deviation168.96937
Coefficient of variation (CV)7.5375743
Kurtosis292.02177
Mean22.416943
Median Absolute Deviation (MAD)0
Skewness15.948923
Sum47367
Variance28550.649
MonotonicityNot monotonic
2023-12-11T12:25:55.899690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1239
42.8%
1 219
 
7.6%
2 120
 
4.1%
3 61
 
2.1%
4 49
 
1.7%
5 42
 
1.5%
6 23
 
0.8%
10 21
 
0.7%
7 19
 
0.7%
8 17
 
0.6%
Other values (130) 303
 
10.5%
(Missing) 783
27.0%
ValueCountFrequency (%)
0 1239
42.8%
1 219
 
7.6%
2 120
 
4.1%
3 61
 
2.1%
4 49
 
1.7%
5 42
 
1.5%
6 23
 
0.8%
7 19
 
0.7%
8 17
 
0.6%
9 10
 
0.3%
ValueCountFrequency (%)
3431 1
< 0.1%
3424 1
< 0.1%
3326 1
< 0.1%
3102 1
< 0.1%
2184 1
< 0.1%
1006 2
0.1%
991 1
< 0.1%
987 1
< 0.1%
953 1
< 0.1%
939 1
< 0.1%

사육량(상자)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct183
Distinct (%)9.3%
Missing932
Missing (%)32.2%
Infinite0
Infinite (%)0.0%
Mean56.862525
Minimum0
Maximum7093
Zeros1516
Zeros (%)52.3%
Negative0
Negative (%)0.0%
Memory size25.6 KiB
2023-12-11T12:25:56.051016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile131
Maximum7093
Range7093
Interquartile range (IQR)0

Descriptive statistics

Standard deviation382.09593
Coefficient of variation (CV)6.719644
Kurtosis183.93365
Mean56.862525
Median Absolute Deviation (MAD)0
Skewness12.398197
Sum111678
Variance145997.3
MonotonicityNot monotonic
2023-12-11T12:25:56.210210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1516
52.3%
20 22
 
0.8%
1 21
 
0.7%
5 19
 
0.7%
2 14
 
0.5%
10 14
 
0.5%
6 12
 
0.4%
30 12
 
0.4%
4 12
 
0.4%
15 10
 
0.3%
Other values (173) 312
 
10.8%
(Missing) 932
32.2%
ValueCountFrequency (%)
0 1516
52.3%
1 21
 
0.7%
2 14
 
0.5%
3 10
 
0.3%
4 12
 
0.4%
5 19
 
0.7%
6 12
 
0.4%
7 2
 
0.1%
8 5
 
0.2%
9 4
 
0.1%
ValueCountFrequency (%)
7093 1
< 0.1%
6641 1
< 0.1%
6267 1
< 0.1%
5315 1
< 0.1%
5209 1
< 0.1%
3592 1
< 0.1%
3533 1
< 0.1%
2500 1
< 0.1%
2344 1
< 0.1%
2310 1
< 0.1%

생산 량(kg)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct486
Distinct (%)23.0%
Missing784
Missing (%)27.1%
Infinite0
Infinite (%)0.0%
Mean21559.89
Minimum0
Maximum4789166
Zeros1240
Zeros (%)42.8%
Negative0
Negative (%)0.0%
Memory size25.6 KiB
2023-12-11T12:25:56.382508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3600
95-th percentile34643.5
Maximum4789166
Range4789166
Interquartile range (IQR)600

Descriptive statistics

Standard deviation216168.26
Coefficient of variation (CV)10.026408
Kurtosis309.01814
Mean21559.89
Median Absolute Deviation (MAD)0
Skewness16.667578
Sum45534488
Variance4.6728718 × 1010
MonotonicityNot monotonic
2023-12-11T12:25:56.538453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 1240
42.8%
1000.0 20
 
0.7%
200.0 19
 
0.7%
300.0 18
 
0.6%
100.0 17
 
0.6%
1500.0 14
 
0.5%
500.0 14
 
0.5%
20.0 13
 
0.4%
50.0 13
 
0.4%
2000.0 11
 
0.4%
Other values (476) 733
25.3%
(Missing) 784
27.1%
ValueCountFrequency (%)
0.0 1240
42.8%
1.0 3
 
0.1%
5.0 3
 
0.1%
6.0 1
 
< 0.1%
8.0 2
 
0.1%
9.0 1
 
< 0.1%
10.0 8
 
0.3%
13.0 3
 
0.1%
14.0 2
 
0.1%
15.0 9
 
0.3%
ValueCountFrequency (%)
4789166.0 1
< 0.1%
4206435.0 1
< 0.1%
4057359.0 1
< 0.1%
3970360.0 1
< 0.1%
2911360.0 1
< 0.1%
2000000.0 1
< 0.1%
1680000.0 1
< 0.1%
1184000.0 1
< 0.1%
1132838.0 1
< 0.1%
1130290.0 1
< 0.1%

Interactions

2023-12-11T12:25:52.985934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:51.612406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:52.134461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:52.545735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:53.157730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:51.739933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:52.238030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:52.650257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:53.283566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:51.869157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:52.350203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:52.766578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:53.399915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:52.016033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:52.449789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:52.863617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:25:56.638810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도시도누에구분농가수(호)사육량(상자)생산 량(kg)
연도1.0000.5290.4560.0410.1090.042
시도0.5291.0000.0000.0000.0000.000
누에구분0.4560.0001.0000.2460.1670.123
농가수(호)0.0410.0000.2461.0000.3960.901
사육량(상자)0.1090.0000.1670.3961.0000.000
생산 량(kg)0.0420.0000.1230.9010.0001.000
2023-12-11T12:25:56.752703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도누에구분
시도1.0000.000
누에구분0.0001.000
2023-12-11T12:25:56.857815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도농가수(호)사육량(상자)생산 량(kg)시도누에구분
연도1.0000.2010.1780.2330.2490.235
농가수(호)0.2011.0000.6670.9650.0000.104
사육량(상자)0.1780.6671.0000.6120.0000.076
생산 량(kg)0.2330.9650.6121.0000.0000.059
시도0.2490.0000.0000.0001.0000.000
누에구분0.2350.1040.0760.0590.0001.000

Missing values

2023-12-11T12:25:53.522846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:25:53.646847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T12:25:53.768995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연도시도시군누에구분농가수(호)사육량(상자)생산 량(kg)
02014경상남도함안수번데기<NA><NA><NA>
12014경상남도창녕수번데기<NA><NA><NA>
22014경상남도고성수번데기<NA><NA><NA>
32014경상남도남해수번데기<NA><NA><NA>
42014경상남도하동수번데기<NA><NA><NA>
52014경상남도산청수번데기<NA><NA><NA>
62014경상남도함양수번데기<NA><NA><NA>
72014경상남도거창수번데기15100.0
82014경상남도합천수번데기<NA><NA><NA>
92014제주광역시제주시잠분<NA><NA><NA>
연도시도시군누에구분농가수(호)사육량(상자)생산 량(kg)
28862008강원도<NA>잠분000.0
28872008충청북도<NA>잠분000.0
28882008충청남도<NA>잠분000.0
28892008전라북도<NA>잠분000.0
28902008전라남도<NA>잠분000.0
28912008경상북도<NA>잠분000.0
28922008경상남도<NA>잠분000.0
28932008제주특별자치도<NA>잠분000.0
28942008세종특별자치시<NA>잠분000.0
28952008서울특별시<NA>뽕잎000.0

Duplicate rows

Most frequently occurring

연도시도시군누에구분농가수(호)사육량(상자)생산 량(kg)# duplicates
02014강원도고성잠분<NA><NA><NA>4
12014강원도삼척잠분<NA><NA><NA>4
92014강원도홍천잠분<NA><NA><NA>4
222014경상남도거제잠분<NA><NA><NA>4
262014경상남도남해잠분<NA><NA><NA>4
272014경상남도사천잠분<NA><NA><NA>4
282014경상남도양산잠분<NA><NA><NA>4
322014경상남도창원잠분<NA><NA><NA>4
352014경상남도함안잠분<NA><NA><NA>4
372014경상남도합천잠분<NA><NA><NA>4