Overview

Dataset statistics

Number of variables7
Number of observations121
Missing cells30
Missing cells (%)3.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.9 KiB
Average record size in memory58.1 B

Variable types

Categorical5
Text1
Numeric1

Dataset

Description양잠형태별 생산현황, 양잠규모 현황, 양잠농가현황 등의 통계자료
Author농림축산식품부
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20181023000000001003

Alerts

51~60세 is highly overall correlated with 40세미만 and 3 other fieldsHigh correlation
40세미만 is highly overall correlated with 51~60세 and 1 other fieldsHigh correlation
41~50세 is highly overall correlated with 51~60세 and 1 other fieldsHigh correlation
61~70세 is highly overall correlated with 51~60세 and 1 other fieldsHigh correlation
71세이상 is highly overall correlated with 51~60세 and 1 other fieldsHigh correlation
40세미만 is highly imbalanced (55.5%)Imbalance
51~60세 has 30 (24.8%) missing valuesMissing
시군 has unique valuesUnique

Reproduction

Analysis started2023-12-11 03:39:22.907617
Analysis finished2023-12-11 03:39:23.797420
Duration0.89 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Categorical

Distinct12
Distinct (%)9.9%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
전라남도
18 
경상북도
18 
충청남도
15 
경상남도
15 
전라북도
14 
Other values (7)
41 

Length

Max length5
Median length4
Mean length3.8595041
Min length3

Unique

Unique2 ?
Unique (%)1.7%

Sample

1st row제주광역시
2nd row제주광역시
3rd row대구광역시
4th row광주광역시
5th row광주광역시

Common Values

ValueCountFrequency (%)
전라남도 18
14.9%
경상북도 18
14.9%
충청남도 15
12.4%
경상남도 15
12.4%
전라북도 14
11.6%
강원도 12
9.9%
충청북도 12
9.9%
경기도 11
9.1%
제주광역시 2
 
1.7%
광주광역시 2
 
1.7%
Other values (2) 2
 
1.7%

Length

2023-12-11T12:39:23.879220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전라남도 18
14.9%
경상북도 18
14.9%
충청남도 15
12.4%
경상남도 15
12.4%
전라북도 14
11.6%
강원도 12
9.9%
충청북도 12
9.9%
경기도 11
9.1%
제주광역시 2
 
1.7%
광주광역시 2
 
1.7%
Other values (2) 2
 
1.7%

시군
Text

UNIQUE 

Distinct121
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2023-12-11T12:39:24.222360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.553719
Min length2

Characters and Unicode

Total characters309
Distinct characters100
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique121 ?
Unique (%)100.0%

Sample

1st row제주시
2nd row서귀포시
3rd row달성군
4th row서구
5th row광산구
ValueCountFrequency (%)
제주시 1
 
0.8%
김제 1
 
0.8%
경주 1
 
0.8%
포항 1
 
0.8%
신안군 1
 
0.8%
장성군 1
 
0.8%
영광군 1
 
0.8%
함평군 1
 
0.8%
무안군 1
 
0.8%
영암군 1
 
0.8%
Other values (111) 111
91.7%
2023-12-11T12:39:24.706434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
43
 
13.9%
23
 
7.4%
17
 
5.5%
16
 
5.2%
12
 
3.9%
11
 
3.6%
10
 
3.2%
8
 
2.6%
7
 
2.3%
6
 
1.9%
Other values (90) 156
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 309
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
43
 
13.9%
23
 
7.4%
17
 
5.5%
16
 
5.2%
12
 
3.9%
11
 
3.6%
10
 
3.2%
8
 
2.6%
7
 
2.3%
6
 
1.9%
Other values (90) 156
50.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 309
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
43
 
13.9%
23
 
7.4%
17
 
5.5%
16
 
5.2%
12
 
3.9%
11
 
3.6%
10
 
3.2%
8
 
2.6%
7
 
2.3%
6
 
1.9%
Other values (90) 156
50.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 309
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
43
 
13.9%
23
 
7.4%
17
 
5.5%
16
 
5.2%
12
 
3.9%
11
 
3.6%
10
 
3.2%
8
 
2.6%
7
 
2.3%
6
 
1.9%
Other values (90) 156
50.5%

40세미만
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct13
Distinct (%)10.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
87 
13 
2
 
6
1
 
5
3
 
2
Other values (8)
 
8

Length

Max length4
Median length4
Mean length3.2975207
Min length1

Unique

Unique8 ?
Unique (%)6.6%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 87
71.9%
13
 
10.7%
2 6
 
5.0%
1 5
 
4.1%
3 2
 
1.7%
8 1
 
0.8%
10 1
 
0.8%
6 1
 
0.8%
23 1
 
0.8%
14 1
 
0.8%
Other values (3) 3
 
2.5%

Length

2023-12-11T12:39:24.865436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 87
80.6%
2 6
 
5.6%
1 5
 
4.6%
3 2
 
1.9%
8 1
 
0.9%
10 1
 
0.9%
6 1
 
0.9%
23 1
 
0.9%
14 1
 
0.9%
9 1
 
0.9%
Other values (2) 2
 
1.9%

41~50세
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)18.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
54 
1
16 
2
4
 
5
Other values (17)
30 

Length

Max length4
Median length3
Mean length2.5454545
Min length1

Unique

Unique10 ?
Unique (%)8.3%

Sample

1st row2
2nd row<NA>
3rd row<NA>
4th row1
5th row2

Common Values

ValueCountFrequency (%)
<NA> 54
44.6%
1 16
 
13.2%
2 8
 
6.6%
4 8
 
6.6%
5
 
4.1%
3 5
 
4.1%
7 4
 
3.3%
11 3
 
2.5%
41 2
 
1.7%
10 2
 
1.7%
Other values (12) 14
 
11.6%

Length

2023-12-11T12:39:25.036242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 54
46.6%
1 16
 
13.8%
2 8
 
6.9%
4 8
 
6.9%
3 5
 
4.3%
7 4
 
3.4%
11 3
 
2.6%
6 2
 
1.7%
12 2
 
1.7%
10 2
 
1.7%
Other values (11) 12
 
10.3%

51~60세
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct37
Distinct (%)40.7%
Missing30
Missing (%)24.8%
Infinite0
Infinite (%)0.0%
Mean21.373626
Minimum1
Maximum304
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-11T12:39:25.195606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q320
95-th percentile106.5
Maximum304
Range303
Interquartile range (IQR)18

Descriptive statistics

Standard deviation43.572328
Coefficient of variation (CV)2.0386025
Kurtosis21.510629
Mean21.373626
Median Absolute Deviation (MAD)4
Skewness4.1977425
Sum1945
Variance1898.5477
MonotonicityNot monotonic
2023-12-11T12:39:25.369233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%)
1 16
13.2%
2 10
 
8.3%
3 9
 
7.4%
4 6
 
5.0%
5 6
 
5.0%
11 4
 
3.3%
6 3
 
2.5%
13 3
 
2.5%
7 3
 
2.5%
20 2
 
1.7%
Other values (27) 29
24.0%
(Missing) 30
24.8%
ValueCountFrequency (%)
1 16
13.2%
2 10
8.3%
3 9
7.4%
4 6
 
5.0%
5 6
 
5.0%
6 3
 
2.5%
7 3
 
2.5%
9 1
 
0.8%
10 1
 
0.8%
11 4
 
3.3%
ValueCountFrequency (%)
304 1
0.8%
181 1
0.8%
135 1
0.8%
132 1
0.8%
115 1
0.8%
98 1
0.8%
55 2
1.7%
53 1
0.8%
52 1
0.8%
50 1
0.8%

61~70세
Categorical

HIGH CORRELATION 

Distinct40
Distinct (%)33.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
29 
1
17 
2
10 
3
4
 
5
Other values (35)
52 

Length

Max length4
Median length3
Mean length2.0661157
Min length1

Unique

Unique24 ?
Unique (%)19.8%

Sample

1st row5
2nd row<NA>
3rd row10
4th row<NA>
5th row18

Common Values

ValueCountFrequency (%)
<NA> 29
24.0%
1 17
14.0%
2 10
 
8.3%
3 8
 
6.6%
4 5
 
4.1%
7 5
 
4.1%
5 3
 
2.5%
6 3
 
2.5%
23 3
 
2.5%
18 2
 
1.7%
Other values (30) 36
29.8%

Length

2023-12-11T12:39:25.551945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 29
24.2%
1 17
14.2%
2 10
 
8.3%
3 8
 
6.7%
4 5
 
4.2%
7 5
 
4.2%
5 3
 
2.5%
6 3
 
2.5%
23 3
 
2.5%
8 2
 
1.7%
Other values (29) 35
29.2%

71세이상
Categorical

HIGH CORRELATION 

Distinct29
Distinct (%)24.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
52 
1
16 
2
5
 
5
3
 
5
Other values (24)
36 

Length

Max length4
Median length3
Mean length2.4958678
Min length1

Unique

Unique17 ?
Unique (%)14.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row8

Common Values

ValueCountFrequency (%)
<NA> 52
43.0%
1 16
 
13.2%
2 7
 
5.8%
5 5
 
4.1%
3 5
 
4.1%
13 4
 
3.3%
8 4
 
3.3%
4 3
 
2.5%
27 2
 
1.7%
6 2
 
1.7%
Other values (19) 21
17.4%

Length

2023-12-11T12:39:25.732548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 52
43.3%
1 16
 
13.3%
2 7
 
5.8%
5 5
 
4.2%
3 5
 
4.2%
13 4
 
3.3%
8 4
 
3.3%
4 3
 
2.5%
9 2
 
1.7%
7 2
 
1.7%
Other values (18) 20
 
16.7%

Interactions

2023-12-11T12:39:23.445484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:39:25.852135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도40세미만41~50세51~60세61~70세71세이상
시도1.0000.7640.0000.0000.0000.000
40세미만0.7641.0000.9670.9430.8720.788
41~50세0.0000.9671.0000.9720.9590.913
51~60세0.0000.9430.9721.0000.9960.932
61~70세0.0000.8720.9590.9961.0000.965
71세이상0.0000.7880.9130.9320.9651.000
2023-12-11T12:39:25.978702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
41~50세시도40세미만61~70세71세이상
41~50세1.0000.0000.6490.4820.449
시도0.0001.0000.4500.0000.000
40세미만0.6490.4501.0000.2700.257
61~70세0.4820.0000.2701.0000.513
71세이상0.4490.0000.2570.5131.000
2023-12-11T12:39:26.098457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
51~60세시도40세미만41~50세61~70세71세이상
51~60세1.0000.0000.7540.7470.6940.578
시도0.0001.0000.4500.0000.0000.000
40세미만0.7540.4501.0000.6490.2700.257
41~50세0.7470.0000.6491.0000.4820.449
61~70세0.6940.0000.2700.4821.0000.513
71세이상0.5780.0000.2570.4490.5131.000

Missing values

2023-12-11T12:39:23.588750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:39:23.741429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도시군40세미만41~50세51~60세61~70세71세이상
0제주광역시제주시<NA>235<NA>
1제주광역시서귀포시<NA><NA>1<NA><NA>
2대구광역시달성군<NA><NA>410<NA>
3광주광역시서구<NA>11<NA><NA>
4광주광역시광산구<NA>220188
5세종자치시세종시<NA><NA>21<NA>
6경기도양평<NA>191156313
7경기도화성<NA>113<NA>
8경기도포천<NA>22<NA>1
9경기도남양주82223137
시도시군40세미만41~50세51~60세61~70세71세이상
111경상남도거제시<NA><NA><NA><NA><NA>
112경상남도의령군<NA><NA>226
113경상남도함안군<NA><NA><NA><NA><NA>
114경상남도창녕군<NA>22<NA><NA>
115경상남도고성군<NA><NA>4<NA><NA>
116경상남도하동군<NA>430101
117경상남도산청군514314215
118경상남도함양군<NA>1219209
119경상남도거창군<NA><NA>412<NA>
120경상남도합천군<NA><NA>273