Overview

Dataset statistics

Number of variables7
Number of observations112
Missing cells0
Missing cells (%)0.0%
Duplicate rows3
Duplicate rows (%)2.7%
Total size in memory6.6 KiB
Average record size in memory60.2 B

Variable types

Categorical5
Numeric2

Dataset

Description국립종자원 정부보급종 예비종자 현황에 대한 데이터로 년산,지원명,작물명,품종명,원종구분,예비종자량,사용량 등의 항목을 제공합니다.
URLhttps://www.data.go.kr/data/15066256/fileData.do

Alerts

원종구분 has constant value ""Constant
Dataset has 3 (2.7%) duplicate rowsDuplicates
예비종자량 is highly overall correlated with 사용량High correlation
사용량 is highly overall correlated with 예비종자량High correlation
지원명 is highly overall correlated with 품종명High correlation
작물명 is highly overall correlated with 품종명High correlation
품종명 is highly overall correlated with 지원명 and 1 other fieldsHigh correlation
사용량 has 2 (1.8%) zerosZeros

Reproduction

Analysis started2023-12-12 13:05:08.389079
Analysis finished2023-12-12 13:05:09.254416
Duration0.87 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년산
Categorical

Distinct3
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2020
55 
2021
42 
2022
15 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 55
49.1%
2021 42
37.5%
2022 15
 
13.4%

Length

2023-12-12T22:05:09.315971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:05:09.424126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 55
49.1%
2021 42
37.5%
2022 15
 
13.4%

지원명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
전남지원
29 
경남지원
27 
전북지원
20 
강원지원
12 
경북지원
10 
Other values (3)
14 

Length

Max length7
Median length4
Mean length4.0535714
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충북지원
2nd row충북지원
3rd row충남지원
4th row충남지원
5th row충남지원

Common Values

ValueCountFrequency (%)
전남지원 29
25.9%
경남지원 27
24.1%
전북지원 20
17.9%
강원지원 12
10.7%
경북지원 10
 
8.9%
충북지원 6
 
5.4%
충남지원 6
 
5.4%
경기종자관리소 2
 
1.8%

Length

2023-12-12T22:05:09.534874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:05:09.647583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전남지원 29
25.9%
경남지원 27
24.1%
전북지원 20
17.9%
강원지원 12
10.7%
경북지원 10
 
8.9%
충북지원 6
 
5.4%
충남지원 6
 
5.4%
경기종자관리소 2
 
1.8%

작물명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
74 
보리
15 
11 
10 
호밀
 
2

Length

Max length2
Median length1
Mean length1.1517857
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
74
66.1%
보리 15
 
13.4%
11
 
9.8%
10
 
8.9%
호밀 2
 
1.8%

Length

2023-12-12T22:05:09.775030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:05:09.907618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
74
66.1%
보리 15
 
13.4%
11
 
9.8%
10
 
8.9%
호밀 2
 
1.8%

품종명
Categorical

HIGH CORRELATION 

Distinct41
Distinct (%)36.6%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
오대벼
 
7
새청무
 
6
추청벼
 
6
삼광벼
 
6
흰찰쌀보리
 
5
Other values (36)
82 

Length

Max length6
Median length3
Mean length3.4910714
Min length2

Unique

Unique14 ?
Unique (%)12.5%

Sample

1st row추청벼
2nd row진수미
3rd row삼광벼
4th row새누리벼
5th row미품벼

Common Values

ValueCountFrequency (%)
오대벼 7
 
6.2%
새청무 6
 
5.4%
추청벼 6
 
5.4%
삼광벼 6
 
5.4%
흰찰쌀보리 5
 
4.5%
신동진벼 5
 
4.5%
태광콩 5
 
4.5%
해담쌀 5
 
4.5%
새일미벼 5
 
4.5%
해품벼 5
 
4.5%
Other values (31) 57
50.9%

Length

2023-12-12T22:05:10.050877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
오대벼 7
 
6.2%
삼광벼 6
 
5.4%
새청무 6
 
5.4%
추청벼 6
 
5.4%
신동진벼 5
 
4.5%
태광콩 5
 
4.5%
해담쌀 5
 
4.5%
새일미벼 5
 
4.5%
해품벼 5
 
4.5%
흰찰쌀보리 5
 
4.5%
Other values (31) 57
50.9%

원종구분
Categorical

CONSTANT 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
보급종
112 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보급종
2nd row보급종
3rd row보급종
4th row보급종
5th row보급종

Common Values

ValueCountFrequency (%)
보급종 112
100.0%

Length

2023-12-12T22:05:10.178149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:05:10.267092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보급종 112
100.0%

예비종자량
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)27.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean376.5625
Minimum10
Maximum2500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T22:05:10.385161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile48.25
Q1100
median140
Q3405
95-th percentile1890
Maximum2500
Range2490
Interquartile range (IQR)305

Descriptive statistics

Standard deviation524.72645
Coefficient of variation (CV)1.3934644
Kurtosis6.6593722
Mean376.5625
Median Absolute Deviation (MAD)60
Skewness2.6255533
Sum42175
Variance275337.85
MonotonicityNot monotonic
2023-12-12T22:05:10.520821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
100 35
31.2%
200 9
 
8.0%
300 7
 
6.2%
120 6
 
5.4%
40 5
 
4.5%
500 5
 
4.5%
600 5
 
4.5%
80 5
 
4.5%
1000 4
 
3.6%
2060 3
 
2.7%
Other values (21) 28
25.0%
ValueCountFrequency (%)
10 1
 
0.9%
40 5
 
4.5%
55 1
 
0.9%
70 1
 
0.9%
80 5
 
4.5%
100 35
31.2%
120 6
 
5.4%
140 3
 
2.7%
160 1
 
0.9%
180 1
 
0.9%
ValueCountFrequency (%)
2500 2
 
1.8%
2060 3
2.7%
2000 1
 
0.9%
1800 1
 
0.9%
1200 1
 
0.9%
1160 1
 
0.9%
1000 4
3.6%
660 1
 
0.9%
620 1
 
0.9%
600 5
4.5%

사용량
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct42
Distinct (%)37.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean231.9375
Minimum0
Maximum2400
Zeros2
Zeros (%)1.8%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T22:05:10.657888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q140
median100
Q3245
95-th percentile813
Maximum2400
Range2400
Interquartile range (IQR)205

Descriptive statistics

Standard deviation398.33546
Coefficient of variation (CV)1.7174259
Kurtosis14.688485
Mean231.9375
Median Absolute Deviation (MAD)83.5
Skewness3.6886054
Sum25977
Variance158671.14
MonotonicityNot monotonic
2023-12-12T22:05:10.779976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
100 16
 
14.3%
200 10
 
8.9%
80 7
 
6.2%
40 7
 
6.2%
20 6
 
5.4%
2 6
 
5.4%
300 5
 
4.5%
10 4
 
3.6%
120 4
 
3.6%
140 4
 
3.6%
Other values (32) 43
38.4%
ValueCountFrequency (%)
0 2
 
1.8%
2 6
5.4%
5 2
 
1.8%
7 1
 
0.9%
10 4
3.6%
15 1
 
0.9%
18 1
 
0.9%
20 6
5.4%
35 1
 
0.9%
38 1
 
0.9%
ValueCountFrequency (%)
2400 1
0.9%
2000 1
0.9%
1900 1
0.9%
1800 1
0.9%
1200 1
0.9%
1000 1
0.9%
660 1
0.9%
620 1
0.9%
560 1
0.9%
500 2
1.8%

Interactions

2023-12-12T22:05:08.836007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:05:08.678284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:05:08.919271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:05:08.762966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:05:10.859634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년산지원명작물명품종명예비종자량사용량
년산1.0000.4800.3210.4700.3130.000
지원명0.4801.0000.5050.9290.6960.290
작물명0.3210.5051.0001.0000.0000.000
품종명0.4700.9291.0001.0000.8370.000
예비종자량0.3130.6960.0000.8371.0000.826
사용량0.0000.2900.0000.0000.8261.000
2023-12-12T22:05:10.961944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년산지원명작물명품종명
년산1.0000.3390.2530.207
지원명0.3391.0000.3350.568
작물명0.2530.3351.0000.815
품종명0.2070.5680.8151.000
2023-12-12T22:05:11.048864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
예비종자량사용량년산지원명작물명품종명
예비종자량1.0000.6020.2030.2970.0000.417
사용량0.6021.0000.0000.1570.0000.000
년산0.2030.0001.0000.3390.2530.207
지원명0.2970.1570.3391.0000.3350.568
작물명0.0000.0000.2530.3351.0000.815
품종명0.4170.0000.2070.5680.8151.000

Missing values

2023-12-12T22:05:09.053508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:05:09.187144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

년산지원명작물명품종명원종구분예비종자량사용량
02020충북지원추청벼보급종420420
12020충북지원진수미보급종8080
22020충남지원삼광벼보급종18001800
32020충남지원새누리벼보급종4040
42020충남지원미품벼보급종140140
52020충남지원새일미벼보급종300300
62020충남지원친들벼보급종12001200
72020전북지원해담쌀보급종200200
82020전북지원해품벼보급종200200
92020전북지원동진찰벼보급종300300
년산지원명작물명품종명원종구분예비종자량사용량
1022022전남지원새청무보급종100040
1032022전남지원새청무보급종100040
1042022전남지원새청무보급종1000240
1052022전남지원보리새쌀보리보급종100100
1062022전남지원보리흰찰쌀보리보급종100100
1072022전남지원금강밀보급종100100
1082022전남지원새금강밀보급종400400
1092022경남지원조경밀보급종400
1102022경남지원백강밀보급종3000
1112022강원지원오대벼보급종60020

Duplicate rows

Most frequently occurring

년산지원명작물명품종명원종구분예비종자량사용량# duplicates
02020강원지원오대벼보급종600602
12021전남지원태광콩보급종10052
22022전남지원새청무보급종1000402