Overview

Dataset statistics

Number of variables8
Number of observations273
Missing cells2
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory17.7 KiB
Average record size in memory66.5 B

Variable types

Categorical6
Text1
Numeric1

Dataset

Description정부보급종 원종배부 내역으로 생산년도,지원명,단지명,작물명,품종명,배부일자,배부량,산지 등의 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15066269/fileData.do

Alerts

배부일자 has a high cardinality: 51 distinct valuesHigh cardinality
배부일자 is highly overall correlated with 생산년도 and 3 other fieldsHigh correlation
생산년도 is highly overall correlated with 배부량 and 5 other fieldsHigh correlation
지원명 is highly overall correlated with 생산년도 and 3 other fieldsHigh correlation
산지 is highly overall correlated with 생산년도 and 3 other fieldsHigh correlation
작물명 is highly overall correlated with 생산년도 and 2 other fieldsHigh correlation
품종명 is highly overall correlated with 생산년도 and 3 other fieldsHigh correlation
배부량 is highly overall correlated with 생산년도High correlation
생산년도 is highly imbalanced (96.5%)Imbalance

Reproduction

Analysis started2023-12-12 05:22:05.405241
Analysis finished2023-12-12 05:22:06.770569
Duration1.37 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

생산년도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2022
272 
<NA>
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 272
99.6%
<NA> 1
 
0.4%

Length

2023-12-12T14:22:06.859040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:22:06.987017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 272
99.6%
na 1
 
0.4%

지원명
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
전북지원
59 
전남지원
47 
충남지원
43 
경북지원
33 
경남지원
28 
Other values (4)
63 

Length

Max length7
Median length4
Mean length4.2747253
Min length4

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row경기종자관리소
2nd row경기종자관리소
3rd row경기종자관리소
4th row경기종자관리소
5th row경기종자관리소

Common Values

ValueCountFrequency (%)
전북지원 59
21.6%
전남지원 47
17.2%
충남지원 43
15.8%
경북지원 33
12.1%
경남지원 28
10.3%
경기종자관리소 25
9.2%
충북지원 19
 
7.0%
강원지원 18
 
6.6%
<NA> 1
 
0.4%

Length

2023-12-12T14:22:07.122914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:22:07.326016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전북지원 59
21.6%
전남지원 47
17.2%
충남지원 43
15.8%
경북지원 33
12.1%
경남지원 28
10.3%
경기종자관리소 25
9.2%
충북지원 19
 
7.0%
강원지원 18
 
6.6%
na 1
 
0.4%
Distinct215
Distinct (%)79.0%
Missing1
Missing (%)0.4%
Memory size2.3 KiB
2023-12-12T14:22:07.830909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.6286765
Min length2

Characters and Unicode

Total characters715
Distinct characters165
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique172 ?
Unique (%)63.2%

Sample

1st row경기도종자관리소(
2nd row고문단지
3rd row구창단지
4th row당거단지
5th row동고단지
ValueCountFrequency (%)
대죽 4
 
1.4%
공음 4
 
1.4%
남산 4
 
1.4%
해창 4
 
1.4%
삭선 3
 
1.1%
황룡위탁영농 3
 
1.1%
고수 3
 
1.1%
관촌 3
 
1.1%
왕태 3
 
1.1%
계화 3
 
1.1%
Other values (215) 250
88.0%
2023-12-12T14:22:08.460289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35
 
4.9%
30
 
4.2%
27
 
3.8%
24
 
3.4%
17
 
2.4%
17
 
2.4%
16
 
2.2%
16
 
2.2%
13
 
1.8%
12
 
1.7%
Other values (155) 508
71.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 676
94.5%
Space Separator 24
 
3.4%
Decimal Number 6
 
0.8%
Open Punctuation 5
 
0.7%
Close Punctuation 4
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
35
 
5.2%
30
 
4.4%
27
 
4.0%
17
 
2.5%
17
 
2.5%
16
 
2.4%
16
 
2.4%
13
 
1.9%
12
 
1.8%
12
 
1.8%
Other values (150) 481
71.2%
Decimal Number
ValueCountFrequency (%)
2 3
50.0%
1 3
50.0%
Space Separator
ValueCountFrequency (%)
24
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 676
94.5%
Common 39
 
5.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
35
 
5.2%
30
 
4.4%
27
 
4.0%
17
 
2.5%
17
 
2.5%
16
 
2.4%
16
 
2.4%
13
 
1.9%
12
 
1.8%
12
 
1.8%
Other values (150) 481
71.2%
Common
ValueCountFrequency (%)
24
61.5%
( 5
 
12.8%
) 4
 
10.3%
2 3
 
7.7%
1 3
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 676
94.5%
ASCII 39
 
5.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
35
 
5.2%
30
 
4.4%
27
 
4.0%
17
 
2.5%
17
 
2.5%
16
 
2.4%
16
 
2.4%
13
 
1.9%
12
 
1.8%
12
 
1.8%
Other values (150) 481
71.2%
ASCII
ValueCountFrequency (%)
24
61.5%
( 5
 
12.8%
) 4
 
10.3%
2 3
 
7.7%
1 3
 
7.7%

작물명
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
163 
56 
29 
보리
18 
 
4
Other values (2)
 
3

Length

Max length4
Median length1
Mean length1.0842491
Min length1

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
163
59.7%
56
 
20.5%
29
 
10.6%
보리 18
 
6.6%
4
 
1.5%
호밀 2
 
0.7%
<NA> 1
 
0.4%

Length

2023-12-12T14:22:08.686160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:22:08.822826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
163
59.7%
56
 
20.5%
29
 
10.6%
보리 18
 
6.6%
4
 
1.5%
호밀 2
 
0.7%
na 1
 
0.4%

품종명
Categorical

HIGH CORRELATION 

Distinct47
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
대원콩
31 
삼광벼
23 
새청무
21 
신동진벼
19 
새금강밀
 
15
Other values (42)
164 

Length

Max length6
Median length3
Mean length3.3919414
Min length2

Unique

Unique17 ?
Unique (%)6.2%

Sample

1st row알찬미
2nd row삼광벼
3rd row고시히카리
4th row추청벼
5th row고시히카리

Common Values

ValueCountFrequency (%)
대원콩 31
 
11.4%
삼광벼 23
 
8.4%
새청무 21
 
7.7%
신동진벼 19
 
7.0%
새금강밀 15
 
5.5%
참드림 12
 
4.4%
일품벼 12
 
4.4%
추청벼 10
 
3.7%
영호진미 9
 
3.3%
선풍콩 9
 
3.3%
Other values (37) 112
41.0%

Length

2023-12-12T14:22:09.018863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
대원콩 31
 
11.4%
삼광벼 23
 
8.4%
새청무 21
 
7.7%
신동진벼 19
 
7.0%
새금강밀 15
 
5.5%
참드림 12
 
4.4%
일품벼 12
 
4.4%
추청벼 10
 
3.7%
영호진미 9
 
3.3%
선풍콩 9
 
3.3%
Other values (37) 112
41.0%

배부일자
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct51
Distinct (%)18.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2022-04-12
30 
2022-04-07
26 
2022-04-15
21 
2021-10-15
 
16
2022-04-14
 
15
Other values (46)
165 

Length

Max length10
Median length10
Mean length9.978022
Min length4

Unique

Unique20 ?
Unique (%)7.3%

Sample

1st row2022-03-31
2nd row2022-03-29
3rd row2022-03-25
4th row2022-03-25
5th row2022-03-31

Common Values

ValueCountFrequency (%)
2022-04-12 30
 
11.0%
2022-04-07 26
 
9.5%
2022-04-15 21
 
7.7%
2021-10-15 16
 
5.9%
2022-04-14 15
 
5.5%
2022-06-10 13
 
4.8%
2022-03-29 11
 
4.0%
2022-04-04 10
 
3.7%
2022-04-01 10
 
3.7%
2022-03-25 10
 
3.7%
Other values (41) 111
40.7%

Length

2023-12-12T14:22:09.180579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022-04-12 30
 
11.0%
2022-04-07 26
 
9.5%
2022-04-15 21
 
7.7%
2021-10-15 16
 
5.9%
2022-04-14 15
 
5.5%
2022-06-10 13
 
4.8%
2022-03-29 11
 
4.0%
2022-04-04 10
 
3.7%
2022-04-01 10
 
3.7%
2022-03-25 10
 
3.7%
Other values (41) 111
40.7%

배부량
Real number (ℝ)

HIGH CORRELATION 

Distinct193
Distinct (%)71.0%
Missing1
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean1541.1305
Minimum15
Maximum7920
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2023-12-12T14:22:09.337812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile371
Q1825
median1276.5
Q31876.25
95-th percentile3782
Maximum7920
Range7905
Interquartile range (IQR)1051.25

Descriptive statistics

Standard deviation1179.5388
Coefficient of variation (CV)0.76537245
Kurtosis7.3831368
Mean1541.1305
Median Absolute Deviation (MAD)523.5
Skewness2.3069007
Sum419187.5
Variance1391311.9
MonotonicityNot monotonic
2023-12-12T14:22:09.533342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
600.0 7
 
2.6%
1100.0 7
 
2.6%
1000.0 6
 
2.2%
960.0 5
 
1.8%
1250.0 4
 
1.5%
1800.0 4
 
1.5%
800.0 4
 
1.5%
1050.0 4
 
1.5%
1560.0 4
 
1.5%
1450.0 3
 
1.1%
Other values (183) 224
82.1%
ValueCountFrequency (%)
15.0 1
0.4%
65.0 1
0.4%
150.0 1
0.4%
190.0 1
0.4%
196.0 1
0.4%
200.0 1
0.4%
242.0 1
0.4%
250.0 1
0.4%
267.0 1
0.4%
276.0 1
0.4%
ValueCountFrequency (%)
7920.0 1
0.4%
7680.0 1
0.4%
6240.0 1
0.4%
5940.0 1
0.4%
5544.0 1
0.4%
5400.0 2
0.7%
4960.0 1
0.4%
4500.0 1
0.4%
4410.0 1
0.4%
4400.0 1
0.4%

산지
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)5.9%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
전남 종자관리소
43 
경북 농업자원관리원
39 
전북 종자사업소
34 
경남 농업자원관리원
33 
경기 종자관리소(생산)
25 
Other values (11)
99 

Length

Max length13
Median length12
Mean length9.1355311
Min length4

Unique

Unique4 ?
Unique (%)1.5%

Sample

1st row경기 종자관리소(생산)
2nd row경기 종자관리소(생산)
3rd row경기 종자관리소(생산)
4th row경기 종자관리소(생산)
5th row경기 종자관리소(생산)

Common Values

ValueCountFrequency (%)
전남 종자관리소 43
15.8%
경북 농업자원관리원 39
14.3%
전북 종자사업소 34
12.5%
경남 농업자원관리원 33
12.1%
경기 종자관리소(생산) 25
9.2%
충남 종자관리소 논산분소 22
8.1%
보급종 격상 19
7.0%
강원 농산물원종장 19
7.0%
충북 농산사업소 14
 
5.1%
충남 종자관리소 14
 
5.1%
Other values (6) 11
 
4.0%

Length

2023-12-12T14:22:09.689219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
종자관리소 79
14.1%
농업자원관리원 72
12.9%
전남 43
 
7.7%
경북 39
 
7.0%
충남 36
 
6.4%
전북 34
 
6.1%
종자사업소 34
 
6.1%
경남 33
 
5.9%
경기 27
 
4.8%
종자관리소(생산 25
 
4.5%
Other values (13) 137
24.5%

Interactions

2023-12-12T14:22:06.069946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:22:10.090230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지원명작물명품종명배부일자배부량산지
지원명1.0000.2280.9140.9690.1840.971
작물명0.2281.0001.0000.9660.5970.739
품종명0.9141.0001.0000.9470.6290.928
배부일자0.9690.9660.9471.0000.7160.955
배부량0.1840.5970.6290.7161.0000.282
산지0.9710.7390.9280.9550.2821.000
2023-12-12T14:22:10.214447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
배부일자생산년도지원명산지작물명품종명
배부일자1.0001.0000.7460.6190.7510.437
생산년도1.0001.0001.0001.0001.0001.000
지원명0.7461.0001.0000.8680.1270.603
산지0.6191.0000.8681.0000.4500.542
작물명0.7511.0000.1270.4501.0000.922
품종명0.4371.0000.6030.5420.9221.000
2023-12-12T14:22:10.366562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
배부량생산년도지원명작물명품종명배부일자산지
배부량1.0001.0000.0900.3440.2590.3200.116
생산년도1.0001.0001.0001.0001.0001.0001.000
지원명0.0901.0001.0000.1270.6030.7460.868
작물명0.3441.0000.1271.0000.9220.7510.450
품종명0.2591.0000.6030.9221.0000.4370.542
배부일자0.3201.0000.7460.7510.4371.0000.619
산지0.1161.0000.8680.4500.5420.6191.000

Missing values

2023-12-12T14:22:06.236668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:22:06.426509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T14:22:06.627055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

생산년도지원명단지명작물명품종명배부일자배부량산지
02022경기종자관리소경기도종자관리소(알찬미2022-03-31200.0경기 종자관리소(생산)
12022경기종자관리소고문단지삼광벼2022-03-291089.0경기 종자관리소(생산)
22022경기종자관리소구창단지고시히카리2022-03-251573.0경기 종자관리소(생산)
32022경기종자관리소당거단지추청벼2022-03-251467.0경기 종자관리소(생산)
42022경기종자관리소동고단지고시히카리2022-03-311627.0경기 종자관리소(생산)
52022경기종자관리소방축단지참드림2022-03-25974.0보급종 격상
62022경기종자관리소방축단지참드림2022-03-25910.0경기 종자관리소(생산)
72022경기종자관리소석봉단지삼광벼2022-03-311563.0경기 종자관리소(생산)
82022경기종자관리소송대단지참드림2022-03-241732.0경기 종자관리소(생산)
92022경기종자관리소숙성단지참드림2022-03-251122.0경기 종자관리소(생산)
생산년도지원명단지명작물명품종명배부일자배부량산지
2632022강원지원잠곡대원콩2022-05-261140.0강원 농산물원종장
2642022강원지원좌운대원콩2022-05-231140.0강원 농산물원종장
2652022강원지원좌운삼광벼2022-04-061760.0강원 농산물원종장
2662022강원지원주천청아콩2022-06-13960.0강원 농산물원종장
2672022강원지원철원오대벼2022-03-081787.0강원 농산물원종장
2682022강원지원풍암오대벼2022-03-29550.0강원 농산물원종장
2692022강원지원학수운광벼2022-03-221100.0강원 농산물원종장
2702022강원지원화지오대벼2022-03-081760.0강원 농산물원종장
2712022강원지원후동삼광벼2022-04-051760.0강원 농산물원종장
272<NA><NA><NA><NA><NA><NA><NA><NA>