Overview

Dataset statistics

Number of variables10
Number of observations298
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory24.9 KiB
Average record size in memory85.4 B

Variable types

Categorical5
Numeric4
DateTime1

Dataset

Description국립종자원 국가보증 포장검사 신청 정보로 작물명, 채종단계, 차수, 신청면적, 소재지수, 필지수, 품종수 등의 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15119579/fileData.do

Alerts

데이터 추출일자 has constant value ""Constant
신청면적 is highly overall correlated with 필지수High correlation
필지수 is highly overall correlated with 신청면적 and 1 other fieldsHigh correlation
품종수 is highly overall correlated with 필지수High correlation
작물명 is highly overall correlated with 차수High correlation
차수 is highly overall correlated with 작물명High correlation
차수 is highly imbalanced (71.2%)Imbalance

Reproduction

Analysis started2023-12-12 07:59:30.411209
Analysis finished2023-12-12 07:59:33.034583
Duration2.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년산
Categorical

Distinct3
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2021
106 
2022
100 
2023
92 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2021 106
35.6%
2022 100
33.6%
2023 92
30.9%

Length

2023-12-12T16:59:33.101518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:59:33.198709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 106
35.6%
2022 100
33.6%
2023 92
30.9%

검사지원명
Categorical

Distinct10
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
전북지원
43 
경북지원
38 
충남지원
36 
강원지원
36 
경남지원
35 
Other values (5)
110 

Length

Max length6
Median length4
Mean length4.0536913
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충북지원
2nd row충북지원
3rd row충북지원
4th row충북지원
5th row충북지원

Common Values

ValueCountFrequency (%)
전북지원 43
14.4%
경북지원 38
12.8%
충남지원 36
12.1%
강원지원 36
12.1%
경남지원 35
11.7%
전남지원 34
11.4%
충북지원 29
9.7%
제주지원 27
9.1%
동부지원 12
 
4.0%
수도권현장팀 8
 
2.7%

Length

2023-12-12T16:59:33.315404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:59:33.446779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전북지원 43
14.4%
경북지원 38
12.8%
충남지원 36
12.1%
강원지원 36
12.1%
경남지원 35
11.7%
전남지원 34
11.4%
충북지원 29
9.7%
제주지원 27
9.1%
동부지원 12
 
4.0%
수도권현장팀 8
 
2.7%

작물명
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
52 
48 
겉보리
37 
34 
쌀보리
30 
Other values (8)
97 

Length

Max length8
Median length1
Mean length2.1342282
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row겉보리
5th row겉보리

Common Values

ValueCountFrequency (%)
52
17.4%
48
16.1%
겉보리 37
12.4%
34
11.4%
쌀보리 30
10.1%
24
8.1%
맥주보리 23
7.7%
봄감자 20
 
6.7%
호밀 10
 
3.4%
청보리(사료용) 7
 
2.3%
Other values (3) 13
 
4.4%

Length

2023-12-12T16:59:33.612112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
52
17.4%
48
16.1%
겉보리 37
12.4%
34
11.4%
쌀보리 30
10.1%
24
8.1%
맥주보리 23
7.7%
봄감자 20
 
6.7%
호밀 10
 
3.4%
청보리(사료용 7
 
2.3%
Other values (3) 13
 
4.4%

채종단계
Categorical

Distinct4
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
원원종포
137 
원종포
136 
채종포1세대
16 
채종포2세대
 
9

Length

Max length6
Median length4
Mean length3.7114094
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row원원종포
2nd row원종포
3rd row채종포1세대
4th row원원종포
5th row원종포

Common Values

ValueCountFrequency (%)
원원종포 137
46.0%
원종포 136
45.6%
채종포1세대 16
 
5.4%
채종포2세대 9
 
3.0%

Length

2023-12-12T16:59:33.764343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:59:33.896927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
원원종포 137
46.0%
원종포 136
45.6%
채종포1세대 16
 
5.4%
채종포2세대 9
 
3.0%

차수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
1차
283 
2차
 
15

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1차
2nd row1차
3rd row1차
4th row1차
5th row1차

Common Values

ValueCountFrequency (%)
1차 283
95.0%
2차 15
 
5.0%

Length

2023-12-12T16:59:34.046681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:59:34.167985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1차 283
95.0%
2차 15
 
5.0%

신청면적
Real number (ℝ)

HIGH CORRELATION 

Distinct132
Distinct (%)44.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean430.58993
Minimum1
Maximum4350
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-12T16:59:34.337489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q124.25
median85.9
Q3570
95-th percentile1521.99
Maximum4350
Range4349
Interquartile range (IQR)545.75

Descriptive statistics

Standard deviation750.74422
Coefficient of variation (CV)1.7435248
Kurtosis10.931064
Mean430.58993
Median Absolute Deviation (MAD)79.4
Skewness3.0519529
Sum128315.8
Variance563616.89
MonotonicityNot monotonic
2023-12-12T16:59:34.513032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.0 17
 
5.7%
10.0 15
 
5.0%
50.0 12
 
4.0%
300.0 8
 
2.7%
70.0 8
 
2.7%
100.0 6
 
2.0%
33.0 6
 
2.0%
250.0 5
 
1.7%
60.0 5
 
1.7%
1000.0 5
 
1.7%
Other values (122) 211
70.8%
ValueCountFrequency (%)
1.0 1
 
0.3%
2.0 5
 
1.7%
3.0 2
 
0.7%
4.0 4
 
1.3%
5.0 17
5.7%
6.0 3
 
1.0%
7.0 2
 
0.7%
8.0 2
 
0.7%
9.0 1
 
0.3%
10.0 15
5.0%
ValueCountFrequency (%)
4350.0 2
0.7%
4163.0 2
0.7%
3760.0 2
0.7%
2915.4 1
0.3%
2908.0 1
0.3%
2807.3 1
0.3%
2506.5 1
0.3%
2366.5 1
0.3%
2217.0 1
0.3%
2051.0 1
0.3%

소재지수
Real number (ℝ)

Distinct10
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4765101
Minimum1
Maximum19
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-12T16:59:34.647038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum19
Range18
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.9680406
Coefficient of variation (CV)1.3329002
Kurtosis47.854413
Mean1.4765101
Median Absolute Deviation (MAD)0
Skewness6.6870771
Sum440
Variance3.8731837
MonotonicityNot monotonic
2023-12-12T16:59:34.751118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 245
82.2%
2 35
 
11.7%
3 8
 
2.7%
4 3
 
1.0%
14 2
 
0.7%
5 1
 
0.3%
6 1
 
0.3%
19 1
 
0.3%
16 1
 
0.3%
15 1
 
0.3%
ValueCountFrequency (%)
1 245
82.2%
2 35
 
11.7%
3 8
 
2.7%
4 3
 
1.0%
5 1
 
0.3%
6 1
 
0.3%
14 2
 
0.7%
15 1
 
0.3%
16 1
 
0.3%
19 1
 
0.3%
ValueCountFrequency (%)
19 1
 
0.3%
16 1
 
0.3%
15 1
 
0.3%
14 2
 
0.7%
6 1
 
0.3%
5 1
 
0.3%
4 3
 
1.0%
3 8
 
2.7%
2 35
 
11.7%
1 245
82.2%

필지수
Real number (ℝ)

HIGH CORRELATION 

Distinct43
Distinct (%)14.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.4060403
Minimum1
Maximum148
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-12T16:59:34.886277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q39
95-th percentile47.6
Maximum148
Range147
Interquartile range (IQR)8

Descriptive statistics

Standard deviation16.473207
Coefficient of variation (CV)1.7513435
Kurtosis20.743275
Mean9.4060403
Median Absolute Deviation (MAD)2
Skewness3.8746672
Sum2803
Variance271.36656
MonotonicityNot monotonic
2023-12-12T16:59:35.033149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=43)
ValueCountFrequency (%)
1 89
29.9%
2 48
16.1%
3 19
 
6.4%
4 19
 
6.4%
6 17
 
5.7%
9 15
 
5.0%
5 9
 
3.0%
12 6
 
2.0%
7 6
 
2.0%
8 6
 
2.0%
Other values (33) 64
21.5%
ValueCountFrequency (%)
1 89
29.9%
2 48
16.1%
3 19
 
6.4%
4 19
 
6.4%
5 9
 
3.0%
6 17
 
5.7%
7 6
 
2.0%
8 6
 
2.0%
9 15
 
5.0%
10 5
 
1.7%
ValueCountFrequency (%)
148 1
0.3%
85 1
0.3%
75 1
0.3%
68 2
0.7%
66 1
0.3%
62 1
0.3%
61 1
0.3%
60 2
0.7%
58 1
0.3%
55 1
0.3%

품종수
Real number (ℝ)

HIGH CORRELATION 

Distinct16
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2516779
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-12T16:59:35.194116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile10.15
Maximum16
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.3293533
Coefficient of variation (CV)1.0238878
Kurtosis2.446232
Mean3.2516779
Median Absolute Deviation (MAD)1
Skewness1.7439552
Sum969
Variance11.084593
MonotonicityNot monotonic
2023-12-12T16:59:35.338526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
1 135
45.3%
2 57
19.1%
3 19
 
6.4%
4 17
 
5.7%
6 17
 
5.7%
8 10
 
3.4%
9 10
 
3.4%
5 8
 
2.7%
10 6
 
2.0%
7 4
 
1.3%
Other values (6) 15
 
5.0%
ValueCountFrequency (%)
1 135
45.3%
2 57
19.1%
3 19
 
6.4%
4 17
 
5.7%
5 8
 
2.7%
6 17
 
5.7%
7 4
 
1.3%
8 10
 
3.4%
9 10
 
3.4%
10 6
 
2.0%
ValueCountFrequency (%)
16 2
 
0.7%
15 1
 
0.3%
14 3
 
1.0%
13 2
 
0.7%
12 3
 
1.0%
11 4
 
1.3%
10 6
2.0%
9 10
3.4%
8 10
3.4%
7 4
 
1.3%

데이터 추출일자
Date

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
Minimum2023-08-22 00:00:00
Maximum2023-08-22 00:00:00
2023-12-12T16:59:35.497362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:35.632678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T16:59:32.314917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:30.971992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:31.391936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:31.834869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:32.419700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:31.064391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:31.488529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:31.955086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:32.514323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:31.182876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:31.603740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:32.071067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:32.610033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:31.296364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:31.714223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:59:32.185905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:59:35.735841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년산검사지원명작물명채종단계차수신청면적소재지수필지수품종수
년산1.0000.1080.0000.0000.0000.0000.0000.0000.072
검사지원명0.1081.0000.6860.3010.6040.5270.2740.5320.726
작물명0.0000.6861.0000.3750.7070.5000.2390.5630.678
채종단계0.0000.3010.3751.0000.0000.6370.4730.5940.000
차수0.0000.6040.7070.0001.0000.2640.0000.2540.395
신청면적0.0000.5270.5000.6370.2641.0000.7740.7870.616
소재지수0.0000.2740.2390.4730.0000.7741.0000.6530.243
필지수0.0000.5320.5630.5940.2540.7870.6531.0000.632
품종수0.0720.7260.6780.0000.3950.6160.2430.6321.000
2023-12-12T16:59:35.873810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
채종단계작물명년산차수검사지원명
채종단계1.0000.2230.0000.0000.182
작물명0.2231.0000.0000.6590.367
년산0.0000.0001.0000.0000.063
차수0.0000.6590.0001.0000.461
검사지원명0.1820.3670.0630.4611.000
2023-12-12T16:59:35.988484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
신청면적소재지수필지수품종수년산검사지원명작물명채종단계차수
신청면적1.0000.3210.7920.4350.0000.2710.2390.4630.260
소재지수0.3211.0000.3330.2220.0000.1460.1180.3220.000
필지수0.7920.3331.0000.6820.0000.3030.3000.4510.270
품종수0.4350.2220.6821.0000.0990.3060.3470.0000.317
년산0.0000.0000.0000.0991.0000.0630.0000.0000.000
검사지원명0.2710.1460.3030.3060.0631.0000.3670.1820.461
작물명0.2390.1180.3000.3470.0000.3671.0000.2230.659
채종단계0.4630.3220.4510.0000.0000.1820.2231.0000.000
차수0.2600.0000.2700.3170.0000.4610.6590.0001.000

Missing values

2023-12-12T16:59:32.771803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:59:32.964748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

년산검사지원명작물명채종단계차수신청면적소재지수필지수품종수데이터 추출일자
02021충북지원원원종포1차14.01882023-08-22
12021충북지원원종포1차300.011482023-08-22
22021충북지원채종포1세대1차857.011942023-08-22
32021충북지원겉보리원원종포1차5.01112023-08-22
42021충북지원겉보리원종포1차70.01212023-08-22
52021충북지원원원종포1차33.01332023-08-22
62021충북지원원종포1차694.012432023-08-22
72021충북지원채종포1세대1차12.01112023-08-22
82021충북지원원원종포1차1.01112023-08-22
92021충북지원원종포1차10.01112023-08-22
년산검사지원명작물명채종단계차수신청면적소재지수필지수품종수데이터 추출일자
2882023수도권현장팀쌀보리원원종포1차9.01112023-08-22
2892023수도권현장팀쌀보리원종포1차100.01412023-08-22
2902023수도권현장팀원원종포1차32.01552023-08-22
2912023수도권현장팀원종포1차506.03552023-08-22
2922023수도권현장팀원원종포1차4.01112023-08-22
2932023수도권현장팀원종포1차50.01112023-08-22
2942023동부지원봄감자원원종포1차520.0119132023-08-22
2952023동부지원봄감자원종포1차3760.0161112023-08-22
2962023동부지원봄감자원원종포2차520.0119132023-08-22
2972023동부지원봄감자원종포2차3760.0162112023-08-22