Overview

Dataset statistics

Number of variables9
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory830.1 KiB
Average record size in memory85.0 B

Variable types

Categorical4
Numeric4
Text1

Dataset

Description정부보급종 정선 입고종자 종자규격 내역으로 년산,지원명,부서명,종자규격번호,작물명,품종명,길이,폭,두께 등의 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15066323/fileData.do

Alerts

부서명 is highly overall correlated with 지원명High correlation
지원명 is highly overall correlated with 부서명High correlation
is highly overall correlated with 두께High correlation
두께 is highly overall correlated with and 1 other fieldsHigh correlation
작물명 is highly overall correlated with 두께High correlation

Reproduction

Analysis started2023-12-12 06:44:42.649115
Analysis finished2023-12-12 06:44:45.796237
Duration3.15 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년산
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2020
3387 
2021
3362 
2022
3251 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2022
4th row2021
5th row2022

Common Values

ValueCountFrequency (%)
2020 3387
33.9%
2021 3362
33.6%
2022 3251
32.5%

Length

2023-12-12T15:44:45.867588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:44:45.976011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 3387
33.9%
2021 3362
33.6%
2022 3251
32.5%

지원명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
전북지원
2070 
경남지원
1626 
전남지원
1456 
경기종자관리소
1143 
충남지원
1036 
Other values (3)
2669 

Length

Max length7
Median length4
Mean length4.3429
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전남지원
2nd row경북지원
3rd row경북지원
4th row전남지원
5th row전북지원

Common Values

ValueCountFrequency (%)
전북지원 2070
20.7%
경남지원 1626
16.3%
전남지원 1456
14.6%
경기종자관리소 1143
11.4%
충남지원 1036
10.4%
경북지원 1035
10.3%
강원지원 911
9.1%
충북지원 723
 
7.2%

Length

2023-12-12T15:44:46.081804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:44:46.238166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전북지원 2070
20.7%
경남지원 1626
16.3%
전남지원 1456
14.6%
경기종자관리소 1143
11.4%
충남지원 1036
10.4%
경북지원 1035
10.3%
강원지원 911
9.1%
충북지원 723
 
7.2%

부서명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
6474 
정읍
1226 
영암
1103 
익산
844 
함평
 
353

Length

Max length4
Median length4
Mean length3.2948
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영암
2nd row<NA>
3rd row<NA>
4th row함평
5th row익산

Common Values

ValueCountFrequency (%)
<NA> 6474
64.7%
정읍 1226
 
12.3%
영암 1103
 
11.0%
익산 844
 
8.4%
함평 353
 
3.5%

Length

2023-12-12T15:44:46.420421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:44:46.557607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 6474
64.7%
정읍 1226
 
12.3%
영암 1103
 
11.0%
익산 844
 
8.4%
함평 353
 
3.5%

종자규격번호
Real number (ℝ)

Distinct100
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.6843
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T15:44:46.677971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q126
median51
Q375
95-th percentile95
Maximum100
Range99
Interquartile range (IQR)49

Descriptive statistics

Standard deviation28.776648
Coefficient of variation (CV)0.56776255
Kurtosis-1.1906824
Mean50.6843
Median Absolute Deviation (MAD)25
Skewness-0.017124126
Sum506843
Variance828.09544
MonotonicityNot monotonic
2023-12-12T15:44:46.831640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80 116
 
1.2%
22 115
 
1.1%
38 113
 
1.1%
12 113
 
1.1%
66 111
 
1.1%
90 111
 
1.1%
70 111
 
1.1%
32 111
 
1.1%
73 109
 
1.1%
25 109
 
1.1%
Other values (90) 8881
88.8%
ValueCountFrequency (%)
1 108
1.1%
2 98
1.0%
3 103
1.0%
4 104
1.0%
5 106
1.1%
6 100
1.0%
7 105
1.1%
8 90
0.9%
9 92
0.9%
10 91
0.9%
ValueCountFrequency (%)
100 98
1.0%
99 100
1.0%
98 102
1.0%
97 89
0.9%
96 106
1.1%
95 92
0.9%
94 98
1.0%
93 96
1.0%
92 92
0.9%
91 101
1.0%

작물명
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
5404 
2053 
보리
1438 
639 
 
249

Length

Max length2
Median length1
Mean length1.1655
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보리
2nd row보리
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
5404
54.0%
2053
 
20.5%
보리 1438
 
14.4%
639
 
6.4%
249
 
2.5%
호밀 217
 
2.2%

Length

2023-12-12T15:44:46.993064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:44:47.100908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5404
54.0%
2053
 
20.5%
보리 1438
 
14.4%
639
 
6.4%
249
 
2.5%
호밀 217
 
2.2%
Distinct58
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T15:44:47.367109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.5286
Min length2

Characters and Unicode

Total characters35286
Distinct characters74
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row새쌀보리
2nd row영양보리
3rd row삼광벼
4th row새금강밀
5th row선풍콩
ValueCountFrequency (%)
삼광벼 651
 
6.5%
대원콩 605
 
6.0%
해담쌀 384
 
3.8%
추청벼 375
 
3.8%
대찬콩 347
 
3.5%
신동진벼 332
 
3.3%
새일미벼 289
 
2.9%
선풍콩 287
 
2.9%
흰찰쌀보리 286
 
2.9%
백옥찰벼 270
 
2.7%
Other values (48) 6174
61.7%
2023-12-12T15:44:47.851382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3889
 
11.0%
2197
 
6.2%
2053
 
5.8%
1521
 
4.3%
1293
 
3.7%
1187
 
3.4%
1160
 
3.3%
1103
 
3.1%
1019
 
2.9%
893
 
2.5%
Other values (64) 18971
53.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 34917
99.0%
Decimal Number 369
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3889
 
11.1%
2197
 
6.3%
2053
 
5.9%
1521
 
4.4%
1293
 
3.7%
1187
 
3.4%
1160
 
3.3%
1103
 
3.2%
1019
 
2.9%
893
 
2.6%
Other values (63) 18602
53.3%
Decimal Number
ValueCountFrequency (%)
1 369
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 34917
99.0%
Common 369
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3889
 
11.1%
2197
 
6.3%
2053
 
5.9%
1521
 
4.4%
1293
 
3.7%
1187
 
3.4%
1160
 
3.3%
1103
 
3.2%
1019
 
2.9%
893
 
2.6%
Other values (63) 18602
53.3%
Common
ValueCountFrequency (%)
1 369
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 34917
99.0%
ASCII 369
 
1.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3889
 
11.1%
2197
 
6.3%
2053
 
5.9%
1521
 
4.4%
1293
 
3.7%
1187
 
3.4%
1160
 
3.3%
1103
 
3.2%
1019
 
2.9%
893
 
2.6%
Other values (63) 18602
53.3%
ASCII
ValueCountFrequency (%)
1 369
100.0%

길이
Real number (ℝ)

Distinct485
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.164665
Minimum2.53
Maximum9.96
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T15:44:48.029876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.53
5-th percentile5.86
Q16.73
median7.13
Q37.6325
95-th percentile8.48
Maximum9.96
Range7.43
Interquartile range (IQR)0.9025

Descriptive statistics

Standard deviation0.77528807
Coefficient of variation (CV)0.10820995
Kurtosis0.6552613
Mean7.164665
Median Absolute Deviation (MAD)0.45
Skewness-0.033925203
Sum71646.65
Variance0.60107159
MonotonicityNot monotonic
2023-12-12T15:44:48.216894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.1 119
 
1.2%
6.87 94
 
0.9%
7.2 94
 
0.9%
7.11 91
 
0.9%
7.14 88
 
0.9%
6.88 87
 
0.9%
6.8 82
 
0.8%
6.98 82
 
0.8%
7.13 80
 
0.8%
6.83 74
 
0.7%
Other values (475) 9109
91.1%
ValueCountFrequency (%)
2.53 1
< 0.1%
3.88 1
< 0.1%
4.02 1
< 0.1%
4.03 1
< 0.1%
4.1 1
< 0.1%
4.13 1
< 0.1%
4.16 1
< 0.1%
4.22 1
< 0.1%
4.31 1
< 0.1%
4.44 1
< 0.1%
ValueCountFrequency (%)
9.96 2
< 0.1%
9.92 1
 
< 0.1%
9.86 3
< 0.1%
9.82 1
 
< 0.1%
9.73 1
 
< 0.1%
9.7 1
 
< 0.1%
9.66 1
 
< 0.1%
9.65 1
 
< 0.1%
9.57 2
< 0.1%
9.54 1
 
< 0.1%


Real number (ℝ)

HIGH CORRELATION 

Distinct633
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.079742
Minimum0.1
Maximum42
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T15:44:48.371557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile2.66
Q13.09
median3.3
Q33.7
95-th percentile7.89
Maximum42
Range41.9
Interquartile range (IQR)0.61

Descriptive statistics

Standard deviation1.7904609
Coefficient of variation (CV)0.4388662
Kurtosis21.4378
Mean4.079742
Median Absolute Deviation (MAD)0.26
Skewness2.347353
Sum40797.42
Variance3.2057502
MonotonicityNot monotonic
2023-12-12T15:44:48.498267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.2 189
 
1.9%
3.3 173
 
1.7%
3.1 150
 
1.5%
3.25 140
 
1.4%
3.26 137
 
1.4%
3.32 133
 
1.3%
3.12 130
 
1.3%
3.14 129
 
1.3%
3.21 127
 
1.3%
3.18 126
 
1.3%
Other values (623) 8566
85.7%
ValueCountFrequency (%)
0.1 1
< 0.1%
1.53 2
< 0.1%
1.54 1
< 0.1%
1.76 1
< 0.1%
1.77 1
< 0.1%
1.79 2
< 0.1%
1.8 1
< 0.1%
1.86 1
< 0.1%
1.87 1
< 0.1%
1.89 1
< 0.1%
ValueCountFrequency (%)
42.0 1
< 0.1%
23.3 1
< 0.1%
9.19 1
< 0.1%
9.04 1
< 0.1%
9.02 1
< 0.1%
8.98 2
< 0.1%
8.96 1
< 0.1%
8.95 1
< 0.1%
8.94 1
< 0.1%
8.93 1
< 0.1%

두께
Real number (ℝ)

HIGH CORRELATION 

Distinct616
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.264636
Minimum0
Maximum8.97
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T15:44:48.913341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.01
Q12.2
median2.33
Q33.14
95-th percentile7.05
Maximum8.97
Range8.97
Interquartile range (IQR)0.94

Descriptive statistics

Standard deviation1.7801278
Coefficient of variation (CV)0.54527605
Kurtosis0.34625098
Mean3.264636
Median Absolute Deviation (MAD)0.21
Skewness1.4014222
Sum32646.36
Variance3.1688551
MonotonicityNot monotonic
2023-12-12T15:44:49.070258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.3 283
 
2.8%
2.21 220
 
2.2%
2.2 210
 
2.1%
2.25 202
 
2.0%
2.24 194
 
1.9%
2.28 194
 
1.9%
2.31 183
 
1.8%
2.26 180
 
1.8%
2.27 179
 
1.8%
2.23 179
 
1.8%
Other values (606) 7976
79.8%
ValueCountFrequency (%)
0.0 1
 
< 0.1%
1.28 1
 
< 0.1%
1.32 1
 
< 0.1%
1.35 4
< 0.1%
1.36 1
 
< 0.1%
1.39 1
 
< 0.1%
1.41 1
 
< 0.1%
1.44 1
 
< 0.1%
1.45 2
< 0.1%
1.46 2
< 0.1%
ValueCountFrequency (%)
8.97 2
< 0.1%
8.86 1
 
< 0.1%
8.77 1
 
< 0.1%
8.72 1
 
< 0.1%
8.71 1
 
< 0.1%
8.7 1
 
< 0.1%
8.67 2
< 0.1%
8.65 3
< 0.1%
8.63 1
 
< 0.1%
8.61 1
 
< 0.1%

Interactions

2023-12-12T15:44:45.123439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:43.790298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:44.263437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:44.705108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:45.210997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:43.890054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:44.358594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:44.801836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:45.308171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:44.025795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:44.480777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:44.919953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:45.413553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:44.168038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:44.589008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:44:45.017133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:44:49.196356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년산지원명부서명종자규격번호작물명품종명길이두께
년산1.0000.1630.1910.0000.1270.4990.1370.0700.186
지원명0.1631.0001.0000.0000.3790.9390.3310.2080.389
부서명0.1911.0001.0000.0000.4940.8900.3050.3910.522
종자규격번호0.0000.0000.0001.0000.0000.0000.0320.0340.000
작물명0.1270.3790.4940.0001.0001.0000.5050.6400.801
품종명0.4990.9390.8900.0001.0001.0000.7640.7990.885
길이0.1370.3310.3050.0320.5050.7641.0000.4780.608
0.0700.2080.3910.0340.6400.7990.4781.0000.838
두께0.1860.3890.5220.0000.8010.8850.6080.8381.000
2023-12-12T15:44:49.317322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부서명지원명년산작물명
부서명1.0001.0000.1820.340
지원명1.0001.0000.1040.221
년산0.1820.1041.0000.052
작물명0.3400.2210.0521.000
2023-12-12T15:44:49.418794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종자규격번호길이두께년산지원명부서명작물명
종자규격번호1.000-0.002-0.008-0.0030.0000.0000.0000.000
길이-0.0021.0000.4000.3830.0820.1650.1990.295
-0.0080.4001.0000.7400.0530.1290.1610.500
두께-0.0030.3830.7401.0000.1120.1970.2550.588
년산0.0000.0820.0530.1121.0000.1040.1820.052
지원명0.0000.1650.1290.1970.1041.0001.0000.221
부서명0.0000.1990.1610.2550.1821.0001.0000.340
작물명0.0000.2950.5000.5880.0520.2210.3401.000

Missing values

2023-12-12T15:44:45.542332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:44:45.707146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

년산지원명부서명종자규격번호작물명품종명길이두께
126562021전남지원영암56보리새쌀보리5.23.141.86
138052021경북지원<NA>6보리영양보리8.643.542.83
214102022경북지원<NA>87삼광벼7.113.312.31
133052021전남지원함평6새금강밀6.423.682.92
196712022전북지원익산72선풍콩8.07.56.3
147672021경남지원<NA>68현품벼6.893.112.16
53062020경북지원<NA>10삼광벼6.812.952.04
217142022경북지원<NA>15대원콩7.717.186.75
54812020경북지원<NA>82백옥찰벼6.412.451.78
170162022충북지원<NA>17삼광벼6.593.082.2
년산지원명부서명종자규격번호작물명품종명길이두께
208402022전남지원영암41태광콩7.46.695.72
70402020경남지원<NA>41조경밀6.873.563.14
100292021충남지원<NA>41미품벼6.872.832.3
92232021충북지원<NA>24오대벼7.483.42.65
20162020전북지원익산17해담쌀6.83.12.2
53022020경북지원<NA>6삼광벼6.683.112.32
190532022전북지원익산54신동진벼8.63.42.1
106452021전북지원익산6해담쌀7.273.112.32
79032020강원지원<NA>4청아콩8.27.826.56
209112022전남지원영암12선풍콩8.577.546.38