Overview

Dataset statistics

Number of variables5
Number of observations395
Missing cells0
Missing cells (%)0.0%
Duplicate rows17
Duplicate rows (%)4.3%
Total size in memory15.9 KiB
Average record size in memory41.3 B

Variable types

Categorical1
Text3
Numeric1

Dataset

Description경상남도 직속기관인 농업기술원은 식량작물, 원예, 화훼, 과수 등 품종육성사업을 수행하고 있습니다.매년 육성된 우량계통은 경상남도종자위원회 자체 심의를 거쳐 국립종자원에 출원하고 재배심사 통과 후 품종 등록됩니다.2023년 기준 395여 품종이 등록되었습니다.
Author경상남도
URLhttps://www.data.go.kr/data/15091169/fileData.do

Alerts

Dataset has 17 (4.3%) duplicate rowsDuplicates

Reproduction

Analysis started2024-03-14 10:19:39.583885
Analysis finished2024-03-14 10:19:40.879158
Duration1.3 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

작물명
Categorical

Distinct17
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
국화
165 
장미
79 
거베라
56 
호접란
25 
버 섯
 
15
Other values (12)
55 

Length

Max length4
Median length2
Mean length2.3468354
Min length1

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row버 섯
2nd row장미
3rd row장미
4th row장미
5th row버 섯

Common Values

ValueCountFrequency (%)
국화 165
41.8%
장미 79
20.0%
거베라 56
 
14.2%
호접란 25
 
6.3%
버 섯 15
 
3.8%
파프리카 9
 
2.3%
나리 7
 
1.8%
단감 6
 
1.5%
양 파 6
 
1.5%
단 감 5
 
1.3%
Other values (7) 22
 
5.6%

Length

2024-03-14T19:19:41.119933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
국화 165
38.9%
장미 79
18.6%
거베라 56
 
13.2%
호접란 25
 
5.9%
15
 
3.5%
15
 
3.5%
파프리카 9
 
2.1%
나리 7
 
1.7%
6
 
1.4%
6
 
1.4%
Other values (11) 41
 
9.7%
Distinct371
Distinct (%)93.9%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2024-03-14T19:19:42.261117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length3.9721519
Min length2

Characters and Unicode

Total characters1569
Distinct characters244
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique348 ?
Unique (%)88.1%

Sample

1st row큰느타리1호
2nd row템테이션
3rd row니나
4th row레드템
5th row새송이1호
ValueCountFrequency (%)
핑크아이 3
 
0.7%
daon 3
 
0.7%
new 3
 
0.7%
옐로티 2
 
0.5%
브라보지엔 2
 
0.5%
레드하모니 2
 
0.5%
마나 2
 
0.5%
가야와인 2
 
0.5%
라비아 2
 
0.5%
레드샤인 2
 
0.5%
Other values (364) 379
94.3%
2024-03-14T19:19:43.837667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
67
 
4.3%
50
 
3.2%
48
 
3.1%
46
 
2.9%
43
 
2.7%
41
 
2.6%
37
 
2.4%
37
 
2.4%
35
 
2.2%
32
 
2.0%
Other values (234) 1133
72.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1512
96.4%
Lowercase Letter 30
 
1.9%
Decimal Number 14
 
0.9%
Space Separator 7
 
0.4%
Uppercase Letter 6
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
67
 
4.4%
50
 
3.3%
48
 
3.2%
46
 
3.0%
43
 
2.8%
41
 
2.7%
37
 
2.4%
37
 
2.4%
35
 
2.3%
32
 
2.1%
Other values (215) 1076
71.2%
Lowercase Letter
ValueCountFrequency (%)
e 6
20.0%
o 5
16.7%
a 4
13.3%
n 4
13.3%
w 4
13.3%
l 2
 
6.7%
r 2
 
6.7%
g 1
 
3.3%
y 1
 
3.3%
d 1
 
3.3%
Decimal Number
ValueCountFrequency (%)
3 4
28.6%
1 4
28.6%
5 3
21.4%
7 1
 
7.1%
2 1
 
7.1%
6 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
D 3
50.0%
N 3
50.0%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1512
96.4%
Latin 36
 
2.3%
Common 21
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
67
 
4.4%
50
 
3.3%
48
 
3.2%
46
 
3.0%
43
 
2.8%
41
 
2.7%
37
 
2.4%
37
 
2.4%
35
 
2.3%
32
 
2.1%
Other values (215) 1076
71.2%
Latin
ValueCountFrequency (%)
e 6
16.7%
o 5
13.9%
a 4
11.1%
n 4
11.1%
w 4
11.1%
D 3
8.3%
N 3
8.3%
l 2
 
5.6%
r 2
 
5.6%
g 1
 
2.8%
Other values (2) 2
 
5.6%
Common
ValueCountFrequency (%)
7
33.3%
3 4
19.0%
1 4
19.0%
5 3
14.3%
7 1
 
4.8%
2 1
 
4.8%
6 1
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1512
96.4%
ASCII 57
 
3.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
67
 
4.4%
50
 
3.3%
48
 
3.2%
46
 
3.0%
43
 
2.8%
41
 
2.7%
37
 
2.4%
37
 
2.4%
35
 
2.3%
32
 
2.1%
Other values (215) 1076
71.2%
ASCII
ValueCountFrequency (%)
7
12.3%
e 6
10.5%
o 5
 
8.8%
a 4
 
7.0%
n 4
 
7.0%
w 4
 
7.0%
3 4
 
7.0%
1 4
 
7.0%
D 3
 
5.3%
N 3
 
5.3%
Other values (9) 13
22.8%
Distinct374
Distinct (%)94.7%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2024-03-14T19:19:44.647007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length105
Median length42
Mean length8.1037975
Min length2

Characters and Unicode

Total characters3201
Distinct characters79
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique353 ?
Unique (%)89.4%

Sample

1st row큰느타리1호
2nd row창원R-1호
3rd row경남R-2호
4th row경남R-3호
5th rowA8B8
ValueCountFrequency (%)
경남 50
 
10.9%
daon 3
 
0.7%
new 3
 
0.7%
경남r-38 2
 
0.4%
경남r-45 2
 
0.4%
경남rs-50 2
 
0.4%
경남rs-51 2
 
0.4%
경남교rs-64 2
 
0.4%
경남교p-20 2
 
0.4%
경남r-43 2
 
0.4%
Other values (373) 388
84.7%
2024-03-14T19:19:45.935739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 410
 
12.8%
337
 
10.5%
337
 
10.5%
C 172
 
5.4%
2 155
 
4.8%
1 155
 
4.8%
146
 
4.6%
4 116
 
3.6%
3 115
 
3.6%
110
 
3.4%
Other values (69) 1148
35.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1024
32.0%
Other Letter 994
31.1%
Uppercase Letter 605
18.9%
Dash Punctuation 410
12.8%
Space Separator 66
 
2.1%
Lowercase Letter 31
 
1.0%
Math Symbol 29
 
0.9%
Open Punctuation 15
 
0.5%
Close Punctuation 15
 
0.5%
Other Punctuation 12
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
337
33.9%
337
33.9%
146
14.7%
110
 
11.1%
8
 
0.8%
8
 
0.8%
6
 
0.6%
6
 
0.6%
4
 
0.4%
3
 
0.3%
Other values (21) 29
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
C 172
28.4%
S 109
18.0%
P 101
16.7%
R 96
15.9%
G 58
 
9.6%
N 18
 
3.0%
K 17
 
2.8%
D 11
 
1.8%
A 8
 
1.3%
L 7
 
1.2%
Other values (6) 8
 
1.3%
Lowercase Letter
ValueCountFrequency (%)
e 6
19.4%
o 5
16.1%
w 4
12.9%
a 4
12.9%
n 4
12.9%
r 2
 
6.5%
l 2
 
6.5%
d 1
 
3.2%
y 1
 
3.2%
g 1
 
3.2%
Decimal Number
ValueCountFrequency (%)
2 155
15.1%
1 155
15.1%
4 116
11.3%
3 115
11.2%
5 109
10.6%
6 90
8.8%
0 88
8.6%
7 83
8.1%
8 64
6.2%
9 49
 
4.8%
Open Punctuation
ValueCountFrequency (%)
( 11
73.3%
[ 2
 
13.3%
{ 2
 
13.3%
Close Punctuation
ValueCountFrequency (%)
) 11
73.3%
] 2
 
13.3%
} 2
 
13.3%
Other Punctuation
ValueCountFrequency (%)
* 6
50.0%
/ 6
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 410
100.0%
Space Separator
ValueCountFrequency (%)
66
100.0%
Math Symbol
ValueCountFrequency (%)
× 29
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1571
49.1%
Hangul 994
31.1%
Latin 636
19.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
337
33.9%
337
33.9%
146
14.7%
110
 
11.1%
8
 
0.8%
8
 
0.8%
6
 
0.6%
6
 
0.6%
4
 
0.4%
3
 
0.3%
Other values (21) 29
 
2.9%
Latin
ValueCountFrequency (%)
C 172
27.0%
S 109
17.1%
P 101
15.9%
R 96
15.1%
G 58
 
9.1%
N 18
 
2.8%
K 17
 
2.7%
D 11
 
1.7%
A 8
 
1.3%
L 7
 
1.1%
Other values (17) 39
 
6.1%
Common
ValueCountFrequency (%)
- 410
26.1%
2 155
 
9.9%
1 155
 
9.9%
4 116
 
7.4%
3 115
 
7.3%
5 109
 
6.9%
6 90
 
5.7%
0 88
 
5.6%
7 83
 
5.3%
66
 
4.2%
Other values (11) 184
11.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2178
68.0%
Hangul 994
31.1%
None 29
 
0.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 410
18.8%
C 172
 
7.9%
2 155
 
7.1%
1 155
 
7.1%
4 116
 
5.3%
3 115
 
5.3%
S 109
 
5.0%
5 109
 
5.0%
P 101
 
4.6%
R 96
 
4.4%
Other values (37) 640
29.4%
Hangul
ValueCountFrequency (%)
337
33.9%
337
33.9%
146
14.7%
110
 
11.1%
8
 
0.8%
8
 
0.8%
6
 
0.6%
6
 
0.6%
4
 
0.4%
3
 
0.3%
Other values (21) 29
 
2.9%
None
ValueCountFrequency (%)
× 29
100.0%

등록년도
Real number (ℝ)

Distinct21
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2014.8861
Minimum1998
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2024-03-14T19:19:46.315752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1998
5-th percentile2007
Q12011
median2015
Q32019
95-th percentile2023
Maximum2023
Range25
Interquartile range (IQR)8

Descriptive statistics

Standard deviation5.0171972
Coefficient of variation (CV)0.0024900649
Kurtosis-0.6910556
Mean2014.8861
Median Absolute Deviation (MAD)4
Skewness-0.21808128
Sum795880
Variance25.172268
MonotonicityIncreasing
2024-03-14T19:19:46.698327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
2017 32
 
8.1%
2016 32
 
8.1%
2022 27
 
6.8%
2019 27
 
6.8%
2011 26
 
6.6%
2015 25
 
6.3%
2007 24
 
6.1%
2012 23
 
5.8%
2023 22
 
5.6%
2013 22
 
5.6%
Other values (11) 135
34.2%
ValueCountFrequency (%)
1998 1
 
0.3%
2003 3
 
0.8%
2005 5
 
1.3%
2006 3
 
0.8%
2007 24
6.1%
2008 10
 
2.5%
2009 22
5.6%
2010 18
4.6%
2011 26
6.6%
2012 23
5.8%
ValueCountFrequency (%)
2023 22
5.6%
2022 27
6.8%
2021 12
 
3.0%
2020 21
5.3%
2019 27
6.8%
2018 20
5.1%
2017 32
8.1%
2016 32
8.1%
2015 25
6.3%
2014 20
5.1%
Distinct297
Distinct (%)75.2%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2024-03-14T19:19:47.845507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length24
Mean length15.860759
Min length8

Characters and Unicode

Total characters6265
Distinct characters249
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique241 ?
Unique (%)61.0%

Sample

1st row조직이 치밀하며 식감이 우수
2nd row적색, 스탠다드, 가시적음
3rd row아이보리색, 스탠다드, 수세강, 저온신정성
4th row적색, 스탠다드, 측지발생적음
5th row수확소요일이 2-3일 단축
ValueCountFrequency (%)
분화용 84
 
5.6%
홑꽃 77
 
5.1%
스프레이국화 60
 
4.0%
절화용 57
 
3.8%
황색 44
 
2.9%
스프레이 44
 
2.9%
반겹꽃 43
 
2.8%
겹꽃 40
 
2.6%
스탠다드 39
 
2.6%
화심 33
 
2.2%
Other values (310) 992
65.6%
2024-03-14T19:19:49.438668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1140
 
18.2%
407
 
6.5%
382
 
6.1%
, 340
 
5.4%
173
 
2.8%
164
 
2.6%
162
 
2.6%
154
 
2.5%
113
 
1.8%
112
 
1.8%
Other values (239) 3118
49.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4738
75.6%
Space Separator 1140
 
18.2%
Other Punctuation 349
 
5.6%
Decimal Number 18
 
0.3%
Uppercase Letter 8
 
0.1%
Close Punctuation 5
 
0.1%
Open Punctuation 5
 
0.1%
Math Symbol 1
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
407
 
8.6%
382
 
8.1%
173
 
3.7%
164
 
3.5%
162
 
3.4%
154
 
3.3%
113
 
2.4%
112
 
2.4%
112
 
2.4%
111
 
2.3%
Other values (222) 2848
60.1%
Decimal Number
ValueCountFrequency (%)
0 6
33.3%
1 4
22.2%
3 3
16.7%
5 2
 
11.1%
2 2
 
11.1%
6 1
 
5.6%
Other Punctuation
ValueCountFrequency (%)
, 340
97.4%
% 4
 
1.1%
/ 4
 
1.1%
. 1
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
A 4
50.0%
L 4
50.0%
Space Separator
ValueCountFrequency (%)
1140
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Math Symbol
ValueCountFrequency (%)
1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4738
75.6%
Common 1519
 
24.2%
Latin 8
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
407
 
8.6%
382
 
8.1%
173
 
3.7%
164
 
3.5%
162
 
3.4%
154
 
3.3%
113
 
2.4%
112
 
2.4%
112
 
2.4%
111
 
2.3%
Other values (222) 2848
60.1%
Common
ValueCountFrequency (%)
1140
75.0%
, 340
 
22.4%
0 6
 
0.4%
) 5
 
0.3%
( 5
 
0.3%
1 4
 
0.3%
% 4
 
0.3%
/ 4
 
0.3%
3 3
 
0.2%
5 2
 
0.1%
Other values (5) 6
 
0.4%
Latin
ValueCountFrequency (%)
A 4
50.0%
L 4
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4738
75.6%
ASCII 1526
 
24.4%
Arrows 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1140
74.7%
, 340
 
22.3%
0 6
 
0.4%
) 5
 
0.3%
( 5
 
0.3%
A 4
 
0.3%
1 4
 
0.3%
L 4
 
0.3%
% 4
 
0.3%
/ 4
 
0.3%
Other values (6) 10
 
0.7%
Hangul
ValueCountFrequency (%)
407
 
8.6%
382
 
8.1%
173
 
3.7%
164
 
3.5%
162
 
3.4%
154
 
3.3%
113
 
2.4%
112
 
2.4%
112
 
2.4%
111
 
2.3%
Other values (222) 2848
60.1%
Arrows
ValueCountFrequency (%)
1
100.0%

Interactions

2024-03-14T19:19:40.117659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T19:19:49.694869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
작물명등록년도
작물명1.0000.468
등록년도0.4681.000
2024-03-14T19:19:49.919837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록년도작물명
등록년도1.0000.199
작물명0.1991.000

Missing values

2024-03-14T19:19:40.437157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T19:19:40.750746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

작물명품 종 명계 통 명등록년도주 요 특 성
0버 섯큰느타리1호큰느타리1호1998조직이 치밀하며 식감이 우수
1장미템테이션창원R-1호2003적색, 스탠다드, 가시적음
2장미니나경남R-2호2003아이보리색, 스탠다드, 수세강, 저온신정성
3장미레드템경남R-3호2003적색, 스탠다드, 측지발생적음
4버 섯새송이1호A8B82005수확소요일이 2-3일 단축
5장미사브리나경남R-4호2005연핑크, 스탠다드, 화색우수, 수세강
6장미고우니경남R-5호2005핑크복색, 스탠다드, 연중재배가능
7장미오렌지뷰티경남R-6호2005오렌지색, 스탠다드, 병충해강함
8장미태양경남R-8호2005적색, 스탠다드, 수세강함
9장미하니경남R-9호2006핑크복색,스탠다드, 재배용이
작물명품 종 명계 통 명등록년도주 요 특 성
385국화핑크빔경남교CD-72023분홍색 겹꽃 스탠다드
386거베라크림쿠키경남교G-67호2023연황색, 녹심, 반겹꽃, 미니
387거베라오텀경남교G-68호2023연주황색, 갈심, 반겹꽃, 대륜
388아람경남2호2023브랜드쌀용, 고품질, 내병성
389멜론새로이새로이2023수경재배에 적합, 고당도
390파프리카New Daon redNew Daon red2023다수확형, 품질 우수
391파프리카New Daon yellowNew Daon yellow2023다수확형, 품질 우수
392파프리카New Daon orangeNew Daon orange2023다수확형, 품질 우수
393단감썬스위트단연10-1-802023중생종, 완전단감, 대과, 생리장애 적음
394단감달님단연13-1-1202023중생종, 대과, 정형과, 고당도

Duplicate rows

Most frequently occurring

작물명품 종 명계 통 명등록년도주 요 특 성# duplicates
0국화가야선셋경남CP-20호2011오렌지색 갈색화심 분화용2
1국화가야와인경남CP-21호2011적색 갈색심 조기개화성 분화용2
2국화수미경남CD-1호2011백색 겹꽃 추국 스탠다드 국화2
3장미라비아경남R-422016오렌지색, 스프레이, 수량많음2
4장미레드샤인경남R-402016진한핑크색, 스탠다드, 가시적음2
5장미레드하모니경남R-452017연핑크색, 화색우수,2
6장미베리타르트경남RS-512019복색 스프레이, 화형 우수, 향기 있음2
7장미브라보지엔경남R-472018아이보리, 스탠다드, 다수성2
8장미셀리나경남R-392015적색, 스탠다드, 가시많음, 병충해강함2
9장미실키경남R-382015핑크색, 스탠다드, 가시적음2