Overview

Dataset statistics

Number of variables5
Number of observations334
Missing cells7
Missing cells (%)0.4%
Duplicate rows17
Duplicate rows (%)5.1%
Total size in memory13.5 KiB
Average record size in memory41.4 B

Variable types

Categorical1
Text3
Numeric1

Dataset

Description경상남도 직속기관인 농업기술원은 식량작물, 원예, 화훼, 과수 등 품종육성사업을 수행하고 있습니다.
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15091169

Alerts

Dataset has 17 (5.1%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-10 23:09:16.896159
Analysis finished2023-12-10 23:09:17.512336
Duration0.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

작물명
Categorical

Distinct14
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
국화
134 
장미
69 
거베라
53 
호접란
25 
버 섯
15 
Other values (9)
38 

Length

Max length4
Median length2
Mean length2.3982036
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row버 섯
2nd row장미
3rd row장미
4th row장미
5th row버 섯

Common Values

ValueCountFrequency (%)
국화 134
40.1%
장미 69
20.7%
거베라 53
 
15.9%
호접란 25
 
7.5%
버 섯 15
 
4.5%
나리 7
 
2.1%
양 파 6
 
1.8%
파프리카 6
 
1.8%
단 감 5
 
1.5%
카네이션 5
 
1.5%
Other values (4) 9
 
2.7%

Length

2023-12-11T08:09:17.578757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
국화 134
36.9%
장미 69
19.0%
거베라 53
 
14.6%
호접란 25
 
6.9%
15
 
4.1%
15
 
4.1%
나리 7
 
1.9%
6
 
1.7%
파프리카 6
 
1.7%
6
 
1.7%
Other values (8) 27
 
7.4%
Distinct309
Distinct (%)92.8%
Missing1
Missing (%)0.3%
Memory size2.7 KiB
2023-12-11T08:09:17.827930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length3.9039039
Min length2

Characters and Unicode

Total characters1300
Distinct characters216
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique286 ?
Unique (%)85.9%

Sample

1st row큰느타리1호
2nd row템테이션
3rd row니나
4th row레드템
5th row새송이1호
ValueCountFrequency (%)
핑크아이 3
 
0.9%
레드샤인 2
 
0.6%
가야와인 2
 
0.6%
셀리나 2
 
0.6%
실키 2
 
0.6%
아모르파티 2
 
0.6%
피치타르트 2
 
0.6%
에그타르트 2
 
0.6%
핑크펄 2
 
0.6%
레드하모니 2
 
0.6%
Other values (301) 314
93.7%
2023-12-11T08:09:18.185736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
60
 
4.6%
42
 
3.2%
40
 
3.1%
38
 
2.9%
37
 
2.8%
36
 
2.8%
36
 
2.8%
30
 
2.3%
29
 
2.2%
28
 
2.2%
Other values (206) 924
71.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1285
98.8%
Decimal Number 13
 
1.0%
Space Separator 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
60
 
4.7%
42
 
3.3%
40
 
3.1%
38
 
3.0%
37
 
2.9%
36
 
2.8%
36
 
2.8%
30
 
2.3%
29
 
2.3%
28
 
2.2%
Other values (200) 909
70.7%
Decimal Number
ValueCountFrequency (%)
3 5
38.5%
1 3
23.1%
2 2
 
15.4%
5 2
 
15.4%
7 1
 
7.7%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1285
98.8%
Common 15
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
60
 
4.7%
42
 
3.3%
40
 
3.1%
38
 
3.0%
37
 
2.9%
36
 
2.8%
36
 
2.8%
30
 
2.3%
29
 
2.3%
28
 
2.2%
Other values (200) 909
70.7%
Common
ValueCountFrequency (%)
3 5
33.3%
1 3
20.0%
2
 
13.3%
2 2
 
13.3%
5 2
 
13.3%
7 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1285
98.8%
ASCII 15
 
1.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
60
 
4.7%
42
 
3.3%
40
 
3.1%
38
 
3.0%
37
 
2.9%
36
 
2.8%
36
 
2.8%
30
 
2.3%
29
 
2.3%
28
 
2.2%
Other values (200) 909
70.7%
ASCII
ValueCountFrequency (%)
3 5
33.3%
1 3
20.0%
2
 
13.3%
2 2
 
13.3%
5 2
 
13.3%
7 1
 
6.7%
Distinct312
Distinct (%)94.0%
Missing2
Missing (%)0.6%
Memory size2.7 KiB
2023-12-11T08:09:18.377455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length35
Mean length7.6385542
Min length1

Characters and Unicode

Total characters2536
Distinct characters57
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique292 ?
Unique (%)88.0%

Sample

1st row-
2nd row창원R-1호
3rd row경남R-2호
4th row경남R-3호
5th rowA8B8
ValueCountFrequency (%)
경남 50
 
12.9%
경남cp-21호 2
 
0.5%
경남r-53 2
 
0.5%
경남r-44 2
 
0.5%
경남r-47 2
 
0.5%
경남r-39 2
 
0.5%
경남r-38 2
 
0.5%
경남cd-1호 2
 
0.5%
경남rs-48 2
 
0.5%
경남rs-49 2
 
0.5%
Other values (308) 319
82.4%
2023-12-11T08:09:18.728830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 331
13.1%
292
 
11.5%
292
 
11.5%
C 141
 
5.6%
2 133
 
5.2%
1 118
 
4.7%
113
 
4.5%
3 107
 
4.2%
4 103
 
4.1%
98
 
3.9%
Other values (47) 808
31.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 830
32.7%
Decimal Number 794
31.3%
Uppercase Letter 484
19.1%
Dash Punctuation 331
 
13.1%
Space Separator 55
 
2.2%
Math Symbol 21
 
0.8%
Other Punctuation 8
 
0.3%
Open Punctuation 6
 
0.2%
Close Punctuation 6
 
0.2%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
292
35.2%
292
35.2%
113
 
13.6%
98
 
11.8%
4
 
0.5%
4
 
0.5%
3
 
0.4%
3
 
0.4%
2
 
0.2%
2
 
0.2%
Other values (14) 17
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
C 141
29.1%
S 88
18.2%
P 86
17.8%
R 76
15.7%
G 54
 
11.2%
A 8
 
1.7%
L 7
 
1.4%
K 7
 
1.4%
D 6
 
1.2%
N 5
 
1.0%
Other values (5) 6
 
1.2%
Decimal Number
ValueCountFrequency (%)
2 133
16.8%
1 118
14.9%
3 107
13.5%
4 103
13.0%
5 96
12.1%
0 57
7.2%
6 56
7.1%
7 46
 
5.8%
8 41
 
5.2%
9 37
 
4.7%
Other Punctuation
ValueCountFrequency (%)
/ 6
75.0%
* 2
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 331
100.0%
Space Separator
ValueCountFrequency (%)
55
100.0%
Math Symbol
ValueCountFrequency (%)
× 21
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Lowercase Letter
ValueCountFrequency (%)
x 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1221
48.1%
Hangul 830
32.7%
Latin 485
 
19.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
292
35.2%
292
35.2%
113
 
13.6%
98
 
11.8%
4
 
0.5%
4
 
0.5%
3
 
0.4%
3
 
0.4%
2
 
0.2%
2
 
0.2%
Other values (14) 17
 
2.0%
Common
ValueCountFrequency (%)
- 331
27.1%
2 133
10.9%
1 118
 
9.7%
3 107
 
8.8%
4 103
 
8.4%
5 96
 
7.9%
0 57
 
4.7%
6 56
 
4.6%
55
 
4.5%
7 46
 
3.8%
Other values (7) 119
 
9.7%
Latin
ValueCountFrequency (%)
C 141
29.1%
S 88
18.1%
P 86
17.7%
R 76
15.7%
G 54
 
11.1%
A 8
 
1.6%
L 7
 
1.4%
K 7
 
1.4%
D 6
 
1.2%
N 5
 
1.0%
Other values (6) 7
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1685
66.4%
Hangul 830
32.7%
None 21
 
0.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 331
19.6%
C 141
 
8.4%
2 133
 
7.9%
1 118
 
7.0%
3 107
 
6.4%
4 103
 
6.1%
5 96
 
5.7%
S 88
 
5.2%
P 86
 
5.1%
R 76
 
4.5%
Other values (22) 406
24.1%
Hangul
ValueCountFrequency (%)
292
35.2%
292
35.2%
113
 
13.6%
98
 
11.8%
4
 
0.5%
4
 
0.5%
3
 
0.4%
3
 
0.4%
2
 
0.2%
2
 
0.2%
Other values (14) 17
 
2.0%
None
ValueCountFrequency (%)
× 21
100.0%

등록년도
Real number (ℝ)

Distinct18
Distinct (%)5.4%
Missing2
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean2013.5181
Minimum1998
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.1 KiB
2023-12-11T08:09:18.856230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1998
5-th percentile2007
Q12010
median2014
Q32017
95-th percentile2020
Maximum2020
Range22
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.2507833
Coefficient of variation (CV)0.0021111225
Kurtosis-0.49900572
Mean2013.5181
Median Absolute Deviation (MAD)3
Skewness-0.39112156
Sum668488
Variance18.069159
MonotonicityIncreasing
2023-12-11T08:09:18.946344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
2016 32
9.6%
2017 32
9.6%
2019 27
 
8.1%
2011 26
 
7.8%
2015 25
 
7.5%
2007 24
 
7.2%
2012 23
 
6.9%
2009 22
 
6.6%
2013 22
 
6.6%
2014 20
 
6.0%
Other values (8) 79
23.7%
ValueCountFrequency (%)
1998 1
 
0.3%
2003 3
 
0.9%
2005 5
 
1.5%
2006 3
 
0.9%
2007 24
7.2%
2008 10
 
3.0%
2009 22
6.6%
2010 18
5.4%
2011 26
7.8%
2012 23
6.9%
ValueCountFrequency (%)
2020 19
5.7%
2019 27
8.1%
2018 20
6.0%
2017 32
9.6%
2016 32
9.6%
2015 25
7.5%
2014 20
6.0%
2013 22
6.6%
2012 23
6.9%
2011 26
7.8%
Distinct243
Distinct (%)73.2%
Missing2
Missing (%)0.6%
Memory size2.7 KiB
2023-12-11T08:09:19.181341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length24
Mean length16.009036
Min length9

Characters and Unicode

Total characters5315
Distinct characters228
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique194 ?
Unique (%)58.4%

Sample

1st row조직이 치밀하며 식감이 우수
2nd row적색, 스탠다드, 가시적음
3rd row아이보리색, 스탠다드, 수세강, 저온신정성
4th row적색, 스탠다드, 측지발생적음
5th row수확소요일이 2-3일 단축
ValueCountFrequency (%)
홑꽃 76
 
6.0%
분화용 69
 
5.4%
스프레이국화 60
 
4.7%
절화용 57
 
4.5%
반겹꽃 40
 
3.1%
황색 36
 
2.8%
스탠다드 36
 
2.8%
화심 33
 
2.6%
겹꽃 29
 
2.3%
스프레이 27
 
2.1%
Other values (265) 811
63.7%
2023-12-11T08:09:19.602776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
961
 
18.1%
352
 
6.6%
346
 
6.5%
, 283
 
5.3%
158
 
3.0%
142
 
2.7%
142
 
2.7%
133
 
2.5%
95
 
1.8%
88
 
1.7%
Other values (218) 2615
49.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4036
75.9%
Space Separator 961
 
18.1%
Other Punctuation 291
 
5.5%
Decimal Number 12
 
0.2%
Uppercase Letter 8
 
0.2%
Open Punctuation 3
 
0.1%
Close Punctuation 3
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
352
 
8.7%
346
 
8.6%
158
 
3.9%
142
 
3.5%
142
 
3.5%
133
 
3.3%
95
 
2.4%
88
 
2.2%
88
 
2.2%
87
 
2.2%
Other values (204) 2405
59.6%
Other Punctuation
ValueCountFrequency (%)
, 283
97.3%
% 4
 
1.4%
/ 3
 
1.0%
. 1
 
0.3%
Decimal Number
ValueCountFrequency (%)
0 6
50.0%
1 2
 
16.7%
3 2
 
16.7%
2 2
 
16.7%
Uppercase Letter
ValueCountFrequency (%)
A 4
50.0%
L 4
50.0%
Space Separator
ValueCountFrequency (%)
961
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4036
75.9%
Common 1271
 
23.9%
Latin 8
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
352
 
8.7%
346
 
8.6%
158
 
3.9%
142
 
3.5%
142
 
3.5%
133
 
3.3%
95
 
2.4%
88
 
2.2%
88
 
2.2%
87
 
2.2%
Other values (204) 2405
59.6%
Common
ValueCountFrequency (%)
961
75.6%
, 283
 
22.3%
0 6
 
0.5%
% 4
 
0.3%
( 3
 
0.2%
/ 3
 
0.2%
) 3
 
0.2%
1 2
 
0.2%
3 2
 
0.2%
2 2
 
0.2%
Other values (2) 2
 
0.2%
Latin
ValueCountFrequency (%)
A 4
50.0%
L 4
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4036
75.9%
ASCII 1279
 
24.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
961
75.1%
, 283
 
22.1%
0 6
 
0.5%
A 4
 
0.3%
L 4
 
0.3%
% 4
 
0.3%
( 3
 
0.2%
/ 3
 
0.2%
) 3
 
0.2%
1 2
 
0.2%
Other values (4) 6
 
0.5%
Hangul
ValueCountFrequency (%)
352
 
8.7%
346
 
8.6%
158
 
3.9%
142
 
3.5%
142
 
3.5%
133
 
3.3%
95
 
2.4%
88
 
2.2%
88
 
2.2%
87
 
2.2%
Other values (204) 2405
59.6%

Interactions

2023-12-11T08:09:17.174408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T08:09:19.687145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
작물명등록년도
작물명1.0000.405
등록년도0.4051.000
2023-12-11T08:09:19.760364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록년도작물명
등록년도1.0000.191
작물명0.1911.000

Missing values

2023-12-11T08:09:17.265942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:09:17.354509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T08:09:17.454663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

작물명품 종 명계 통 명등록년도주 요 특 성
0버 섯큰느타리1호-1998조직이 치밀하며 식감이 우수
1장미템테이션창원R-1호2003적색, 스탠다드, 가시적음
2장미니나경남R-2호2003아이보리색, 스탠다드, 수세강, 저온신정성
3장미레드템경남R-3호2003적색, 스탠다드, 측지발생적음
4버 섯새송이1호A8B82005수확소요일이 2-3일 단축
5장미사브리나경남R-4호2005연핑크, 스탠다드, 화색우수, 수세강
6장미고우니경남R-5호2005핑크복색, 스탠다드, 연중재배가능
7장미오렌지뷰티경남R-6호2005오렌지색, 스탠다드, 병충해강함
8장미태양경남R-8호2005적색, 스탠다드, 수세강함
9장미하니경남R-9호2006핑크복색,스탠다드, 재배용이
작물명품 종 명계 통 명등록년도주 요 특 성
324국화에르메스핑크경남교CS-622020분홍색 겹꽃 디스버드국화
325국화비너스핑크경남교CS-632020분홍색 홑꽃 녹심 스프레이국화
326국화큐피트그린경남교CS-642020녹색 폼폰 다화성 스프레이국화
327국화아레스퍼플경남교CS-652020자주색 폼폰 다화성 스프레이국화
328장미유포리아경남교RS-532020연한 그린색, 스프레이, 카네이션 화형, 생력형
329장미레리티경남교RS-542020연한 핑크색, 스프레이, 생육 균일, 생산성 우수
330장미미스틱경남교RS-552020연한 보라색, 스프레이, 웨딩용, 절화수명 김
331감국옥향감국5호2020다수성, 병해충 저항성, 약용,식용
332<NA><NA><NA><NA><NA>
333<NA>계 332품종<NA><NA><NA>

Duplicate rows

Most frequently occurring

작물명품 종 명계 통 명등록년도주 요 특 성# duplicates
0국화가야선셋경남CP-20호2011오렌지색 갈색화심 분화용2
1국화가야와인경남CP-21호2011적색 갈색심 조기개화성 분화용2
2국화수미경남CD-1호2011백색 겹꽃 추국 스탠다드 국화2
3장미라비아경남R-422016오렌지색, 스프레이, 수량많음2
4장미레드샤인경남R-402016진한핑크색, 스탠다드, 가시적음2
5장미레드하모니경남R-452017연핑크색, 화색우수,2
6장미베리타르트경남RS-512019복색 스프레이, 화형 우수, 향기 있음2
7장미브라보지엔경남R-472018아이보리, 스탠다드, 다수성2
8장미셀리나경남R-392015적색, 스탠다드, 가시많음, 병충해강함2
9장미실키경남R-382015핑크색, 스탠다드, 가시적음2