Overview

Dataset statistics

Number of variables15
Number of observations65
Missing cells167
Missing cells (%)17.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.9 KiB
Average record size in memory125.0 B

Variable types

Numeric1
Text5
Categorical9

Dataset

Description한국원자력연구원_첨단방사선연구소에서 육성한 벼(골드아미2호)의 품종특성 데이터 입니다. 데이터 칼럼 리스트는 번호, 형질, 표현형태-1, 표현형태-2, 표현형태-3, 표현형태-4, 표현형태-5, 표현형태-6, 표현형태-7, 표현형태-8, 표현형태-9, 출원품종 표현형태, 출원품종 실측치, 대조품종 표현형태, 대조품종 실측치 입니다.
Author한국원자력연구원
URLhttps://www.data.go.kr/data/15046071/fileData.do

Alerts

표현형태-5 is highly overall correlated with 표현형태-7 and 6 other fieldsHigh correlation
표현형태-8 is highly overall correlated with 번호 and 4 other fieldsHigh correlation
표현형태-7 is highly overall correlated with 표현형태-1 and 7 other fieldsHigh correlation
대조품종 실측치 is highly overall correlated with 표현형태-1 and 6 other fieldsHigh correlation
대조품종 표현형태 is highly overall correlated with 표현형태-1 and 6 other fieldsHigh correlation
표현형태-9 is highly overall correlated with 표현형태-1 and 7 other fieldsHigh correlation
표현형태-1 is highly overall correlated with 표현형태-7 and 6 other fieldsHigh correlation
출원품종 표현형태 is highly overall correlated with 표현형태-1 and 6 other fieldsHigh correlation
출원품종 실측치 is highly overall correlated with 표현형태-1 and 6 other fieldsHigh correlation
번호 is highly overall correlated with 표현형태-8High correlation
표현형태-8 is highly imbalanced (73.0%)Imbalance
표현형태-2 has 46 (70.8%) missing valuesMissing
표현형태-3 has 10 (15.4%) missing valuesMissing
표현형태-4 has 52 (80.0%) missing valuesMissing
표현형태-6 has 59 (90.8%) missing valuesMissing
번호 has unique valuesUnique
형질 has unique valuesUnique

Reproduction

Analysis started2023-12-12 21:54:52.463954
Analysis finished2023-12-12 21:54:54.135391
Duration1.67 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct65
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33
Minimum1
Maximum65
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size717.0 B
2023-12-13T06:54:54.228616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.2
Q117
median33
Q349
95-th percentile61.8
Maximum65
Range64
Interquartile range (IQR)32

Descriptive statistics

Standard deviation18.90767
Coefficient of variation (CV)0.57295971
Kurtosis-1.2
Mean33
Median Absolute Deviation (MAD)16
Skewness0
Sum2145
Variance357.5
MonotonicityStrictly increasing
2023-12-13T06:54:54.381647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.5%
50 1
 
1.5%
36 1
 
1.5%
37 1
 
1.5%
38 1
 
1.5%
39 1
 
1.5%
40 1
 
1.5%
41 1
 
1.5%
42 1
 
1.5%
43 1
 
1.5%
Other values (55) 55
84.6%
ValueCountFrequency (%)
1 1
1.5%
2 1
1.5%
3 1
1.5%
4 1
1.5%
5 1
1.5%
6 1
1.5%
7 1
1.5%
8 1
1.5%
9 1
1.5%
10 1
1.5%
ValueCountFrequency (%)
65 1
1.5%
64 1
1.5%
63 1
1.5%
62 1
1.5%
61 1
1.5%
60 1
1.5%
59 1
1.5%
58 1
1.5%
57 1
1.5%
56 1
1.5%

형질
Text

UNIQUE 

Distinct65
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size652.0 B
2023-12-13T06:54:54.603669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length17
Mean length12.353846
Min length3

Characters and Unicode

Total characters803
Distinct characters121
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)100.0%

Sample

1st row초엽 : 안토시아닌 색소
2nd row제1엽 : 엽초색
3rd row잎 : 녹색정도 (잎색농도)
4th row잎 : 안토시아닌 색소
5th row잎 : 안토시아닌 색소분포
ValueCountFrequency (%)
58
22.8%
안토시아닌 16
 
6.3%
이삭 13
 
5.1%
색소 11
 
4.3%
외영 11
 
4.3%
8
 
3.1%
줄기 7
 
2.8%
길이 7
 
2.8%
7
 
2.8%
출수기 5
 
2.0%
Other values (71) 111
43.7%
2023-12-13T06:54:54.987229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
194
24.2%
: 58
 
7.2%
28
 
3.5%
22
 
2.7%
22
 
2.7%
18
 
2.2%
) 17
 
2.1%
17
 
2.1%
( 17
 
2.1%
16
 
2.0%
Other values (111) 394
49.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 509
63.4%
Space Separator 194
 
24.2%
Other Punctuation 60
 
7.5%
Close Punctuation 17
 
2.1%
Open Punctuation 17
 
2.1%
Decimal Number 5
 
0.6%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
28
 
5.5%
22
 
4.3%
22
 
4.3%
18
 
3.5%
17
 
3.3%
16
 
3.1%
16
 
3.1%
16
 
3.1%
16
 
3.1%
16
 
3.1%
Other values (100) 322
63.3%
Decimal Number
ValueCountFrequency (%)
2 2
40.0%
1 1
20.0%
0 1
20.0%
5 1
20.0%
Other Punctuation
ValueCountFrequency (%)
: 58
96.7%
, 1
 
1.7%
% 1
 
1.7%
Space Separator
ValueCountFrequency (%)
194
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 509
63.4%
Common 294
36.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
28
 
5.5%
22
 
4.3%
22
 
4.3%
18
 
3.5%
17
 
3.3%
16
 
3.1%
16
 
3.1%
16
 
3.1%
16
 
3.1%
16
 
3.1%
Other values (100) 322
63.3%
Common
ValueCountFrequency (%)
194
66.0%
: 58
 
19.7%
) 17
 
5.8%
( 17
 
5.8%
2 2
 
0.7%
1 1
 
0.3%
, 1
 
0.3%
- 1
 
0.3%
% 1
 
0.3%
0 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 509
63.4%
ASCII 294
36.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
194
66.0%
: 58
 
19.7%
) 17
 
5.8%
( 17
 
5.8%
2 2
 
0.7%
1 1
 
0.3%
, 1
 
0.3%
- 1
 
0.3%
% 1
 
0.3%
0 1
 
0.3%
Hangul
ValueCountFrequency (%)
28
 
5.5%
22
 
4.3%
22
 
4.3%
18
 
3.5%
17
 
3.3%
16
 
3.1%
16
 
3.1%
16
 
3.1%
16
 
3.1%
16
 
3.1%
Other values (100) 322
63.3%

표현형태-1
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)36.9%
Missing0
Missing (%)0.0%
Memory size652.0 B
<NA>
15 
없다
11 
없거나매우연하다
직립
없거나매우약하다
Other values (19)
24 

Length

Max length8
Median length7
Mean length3.7230769
Min length1

Unique

Unique16 ?
Unique (%)24.6%

Sample

1st row없거나매우약하다
2nd row녹색
3rd row<NA>
4th row없다
5th row

Common Values

ValueCountFrequency (%)
<NA> 15
23.1%
없다 11
16.9%
없거나매우연하다 6
 
9.2%
직립 5
 
7.7%
없거나매우약하다 4
 
6.2%
황백색 3
 
4.6%
백색 3
 
4.6%
매우빠르다 2
 
3.1%
선단 1
 
1.5%
없다 1
 
1.5%
Other values (14) 14
21.5%

Length

2023-12-13T06:54:55.184676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 15
22.7%
없다 12
18.2%
없거나매우연하다 6
 
9.1%
직립 5
 
7.6%
없거나매우약하다 4
 
6.1%
황백색 3
 
4.5%
백색 3
 
4.5%
매우빠르다 2
 
3.0%
추출불량 1
 
1.5%
평범하다 1
 
1.5%
Other values (14) 14
21.2%

표현형태-2
Text

MISSING 

Distinct16
Distinct (%)84.2%
Missing46
Missing (%)70.8%
Memory size652.0 B
2023-12-13T06:54:55.359567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length6
Mean length3.4736842
Min length2

Characters and Unicode

Total characters66
Distinct characters39
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)78.9%

Sample

1st row옅다
2nd row자주색
3rd row가장자리
4th row뾰족하다
5th row녹색
ValueCountFrequency (%)
황갈색 4
18.2%
2
 
9.1%
옅다 1
 
4.5%
연노랑색 1
 
4.5%
10 1
 
4.5%
5 1
 
4.5%
중간 1
 
4.5%
담갈색 1
 
4.5%
단원형 1
 
4.5%
황갈색골 1
 
4.5%
Other values (8) 8
36.4%
2023-12-13T06:54:55.667850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
 
15.2%
6
 
9.1%
5
 
7.6%
3
 
4.5%
3
 
4.5%
2
 
3.0%
2
 
3.0%
2
 
3.0%
2
 
3.0%
2
 
3.0%
Other values (29) 29
43.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 57
86.4%
Decimal Number 4
 
6.1%
Space Separator 3
 
4.5%
Other Punctuation 1
 
1.5%
Math Symbol 1
 
1.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10
17.5%
6
 
10.5%
5
 
8.8%
3
 
5.3%
2
 
3.5%
2
 
3.5%
2
 
3.5%
2
 
3.5%
2
 
3.5%
1
 
1.8%
Other values (22) 22
38.6%
Decimal Number
ValueCountFrequency (%)
0 1
25.0%
1 1
25.0%
5 1
25.0%
2 1
25.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
% 1
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 57
86.4%
Common 9
 
13.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10
17.5%
6
 
10.5%
5
 
8.8%
3
 
5.3%
2
 
3.5%
2
 
3.5%
2
 
3.5%
2
 
3.5%
2
 
3.5%
1
 
1.8%
Other values (22) 22
38.6%
Common
ValueCountFrequency (%)
3
33.3%
% 1
 
11.1%
0 1
 
11.1%
1 1
 
11.1%
~ 1
 
11.1%
5 1
 
11.1%
2 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 57
86.4%
ASCII 9
 
13.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10
17.5%
6
 
10.5%
5
 
8.8%
3
 
5.3%
2
 
3.5%
2
 
3.5%
2
 
3.5%
2
 
3.5%
2
 
3.5%
1
 
1.8%
Other values (22) 22
38.6%
ASCII
ValueCountFrequency (%)
3
33.3%
% 1
 
11.1%
0 1
 
11.1%
1 1
 
11.1%
~ 1
 
11.1%
5 1
 
11.1%
2 1
 
11.1%

표현형태-3
Text

MISSING 

Distinct30
Distinct (%)54.5%
Missing10
Missing (%)15.4%
Memory size652.0 B
2023-12-13T06:54:55.877898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.1090909
Min length1

Characters and Unicode

Total characters171
Distinct characters63
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)41.8%

Sample

1st row진하다
2nd row옅은 자색
3rd row연하다
4th row얼룩
5th row연하다
ValueCountFrequency (%)
연하다 10
 
16.1%
짧다 6
 
9.7%
반직립 5
 
8.1%
갈색 4
 
6.5%
빠르다 3
 
4.8%
좁다 3
 
4.8%
약하다 2
 
3.2%
2
 
3.2%
중원형 1
 
1.6%
부분추출 1
 
1.6%
Other values (25) 25
40.3%
2023-12-13T06:54:56.248128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30
17.5%
14
 
8.2%
10
 
5.8%
10
 
5.8%
8
 
4.7%
7
 
4.1%
6
 
3.5%
5
 
2.9%
5
 
2.9%
5
 
2.9%
Other values (53) 71
41.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 156
91.2%
Space Separator 8
 
4.7%
Decimal Number 5
 
2.9%
Math Symbol 1
 
0.6%
Other Punctuation 1
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
19.2%
14
 
9.0%
10
 
6.4%
10
 
6.4%
7
 
4.5%
6
 
3.8%
5
 
3.2%
5
 
3.2%
5
 
3.2%
3
 
1.9%
Other values (47) 61
39.1%
Decimal Number
ValueCountFrequency (%)
1 3
60.0%
5 1
 
20.0%
3 1
 
20.0%
Space Separator
ValueCountFrequency (%)
8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%
Other Punctuation
ValueCountFrequency (%)
% 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 156
91.2%
Common 15
 
8.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
19.2%
14
 
9.0%
10
 
6.4%
10
 
6.4%
7
 
4.5%
6
 
3.8%
5
 
3.2%
5
 
3.2%
5
 
3.2%
3
 
1.9%
Other values (47) 61
39.1%
Common
ValueCountFrequency (%)
8
53.3%
1 3
 
20.0%
5 1
 
6.7%
~ 1
 
6.7%
% 1
 
6.7%
3 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 156
91.2%
ASCII 15
 
8.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
30
19.2%
14
 
9.0%
10
 
6.4%
10
 
6.4%
7
 
4.5%
6
 
3.8%
5
 
3.2%
5
 
3.2%
5
 
3.2%
3
 
1.9%
Other values (47) 61
39.1%
ASCII
ValueCountFrequency (%)
8
53.3%
1 3
 
20.0%
5 1
 
6.7%
~ 1
 
6.7%
% 1
 
6.7%
3 1
 
6.7%

표현형태-4
Text

MISSING 

Distinct11
Distinct (%)84.6%
Missing52
Missing (%)80.0%
Memory size652.0 B
2023-12-13T06:54:56.446195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length5
Mean length3.9230769
Min length2

Characters and Unicode

Total characters51
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)69.2%

Sample

1st row자색
2nd row균등
3rd row옅은자주색
4th row옅은자색
5th row적갈색
ValueCountFrequency (%)
적갈색 3
16.7%
3
16.7%
자색 2
11.1%
균등 1
 
5.6%
옅은자주색 1
 
5.6%
옅은자색 1
 
5.6%
적색 1
 
5.6%
담자색 1
 
5.6%
자색점 1
 
5.6%
장원형 1
 
5.6%
Other values (3) 3
16.7%
2023-12-13T06:54:56.986171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11
21.6%
6
11.8%
5
9.8%
4
 
7.8%
4
 
7.8%
3
 
5.9%
2
 
3.9%
~ 2
 
3.9%
1
 
2.0%
0 1
 
2.0%
Other values (12) 12
23.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39
76.5%
Space Separator 5
 
9.8%
Decimal Number 4
 
7.8%
Math Symbol 2
 
3.9%
Other Punctuation 1
 
2.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
11
28.2%
6
15.4%
4
 
10.3%
4
 
10.3%
3
 
7.7%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (5) 5
12.8%
Decimal Number
ValueCountFrequency (%)
0 1
25.0%
2 1
25.0%
6 1
25.0%
1 1
25.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%
Other Punctuation
ValueCountFrequency (%)
% 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39
76.5%
Common 12
 
23.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
11
28.2%
6
15.4%
4
 
10.3%
4
 
10.3%
3
 
7.7%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (5) 5
12.8%
Common
ValueCountFrequency (%)
5
41.7%
~ 2
 
16.7%
0 1
 
8.3%
2 1
 
8.3%
6 1
 
8.3%
1 1
 
8.3%
% 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39
76.5%
ASCII 12
 
23.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
11
28.2%
6
15.4%
4
 
10.3%
4
 
10.3%
3
 
7.7%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (5) 5
12.8%
ASCII
ValueCountFrequency (%)
5
41.7%
~ 2
 
16.7%
0 1
 
8.3%
2 1
 
8.3%
6 1
 
8.3%
1 1
 
8.3%
% 1
 
8.3%

표현형태-5
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size652.0 B
중간
30 
<NA>
19 
자색
 
3
담적색
 
3
수평
 
2
Other values (8)

Length

Max length9
Median length2
Mean length2.8153846
Min length2

Unique

Unique8 ?
Unique (%)12.3%

Sample

1st row<NA>
2nd row<NA>
3rd row중간
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
중간 30
46.2%
<NA> 19
29.2%
자색 3
 
4.6%
담적색 3
 
4.6%
수평 2
 
3.1%
자주색 1
 
1.5%
보통개형 1
 
1.5%
전체 1
 
1.5%
굽음 1
 
1.5%
개형 1
 
1.5%
Other values (3) 3
 
4.6%

Length

2023-12-13T06:54:57.114756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
중간 30
44.1%
na 19
27.9%
자색 3
 
4.4%
담적색 3
 
4.4%
수평 2
 
2.9%
2
 
2.9%
자주색 1
 
1.5%
보통개형 1
 
1.5%
전체 1
 
1.5%
굽음 1
 
1.5%
Other values (5) 5
 
7.4%

표현형태-6
Text

MISSING 

Distinct3
Distinct (%)50.0%
Missing59
Missing (%)90.8%
Memory size652.0 B
2023-12-13T06:54:57.223131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length3.1666667
Min length2

Characters and Unicode

Total characters19
Distinct characters10
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)16.7%

Sample

1st row적색
2nd row흑색
3rd row적색
4th row흑색
5th row적색
ValueCountFrequency (%)
적색 3
33.3%
흑색 2
22.2%
2
22.2%
25 1
 
11.1%
30 1
 
11.1%
2023-12-13T06:54:57.446187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
26.3%
3
15.8%
3
15.8%
2
 
10.5%
2 1
 
5.3%
5 1
 
5.3%
~ 1
 
5.3%
3 1
 
5.3%
0 1
 
5.3%
% 1
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10
52.6%
Decimal Number 4
 
21.1%
Space Separator 3
 
15.8%
Math Symbol 1
 
5.3%
Other Punctuation 1
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1
25.0%
5 1
25.0%
3 1
25.0%
0 1
25.0%
Other Letter
ValueCountFrequency (%)
5
50.0%
3
30.0%
2
 
20.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%
Other Punctuation
ValueCountFrequency (%)
% 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10
52.6%
Common 9
47.4%

Most frequent character per script

Common
ValueCountFrequency (%)
3
33.3%
2 1
 
11.1%
5 1
 
11.1%
~ 1
 
11.1%
3 1
 
11.1%
0 1
 
11.1%
% 1
 
11.1%
Hangul
ValueCountFrequency (%)
5
50.0%
3
30.0%
2
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10
52.6%
ASCII 9
47.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5
50.0%
3
30.0%
2
 
20.0%
ASCII
ValueCountFrequency (%)
3
33.3%
2 1
 
11.1%
5 1
 
11.1%
~ 1
 
11.1%
3 1
 
11.1%
0 1
 
11.1%
% 1
 
11.1%

표현형태-7
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)27.7%
Missing0
Missing (%)0.0%
Memory size652.0 B
<NA>
27 
진하다
길다
강하다
넓다
Other values (13)
17 

Length

Max length6
Median length5
Mean length3.4
Min length2

Unique

Unique10 ?
Unique (%)15.4%

Sample

1st row<NA>
2nd row<NA>
3rd row진하다
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 27
41.5%
진하다 9
 
13.8%
길다 6
 
9.2%
강하다 3
 
4.6%
넓다 3
 
4.6%
늦다 3
 
4.6%
뒤로휨 2
 
3.1%
담자색 2
 
3.1%
심하게 굽음 1
 
1.5%
완전개형 1
 
1.5%
Other values (8) 8
 
12.3%

Length

2023-12-13T06:54:57.565117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 27
39.7%
진하다 9
 
13.2%
길다 6
 
8.8%
강하다 3
 
4.4%
넓다 3
 
4.4%
늦다 3
 
4.4%
뒤로휨 2
 
2.9%
담자색 2
 
2.9%
2
 
2.9%
30 1
 
1.5%
Other values (10) 10
 
14.7%

표현형태-8
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size652.0 B
<NA>
62 
자색
 
3

Length

Max length4
Median length4
Mean length3.9076923
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 62
95.4%
자색 3
 
4.6%

Length

2023-12-13T06:54:57.705502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:54:57.798576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 62
95.4%
자색 3
 
4.6%

표현형태-9
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)16.9%
Missing0
Missing (%)0.0%
Memory size652.0 B
<NA>
43 
있다
10 
매우진하다
 
3
흑색
 
2
포복
 
1
Other values (6)

Length

Max length8
Median length4
Mean length3.7230769
Min length2

Unique

Unique7 ?
Unique (%)10.8%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row있다
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 43
66.2%
있다 10
 
15.4%
매우진하다 3
 
4.6%
흑색 2
 
3.1%
포복 1
 
1.5%
장간 1
 
1.5%
매우길다 1
 
1.5%
매우강하다 1
 
1.5%
추출매우양호 1
 
1.5%
매우늦다 1
 
1.5%

Length

2023-12-13T06:54:57.891848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 43
64.2%
있다 10
 
14.9%
매우진하다 3
 
4.5%
흑색 3
 
4.5%
포복 1
 
1.5%
장간 1
 
1.5%
매우길다 1
 
1.5%
매우강하다 1
 
1.5%
추출매우양호 1
 
1.5%
매우늦다 1
 
1.5%
Other values (2) 2
 
3.0%

출원품종 표현형태
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Memory size652.0 B
1
28 
5
17 
<NA>
10 
2
3

Length

Max length4
Median length1
Mean length1.4615385
Min length1

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row1
2nd row1
3rd row5
4th row1
5th row<NA>

Common Values

ValueCountFrequency (%)
1 28
43.1%
5 17
26.2%
<NA> 10
 
15.4%
2 5
 
7.7%
3 4
 
6.2%
4 1
 
1.5%

Length

2023-12-13T06:54:58.000289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:54:58.100842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 28
43.1%
5 17
26.2%
na 10
 
15.4%
2 5
 
7.7%
3 4
 
6.2%
4 1
 
1.5%

출원품종 실측치
Categorical

HIGH CORRELATION 

Distinct27
Distinct (%)41.5%
Missing0
Missing (%)0.0%
Memory size652.0 B
없다
12 
없거나 매우 약하다
10 
중간
<NA>
반직립
Other values (22)
22 

Length

Max length10
Median length5
Mean length4.0307692
Min length2

Unique

Unique22 ?
Unique (%)33.8%

Sample

1st row없거나 매우 약하다
2nd row녹색
3rd row중간
4th row없다
5th row<NA>

Common Values

ValueCountFrequency (%)
없다 12
18.5%
없거나 매우 약하다 10
15.4%
중간 9
13.8%
<NA> 9
13.8%
반직립 3
 
4.6%
황백색 1
 
1.5%
뾰족하다 1
 
1.5%
무색 1
 
1.5%
직립 1
 
1.5%
8. 11 1
 
1.5%
Other values (17) 17
26.2%

Length

2023-12-13T06:54:58.228510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
없다 12
14.0%
매우 10
11.6%
약하다 10
11.6%
없거나 10
11.6%
중간 9
10.5%
na 9
10.5%
반직립 3
 
3.5%
2.74 1
 
1.2%
26.2 1
 
1.2%
3.21 1
 
1.2%
Other values (20) 20
23.3%

대조품종 표현형태
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Memory size652.0 B
1
28 
5
19 
<NA>
2
3

Length

Max length4
Median length1
Mean length1.4153846
Min length1

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row1
2nd row1
3rd row5
4th row1
5th row<NA>

Common Values

ValueCountFrequency (%)
1 28
43.1%
5 19
29.2%
<NA> 9
 
13.8%
2 4
 
6.2%
3 4
 
6.2%
4 1
 
1.5%

Length

2023-12-13T06:54:58.418174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:54:58.555895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 28
43.1%
5 19
29.2%
na 9
 
13.8%
2 4
 
6.2%
3 4
 
6.2%
4 1
 
1.5%

대조품종 실측치
Categorical

HIGH CORRELATION 

Distinct27
Distinct (%)41.5%
Missing0
Missing (%)0.0%
Memory size652.0 B
없다
12 
중간
10 
없거나 매우 약하다
<NA>
반직립
Other values (22)
22 

Length

Max length10
Median length4
Mean length3.8461538
Min length2

Unique

Unique22 ?
Unique (%)33.8%

Sample

1st row없거나 매우 약하다
2nd row녹색
3rd row중간
4th row없다
5th row<NA>

Common Values

ValueCountFrequency (%)
없다 12
18.5%
중간 10
15.4%
없거나 매우 약하다 9
13.8%
<NA> 9
13.8%
반직립 3
 
4.6%
황백색 1
 
1.5%
뾰족하다 1
 
1.5%
무색 1
 
1.5%
직립 1
 
1.5%
8. 8 1
 
1.5%
Other values (17) 17
26.2%

Length

2023-12-13T06:54:58.717878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
없다 12
14.3%
중간 10
11.9%
없거나 9
10.7%
매우 9
10.7%
약하다 9
10.7%
na 9
10.7%
반직립 3
 
3.6%
8 2
 
2.4%
단원형 1
 
1.2%
3.41 1
 
1.2%
Other values (19) 19
22.6%

Interactions

2023-12-13T06:54:53.409908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:54:58.826554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호형질표현형태-1표현형태-2표현형태-3표현형태-4표현형태-5표현형태-6표현형태-7표현형태-9출원품종 표현형태출원품종 실측치대조품종 표현형태대조품종 실측치
번호1.0001.0000.2010.6850.7290.6540.1680.0000.5240.7640.2000.5270.2810.558
형질1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
표현형태-10.2011.0001.0000.9560.9930.8860.8800.8420.9821.0000.9640.9820.9360.980
표현형태-20.6851.0000.9561.0000.9950.9590.9320.6431.0000.0001.0000.9001.0000.900
표현형태-30.7291.0000.9930.9951.0000.9190.9200.8980.9901.0000.9480.9530.9340.951
표현형태-40.6541.0000.8860.9590.9191.0001.0001.0001.0000.0001.0001.0001.0001.000
표현형태-50.1681.0000.8800.9320.9201.0001.0001.0001.0001.0000.9210.9960.9230.996
표현형태-60.0001.0000.8420.6430.8981.0001.0001.0001.000NaN0.8271.0000.8271.000
표현형태-70.5241.0000.9821.0000.9901.0001.0001.0001.0001.0001.0000.9810.9820.979
표현형태-90.7641.0001.0000.0001.0000.0001.000NaN1.0001.0001.0001.0001.0001.000
출원품종 표현형태0.2001.0000.9641.0000.9481.0000.9210.8271.0001.0001.0001.0000.9991.000
출원품종 실측치0.5271.0000.9820.9000.9531.0000.9961.0000.9811.0001.0001.0000.9961.000
대조품종 표현형태0.2811.0000.9361.0000.9341.0000.9230.8270.9821.0000.9990.9961.0001.000
대조품종 실측치0.5581.0000.9800.9000.9511.0000.9961.0000.9791.0001.0001.0001.0001.000
2023-12-13T06:54:58.993079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
표현형태-5표현형태-8표현형태-7대조품종 실측치대조품종 표현형태표현형태-9표현형태-1출원품종 표현형태출원품종 실측치
표현형태-51.0001.0000.8100.6870.7390.5770.4810.7330.687
표현형태-81.0001.0001.000NaNNaN1.0001.000NaNNaN
표현형태-70.8101.0001.0000.6780.7271.0000.8560.7590.687
대조품종 실측치0.687NaN0.6781.0000.7670.9200.7900.7600.993
대조품종 표현형태0.739NaN0.7270.7671.0000.8290.6600.9560.753
표현형태-90.5771.0001.0000.9200.8291.0000.9570.8290.920
표현형태-10.4811.0000.8560.7900.6600.9571.0000.7230.805
출원품종 표현형태0.733NaN0.7590.7600.9560.8290.7231.0000.775
출원품종 실측치0.687NaN0.6870.9930.7530.9200.8050.7751.000
2023-12-13T06:54:59.136326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호표현형태-1표현형태-5표현형태-7표현형태-8표현형태-9출원품종 표현형태출원품종 실측치대조품종 표현형태대조품종 실측치
번호1.0000.0000.0000.1641.0000.2820.0560.1500.1000.167
표현형태-10.0001.0000.4810.8561.0000.9570.7230.8050.6600.790
표현형태-50.0000.4811.0000.8101.0000.5770.7330.6870.7390.687
표현형태-70.1640.8560.8101.0001.0001.0000.7590.6870.7270.678
표현형태-81.0001.0001.0001.0001.0001.000NaNNaNNaNNaN
표현형태-90.2820.9570.5771.0001.0001.0000.8290.9200.8290.920
출원품종 표현형태0.0560.7230.7330.759NaN0.8291.0000.7750.9560.760
출원품종 실측치0.1500.8050.6870.687NaN0.9200.7751.0000.7530.993
대조품종 표현형태0.1000.6600.7390.727NaN0.8290.9560.7531.0000.767
대조품종 실측치0.1670.7900.6870.678NaN0.9200.7600.9930.7671.000

Missing values

2023-12-13T06:54:53.545079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:54:53.774104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T06:54:53.959383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

번호형질표현형태-1표현형태-2표현형태-3표현형태-4표현형태-5표현형태-6표현형태-7표현형태-8표현형태-9출원품종 표현형태출원품종 실측치대조품종 표현형태대조품종 실측치
01초엽 : 안토시아닌 색소없거나매우약하다옅다진하다<NA><NA><NA><NA><NA><NA>1없거나 매우 약하다1없거나 매우 약하다
12제1엽 : 엽초색녹색자주색옅은 자색자색<NA><NA><NA><NA><NA>1녹색1녹색
23잎 : 녹색정도 (잎색농도)<NA><NA>연하다<NA>중간<NA>진하다<NA><NA>5중간5중간
34잎 : 안토시아닌 색소없다<NA><NA><NA><NA><NA><NA><NA>있다1없다1없다
45잎 : 안토시아닌 색소분포가장자리얼룩균등<NA><NA><NA><NA><NA><NA><NA><NA><NA>
56잎집 : 안토시아닌 색소없다<NA><NA><NA><NA><NA><NA><NA>있다1없다1없다
67잎집 : 안토시아닌 색소농도매우연하다<NA>연하다<NA>중간<NA>강하다<NA><NA><NA><NA><NA><NA>
78잎몸 : 모용성없거나매우약하다<NA>약하다<NA>중간<NA>강하다<NA><NA>1없거나 매우 약하다5중간
89잎 : 잎귀의 안토시아닌 색소없다<NA><NA><NA><NA><NA><NA><NA>있다1없다1없다
910잎 : 잎깃의 안토시아닌 색소없다<NA><NA><NA><NA><NA><NA><NA>있다1없다1없다
번호형질표현형태-1표현형태-2표현형태-3표현형태-4표현형태-5표현형태-6표현형태-7표현형태-8표현형태-9출원품종 표현형태출원품종 실측치대조품종 표현형태대조품종 실측치
5556벼알 : 외영의 페놀반응없다<NA><NA><NA><NA><NA><NA><NA>있다1없다1없다
5657외영 페놀반응 정도<NA><NA>연하다<NA>중간<NA>진하다<NA><NA><NA><NA><NA><NA>
5758현미 : 길이<NA><NA>짧다<NA>중간<NA>길다<NA><NA>55.0755.34
5859현미 : 폭<NA><NA>좁다<NA>중간<NA>넓다<NA><NA>52.7452.82
5960현미 : 모양 (측면 관찰)원형단원형중원형장원형세장형<NA><NA><NA><NA>2단원형2단원형
6061현미 : 색백색담갈색얼룩진갈색짙은갈색담적색적색얼룩진자색자색암자색 / 흑색2담갈색2담갈색
6162배유 : 찰메성중간<NA><NA><NA><NA><NA><NA>3메성3메성
6263배유 : 아밀로스함량<5 ~ 10 %11 ~ 15 %16 ~ 20 %21 ~ 25 %25 ~ 30 %> 30 %<NA><NA>420.25418.6
6364알카리 붕괴도붕괴안됨<NA>조금붕괴됨<NA>중간<NA>완전히붕괴됨<NA><NA>1붕괴안됨1붕괴안됨
6465현미 : 향취성없거나매우약하다약하다강하다<NA><NA><NA><NA><NA><NA>1없거나 매우 약하다1없거나 매우 약하다