Overview

Dataset statistics

Number of variables10
Number of observations1396
Missing cells11
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory109.2 KiB
Average record size in memory80.1 B

Variable types

Text6
Categorical4

Dataset

Description대아수목원식물보유현황초본식물
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202390

Alerts

Unnamed: 5 is highly overall correlated with Unnamed: 8 and 1 other fieldsHigh correlation
Unnamed: 6 is highly overall correlated with Unnamed: 8 and 1 other fieldsHigh correlation
Unnamed: 8 is highly overall correlated with Unnamed: 5 and 2 other fieldsHigh correlation
Unnamed: 9 is highly overall correlated with Unnamed: 5 and 2 other fieldsHigh correlation
Unnamed: 6 is highly imbalanced (61.5%)Imbalance

Reproduction

Analysis started2024-03-14 00:09:44.006580
Analysis finished2024-03-14 00:09:45.285854
Duration1.28 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1394
Distinct (%)100.0%
Missing2
Missing (%)0.1%
Memory size11.0 KiB
2024-03-14T09:09:45.568386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length3.2065997
Min length1

Characters and Unicode

Total characters4470
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1394 ?
Unique (%)100.0%

Sample

1st row일련 번호
2nd row1
3rd row2
4th row3
5th row4
ValueCountFrequency (%)
11 1
 
0.1%
939 1
 
0.1%
934 1
 
0.1%
933 1
 
0.1%
932 1
 
0.1%
931 1
 
0.1%
930 1
 
0.1%
929 1
 
0.1%
928 1
 
0.1%
927 1
 
0.1%
Other values (1385) 1385
99.3%
2024-03-14T09:09:46.061097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 874
19.6%
2 480
10.7%
3 474
10.6%
4 379
8.5%
5 379
8.5%
6 379
8.5%
7 379
8.5%
8 379
8.5%
9 373
8.3%
0 369
8.3%
Other values (5) 5
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4465
99.9%
Other Letter 4
 
0.1%
Control 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 874
19.6%
2 480
10.8%
3 474
10.6%
4 379
8.5%
5 379
8.5%
6 379
8.5%
7 379
8.5%
8 379
8.5%
9 373
8.4%
0 369
8.3%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4466
99.9%
Hangul 4
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 874
19.6%
2 480
10.7%
3 474
10.6%
4 379
8.5%
5 379
8.5%
6 379
8.5%
7 379
8.5%
8 379
8.5%
9 373
8.4%
0 369
8.3%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4466
99.9%
Hangul 4
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 874
19.6%
2 480
10.7%
3 474
10.6%
4 379
8.5%
5 379
8.5%
6 379
8.5%
7 379
8.5%
8 379
8.5%
9 373
8.4%
0 369
8.3%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Distinct1314
Distinct (%)94.1%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
2024-03-14T09:09:46.332905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length46
Mean length26.89255
Min length8

Characters and Unicode

Total characters37542
Distinct characters82
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1276 ?
Unique (%)91.4%

Sample

1st row식물 유전자원명
2nd row학 명
3rd rowLycopodium chinense Chirist.
4th rowSelaginella involvens (Sw.) Spring
5th rowSelaginella tamariscina (Beauv.) Spring
ValueCountFrequency (%)
spp 333
 
7.0%
iris 182
 
3.8%
l 159
 
3.3%
var 131
 
2.8%
nakai 80
 
1.7%
et 66
 
1.4%
thunb 46
 
1.0%
max 43
 
0.9%
hemerocallis 42
 
0.9%
japonica 39
 
0.8%
Other values (2029) 3639
76.4%
2024-03-14T09:09:46.743923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3978
 
10.6%
a 3652
 
9.7%
i 2954
 
7.9%
s 2230
 
5.9%
e 2225
 
5.9%
r 2112
 
5.6%
n 1638
 
4.4%
o 1622
 
4.3%
l 1562
 
4.2%
u 1378
 
3.7%
Other values (72) 14191
37.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28133
74.9%
Space Separator 3978
 
10.6%
Uppercase Letter 2927
 
7.8%
Other Punctuation 1875
 
5.0%
Close Punctuation 190
 
0.5%
Open Punctuation 190
 
0.5%
Dash Punctuation 186
 
0.5%
Other Letter 22
 
0.1%
Final Punctuation 20
 
0.1%
Initial Punctuation 20
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3652
13.0%
i 2954
 
10.5%
s 2230
 
7.9%
e 2225
 
7.9%
r 2112
 
7.5%
n 1638
 
5.8%
o 1622
 
5.8%
l 1562
 
5.6%
u 1378
 
4.9%
t 1358
 
4.8%
Other values (16) 7402
26.3%
Uppercase Letter
ValueCountFrequency (%)
L 305
 
10.4%
S 240
 
8.2%
C 239
 
8.2%
A 215
 
7.3%
M 211
 
7.2%
P 205
 
7.0%
I 204
 
7.0%
H 193
 
6.6%
B 156
 
5.3%
T 139
 
4.7%
Other values (16) 820
28.0%
Other Letter
ValueCountFrequency (%)
4
18.2%
2
 
9.1%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (6) 6
27.3%
Other Punctuation
ValueCountFrequency (%)
. 1372
73.2%
" 407
 
21.7%
' 89
 
4.7%
? 3
 
0.2%
& 2
 
0.1%
* 1
 
0.1%
: 1
 
0.1%
Space Separator
ValueCountFrequency (%)
3978
100.0%
Close Punctuation
ValueCountFrequency (%)
) 190
100.0%
Open Punctuation
ValueCountFrequency (%)
( 190
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 186
100.0%
Final Punctuation
ValueCountFrequency (%)
20
100.0%
Initial Punctuation
ValueCountFrequency (%)
20
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31060
82.7%
Common 6460
 
17.2%
Hangul 22
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3652
 
11.8%
i 2954
 
9.5%
s 2230
 
7.2%
e 2225
 
7.2%
r 2112
 
6.8%
n 1638
 
5.3%
o 1622
 
5.2%
l 1562
 
5.0%
u 1378
 
4.4%
t 1358
 
4.4%
Other values (42) 10329
33.3%
Hangul
ValueCountFrequency (%)
4
18.2%
2
 
9.1%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (6) 6
27.3%
Common
ValueCountFrequency (%)
3978
61.6%
. 1372
 
21.2%
" 407
 
6.3%
) 190
 
2.9%
( 190
 
2.9%
- 186
 
2.9%
' 89
 
1.4%
20
 
0.3%
20
 
0.3%
? 3
 
< 0.1%
Other values (4) 5
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37480
99.8%
Punctuation 40
 
0.1%
Hangul 22
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3978
 
10.6%
a 3652
 
9.7%
i 2954
 
7.9%
s 2230
 
5.9%
e 2225
 
5.9%
r 2112
 
5.6%
n 1638
 
4.4%
o 1622
 
4.3%
l 1562
 
4.2%
u 1378
 
3.7%
Other values (54) 14129
37.7%
Punctuation
ValueCountFrequency (%)
20
50.0%
20
50.0%
Hangul
ValueCountFrequency (%)
4
18.2%
2
 
9.1%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (6) 6
27.3%
Distinct117
Distinct (%)8.4%
Missing2
Missing (%)0.1%
Memory size11.0 KiB
2024-03-14T09:09:46.954808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.456241
Min length1

Characters and Unicode

Total characters3424
Distinct characters166
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)2.0%

Sample

1st row과 명
2nd row석송
3rd row부처손
4th row부처손
5th row부처손
ValueCountFrequency (%)
붓꽃 196
 
14.1%
백합 163
 
11.7%
국화 130
 
9.3%
80
 
5.7%
사초 48
 
3.4%
미나리아재비 46
 
3.3%
천남성 43
 
3.1%
꿀풀 39
 
2.8%
돌나물 34
 
2.4%
선인장 29
 
2.1%
Other values (108) 587
42.1%
2024-03-14T09:09:47.276383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
253
 
7.4%
196
 
5.7%
165
 
4.8%
164
 
4.8%
163
 
4.8%
130
 
3.8%
113
 
3.3%
95
 
2.8%
91
 
2.7%
90
 
2.6%
Other values (156) 1964
57.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3422
99.9%
Space Separator 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
253
 
7.4%
196
 
5.7%
165
 
4.8%
164
 
4.8%
163
 
4.8%
130
 
3.8%
113
 
3.3%
95
 
2.8%
91
 
2.7%
90
 
2.6%
Other values (155) 1962
57.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3422
99.9%
Common 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
253
 
7.4%
196
 
5.7%
165
 
4.8%
164
 
4.8%
163
 
4.8%
130
 
3.8%
113
 
3.3%
95
 
2.8%
91
 
2.7%
90
 
2.6%
Other values (155) 1962
57.3%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3422
99.9%
ASCII 2
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
253
 
7.4%
196
 
5.7%
165
 
4.8%
164
 
4.8%
163
 
4.8%
130
 
3.8%
113
 
3.3%
95
 
2.8%
91
 
2.7%
90
 
2.6%
Other values (155) 1962
57.3%
ASCII
ValueCountFrequency (%)
2
100.0%
Distinct1174
Distinct (%)84.2%
Missing2
Missing (%)0.1%
Memory size11.0 KiB
2024-03-14T09:09:47.512973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length5.1499283
Min length1

Characters and Unicode

Total characters7179
Distinct characters549
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1163 ?
Unique (%)83.4%

Sample

1st row국 명
2nd row다람쥐꼬리
3rd row바위손
4th row부처손
5th row구실사리
ValueCountFrequency (%)
품종 219
 
12.4%
붓꽃 91
 
5.2%
붓꽃류 74
 
4.2%
독일 54
 
3.1%
원추리 34
 
1.9%
알리움 12
 
0.7%
백합 9
 
0.5%
필로덴드론 8
 
0.5%
드로세라 6
 
0.3%
튤립품종 6
 
0.3%
Other values (1207) 1247
70.9%
2024-03-14T09:09:47.843595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
371
 
5.2%
283
 
3.9%
255
 
3.6%
233
 
3.2%
228
 
3.2%
180
 
2.5%
135
 
1.9%
115
 
1.6%
) 112
 
1.6%
( 112
 
1.6%
Other values (539) 5155
71.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6568
91.5%
Space Separator 371
 
5.2%
Close Punctuation 112
 
1.6%
Open Punctuation 112
 
1.6%
Other Punctuation 8
 
0.1%
Decimal Number 7
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
283
 
4.3%
255
 
3.9%
233
 
3.5%
228
 
3.5%
180
 
2.7%
135
 
2.1%
115
 
1.8%
111
 
1.7%
97
 
1.5%
97
 
1.5%
Other values (526) 4834
73.6%
Decimal Number
ValueCountFrequency (%)
1 1
14.3%
2 1
14.3%
3 1
14.3%
4 1
14.3%
5 1
14.3%
6 1
14.3%
7 1
14.3%
Other Punctuation
ValueCountFrequency (%)
, 5
62.5%
* 3
37.5%
Space Separator
ValueCountFrequency (%)
371
100.0%
Close Punctuation
ValueCountFrequency (%)
) 112
100.0%
Open Punctuation
ValueCountFrequency (%)
( 112
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6568
91.5%
Common 611
 
8.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
283
 
4.3%
255
 
3.9%
233
 
3.5%
228
 
3.5%
180
 
2.7%
135
 
2.1%
115
 
1.8%
111
 
1.7%
97
 
1.5%
97
 
1.5%
Other values (526) 4834
73.6%
Common
ValueCountFrequency (%)
371
60.7%
) 112
 
18.3%
( 112
 
18.3%
, 5
 
0.8%
* 3
 
0.5%
1 1
 
0.2%
2 1
 
0.2%
3 1
 
0.2%
4 1
 
0.2%
5 1
 
0.2%
Other values (3) 3
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6568
91.5%
ASCII 611
 
8.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
371
60.7%
) 112
 
18.3%
( 112
 
18.3%
, 5
 
0.8%
* 3
 
0.5%
1 1
 
0.2%
2 1
 
0.2%
3 1
 
0.2%
4 1
 
0.2%
5 1
 
0.2%
Other values (3) 3
 
0.5%
Hangul
ValueCountFrequency (%)
283
 
4.3%
255
 
3.9%
233
 
3.5%
228
 
3.5%
180
 
2.7%
135
 
2.1%
115
 
1.8%
111
 
1.7%
97
 
1.5%
97
 
1.5%
Other values (526) 4834
73.6%
Distinct95
Distinct (%)6.8%
Missing3
Missing (%)0.2%
Memory size11.0 KiB
2024-03-14T09:09:48.022462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length1
Mean length1.5757358
Min length1

Characters and Unicode

Total characters2195
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)4.1%

Sample

1st row 수량
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 672
48.2%
10 170
 
12.2%
3 69
 
5.0%
100 64
 
4.6%
25 51
 
3.7%
2 35
 
2.5%
15 33
 
2.4%
50 30
 
2.2%
5 29
 
2.1%
20 27
 
1.9%
Other values (85) 213
 
15.3%
2024-03-14T09:09:48.325965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1014
46.2%
0 562
25.6%
5 176
 
8.0%
2 169
 
7.7%
3 118
 
5.4%
, 32
 
1.5%
4 31
 
1.4%
8 26
 
1.2%
6 26
 
1.2%
7 24
 
1.1%
Other values (4) 17
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2159
98.4%
Other Punctuation 32
 
1.5%
Space Separator 2
 
0.1%
Other Letter 2
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1014
47.0%
0 562
26.0%
5 176
 
8.2%
2 169
 
7.8%
3 118
 
5.5%
4 31
 
1.4%
8 26
 
1.2%
6 26
 
1.2%
7 24
 
1.1%
9 13
 
0.6%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Other Punctuation
ValueCountFrequency (%)
, 32
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2193
99.9%
Hangul 2
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1014
46.2%
0 562
25.6%
5 176
 
8.0%
2 169
 
7.7%
3 118
 
5.4%
, 32
 
1.5%
4 31
 
1.4%
8 26
 
1.2%
6 26
 
1.2%
7 24
 
1.1%
Other values (2) 15
 
0.7%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2193
99.9%
Hangul 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1014
46.2%
0 562
25.6%
5 176
 
8.0%
2 169
 
7.7%
3 118
 
5.4%
, 32
 
1.5%
4 31
 
1.4%
8 26
 
1.2%
6 26
 
1.2%
7 24
 
1.1%
Other values (2) 15
 
0.7%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Unnamed: 5
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
<NA>
451 
2003년
278 
2002년
229 
2011년
127 
2004년
116 
Other values (15)
195 

Length

Max length5
Median length5
Mean length4.6755014
Min length4

Unique

Unique5 ?
Unique (%)0.4%

Sample

1st row보유년월
2nd row<NA>
3rd row<NA>
4th row2001년
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 451
32.3%
2003년 278
19.9%
2002년 229
16.4%
2011년 127
 
9.1%
2004년 116
 
8.3%
2014년 55
 
3.9%
2000년 38
 
2.7%
2012년 29
 
2.1%
2013년 20
 
1.4%
1995년 15
 
1.1%
Other values (10) 38
 
2.7%

Length

2024-03-14T09:09:48.437738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 451
32.3%
2003년 278
19.9%
2002년 229
16.4%
2011년 127
 
9.1%
2004년 116
 
8.3%
2014년 55
 
3.9%
2000년 38
 
2.7%
2012년 29
 
2.1%
2013년 20
 
1.4%
1995년 15
 
1.1%
Other values (10) 38
 
2.7%

Unnamed: 6
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
수집
932 
자생
450 
자생,수집
 
10
<NA>
 
2
보유경위
 
1

Length

Max length5
Median length2
Mean length2.027937
Min length2

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row보유경위
2nd row<NA>
3rd row자생
4th row수집
5th row자생

Common Values

ValueCountFrequency (%)
수집 932
66.8%
자생 450
32.2%
자생,수집 10
 
0.7%
<NA> 2
 
0.1%
보유경위 1
 
0.1%
수집,자생 1
 
0.1%

Length

2024-03-14T09:09:48.533645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T09:09:48.659805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수집 932
66.8%
자생 450
32.2%
자생,수집 10
 
0.7%
na 2
 
0.1%
보유경위 1
 
0.1%
수집,자생 1
 
0.1%
Distinct78
Distinct (%)5.6%
Missing2
Missing (%)0.1%
Memory size11.0 KiB
2024-03-14T09:09:48.844429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length2
Mean length2.561693
Min length2

Characters and Unicode

Total characters3571
Distinct characters98
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)2.9%

Sample

1st row원산지
2nd row한국
3rd row한국
4th row한국
5th row한국
ValueCountFrequency (%)
한국 904
64.8%
네덜란드 143
 
10.3%
독일 54
 
3.9%
유럽 48
 
3.4%
브라질 24
 
1.7%
북아메리카 22
 
1.6%
멕시코 19
 
1.4%
원예품종 16
 
1.1%
중국 12
 
0.9%
일본 11
 
0.8%
Other values (68) 142
 
10.2%
2024-03-14T09:09:49.131295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
928
26.0%
904
25.3%
146
 
4.1%
144
 
4.0%
143
 
4.0%
143
 
4.0%
96
 
2.7%
71
 
2.0%
68
 
1.9%
64
 
1.8%
Other values (88) 864
24.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3552
99.5%
Other Punctuation 18
 
0.5%
Space Separator 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
928
26.1%
904
25.5%
146
 
4.1%
144
 
4.1%
143
 
4.0%
143
 
4.0%
96
 
2.7%
71
 
2.0%
68
 
1.9%
64
 
1.8%
Other values (86) 845
23.8%
Other Punctuation
ValueCountFrequency (%)
, 18
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3552
99.5%
Common 19
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
928
26.1%
904
25.5%
146
 
4.1%
144
 
4.1%
143
 
4.0%
143
 
4.0%
96
 
2.7%
71
 
2.0%
68
 
1.9%
64
 
1.8%
Other values (86) 845
23.8%
Common
ValueCountFrequency (%)
, 18
94.7%
1
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3552
99.5%
ASCII 19
 
0.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
928
26.1%
904
25.5%
146
 
4.1%
144
 
4.1%
143
 
4.0%
143
 
4.0%
96
 
2.7%
71
 
2.0%
68
 
1.9%
64
 
1.8%
Other values (86) 845
23.8%
ASCII
ValueCountFrequency (%)
, 18
94.7%
1
 
5.3%

Unnamed: 8
Categorical

HIGH CORRELATION 

Distinct35
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
완주동상
553 
프롬엔
251 
태광식물원
123 
천보식물원
107 
대영농장
75 
Other values (30)
287 

Length

Max length8
Median length6
Mean length3.993553
Min length2

Unique

Unique12 ?
Unique (%)0.9%

Sample

1st row산지
2nd row<NA>
3rd row완주동상
4th row완주동상
5th row완주동상

Common Values

ValueCountFrequency (%)
완주동상 553
39.6%
프롬엔 251
18.0%
태광식물원 123
 
8.8%
천보식물원 107
 
7.7%
대영농장 75
 
5.4%
지피식물원 45
 
3.2%
화양 43
 
3.1%
종자은행 38
 
2.7%
대한종묘원 32
 
2.3%
식물초록병원에서 20
 
1.4%
Other values (25) 109
 
7.8%

Length

2024-03-14T09:09:49.270568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
완주동상 553
39.6%
프롬엔 251
18.0%
태광식물원 123
 
8.8%
천보식물원 107
 
7.7%
대영농장 75
 
5.4%
지피식물원 45
 
3.2%
화양 43
 
3.1%
종자은행 38
 
2.7%
대한종묘원 32
 
2.3%
식물초록병원에서 20
 
1.4%
Other values (25) 109
 
7.8%

Unnamed: 9
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
자생
445 
온실
307 
묘포장
218 
풍경뜰
158 
표본수원
81 
Other values (17)
187 

Length

Max length11
Median length2
Mean length2.6919771
Min length2

Unique

Unique5 ?
Unique (%)0.4%

Sample

1st row참고사항 (식재위치)
2nd row<NA>
3rd row자생
4th row온실
5th row자생

Common Values

ValueCountFrequency (%)
자생 445
31.9%
온실 307
22.0%
묘포장 218
15.6%
풍경뜰 158
 
11.3%
표본수원 81
 
5.8%
임도변 58
 
4.2%
아이리스원 54
 
3.9%
수생식물원 23
 
1.6%
열대식물원 15
 
1.1%
희귀식물원 8
 
0.6%
Other values (12) 29
 
2.1%

Length

2024-03-14T09:09:49.370451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
자생 445
31.8%
온실 307
21.9%
묘포장 218
15.6%
풍경뜰 158
 
11.3%
표본수원 81
 
5.8%
임도변 58
 
4.1%
아이리스원 54
 
3.9%
수생식물원 23
 
1.6%
열대식물원 15
 
1.1%
희귀식물원 8
 
0.6%
Other values (14) 33
 
2.4%

Correlations

2024-03-14T09:09:49.445837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9
Unnamed: 41.0000.8980.9310.7940.9170.951
Unnamed: 50.8981.0000.7490.9050.9400.894
Unnamed: 60.9310.7491.0000.7990.8830.900
Unnamed: 70.7940.9050.7991.0000.9290.893
Unnamed: 80.9170.9400.8830.9291.0000.936
Unnamed: 90.9510.8940.9000.8930.9361.000
2024-03-14T09:09:49.555581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 9Unnamed: 6Unnamed: 5Unnamed: 8
Unnamed: 91.0000.7070.5110.555
Unnamed: 60.7071.0000.4890.645
Unnamed: 50.5110.4891.0000.584
Unnamed: 80.5550.6450.5841.000
2024-03-14T09:09:49.644319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 5Unnamed: 6Unnamed: 8Unnamed: 9
Unnamed: 51.0000.4890.5840.511
Unnamed: 60.4891.0000.6450.707
Unnamed: 80.5840.6451.0000.555
Unnamed: 90.5110.7070.5551.000

Missing values

2024-03-14T09:09:44.950889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T09:09:45.064292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T09:09:45.180478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

대아수목원 보유 식물목록(초본) 117과 497속 871종 5아종 126변종 7품종 1교잡종 257재배종 126기타종 총 1,393종류Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9
0일련 번호식물 유전자원명<NA><NA>수량보유년월보유경위원산지산지참고사항 (식재위치)
1<NA>학 명과 명국 명<NA><NA><NA><NA><NA><NA>
21Lycopodium chinense Chirist.석송다람쥐꼬리1<NA>자생한국완주동상자생
32Selaginella involvens (Sw.) Spring부처손바위손12001년수집한국완주동상온실
43Selaginella tamariscina (Beauv.) Spring부처손부처손1<NA>자생한국완주동상자생
54Selaginella rossii (Bak.) Warb.부처손구실사리1<NA>자생한국완주동상자생
65Equisetum arvense L.속새쇠뜨기1<NA>자생한국완주동상자생
76Equisetum hyemale L.속새속새3812002년수집한국천보식물원약용수원
87Botrychium ternatum (Thunb.) Sw.고사리삼고사리삼1<NA>자생한국완주동상자생
98Osmunda japonica Thunb.고비고비2002002년수집한국대영농장표본수원
대아수목원 보유 식물목록(초본) 117과 497속 871종 5아종 126변종 7품종 1교잡종 257재배종 126기타종 총 1,393종류Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9
13861385Iris spp. "Burgemeister"붓꽃독일 붓꽃류 품종102014년수집독일종자은행아이리스원
13871386Iris spp. "Edith Wolford"붓꽃독일 붓꽃류 품종102014년수집독일종자은행아이리스원
13881387Iris spp. "Little Mary Sunshine"붓꽃독일 붓꽃류 품종102014년수집독일종자은행아이리스원
13891388Iris spp. "Man From Rio"붓꽃독일 붓꽃류 품종102014년수집독일종자은행아이리스원
13901389Iris spp. "Mocambo"붓꽃독일 붓꽃류 품종102014년수집독일종자은행아이리스원
13911390Iris spp. "Night Edition Soortecht"붓꽃독일 붓꽃류 품종102014년수집독일종자은행아이리스원
13921391Iris spp. "Silverado"붓꽃독일 붓꽃류 품종102014년수집독일종자은행아이리스원
13931392Iris spp. "Tuxedo"붓꽃독일 붓꽃류 품종102014년수집독일종자은행아이리스원
13941393Hypericum x inodorum 'Rheingold'물레나물이노도룸물레나물(품종)102014년수집유럽, 아시아프롬앤열대식물원
1395<NA>* 미동정 식물로 학명이 불분명함.<NA><NA><NA><NA><NA><NA><NA><NA>