Overview

Dataset statistics

Number of variables15
Number of observations1300
Missing cells6547
Missing cells (%)33.6%
Duplicate rows2
Duplicate rows (%)0.2%
Total size in memory158.8 KiB
Average record size in memory125.1 B

Variable types

Text7
Categorical3
Unsupported5

Dataset

Description대아수목원식물보유현황목본식물
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202190

Alerts

Dataset has 2 (0.2%) duplicate rowsDuplicates
Unnamed: 5 is highly overall correlated with Unnamed: 6High correlation
Unnamed: 6 is highly overall correlated with Unnamed: 5 and 1 other fieldsHigh correlation
Unnamed: 8 is highly overall correlated with Unnamed: 6High correlation
Unnamed: 6 is highly imbalanced (74.9%)Imbalance
Unnamed: 10 has 1300 (100.0%) missing valuesMissing
Unnamed: 11 has 1300 (100.0%) missing valuesMissing
Unnamed: 12 has 1300 (100.0%) missing valuesMissing
Unnamed: 13 has 1300 (100.0%) missing valuesMissing
Unnamed: 14 has 1300 (100.0%) missing valuesMissing
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 13 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 14 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-13 23:56:30.244245
Analysis finished2024-03-13 23:56:31.721239
Duration1.48 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1291
Distinct (%)100.0%
Missing9
Missing (%)0.7%
Memory size10.3 KiB
2024-03-14T08:56:32.005984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length3.1432998
Min length1

Characters and Unicode

Total characters4058
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1291 ?
Unique (%)100.0%

Sample

1st row일련 번호
2nd row1
3rd row2
4th row3
5th row4
ValueCountFrequency (%)
78 1
 
0.1%
886 1
 
0.1%
865 1
 
0.1%
864 1
 
0.1%
863 1
 
0.1%
862 1
 
0.1%
861 1
 
0.1%
860 1
 
0.1%
859 1
 
0.1%
866 1
 
0.1%
Other values (1282) 1282
99.2%
2024-03-14T08:56:32.498111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 750
18.5%
2 450
11.1%
7 359
8.8%
8 359
8.8%
3 359
8.8%
4 359
8.8%
5 359
8.8%
6 359
8.8%
9 350
8.6%
0 349
8.6%
Other values (5) 5
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4053
99.9%
Other Letter 4
 
0.1%
Control 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 750
18.5%
2 450
11.1%
7 359
8.9%
8 359
8.9%
3 359
8.9%
4 359
8.9%
5 359
8.9%
6 359
8.9%
9 350
8.6%
0 349
8.6%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4054
99.9%
Hangul 4
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 750
18.5%
2 450
11.1%
7 359
8.9%
8 359
8.9%
3 359
8.9%
4 359
8.9%
5 359
8.9%
6 359
8.9%
9 350
8.6%
0 349
8.6%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4054
99.9%
Hangul 4
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 750
18.5%
2 450
11.1%
7 359
8.9%
8 359
8.9%
3 359
8.9%
4 359
8.9%
5 359
8.9%
6 359
8.9%
9 350
8.6%
0 349
8.6%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Distinct1034
Distinct (%)80.0%
Missing7
Missing (%)0.5%
Memory size10.3 KiB
2024-03-14T08:56:32.806247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length42
Mean length24.093581
Min length7

Characters and Unicode

Total characters31153
Distinct characters89
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique995 ?
Unique (%)77.0%

Sample

1st row수목유전자원명
2nd row학 명
3rd rowCycas revoluta Thunb.
4th rowZamia pumila L.
5th rowGinkgo spp.
ValueCountFrequency (%)
spp 327
 
7.9%
rosa 139
 
3.3%
japonica 96
 
2.3%
var 90
 
2.2%
hibiscus 88
 
2.1%
camellia 73
 
1.8%
syriacus 70
 
1.7%
l 61
 
1.5%
magnolia 60
 
1.4%
et 54
 
1.3%
Other values (1400) 3093
74.5%
2024-03-14T08:56:33.230680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3325
 
10.7%
a 3207
 
10.3%
i 2286
 
7.3%
s 1987
 
6.4%
e 1687
 
5.4%
r 1585
 
5.1%
o 1561
 
5.0%
n 1487
 
4.8%
u 1408
 
4.5%
l 1225
 
3.9%
Other values (79) 11395
36.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23386
75.1%
Space Separator 3325
 
10.7%
Uppercase Letter 2583
 
8.3%
Other Punctuation 1671
 
5.4%
Open Punctuation 64
 
0.2%
Close Punctuation 64
 
0.2%
Decimal Number 24
 
0.1%
Other Letter 22
 
0.1%
Dash Punctuation 12
 
< 0.1%
Modifier Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3207
13.7%
i 2286
 
9.8%
s 1987
 
8.5%
e 1687
 
7.2%
r 1585
 
6.8%
o 1561
 
6.7%
n 1487
 
6.4%
u 1408
 
6.0%
l 1225
 
5.2%
p 1181
 
5.1%
Other values (16) 5772
24.7%
Uppercase Letter
ValueCountFrequency (%)
R 285
11.0%
C 256
 
9.9%
M 212
 
8.2%
H 207
 
8.0%
P 206
 
8.0%
S 202
 
7.8%
L 162
 
6.3%
A 139
 
5.4%
T 137
 
5.3%
B 111
 
4.3%
Other values (16) 666
25.8%
Other Letter
ValueCountFrequency (%)
4
18.2%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (8) 8
36.4%
Decimal Number
ValueCountFrequency (%)
6 7
29.2%
2 4
16.7%
1 4
16.7%
9 2
 
8.3%
8 2
 
8.3%
7 1
 
4.2%
4 1
 
4.2%
0 1
 
4.2%
5 1
 
4.2%
3 1
 
4.2%
Other Punctuation
ValueCountFrequency (%)
. 993
59.4%
' 639
38.2%
" 38
 
2.3%
* 1
 
0.1%
Space Separator
ValueCountFrequency (%)
3325
100.0%
Open Punctuation
ValueCountFrequency (%)
( 64
100.0%
Close Punctuation
ValueCountFrequency (%)
) 64
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 25969
83.4%
Common 5162
 
16.6%
Hangul 22
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3207
 
12.3%
i 2286
 
8.8%
s 1987
 
7.7%
e 1687
 
6.5%
r 1585
 
6.1%
o 1561
 
6.0%
n 1487
 
5.7%
u 1408
 
5.4%
l 1225
 
4.7%
p 1181
 
4.5%
Other values (42) 8355
32.2%
Common
ValueCountFrequency (%)
3325
64.4%
. 993
 
19.2%
' 639
 
12.4%
( 64
 
1.2%
) 64
 
1.2%
" 38
 
0.7%
- 12
 
0.2%
6 7
 
0.1%
2 4
 
0.1%
1 4
 
0.1%
Other values (9) 12
 
0.2%
Hangul
ValueCountFrequency (%)
4
18.2%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (8) 8
36.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31131
99.9%
Hangul 22
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3325
 
10.7%
a 3207
 
10.3%
i 2286
 
7.3%
s 1987
 
6.4%
e 1687
 
5.4%
r 1585
 
5.1%
o 1561
 
5.0%
n 1487
 
4.8%
u 1408
 
4.5%
l 1225
 
3.9%
Other values (61) 11373
36.5%
Hangul
ValueCountFrequency (%)
4
18.2%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (8) 8
36.4%
Distinct91
Distinct (%)7.0%
Missing9
Missing (%)0.7%
Memory size10.3 KiB
2024-03-14T08:56:33.408161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length2.9969016
Min length1

Characters and Unicode

Total characters3869
Distinct characters145
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)2.2%

Sample

1st row과 명
2nd row소철
3rd row소철
4th row은행나무
5th row은행나무
ValueCountFrequency (%)
장미 238
18.4%
아욱 88
 
6.8%
차나무 79
 
6.1%
목련 63
 
4.9%
진달래 60
 
4.6%
미나리아재비 60
 
4.6%
단풍나무 51
 
3.9%
측백나무 44
 
3.4%
인동 37
 
2.9%
37
 
2.9%
Other values (82) 535
41.4%
2024-03-14T08:56:33.765673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
545
 
14.1%
481
 
12.4%
299
 
7.7%
238
 
6.2%
153
 
4.0%
88
 
2.3%
80
 
2.1%
79
 
2.0%
71
 
1.8%
68
 
1.8%
Other values (135) 1767
45.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3861
99.8%
Space Separator 8
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
545
 
14.1%
481
 
12.5%
299
 
7.7%
238
 
6.2%
153
 
4.0%
88
 
2.3%
80
 
2.1%
79
 
2.0%
71
 
1.8%
68
 
1.8%
Other values (134) 1759
45.6%
Space Separator
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3861
99.8%
Common 8
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
545
 
14.1%
481
 
12.5%
299
 
7.7%
238
 
6.2%
153
 
4.0%
88
 
2.3%
80
 
2.1%
79
 
2.0%
71
 
1.8%
68
 
1.8%
Other values (134) 1759
45.6%
Common
ValueCountFrequency (%)
8
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3861
99.8%
ASCII 8
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
545
 
14.1%
481
 
12.5%
299
 
7.7%
238
 
6.2%
153
 
4.0%
88
 
2.3%
80
 
2.1%
79
 
2.0%
71
 
1.8%
68
 
1.8%
Other values (134) 1759
45.6%
ASCII
ValueCountFrequency (%)
8
100.0%
Distinct1106
Distinct (%)85.7%
Missing9
Missing (%)0.7%
Memory size10.3 KiB
2024-03-14T08:56:34.012487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length14
Mean length6.0945004
Min length1

Characters and Unicode

Total characters7868
Distinct characters517
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1084 ?
Unique (%)84.0%

Sample

1st row국 명
2nd row소철
3rd row멕시코소철
4th row왕방울은행나무*
5th row은행나무
ValueCountFrequency (%)
동백나무(재배종 66
 
5.0%
목련(재배종 49
 
3.7%
철쭉류 18
 
1.4%
모란(재배종 12
 
0.9%
무궁화(품종 11
 
0.8%
수국(재배종 11
 
0.8%
단풍나무(재배종 6
 
0.5%
수국 6
 
0.5%
품종 5
 
0.4%
무궁화류 4
 
0.3%
Other values (1108) 1141
85.9%
2024-03-14T08:56:34.407005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
590
 
7.5%
497
 
6.3%
* 332
 
4.2%
( 291
 
3.7%
) 291
 
3.7%
- 230
 
2.9%
214
 
2.7%
173
 
2.2%
166
 
2.1%
154
 
2.0%
Other values (507) 4930
62.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6624
84.2%
Other Punctuation 370
 
4.7%
Open Punctuation 291
 
3.7%
Close Punctuation 291
 
3.7%
Dash Punctuation 230
 
2.9%
Space Separator 46
 
0.6%
Decimal Number 16
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
590
 
8.9%
497
 
7.5%
214
 
3.2%
173
 
2.6%
166
 
2.5%
154
 
2.3%
145
 
2.2%
138
 
2.1%
122
 
1.8%
106
 
1.6%
Other values (493) 4319
65.2%
Decimal Number
ValueCountFrequency (%)
2 4
25.0%
9 3
18.8%
1 3
18.8%
8 3
18.8%
4 1
 
6.2%
0 1
 
6.2%
7 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
* 332
89.7%
' 37
 
10.0%
, 1
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 291
100.0%
Close Punctuation
ValueCountFrequency (%)
) 291
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 230
100.0%
Space Separator
ValueCountFrequency (%)
46
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6624
84.2%
Common 1244
 
15.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
590
 
8.9%
497
 
7.5%
214
 
3.2%
173
 
2.6%
166
 
2.5%
154
 
2.3%
145
 
2.2%
138
 
2.1%
122
 
1.8%
106
 
1.6%
Other values (493) 4319
65.2%
Common
ValueCountFrequency (%)
* 332
26.7%
( 291
23.4%
) 291
23.4%
- 230
18.5%
46
 
3.7%
' 37
 
3.0%
2 4
 
0.3%
9 3
 
0.2%
1 3
 
0.2%
8 3
 
0.2%
Other values (4) 4
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6624
84.2%
ASCII 1244
 
15.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
590
 
8.9%
497
 
7.5%
214
 
3.2%
173
 
2.6%
166
 
2.5%
154
 
2.3%
145
 
2.2%
138
 
2.1%
122
 
1.8%
106
 
1.6%
Other values (493) 4319
65.2%
ASCII
ValueCountFrequency (%)
* 332
26.7%
( 291
23.4%
) 291
23.4%
- 230
18.5%
46
 
3.7%
' 37
 
3.0%
2 4
 
0.3%
9 3
 
0.2%
1 3
 
0.2%
8 3
 
0.2%
Other values (4) 4
 
0.3%
Distinct133
Distinct (%)10.3%
Missing5
Missing (%)0.4%
Memory size10.3 KiB
2024-03-14T08:56:34.628585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length2
Mean length2.4254826
Min length2

Characters and Unicode

Total characters3141
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)6.3%

Sample

1st row 수량
2nd row21
3rd row1
4th row2
5th row98
ValueCountFrequency (%)
1 572
44.2%
2 134
 
10.3%
10 65
 
5.0%
3 64
 
4.9%
30 48
 
3.7%
20 40
 
3.1%
5 30
 
2.3%
4 29
 
2.2%
100 22
 
1.7%
28 16
 
1.2%
Other values (121) 275
21.2%
2024-03-14T08:56:34.927722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1252
39.9%
1 776
24.7%
0 290
 
9.2%
2 260
 
8.3%
3 160
 
5.1%
5 115
 
3.7%
4 90
 
2.9%
6 64
 
2.0%
8 47
 
1.5%
7 36
 
1.1%
Other values (4) 51
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1861
59.2%
Space Separator 1252
39.9%
Other Punctuation 26
 
0.8%
Other Letter 2
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 776
41.7%
0 290
 
15.6%
2 260
 
14.0%
3 160
 
8.6%
5 115
 
6.2%
4 90
 
4.8%
6 64
 
3.4%
8 47
 
2.5%
7 36
 
1.9%
9 23
 
1.2%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
1252
100.0%
Other Punctuation
ValueCountFrequency (%)
, 26
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3139
99.9%
Hangul 2
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1252
39.9%
1 776
24.7%
0 290
 
9.2%
2 260
 
8.3%
3 160
 
5.1%
5 115
 
3.7%
4 90
 
2.9%
6 64
 
2.0%
8 47
 
1.5%
7 36
 
1.1%
Other values (2) 49
 
1.6%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3139
99.9%
Hangul 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1252
39.9%
1 776
24.7%
0 290
 
9.2%
2 260
 
8.3%
3 160
 
5.1%
5 115
 
3.7%
4 90
 
2.9%
6 64
 
2.0%
8 47
 
1.5%
7 36
 
1.1%
Other values (2) 49
 
1.6%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Unnamed: 5
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size10.3 KiB
2002년
537 
2003년
177 
<NA>
163 
2000년
138 
1989년
63 
Other values (17)
222 

Length

Max length5
Median length5
Mean length4.8738462
Min length4

Unique

Unique3 ?
Unique (%)0.2%

Sample

1st row보유년월
2nd row<NA>
3rd row1994년
4th row1994년
5th row2003년

Common Values

ValueCountFrequency (%)
2002년 537
41.3%
2003년 177
 
13.6%
<NA> 163
 
12.5%
2000년 138
 
10.6%
1989년 63
 
4.8%
2014년 44
 
3.4%
2001년 33
 
2.5%
2004년 28
 
2.2%
1994년 24
 
1.8%
1996년 18
 
1.4%
Other values (12) 75
 
5.8%

Length

2024-03-14T08:56:35.033666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2002년 537
41.3%
2003년 177
 
13.6%
na 163
 
12.5%
2000년 138
 
10.6%
1989년 63
 
4.8%
2014년 44
 
3.4%
2001년 33
 
2.5%
2004년 28
 
2.2%
1994년 24
 
1.8%
1996년 18
 
1.4%
Other values (12) 75
 
5.8%

Unnamed: 6
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size10.3 KiB
수집
1126 
자생
158 
분양 (국립수목원)
 
6
자생,수집
 
5
<NA>
 
4

Length

Max length10
Median length2
Mean length2.0561538
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row보유경위
2nd row<NA>
3rd row수집
4th row수집
5th row수집

Common Values

ValueCountFrequency (%)
수집 1126
86.6%
자생 158
 
12.2%
분양 (국립수목원) 6
 
0.5%
자생,수집 5
 
0.4%
<NA> 4
 
0.3%
보유경위 1
 
0.1%

Length

2024-03-14T08:56:35.159757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T08:56:35.266894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수집 1126
86.2%
자생 158
 
12.1%
분양 6
 
0.5%
국립수목원 6
 
0.5%
자생,수집 5
 
0.4%
na 4
 
0.3%
보유경위 1
 
0.1%
Distinct57
Distinct (%)4.4%
Missing4
Missing (%)0.3%
Memory size10.3 KiB
2024-03-14T08:56:35.404597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.25
Min length2

Characters and Unicode

Total characters2916
Distinct characters70
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34 ?
Unique (%)2.6%

Sample

1st row원산지
2nd row한국
3rd row멕시코
4th row중국
5th row중국
ValueCountFrequency (%)
한국 706
53.6%
일본 224
 
17.0%
중국 125
 
9.5%
프랑스 63
 
4.8%
유럽 44
 
3.3%
독일 40
 
3.0%
미국 26
 
2.0%
북아메리카 18
 
1.4%
아시아 12
 
0.9%
인도 8
 
0.6%
Other values (37) 52
 
3.9%
2024-03-14T08:56:35.667585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
860
29.5%
706
24.2%
265
 
9.1%
225
 
7.7%
131
 
4.5%
68
 
2.3%
66
 
2.3%
64
 
2.2%
62
 
2.1%
46
 
1.6%
Other values (60) 423
14.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2869
98.4%
Space Separator 24
 
0.8%
Other Punctuation 23
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
860
30.0%
706
24.6%
265
 
9.2%
225
 
7.8%
131
 
4.6%
68
 
2.4%
66
 
2.3%
64
 
2.2%
62
 
2.2%
46
 
1.6%
Other values (58) 376
13.1%
Space Separator
ValueCountFrequency (%)
24
100.0%
Other Punctuation
ValueCountFrequency (%)
, 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2869
98.4%
Common 47
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
860
30.0%
706
24.6%
265
 
9.2%
225
 
7.8%
131
 
4.6%
68
 
2.4%
66
 
2.3%
64
 
2.2%
62
 
2.2%
46
 
1.6%
Other values (58) 376
13.1%
Common
ValueCountFrequency (%)
24
51.1%
, 23
48.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2869
98.4%
ASCII 47
 
1.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
860
30.0%
706
24.6%
265
 
9.2%
225
 
7.8%
131
 
4.6%
68
 
2.4%
66
 
2.3%
64
 
2.2%
62
 
2.2%
46
 
1.6%
Other values (58) 376
13.1%
ASCII
ValueCountFrequency (%)
24
51.1%
, 23
48.9%

Unnamed: 8
Categorical

HIGH CORRELATION 

Distinct34
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size10.3 KiB
완주
591 
천리포수목원
195 
뉴코리아장미
128 
전북산림
96 
미림종묘
 
46
Other values (29)
244 

Length

Max length6
Median length2
Mean length3.4369231
Min length2

Unique

Unique9 ?
Unique (%)0.7%

Sample

1st row산지
2nd row<NA>
3rd row완주
4th row완주
5th row전북산림

Common Values

ValueCountFrequency (%)
완주 591
45.5%
천리포수목원 195
 
15.0%
뉴코리아장미 128
 
9.8%
전북산림 96
 
7.4%
미림종묘 46
 
3.5%
프롬앤 44
 
3.4%
프롬엔 42
 
3.2%
순창 33
 
2.5%
전원생활 23
 
1.8%
천보식물원 16
 
1.2%
Other values (24) 86
 
6.6%

Length

2024-03-14T08:56:35.826204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
완주 591
45.3%
천리포수목원 195
 
15.0%
뉴코리아장미 128
 
9.8%
전북산림 96
 
7.4%
미림종묘 46
 
3.5%
프롬앤 44
 
3.4%
프롬엔 42
 
3.2%
순창 33
 
2.5%
전원생활 23
 
1.8%
천보식물원 16
 
1.2%
Other values (26) 90
 
6.9%
Distinct99
Distinct (%)7.6%
Missing4
Missing (%)0.3%
Memory size10.3 KiB
2024-03-14T08:56:36.000761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length4.2114198
Min length2

Characters and Unicode

Total characters5458
Distinct characters75
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique47 ?
Unique (%)3.6%

Sample

1st row참고사항 (식재위치)
2nd row온실
3rd row온실
4th row테마정원, 표본수원
5th row분재원, 약용수원
ValueCountFrequency (%)
표본수원 443
29.5%
온실 166
 
11.0%
자생 162
 
10.8%
장미원 117
 
7.8%
무궁화원 85
 
5.7%
동백원 60
 
4.0%
목련원 56
 
3.7%
관상수원 51
 
3.4%
묘포장 50
 
3.3%
약용수원 45
 
3.0%
Other values (27) 269
17.9%
2024-03-14T08:56:36.247720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1120
20.5%
650
 
11.9%
444
 
8.1%
444
 
8.1%
221
 
4.0%
, 207
 
3.8%
198
 
3.6%
178
 
3.3%
169
 
3.1%
166
 
3.0%
Other values (65) 1661
30.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5027
92.1%
Space Separator 221
 
4.0%
Other Punctuation 207
 
3.8%
Open Punctuation 1
 
< 0.1%
Control 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1120
22.3%
650
12.9%
444
 
8.8%
444
 
8.8%
198
 
3.9%
178
 
3.5%
169
 
3.4%
166
 
3.3%
162
 
3.2%
118
 
2.3%
Other values (60) 1378
27.4%
Space Separator
ValueCountFrequency (%)
221
100.0%
Other Punctuation
ValueCountFrequency (%)
, 207
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Control
ValueCountFrequency (%)
1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5027
92.1%
Common 431
 
7.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1120
22.3%
650
12.9%
444
 
8.8%
444
 
8.8%
198
 
3.9%
178
 
3.5%
169
 
3.4%
166
 
3.3%
162
 
3.2%
118
 
2.3%
Other values (60) 1378
27.4%
Common
ValueCountFrequency (%)
221
51.3%
, 207
48.0%
( 1
 
0.2%
1
 
0.2%
) 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5027
92.1%
ASCII 431
 
7.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1120
22.3%
650
12.9%
444
 
8.8%
444
 
8.8%
198
 
3.9%
178
 
3.5%
169
 
3.4%
166
 
3.3%
162
 
3.2%
118
 
2.3%
Other values (60) 1378
27.4%
ASCII
ValueCountFrequency (%)
221
51.3%
, 207
48.0%
( 1
 
0.2%
1
 
0.2%
) 1
 
0.2%

Unnamed: 10
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1300
Missing (%)100.0%
Memory size11.6 KiB

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1300
Missing (%)100.0%
Memory size11.6 KiB

Unnamed: 12
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1300
Missing (%)100.0%
Memory size11.6 KiB

Unnamed: 13
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1300
Missing (%)100.0%
Memory size11.6 KiB

Unnamed: 14
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1300
Missing (%)100.0%
Memory size11.6 KiB

Correlations

2024-03-14T08:56:36.337142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 2Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9
Unnamed: 21.0000.8330.4520.9310.8710.926
Unnamed: 50.8331.0000.9370.8100.9090.945
Unnamed: 60.4520.9371.0000.7860.9180.978
Unnamed: 70.9310.8100.7861.0000.9080.780
Unnamed: 80.8710.9090.9180.9081.0000.974
Unnamed: 90.9260.9450.9780.7800.9741.000
2024-03-14T08:56:36.421956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 6Unnamed: 5Unnamed: 8
Unnamed: 61.0000.8080.722
Unnamed: 50.8081.0000.485
Unnamed: 80.7220.4851.000
2024-03-14T08:56:36.770869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 5Unnamed: 6Unnamed: 8
Unnamed: 51.0000.8080.485
Unnamed: 60.8081.0000.722
Unnamed: 80.4850.7221.000

Missing values

2024-03-14T08:56:31.235541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T08:56:31.454253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T08:56:31.621220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

대아수목원 보유 식물목록(목본) 90과 250속 532종 1아종 91변종 55품종 3교잡종 608재배종 총 1,290종류Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14
0일련 번호수목유전자원명<NA><NA>수량보유년월보유경위원산지산지참고사항 (식재위치)<NA><NA><NA><NA><NA>
1<NA>학 명과 명국 명<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
21Cycas revoluta Thunb.소철소철211994년수집한국완주온실<NA><NA><NA><NA><NA>
32Zamia pumila L.소철멕시코소철11994년수집멕시코완주온실<NA><NA><NA><NA><NA>
43Ginkgo spp.은행나무왕방울은행나무*22003년수집중국전북산림테마정원, 표본수원<NA><NA><NA><NA><NA>
54Ginkgo biloba L.은행나무은행나무981989년수집중국완주분재원, 약용수원<NA><NA><NA><NA><NA>
65Cephalotaxus koreana Nakai주목개비자나무1<NA>자생한국완주자생<NA><NA><NA><NA><NA>
76Cephalotaxus koreana var. nana Nak.주목눈개비자나무1<NA>자생한국완주자생<NA><NA><NA><NA><NA>
87Taxus spp.주목황금주목(팔방성)*12003년수집한국완주온실, 표본수원<NA><NA><NA><NA><NA>
98Taxus baccata 'Aurea'주목황금주목32002년수집한국완주표본수원<NA><NA><NA><NA><NA>
대아수목원 보유 식물목록(목본) 90과 250속 532종 1아종 91변종 55품종 3교잡종 608재배종 총 1,290종류Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14
12901284Hydrangea mcarophylla "Alpengluchen"범의귀수국 품종 '알펜그루헨'102014년수집유럽프롬앤표본수원<NA><NA><NA><NA><NA>
12911285Lonicera nitida "Lemon Beauty"인동동청괴불나무 '레몬 뷰티'102014년수집유럽프롬앤표본수원<NA><NA><NA><NA><NA>
12921286Lonicera nitida "Maigun"인동동청괴불나무 '마이준'102014년수집유럽프롬앤표본수원<NA><NA><NA><NA><NA>
12931287Lonicera pileata인동필레아타괴불나무102014년수집유럽프롬앤표본수원<NA><NA><NA><NA><NA>
12941288Rosa cannia장미장미속류102014년수집유럽프롬앤표본수원<NA><NA><NA><NA><NA>
12951289Rosa rubignosa장미장미속류102014년수집유럽프롬앤표본수원<NA><NA><NA><NA><NA>
12961290Stephandra incisa "Crispa"장미국수나무 '크리스파'102014년수집유럽, 아시아프롬앤표본수원<NA><NA><NA><NA><NA>
1297<NA>* 미동정 식물로 학명이 불분명함.<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
1298<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
1299<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

대아수목원 보유 식물목록(목본) 90과 250속 532종 1아종 91변종 55품종 3교잡종 608재배종 총 1,290종류Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9# duplicates
0<NA><NA><NA><NA>52012년분양 (국립수목원)한국제주도산림생태체험관2
1<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2