Overview

Dataset statistics

Number of variables7
Number of observations701
Missing cells0
Missing cells (%)0.0%
Duplicate rows3
Duplicate rows (%)0.4%
Total size in memory38.5 KiB
Average record size in memory56.2 B

Variable types

Text2
Categorical5

Dataset

Description백두대간 지역의 희귀식물 및 멸종위기, 특산식물과 귀화식물에 대한 데이터로, 식물명 및 학명과 분류정보 등을 제공합니다.
Author산림청
URLhttps://www.data.go.kr/data/15093672/fileData.do

Alerts

Dataset has 3 (0.4%) duplicate rowsDuplicates
1급멸종위기식물 분류 is highly overall correlated with 희귀식물 분류High correlation
희귀식물 분류 is highly overall correlated with 1급멸종위기식물 분류 and 2 other fieldsHigh correlation
특산식물 분류 is highly overall correlated with 희귀식물 분류 and 1 other fieldsHigh correlation
2급멸종위기식물 분류 is highly overall correlated with 희귀식물 분류 and 1 other fieldsHigh correlation
1급멸종위기식물 분류 is highly imbalanced (91.0%)Imbalance
2급멸종위기식물 분류 is highly imbalanced (57.4%)Imbalance
귀화식물 분류 is highly imbalanced (53.6%)Imbalance

Reproduction

Analysis started2023-12-12 22:00:08.612759
Analysis finished2023-12-12 22:00:09.167959
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct671
Distinct (%)95.7%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
2023-12-13T07:00:09.364070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.4835949
Min length2

Characters and Unicode

Total characters3143
Distinct characters381
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique643 ?
Unique (%)91.7%

Sample

1st row너도밤나무
2nd row산비장이
3rd row개회향
4th row백부자
5th row참좁쌀풀
ValueCountFrequency (%)
흰바디나물 3
 
0.4%
섬초롱꽃 3
 
0.4%
흰바디 3
 
0.4%
섬쥐똥나무 2
 
0.3%
섬백리향 2
 
0.3%
흰솔나리 2
 
0.3%
고산구슬봉이 2
 
0.3%
털긴잎갈퀴 2
 
0.3%
흰등괴불 2
 
0.3%
기생꽃 2
 
0.3%
Other values (661) 680
96.7%
2023-12-13T07:00:09.732554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
140
 
4.5%
121
 
3.8%
103
 
3.3%
71
 
2.3%
68
 
2.2%
68
 
2.2%
60
 
1.9%
55
 
1.7%
46
 
1.5%
45
 
1.4%
Other values (371) 2366
75.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3135
99.7%
Other Punctuation 4
 
0.1%
Space Separator 2
 
0.1%
Close Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
140
 
4.5%
121
 
3.9%
103
 
3.3%
71
 
2.3%
68
 
2.2%
68
 
2.2%
60
 
1.9%
55
 
1.8%
46
 
1.5%
45
 
1.4%
Other values (367) 2358
75.2%
Other Punctuation
ValueCountFrequency (%)
' 4
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3135
99.7%
Common 8
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
140
 
4.5%
121
 
3.9%
103
 
3.3%
71
 
2.3%
68
 
2.2%
68
 
2.2%
60
 
1.9%
55
 
1.8%
46
 
1.5%
45
 
1.4%
Other values (367) 2358
75.2%
Common
ValueCountFrequency (%)
' 4
50.0%
2
25.0%
) 1
 
12.5%
( 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3135
99.7%
ASCII 8
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
140
 
4.5%
121
 
3.9%
103
 
3.3%
71
 
2.3%
68
 
2.2%
68
 
2.2%
60
 
1.9%
55
 
1.8%
46
 
1.5%
45
 
1.4%
Other values (367) 2358
75.2%
ASCII
ValueCountFrequency (%)
' 4
50.0%
2
25.0%
) 1
 
12.5%
( 1
 
12.5%
Distinct610
Distinct (%)87.0%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
2023-12-13T07:00:09.982058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length71
Median length53
Mean length31.673324
Min length12

Characters and Unicode

Total characters22203
Distinct characters60
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique535 ?
Unique (%)76.3%

Sample

1st rowFagus engleriana Seemen ex Diels
2nd rowSerratula coronata var. insularis (Iljin) Kitam. for. insularis
3rd rowLigusticum tachiroei (Franch. & Sav.) M.Hiroe & Constance
4th rowAconitum koreanum R.Raymund
5th rowLysimachia coreana Nakai
ValueCountFrequency (%)
nakai 259
 
9.3%
var 123
 
4.4%
76
 
2.7%
l 72
 
2.6%
h.lev 39
 
1.4%
maxim 39
 
1.4%
ohwi 36
 
1.3%
ex 32
 
1.1%
makino 22
 
0.8%
saussurea 21
 
0.8%
Other values (1065) 2073
74.2%
2023-12-13T07:00:10.376228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2738
 
12.3%
2091
 
9.4%
i 1981
 
8.9%
e 1298
 
5.8%
s 1118
 
5.0%
r 1102
 
5.0%
o 1055
 
4.8%
n 1026
 
4.6%
u 928
 
4.2%
. 833
 
3.8%
Other values (50) 8033
36.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16843
75.9%
Space Separator 2091
 
9.4%
Uppercase Letter 1992
 
9.0%
Other Punctuation 917
 
4.1%
Open Punctuation 176
 
0.8%
Close Punctuation 176
 
0.8%
Dash Punctuation 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2738
16.3%
i 1981
11.8%
e 1298
 
7.7%
s 1118
 
6.6%
r 1102
 
6.5%
o 1055
 
6.3%
n 1026
 
6.1%
u 928
 
5.5%
l 805
 
4.8%
t 666
 
4.0%
Other values (16) 4126
24.5%
Uppercase Letter
ValueCountFrequency (%)
N 276
13.9%
L 204
 
10.2%
S 172
 
8.6%
C 133
 
6.7%
M 122
 
6.1%
H 121
 
6.1%
P 115
 
5.8%
K 110
 
5.5%
A 107
 
5.4%
T 90
 
4.5%
Other values (16) 542
27.2%
Other Punctuation
ValueCountFrequency (%)
. 833
90.8%
& 76
 
8.3%
, 4
 
0.4%
' 4
 
0.4%
Space Separator
ValueCountFrequency (%)
2091
100.0%
Open Punctuation
ValueCountFrequency (%)
( 176
100.0%
Close Punctuation
ValueCountFrequency (%)
) 176
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18835
84.8%
Common 3368
 
15.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2738
14.5%
i 1981
 
10.5%
e 1298
 
6.9%
s 1118
 
5.9%
r 1102
 
5.9%
o 1055
 
5.6%
n 1026
 
5.4%
u 928
 
4.9%
l 805
 
4.3%
t 666
 
3.5%
Other values (42) 6118
32.5%
Common
ValueCountFrequency (%)
2091
62.1%
. 833
 
24.7%
( 176
 
5.2%
) 176
 
5.2%
& 76
 
2.3%
- 8
 
0.2%
, 4
 
0.1%
' 4
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22203
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2738
 
12.3%
2091
 
9.4%
i 1981
 
8.9%
e 1298
 
5.8%
s 1118
 
5.0%
r 1102
 
5.0%
o 1055
 
4.8%
n 1026
 
4.6%
u 928
 
4.2%
. 833
 
3.8%
Other values (50) 8033
36.2%

희귀식물 분류
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
<NA>
443 
희귀식물
258 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row희귀식물
2nd row<NA>
3rd row희귀식물
4th row희귀식물
5th row희귀식물

Common Values

ValueCountFrequency (%)
<NA> 443
63.2%
희귀식물 258
36.8%

Length

2023-12-13T07:00:10.486648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:00:10.570502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 443
63.2%
희귀식물 258
36.8%

1급멸종위기식물 분류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
<NA>
693 
1급멸종위기식물
 
8

Length

Max length8
Median length4
Mean length4.0456491
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 693
98.9%
1급멸종위기식물 8
 
1.1%

Length

2023-12-13T07:00:10.670625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:00:10.770616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 693
98.9%
1급멸종위기식물 8
 
1.1%

2급멸종위기식물 분류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
<NA>
640 
2급멸종위기식물
 
61

Length

Max length8
Median length4
Mean length4.3480742
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row2급멸종위기식물
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 640
91.3%
2급멸종위기식물 61
 
8.7%

Length

2023-12-13T07:00:10.883351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:00:11.015893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 640
91.3%
2급멸종위기식물 61
 
8.7%

특산식물 분류
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
특산식물
421 
<NA>
280 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row특산식물
2nd row특산식물
3rd row<NA>
4th row<NA>
5th row특산식물

Common Values

ValueCountFrequency (%)
특산식물 421
60.1%
<NA> 280
39.9%

Length

2023-12-13T07:00:11.137648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:00:11.237271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
특산식물 421
60.1%
na 280
39.9%

귀화식물 분류
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
<NA>
632 
귀화식물
69 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 632
90.2%
귀화식물 69
 
9.8%

Length

2023-12-13T07:00:11.346274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:00:11.453357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 632
90.2%
귀화식물 69
 
9.8%

Correlations

2023-12-13T07:00:11.533912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
1급멸종위기식물 분류귀화식물 분류희귀식물 분류특산식물 분류2급멸종위기식물 분류
1급멸종위기식물 분류1.000NaN1.000NaNNaN
귀화식물 분류NaN1.000NaNNaNNaN
희귀식물 분류1.000NaN1.0001.0001.000
특산식물 분류NaNNaN1.0001.0001.000
2급멸종위기식물 분류NaNNaN1.0001.0001.000
2023-12-13T07:00:11.640935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
희귀식물 분류1급멸종위기식물 분류2급멸종위기식물 분류특산식물 분류귀화식물 분류
희귀식물 분류1.0001.0001.0001.0000.000
1급멸종위기식물 분류1.0001.0000.0000.0000.000
2급멸종위기식물 분류1.0000.0001.0001.0000.000
특산식물 분류1.0000.0001.0001.0000.000
귀화식물 분류0.0000.0000.0000.0001.000

Missing values

2023-12-13T07:00:08.997383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:00:09.124824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

식물국문명식물학명희귀식물 분류1급멸종위기식물 분류2급멸종위기식물 분류특산식물 분류귀화식물 분류
0너도밤나무Fagus engleriana Seemen ex Diels희귀식물<NA><NA>특산식물<NA>
1산비장이Serratula coronata var. insularis (Iljin) Kitam. for. insularis<NA><NA><NA>특산식물<NA>
2개회향Ligusticum tachiroei (Franch. & Sav.) M.Hiroe & Constance희귀식물<NA><NA><NA><NA>
3백부자Aconitum koreanum R.Raymund희귀식물<NA>2급멸종위기식물<NA><NA>
4참좁쌀풀Lysimachia coreana Nakai희귀식물<NA><NA>특산식물<NA>
5섬피나무Tilia insularis Nakai<NA><NA><NA>특산식물<NA>
6섬거복꼬리Boehmeria taquetii Nakai<NA><NA><NA>특산식물<NA>
7기생초Coreopsis tinctoria Nutt.<NA><NA><NA><NA>귀화식물
8흰정향나무Syringa patula var. kamibayshii for. lactea (Nakai) K.Kim<NA><NA><NA>특산식물<NA>
9꽃잔대Adenophora koreana Kitam.<NA><NA><NA>특산식물<NA>
식물국문명식물학명희귀식물 분류1급멸종위기식물 분류2급멸종위기식물 분류특산식물 분류귀화식물 분류
691두루미천남성Arisaema heterophyllum Blume희귀식물<NA><NA><NA><NA>
692애기이삭사초Carex ochrochlamis Ohwi<NA><NA><NA>특산식물<NA>
693갈퀴아재비Asperula lasiantha Nakai<NA><NA><NA>특산식물<NA>
694박달목서Osmanthus insularis Koidz.희귀식물<NA>2급멸종위기식물<NA><NA>
695흰등괴불Lonicera maximowiczii var. latifolia (Ohwi) Hara<NA><NA><NA>특산식물<NA>
696구상나무Abies koreana Wilson희귀식물<NA><NA>특산식물<NA>
697주걱개망초Erigeron strigosus Muhl.<NA><NA><NA><NA>귀화식물
698한라분취Saussurea maximowiczii var. triceps (H.Lev. & Vaniot) Kitam.<NA><NA><NA>특산식물<NA>
699왕제비꽃Viola websteri Hemsl.희귀식물<NA>2급멸종위기식물<NA><NA>
700강활ostericum praeteritum<NA><NA><NA>특산식물<NA>

Duplicate rows

Most frequently occurring

식물국문명식물학명희귀식물 분류1급멸종위기식물 분류2급멸종위기식물 분류특산식물 분류귀화식물 분류# duplicates
0개서나무Carpinus tschonoskii Maxim. var. tschonoskii희귀식물<NA><NA><NA><NA>2
1서나무Carpinus laxiflora (Siebold & Zucc.) Blume var. laxiflora<NA><NA><NA>특산식물<NA>2
2털긴잎갈퀴Galium boreale var. koreanum Nakai<NA><NA><NA>특산식물<NA>2