Overview

Dataset statistics

Number of variables6
Number of observations30
Missing cells30
Missing cells (%)16.7%
Duplicate rows1
Duplicate rows (%)3.3%
Total size in memory1.6 KiB
Average record size in memory53.4 B

Variable types

Categorical1
Text3
Numeric1
DateTime1

Dataset

Description제주특별자치도 내 서식중인 멸종위기종으로 등록된 동식물과 관련한 데이터로 구분, 국명, 학명, 보유랑 등의 정보를 제공합니다.
Author제주특별자치도
URLhttps://www.data.go.kr/data/15045333/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (3.3%) duplicate rowsDuplicates
보유량 has 5 (16.7%) missing valuesMissing
비고 has 25 (83.3%) missing valuesMissing
보유량 has 1 (3.3%) zerosZeros

Reproduction

Analysis started2024-05-04 08:23:43.140291
Analysis finished2024-05-04 08:23:45.405118
Duration2.26 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
멸종위기 야생식물 2급
15 
멸종위기 야생식물 1급
10 
지정외 보유 수종

Length

Max length12
Median length12
Mean length11.5
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row멸종위기 야생식물 1급
2nd row멸종위기 야생식물 1급
3rd row멸종위기 야생식물 1급
4th row멸종위기 야생식물 1급
5th row멸종위기 야생식물 1급

Common Values

ValueCountFrequency (%)
멸종위기 야생식물 2급 15
50.0%
멸종위기 야생식물 1급 10
33.3%
지정외 보유 수종 5
 
16.7%

Length

2024-05-04T08:23:45.777290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T08:23:46.198662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
멸종위기 25
27.8%
야생식물 25
27.8%
2급 15
16.7%
1급 10
 
11.1%
지정외 5
 
5.6%
보유 5
 
5.6%
수종 5
 
5.6%

국명
Text

Distinct29
Distinct (%)96.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2024-05-04T08:23:46.741373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length3.5666667
Min length2

Characters and Unicode

Total characters107
Distinct characters65
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)93.3%

Sample

1st row금자란
2nd row나도풍란
3rd row만년콩
4th row암매
5th row제주고사리삼
ValueCountFrequency (%)
초령목 2
 
6.7%
솔잎란 1
 
3.3%
백운란 1
 
3.3%
백양더부살이 1
 
3.3%
비자란 1
 
3.3%
파초일엽 1
 
3.3%
콩짜개란 1
 
3.3%
차걸이란 1
 
3.3%
지네발란 1
 
3.3%
죽절초 1
 
3.3%
Other values (19) 19
63.3%
2024-05-04T08:23:47.866246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13
 
12.1%
6
 
5.6%
4
 
3.7%
4
 
3.7%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
2
 
1.9%
Other values (55) 63
58.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 107
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
12.1%
6
 
5.6%
4
 
3.7%
4
 
3.7%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
2
 
1.9%
Other values (55) 63
58.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 107
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
12.1%
6
 
5.6%
4
 
3.7%
4
 
3.7%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
2
 
1.9%
Other values (55) 63
58.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 107
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
13
 
12.1%
6
 
5.6%
4
 
3.7%
4
 
3.7%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
2
 
1.9%
Other values (55) 63
58.9%

학명
Text

Distinct29
Distinct (%)96.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2024-05-04T08:23:48.543866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length25
Mean length20.166667
Min length13

Characters and Unicode

Total characters605
Distinct characters42
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)93.3%

Sample

1st rowGastrochilus matsuran
2nd rowPhalaenopsis japonica
3rd rowEuchresta japonica
4th rowDiapensia japponica var.obovata
5th rowMankyua chejuense
ValueCountFrequency (%)
japonica 3
 
4.9%
cymbidium 3
 
4.9%
magnolia 2
 
3.3%
compressa 2
 
3.3%
gastrochilus 2
 
3.3%
japonicus 2
 
3.3%
oberonia 1
 
1.6%
septentrionalis 1
 
1.6%
utricularia 1
 
1.6%
uliginosa 1
 
1.6%
Other values (43) 43
70.5%
2024-05-04T08:23:49.781353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 73
 
12.1%
i 55
 
9.1%
o 43
 
7.1%
n 42
 
6.9%
s 38
 
6.3%
u 36
 
6.0%
33
 
5.5%
r 33
 
5.5%
e 31
 
5.1%
l 27
 
4.5%
Other values (32) 194
32.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 542
89.6%
Space Separator 33
 
5.5%
Uppercase Letter 29
 
4.8%
Other Punctuation 1
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 73
13.5%
i 55
10.1%
o 43
 
7.9%
n 42
 
7.7%
s 38
 
7.0%
u 36
 
6.6%
r 33
 
6.1%
e 31
 
5.7%
l 27
 
5.0%
m 27
 
5.0%
Other values (15) 137
25.3%
Uppercase Letter
ValueCountFrequency (%)
P 4
13.8%
C 4
13.8%
M 3
10.3%
D 2
 
6.9%
G 2
 
6.9%
L 2
 
6.9%
O 2
 
6.9%
S 2
 
6.9%
B 2
 
6.9%
U 1
 
3.4%
Other values (5) 5
17.2%
Space Separator
ValueCountFrequency (%)
33
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 571
94.4%
Common 34
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 73
12.8%
i 55
 
9.6%
o 43
 
7.5%
n 42
 
7.4%
s 38
 
6.7%
u 36
 
6.3%
r 33
 
5.8%
e 31
 
5.4%
l 27
 
4.7%
m 27
 
4.7%
Other values (30) 166
29.1%
Common
ValueCountFrequency (%)
33
97.1%
. 1
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 605
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 73
 
12.1%
i 55
 
9.1%
o 43
 
7.1%
n 42
 
6.9%
s 38
 
6.3%
u 36
 
6.0%
33
 
5.5%
r 33
 
5.5%
e 31
 
5.1%
l 27
 
4.5%
Other values (32) 194
32.1%

보유량
Real number (ℝ)

MISSING  ZEROS 

Distinct21
Distinct (%)84.0%
Missing5
Missing (%)16.7%
Infinite0
Infinite (%)0.0%
Mean151.4
Minimum0
Maximum1300
Zeros1
Zeros (%)3.3%
Negative0
Negative (%)0.0%
Memory size402.0 B
2024-05-04T08:23:50.454025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.2
Q14
median40
Q3220
95-th percentile442
Maximum1300
Range1300
Interquartile range (IQR)216

Descriptive statistics

Standard deviation274.51624
Coefficient of variation (CV)1.8131852
Kurtosis13.164724
Mean151.4
Median Absolute Deviation (MAD)38
Skewness3.3345702
Sum3785
Variance75359.167
MonotonicityNot monotonic
2024-05-04T08:23:50.983772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
100 2
 
6.7%
3 2
 
6.7%
4 2
 
6.7%
5 2
 
6.7%
8 1
 
3.3%
410 1
 
3.3%
40 1
 
3.3%
250 1
 
3.3%
0 1
 
3.3%
1300 1
 
3.3%
Other values (11) 11
36.7%
(Missing) 5
16.7%
ValueCountFrequency (%)
0 1
3.3%
1 1
3.3%
2 1
3.3%
3 2
6.7%
4 2
6.7%
5 2
6.7%
8 1
3.3%
15 1
3.3%
30 1
3.3%
40 1
3.3%
ValueCountFrequency (%)
1300 1
3.3%
450 1
3.3%
410 1
3.3%
280 1
3.3%
250 1
3.3%
240 1
3.3%
220 1
3.3%
215 1
3.3%
100 2
6.7%
55 1
3.3%

비고
Text

MISSING 

Distinct3
Distinct (%)60.0%
Missing25
Missing (%)83.3%
Memory size372.0 B
2024-05-04T08:23:51.307290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length2
Mean length4
Min length2

Characters and Unicode

Total characters20
Distinct characters13
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)40.0%

Sample

1st row난실
2nd row난실
3rd row난실
4th row종자보관
5th row배양실 35(종자)
ValueCountFrequency (%)
난실 3
50.0%
종자보관 1
 
16.7%
배양실 1
 
16.7%
35(종자 1
 
16.7%
2024-05-04T08:23:52.157067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
20.0%
3
15.0%
2
10.0%
2
10.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
3 1
 
5.0%
Other values (3) 3
15.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15
75.0%
Decimal Number 2
 
10.0%
Space Separator 1
 
5.0%
Open Punctuation 1
 
5.0%
Close Punctuation 1
 
5.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
26.7%
3
20.0%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Decimal Number
ValueCountFrequency (%)
3 1
50.0%
5 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15
75.0%
Common 5
 
25.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
26.7%
3
20.0%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Common
ValueCountFrequency (%)
1
20.0%
3 1
20.0%
5 1
20.0%
( 1
20.0%
) 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 15
75.0%
ASCII 5
 
25.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4
26.7%
3
20.0%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
ASCII
ValueCountFrequency (%)
1
20.0%
3 1
20.0%
5 1
20.0%
( 1
20.0%
) 1
20.0%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
Minimum2023-12-31 00:00:00
Maximum2023-12-31 00:00:00
2024-05-04T08:23:52.598966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:23:52.895640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2024-05-04T08:23:43.623398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T08:23:53.126601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분국명학명보유량비고
구분1.0001.0001.0000.2631.000
국명1.0001.0001.0001.0001.000
학명1.0001.0001.0001.0001.000
보유량0.2631.0001.0001.000NaN
비고1.0001.0001.000NaN1.000
2024-05-04T08:23:53.527471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
보유량구분
보유량1.0000.177
구분0.1771.000

Missing values

2024-05-04T08:23:44.239131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T08:23:44.745734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-04T08:23:45.133231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분국명학명보유량비고데이터기준일자
0멸종위기 야생식물 1급금자란Gastrochilus matsuran1<NA>2023-12-31
1멸종위기 야생식물 1급나도풍란Phalaenopsis japonica5<NA>2023-12-31
2멸종위기 야생식물 1급만년콩Euchresta japonica2<NA>2023-12-31
3멸종위기 야생식물 1급암매Diapensia japponica var.obovata<NA><NA>2023-12-31
4멸종위기 야생식물 1급제주고사리삼Mankyua chejuense30<NA>2023-12-31
5멸종위기 야생식물 1급죽백란Cymbidium lancifolium<NA>난실2023-12-31
6멸종위기 야생식물 1급탐라란Gastrochilus japonicus3<NA>2023-12-31
7멸종위기 야생식물 1급풍란Neofinetia falcata4난실2023-12-31
8멸종위기 야생식물 1급한라솜다리Leontopodium coreanum45<NA>2023-12-31
9멸종위기 야생식물 1급한란Cymbidium kanran15난실2023-12-31
구분국명학명보유량비고데이터기준일자
20멸종위기 야생식물 2급죽절초Sarcandra glabra1300<NA>2023-12-31
21멸종위기 야생식물 2급지네발란Pelatantheria scolopendrifolia3<NA>2023-12-31
22멸종위기 야생식물 2급차걸이란Oberonia japonica0<NA>2023-12-31
23멸종위기 야생식물 2급콩짜개란Bulbophyllum drymoglossum250<NA>2023-12-31
24멸종위기 야생식물 2급파초일엽Asplenium antiquum40<NA>2023-12-31
25지정외 보유 수종비자란hrixspermum japonicum410<NA>2023-12-31
26지정외 보유 수종백양더부살이Orobanche filicicola<NA>배양실 35(종자)2023-12-31
27지정외 보유 수종백운란Kuhlhasseltia nakaiana4<NA>2023-12-31
28지정외 보유 수종초령목Magnolia compressa100<NA>2023-12-31
29지정외 보유 수종초령목Magnolia compressa100<NA>2023-12-31

Duplicate rows

Most frequently occurring

구분국명학명보유량비고데이터기준일자# duplicates
0지정외 보유 수종초령목Magnolia compressa100<NA>2023-12-312