Overview

Dataset statistics

Number of variables4
Number of observations1672
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory55.6 KiB
Average record size in memory34.1 B

Variable types

Numeric2
Text2

Dataset

Description전북특별자치도 대아수목원 식물표본 보유 현황(학명, 국명, 표본수)우리기관에서는 더 이상 생성 불가 데이터입니다.
Author전북특별자치도
URLhttps://www.data.go.kr/data/15055682/fileData.do

Alerts

번호 has unique valuesUnique

Reproduction

Analysis started2024-03-15 00:46:44.956728
Analysis finished2024-03-15 00:46:47.085858
Duration2.13 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct1672
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean836.5
Minimum1
Maximum1672
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.8 KiB
2024-03-15T09:46:47.307323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile84.55
Q1418.75
median836.5
Q31254.25
95-th percentile1588.45
Maximum1672
Range1671
Interquartile range (IQR)835.5

Descriptive statistics

Standard deviation482.80914
Coefficient of variation (CV)0.57717769
Kurtosis-1.2
Mean836.5
Median Absolute Deviation (MAD)418
Skewness0
Sum1398628
Variance233104.67
MonotonicityStrictly increasing
2024-03-15T09:46:47.836329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
1151 1
 
0.1%
1123 1
 
0.1%
1122 1
 
0.1%
1121 1
 
0.1%
1120 1
 
0.1%
1119 1
 
0.1%
1118 1
 
0.1%
1117 1
 
0.1%
1116 1
 
0.1%
Other values (1662) 1662
99.4%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1672 1
0.1%
1671 1
0.1%
1670 1
0.1%
1669 1
0.1%
1668 1
0.1%
1667 1
0.1%
1666 1
0.1%
1665 1
0.1%
1664 1
0.1%
1663 1
0.1%
Distinct1472
Distinct (%)88.0%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
2024-03-15T09:46:48.935745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length62
Median length49
Mean length25.995813
Min length2

Characters and Unicode

Total characters43465
Distinct characters75
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1430 ?
Unique (%)85.5%

Sample

1st rowSelaginella involvens (Sw.) Spring
2nd rowSelaginella tamariscina (Beauv.) Spring
3rd rowEquisetum arvense L.
4th rowEquisetum hyemale L.
5th rowBotrychium ternatum (Thunb.) Sw.
ValueCountFrequency (%)
spp 252
 
4.3%
l 176
 
3.0%
var 161
 
2.8%
rosa 127
 
2.2%
japonica 114
 
2.0%
et 99
 
1.7%
thunb 82
 
1.4%
hibiscus 82
 
1.4%
nakai 82
 
1.4%
syriacus 69
 
1.2%
Other values (2212) 4552
78.5%
2024-03-15T09:46:51.231603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4550
 
10.5%
4499
 
10.4%
i 3426
 
7.9%
s 2583
 
5.9%
e 2437
 
5.6%
r 2240
 
5.2%
o 2036
 
4.7%
n 1992
 
4.6%
u 1930
 
4.4%
l 1760
 
4.0%
Other values (65) 16012
36.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 32819
75.5%
Space Separator 4499
 
10.4%
Uppercase Letter 3673
 
8.5%
Other Punctuation 2031
 
4.7%
Close Punctuation 202
 
0.5%
Open Punctuation 202
 
0.5%
Dash Punctuation 16
 
< 0.1%
Decimal Number 10
 
< 0.1%
Other Letter 9
 
< 0.1%
Modifier Symbol 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4550
13.9%
i 3426
10.4%
s 2583
 
7.9%
e 2437
 
7.4%
r 2240
 
6.8%
o 2036
 
6.2%
n 1992
 
6.1%
u 1930
 
5.9%
l 1760
 
5.4%
t 1369
 
4.2%
Other values (16) 8496
25.9%
Uppercase Letter
ValueCountFrequency (%)
C 341
 
9.3%
L 338
 
9.2%
S 318
 
8.7%
P 299
 
8.1%
M 281
 
7.7%
H 265
 
7.2%
R 256
 
7.0%
A 216
 
5.9%
T 215
 
5.9%
B 168
 
4.6%
Other values (16) 976
26.6%
Other Letter
ValueCountFrequency (%)
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Decimal Number
ValueCountFrequency (%)
1 3
30.0%
8 2
20.0%
2 2
20.0%
4 1
 
10.0%
0 1
 
10.0%
9 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
. 1557
76.7%
' 473
 
23.3%
& 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
4499
100.0%
Close Punctuation
ValueCountFrequency (%)
) 202
100.0%
Open Punctuation
ValueCountFrequency (%)
( 202
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 36492
84.0%
Common 6964
 
16.0%
Hangul 9
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4550
 
12.5%
i 3426
 
9.4%
s 2583
 
7.1%
e 2437
 
6.7%
r 2240
 
6.1%
o 2036
 
5.6%
n 1992
 
5.5%
u 1930
 
5.3%
l 1760
 
4.8%
t 1369
 
3.8%
Other values (42) 12169
33.3%
Common
ValueCountFrequency (%)
4499
64.6%
. 1557
 
22.4%
' 473
 
6.8%
) 202
 
2.9%
( 202
 
2.9%
- 16
 
0.2%
` 4
 
0.1%
1 3
 
< 0.1%
8 2
 
< 0.1%
2 2
 
< 0.1%
Other values (4) 4
 
0.1%
Hangul
ValueCountFrequency (%)
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43456
> 99.9%
Hangul 9
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4550
 
10.5%
4499
 
10.4%
i 3426
 
7.9%
s 2583
 
5.9%
e 2437
 
5.6%
r 2240
 
5.2%
o 2036
 
4.7%
n 1992
 
4.6%
u 1930
 
4.4%
l 1760
 
4.1%
Other values (56) 16003
36.8%
Hangul
ValueCountFrequency (%)
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%

국명
Text

Distinct1599
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Memory size13.2 KiB
2024-03-15T09:46:53.499624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.2446172
Min length1

Characters and Unicode

Total characters8769
Distinct characters620
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1595 ?
Unique (%)95.4%

Sample

1st row바위손
2nd row부처손
3rd row쇠뜨기
4th row속새
5th row고사리삼
ValueCountFrequency (%)
동백나무(재배종 55
 
3.3%
무궁화(재배종 11
 
0.7%
목련(재배종 9
 
0.5%
82 2
 
0.1%
아디안툼 2
 
0.1%
드로세라류 2
 
0.1%
클레로덴드룸(서양누리장 1
 
0.1%
층꽃나무 1
 
0.1%
새비나무 1
 
0.1%
털누리장나무 1
 
0.1%
Other values (1596) 1596
94.9%
2024-03-15T09:46:55.267414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
492
 
5.6%
446
 
5.1%
- 224
 
2.6%
223
 
2.5%
( 221
 
2.5%
) 221
 
2.5%
170
 
1.9%
145
 
1.7%
138
 
1.6%
136
 
1.6%
Other values (610) 6353
72.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8055
91.9%
Dash Punctuation 224
 
2.6%
Open Punctuation 221
 
2.5%
Close Punctuation 221
 
2.5%
Decimal Number 22
 
0.3%
Space Separator 13
 
0.1%
Lowercase Letter 7
 
0.1%
Other Punctuation 5
 
0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
492
 
6.1%
446
 
5.5%
223
 
2.8%
170
 
2.1%
145
 
1.8%
138
 
1.7%
136
 
1.7%
104
 
1.3%
104
 
1.3%
103
 
1.3%
Other values (586) 5994
74.4%
Decimal Number
ValueCountFrequency (%)
2 6
27.3%
1 4
18.2%
8 3
13.6%
7 2
 
9.1%
9 2
 
9.1%
4 2
 
9.1%
5 1
 
4.5%
6 1
 
4.5%
3 1
 
4.5%
Lowercase Letter
ValueCountFrequency (%)
a 1
14.3%
r 1
14.3%
e 1
14.3%
c 1
14.3%
i 1
14.3%
n 1
14.3%
o 1
14.3%
Other Punctuation
ValueCountFrequency (%)
, 3
60.0%
. 1
 
20.0%
' 1
 
20.0%
Dash Punctuation
ValueCountFrequency (%)
- 224
100.0%
Open Punctuation
ValueCountFrequency (%)
( 221
100.0%
Close Punctuation
ValueCountFrequency (%)
) 221
100.0%
Space Separator
ValueCountFrequency (%)
13
100.0%
Uppercase Letter
ValueCountFrequency (%)
L 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8055
91.9%
Common 706
 
8.1%
Latin 8
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
492
 
6.1%
446
 
5.5%
223
 
2.8%
170
 
2.1%
145
 
1.8%
138
 
1.7%
136
 
1.7%
104
 
1.3%
104
 
1.3%
103
 
1.3%
Other values (586) 5994
74.4%
Common
ValueCountFrequency (%)
- 224
31.7%
( 221
31.3%
) 221
31.3%
13
 
1.8%
2 6
 
0.8%
1 4
 
0.6%
8 3
 
0.4%
, 3
 
0.4%
7 2
 
0.3%
9 2
 
0.3%
Other values (6) 7
 
1.0%
Latin
ValueCountFrequency (%)
L 1
12.5%
a 1
12.5%
r 1
12.5%
e 1
12.5%
c 1
12.5%
i 1
12.5%
n 1
12.5%
o 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8055
91.9%
ASCII 714
 
8.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
492
 
6.1%
446
 
5.5%
223
 
2.8%
170
 
2.1%
145
 
1.8%
138
 
1.7%
136
 
1.7%
104
 
1.3%
104
 
1.3%
103
 
1.3%
Other values (586) 5994
74.4%
ASCII
ValueCountFrequency (%)
- 224
31.4%
( 221
31.0%
) 221
31.0%
13
 
1.8%
2 6
 
0.8%
1 4
 
0.6%
8 3
 
0.4%
, 3
 
0.4%
7 2
 
0.3%
9 2
 
0.3%
Other values (14) 15
 
2.1%

표본수
Real number (ℝ)

Distinct15
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.6172249
Minimum1
Maximum15
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.8 KiB
2024-03-15T09:46:55.532307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile8
Maximum15
Range14
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.2944556
Coefficient of variation (CV)0.63431379
Kurtosis2.7363741
Mean3.6172249
Median Absolute Deviation (MAD)1
Skewness1.4123652
Sum6048
Variance5.2645266
MonotonicityNot monotonic
2024-03-15T09:46:55.857103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
2 462
27.6%
5 319
19.1%
1 236
14.1%
4 235
14.1%
3 207
12.4%
6 61
 
3.6%
8 37
 
2.2%
7 35
 
2.1%
9 34
 
2.0%
10 20
 
1.2%
Other values (5) 26
 
1.6%
ValueCountFrequency (%)
1 236
14.1%
2 462
27.6%
3 207
12.4%
4 235
14.1%
5 319
19.1%
6 61
 
3.6%
7 35
 
2.1%
8 37
 
2.2%
9 34
 
2.0%
10 20
 
1.2%
ValueCountFrequency (%)
15 2
 
0.1%
14 4
 
0.2%
13 3
 
0.2%
12 7
 
0.4%
11 10
 
0.6%
10 20
 
1.2%
9 34
2.0%
8 37
2.2%
7 35
2.1%
6 61
3.6%

Interactions

2024-03-15T09:46:46.057158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T09:46:45.509495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T09:46:46.341877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T09:46:45.785662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T09:46:56.096119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호표본수
번호1.0000.453
표본수0.4531.000
2024-03-15T09:46:56.326794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호표본수
번호1.000-0.093
표본수-0.0931.000

Missing values

2024-03-15T09:46:46.694334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T09:46:46.975173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호학명국명표본수
01Selaginella involvens (Sw.) Spring바위손5
12Selaginella tamariscina (Beauv.) Spring부처손5
23Equisetum arvense L.쇠뜨기5
34Equisetum hyemale L.속새9
45Botrychium ternatum (Thunb.) Sw.고사리삼4
56Osmunda japonica Thunb.고비1
67Adiantum caudatum L.아디안툼 코우다툼3
78Adiantum hispidulum Swartz아디안툼 히스피둘룸1
89Dennstaedtia wilfordii (Moore) Christ.황고사리2
910Platycerium hilli Moore힐리박쥐난2
번호학명국명표본수
16621663Cyathea fauriei (Christ) Copel.해고3
16631664Sarracenia purpurea L.사라세니아퍼프레아1
16641665Sarracenia rubra사라세니아루브라3
16651666Callistemon lanceolatus (Sm.) DC.병솔꽃나무3
16661667Psidium cattleianum Sabine스트로베리구아바4
16671668Tibouchina semidecandra Cogn.티보치나5
16681669Davidia involucrata손수건나무5
16691670망고망고3
16701671트럼펫트럼펫2
16711672호주매화호주매화3