Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 1833 |
Missing cells | 480 |
Missing cells (%) | 6.5% |
Duplicate rows | 1 |
Duplicate rows (%) | 0.1% |
Total size in memory | 57.4 KiB |
Average record size in memory | 32.1 B |
Variable types
Unsupported | 2 |
---|---|
Text | 2 |
Dataset
Description | 대아수목원식물표본보유현황 |
---|---|
Author | 전라북도 |
URL | https://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202191 |
Dataset has 1 (0.1%) duplicate rows | Duplicates |
Unnamed: 1 has 159 (8.7%) missing values | Missing |
Unnamed: 2 has 160 (8.7%) missing values | Missing |
Unnamed: 3 has 160 (8.7%) missing values | Missing |
대아수목원 식물표본 보유 현황 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2024-03-14 01:13:32.052173 |
---|---|
Analysis finished | 2024-03-14 01:13:32.563933 |
Duration | 0.51 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
대아수목원 식물표본 보유 현황
Unsupported
REJECTED
  UNSUPPORTED
 
Missing | 1 |
---|---|
Missing (%) | 0.1% |
Memory size | 14.4 KiB |
Unnamed: 1
Text
MISSING
 
Distinct | 1474 |
---|---|
Distinct (%) | 88.1% |
Missing | 159 |
Missing (%) | 8.7% |
Memory size | 14.4 KiB |
Length
Max length | 62 |
---|---|
Median length | 49 |
Mean length | 26.01374 |
Min length | 2 |
Characters and Unicode
Total characters | 43547 |
---|---|
Distinct characters | 84 |
Distinct categories | 10 ? |
Distinct scripts | 3 ? |
Distinct blocks | 3 ? |
Unique
Unique | 1432 ? |
---|---|
Unique (%) | 85.5% |
Sample
1st row | 식 물 유 전 자 원 명 |
---|---|
2nd row | 학 명 |
3rd row | Selaginella involvens (Sw.) Spring |
4th row | Selaginella tamariscina (Beauv.) Spring |
5th row | Equisetum arvense L. |
Value | Count | Frequency (%) |
spp | 252 | 4.3% |
l | 176 | 3.0% |
var | 161 | 2.8% |
rosa | 127 | 2.2% |
japonica | 114 | 2.0% |
et | 99 | 1.7% |
nakai | 82 | 1.4% |
hibiscus | 82 | 1.4% |
thunb | 82 | 1.4% |
syriacus | 69 | 1.2% |
Other values (2220) | 4561 |
Most occurring characters
Value | Count | Frequency (%) |
4568 | 10.5% | |
a | 4550 | 10.4% |
i | 3426 | 7.9% |
s | 2583 | 5.9% |
e | 2437 | 5.6% |
r | 2240 | 5.1% |
o | 2036 | 4.7% |
n | 1992 | 4.6% |
u | 1930 | 4.4% |
l | 1760 | 4.0% |
Other values (74) | 16025 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 32819 | |
Space Separator | 4572 | 10.5% |
Uppercase Letter | 3673 | 8.4% |
Other Punctuation | 2031 | 4.7% |
Close Punctuation | 202 | 0.5% |
Open Punctuation | 202 | 0.5% |
Other Letter | 18 | < 0.1% |
Dash Punctuation | 16 | < 0.1% |
Decimal Number | 10 | < 0.1% |
Modifier Symbol | 4 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
a | 4550 | |
i | 3426 | |
s | 2583 | 7.9% |
e | 2437 | 7.4% |
r | 2240 | 6.8% |
o | 2036 | 6.2% |
n | 1992 | 6.1% |
u | 1930 | 5.9% |
l | 1760 | 5.4% |
t | 1369 | 4.2% |
Other values (16) | 8496 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 341 | 9.3% |
L | 338 | 9.2% |
S | 318 | 8.7% |
P | 299 | 8.1% |
M | 281 | 7.7% |
H | 265 | 7.2% |
R | 256 | 7.0% |
A | 216 | 5.9% |
T | 215 | 5.9% |
B | 168 | 4.6% |
Other values (16) | 976 |
Other Letter
Value | Count | Frequency (%) |
명 | 2 | 11.1% |
매 | 1 | 5.6% |
주 | 1 | 5.6% |
호 | 1 | 5.6% |
펫 | 1 | 5.6% |
럼 | 1 | 5.6% |
트 | 1 | 5.6% |
고 | 1 | 5.6% |
망 | 1 | 5.6% |
원 | 1 | 5.6% |
Other values (7) | 7 |
Decimal Number
Value | Count | Frequency (%) |
1 | 3 | |
2 | 2 | |
8 | 2 | |
9 | 1 | 10.0% |
4 | 1 | 10.0% |
0 | 1 | 10.0% |
Other Punctuation
Value | Count | Frequency (%) |
. | 1557 | |
' | 473 | 23.3% |
& | 1 | < 0.1% |
Space Separator
Value | Count | Frequency (%) |
4568 | ||
4 | 0.1% |
Close Punctuation
Value | Count | Frequency (%) |
) | 202 |
Open Punctuation
Value | Count | Frequency (%) |
( | 202 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 16 |
Modifier Symbol
Value | Count | Frequency (%) |
` | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 36492 | |
Common | 7037 | 16.2% |
Hangul | 18 | < 0.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
a | 4550 | 12.5% |
i | 3426 | 9.4% |
s | 2583 | 7.1% |
e | 2437 | 6.7% |
r | 2240 | 6.1% |
o | 2036 | 5.6% |
n | 1992 | 5.5% |
u | 1930 | 5.3% |
l | 1760 | 4.8% |
t | 1369 | 3.8% |
Other values (42) | 12169 |
Hangul
Value | Count | Frequency (%) |
명 | 2 | 11.1% |
매 | 1 | 5.6% |
주 | 1 | 5.6% |
호 | 1 | 5.6% |
펫 | 1 | 5.6% |
럼 | 1 | 5.6% |
트 | 1 | 5.6% |
고 | 1 | 5.6% |
망 | 1 | 5.6% |
원 | 1 | 5.6% |
Other values (7) | 7 |
Common
Value | Count | Frequency (%) |
4568 | ||
. | 1557 | 22.1% |
' | 473 | 6.7% |
) | 202 | 2.9% |
( | 202 | 2.9% |
- | 16 | 0.2% |
4 | 0.1% | |
` | 4 | 0.1% |
1 | 3 | < 0.1% |
2 | 2 | < 0.1% |
Other values (5) | 6 | 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 43525 | |
Hangul | 18 | < 0.1% |
None | 4 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
4568 | 10.5% | |
a | 4550 | 10.5% |
i | 3426 | 7.9% |
s | 2583 | 5.9% |
e | 2437 | 5.6% |
r | 2240 | 5.1% |
o | 2036 | 4.7% |
n | 1992 | 4.6% |
u | 1930 | 4.4% |
l | 1760 | 4.0% |
Other values (56) | 16003 |
None
Value | Count | Frequency (%) |
4 |
Hangul
Value | Count | Frequency (%) |
명 | 2 | 11.1% |
매 | 1 | 5.6% |
주 | 1 | 5.6% |
호 | 1 | 5.6% |
펫 | 1 | 5.6% |
럼 | 1 | 5.6% |
트 | 1 | 5.6% |
고 | 1 | 5.6% |
망 | 1 | 5.6% |
원 | 1 | 5.6% |
Other values (7) | 7 |
Unnamed: 2
Text
MISSING
 
Distinct | 1600 |
---|---|
Distinct (%) | 95.6% |
Missing | 160 |
Missing (%) | 8.7% |
Memory size | 14.4 KiB |
Value | Count | Frequency (%) |
동백나무(재배종 | 55 | 3.3% |
무궁화(재배종 | 11 | 0.7% |
목련(재배종 | 9 | 0.5% |
아디안툼 | 2 | 0.1% |
드로세라류 | 2 | 0.1% |
82 | 2 | 0.1% |
실새삼 | 1 | 0.1% |
풀협죽도 | 1 | 0.1% |
지면패랭이(꽃잔디 | 1 | 0.1% |
참꽃마리 | 1 | 0.1% |
Other values (1598) | 1598 |
Most occurring characters
Value | Count | Frequency (%) |
무 | 492 | 5.6% |
나 | 446 | 5.1% |
- | 224 | 2.6% |
리 | 223 | 2.5% |
( | 221 | 2.5% |
) | 221 | 2.5% |
미 | 170 | 1.9% |
장 | 145 | 1.7% |
이 | 138 | 1.6% |
화 | 136 | 1.5% |
Other values (611) | 6366 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 8058 | |
Dash Punctuation | 224 | 2.6% |
Open Punctuation | 221 | 2.5% |
Close Punctuation | 221 | 2.5% |
Space Separator | 23 | 0.3% |
Decimal Number | 22 | 0.3% |
Lowercase Letter | 7 | 0.1% |
Other Punctuation | 5 | 0.1% |
Uppercase Letter | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
무 | 492 | 6.1% |
나 | 446 | 5.5% |
리 | 223 | 2.8% |
미 | 170 | 2.1% |
장 | 145 | 1.8% |
이 | 138 | 1.7% |
화 | 136 | 1.7% |
종 | 104 | 1.3% |
백 | 104 | 1.3% |
아 | 103 | 1.3% |
Other values (587) | 5997 |
Decimal Number
Value | Count | Frequency (%) |
2 | 6 | |
1 | 4 | |
8 | 3 | |
7 | 2 | 9.1% |
4 | 2 | 9.1% |
9 | 2 | 9.1% |
3 | 1 | 4.5% |
5 | 1 | 4.5% |
6 | 1 | 4.5% |
Lowercase Letter
Value | Count | Frequency (%) |
r | 1 | |
a | 1 | |
e | 1 | |
c | 1 | |
i | 1 | |
n | 1 | |
o | 1 |
Other Punctuation
Value | Count | Frequency (%) |
, | 3 | |
' | 1 | 20.0% |
. | 1 | 20.0% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 224 |
Open Punctuation
Value | Count | Frequency (%) |
( | 221 |
Close Punctuation
Value | Count | Frequency (%) |
) | 221 |
Space Separator
Value | Count | Frequency (%) |
23 |
Uppercase Letter
Value | Count | Frequency (%) |
L | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 8058 | |
Common | 716 | 8.2% |
Latin | 8 | 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
무 | 492 | 6.1% |
나 | 446 | 5.5% |
리 | 223 | 2.8% |
미 | 170 | 2.1% |
장 | 145 | 1.8% |
이 | 138 | 1.7% |
화 | 136 | 1.7% |
종 | 104 | 1.3% |
백 | 104 | 1.3% |
아 | 103 | 1.3% |
Other values (587) | 5997 |
Common
Value | Count | Frequency (%) |
- | 224 | |
( | 221 | |
) | 221 | |
23 | 3.2% | |
2 | 6 | 0.8% |
1 | 4 | 0.6% |
8 | 3 | 0.4% |
, | 3 | 0.4% |
7 | 2 | 0.3% |
4 | 2 | 0.3% |
Other values (6) | 7 | 1.0% |
Latin
Value | Count | Frequency (%) |
r | 1 | |
a | 1 | |
e | 1 | |
c | 1 | |
i | 1 | |
n | 1 | |
L | 1 | |
o | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 8058 | |
ASCII | 724 | 8.2% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
무 | 492 | 6.1% |
나 | 446 | 5.5% |
리 | 223 | 2.8% |
미 | 170 | 2.1% |
장 | 145 | 1.8% |
이 | 138 | 1.7% |
화 | 136 | 1.7% |
종 | 104 | 1.3% |
백 | 104 | 1.3% |
아 | 103 | 1.3% |
Other values (587) | 5997 |
ASCII
Value | Count | Frequency (%) |
- | 224 | |
( | 221 | |
) | 221 | |
23 | 3.2% | |
2 | 6 | 0.8% |
1 | 4 | 0.6% |
8 | 3 | 0.4% |
, | 3 | 0.4% |
7 | 2 | 0.3% |
4 | 2 | 0.3% |
Other values (14) | 15 | 2.1% |
Unnamed: 3
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 160 |
---|---|
Missing (%) | 8.7% |
Memory size | 14.4 KiB |
대아수목원 식물표본 보유 현황 | Unnamed: 1 | Unnamed: 2 | Unnamed: 3 | |
---|---|---|---|---|
0 | 번호 | 식 물 유 전 자 원 명 | <NA> | 표본수 |
1 | NaN | 학 명 | 국 명 | NaN |
2 | Selaginellaceae 부처손科 | <NA> | <NA> | NaN |
3 | 1 | Selaginella involvens (Sw.) Spring | 바위손 | 5 |
4 | 2 | Selaginella tamariscina (Beauv.) Spring | 부처손 | 5 |
5 | Equisetaceae 속새科 | <NA> | <NA> | NaN |
6 | 3 | Equisetum arvense L. | 쇠뜨기 | 5 |
7 | 4 | Equisetum hyemale L. | 속새 | 9 |
8 | Ophioglossaceae 고사리삼科 | <NA> | <NA> | NaN |
9 | 5 | Botrychium ternatum (Thunb.) Sw. | 고사리삼 | 4 |
대아수목원 식물표본 보유 현황 | Unnamed: 1 | Unnamed: 2 | Unnamed: 3 | |
---|---|---|---|---|
1823 | 1666 | Callistemon lanceolatus (Sm.) DC. | 병솔꽃나무 | 3 |
1824 | 1667 | Psidium cattleianum Sabine | 스트로베리구아바 | 4 |
1825 | Meliaceae 산석류科 | <NA> | <NA> | NaN |
1826 | 1668 | Tibouchina semidecandra Cogn. | 티보치나 | 5 |
1827 | 닛사科 | <NA> | <NA> | NaN |
1828 | 1669 | Davidia involucrata | 손수건나무 | 5 |
1829 | 학명 미확인종류 | <NA> | <NA> | NaN |
1830 | 1670 | 망고 | 망고 | 3 |
1831 | 1671 | 트럼펫 | 트럼펫 | 2 |
1832 | 1672 | 호주매화 | 호주매화 | 3 |
Most frequently occurring
Unnamed: 1 | Unnamed: 2 | # duplicates | |
---|---|---|---|
0 | <NA> | <NA> | 159 |