Dataset statistics
Number of variables | 3 |
---|---|
Number of observations | 10000 |
Missing cells | 2734 |
Missing cells (%) | 9.1% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 312.5 KiB |
Average record size in memory | 32.0 B |
Variable types
Text | 3 |
---|
Dataset
Description | 생물 유전정보 중 DNA 바코드 관련 내용으로 그에 대한 정의와 국외 및 국내 연구동향, DNA 바코드의 필요성에 대한 내용 설명 입니다. |
---|---|
Author | 환경부 국립생물자원관 |
URL | https://www.data.go.kr/data/15067608/fileData.do |
Reproduction
Analysis started | 2023-12-12 09:19:47.544366 |
---|---|
Analysis finished | 2023-12-12 09:19:48.382278 |
Duration | 0.84 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
유전정보아이디
Text
UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 100000 |
---|---|
Distinct characters | 13 |
Distinct categories | 2 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 10000 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | WBN0403419 |
---|---|
2nd row | WBN0362518 |
3rd row | WBN0369629 |
4th row | WBN0339461 |
5th row | WBN0364430 |
Value | Count | Frequency (%) |
wbn0403419 | 1 | < 0.1% |
wbn0378672 | 1 | < 0.1% |
wbn0355176 | 1 | < 0.1% |
wbn0351627 | 1 | < 0.1% |
wbn0377113 | 1 | < 0.1% |
wbn0388020 | 1 | < 0.1% |
wbn0386675 | 1 | < 0.1% |
wbn0338612 | 1 | < 0.1% |
wbn0401216 | 1 | < 0.1% |
wbn0369089 | 1 | < 0.1% |
Other values (9990) | 9990 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 14637 | |
3 | 13740 | |
W | 10000 | |
B | 10000 | |
N | 10000 | |
4 | 5864 | |
6 | 5588 | 5.6% |
9 | 5557 | 5.6% |
7 | 5556 | 5.6% |
8 | 5447 | 5.4% |
Other values (3) | 13611 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 70000 | |
Uppercase Letter | 30000 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 14637 | |
3 | 13740 | |
4 | 5864 | |
6 | 5588 | 8.0% |
9 | 5557 | 7.9% |
7 | 5556 | 7.9% |
8 | 5447 | 7.8% |
5 | 5403 | 7.7% |
2 | 4157 | 5.9% |
1 | 4051 | 5.8% |
Uppercase Letter
Value | Count | Frequency (%) |
W | 10000 | |
B | 10000 | |
N | 10000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 70000 | |
Latin | 30000 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 14637 | |
3 | 13740 | |
4 | 5864 | |
6 | 5588 | 8.0% |
9 | 5557 | 7.9% |
7 | 5556 | 7.9% |
8 | 5447 | 7.8% |
5 | 5403 | 7.7% |
2 | 4157 | 5.9% |
1 | 4051 | 5.8% |
Latin
Value | Count | Frequency (%) |
W | 10000 | |
B | 10000 | |
N | 10000 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 100000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 14637 | |
3 | 13740 | |
W | 10000 | |
B | 10000 | |
N | 10000 | |
4 | 5864 | |
6 | 5588 | 5.6% |
9 | 5557 | 5.6% |
7 | 5556 | 5.6% |
8 | 5447 | 5.4% |
Other values (3) | 13611 |
학명
Text
Distinct | 4776 |
---|---|
Distinct (%) | 47.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 125 |
---|---|
Median length | 75 |
Mean length | 32.2949 |
Min length | 5 |
Characters and Unicode
Total characters | 322949 |
---|---|
Distinct characters | 76 |
Distinct categories | 10 ? |
Distinct scripts | 2 ? |
Distinct blocks | 3 ? |
Unique
Unique | 3072 ? |
---|---|
Unique (%) | 30.7% |
Sample
1st row | Micropsalliota pleurocystidiata Heinem. & Little Flower 1983 |
---|---|
2nd row | Agelena limbata Thorell, 1897 |
3rd row | Impatiens L. |
4th row | Chrysosplenium japonicum (Maxim.) Makino |
5th row | Chlorostoma lischkei Tapparone Canefri, 1874 |
Value | Count | Frequency (%) |
1126 | 2.6% | |
l | 1068 | 2.5% |
et | 540 | 1.3% |
al | 537 | 1.3% |
ex | 450 | 1.1% |
nakai | 362 | 0.8% |
japonica | 314 | 0.7% |
a | 301 | 0.7% |
var | 271 | 0.6% |
h | 270 | 0.6% |
Other values (8284) | 37618 |
Most occurring characters
Value | Count | Frequency (%) |
32857 | 10.2% | |
a | 28913 | 9.0% |
i | 22305 | 6.9% |
e | 18941 | 5.9% |
s | 15777 | 4.9% |
o | 15594 | 4.8% |
r | 15195 | 4.7% |
n | 14672 | 4.5% |
l | 13148 | 4.1% |
u | 12720 | 3.9% |
Other values (66) | 132827 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 223530 | |
Space Separator | 32857 | 10.2% |
Uppercase Letter | 26693 | 8.3% |
Decimal Number | 18996 | 5.9% |
Other Punctuation | 13649 | 4.2% |
Open Punctuation | 3483 | 1.1% |
Close Punctuation | 3483 | 1.1% |
Dash Punctuation | 221 | 0.1% |
Math Symbol | 21 | < 0.1% |
Final Punctuation | 16 | < 0.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
L | 2551 | 9.6% |
C | 2516 | 9.4% |
S | 2445 | 9.2% |
M | 2110 | 7.9% |
A | 1766 | 6.6% |
P | 1724 | 6.5% |
H | 1503 | 5.6% |
B | 1340 | 5.0% |
T | 1305 | 4.9% |
K | 1191 | 4.5% |
Other values (17) | 8242 |
Lowercase Letter
Value | Count | Frequency (%) |
a | 28913 | |
i | 22305 | 10.0% |
e | 18941 | 8.5% |
s | 15777 | 7.1% |
o | 15594 | 7.0% |
r | 15195 | 6.8% |
n | 14672 | 6.6% |
l | 13148 | 5.9% |
u | 12720 | 5.7% |
t | 10211 | 4.6% |
Other values (16) | 56054 |
Decimal Number
Value | Count | Frequency (%) |
1 | 4912 | |
8 | 3178 | |
9 | 2392 | |
0 | 1850 | 9.7% |
7 | 1558 | 8.2% |
2 | 1543 | 8.1% |
5 | 991 | 5.2% |
6 | 903 | 4.8% |
3 | 860 | 4.5% |
4 | 809 | 4.3% |
Other Punctuation
Value | Count | Frequency (%) |
. | 8249 | |
, | 3798 | |
& | 1124 | 8.2% |
? | 468 | 3.4% |
' | 10 | 0.1% |
Open Punctuation
Value | Count | Frequency (%) |
( | 3479 | |
[ | 4 | 0.1% |
Close Punctuation
Value | Count | Frequency (%) |
) | 3479 | |
] | 4 | 0.1% |
Space Separator
Value | Count | Frequency (%) |
32857 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 221 |
Math Symbol
Value | Count | Frequency (%) |
× | 21 |
Final Punctuation
Value | Count | Frequency (%) |
’ | 16 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 250223 | |
Common | 72726 | 22.5% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
a | 28913 | 11.6% |
i | 22305 | 8.9% |
e | 18941 | 7.6% |
s | 15777 | 6.3% |
o | 15594 | 6.2% |
r | 15195 | 6.1% |
n | 14672 | 5.9% |
l | 13148 | 5.3% |
u | 12720 | 5.1% |
t | 10211 | 4.1% |
Other values (43) | 82747 |
Common
Value | Count | Frequency (%) |
32857 | ||
. | 8249 | 11.3% |
1 | 4912 | 6.8% |
, | 3798 | 5.2% |
( | 3479 | 4.8% |
) | 3479 | 4.8% |
8 | 3178 | 4.4% |
9 | 2392 | 3.3% |
0 | 1850 | 2.5% |
7 | 1558 | 2.1% |
Other values (13) | 6974 | 9.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 322910 | |
None | 23 | < 0.1% |
Punctuation | 16 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
32857 | 10.2% | |
a | 28913 | 9.0% |
i | 22305 | 6.9% |
e | 18941 | 5.9% |
s | 15777 | 4.9% |
o | 15594 | 4.8% |
r | 15195 | 4.7% |
n | 14672 | 4.5% |
l | 13148 | 4.1% |
u | 12720 | 3.9% |
Other values (63) | 132788 |
None
Value | Count | Frequency (%) |
× | 21 | |
Ø | 2 | 8.7% |
Punctuation
Value | Count | Frequency (%) |
’ | 16 |
국명
Text
MISSING
 
Distinct | 3092 |
---|---|
Distinct (%) | 42.6% |
Missing | 2734 |
Missing (%) | 27.3% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
밤고둥 | 159 | 2.2% |
구멍밤고둥 | 105 | 1.4% |
낫균속 | 84 | 1.2% |
극동갯강구 | 80 | 1.1% |
고랑딱개비 | 79 | 1.1% |
홍합 | 79 | 1.1% |
쇠살모사 | 76 | 1.0% |
가는몸참집게 | 62 | 0.9% |
갯장대 | 42 | 0.6% |
덧나무 | 42 | 0.6% |
Other values (3082) | 6458 |
Most occurring characters
Value | Count | Frequency (%) |
리 | 1118 | 3.3% |
나 | 943 | 2.8% |
무 | 788 | 2.3% |
이 | 769 | 2.3% |
속 | 734 | 2.2% |
고 | 678 | 2.0% |
개 | 632 | 1.9% |
미 | 470 | 1.4% |
구 | 464 | 1.4% |
사 | 460 | 1.4% |
Other values (686) | 26750 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 33802 | |
Uppercase Letter | 2 | < 0.1% |
Other Punctuation | 1 | < 0.1% |
Lowercase Letter | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
리 | 1118 | 3.3% |
나 | 943 | 2.8% |
무 | 788 | 2.3% |
이 | 769 | 2.3% |
속 | 734 | 2.2% |
고 | 678 | 2.0% |
개 | 632 | 1.9% |
미 | 470 | 1.4% |
구 | 464 | 1.4% |
사 | 460 | 1.4% |
Other values (682) | 26746 |
Uppercase Letter
Value | Count | Frequency (%) |
U | 1 | |
K | 1 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 1 |
Lowercase Letter
Value | Count | Frequency (%) |
a | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 33802 | |
Latin | 3 | < 0.1% |
Common | 1 | < 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
리 | 1118 | 3.3% |
나 | 943 | 2.8% |
무 | 788 | 2.3% |
이 | 769 | 2.3% |
속 | 734 | 2.2% |
고 | 678 | 2.0% |
개 | 632 | 1.9% |
미 | 470 | 1.4% |
구 | 464 | 1.4% |
사 | 460 | 1.4% |
Other values (682) | 26746 |
Latin
Value | Count | Frequency (%) |
a | 1 | |
U | 1 | |
K | 1 |
Common
Value | Count | Frequency (%) |
/ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 33802 | |
ASCII | 4 | < 0.1% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
리 | 1118 | 3.3% |
나 | 943 | 2.8% |
무 | 788 | 2.3% |
이 | 769 | 2.3% |
속 | 734 | 2.2% |
고 | 678 | 2.0% |
개 | 632 | 1.9% |
미 | 470 | 1.4% |
구 | 464 | 1.4% |
사 | 460 | 1.4% |
Other values (682) | 26746 |
ASCII
Value | Count | Frequency (%) |
/ | 1 | |
a | 1 | |
U | 1 | |
K | 1 |
유전정보아이디 | 학명 | 국명 | |
---|---|---|---|
62991 | WBN0403419 | Micropsalliota pleurocystidiata Heinem. & Little Flower 1983 | <NA> |
17404 | WBN0362518 | Agelena limbata Thorell, 1897 | 들풀거미 |
26325 | WBN0369629 | Impatiens L. | 물봉선속 |
833 | WBN0339461 | Chrysosplenium japonicum (Maxim.) Makino | 산괭이눈 |
24921 | WBN0364430 | Chlorostoma lischkei Tapparone Canefri, 1874 | 밤고둥 |
1889 | WBN0338910 | Cynanchum volubile (Maxim.) Hemsl. | 세포큰조롱 |
22828 | WBN0348845 | Maianthemum japonicum (A. Gray) La Frankie | 풀솜대 |
16881 | WBN0360619 | Leibnitzia anandria (L.) Turcz. | 솜나물 |
36011 | WBN0378668 | Eriocaulon truncatum Buch.-Ham. ex Mart. | <NA> |
18724 | WBN0347991 | Galium kinuta Nakai & H. Hara | 민둥갈퀴 |
유전정보아이디 | 학명 | 국명 | |
---|---|---|---|
63243 | WBN0402663 | Rikiosatoa grisea (Butler, 1878) | 두줄가지나방 |
10576 | WBN0350608 | Paraburkholderia caledonica Coenye et al. 2001 | <NA> |
47431 | WBN0392150 | Arabis gemmifera (Matsum.) Makino | 산장대 |
41121 | WBN0381411 | Peromyia Kieffer, 1894 | 어리애혹파리속 |
33542 | WBN0375844 | Spermacoce remota Lam. | <NA> |
28465 | WBN0366217 | Gloydius ussuriensis (Emelianov, 1929) | 쇠살모사 |
7834 | WBN0345319 | Sphingobium algicola Lee Y and Jeon CO. 2017 | <NA> |
12356 | WBN0355370 | Taraxacum formosanum Kitam. | 영도민들레 |
48436 | WBN0390003 | Orthocladius ulaanbaatus Sasa and Suzuki, 1997 | 울란바트깃깔따구 |
24055 | WBN0343215 | Forsythia ovata Nakai | 만리화 |