Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 10000 |
Missing cells | 2686 |
Missing cells (%) | 6.7% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 390.6 KiB |
Average record size in memory | 40.0 B |
Variable types
Text | 4 |
---|
Dataset
Description | 국립생물자원관에서 생산한 DNA 바코드 서열 관련 자생 야생생물의 유전정보 현황(유전정보관리 번호, 학명, 국명 등) 제공 |
---|---|
Author | 환경부 국립생물자원관 |
URL | https://www.data.go.kr/data/3070009/fileData.do |
Reproduction
Analysis started | 2023-12-12 16:32:35.548703 |
---|---|
Analysis finished | 2023-12-12 16:32:36.263375 |
Duration | 0.71 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
유전정보아이디
Text
UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 100000 |
---|---|
Distinct characters | 13 |
Distinct categories | 2 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 10000 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | WBN0368067 |
---|---|
2nd row | WBN0342171 |
3rd row | WBN0377694 |
4th row | WBN0340032 |
5th row | WBN0383118 |
Value | Count | Frequency (%) |
wbn0368067 | 1 | < 0.1% |
wbn0382855 | 1 | < 0.1% |
wbn0361566 | 1 | < 0.1% |
wbn0397397 | 1 | < 0.1% |
wbn0346392 | 1 | < 0.1% |
wbn0379514 | 1 | < 0.1% |
wbn0345017 | 1 | < 0.1% |
wbn0368320 | 1 | < 0.1% |
wbn0400002 | 1 | < 0.1% |
wbn0358100 | 1 | < 0.1% |
Other values (9990) | 9990 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 14709 | |
3 | 13904 | |
W | 10000 | |
B | 10000 | |
N | 10000 | |
4 | 5799 | 5.8% |
9 | 5695 | 5.7% |
8 | 5527 | 5.5% |
6 | 5493 | 5.5% |
7 | 5486 | 5.5% |
Other values (3) | 13387 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 70000 | |
Uppercase Letter | 30000 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 14709 | |
3 | 13904 | |
4 | 5799 | 8.3% |
9 | 5695 | 8.1% |
8 | 5527 | 7.9% |
6 | 5493 | 7.8% |
7 | 5486 | 7.8% |
5 | 5381 | 7.7% |
1 | 4007 | 5.7% |
2 | 3999 | 5.7% |
Uppercase Letter
Value | Count | Frequency (%) |
W | 10000 | |
B | 10000 | |
N | 10000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 70000 | |
Latin | 30000 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 14709 | |
3 | 13904 | |
4 | 5799 | 8.3% |
9 | 5695 | 8.1% |
8 | 5527 | 7.9% |
6 | 5493 | 7.8% |
7 | 5486 | 7.8% |
5 | 5381 | 7.7% |
1 | 4007 | 5.7% |
2 | 3999 | 5.7% |
Latin
Value | Count | Frequency (%) |
W | 10000 | |
B | 10000 | |
N | 10000 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 100000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 14709 | |
3 | 13904 | |
W | 10000 | |
B | 10000 | |
N | 10000 | |
4 | 5799 | 5.8% |
9 | 5695 | 5.7% |
8 | 5527 | 5.5% |
6 | 5493 | 5.5% |
7 | 5486 | 5.5% |
Other values (3) | 13387 |
학명
Text
Distinct | 4794 |
---|---|
Distinct (%) | 47.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 117 |
---|---|
Median length | 75 |
Mean length | 32.1174 |
Min length | 5 |
Characters and Unicode
Total characters | 321174 |
---|---|
Distinct characters | 77 |
Distinct categories | 10 ? |
Distinct scripts | 2 ? |
Distinct blocks | 3 ? |
Unique
Unique | 3104 ? |
---|---|
Unique (%) | 31.0% |
Sample
1st row | Asplenium incisum Thunb. |
---|---|
2nd row | Anthus gustavi Swinhoe, 1863 |
3rd row | Glomerella Spauld. & H. Schrenk 1903 |
4th row | Petrolisthes coccineus (Owen, 1839) |
5th row | Pagurus maculosus Komai & Imafuku, 1996 |
Value | Count | Frequency (%) |
l | 1053 | 2.5% |
1038 | 2.4% | |
et | 506 | 1.2% |
al | 504 | 1.2% |
ex | 444 | 1.0% |
nakai | 375 | 0.9% |
japonica | 311 | 0.7% |
a | 297 | 0.7% |
var | 294 | 0.7% |
h | 286 | 0.7% |
Other values (8363) | 37354 |
Most occurring characters
Value | Count | Frequency (%) |
32462 | 10.1% | |
a | 28650 | 8.9% |
i | 22195 | 6.9% |
e | 18804 | 5.9% |
s | 15899 | 5.0% |
o | 15720 | 4.9% |
r | 15295 | 4.8% |
n | 14436 | 4.5% |
l | 13077 | 4.1% |
u | 12660 | 3.9% |
Other values (67) | 131976 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 222556 | |
Space Separator | 32462 | 10.1% |
Uppercase Letter | 26618 | 8.3% |
Decimal Number | 18956 | 5.9% |
Other Punctuation | 13550 | 4.2% |
Open Punctuation | 3406 | 1.1% |
Close Punctuation | 3406 | 1.1% |
Dash Punctuation | 183 | 0.1% |
Math Symbol | 20 | < 0.1% |
Final Punctuation | 17 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
a | 28650 | |
i | 22195 | 10.0% |
e | 18804 | 8.4% |
s | 15899 | 7.1% |
o | 15720 | 7.1% |
r | 15295 | 6.9% |
n | 14436 | 6.5% |
l | 13077 | 5.9% |
u | 12660 | 5.7% |
t | 10225 | 4.6% |
Other values (17) | 55595 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 2655 | 10.0% |
L | 2518 | 9.5% |
S | 2253 | 8.5% |
M | 2003 | 7.5% |
P | 1751 | 6.6% |
A | 1741 | 6.5% |
H | 1578 | 5.9% |
T | 1360 | 5.1% |
B | 1336 | 5.0% |
K | 1186 | 4.5% |
Other values (17) | 8237 |
Decimal Number
Value | Count | Frequency (%) |
1 | 4936 | |
8 | 3280 | |
9 | 2275 | |
0 | 1779 | 9.4% |
7 | 1581 | 8.3% |
2 | 1488 | 7.8% |
5 | 1011 | 5.3% |
6 | 981 | 5.2% |
3 | 848 | 4.5% |
4 | 777 | 4.1% |
Other Punctuation
Value | Count | Frequency (%) |
. | 8328 | |
, | 3759 | |
& | 1035 | 7.6% |
? | 421 | 3.1% |
' | 7 | 0.1% |
Open Punctuation
Value | Count | Frequency (%) |
( | 3403 | |
[ | 3 | 0.1% |
Close Punctuation
Value | Count | Frequency (%) |
) | 3403 | |
] | 3 | 0.1% |
Space Separator
Value | Count | Frequency (%) |
32462 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 183 |
Math Symbol
Value | Count | Frequency (%) |
× | 20 |
Final Punctuation
Value | Count | Frequency (%) |
’ | 17 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 249174 | |
Common | 72000 | 22.4% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
a | 28650 | 11.5% |
i | 22195 | 8.9% |
e | 18804 | 7.5% |
s | 15899 | 6.4% |
o | 15720 | 6.3% |
r | 15295 | 6.1% |
n | 14436 | 5.8% |
l | 13077 | 5.2% |
u | 12660 | 5.1% |
t | 10225 | 4.1% |
Other values (44) | 82213 |
Common
Value | Count | Frequency (%) |
32462 | ||
. | 8328 | 11.6% |
1 | 4936 | 6.9% |
, | 3759 | 5.2% |
( | 3403 | 4.7% |
) | 3403 | 4.7% |
8 | 3280 | 4.6% |
9 | 2275 | 3.2% |
0 | 1779 | 2.5% |
7 | 1581 | 2.2% |
Other values (13) | 6794 | 9.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 321134 | |
None | 23 | < 0.1% |
Punctuation | 17 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
32462 | 10.1% | |
a | 28650 | 8.9% |
i | 22195 | 6.9% |
e | 18804 | 5.9% |
s | 15899 | 5.0% |
o | 15720 | 4.9% |
r | 15295 | 4.8% |
n | 14436 | 4.5% |
l | 13077 | 4.1% |
u | 12660 | 3.9% |
Other values (63) | 131936 |
None
Value | Count | Frequency (%) |
× | 20 | |
Ø | 2 | 8.7% |
ø | 1 | 4.3% |
Punctuation
Value | Count | Frequency (%) |
’ | 17 |
국명
Text
MISSING
 
Distinct | 3057 |
---|---|
Distinct (%) | 41.8% |
Missing | 2686 |
Missing (%) | 26.9% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
밤고둥 | 180 | 2.5% |
낫균속 | 103 | 1.4% |
구멍밤고둥 | 102 | 1.4% |
홍합 | 94 | 1.3% |
가는몸참집게 | 76 | 1.0% |
고랑딱개비 | 76 | 1.0% |
극동갯강구 | 74 | 1.0% |
쇠살모사 | 63 | 0.9% |
덧나무 | 52 | 0.7% |
무당거미 | 50 | 0.7% |
Other values (3047) | 6444 |
Most occurring characters
Value | Count | Frequency (%) |
리 | 1111 | 3.3% |
나 | 1044 | 3.1% |
무 | 844 | 2.5% |
속 | 811 | 2.4% |
이 | 791 | 2.3% |
고 | 723 | 2.1% |
개 | 629 | 1.9% |
미 | 478 | 1.4% |
구 | 454 | 1.3% |
사 | 454 | 1.3% |
Other values (678) | 26644 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 33979 | |
Other Punctuation | 4 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
리 | 1111 | 3.3% |
나 | 1044 | 3.1% |
무 | 844 | 2.5% |
속 | 811 | 2.4% |
이 | 791 | 2.3% |
고 | 723 | 2.1% |
개 | 629 | 1.9% |
미 | 478 | 1.4% |
구 | 454 | 1.3% |
사 | 454 | 1.3% |
Other values (677) | 26640 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 33979 | |
Common | 4 | < 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
리 | 1111 | 3.3% |
나 | 1044 | 3.1% |
무 | 844 | 2.5% |
속 | 811 | 2.4% |
이 | 791 | 2.3% |
고 | 723 | 2.1% |
개 | 629 | 1.9% |
미 | 478 | 1.4% |
구 | 454 | 1.3% |
사 | 454 | 1.3% |
Other values (677) | 26640 |
Common
Value | Count | Frequency (%) |
/ | 4 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 33979 | |
ASCII | 4 | < 0.1% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
리 | 1111 | 3.3% |
나 | 1044 | 3.1% |
무 | 844 | 2.5% |
속 | 811 | 2.4% |
이 | 791 | 2.3% |
고 | 723 | 2.1% |
개 | 629 | 1.9% |
미 | 478 | 1.4% |
구 | 454 | 1.3% |
사 | 454 | 1.3% |
Other values (677) | 26640 |
ASCII
Value | Count | Frequency (%) |
/ | 4 |
마커명
Text
Distinct | 55 |
---|---|
Distinct (%) | 0.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
rbcl | 1926 | |
coi | 1835 | |
its | 1636 | |
16s | 1396 | |
matk | 1130 | |
trnh-psba | 422 | 4.1% |
cytb | 260 | 2.5% |
rrna | 213 | 2.1% |
trnl-f | 191 | 1.9% |
lsu | 135 | 1.3% |
Other values (46) | 1073 |
Most occurring characters
Value | Count | Frequency (%) |
S | 3573 | 9.1% |
I | 3543 | 9.1% |
r | 2839 | 7.3% |
b | 2744 | 7.0% |
L | 2272 | 5.8% |
C | 2237 | 5.7% |
t | 2128 | 5.4% |
c | 1982 | 5.1% |
O | 1840 | 4.7% |
1 | 1719 | 4.4% |
Other values (46) | 14222 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 19330 | |
Lowercase Letter | 14849 | |
Decimal Number | 3765 | 9.6% |
Dash Punctuation | 922 | 2.4% |
Space Separator | 217 | 0.6% |
Other Punctuation | 16 | < 0.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
S | 3573 | |
I | 3543 | |
L | 2272 | |
C | 2237 | |
O | 1840 | |
T | 1688 | |
K | 1202 | 6.2% |
A | 684 | 3.5% |
H | 588 | 3.0% |
R | 328 | 1.7% |
Other values (14) | 1375 | 7.1% |
Lowercase Letter
Value | Count | Frequency (%) |
r | 2839 | |
b | 2744 | |
t | 2128 | |
c | 1982 | |
a | 1202 | |
m | 1152 | |
n | 758 | 5.1% |
p | 681 | 4.6% |
s | 573 | 3.9% |
y | 280 | 1.9% |
Other values (11) | 510 | 3.4% |
Decimal Number
Value | Count | Frequency (%) |
1 | 1719 | |
6 | 1410 | |
2 | 361 | 9.6% |
8 | 213 | 5.7% |
3 | 29 | 0.8% |
4 | 20 | 0.5% |
5 | 8 | 0.2% |
9 | 5 | 0.1% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 922 |
Space Separator
Value | Count | Frequency (%) |
217 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 16 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 34140 | |
Common | 4920 | 12.6% |
Greek | 39 | 0.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
S | 3573 | 10.5% |
I | 3543 | 10.4% |
r | 2839 | 8.3% |
b | 2744 | 8.0% |
L | 2272 | 6.7% |
C | 2237 | 6.6% |
t | 2128 | 6.2% |
c | 1982 | 5.8% |
O | 1840 | 5.4% |
T | 1688 | 4.9% |
Other values (34) | 9294 |
Common
Value | Count | Frequency (%) |
1 | 1719 | |
6 | 1410 | |
- | 922 | |
2 | 361 | 7.3% |
217 | 4.4% | |
8 | 213 | 4.3% |
3 | 29 | 0.6% |
4 | 20 | 0.4% |
/ | 16 | 0.3% |
5 | 8 | 0.2% |
Greek
Value | Count | Frequency (%) |
α | 39 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 39060 | |
None | 39 | 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
S | 3573 | 9.1% |
I | 3543 | 9.1% |
r | 2839 | 7.3% |
b | 2744 | 7.0% |
L | 2272 | 5.8% |
C | 2237 | 5.7% |
t | 2128 | 5.4% |
c | 1982 | 5.1% |
O | 1840 | 4.7% |
1 | 1719 | 4.4% |
Other values (45) | 14183 |
None
Value | Count | Frequency (%) |
α | 39 |
유전정보아이디 | 학명 | 국명 | 마커명 | |
---|---|---|---|---|
30127 | WBN0368067 | Asplenium incisum Thunb. | 꼬리고사리 | rbcL |
2111 | WBN0342171 | Anthus gustavi Swinhoe, 1863 | 흰등밭종다리 | Cytb |
39921 | WBN0377694 | Glomerella Spauld. & H. Schrenk 1903 | 작은뿔껍질균속 | CHS-1 |
2272 | WBN0340032 | Petrolisthes coccineus (Owen, 1839) | 검붉은게붙이 | COI |
57483 | WBN0383118 | Pagurus maculosus Komai & Imafuku, 1996 | 가는몸참집게 | COI |
53234 | WBN0387239 | Polygonatum Mill. | 둥굴레속 | rbcL |
467 | WBN0337853 | Petunia × hybrida (Hook.) Vilm. | 페튜니아 | rbcL |
7881 | WBN0356460 | Potamogeton fryeri A. Benn. | 선가래 | rbcL |
45620 | WBN0401025 | Cardamine leucantha (Tausch) O. E. Schulz | 미나리냉이 | trnH-psbA |
47831 | WBN0395041 | Clematis ochotensis (Pall.) Poir. | 자주종덩굴 | rbcL |
유전정보아이디 | 학명 | 국명 | 마커명 | |
---|---|---|---|---|
63650 | WBN0403163 | Cylindromyia brassicaria (Fabricius, 1775) | 표주박기생파리 | COI |
47852 | WBN0395062 | Coriandrum sativum L. | 고수 | rbcL |
19620 | WBN0347780 | Solanum lycopersicum L. | 토마토 | trnH-psbA |
10779 | WBN0348976 | Asparagus cochinchinensis (Lour.) Merr. | 천문동 | ITS |
29079 | WBN0365442 | Mytilus unguiculatus Valenciennes, 1858 | 홍합 | 16S |
9701 | WBN0354896 | Aster meyendorffii (Regel & Maack) Voss | 개쑥부쟁이 | matK |
17755 | WBN0361442 | Sagina L. | 개미자리속 | trnL-F |
58179 | WBN0389533 | Modiolicola bifida Tanaka, 1961 | 진주담치속살이 | COI |
38161 | WBN0373770 | Gasteracantha kuhli C. L. Koch, 1837 | 가시거미 | 16S |
41303 | WBN0376532 | Dissotis rotundifolia (Sm.) Triana | <NA> | ITS |