Overview

Dataset statistics

Number of variables17
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.4 MiB
Average record size in memory145.0 B

Variable types

Categorical11
Text6

Dataset

Description제주 민속자연사박물관에서 소장하고 있는 자연사자료(동물)의 분류, 속명, 종명, 수량, 수집방법, 현위치 등 정보 제공
Author제주특별자치도
URLhttps://www.data.go.kr/data/15045471/fileData.do

Alerts

기관코드 has constant value ""Constant
대분류 has constant value ""Constant
수량 has constant value ""Constant
데이터기준일자 has constant value ""Constant
소분류 is highly overall correlated with 중분류 and 2 other fieldsHigh correlation
중분류 is highly overall correlated with 소분류 and 3 other fieldsHigh correlation
이명 is highly overall correlated with 사진분류 and 1 other fieldsHigh correlation
사진분류 is highly overall correlated with 중분류 and 4 other fieldsHigh correlation
상태 is highly overall correlated with 중분류 and 4 other fieldsHigh correlation
현 상태 is highly overall correlated with 중분류 and 2 other fieldsHigh correlation
중분류 is highly imbalanced (74.0%)Imbalance
이명 is highly imbalanced (96.0%)Imbalance
사진분류 is highly imbalanced (79.4%)Imbalance
상태 is highly imbalanced (73.5%)Imbalance
현 상태 is highly imbalanced (86.0%)Imbalance
자료수집번호(통합) has unique valuesUnique
자료세부분류번호 has unique valuesUnique

Reproduction

Analysis started2023-12-11 23:07:22.464060
Analysis finished2023-12-11 23:07:24.054241
Duration1.59 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기관코드
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
JFNM
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJFNM
2nd rowJFNM
3rd rowJFNM
4th rowJFNM
5th rowJFNM

Common Values

ValueCountFrequency (%)
JFNM 10000
100.0%

Length

2023-12-12T08:07:24.102985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:07:24.170612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
jfnm 10000
100.0%

대분류
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
동물
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row동물
2nd row동물
3rd row동물
4th row동물
5th row동물

Common Values

ValueCountFrequency (%)
동물 10000
100.0%

Length

2023-12-12T08:07:24.241360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:07:24.311502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
동물 10000
100.0%

중분류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
곤충
8923 
조류
 
858
포유류
 
107
양서류
 
73
파충류
 
39

Length

Max length3
Median length2
Mean length2.0219
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row포유류
2nd row곤충
3rd row파충류
4th row곤충
5th row양서류

Common Values

ValueCountFrequency (%)
곤충 8923
89.2%
조류 858
 
8.6%
포유류 107
 
1.1%
양서류 73
 
0.7%
파충류 39
 
0.4%

Length

2023-12-12T08:07:24.387531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:07:24.472445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
곤충 8923
89.2%
조류 858
 
8.6%
포유류 107
 
1.1%
양서류 73
 
0.7%
파충류 39
 
0.4%

소분류
Categorical

HIGH CORRELATION 

Distinct49
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
딱정벌레목
3034 
나비목
1598 
나비목(나방류)
1174 
노린재목
987 
잠자리목
543 
Other values (44)
2664 

Length

Max length8
Median length5
Mean length4.4816
Min length2

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st row식육목
2nd row딱정벌레목
3rd row뱀목
4th row노린재목
5th row개구리목

Common Values

ValueCountFrequency (%)
딱정벌레목 3034
30.3%
나비목 1598
16.0%
나비목(나방류) 1174
 
11.7%
노린재목 987
 
9.9%
잠자리목 543
 
5.4%
메뚜기목 500
 
5.0%
벌목 386
 
3.9%
참새목 343
 
3.4%
파리목 233
 
2.3%
나비목(나비류) 186
 
1.9%
Other values (39) 1016
 
10.2%

Length

2023-12-12T08:07:24.568137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
딱정벌레목 3034
30.3%
나비목 1598
16.0%
나비목(나방류 1174
 
11.7%
노린재목 987
 
9.9%
잠자리목 543
 
5.4%
메뚜기목 500
 
5.0%
벌목 386
 
3.9%
참새목 343
 
3.4%
파리목 233
 
2.3%
나비목(나비류 186
 
1.9%
Other values (39) 1016
 
10.2%
Distinct167
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T08:07:24.773947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length4
Mean length4.0728
Min length2

Characters and Unicode

Total characters40728
Distinct characters199
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)0.3%

Sample

1st row족제비과
2nd row해당없음
3rd row뱀과
4th row해당없음
5th row무당개구리과
ValueCountFrequency (%)
해당없음 6056
60.6%
잠자리과 269
 
2.7%
딱정벌레과 206
 
2.1%
노린재과 158
 
1.6%
흰나비과 157
 
1.6%
자나방과 143
 
1.4%
네발나비과 138
 
1.4%
박각시과 112
 
1.1%
사슴벌레과 105
 
1.1%
말벌과 102
 
1.0%
Other values (157) 2554
25.5%
2023-12-12T08:07:25.084712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6118
15.0%
6056
14.9%
6056
14.9%
6056
14.9%
3937
9.7%
767
 
1.9%
716
 
1.8%
695
 
1.7%
519
 
1.3%
453
 
1.1%
Other values (189) 9355
23.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 40728
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6118
15.0%
6056
14.9%
6056
14.9%
6056
14.9%
3937
9.7%
767
 
1.9%
716
 
1.8%
695
 
1.7%
519
 
1.3%
453
 
1.1%
Other values (189) 9355
23.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 40728
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6118
15.0%
6056
14.9%
6056
14.9%
6056
14.9%
3937
9.7%
767
 
1.9%
716
 
1.8%
695
 
1.7%
519
 
1.3%
453
 
1.1%
Other values (189) 9355
23.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 40728
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6118
15.0%
6056
14.9%
6056
14.9%
6056
14.9%
3937
9.7%
767
 
1.9%
716
 
1.8%
695
 
1.7%
519
 
1.3%
453
 
1.1%
Other values (189) 9355
23.0%

속명
Text

Distinct442
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T08:07:25.315489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length4
Mean length5.8068
Min length3

Characters and Unicode

Total characters58068
Distinct characters59
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)1.0%

Sample

1st rowMustela
2nd row해당없음
3rd rowSibynophis
4th row해당없음
5th rowBombina
ValueCountFrequency (%)
해당없음 6157
61.6%
eurema 106
 
1.1%
pantala 100
 
1.0%
vespa 91
 
0.9%
serrognathus 88
 
0.9%
sympetrum 68
 
0.7%
plautia 67
 
0.7%
anas 66
 
0.7%
polygonia 58
 
0.6%
bombina 58
 
0.6%
Other values (425) 3141
31.4%
2023-12-12T08:07:25.658432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6157
 
10.6%
6157
 
10.6%
6157
 
10.6%
6157
 
10.6%
a 3961
 
6.8%
o 2751
 
4.7%
i 2514
 
4.3%
s 2459
 
4.2%
r 2142
 
3.7%
e 2040
 
3.5%
Other values (49) 17573
30.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29056
50.0%
Other Letter 24632
42.4%
Uppercase Letter 3842
 
6.6%
Space Separator 486
 
0.8%
Close Punctuation 24
 
< 0.1%
Open Punctuation 24
 
< 0.1%
Other Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3961
13.6%
o 2751
 
9.5%
i 2514
 
8.7%
s 2459
 
8.5%
r 2142
 
7.4%
e 2040
 
7.0%
t 1644
 
5.7%
u 1593
 
5.5%
l 1557
 
5.4%
n 1474
 
5.1%
Other values (14) 6921
23.8%
Uppercase Letter
ValueCountFrequency (%)
P 744
19.4%
C 492
12.8%
A 489
12.7%
S 448
11.7%
E 264
 
6.9%
B 196
 
5.1%
M 188
 
4.9%
D 141
 
3.7%
L 134
 
3.5%
H 111
 
2.9%
Other values (13) 635
16.5%
Other Letter
ValueCountFrequency (%)
6157
25.0%
6157
25.0%
6157
25.0%
6157
25.0%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
486
100.0%
Close Punctuation
ValueCountFrequency (%)
] 24
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 24
100.0%
Other Punctuation
ValueCountFrequency (%)
\ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32898
56.7%
Hangul 24632
42.4%
Common 538
 
0.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3961
 
12.0%
o 2751
 
8.4%
i 2514
 
7.6%
s 2459
 
7.5%
r 2142
 
6.5%
e 2040
 
6.2%
t 1644
 
5.0%
u 1593
 
4.8%
l 1557
 
4.7%
n 1474
 
4.5%
Other values (37) 10763
32.7%
Hangul
ValueCountFrequency (%)
6157
25.0%
6157
25.0%
6157
25.0%
6157
25.0%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
Common
ValueCountFrequency (%)
486
90.3%
] 24
 
4.5%
[ 24
 
4.5%
\ 4
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33436
57.6%
Hangul 24632
42.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6157
25.0%
6157
25.0%
6157
25.0%
6157
25.0%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
ASCII
ValueCountFrequency (%)
a 3961
 
11.8%
o 2751
 
8.2%
i 2514
 
7.5%
s 2459
 
7.4%
r 2142
 
6.4%
e 2040
 
6.1%
t 1644
 
4.9%
u 1593
 
4.8%
l 1557
 
4.7%
n 1474
 
4.4%
Other values (41) 11301
33.8%

종명
Text

Distinct507
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T08:07:25.897147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length4
Mean length5.7308
Min length4

Characters and Unicode

Total characters57308
Distinct characters33
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique129 ?
Unique (%)1.3%

Sample

1st rowsibirica
2nd row해당없음
3rd rowchinensis
4th row해당없음
5th roworientalis
ValueCountFrequency (%)
해당없음 6157
61.5%
flavescens 100
 
1.0%
simillima 91
 
0.9%
japonica 90
 
0.9%
orientalis 89
 
0.9%
platymelus 86
 
0.9%
stali 67
 
0.7%
c-aureum 58
 
0.6%
servilia 57
 
0.6%
eroticum 54
 
0.5%
Other values (495) 3167
31.6%
2023-12-12T08:07:26.263431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6157
10.7%
6157
10.7%
6157
10.7%
6157
10.7%
a 4263
 
7.4%
i 3846
 
6.7%
s 3053
 
5.3%
e 2439
 
4.3%
r 2358
 
4.1%
n 1973
 
3.4%
Other values (23) 14748
25.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 32162
56.1%
Other Letter 24628
43.0%
Space Separator 459
 
0.8%
Dash Punctuation 58
 
0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4263
13.3%
i 3846
12.0%
s 3053
9.5%
e 2439
 
7.6%
r 2358
 
7.3%
n 1973
 
6.1%
u 1930
 
6.0%
t 1920
 
6.0%
l 1793
 
5.6%
c 1694
 
5.3%
Other values (16) 6893
21.4%
Other Letter
ValueCountFrequency (%)
6157
25.0%
6157
25.0%
6157
25.0%
6157
25.0%
Space Separator
ValueCountFrequency (%)
459
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 58
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32163
56.1%
Hangul 24628
43.0%
Common 517
 
0.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4263
13.3%
i 3846
12.0%
s 3053
9.5%
e 2439
 
7.6%
r 2358
 
7.3%
n 1973
 
6.1%
u 1930
 
6.0%
t 1920
 
6.0%
l 1793
 
5.6%
c 1694
 
5.3%
Other values (17) 6894
21.4%
Hangul
ValueCountFrequency (%)
6157
25.0%
6157
25.0%
6157
25.0%
6157
25.0%
Common
ValueCountFrequency (%)
459
88.8%
- 58
 
11.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32680
57.0%
Hangul 24628
43.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6157
25.0%
6157
25.0%
6157
25.0%
6157
25.0%
ASCII
ValueCountFrequency (%)
a 4263
13.0%
i 3846
11.8%
s 3053
9.3%
e 2439
 
7.5%
r 2358
 
7.2%
n 1973
 
6.0%
u 1930
 
5.9%
t 1920
 
5.9%
l 1793
 
5.5%
c 1694
 
5.2%
Other values (19) 7411
22.7%

명칭
Text

Distinct1258
Distinct (%)12.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T08:07:26.474439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length5.7685
Min length1

Characters and Unicode

Total characters57685
Distinct characters477
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique329 ?
Unique (%)3.3%

Sample

1st row족제비
2nd row홍테무당벌레
3rd row비바리뱀
4th row애땅노린재
5th row무당개구리
ValueCountFrequency (%)
왕딱정벌레 112
 
1.1%
청동풍뎅이 112
 
1.1%
넓적사슴벌레 91
 
0.9%
황말벌 90
 
0.9%
왕빗살방아벌레 89
 
0.9%
애기뿔소똥구리 86
 
0.9%
주둥무늬차색풍뎅이 83
 
0.8%
호랑나비 83
 
0.8%
큰수중다리송장벌레 82
 
0.8%
멋쟁이딱정벌레 81
 
0.8%
Other values (1246) 9094
90.9%
2023-12-12T08:07:26.785188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2790
 
4.8%
2247
 
3.9%
2073
 
3.6%
1784
 
3.1%
1779
 
3.1%
1577
 
2.7%
1481
 
2.6%
1223
 
2.1%
963
 
1.7%
909
 
1.6%
Other values (467) 40859
70.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 57642
99.9%
Space Separator 41
 
0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2790
 
4.8%
2247
 
3.9%
2073
 
3.6%
1784
 
3.1%
1779
 
3.1%
1577
 
2.7%
1481
 
2.6%
1223
 
2.1%
963
 
1.7%
909
 
1.6%
Other values (464) 40816
70.8%
Space Separator
ValueCountFrequency (%)
41
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 57642
99.9%
Common 43
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2790
 
4.8%
2247
 
3.9%
2073
 
3.6%
1784
 
3.1%
1779
 
3.1%
1577
 
2.7%
1481
 
2.6%
1223
 
2.1%
963
 
1.7%
909
 
1.6%
Other values (464) 40816
70.8%
Common
ValueCountFrequency (%)
41
95.3%
( 1
 
2.3%
) 1
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 57642
99.9%
ASCII 43
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2790
 
4.8%
2247
 
3.9%
2073
 
3.6%
1784
 
3.1%
1779
 
3.1%
1577
 
2.7%
1481
 
2.6%
1223
 
2.1%
963
 
1.7%
909
 
1.6%
Other values (464) 40816
70.8%
ASCII
ValueCountFrequency (%)
41
95.3%
( 1
 
2.3%
) 1
 
2.3%

이명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
해당없음
9830 
Oriental Fire-bellied toad
 
45
등줄메뚜기
 
17
때죽재주나방
 
16
Spot-billed Duck
 
14
Other values (20)
 
78

Length

Max length26
Median length4
Mean length4.1773
Min length2

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row해당없음
2nd row해당없음
3rd row해당없음
4th row해당없음
5th rowOriental Fire-bellied toad

Common Values

ValueCountFrequency (%)
해당없음 9830
98.3%
Oriental Fire-bellied toad 45
 
0.4%
등줄메뚜기 17
 
0.2%
때죽재주나방 16
 
0.2%
Spot-billed Duck 14
 
0.1%
Ring-necked Pheasant 14
 
0.1%
Intermediate Egret 13
 
0.1%
Mallard 7
 
0.1%
곱추하늘나방 7
 
0.1%
Common Buzzard 4
 
< 0.1%
Other values (15) 33
 
0.3%

Length

2023-12-12T08:07:26.895633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
해당없음 9830
97.0%
toad 45
 
0.4%
oriental 45
 
0.4%
fire-bellied 45
 
0.4%
등줄메뚜기 17
 
0.2%
때죽재주나방 16
 
0.2%
spot-billed 14
 
0.1%
duck 14
 
0.1%
pheasant 14
 
0.1%
ring-necked 14
 
0.1%
Other values (22) 85
 
0.8%
Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T08:07:27.161150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length9.2604
Min length6

Characters and Unicode

Total characters92604
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowJFNM-73
2nd rowJFNM-4646
3rd rowJFNM-1288
4th rowJFNM-12499
5th rowJFNM-1332
ValueCountFrequency (%)
jfnm-73 1
 
< 0.1%
jfnm-9751 1
 
< 0.1%
jfnm-12200 1
 
< 0.1%
jfnm-10182 1
 
< 0.1%
jfnm-12572 1
 
< 0.1%
jfnm-3498 1
 
< 0.1%
jfnm-13475 1
 
< 0.1%
jfnm-7613 1
 
< 0.1%
jfnm-4950 1
 
< 0.1%
jfnm-853 1
 
< 0.1%
Other values (9990) 9990
99.9%
2023-12-12T08:07:27.537658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
J 10000
10.8%
F 10000
10.8%
N 10000
10.8%
M 10000
10.8%
- 10000
10.8%
1 7571
8.2%
3 4587
 
5.0%
2 4020
 
4.3%
4 3994
 
4.3%
0 3824
 
4.1%
Other values (5) 18608
20.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 42604
46.0%
Uppercase Letter 40000
43.2%
Dash Punctuation 10000
 
10.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 7571
17.8%
3 4587
10.8%
2 4020
9.4%
4 3994
9.4%
0 3824
9.0%
9 3775
8.9%
8 3747
8.8%
5 3720
8.7%
6 3710
8.7%
7 3656
8.6%
Uppercase Letter
ValueCountFrequency (%)
J 10000
25.0%
F 10000
25.0%
N 10000
25.0%
M 10000
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 52604
56.8%
Latin 40000
43.2%

Most frequent character per script

Common
ValueCountFrequency (%)
- 10000
19.0%
1 7571
14.4%
3 4587
8.7%
2 4020
7.6%
4 3994
 
7.6%
0 3824
 
7.3%
9 3775
 
7.2%
8 3747
 
7.1%
5 3720
 
7.1%
6 3710
 
7.1%
Latin
ValueCountFrequency (%)
J 10000
25.0%
F 10000
25.0%
N 10000
25.0%
M 10000
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 92604
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
J 10000
10.8%
F 10000
10.8%
N 10000
10.8%
M 10000
10.8%
- 10000
10.8%
1 7571
8.2%
3 4587
 
5.0%
2 4020
 
4.3%
4 3994
 
4.3%
0 3824
 
4.1%
Other values (5) 18608
20.1%
Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T08:07:27.751522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length11.9103
Min length9

Characters and Unicode

Total characters119103
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowJFNM-MM-73
2nd rowJFNM-IN-1834
3rd rowJFNM-RP-40
4th rowJFNM-IN-9687
5th rowJFNM-AM-29
ValueCountFrequency (%)
jfnm-mm-73 1
 
< 0.1%
jfnm-in-6939 1
 
< 0.1%
jfnm-in-9388 1
 
< 0.1%
jfnm-in-7370 1
 
< 0.1%
jfnm-in-9760 1
 
< 0.1%
jfnm-in-686 1
 
< 0.1%
jfnm-in-10663 1
 
< 0.1%
jfnm-in-4801 1
 
< 0.1%
jfnm-in-2138 1
 
< 0.1%
jfnm-av-717 1
 
< 0.1%
Other values (9990) 9990
99.9%
2023-12-12T08:07:28.066861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 20000
16.8%
N 18923
15.9%
M 10287
8.6%
J 10000
8.4%
F 10000
8.4%
I 8923
 
7.5%
1 5594
 
4.7%
2 3825
 
3.2%
3 3806
 
3.2%
4 3777
 
3.2%
Other values (10) 23968
20.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 60000
50.4%
Decimal Number 39103
32.8%
Dash Punctuation 20000
 
16.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5594
14.3%
2 3825
9.8%
3 3806
9.7%
4 3777
9.7%
5 3723
9.5%
8 3711
9.5%
0 3672
9.4%
9 3667
9.4%
6 3666
9.4%
7 3662
9.4%
Uppercase Letter
ValueCountFrequency (%)
N 18923
31.5%
M 10287
17.1%
J 10000
16.7%
F 10000
16.7%
I 8923
14.9%
A 931
 
1.6%
V 858
 
1.4%
R 39
 
0.1%
P 39
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 20000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 60000
50.4%
Common 59103
49.6%

Most frequent character per script

Common
ValueCountFrequency (%)
- 20000
33.8%
1 5594
 
9.5%
2 3825
 
6.5%
3 3806
 
6.4%
4 3777
 
6.4%
5 3723
 
6.3%
8 3711
 
6.3%
0 3672
 
6.2%
9 3667
 
6.2%
6 3666
 
6.2%
Latin
ValueCountFrequency (%)
N 18923
31.5%
M 10287
17.1%
J 10000
16.7%
F 10000
16.7%
I 8923
14.9%
A 931
 
1.6%
V 858
 
1.4%
R 39
 
0.1%
P 39
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 119103
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 20000
16.8%
N 18923
15.9%
M 10287
8.6%
J 10000
8.4%
F 10000
8.4%
I 8923
 
7.5%
1 5594
 
4.7%
2 3825
 
3.2%
3 3806
 
3.2%
4 3777
 
3.2%
Other values (10) 23968
20.1%

수량
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 10000
100.0%

Length

2023-12-12T08:07:28.172634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:07:28.249984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 10000
100.0%

사진분류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
곤충
8923 
조류
 
841
양서파충류
 
106
포유류
 
78
해산포유류
 
28
Other values (3)
 
24

Length

Max length5
Median length2
Mean length2.0498
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row포유류
2nd row곤충
3rd row양서파충류
4th row곤충
5th row양서파충류

Common Values

ValueCountFrequency (%)
곤충 8923
89.2%
조류 841
 
8.4%
양서파충류 106
 
1.1%
포유류 78
 
0.8%
해산포유류 28
 
0.3%
동물 18
 
0.2%
해산피충류 5
 
0.1%
해산파충류 1
 
< 0.1%

Length

2023-12-12T08:07:28.355803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:07:28.464796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
곤충 8923
89.2%
조류 841
 
8.4%
양서파충류 106
 
1.1%
포유류 78
 
0.8%
해산포유류 28
 
0.3%
동물 18
 
0.2%
해산피충류 5
 
< 0.1%
해산파충류 1
 
< 0.1%

상태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
건조
8923 
박제
1012 
액침
 
61
골격
 
4

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row박제
2nd row건조
3rd row액침
4th row건조
5th row액침

Common Values

ValueCountFrequency (%)
건조 8923
89.2%
박제 1012
 
10.1%
액침 61
 
0.6%
골격 4
 
< 0.1%

Length

2023-12-12T08:07:28.564545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:07:28.655328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
건조 8923
89.2%
박제 1012
 
10.1%
액침 61
 
0.6%
골격 4
 
< 0.1%

수집방법
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
자체
6382 
구입
3300 
기증
 
317
기증
 
1

Length

Max length3
Median length2
Mean length2.0001
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row기증
2nd row구입
3rd row자체
4th row자체
5th row자체

Common Values

ValueCountFrequency (%)
자체 6382
63.8%
구입 3300
33.0%
기증 317
 
3.2%
기증 1
 
< 0.1%

Length

2023-12-12T08:07:28.739825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:07:28.820650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
자체 6382
63.8%
구입 3300
33.0%
기증 318
 
3.2%

현 상태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
동물수장고
9296 
전시
 
317
광식물수장고
 
81
전시(7)
 
44
해양수장고
 
34
Other values (12)
 
228

Length

Max length7
Median length5
Mean length4.9198
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row동물수장고
2nd row동물수장고
3rd row광식물수장고
4th row동물수장고
5th row광식물수장고

Common Values

ValueCountFrequency (%)
동물수장고 9296
93.0%
전시 317
 
3.2%
광식물수장고 81
 
0.8%
전시(7) 44
 
0.4%
해양수장고 34
 
0.3%
전시(10) 31
 
0.3%
전시(8) 31
 
0.3%
전시(3) 28
 
0.3%
전시(4) 26
 
0.3%
전시(2) 26
 
0.3%
Other values (7) 86
 
0.9%

Length

2023-12-12T08:07:28.910881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
동물수장고 9296
93.0%
전시 317
 
3.2%
광식물수장고 81
 
0.8%
전시(7 44
 
0.4%
해양수장고 34
 
0.3%
전시(10 31
 
0.3%
전시(8 31
 
0.3%
전시(3 28
 
0.3%
전시(2 26
 
0.3%
전시(4 26
 
0.3%
Other values (7) 86
 
0.9%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2016-09-30
10000 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016-09-30
2nd row2016-09-30
3rd row2016-09-30
4th row2016-09-30
5th row2016-09-30

Common Values

ValueCountFrequency (%)
2016-09-30 10000
100.0%

Length

2023-12-12T08:07:29.001078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:07:29.088511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2016-09-30 10000
100.0%

Correlations

2023-12-12T08:07:29.149866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
중분류소분류이명사진분류상태수집방법현 상태
중분류1.0001.0000.7930.9260.7740.1980.764
소분류1.0001.0000.8770.9710.9280.5260.815
이명0.7930.8771.0000.8540.8190.1960.606
사진분류0.9260.9710.8541.0000.9560.3570.807
상태0.7740.9280.8190.9561.0000.3420.747
수집방법0.1980.5260.1960.3570.3421.0000.263
현 상태0.7640.8150.6060.8070.7470.2631.000
2023-12-12T08:07:29.249035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사진분류상태현 상태소분류수집방법이명중분류
사진분류1.0000.7230.5060.8170.1660.5530.880
상태0.7231.0000.5240.7460.1390.5910.726
현 상태0.5060.5241.0000.3570.1490.2190.526
소분류0.8170.7460.3571.0000.2900.3800.998
수집방법0.1660.1390.1490.2901.0000.1050.163
이명0.5530.5910.2190.3800.1051.0000.477
중분류0.8800.7260.5260.9980.1630.4771.000
2023-12-12T08:07:29.534469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
중분류소분류이명사진분류상태수집방법현 상태
중분류1.0000.9980.4770.8800.7260.1630.526
소분류0.9981.0000.3800.8170.7460.2900.357
이명0.4770.3801.0000.5530.5910.1050.219
사진분류0.8800.8170.5531.0000.7230.1660.506
상태0.7260.7460.5910.7231.0000.1390.524
수집방법0.1630.2900.1050.1660.1391.0000.149
현 상태0.5260.3570.2190.5060.5240.1491.000

Missing values

2023-12-12T08:07:23.792895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:07:23.974461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기관코드대분류중분류소분류세분류속명종명명칭이명자료수집번호(통합)자료세부분류번호수량사진분류상태수집방법현 상태데이터기준일자
72JFNM동물포유류식육목족제비과Mustelasibirica족제비해당없음JFNM-73JFNM-MM-731포유류박제기증동물수장고2016-09-30
12466JFNM동물곤충딱정벌레목해당없음해당없음해당없음홍테무당벌레해당없음JFNM-4646JFNM-IN-18341곤충건조구입동물수장고2016-09-30
1290JFNM동물파충류뱀목뱀과Sibynophischinensis비바리뱀해당없음JFNM-1288JFNM-RP-401양서파충류액침자체광식물수장고2016-09-30
7957JFNM동물곤충노린재목해당없음해당없음해당없음애땅노린재해당없음JFNM-12499JFNM-IN-96871곤충건조자체동물수장고2016-09-30
1334JFNM동물양서류개구리목무당개구리과Bombinaorientalis무당개구리Oriental Fire-bellied toadJFNM-1332JFNM-AM-291양서파충류액침자체광식물수장고2016-09-30
12772JFNM동물곤충나비목해당없음해당없음해당없음흰뱀눈나비해당없음JFNM-7152JFNM-IN-43401곤충건조자체동물수장고2016-09-30
149JFNM동물조류참새목까마귀과Garrulusglandarius어치해당없음JFNM-149JFNM-AV-131조류박제구입동물수장고2016-09-30
2300JFNM동물곤충파리목광대파리과Spheniscomyiasexmaculatus광대파리해당없음JFNM-3295JFNM-IN-4831곤충건조구입동물수장고2016-09-30
10942JFNM동물곤충딱정벌레목해당없음해당없음해당없음콩알물땡땡이해당없음JFNM-6854JFNM-IN-40421곤충건조자체동물수장고2016-09-30
954JFNM동물조류매목매과FalcoperegrinusPeregrine FalconJFNM-954JFNM-AV-8181조류박제자체전시2016-09-30
기관코드대분류중분류소분류세분류속명종명명칭이명자료수집번호(통합)자료세부분류번호수량사진분류상태수집방법현 상태데이터기준일자
12483JFNM동물곤충벌목말벌과Vespasimillima황말벌해당없음JFNM-13877JFNM-IN-110651곤충건조자체동물수장고2016-09-30
6656JFNM동물곤충딱정벌레목해당없음해당없음해당없음뽕나무하늘소해당없음JFNM-13268JFNM-IN-104561곤충건조자체동물수장고2016-09-30
12524JFNM동물곤충벌목말벌과Vespasimillima황말벌해당없음JFNM-6784JFNM-IN-39721곤충건조자체동물수장고2016-09-30
4257JFNM동물곤충잠자리목잠자리과Pantalaflavescens된장잠자리해당없음JFNM-10462JFNM-IN-76501곤충건조자체동물수장고2016-09-30
7734JFNM동물곤충딱정벌레목해당없음해당없음해당없음애기물방개해당없음JFNM-4959JFNM-IN-21471곤충건조구입동물수장고2016-09-30
9006JFNM동물곤충나비목(나방류)해당없음해당없음해당없음우단박각시해당없음JFNM-9259JFNM-IN-64471곤충건조자체동물수장고2016-09-30
11292JFNM동물곤충딱정벌레목해당없음해당없음해당없음큰수중다리송장벌레해당없음JFNM-7309JFNM-IN-44971곤충건조자체동물수장고2016-09-30
8003JFNM동물곤충딱정벌레목해당없음해당없음해당없음애물땡땡이해당없음JFNM-9606JFNM-IN-67941곤충건조자체동물수장고2016-09-30
4517JFNM동물곤충메뚜기목메뚜기과Shirakiacrisshirakii등검은메뚜기해당없음JFNM-11631JFNM-IN-88191곤충건조자체동물수장고2016-09-30
4055JFNM동물곤충딱정벌레목풍뎅이과Anomalasieversi대마도줄풍뎅이해당없음JFNM-6121JFNM-IN-33091곤충건조구입동물수장고2016-09-30