Overview

Dataset statistics

Number of variables6
Number of observations3899
Missing cells6218
Missing cells (%)26.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory186.7 KiB
Average record size in memory49.0 B

Variable types

Numeric1
Categorical1
Text4

Dataset

Description식물병해충 예찰조사 과정 및 연구사업 등에서 확보한 표본 정보의 리스트로, 종합적인 표본의 보존, 관리 및 활용을 목적으로 한다.
Author농림축산식품부 농림축산검역본부
URLhttps://www.data.go.kr/data/15107667/fileData.do

Alerts

has 1699 (43.6%) missing valuesMissing
has 1698 (43.5%) missing valuesMissing
has 1700 (43.6%) missing valuesMissing
채집일 has 1121 (28.8%) missing valuesMissing
번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 07:36:45.058802
Analysis finished2023-12-12 07:36:46.146820
Duration1.09 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct3899
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1999.8248
Minimum1
Maximum3951
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2023-12-12T16:36:46.232488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile247.9
Q11027.5
median2002
Q32976.5
95-th percentile3756.1
Maximum3951
Range3950
Interquartile range (IQR)1949

Descriptive statistics

Standard deviation1129.3412
Coefficient of variation (CV)0.56472007
Kurtosis-1.1868309
Mean1999.8248
Median Absolute Deviation (MAD)975
Skewness-0.010492774
Sum7797317
Variance1275411.6
MonotonicityStrictly increasing
2023-12-12T16:36:46.443497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
2673 1
 
< 0.1%
2645 1
 
< 0.1%
2646 1
 
< 0.1%
2647 1
 
< 0.1%
2648 1
 
< 0.1%
2649 1
 
< 0.1%
2650 1
 
< 0.1%
2651 1
 
< 0.1%
2652 1
 
< 0.1%
Other values (3889) 3889
99.7%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
3951 1
< 0.1%
3950 1
< 0.1%
3949 1
< 0.1%
3948 1
< 0.1%
3947 1
< 0.1%
3946 1
< 0.1%
3945 1
< 0.1%
3944 1
< 0.1%
3943 1
< 0.1%
3942 1
< 0.1%


Categorical

Distinct10
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
Lepidoptera
2291 
<NA>
839 
Coleoptera
443 
Hemiptera
 
181
Hymenoptera
 
122
Other values (5)
 
23

Length

Max length11
Median length11
Mean length9.2680174
Min length4

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st rowHemiptera
2nd rowHemiptera
3rd rowHemiptera
4th rowHemiptera
5th rowHemiptera

Common Values

ValueCountFrequency (%)
Lepidoptera 2291
58.8%
<NA> 839
 
21.5%
Coleoptera 443
 
11.4%
Hemiptera 181
 
4.6%
Hymenoptera 122
 
3.1%
Diptera 15
 
0.4%
Orthoptera 5
 
0.1%
Odonata 1
 
< 0.1%
Neoptera 1
 
< 0.1%
Isoptera 1
 
< 0.1%

Length

2023-12-12T16:36:46.645299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:36:46.781903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
lepidoptera 2291
58.8%
na 839
 
21.5%
coleoptera 443
 
11.4%
hemiptera 181
 
4.6%
hymenoptera 122
 
3.1%
diptera 15
 
0.4%
orthoptera 5
 
0.1%
odonata 1
 
< 0.1%
neoptera 1
 
< 0.1%
isoptera 1
 
< 0.1%


Text

MISSING 

Distinct83
Distinct (%)3.8%
Missing1699
Missing (%)43.6%
Memory size30.6 KiB
2023-12-12T16:36:47.025569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length15
Mean length10.383182
Min length7

Characters and Unicode

Total characters22843
Distinct characters38
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)1.0%

Sample

1st rowVelliidae
2nd rowLygaeidae
3rd rowLygaeidae
4th rowLygaeidae
5th rowLygaeidae
ValueCountFrequency (%)
geometridae 374
17.0%
noctuidae 340
15.5%
curculionidae 185
8.4%
tortricidae 181
8.2%
erebidae 154
 
7.0%
formicidae 120
 
5.5%
arctiidae 117
 
5.3%
cerambycidae 117
 
5.3%
pyralidae 103
 
4.7%
notodontidae 60
 
2.7%
Other values (73) 449
20.4%
2023-12-12T16:36:47.395886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3424
15.0%
i 3074
13.5%
a 2707
11.9%
d 2304
10.1%
r 1757
7.7%
o 1616
7.1%
t 1287
 
5.6%
c 1169
 
5.1%
m 751
 
3.3%
u 751
 
3.3%
Other values (28) 4003
17.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20645
90.4%
Uppercase Letter 2198
 
9.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3424
16.6%
i 3074
14.9%
a 2707
13.1%
d 2304
11.2%
r 1757
8.5%
o 1616
7.8%
t 1287
 
6.2%
c 1169
 
5.7%
m 751
 
3.6%
u 751
 
3.6%
Other values (10) 1805
8.7%
Uppercase Letter
ValueCountFrequency (%)
N 414
18.8%
G 388
17.7%
C 378
17.2%
T 221
10.1%
P 176
8.0%
E 163
 
7.4%
A 125
 
5.7%
F 122
 
5.6%
L 54
 
2.5%
B 50
 
2.3%
Other values (8) 107
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 22843
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3424
15.0%
i 3074
13.5%
a 2707
11.9%
d 2304
10.1%
r 1757
7.7%
o 1616
7.1%
t 1287
 
5.6%
c 1169
 
5.1%
m 751
 
3.3%
u 751
 
3.3%
Other values (28) 4003
17.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22843
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3424
15.0%
i 3074
13.5%
a 2707
11.9%
d 2304
10.1%
r 1757
7.7%
o 1616
7.1%
t 1287
 
5.6%
c 1169
 
5.1%
m 751
 
3.3%
u 751
 
3.3%
Other values (28) 4003
17.5%


Text

MISSING 

Distinct538
Distinct (%)24.4%
Missing1698
Missing (%)43.5%
Memory size30.6 KiB
2023-12-12T16:36:47.658588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length8.7955475
Min length3

Characters and Unicode

Total characters19359
Distinct characters49
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique239 ?
Unique (%)10.9%

Sample

1st rowMicrovelia
2nd rowDimorphopterus
3rd rowDimorphopterus
4th rowDimorphopterus
5th rowParadieuches
ValueCountFrequency (%)
platypus 167
 
7.6%
monochamus 86
 
3.9%
athetis 58
 
2.6%
meteima 48
 
2.2%
papilio 41
 
1.9%
cydia 34
 
1.5%
zanclognatha 30
 
1.4%
chiasmia 26
 
1.2%
scopula 26
 
1.2%
tetramorium 24
 
1.1%
Other values (527) 1661
75.5%
2023-12-12T16:36:48.031579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2577
13.3%
o 1597
 
8.2%
i 1569
 
8.1%
t 1333
 
6.9%
s 1198
 
6.2%
r 1166
 
6.0%
e 1121
 
5.8%
p 886
 
4.6%
l 835
 
4.3%
h 831
 
4.3%
Other values (39) 6246
32.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17159
88.6%
Uppercase Letter 2200
 
11.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2577
15.0%
o 1597
 
9.3%
i 1569
 
9.1%
t 1333
 
7.8%
s 1198
 
7.0%
r 1166
 
6.8%
e 1121
 
6.5%
p 886
 
5.2%
l 835
 
4.9%
h 831
 
4.8%
Other values (15) 4046
23.6%
Uppercase Letter
ValueCountFrequency (%)
P 405
18.4%
A 295
13.4%
M 230
10.5%
C 227
10.3%
S 208
9.5%
E 114
 
5.2%
T 99
 
4.5%
H 97
 
4.4%
O 68
 
3.1%
D 61
 
2.8%
Other values (14) 396
18.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19359
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2577
13.3%
o 1597
 
8.2%
i 1569
 
8.1%
t 1333
 
6.9%
s 1198
 
6.2%
r 1166
 
6.0%
e 1121
 
5.8%
p 886
 
4.6%
l 835
 
4.3%
h 831
 
4.3%
Other values (39) 6246
32.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19359
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2577
13.3%
o 1597
 
8.2%
i 1569
 
8.1%
t 1333
 
6.9%
s 1198
 
6.2%
r 1166
 
6.0%
e 1121
 
5.8%
p 886
 
4.6%
l 835
 
4.3%
h 831
 
4.3%
Other values (39) 6246
32.3%


Text

MISSING 

Distinct719
Distinct (%)32.7%
Missing1700
Missing (%)43.6%
Memory size30.6 KiB
2023-12-12T16:36:48.333201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length18
Mean length8.990905
Min length2

Characters and Unicode

Total characters19771
Distinct characters35
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique399 ?
Unique (%)18.1%

Sample

1st rowhorvathi
2nd rowpallipes
3rd rowpallipes
4th rowpallipes
5th rowdissimils
ValueCountFrequency (%)
koryoensis 160
 
7.2%
saltuarius 56
 
2.5%
mediorufa 48
 
2.2%
japonica 34
 
1.5%
alternatus 29
 
1.3%
japonicus 29
 
1.3%
tsushimae 24
 
1.1%
debilitata 24
 
1.1%
hebesata 23
 
1.0%
flava 22
 
1.0%
Other values (708) 1766
79.7%
2023-12-12T16:36:48.791311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2957
15.0%
i 2275
11.5%
s 1849
9.4%
e 1481
 
7.5%
r 1357
 
6.9%
n 1272
 
6.4%
t 1263
 
6.4%
o 1137
 
5.8%
l 1083
 
5.5%
u 992
 
5.0%
Other values (25) 4105
20.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19684
99.6%
Space Separator 67
 
0.3%
Uppercase Letter 10
 
0.1%
Open Punctuation 4
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2957
15.0%
i 2275
11.6%
s 1849
9.4%
e 1481
 
7.5%
r 1357
 
6.9%
n 1272
 
6.5%
t 1263
 
6.4%
o 1137
 
5.8%
l 1083
 
5.5%
u 992
 
5.0%
Other values (16) 4018
20.4%
Uppercase Letter
ValueCountFrequency (%)
F 4
40.0%
E 2
20.0%
D 2
20.0%
X 1
 
10.0%
R 1
 
10.0%
Space Separator
ValueCountFrequency (%)
67
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19694
99.6%
Common 77
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2957
15.0%
i 2275
11.6%
s 1849
9.4%
e 1481
 
7.5%
r 1357
 
6.9%
n 1272
 
6.5%
t 1263
 
6.4%
o 1137
 
5.8%
l 1083
 
5.5%
u 992
 
5.0%
Other values (21) 4028
20.5%
Common
ValueCountFrequency (%)
67
87.0%
( 4
 
5.2%
) 4
 
5.2%
. 2
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19771
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2957
15.0%
i 2275
11.5%
s 1849
9.4%
e 1481
 
7.5%
r 1357
 
6.9%
n 1272
 
6.4%
t 1263
 
6.4%
o 1137
 
5.8%
l 1083
 
5.5%
u 992
 
5.0%
Other values (25) 4105
20.8%

채집일
Text

MISSING 

Distinct464
Distinct (%)16.7%
Missing1121
Missing (%)28.8%
Memory size30.6 KiB
2023-12-12T16:36:49.115764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters27780
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique230 ?
Unique (%)8.3%

Sample

1st row2018-04-15
2nd row2018-09-29
3rd row2018-09-29
4th row2018-09-29
5th row2018-08-25
ValueCountFrequency (%)
2010-08-19 109
 
3.9%
2016-07-09 103
 
3.7%
2011-01-01 97
 
3.5%
2016-05-28 85
 
3.1%
2010-08-20 84
 
3.0%
2016-04-23 84
 
3.0%
2016-04-10 80
 
2.9%
2010-08-22 74
 
2.7%
2010-09-07 67
 
2.4%
2016-09-17 57
 
2.1%
Other values (454) 1938
69.8%
2023-12-12T16:36:49.540112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8046
29.0%
- 5556
20.0%
2 4422
15.9%
1 4106
14.8%
9 1144
 
4.1%
8 1078
 
3.9%
6 1023
 
3.7%
7 976
 
3.5%
5 533
 
1.9%
4 493
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 22224
80.0%
Dash Punctuation 5556
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8046
36.2%
2 4422
19.9%
1 4106
18.5%
9 1144
 
5.1%
8 1078
 
4.9%
6 1023
 
4.6%
7 976
 
4.4%
5 533
 
2.4%
4 493
 
2.2%
3 403
 
1.8%
Dash Punctuation
ValueCountFrequency (%)
- 5556
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27780
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8046
29.0%
- 5556
20.0%
2 4422
15.9%
1 4106
14.8%
9 1144
 
4.1%
8 1078
 
3.9%
6 1023
 
3.7%
7 976
 
3.5%
5 533
 
1.9%
4 493
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27780
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8046
29.0%
- 5556
20.0%
2 4422
15.9%
1 4106
14.8%
9 1144
 
4.1%
8 1078
 
3.9%
6 1023
 
3.7%
7 976
 
3.5%
5 533
 
1.9%
4 493
 
1.8%

Interactions

2023-12-12T16:36:45.700807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:36:49.637912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호
번호1.0000.7770.934
0.7771.0001.000
0.9341.0001.000
2023-12-12T16:36:49.721995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호
번호1.0000.353
0.3531.000

Missing values

2023-12-12T16:36:45.849741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:36:45.953006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T16:36:46.056779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

번호채집일
01HemipteraVelliidaeMicroveliahorvathi2018-04-15
12HemipteraLygaeidaeDimorphopteruspallipes2018-09-29
23HemipteraLygaeidaeDimorphopteruspallipes2018-09-29
34HemipteraLygaeidaeDimorphopteruspallipes2018-09-29
45HemipteraLygaeidaeParadieuchesdissimils2018-08-25
56HemipteraLygaeidaeParadieuchesdissimils2018-08-25
67HemipteraTingidaeStephanitispyrioides2018-08-04
78HemipteraTingidaeStephanitispyrioides2018-08-04
89HemipteraBeytidaeYemmaexilis2018-08-19
910HemipteraTingidaeCorythuchamarmorata2018-08-10
번호채집일
38893942LepidopteraNoctuidaeCorgathadictaria<NA>
38903943LepidopteraNoctuidaeMythimnabani<NA>
38913944LepidopteraNoctuidaeBryophilinamollicula<NA>
38923945LepidopteraNoctuidaeBlasticorhinusussuriensis<NA>
38933946ColeopteraCerambycidaeAegosomasinicum2018-07-14
38943947LepidopteraSphingidaeSphinxmorio<NA>
38953948LepidopteraNoctuidaeMythimnaloreyi<NA>
38963949LepidopteraNoctuidaeMythimnaloreyi<NA>
38973950LepidopteraNoctuidaeMythimnaloreyi<NA>
38983951IsopteraRhinotermitidaeCoptotermesgestroi<NA>