Overview

Dataset statistics

Number of variables20
Number of observations10000
Missing cells11200
Missing cells (%)5.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 MiB
Average record size in memory172.0 B

Variable types

Numeric4
Boolean2
Text11
Categorical2
DateTime1

Dataset

Description국내조류의 분포 및 대상 조류조사정보 DB에 취합한 종별 분포자료로써, 375종 2만여 항목에 대한 정보입니다. 해당 데이터가 보유한 컬럼은 다음과 같습니다. 컬럼명 : 자료번호, 보호종 여부, 보호종 번호, 고유번호(원병오1993), 국명(원병오1993), 국명(원병오2000), 학명(원병오1993), 학명(원병오2000), 영문명(원병오1993), 영문명(원병오2000), 적색자료서등재 여부, 관찰개체수, 조사자, 조사자 소속기관, 조사방법, 조사지역, 행정구역, 문헌명, 지역그리드번호, 관찰시기
Author한국과학기술정보연구원
URLhttps://www.data.go.kr/data/3033734/fileData.do

Alerts

고유번호(원병오1993) is highly overall correlated with 보호종 여부High correlation
지역그리드번호 is highly overall correlated with 조사방법High correlation
보호종 여부 is highly overall correlated with 고유번호(원병오1993)High correlation
조사자 소속기관 is highly overall correlated with 조사방법High correlation
조사방법 is highly overall correlated with 지역그리드번호 and 1 other fieldsHigh correlation
적색자료서등재 여부 is highly imbalanced (84.9%)Imbalance
조사방법 is highly imbalanced (62.4%)Imbalance
보호종 번호 has 8883 (88.8%) missing valuesMissing
학명(원병오2000) has 177 (1.8%) missing valuesMissing
영문명(원병오2000) has 182 (1.8%) missing valuesMissing
조사자 has 1953 (19.5%) missing valuesMissing
관찰개체수 is highly skewed (γ1 = 29.640763)Skewed
자료번호 has unique valuesUnique

Reproduction

Analysis started2024-04-17 18:11:24.567641
Analysis finished2024-04-17 18:11:28.115181
Duration3.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자료번호
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11810.158
Minimum1
Maximum23837
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-18T03:11:28.176118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1104.7
Q15801.25
median11762
Q317796.5
95-th percentile22571.2
Maximum23837
Range23836
Interquartile range (IQR)11995.25

Descriptive statistics

Standard deviation6908.5575
Coefficient of variation (CV)0.58496739
Kurtosis-1.211244
Mean11810.158
Median Absolute Deviation (MAD)6002.5
Skewness0.012238001
Sum1.1810158 × 108
Variance47728167
MonotonicityNot monotonic
2024-04-18T03:11:28.278965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6843 1
 
< 0.1%
18075 1
 
< 0.1%
147 1
 
< 0.1%
8419 1
 
< 0.1%
23466 1
 
< 0.1%
4339 1
 
< 0.1%
16112 1
 
< 0.1%
4239 1
 
< 0.1%
23207 1
 
< 0.1%
4533 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
9 1
< 0.1%
12 1
< 0.1%
14 1
< 0.1%
15 1
< 0.1%
16 1
< 0.1%
ValueCountFrequency (%)
23837 1
< 0.1%
23833 1
< 0.1%
23832 1
< 0.1%
23828 1
< 0.1%
23824 1
< 0.1%
23823 1
< 0.1%
23822 1
< 0.1%
23821 1
< 0.1%
23820 1
< 0.1%
23818 1
< 0.1%

보호종 여부
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
False
8883 
True
1117 
ValueCountFrequency (%)
False 8883
88.8%
True 1117
 
11.2%
2024-04-18T03:11:28.361155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

보호종 번호
Text

MISSING 

Distinct53
Distinct (%)4.7%
Missing8883
Missing (%)88.8%
Memory size156.2 KiB
2024-04-18T03:11:28.490562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length7.0850492
Min length3

Characters and Unicode

Total characters7914
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)0.7%

Sample

1st rowPR35
2nd rowPR10
3rd rowNM201 PR9
4th rowPR14
5th rowPR19
ValueCountFrequency (%)
nm323 237
 
15.1%
nm201 162
 
10.3%
pr6 145
 
9.2%
pr8 92
 
5.8%
pr2 65
 
4.1%
pr9 64
 
4.1%
nm205 55
 
3.5%
pr3 52
 
3.3%
pr36 46
 
2.9%
nm325 44
 
2.8%
Other values (46) 612
38.9%
2024-04-18T03:11:28.727024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1556
19.7%
3 848
10.7%
N 813
10.3%
2 804
10.2%
P 761
9.6%
R 761
9.6%
M 674
8.5%
1 376
 
4.8%
0 299
 
3.8%
6 250
 
3.2%
Other values (6) 772
9.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3210
40.6%
Uppercase Letter 3148
39.8%
Space Separator 1556
19.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 848
26.4%
2 804
25.0%
1 376
11.7%
0 299
 
9.3%
6 250
 
7.8%
5 155
 
4.8%
9 139
 
4.3%
4 134
 
4.2%
8 114
 
3.6%
7 91
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
N 813
25.8%
P 761
24.2%
R 761
24.2%
M 674
21.4%
E 139
 
4.4%
Space Separator
ValueCountFrequency (%)
1556
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4766
60.2%
Latin 3148
39.8%

Most frequent character per script

Common
ValueCountFrequency (%)
1556
32.6%
3 848
17.8%
2 804
16.9%
1 376
 
7.9%
0 299
 
6.3%
6 250
 
5.2%
5 155
 
3.3%
9 139
 
2.9%
4 134
 
2.8%
8 114
 
2.4%
Latin
ValueCountFrequency (%)
N 813
25.8%
P 761
24.2%
R 761
24.2%
M 674
21.4%
E 139
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7914
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1556
19.7%
3 848
10.7%
N 813
10.3%
2 804
10.2%
P 761
9.6%
R 761
9.6%
M 674
8.5%
1 376
 
4.8%
0 299
 
3.8%
6 250
 
3.2%
Other values (6) 772
9.8%

고유번호(원병오1993)
Real number (ℝ)

HIGH CORRELATION 

Distinct285
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean189.3927
Minimum1
Maximum440
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-18T03:11:28.837406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile28
Q164
median163
Q3312
95-th percentile428
Maximum440
Range439
Interquartile range (IQR)248

Descriptive statistics

Standard deviation137.95616
Coefficient of variation (CV)0.72841328
Kurtosis-1.2789503
Mean189.3927
Median Absolute Deviation (MAD)108
Skewness0.40201942
Sum1893927
Variance19031.901
MonotonicityNot monotonic
2024-04-18T03:11:28.938421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
62 291
 
2.9%
37 261
 
2.6%
61 239
 
2.4%
32 208
 
2.1%
238 204
 
2.0%
379 195
 
1.9%
64 186
 
1.9%
434 185
 
1.8%
75 180
 
1.8%
213 179
 
1.8%
Other values (275) 7872
78.7%
ValueCountFrequency (%)
1 9
 
0.1%
2 2
 
< 0.1%
3 2
 
< 0.1%
5 162
1.6%
6 13
 
0.1%
7 53
 
0.5%
8 121
1.2%
9 4
 
< 0.1%
10 3
 
< 0.1%
18 30
 
0.3%
ValueCountFrequency (%)
440 21
 
0.2%
439 115
1.1%
438 26
 
0.3%
437 6
 
0.1%
434 185
1.8%
433 34
 
0.3%
432 104
1.0%
428 71
 
0.7%
427 40
 
0.4%
423 1
 
< 0.1%
Distinct289
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-18T03:11:29.155618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length3.9647
Min length1

Characters and Unicode

Total characters39647
Distinct characters202
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)0.3%

Sample

1st row청둥오리
2nd row참새
3rd row노랑할미새
4th row알락오리
5th row민물도요
ValueCountFrequency (%)
흰뺨검둥오리 291
 
2.9%
왜가리 261
 
2.6%
청둥오리 239
 
2.4%
멧비둘기 204
 
2.0%
박새 195
 
1.9%
쇠오리 186
 
1.9%
까치 185
 
1.8%
흰죽지 180
 
1.8%
괭이갈매기 179
 
1.8%
중대백로 176
 
1.8%
Other values (279) 7904
79.0%
2024-04-18T03:11:29.495572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4204
 
10.6%
2149
 
5.4%
1964
 
5.0%
1814
 
4.6%
1019
 
2.6%
784
 
2.0%
717
 
1.8%
673
 
1.7%
672
 
1.7%
660
 
1.7%
Other values (192) 24991
63.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39647
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4204
 
10.6%
2149
 
5.4%
1964
 
5.0%
1814
 
4.6%
1019
 
2.6%
784
 
2.0%
717
 
1.8%
673
 
1.7%
672
 
1.7%
660
 
1.7%
Other values (192) 24991
63.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39647
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4204
 
10.6%
2149
 
5.4%
1964
 
5.0%
1814
 
4.6%
1019
 
2.6%
784
 
2.0%
717
 
1.8%
673
 
1.7%
672
 
1.7%
660
 
1.7%
Other values (192) 24991
63.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39647
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4204
 
10.6%
2149
 
5.4%
1964
 
5.0%
1814
 
4.6%
1019
 
2.6%
784
 
2.0%
717
 
1.8%
673
 
1.7%
672
 
1.7%
660
 
1.7%
Other values (192) 24991
63.0%
Distinct289
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-18T03:11:29.736981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length3.9647
Min length1

Characters and Unicode

Total characters39647
Distinct characters202
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)0.3%

Sample

1st row청둥오리
2nd row참새
3rd row노랑할미새
4th row알락오리
5th row민물도요
ValueCountFrequency (%)
흰뺨검둥오리 291
 
2.9%
왜가리 261
 
2.6%
청둥오리 239
 
2.4%
멧비둘기 204
 
2.0%
박새 195
 
1.9%
쇠오리 186
 
1.9%
까치 185
 
1.8%
흰죽지 180
 
1.8%
괭이갈매기 179
 
1.8%
중대백로 176
 
1.8%
Other values (279) 7904
79.0%
2024-04-18T03:11:30.091332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4204
 
10.6%
2149
 
5.4%
1964
 
5.0%
1814
 
4.6%
1019
 
2.6%
784
 
2.0%
717
 
1.8%
673
 
1.7%
672
 
1.7%
660
 
1.7%
Other values (192) 24991
63.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39647
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4204
 
10.6%
2149
 
5.4%
1964
 
5.0%
1814
 
4.6%
1019
 
2.6%
784
 
2.0%
717
 
1.8%
673
 
1.7%
672
 
1.7%
660
 
1.7%
Other values (192) 24991
63.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39647
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4204
 
10.6%
2149
 
5.4%
1964
 
5.0%
1814
 
4.6%
1019
 
2.6%
784
 
2.0%
717
 
1.8%
673
 
1.7%
672
 
1.7%
660
 
1.7%
Other values (192) 24991
63.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39647
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4204
 
10.6%
2149
 
5.4%
1964
 
5.0%
1814
 
4.6%
1019
 
2.6%
784
 
2.0%
717
 
1.8%
673
 
1.7%
672
 
1.7%
660
 
1.7%
Other values (192) 24991
63.0%
Distinct289
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-18T03:11:30.335153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length25
Mean length16.3248
Min length9

Characters and Unicode

Total characters163248
Distinct characters48
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)0.3%

Sample

1st rowAnas platyrhynchos
2nd rowPasser montanus
3rd rowMotacilla cinerea
4th rowAnas strepera
5th rowCalidris alpina
ValueCountFrequency (%)
anas 1336
 
6.5%
larus 613
 
3.0%
egretta 430
 
2.1%
parus 423
 
2.1%
alba 391
 
1.9%
pica 370
 
1.8%
podiceps 353
 
1.7%
aythya 341
 
1.7%
cinerea 322
 
1.6%
emberiza 307
 
1.5%
Other values (386) 15512
76.0%
2024-04-18T03:11:30.670215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 20210
12.4%
s 14436
 
8.8%
r 12388
 
7.6%
i 11548
 
7.1%
10398
 
6.4%
u 10368
 
6.4%
e 10127
 
6.2%
n 9243
 
5.7%
o 8192
 
5.0%
l 8063
 
4.9%
Other values (38) 48275
29.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 142850
87.5%
Space Separator 10398
 
6.4%
Uppercase Letter 10000
 
6.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 20210
14.1%
s 14436
10.1%
r 12388
8.7%
i 11548
 
8.1%
u 10368
 
7.3%
e 10127
 
7.1%
n 9243
 
6.5%
o 8192
 
5.7%
l 8063
 
5.6%
c 7972
 
5.6%
Other values (16) 30303
21.2%
Uppercase Letter
ValueCountFrequency (%)
A 2589
25.9%
P 1825
18.2%
C 1091
10.9%
L 783
 
7.8%
E 783
 
7.8%
M 562
 
5.6%
T 454
 
4.5%
S 338
 
3.4%
H 301
 
3.0%
F 273
 
2.7%
Other values (11) 1001
 
10.0%
Space Separator
ValueCountFrequency (%)
10398
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 152850
93.6%
Common 10398
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 20210
13.2%
s 14436
 
9.4%
r 12388
 
8.1%
i 11548
 
7.6%
u 10368
 
6.8%
e 10127
 
6.6%
n 9243
 
6.0%
o 8192
 
5.4%
l 8063
 
5.3%
c 7972
 
5.2%
Other values (37) 40303
26.4%
Common
ValueCountFrequency (%)
10398
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 163248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 20210
12.4%
s 14436
 
8.8%
r 12388
 
7.6%
i 11548
 
7.1%
10398
 
6.4%
u 10368
 
6.4%
e 10127
 
6.2%
n 9243
 
5.7%
o 8192
 
5.0%
l 8063
 
4.9%
Other values (38) 48275
29.6%

학명(원병오2000)
Text

MISSING 

Distinct287
Distinct (%)2.9%
Missing177
Missing (%)1.8%
Memory size156.2 KiB
2024-04-18T03:11:30.905929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length23
Mean length16.28362
Min length9

Characters and Unicode

Total characters159954
Distinct characters47
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)0.3%

Sample

1st rowAnas platyrhynchos
2nd rowPasser montanus
3rd rowMotacilla cinerea
4th rowAnas strepera
5th rowCalidris alpina
ValueCountFrequency (%)
anas 1336
 
6.7%
larus 613
 
3.1%
parus 423
 
2.1%
crecca 372
 
1.9%
pica 370
 
1.9%
aythya 341
 
1.7%
cinerea 322
 
1.6%
emberiza 307
 
1.5%
poecilorhyncha 291
 
1.5%
anser 286
 
1.4%
Other values (385) 15223
76.6%
2024-04-18T03:11:31.302636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 19721
12.3%
s 14226
 
8.9%
r 12351
 
7.7%
i 11227
 
7.0%
u 10671
 
6.7%
10061
 
6.3%
e 9404
 
5.9%
n 9245
 
5.8%
c 8467
 
5.3%
o 8003
 
5.0%
Other values (37) 46578
29.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 140070
87.6%
Space Separator 10061
 
6.3%
Uppercase Letter 9823
 
6.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 19721
14.1%
s 14226
10.2%
r 12351
8.8%
i 11227
 
8.0%
u 10671
 
7.6%
e 9404
 
6.7%
n 9245
 
6.6%
c 8467
 
6.0%
o 8003
 
5.7%
l 7804
 
5.6%
Other values (16) 28951
20.7%
Uppercase Letter
ValueCountFrequency (%)
A 2589
26.4%
P 1663
16.9%
C 1089
11.1%
L 799
 
8.1%
T 616
 
6.3%
M 606
 
6.2%
E 514
 
5.2%
S 338
 
3.4%
F 273
 
2.8%
B 236
 
2.4%
Other values (10) 1100
11.2%
Space Separator
ValueCountFrequency (%)
10061
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 149893
93.7%
Common 10061
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 19721
13.2%
s 14226
 
9.5%
r 12351
 
8.2%
i 11227
 
7.5%
u 10671
 
7.1%
e 9404
 
6.3%
n 9245
 
6.2%
c 8467
 
5.6%
o 8003
 
5.3%
l 7804
 
5.2%
Other values (36) 38774
25.9%
Common
ValueCountFrequency (%)
10061
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 159954
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 19721
12.3%
s 14226
 
8.9%
r 12351
 
7.7%
i 11227
 
7.0%
u 10671
 
6.7%
10061
 
6.3%
e 9404
 
5.9%
n 9245
 
5.8%
c 8467
 
5.3%
o 8003
 
5.0%
Other values (37) 46578
29.1%
Distinct287
Distinct (%)2.9%
Missing5
Missing (%)< 0.1%
Memory size156.2 KiB
2024-04-18T03:11:31.531118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length24
Mean length14.602401
Min length4

Characters and Unicode

Total characters145951
Distinct characters55
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)0.3%

Sample

1st rowMallard
2nd rowEurasian Tree Sparrow
3rd rowGrey Wagtail
4th rowGadwall
5th rowDunlin
ValueCountFrequency (%)
common 1103
 
5.5%
gull 613
 
3.1%
eurasian 566
 
2.8%
tit 509
 
2.5%
egret 447
 
2.2%
duck 436
 
2.2%
great 431
 
2.1%
grey 391
 
1.9%
heron 361
 
1.8%
grebe 353
 
1.8%
Other values (309) 14883
74.1%
2024-04-18T03:11:31.872586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 15675
 
10.7%
a 11344
 
7.8%
r 10856
 
7.4%
10114
 
6.9%
l 9155
 
6.3%
o 8624
 
5.9%
n 7953
 
5.4%
t 7775
 
5.3%
i 7254
 
5.0%
d 5031
 
3.4%
Other values (45) 52170
35.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 113254
77.6%
Uppercase Letter 20141
 
13.8%
Space Separator 10114
 
6.9%
Dash Punctuation 2255
 
1.5%
Other Punctuation 161
 
0.1%
Open Punctuation 13
 
< 0.1%
Close Punctuation 13
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 15675
13.8%
a 11344
10.0%
r 10856
9.6%
l 9155
 
8.1%
o 8624
 
7.6%
n 7953
 
7.0%
t 7775
 
6.9%
i 7254
 
6.4%
d 5031
 
4.4%
u 4356
 
3.8%
Other values (15) 25231
22.3%
Uppercase Letter
ValueCountFrequency (%)
G 2787
13.8%
C 2177
10.8%
S 2008
10.0%
B 1864
9.3%
T 1509
 
7.5%
W 1218
 
6.0%
P 1215
 
6.0%
E 1214
 
6.0%
D 948
 
4.7%
M 876
 
4.3%
Other values (15) 4325
21.5%
Space Separator
ValueCountFrequency (%)
10114
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2255
100.0%
Other Punctuation
ValueCountFrequency (%)
' 161
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 133395
91.4%
Common 12556
 
8.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 15675
 
11.8%
a 11344
 
8.5%
r 10856
 
8.1%
l 9155
 
6.9%
o 8624
 
6.5%
n 7953
 
6.0%
t 7775
 
5.8%
i 7254
 
5.4%
d 5031
 
3.8%
u 4356
 
3.3%
Other values (40) 45372
34.0%
Common
ValueCountFrequency (%)
10114
80.6%
- 2255
 
18.0%
' 161
 
1.3%
( 13
 
0.1%
) 13
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 145951
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 15675
 
10.7%
a 11344
 
7.8%
r 10856
 
7.4%
10114
 
6.9%
l 9155
 
6.3%
o 8624
 
5.9%
n 7953
 
5.4%
t 7775
 
5.3%
i 7254
 
5.0%
d 5031
 
3.4%
Other values (45) 52170
35.7%
Distinct286
Distinct (%)2.9%
Missing182
Missing (%)1.8%
Memory size156.2 KiB
2024-04-18T03:11:32.101029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length24
Mean length14.87635
Min length4

Characters and Unicode

Total characters146056
Distinct characters55
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)0.3%

Sample

1st rowMallard
2nd rowEurasian Tree Sparrow
3rd rowGrey Wagtail
4th rowGadwall
5th rowDunlin
ValueCountFrequency (%)
common 1103
 
5.6%
gull 613
 
3.1%
eurasian 575
 
2.9%
tit 509
 
2.6%
great 463
 
2.3%
duck 436
 
2.2%
grey 391
 
2.0%
heron 361
 
1.8%
grebe 353
 
1.8%
little 335
 
1.7%
Other values (310) 14663
74.0%
2024-04-18T03:11:32.436200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 15699
 
10.7%
a 11107
 
7.6%
r 10705
 
7.3%
10000
 
6.8%
l 9200
 
6.3%
o 8624
 
5.9%
n 8294
 
5.7%
t 7630
 
5.2%
i 7407
 
5.1%
d 5127
 
3.5%
Other values (45) 52263
35.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 113351
77.6%
Uppercase Letter 19942
 
13.7%
Space Separator 10000
 
6.8%
Dash Punctuation 2398
 
1.6%
Other Punctuation 161
 
0.1%
Open Punctuation 102
 
0.1%
Close Punctuation 102
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 15699
13.8%
a 11107
9.8%
r 10705
9.4%
l 9200
 
8.1%
o 8624
 
7.6%
n 8294
 
7.3%
t 7630
 
6.7%
i 7407
 
6.5%
d 5127
 
4.5%
u 4371
 
3.9%
Other values (15) 25187
22.2%
Uppercase Letter
ValueCountFrequency (%)
G 3005
15.1%
C 2177
10.9%
S 2008
10.1%
B 1910
9.6%
T 1509
 
7.6%
W 1217
 
6.1%
P 1168
 
5.9%
E 1047
 
5.3%
D 954
 
4.8%
M 876
 
4.4%
Other values (15) 4071
20.4%
Space Separator
ValueCountFrequency (%)
10000
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2398
100.0%
Other Punctuation
ValueCountFrequency (%)
' 161
100.0%
Open Punctuation
ValueCountFrequency (%)
( 102
100.0%
Close Punctuation
ValueCountFrequency (%)
) 102
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 133293
91.3%
Common 12763
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 15699
 
11.8%
a 11107
 
8.3%
r 10705
 
8.0%
l 9200
 
6.9%
o 8624
 
6.5%
n 8294
 
6.2%
t 7630
 
5.7%
i 7407
 
5.6%
d 5127
 
3.8%
u 4371
 
3.3%
Other values (40) 45129
33.9%
Common
ValueCountFrequency (%)
10000
78.4%
- 2398
 
18.8%
' 161
 
1.3%
( 102
 
0.8%
) 102
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 146056
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 15699
 
10.7%
a 11107
 
7.6%
r 10705
 
7.3%
10000
 
6.8%
l 9200
 
6.3%
o 8624
 
5.9%
n 8294
 
5.7%
t 7630
 
5.2%
i 7407
 
5.1%
d 5127
 
3.5%
Other values (45) 52263
35.8%

적색자료서등재 여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
False
9783 
True
 
217
ValueCountFrequency (%)
False 9783
97.8%
True 217
 
2.2%
2024-04-18T03:11:32.529119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

관찰개체수
Real number (ℝ)

SKEWED 

Distinct1093
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean466.2438
Minimum0
Maximum253000
Zeros19
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-18T03:11:32.609875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median9
Q358
95-th percentile1216.3
Maximum253000
Range253000
Interquartile range (IQR)56

Descriptive statistics

Standard deviation4434.0485
Coefficient of variation (CV)9.5101501
Kurtosis1273.0906
Mean466.2438
Median Absolute Deviation (MAD)8
Skewness29.640763
Sum4662438
Variance19660786
MonotonicityNot monotonic
2024-04-18T03:11:32.710225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1541
 
15.4%
2 1119
 
11.2%
3 624
 
6.2%
4 505
 
5.1%
5 418
 
4.2%
6 255
 
2.5%
7 251
 
2.5%
8 211
 
2.1%
10 204
 
2.0%
9 141
 
1.4%
Other values (1083) 4731
47.3%
ValueCountFrequency (%)
0 19
 
0.2%
1 1541
15.4%
2 1119
11.2%
3 624
6.2%
4 505
 
5.1%
5 418
 
4.2%
6 255
 
2.5%
7 251
 
2.5%
8 211
 
2.1%
9 141
 
1.4%
ValueCountFrequency (%)
253000 1
< 0.1%
110000 1
< 0.1%
105000 1
< 0.1%
102500 1
< 0.1%
100070 1
< 0.1%
100000 1
< 0.1%
92000 1
< 0.1%
88000 1
< 0.1%
80040 1
< 0.1%
78292 1
< 0.1%

조사자
Text

MISSING 

Distinct81
Distinct (%)1.0%
Missing1953
Missing (%)19.5%
Memory size156.2 KiB
2024-04-18T03:11:32.906737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length18
Mean length8.0511992
Min length3

Characters and Unicode

Total characters64788
Distinct characters116
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국립환경연구원 야생동물과
2nd row박찬열
3rd row윤무부, 권혁두
4th row전라남도 실태조사원
5th row원병오,외 2명
ValueCountFrequency (%)
국립환경연구원 1346
 
9.4%
야생동물과 1346
 
9.4%
백운기,외 1107
 
7.7%
9인 1107
 
7.7%
원병오 566
 
4.0%
김완병 520
 
3.6%
백운기 510
 
3.6%
이종남 499
 
3.5%
박행신 498
 
3.5%
함규황 453
 
3.2%
Other values (65) 6362
44.4%
2024-04-18T03:11:33.227323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6267
 
9.7%
, 5331
 
8.2%
2434
 
3.8%
1905
 
2.9%
1731
 
2.7%
1722
 
2.7%
1653
 
2.6%
1617
 
2.5%
1527
 
2.4%
1526
 
2.4%
Other values (106) 39075
60.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 51718
79.8%
Space Separator 6267
 
9.7%
Other Punctuation 5331
 
8.2%
Decimal Number 1254
 
1.9%
Lowercase Letter 132
 
0.2%
Open Punctuation 32
 
< 0.1%
Close Punctuation 32
 
< 0.1%
Uppercase Letter 22
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2434
 
4.7%
1905
 
3.7%
1731
 
3.3%
1722
 
3.3%
1653
 
3.2%
1617
 
3.1%
1527
 
3.0%
1526
 
3.0%
1475
 
2.9%
1434
 
2.8%
Other values (93) 34694
67.1%
Lowercase Letter
ValueCountFrequency (%)
a 22
16.7%
d 22
16.7%
r 22
16.7%
h 22
16.7%
c 22
16.7%
i 22
16.7%
Decimal Number
ValueCountFrequency (%)
9 1107
88.3%
2 147
 
11.7%
Space Separator
ValueCountFrequency (%)
6267
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5331
100.0%
Open Punctuation
ValueCountFrequency (%)
( 32
100.0%
Close Punctuation
ValueCountFrequency (%)
) 32
100.0%
Uppercase Letter
ValueCountFrequency (%)
R 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 51718
79.8%
Common 12916
 
19.9%
Latin 154
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2434
 
4.7%
1905
 
3.7%
1731
 
3.3%
1722
 
3.3%
1653
 
3.2%
1617
 
3.1%
1527
 
3.0%
1526
 
3.0%
1475
 
2.9%
1434
 
2.8%
Other values (93) 34694
67.1%
Latin
ValueCountFrequency (%)
a 22
14.3%
d 22
14.3%
r 22
14.3%
h 22
14.3%
c 22
14.3%
i 22
14.3%
R 22
14.3%
Common
ValueCountFrequency (%)
6267
48.5%
, 5331
41.3%
9 1107
 
8.6%
2 147
 
1.1%
( 32
 
0.2%
) 32
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 51718
79.8%
ASCII 13070
 
20.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6267
47.9%
, 5331
40.8%
9 1107
 
8.5%
2 147
 
1.1%
( 32
 
0.2%
) 32
 
0.2%
a 22
 
0.2%
d 22
 
0.2%
r 22
 
0.2%
h 22
 
0.2%
Other values (3) 66
 
0.5%
Hangul
ValueCountFrequency (%)
2434
 
4.7%
1905
 
3.7%
1731
 
3.3%
1722
 
3.3%
1653
 
3.2%
1617
 
3.1%
1527
 
3.0%
1526
 
3.0%
1475
 
2.9%
1434
 
2.8%
Other values (93) 34694
67.1%

조사자 소속기관
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
6699 
한국자연보존협회
 
598
국립공원관리공단
 
488
한국조류연구소
 
429
자연보호중앙협의회
 
340
Other values (12)
1446 

Length

Max length25
Median length4
Mean length5.2662
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row서울대학교
3rd row자연보호중앙협의회
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 6699
67.0%
한국자연보존협회 598
 
6.0%
국립공원관리공단 488
 
4.9%
한국조류연구소 429
 
4.3%
자연보호중앙협의회 340
 
3.4%
경남대학교 334
 
3.3%
서울대학교 283
 
2.8%
경북대학교 140
 
1.4%
국립중앙과학관 137
 
1.4%
제주도 민속자연사박물관, 제주대학교 과학교육과 121
 
1.2%
Other values (7) 431
 
4.3%

Length

2024-04-18T03:11:33.329594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 6699
63.4%
한국자연보존협회 598
 
5.7%
국립공원관리공단 488
 
4.6%
한국조류연구소 429
 
4.1%
자연보호중앙협의회 340
 
3.2%
경남대학교 334
 
3.2%
서울대학교 283
 
2.7%
제주대학교 195
 
1.8%
경북대학교 140
 
1.3%
국립중앙과학관 137
 
1.3%
Other values (10) 915
 
8.7%

조사방법
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct40
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
7129 
석사학위논문
937 
국립공원 자연자원조사
 
319
생태계 조사
 
259
국립공원자연자원조사
 
137
Other values (35)
1219 

Length

Max length38
Median length4
Mean length5.7626
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row석사학위논문
3rd row생물자원 현황 파악
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 7129
71.3%
석사학위논문 937
 
9.4%
국립공원 자연자원조사 319
 
3.2%
생태계 조사 259
 
2.6%
국립공원자연자원조사 137
 
1.4%
생물상 기초조사 137
 
1.4%
AWB보고 및 환경부전국월동수조류조사 103
 
1.0%
비무장지대 및 인접지역의 산림생태계 조사 102
 
1.0%
AWB보고 78
 
0.8%
생물자원 현황 파악 54
 
0.5%
Other values (30) 745
 
7.4%

Length

2024-04-18T03:11:33.437328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 7129
54.3%
석사학위논문 937
 
7.1%
조사 661
 
5.0%
국립공원 319
 
2.4%
자연자원조사 319
 
2.4%
생태계 259
 
2.0%
256
 
2.0%
awb보고 181
 
1.4%
조류상 163
 
1.2%
일대의 155
 
1.2%
Other values (66) 2743
 
20.9%
Distinct482
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-18T03:11:33.607216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length36
Mean length11.7119
Min length2

Characters and Unicode

Total characters117119
Distinct characters295
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)0.6%

Sample

1st row순천만 일대
2nd row서울시 도시 환경림, 월곡동
3rd row백령도,
4th row해창만
5th row백령도 화동갯벌 및 습지
ValueCountFrequency (%)
제주도 945
 
4.0%
일대 766
 
3.3%
낙동강하구 524
 
2.2%
천수만 507
 
2.2%
경기도 465
 
2.0%
지역 429
 
1.8%
양어장 418
 
1.8%
금강하구 387
 
1.7%
하도리 361
 
1.5%
강원도 350
 
1.5%
Other values (616) 18275
78.0%
2024-04-18T03:11:33.912368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16461
 
14.1%
, 5162
 
4.4%
4643
 
4.0%
4431
 
3.8%
4317
 
3.7%
3637
 
3.1%
3226
 
2.8%
3190
 
2.7%
2931
 
2.5%
2331
 
2.0%
Other values (285) 66790
57.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 90247
77.1%
Space Separator 16461
 
14.1%
Other Punctuation 5672
 
4.8%
Open Punctuation 1294
 
1.1%
Close Punctuation 1294
 
1.1%
Dash Punctuation 1267
 
1.1%
Uppercase Letter 476
 
0.4%
Decimal Number 270
 
0.2%
Math Symbol 116
 
0.1%
Lowercase Letter 22
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4643
 
5.1%
4431
 
4.9%
4317
 
4.8%
3637
 
4.0%
3226
 
3.6%
3190
 
3.5%
2931
 
3.2%
2331
 
2.6%
2138
 
2.4%
2039
 
2.3%
Other values (263) 57364
63.6%
Decimal Number
ValueCountFrequency (%)
1 107
39.6%
2 65
24.1%
3 43
15.9%
5 30
 
11.1%
4 9
 
3.3%
0 8
 
3.0%
6 3
 
1.1%
7 2
 
0.7%
8 2
 
0.7%
9 1
 
0.4%
Other Punctuation
ValueCountFrequency (%)
, 5162
91.0%
. 394
 
6.9%
: 116
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
A 244
51.3%
B 225
47.3%
C 7
 
1.5%
Space Separator
ValueCountFrequency (%)
16461
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1294
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1294
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1267
100.0%
Math Symbol
ValueCountFrequency (%)
~ 116
100.0%
Lowercase Letter
ValueCountFrequency (%)
m 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 90247
77.1%
Common 26374
 
22.5%
Latin 498
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4643
 
5.1%
4431
 
4.9%
4317
 
4.8%
3637
 
4.0%
3226
 
3.6%
3190
 
3.5%
2931
 
3.2%
2331
 
2.6%
2138
 
2.4%
2039
 
2.3%
Other values (263) 57364
63.6%
Common
ValueCountFrequency (%)
16461
62.4%
, 5162
 
19.6%
( 1294
 
4.9%
) 1294
 
4.9%
- 1267
 
4.8%
. 394
 
1.5%
~ 116
 
0.4%
: 116
 
0.4%
1 107
 
0.4%
2 65
 
0.2%
Other values (8) 98
 
0.4%
Latin
ValueCountFrequency (%)
A 244
49.0%
B 225
45.2%
m 22
 
4.4%
C 7
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 90247
77.1%
ASCII 26872
 
22.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
16461
61.3%
, 5162
 
19.2%
( 1294
 
4.8%
) 1294
 
4.8%
- 1267
 
4.7%
. 394
 
1.5%
A 244
 
0.9%
B 225
 
0.8%
~ 116
 
0.4%
: 116
 
0.4%
Other values (12) 299
 
1.1%
Hangul
ValueCountFrequency (%)
4643
 
5.1%
4431
 
4.9%
4317
 
4.8%
3637
 
4.0%
3226
 
3.6%
3190
 
3.5%
2931
 
3.2%
2331
 
2.6%
2138
 
2.4%
2039
 
2.3%
Other values (263) 57364
63.6%
Distinct361
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-18T03:11:34.158177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length72
Median length41
Mean length18.6633
Min length3

Characters and Unicode

Total characters186633
Distinct characters209
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique134 ?
Unique (%)1.3%

Sample

1st row전라남도 순천시 인안동, 별량면, 해룡면 일대
2nd row서울 성북구
3rd row인천광역시 옹진군 백령면
4th row전라남도 고흥군 포두면
5th row인천광역시 옹진군 백령면
ValueCountFrequency (%)
충청남도 1789
 
4.0%
일대 1469
 
3.3%
1408
 
3.1%
경기도 1393
 
3.1%
경상남도 1111
 
2.5%
전라북도 1106
 
2.5%
강원도 1012
 
2.3%
사하구 997
 
2.2%
강서구 978
 
2.2%
부산광역시 967
 
2.2%
Other values (497) 32702
72.8%
2024-04-18T03:11:34.532989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35140
 
18.8%
10356
 
5.5%
7808
 
4.2%
7605
 
4.1%
6281
 
3.4%
, 5654
 
3.0%
5097
 
2.7%
3928
 
2.1%
3600
 
1.9%
3507
 
1.9%
Other values (199) 97657
52.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 138346
74.1%
Space Separator 35140
 
18.8%
Decimal Number 5979
 
3.2%
Other Punctuation 5654
 
3.0%
Uppercase Letter 1102
 
0.6%
Dash Punctuation 412
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10356
 
7.5%
7808
 
5.6%
7605
 
5.5%
6281
 
4.5%
5097
 
3.7%
3928
 
2.8%
3600
 
2.6%
3507
 
2.5%
3314
 
2.4%
3294
 
2.4%
Other values (184) 83556
60.4%
Decimal Number
ValueCountFrequency (%)
3 1286
21.5%
5 929
15.5%
1 906
15.2%
2 853
14.3%
0 466
 
7.8%
6 462
 
7.7%
8 418
 
7.0%
4 362
 
6.1%
7 232
 
3.9%
9 65
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
N 551
50.0%
E 551
50.0%
Space Separator
ValueCountFrequency (%)
35140
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5654
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 412
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 138346
74.1%
Common 47185
 
25.3%
Latin 1102
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10356
 
7.5%
7808
 
5.6%
7605
 
5.5%
6281
 
4.5%
5097
 
3.7%
3928
 
2.8%
3600
 
2.6%
3507
 
2.5%
3314
 
2.4%
3294
 
2.4%
Other values (184) 83556
60.4%
Common
ValueCountFrequency (%)
35140
74.5%
, 5654
 
12.0%
3 1286
 
2.7%
5 929
 
2.0%
1 906
 
1.9%
2 853
 
1.8%
0 466
 
1.0%
6 462
 
1.0%
8 418
 
0.9%
- 412
 
0.9%
Other values (3) 659
 
1.4%
Latin
ValueCountFrequency (%)
N 551
50.0%
E 551
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 138346
74.1%
ASCII 48287
 
25.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
35140
72.8%
, 5654
 
11.7%
3 1286
 
2.7%
5 929
 
1.9%
1 906
 
1.9%
2 853
 
1.8%
N 551
 
1.1%
E 551
 
1.1%
0 466
 
1.0%
6 462
 
1.0%
Other values (5) 1489
 
3.1%
Hangul
ValueCountFrequency (%)
10356
 
7.5%
7808
 
5.6%
7605
 
5.5%
6281
 
4.5%
5097
 
3.7%
3928
 
2.8%
3600
 
2.6%
3507
 
2.5%
3314
 
2.4%
3294
 
2.4%
Other values (184) 83556
60.4%
Distinct102
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-18T03:11:34.738852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length112
Median length78
Mean length46.2704
Min length9

Characters and Unicode

Total characters462704
Distinct characters271
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row김진한.1998.한국에 도래하는 철새의 생태와 보호관리(특히 서해안에 도래하는 수조류에 대하여).경희대학교 대학원 박사학위논문
2nd row박찬열. 1994. 야생조류의 서식에 적합한 도시환경림 조성 및 관리방안 . 서울대학교 석사학위논문
3rd row윤무부, 권혁두. 1987. 백령도 및 대청.소청도의 조류조사. 자연실태종합조사 보고서. 자연보호중앙협의회.
4th row전라남도.2001.전남지방 겨울철새-서식실태 조사보고서.58-59
5th row원병오외 2명.1992.비무장지대 인접지역(백령도)의 조류. 비무장지대 인접지역(민통선 지역)의 자연생태계 조사보고서.환경부. 701-711
ValueCountFrequency (%)
도래하는 2466
 
3.6%
환경부.1997.전국 1947
 
2.8%
동시센서스 1947
 
2.8%
겨울철새 1947
 
2.8%
1266
 
1.8%
생태와 1265
 
1.8%
조류의 1171
 
1.7%
월동 1158
 
1.7%
백운기외 1107
 
1.6%
9인.2000.천연기념물 1107
 
1.6%
Other values (427) 53903
77.8%
2024-04-18T03:11:35.043014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
60273
 
13.0%
. 30385
 
6.6%
9 19891
 
4.3%
1 15202
 
3.3%
12418
 
2.7%
7620
 
1.6%
7489
 
1.6%
7403
 
1.6%
7203
 
1.6%
6837
 
1.5%
Other values (261) 287983
62.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 283939
61.4%
Decimal Number 68385
 
14.8%
Space Separator 60273
 
13.0%
Other Punctuation 34935
 
7.6%
Dash Punctuation 4852
 
1.0%
Lowercase Letter 3900
 
0.8%
Close Punctuation 2712
 
0.6%
Open Punctuation 2712
 
0.6%
Uppercase Letter 879
 
0.2%
Math Symbol 117
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12418
 
4.4%
7620
 
2.7%
7489
 
2.6%
7403
 
2.6%
7203
 
2.5%
6837
 
2.4%
6722
 
2.4%
6178
 
2.2%
6165
 
2.2%
5720
 
2.0%
Other values (222) 210184
74.0%
Lowercase Letter
ValueCountFrequency (%)
p 1674
42.9%
o 841
21.6%
l 435
 
11.2%
v 225
 
5.8%
u 145
 
3.7%
n 116
 
3.0%
s 116
 
3.0%
g 87
 
2.2%
y 87
 
2.2%
c 58
 
1.5%
Other values (4) 116
 
3.0%
Decimal Number
ValueCountFrequency (%)
9 19891
29.1%
1 15202
22.2%
0 5807
 
8.5%
2 5490
 
8.0%
7 5454
 
8.0%
3 4554
 
6.7%
8 4190
 
6.1%
4 3098
 
4.5%
6 2857
 
4.2%
5 1842
 
2.7%
Other Punctuation
ValueCountFrequency (%)
. 30385
87.0%
, 3809
 
10.9%
: 421
 
1.2%
' 234
 
0.7%
· 86
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
N 406
46.2%
V 181
20.6%
A 117
 
13.3%
B 117
 
13.3%
C 58
 
6.6%
Space Separator
ValueCountFrequency (%)
60273
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4852
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2712
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2712
100.0%
Math Symbol
ValueCountFrequency (%)
~ 117
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 283939
61.4%
Common 173986
37.6%
Latin 4779
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12418
 
4.4%
7620
 
2.7%
7489
 
2.6%
7403
 
2.6%
7203
 
2.5%
6837
 
2.4%
6722
 
2.4%
6178
 
2.2%
6165
 
2.2%
5720
 
2.0%
Other values (222) 210184
74.0%
Common
ValueCountFrequency (%)
60273
34.6%
. 30385
17.5%
9 19891
 
11.4%
1 15202
 
8.7%
0 5807
 
3.3%
2 5490
 
3.2%
7 5454
 
3.1%
- 4852
 
2.8%
3 4554
 
2.6%
8 4190
 
2.4%
Other values (10) 17888
 
10.3%
Latin
ValueCountFrequency (%)
p 1674
35.0%
o 841
17.6%
l 435
 
9.1%
N 406
 
8.5%
v 225
 
4.7%
V 181
 
3.8%
u 145
 
3.0%
A 117
 
2.4%
B 117
 
2.4%
n 116
 
2.4%
Other values (9) 522
 
10.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 283939
61.4%
ASCII 178679
38.6%
None 86
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
60273
33.7%
. 30385
17.0%
9 19891
 
11.1%
1 15202
 
8.5%
0 5807
 
3.2%
2 5490
 
3.1%
7 5454
 
3.1%
- 4852
 
2.7%
3 4554
 
2.5%
8 4190
 
2.3%
Other values (28) 22581
 
12.6%
Hangul
ValueCountFrequency (%)
12418
 
4.4%
7620
 
2.7%
7489
 
2.6%
7403
 
2.6%
7203
 
2.5%
6837
 
2.4%
6722
 
2.4%
6178
 
2.2%
6165
 
2.2%
5720
 
2.0%
Other values (222) 210184
74.0%
None
ValueCountFrequency (%)
· 86
100.0%

지역그리드번호
Real number (ℝ)

HIGH CORRELATION 

Distinct157
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1402.527
Minimum48
Maximum3436
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-18T03:11:35.152128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum48
5-th percentile776
Q1998
median1122
Q32005
95-th percentile2300
Maximum3436
Range3388
Interquartile range (IQR)1007

Descriptive statistics

Standard deviation577.79266
Coefficient of variation (CV)0.41196545
Kurtosis-0.43921625
Mean1402.527
Median Absolute Deviation (MAD)287
Skewness0.59952628
Sum14025270
Variance333844.36
MonotonicityNot monotonic
2024-04-18T03:11:35.288655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2300 967
 
9.7%
1098 520
 
5.2%
1015 519
 
5.2%
998 434
 
4.3%
1122 347
 
3.5%
2124 335
 
3.4%
2005 326
 
3.3%
835 282
 
2.8%
1016 272
 
2.7%
889 270
 
2.7%
Other values (147) 5728
57.3%
ValueCountFrequency (%)
48 15
 
0.1%
157 7
 
0.1%
158 36
 
0.4%
216 9
 
0.1%
418 136
1.4%
419 36
 
0.4%
447 5
 
0.1%
449 12
 
0.1%
505 2
 
< 0.1%
507 11
 
0.1%
ValueCountFrequency (%)
3436 7
 
0.1%
3378 63
 
0.6%
2519 1
 
< 0.1%
2514 14
 
0.1%
2414 18
 
0.2%
2403 95
 
0.9%
2398 4
 
< 0.1%
2392 6
 
0.1%
2300 967
9.7%
2280 23
 
0.2%
Distinct531
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1972-09-30 00:00:00
Maximum1999-12-31 00:00:00
2024-04-18T03:11:35.404539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:35.496732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-04-18T03:11:27.117812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:26.092076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:26.367099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:26.739164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:27.188708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:26.157043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:26.435261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:26.834968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:27.259693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:26.222464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:26.534138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:26.940919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:27.336166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:26.291506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:26.643616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T03:11:27.041103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-18T03:11:35.818778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자료번호보호종 여부보호종 번호고유번호(원병오1993)적색자료서등재 여부관찰개체수조사자조사자 소속기관조사방법지역그리드번호
자료번호1.0000.1540.6390.4550.0870.0250.8630.2500.1720.392
보호종 여부0.1541.000NaN0.6460.5780.0780.2890.2210.1690.087
보호종 번호0.639NaN1.0001.0001.0000.3270.8510.6730.4920.692
고유번호(원병오1993)0.4550.6461.0001.0000.1750.0470.6370.5290.5070.285
적색자료서등재 여부0.0870.5781.0000.1751.0000.2130.1020.1430.1030.067
관찰개체수0.0250.0780.3270.0470.2131.0000.0000.0000.0000.000
조사자0.8630.2890.8510.6370.1020.0001.0000.9980.9970.962
조사자 소속기관0.2500.2210.6730.5290.1430.0000.9981.0000.9660.780
조사방법0.1720.1690.4920.5070.1030.0000.9970.9661.0000.960
지역그리드번호0.3920.0870.6920.2850.0670.0000.9620.7800.9601.000
2024-04-18T03:11:35.918685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
적색자료서등재 여부보호종 여부조사방법조사자 소속기관
적색자료서등재 여부1.0000.3930.0860.112
보호종 여부0.3931.0000.1410.173
조사방법0.0860.1411.0000.738
조사자 소속기관0.1120.1730.7381.000
2024-04-18T03:11:35.991121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자료번호고유번호(원병오1993)관찰개체수지역그리드번호보호종 여부적색자료서등재 여부조사자 소속기관조사방법
자료번호1.0000.222-0.1530.0830.1180.0670.1200.088
고유번호(원병오1993)0.2221.000-0.2650.1130.5010.1340.2380.199
관찰개체수-0.153-0.2651.000-0.0680.0560.1530.0000.000
지역그리드번호0.0830.113-0.0681.0000.0870.0670.4660.772
보호종 여부0.1180.5010.0560.0871.0000.3930.1730.141
적색자료서등재 여부0.0670.1340.1530.0670.3931.0000.1120.086
조사자 소속기관0.1200.2380.0000.4660.1730.1121.0000.738
조사방법0.0880.1990.0000.7720.1410.0860.7381.000

Missing values

2024-04-18T03:11:27.675150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-18T03:11:27.861752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-18T03:11:28.032076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

자료번호보호종 여부보호종 번호고유번호(원병오1993)국명(원병오1993)국명(원병오2000)학명(원병오1993)학명(원병오2000)영문명(원병오1993)영문명(원병오2000)적색자료서등재 여부관찰개체수조사자조사자 소속기관조사방법조사지역행정구역문헌명지역그리드번호관찰시기
68426843N<NA>61청둥오리청둥오리Anas platyrhynchosAnas platyrhynchosMallardMallardN1000국립환경연구원 야생동물과<NA><NA>순천만 일대전라남도 순천시 인안동, 별량면, 해룡면 일대김진한.1998.한국에 도래하는 철새의 생태와 보호관리(특히 서해안에 도래하는 수조류에 대하여).경희대학교 대학원 박사학위논문14321997-12-10
2234122342N<NA>422참새참새Passer montanusPasser montanusEurasian Tree SparrowEurasian Tree SparrowN23박찬열서울대학교석사학위논문서울시 도시 환경림, 월곡동서울 성북구박찬열. 1994. 야생조류의 서식에 적합한 도시환경림 조성 및 관리방안 . 서울대학교 석사학위논문11731992-05-01
1753517536N<NA>290노랑할미새노랑할미새Motacilla cinereaMotacilla cinereaGrey WagtailGrey WagtailN1윤무부, 권혁두자연보호중앙협의회생물자원 현황 파악백령도,인천광역시 옹진군 백령면윤무부, 권혁두. 1987. 백령도 및 대청.소청도의 조류조사. 자연실태종합조사 보고서. 자연보호중앙협의회.4181987-06-25
1513915140N<NA>68알락오리알락오리Anas streperaAnas streperaGadwallGadwallN2전라남도 실태조사원<NA><NA>해창만전라남도 고흥군 포두면전라남도.2001.전남지방 겨울철새-서식실태 조사보고서.58-5913771999-12-15
991992N<NA>164민물도요민물도요Calidris alpinaCalidris alpinaDunlinDunlinN8원병오,외 2명<NA><NA>백령도 화동갯벌 및 습지인천광역시 옹진군 백령면원병오외 2명.1992.비무장지대 인접지역(백령도)의 조류. 비무장지대 인접지역(민통선 지역)의 자연생태계 조사보고서.환경부. 701-7114181991-09-15
1590815909N<NA>19가마우지가마우지Phalacrocorax filamentosusPhalacrocorax capillatusJapanese CormorantJapanese CormorantN405박성근, 김학진, 정명숙한국조류연구소<NA>낙동강하구, 하신일대부산직할시 사하구 하신일대 35도 15분N, 129도 12분E경희대학교조사자료20141994-02-19
2083320834YPR35189알락꼬리마도요알락꼬리마도요Numenius madagascariensisNumenius madagascariensisFar Eastern CurlewFar Eastern CurlewN1이상근경남대학교석사학위논문고성군 해안지역,경상남도 고성군 동해면이상근. 1997. 경남대 교육대학원 석사학위논문.20111995-03-01
72357236YPR1066가창오리가창오리Anas formosaAnas formosaBaikal TealBaikal TealY102500<NA><NA><NA>아산만경기도 평택시 현덕면 및 충청남도 아산시 영인면 일대환경부.1997.전국 겨울철새 동시센서스11221997-01-31
87328733YNM201 PR956고니고니Cygnus columbianusCygnus columbianusBewick's SwanBewick's SwanN45허위행,이종남<NA><NA>낙동강하구부산광역시 사하구, 강서구허위행,이종남,이인섭,우용태.1999.낙동강 하구의 조류상과 중요 습지로서의 평가.한국조류학회지.Vol6,No1.47-5623001997-03-01
2048220483N<NA>375쇠박새쇠박새Parus palustrisParus palustrisMarsh TitMarsh TitN3우한정, 백남극한국자연보존협회지리산 북부지역 일대의 하계 조류상조사지리산 , 한신계곡경상남도 함양군, 전라북도 남원군우한정, 백남극. 1993. 지리산 북부지대 일대의 하계 조류상. 한국자연보존협회 조사보고서. 제 31호, pp. 123-132.15431992-07-29
자료번호보호종 여부보호종 번호고유번호(원병오1993)국명(원병오1993)국명(원병오2000)학명(원병오1993)학명(원병오2000)영문명(원병오1993)영문명(원병오2000)적색자료서등재 여부관찰개체수조사자조사자 소속기관조사방법조사지역행정구역문헌명지역그리드번호관찰시기
14481449N<NA>206붉은부리갈매기붉은부리갈매기Larus ridibundusLarus ridibundusCommon Black-headed GullCommon Black-headed GullN32국립환경연구원 야생동물과<NA><NA>대구 화원유원지(흑두루미 도래지)경상북도 대구광역시김진한.1998.한국에 도래하는 철새의 생태와 보호관리(특히 서해안에 도래하는 수조류에 대하여).경희대학교 대학원 박사학위논문20031993-12-20
31593160N<NA>439까마귀까마귀Corvus coroneCorvus coroneCarrion CrowCarrion CrowN2국립환경연구원 야생동물과<NA><NA>강원도 계방산강원도 홍천군 내면 창촌1리국립환경연구원야생동물과 조사자료19821995-09-26
1680316804N<NA>434까치까치Pica picaPica picaBlack-billed MagpieBlack-billed MagpieN6함규황자연보호중앙협의회생태계 조사대청호 일대, 군북면충북 옥천군 군북면함규황. 1991. 대청호 호소 생태계 조사연구보고서. 자연보호중앙협의회.15331991-03-21
1506515066N<NA>290알락할미새알락할미새Motacilla alba leucopsisMotacilla albaWhite-faced Pied WagtailWhite(Pied) WagtailN4백운기,외 9인<NA><NA>주남저수지경상남도 창원시 동읍백운기외 9인.2000.천연기념물 조류의 월동 실태조사.421-44321241999-12-12
2002720028N<NA>8뿔논병아리뿔논병아리Podiceps cristatusPodiceps cristatusGreat Crested GrebeGreat Crested GrebeN3배병옥경남대학교석사학위논문거제도 해안지역, 거제군 남부면 저구리-갈곶리경상남도 거제군배병옥. 1991. 경남대 교육대학원 석사학위논문.21301989-11-26
46594660N<NA>373오목눈이오목눈이Aegithalos caudatusAegithalos caudatusLong-tailed TitLong-tailed TitN15정명숙,원병오<NA><NA>경기도 시흥시 무지동 점말 구릉산지 일대경기도 시흥시정명숙,원병오.1999.고속도로 건설지역에 있어서의 농촌산림조류의 생태와 보호.한국조류학회지.vol6,No1.17-3310591996-12-01
2243022431N<NA>61청둥오리청둥오리Anas platyrhynchosAnas platyrhynchosMallardMallardN4함규황자연보호중앙협의회생태계 조사대청호 일대, 동이면충북 옥천군 동이면함규황. 1991. 대청호 호소 생태계 조사연구보고서. 자연보호중앙협의회.15341990-10-26
46794680N<NA>394노랑턱멧새노랑턱멧새Emberiza elegansEmberiza elegansYellow-throated BuntingYellow-throated BuntingN1정명숙,원병오<NA><NA>경기도 시흥시 하중동 월대봉경기도 시흥시정명숙,원병오.1999.고속도로 건설지역에 있어서의 농촌산림조류의 생태와 보호.한국조류학회지.vol6,No1.17-3310591996-12-01
57245725N<NA>75흰죽지흰죽지Aythya ferinaAythya ferinaCommon PochardCommon PochardN3이종남<NA><NA>낙동강 대마등과 장자도 일원부산광역시 사하구, 강서구이종남.1998.낙동강 하구의 조류상에 관한 연구('96~'97)-표본구 A,B를 중심으로-.경성대학교조류연구소보2.53-7323001996-05-01
1154411545YPR650큰기러기큰기러기Anser fabalisAnser fabalisBean GooseBean GooseN300산림청, 임업연<NA><NA>토교저수지강원도 철원군 동송읍 이길리산림청 임업연구원.1998.비무장지대 및 인접지역의 산림생태계 조사 4차 보고서 서부해안 및 도서지역.121-14013401998-10-16