Overview

Dataset statistics

Number of variables11
Number of observations10000
Missing cells5483
Missing cells (%)5.0%
Duplicate rows6
Duplicate rows (%)0.1%
Total size in memory966.8 KiB
Average record size in memory99.0 B

Variable types

Text7
Numeric3
Categorical1

Dataset

Description종코드,국명,학명,서식지코드,서식지명,세부통계용명칭,출현년도,원전,X좌표,Y좌표,서식지비고정보
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-2200/S/1/datasetView.do

Alerts

Dataset has 6 (0.1%) duplicate rowsDuplicates
X좌표 is highly overall correlated with 서식지비고정보High correlation
Y좌표 is highly overall correlated with 서식지비고정보High correlation
서식지비고정보 is highly overall correlated with X좌표 and 1 other fieldsHigh correlation
세부통계용명칭 has 1705 (17.1%) missing valuesMissing
X좌표 has 1885 (18.9%) missing valuesMissing
Y좌표 has 1885 (18.9%) missing valuesMissing

Reproduction

Analysis started2024-05-11 02:13:31.547094
Analysis finished2024-05-11 02:13:40.678697
Duration9.13 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2636
Distinct (%)26.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T02:13:41.812792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters50000
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1196 ?
Unique (%)12.0%

Sample

1st rows2975
2nd rows0226
3rd rows1725
4th rows3855
5th rows0169
ValueCountFrequency (%)
s3918 35
 
0.4%
s0214 33
 
0.3%
s1261 32
 
0.3%
s4502 31
 
0.3%
s0712 31
 
0.3%
s1725 30
 
0.3%
s1978 30
 
0.3%
s4526 29
 
0.3%
s2078 28
 
0.3%
s3837 26
 
0.3%
Other values (2626) 9695
97.0%
2024-05-11T02:13:43.402775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 10000
20.0%
2 5588
11.2%
1 5218
10.4%
3 5104
10.2%
0 5025
10.1%
4 4116
8.2%
5 3338
 
6.7%
8 3018
 
6.0%
9 2948
 
5.9%
7 2884
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40000
80.0%
Lowercase Letter 10000
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 5588
14.0%
1 5218
13.0%
3 5104
12.8%
0 5025
12.6%
4 4116
10.3%
5 3338
8.3%
8 3018
7.5%
9 2948
7.4%
7 2884
7.2%
6 2761
6.9%
Lowercase Letter
ValueCountFrequency (%)
s 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 40000
80.0%
Latin 10000
 
20.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 5588
14.0%
1 5218
13.0%
3 5104
12.8%
0 5025
12.6%
4 4116
10.3%
5 3338
8.3%
8 3018
7.5%
9 2948
7.4%
7 2884
7.2%
6 2761
6.9%
Latin
ValueCountFrequency (%)
s 10000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 10000
20.0%
2 5588
11.2%
1 5218
10.4%
3 5104
10.2%
0 5025
10.1%
4 4116
8.2%
5 3338
 
6.7%
8 3018
 
6.0%
9 2948
 
5.9%
7 2884
 
5.8%

국명
Text

Distinct2642
Distinct (%)26.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T02:13:44.251513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length4.3306
Min length1

Characters and Unicode

Total characters43306
Distinct characters626
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1200 ?
Unique (%)12.0%

Sample

1st row애기나리
2nd row개밀
3rd row멧비둘기
4th row질경이
5th row갈퀴덩굴
ValueCountFrequency (%)
참새 35
 
0.4%
개망초 33
 
0.3%
닭의장풀 32
 
0.3%
멧비둘기 31
 
0.3%
까치 31
 
0.3%
환삼덩굴 31
 
0.3%
박새 30
 
0.3%
황새냉이 29
 
0.3%
뱀딸기 28
 
0.3%
무당벌레 26
 
0.3%
Other values (2620) 9694
96.9%
2024-05-11T02:13:45.569898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2315
 
5.3%
1768
 
4.1%
1537
 
3.5%
1104
 
2.5%
822
 
1.9%
775
 
1.8%
728
 
1.7%
661
 
1.5%
637
 
1.5%
634
 
1.5%
Other values (616) 32325
74.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 43262
99.9%
Space Separator 17
 
< 0.1%
Other Punctuation 16
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Open Punctuation 4
 
< 0.1%
Dash Punctuation 2
 
< 0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2315
 
5.4%
1768
 
4.1%
1537
 
3.6%
1104
 
2.6%
822
 
1.9%
775
 
1.8%
728
 
1.7%
661
 
1.5%
637
 
1.5%
634
 
1.5%
Other values (609) 32281
74.6%
Other Punctuation
ValueCountFrequency (%)
? 15
93.8%
1
 
6.2%
Space Separator
ValueCountFrequency (%)
17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Lowercase Letter
ValueCountFrequency (%)
f 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 43262
99.9%
Common 43
 
0.1%
Latin 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2315
 
5.4%
1768
 
4.1%
1537
 
3.6%
1104
 
2.6%
822
 
1.9%
775
 
1.8%
728
 
1.7%
661
 
1.5%
637
 
1.5%
634
 
1.5%
Other values (609) 32281
74.6%
Common
ValueCountFrequency (%)
17
39.5%
? 15
34.9%
) 4
 
9.3%
( 4
 
9.3%
- 2
 
4.7%
1
 
2.3%
Latin
ValueCountFrequency (%)
f 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 43262
99.9%
ASCII 43
 
0.1%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2315
 
5.4%
1768
 
4.1%
1537
 
3.6%
1104
 
2.6%
822
 
1.9%
775
 
1.8%
728
 
1.7%
661
 
1.5%
637
 
1.5%
634
 
1.5%
Other values (609) 32281
74.6%
ASCII
ValueCountFrequency (%)
17
39.5%
? 15
34.9%
) 4
 
9.3%
( 4
 
9.3%
- 2
 
4.7%
f 1
 
2.3%
None
ValueCountFrequency (%)
1
100.0%

학명
Text

Distinct3037
Distinct (%)30.4%
Missing8
Missing (%)0.1%
Memory size156.2 KiB
2024-05-11T02:13:46.509294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length77
Median length61
Mean length26.818255
Min length7

Characters and Unicode

Total characters267968
Distinct characters71
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1555 ?
Unique (%)15.6%

Sample

1st rowDisporum smilacinum A. Gray
2nd rowAgropyron tsukushiense var. transiens (Hack.) Ohwi
3rd rowStreptopelia orientalis
4th rowPlantago asiatica L.
5th rowGalium spurium L.
ValueCountFrequency (%)
l 1507
 
4.5%
var 734
 
2.2%
thunb 427
 
1.3%
japonica 417
 
1.3%
nakai 362
 
1.1%
siebold 310
 
0.9%
maxim 279
 
0.8%
miq 207
 
0.6%
makino 191
 
0.6%
188
 
0.6%
Other values (4523) 28564
86.1%
2024-05-11T02:13:48.268986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 27231
 
10.2%
23801
 
8.9%
i 22119
 
8.3%
e 16806
 
6.3%
s 16087
 
6.0%
r 15770
 
5.9%
o 13298
 
5.0%
n 13201
 
4.9%
u 13052
 
4.9%
l 11647
 
4.3%
Other values (61) 94956
35.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 209847
78.3%
Space Separator 23801
 
8.9%
Uppercase Letter 21321
 
8.0%
Other Punctuation 6635
 
2.5%
Close Punctuation 3081
 
1.1%
Open Punctuation 3078
 
1.1%
Dash Punctuation 179
 
0.1%
Decimal Number 25
 
< 0.1%
Other Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 27231
13.0%
i 22119
10.5%
e 16806
 
8.0%
s 16087
 
7.7%
r 15770
 
7.5%
o 13298
 
6.3%
n 13201
 
6.3%
u 13052
 
6.2%
l 11647
 
5.6%
t 9940
 
4.7%
Other values (16) 50696
24.2%
Uppercase Letter
ValueCountFrequency (%)
L 2525
11.8%
S 2003
 
9.4%
P 1914
 
9.0%
C 1769
 
8.3%
M 1762
 
8.3%
A 1477
 
6.9%
B 1241
 
5.8%
T 1193
 
5.6%
H 932
 
4.4%
R 856
 
4.0%
Other values (16) 5649
26.5%
Other Punctuation
ValueCountFrequency (%)
. 6557
98.8%
: 38
 
0.6%
, 26
 
0.4%
& 11
 
0.2%
? 2
 
< 0.1%
; 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 14
56.0%
7 3
 
12.0%
8 3
 
12.0%
2 2
 
8.0%
9 2
 
8.0%
5 1
 
4.0%
Close Punctuation
ValueCountFrequency (%)
) 3079
99.9%
] 2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 3076
99.9%
[ 2
 
0.1%
Space Separator
ValueCountFrequency (%)
23801
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 179
100.0%
Other Letter
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 231168
86.3%
Common 36799
 
13.7%
Hangul 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 27231
 
11.8%
i 22119
 
9.6%
e 16806
 
7.3%
s 16087
 
7.0%
r 15770
 
6.8%
o 13298
 
5.8%
n 13201
 
5.7%
u 13052
 
5.6%
l 11647
 
5.0%
t 9940
 
4.3%
Other values (42) 72017
31.2%
Common
ValueCountFrequency (%)
23801
64.7%
. 6557
 
17.8%
) 3079
 
8.4%
( 3076
 
8.4%
- 179
 
0.5%
: 38
 
0.1%
, 26
 
0.1%
1 14
 
< 0.1%
& 11
 
< 0.1%
7 3
 
< 0.1%
Other values (8) 15
 
< 0.1%
Hangul
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 267967
> 99.9%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 27231
 
10.2%
23801
 
8.9%
i 22119
 
8.3%
e 16806
 
6.3%
s 16087
 
6.0%
r 15770
 
5.9%
o 13298
 
5.0%
n 13201
 
4.9%
u 13052
 
4.9%
l 11647
 
4.3%
Other values (60) 94955
35.4%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Distinct244
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T02:13:49.236663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters50000
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)0.3%

Sample

1st rowp0205
2nd rowp0222
3rd rowp0175
4th rowp0008
5th rowp0293
ValueCountFrequency (%)
p0102 649
 
6.5%
p0058 561
 
5.6%
p0198 529
 
5.3%
p0050 335
 
3.4%
p0036 324
 
3.2%
p0243 295
 
2.9%
p0271 244
 
2.4%
p0279 237
 
2.4%
p0249 237
 
2.4%
p0148 218
 
2.2%
Other values (234) 6371
63.7%
2024-05-11T02:13:50.599996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 15109
30.2%
p 10000
20.0%
2 4951
 
9.9%
1 4082
 
8.2%
3 3383
 
6.8%
5 2763
 
5.5%
9 2438
 
4.9%
8 2358
 
4.7%
4 1940
 
3.9%
6 1591
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40000
80.0%
Lowercase Letter 10000
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 15109
37.8%
2 4951
 
12.4%
1 4082
 
10.2%
3 3383
 
8.5%
5 2763
 
6.9%
9 2438
 
6.1%
8 2358
 
5.9%
4 1940
 
4.9%
6 1591
 
4.0%
7 1385
 
3.5%
Lowercase Letter
ValueCountFrequency (%)
p 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 40000
80.0%
Latin 10000
 
20.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 15109
37.8%
2 4951
 
12.4%
1 4082
 
10.2%
3 3383
 
8.5%
5 2763
 
6.9%
9 2438
 
6.1%
8 2358
 
5.9%
4 1940
 
4.9%
6 1591
 
4.0%
7 1385
 
3.5%
Latin
ValueCountFrequency (%)
p 10000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 15109
30.2%
p 10000
20.0%
2 4951
 
9.9%
1 4082
 
8.2%
3 3383
 
6.8%
5 2763
 
5.5%
9 2438
 
4.9%
8 2358
 
4.7%
4 1940
 
3.9%
6 1591
 
3.2%
Distinct278
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T02:13:51.510578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length24
Mean length4.8547
Min length2

Characters and Unicode

Total characters48547
Distinct characters246
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)0.3%

Sample

1st row인왕산
2nd row중랑천
3rd row여의도샛강
4th row개포동 달터근린공원
5th row천왕산
ValueCountFrequency (%)
생태경관보전지역 842
 
7.1%
북한산 649
 
5.5%
남산 627
 
5.3%
월드컵공원 529
 
4.5%
청계산 333
 
2.8%
길동생태공원 326
 
2.7%
관악산 325
 
2.7%
탄천 237
 
2.0%
헌인릉 237
 
2.0%
수락산 218
 
1.8%
Other values (321) 7559
63.6%
2024-05-11T02:13:53.077881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3600
 
7.4%
1882
 
3.9%
1723
 
3.5%
1638
 
3.4%
1358
 
2.8%
1354
 
2.8%
1307
 
2.7%
1278
 
2.6%
1207
 
2.5%
1130
 
2.3%
Other values (236) 32070
66.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 42922
88.4%
Space Separator 1882
 
3.9%
Decimal Number 1671
 
3.4%
Uppercase Letter 1510
 
3.1%
Math Symbol 358
 
0.7%
Dash Punctuation 54
 
0.1%
Open Punctuation 52
 
0.1%
Close Punctuation 52
 
0.1%
Other Punctuation 44
 
0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3600
 
8.4%
1723
 
4.0%
1638
 
3.8%
1358
 
3.2%
1354
 
3.2%
1307
 
3.0%
1278
 
3.0%
1207
 
2.8%
1130
 
2.6%
1033
 
2.4%
Other values (208) 27294
63.6%
Decimal Number
ValueCountFrequency (%)
1 312
18.7%
2 266
15.9%
4 222
13.3%
5 219
13.1%
3 207
12.4%
6 134
8.0%
7 108
 
6.5%
8 84
 
5.0%
9 60
 
3.6%
0 59
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
C 317
21.0%
A 262
17.4%
H 236
15.6%
E 201
13.3%
G 157
10.4%
F 135
8.9%
B 115
 
7.6%
D 87
 
5.8%
Other Punctuation
ValueCountFrequency (%)
? 42
95.5%
/ 1
 
2.3%
. 1
 
2.3%
Lowercase Letter
ValueCountFrequency (%)
k 1
50.0%
m 1
50.0%
Space Separator
ValueCountFrequency (%)
1882
100.0%
Math Symbol
ValueCountFrequency (%)
~ 358
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 54
100.0%
Open Punctuation
ValueCountFrequency (%)
( 52
100.0%
Close Punctuation
ValueCountFrequency (%)
) 52
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 42922
88.4%
Common 4113
 
8.5%
Latin 1512
 
3.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3600
 
8.4%
1723
 
4.0%
1638
 
3.8%
1358
 
3.2%
1354
 
3.2%
1307
 
3.0%
1278
 
3.0%
1207
 
2.8%
1130
 
2.6%
1033
 
2.4%
Other values (208) 27294
63.6%
Common
ValueCountFrequency (%)
1882
45.8%
~ 358
 
8.7%
1 312
 
7.6%
2 266
 
6.5%
4 222
 
5.4%
5 219
 
5.3%
3 207
 
5.0%
6 134
 
3.3%
7 108
 
2.6%
8 84
 
2.0%
Other values (8) 321
 
7.8%
Latin
ValueCountFrequency (%)
C 317
21.0%
A 262
17.3%
H 236
15.6%
E 201
13.3%
G 157
10.4%
F 135
8.9%
B 115
 
7.6%
D 87
 
5.8%
k 1
 
0.1%
m 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 42922
88.4%
ASCII 5625
 
11.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3600
 
8.4%
1723
 
4.0%
1638
 
3.8%
1358
 
3.2%
1354
 
3.2%
1307
 
3.0%
1278
 
3.0%
1207
 
2.8%
1130
 
2.6%
1033
 
2.4%
Other values (208) 27294
63.6%
ASCII
ValueCountFrequency (%)
1882
33.5%
~ 358
 
6.4%
C 317
 
5.6%
1 312
 
5.5%
2 266
 
4.7%
A 262
 
4.7%
H 236
 
4.2%
4 222
 
3.9%
5 219
 
3.9%
3 207
 
3.7%
Other values (18) 1344
23.9%

세부통계용명칭
Text

MISSING 

Distinct61
Distinct (%)0.7%
Missing1705
Missing (%)17.1%
Memory size156.2 KiB
2024-05-11T02:13:53.772423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length8
Mean length3.6455696
Min length2

Characters and Unicode

Total characters30240
Distinct characters108
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row인왕산
2nd row중랑천
3rd row여의도샛강
4th row달터근린공원
5th row한강
ValueCountFrequency (%)
한강 1150
 
12.7%
북한산 697
 
7.7%
국립공원 697
 
7.7%
남산 667
 
7.4%
월드컵공원 621
 
6.9%
청계산 427
 
4.7%
길동생태공원 335
 
3.7%
중랑천 325
 
3.6%
관악산 324
 
3.6%
청계천 289
 
3.2%
Other values (52) 3524
38.9%
2024-05-11T02:13:54.914208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3550
 
11.7%
1986
 
6.6%
1915
 
6.3%
1847
 
6.1%
1359
 
4.5%
1326
 
4.4%
761
 
2.5%
756
 
2.5%
756
 
2.5%
752
 
2.5%
Other values (98) 15232
50.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29479
97.5%
Space Separator 761
 
2.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3550
 
12.0%
1986
 
6.7%
1915
 
6.5%
1847
 
6.3%
1359
 
4.6%
1326
 
4.5%
756
 
2.6%
756
 
2.6%
752
 
2.6%
716
 
2.4%
Other values (97) 14516
49.2%
Space Separator
ValueCountFrequency (%)
761
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29479
97.5%
Common 761
 
2.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3550
 
12.0%
1986
 
6.7%
1915
 
6.5%
1847
 
6.3%
1359
 
4.6%
1326
 
4.5%
756
 
2.6%
756
 
2.6%
752
 
2.6%
716
 
2.4%
Other values (97) 14516
49.2%
Common
ValueCountFrequency (%)
761
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29479
97.5%
ASCII 761
 
2.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3550
 
12.0%
1986
 
6.7%
1915
 
6.5%
1847
 
6.3%
1359
 
4.6%
1326
 
4.5%
756
 
2.6%
756
 
2.6%
752
 
2.6%
716
 
2.4%
Other values (97) 14516
49.2%
ASCII
ValueCountFrequency (%)
761
100.0%

출현년도
Real number (ℝ)

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2002.5167
Minimum1948
Maximum2012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T02:13:55.348236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1948
5-th percentile1994
Q12001
median2004
Q32006
95-th percentile2009
Maximum2012
Range64
Interquartile range (IQR)5

Descriptive statistics

Standard deviation6.7497761
Coefficient of variation (CV)0.0033706466
Kurtosis32.408135
Mean2002.5167
Median Absolute Deviation (MAD)2
Skewness-4.7330307
Sum20025167
Variance45.559477
MonotonicityNot monotonic
2024-05-11T02:13:55.732896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
2004 2271
22.7%
2007 1025
10.2%
2006 990
9.9%
2001 810
 
8.1%
2002 809
 
8.1%
2005 767
 
7.7%
2009 642
 
6.4%
2003 574
 
5.7%
1999 511
 
5.1%
1997 376
 
3.8%
Other values (14) 1225
12.2%
ValueCountFrequency (%)
1948 76
0.8%
1972 53
 
0.5%
1984 8
 
0.1%
1986 76
0.8%
1987 72
0.7%
1989 33
 
0.3%
1992 66
0.7%
1993 93
0.9%
1994 156
1.6%
1996 29
 
0.3%
ValueCountFrequency (%)
2012 53
 
0.5%
2009 642
 
6.4%
2008 131
 
1.3%
2007 1025
10.2%
2006 990
9.9%
2005 767
 
7.7%
2004 2271
22.7%
2003 574
 
5.7%
2002 809
 
8.1%
2001 810
 
8.1%

원전
Text

Distinct95
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T02:13:56.590647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length36
Mean length19.931
Min length6

Characters and Unicode

Total characters199310
Distinct characters200
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row산림생태계조사 연구보고서
2nd row2007년 한강생태계 조사
3rd row2007년 한강생태계 조사
4th row서울시 우수 생태계지역 정밀조사 연구
5th row서울시 도시숲(산림) 생태계 조사 학술 연구
ValueCountFrequency (%)
서울시 2813
 
7.1%
한강생태계 1587
 
4.0%
1531
 
3.9%
생물다양성 1510
 
3.8%
증진방안 1510
 
3.8%
비오톱유형별 1510
 
3.8%
조사 1482
 
3.8%
조사연구 1075
 
2.7%
2007년 1055
 
2.7%
연구 1025
 
2.6%
Other values (203) 24369
61.7%
2024-05-11T02:13:58.236069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30481
 
15.3%
8642
 
4.3%
6441
 
3.2%
5628
 
2.8%
5065
 
2.5%
4841
 
2.4%
4779
 
2.4%
0 4577
 
2.3%
4268
 
2.1%
4210
 
2.1%
Other values (190) 120378
60.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 153715
77.1%
Space Separator 30481
 
15.3%
Decimal Number 10430
 
5.2%
Open Punctuation 1745
 
0.9%
Close Punctuation 1745
 
0.9%
Other Punctuation 792
 
0.4%
Dash Punctuation 208
 
0.1%
Lowercase Letter 152
 
0.1%
Uppercase Letter 42
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8642
 
5.6%
6441
 
4.2%
5628
 
3.7%
5065
 
3.3%
4841
 
3.1%
4779
 
3.1%
4268
 
2.8%
4210
 
2.7%
3954
 
2.6%
3509
 
2.3%
Other values (168) 102378
66.6%
Decimal Number
ValueCountFrequency (%)
0 4577
43.9%
2 2578
24.7%
7 1421
 
13.6%
8 443
 
4.2%
4 426
 
4.1%
6 326
 
3.1%
3 285
 
2.7%
1 170
 
1.6%
5 132
 
1.3%
9 72
 
0.7%
Other Punctuation
ValueCountFrequency (%)
, 277
35.0%
? 233
29.4%
. 204
25.8%
: 72
 
9.1%
/ 6
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
I 21
50.0%
V 21
50.0%
Space Separator
ValueCountFrequency (%)
30481
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1745
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1745
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 208
100.0%
Lowercase Letter
ValueCountFrequency (%)
p 152
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 153715
77.1%
Common 45401
 
22.8%
Latin 194
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8642
 
5.6%
6441
 
4.2%
5628
 
3.7%
5065
 
3.3%
4841
 
3.1%
4779
 
3.1%
4268
 
2.8%
4210
 
2.7%
3954
 
2.6%
3509
 
2.3%
Other values (168) 102378
66.6%
Common
ValueCountFrequency (%)
30481
67.1%
0 4577
 
10.1%
2 2578
 
5.7%
( 1745
 
3.8%
) 1745
 
3.8%
7 1421
 
3.1%
8 443
 
1.0%
4 426
 
0.9%
6 326
 
0.7%
3 285
 
0.6%
Other values (9) 1374
 
3.0%
Latin
ValueCountFrequency (%)
p 152
78.4%
I 21
 
10.8%
V 21
 
10.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 153715
77.1%
ASCII 45595
 
22.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
30481
66.9%
0 4577
 
10.0%
2 2578
 
5.7%
( 1745
 
3.8%
) 1745
 
3.8%
7 1421
 
3.1%
8 443
 
1.0%
4 426
 
0.9%
6 326
 
0.7%
3 285
 
0.6%
Other values (12) 1568
 
3.4%
Hangul
ValueCountFrequency (%)
8642
 
5.6%
6441
 
4.2%
5628
 
3.7%
5065
 
3.3%
4841
 
3.1%
4779
 
3.1%
4268
 
2.8%
4210
 
2.7%
3954
 
2.6%
3509
 
2.3%
Other values (168) 102378
66.6%

X좌표
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct144
Distinct (%)1.8%
Missing1885
Missing (%)18.9%
Infinite0
Infinite (%)0.0%
Mean202748.8
Minimum182204.3
Maximum256839.46
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T02:13:58.838159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum182204.3
5-th percentile189443.6
Q1196429
median199406.4
Q3207170.8
95-th percentile213686.3
Maximum256839.46
Range74635.157
Interquartile range (IQR)10741.8

Descriptive statistics

Standard deviation10891.678
Coefficient of variation (CV)0.053720059
Kurtosis6.7382043
Mean202748.8
Median Absolute Deviation (MAD)6693.1
Skewness1.9060867
Sum1.6453065 × 109
Variance1.1862864 × 108
MonotonicityNot monotonic
2024-05-11T02:13:59.336495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
198426.6 649
 
6.5%
199375.0 561
 
5.6%
190107.9 529
 
5.3%
213611.0 335
 
3.4%
196429.0 324
 
3.2%
203920.1 295
 
2.9%
198658.9 244
 
2.4%
208246.3 237
 
2.4%
207170.8 237
 
2.4%
206693.8 218
 
2.2%
Other values (134) 4486
44.9%
(Missing) 1885
18.9%
ValueCountFrequency (%)
182204.3 49
0.5%
182514.7 10
 
0.1%
182711.8 1
 
< 0.1%
182726.2 60
0.6%
182949.6 5
 
0.1%
184424.4 24
 
0.2%
184641.9 11
 
0.1%
185160.6 4
 
< 0.1%
185616.0 6
 
0.1%
186082.6 1
 
< 0.1%
ValueCountFrequency (%)
256839.4568 27
 
0.3%
254100.1354 10
 
0.1%
252811.7026 81
 
0.8%
246332.7914 25
 
0.2%
234793.9968 62
 
0.6%
233493.0421 40
 
0.4%
232023.3238 59
 
0.6%
214034.7 8
 
0.1%
213686.3 127
 
1.3%
213611.0 335
3.4%

Y좌표
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct145
Distinct (%)1.8%
Missing1885
Missing (%)18.9%
Infinite0
Infinite (%)0.0%
Mean456607.36
Minimum437272.4
Maximum608607.71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T02:13:59.863113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum437272.4
5-th percentile437564.6
Q1445458
median451442.6
Q3453647.8
95-th percentile465056.8
Maximum608607.71
Range171335.31
Interquartile range (IQR)8189.8

Descriptive statistics

Standard deviation30005.006
Coefficient of variation (CV)0.065712927
Kurtosis14.95384
Mean456607.36
Median Absolute Deviation (MAD)4871.5
Skewness3.9732704
Sum3.7053687 × 109
Variance9.003004 × 108
MonotonicityNot monotonic
2024-05-11T02:14:00.353893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
461398.5 649
 
6.5%
449747.8 561
 
5.6%
451442.6 529
 
5.3%
448693.0 335
 
3.4%
440098.8 324
 
3.2%
437272.4 295
 
2.9%
445458.0 244
 
2.4%
443827.0 237
 
2.4%
440246.8 237
 
2.4%
465056.8 218
 
2.2%
Other values (135) 4486
44.9%
(Missing) 1885
18.9%
ValueCountFrequency (%)
437272.4 295
2.9%
437564.6 132
1.3%
437765.4 7
 
0.1%
440098.8 324
3.2%
440246.8 237
2.4%
440298.9 1
 
< 0.1%
440690.2 16
 
0.2%
440774.0 5
 
0.1%
440980.5 41
 
0.4%
441372.6 115
 
1.1%
ValueCountFrequency (%)
608607.7106 20
 
0.2%
598855.5575 59
 
0.6%
597511.3537 25
 
0.2%
591534.6811 51
 
0.5%
586144.1399 27
 
0.3%
585739.9074 81
 
0.8%
585505.296 10
 
0.1%
580078.2372 62
 
0.6%
578962.5137 40
 
0.4%
465056.8 218
2.2%

서식지비고정보
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
동경측지계
7740 
<NA>
1657 
세계측지계
 
603

Length

Max length5
Median length5
Mean length4.8343
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row동경측지계
2nd row동경측지계
3rd row동경측지계
4th row동경측지계
5th row세계측지계

Common Values

ValueCountFrequency (%)
동경측지계 7740
77.4%
<NA> 1657
 
16.6%
세계측지계 603
 
6.0%

Length

2024-05-11T02:14:00.906344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T02:14:01.313520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
동경측지계 7740
77.4%
na 1657
 
16.6%
세계측지계 603
 
6.0%

Interactions

2024-05-11T02:13:37.160068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:13:35.255139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:13:36.149111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:13:37.634568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:13:35.564636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:13:36.487873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:13:38.147852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:13:35.862902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T02:13:36.837304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T02:14:01.515501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세부통계용명칭출현년도원전X좌표Y좌표서식지비고정보
세부통계용명칭1.0000.7840.9920.9870.9930.903
출현년도0.7841.0001.0000.2880.2190.208
원전0.9921.0001.0000.9330.9480.946
X좌표0.9870.2880.9331.0000.8100.866
Y좌표0.9930.2190.9480.8101.0001.000
서식지비고정보0.9030.2080.9460.8661.0001.000
2024-05-11T02:14:01.961981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
출현년도X좌표Y좌표서식지비고정보
출현년도1.0000.1300.0630.151
X좌표0.1301.0000.0200.899
Y좌표0.0630.0201.0001.000
서식지비고정보0.1510.8991.0001.000

Missing values

2024-05-11T02:13:38.748249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T02:13:39.740666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-11T02:13:40.281982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

종코드국명학명서식지코드서식지명세부통계용명칭출현년도원전X좌표Y좌표서식지비고정보
38552s2975애기나리Disporum smilacinum A. Grayp0205인왕산인왕산1998산림생태계조사 연구보고서196252.9453421.1동경측지계
40028s0226개밀Agropyron tsukushiense var. transiens (Hack.) Ohwip0222중랑천중랑천20072007년 한강생태계 조사206389.4453647.8동경측지계
31807s1725멧비둘기Streptopelia orientalisp0175여의도샛강여의도샛강20062007년 한강생태계 조사192687.5446182.8동경측지계
259s3855질경이Plantago asiatica L.p0008개포동 달터근린공원달터근린공원2001서울시 우수 생태계지역 정밀조사 연구204367.0442054.9동경측지계
53369s0169갈퀴덩굴Galium spurium L.p0293천왕산한강2007서울시 도시숲(산림) 생태계 조사 학술 연구252811.7026585739.9074세계측지계
11790s1353도둑놈의갈고리Desmodium oxyphyllum DC.p0058남산남산1948남산의 식물199375.0449747.8동경측지계
53147s0419고양이Felis catusp0292A1초안산2004서울시 비오톱유형별 생물다양성 증진방안207230.5258608607.7106세계측지계
17939s4609흰눈썹황금새Ficedula zanthopygiap0094보라매공원보라매공원2006소규모 생물서식공간 생태계 모니터링192713.3443357.8동경측지계
5973s3409이스라지Prunus japonica Thunb. var. nakaii (Lev.) Rehderp0042구룡산 물박달나무군집구룡산2001서울시 우수 생태계지역 정밀조사 연구205028.7440690.2동경측지계
7522s0517구슬무당거저리Ceropria induta (Wiedemann)p0050길동생태공원길동생태공원20042004년 운영결과보고서213611.0448693.0동경측지계
종코드국명학명서식지코드서식지명세부통계용명칭출현년도원전X좌표Y좌표서식지비고정보
55598s3617조릿대Sasa borealis (Hack.) Makinop0301B1<NA>2004서울시 비오톱유형별 생물다양성 증진방안<NA><NA>세계측지계
11913s1739명자꽃Chaenomeles lagenaria (Loisel) Koidz.p0058남산남산1986남산공원의 자연환경실태 및 보존대책 pp.1-78199375.0449747.8동경측지계
59428s2246북쪽비단노린재Eurydema gebleri Kolenatip0321C8<NA>2004서울시 비오톱유형별 생물다양성 증진방안<NA><NA><NA>
31768s0735깝작도요Actitis hypoleucosp0175여의도샛강여의도샛강2001서울시 우수 생태계지역 정밀조사 연구192687.5446182.8동경측지계
58286s1754모메뚜기Tetrix japonica (Bolivar)p0313C12<NA>2004서울시 비오톱유형별 생물다양성 증진방안<NA><NA><NA>
5724s3561점박이둥글노린재Eysarcoris guttiger (Thunberg)p0038광나루한강20062007년 한강생태계 조사210903.0450647.3동경측지계
62671s4278톱다리개미허리노린재Riptortus clavatus (Thunberg)p0343G3<NA>2004서울시 비오톱유형별 생물다양성 증진방안<NA><NA><NA>
51448s4805산거울Carex humilisp0279헌인릉 오리나무군집헌인릉2009헌인릉 생태경관보전지역 관리계획 수립 연구3차년도207170.8440246.8동경측지계
14630s3846진득찰Siegesbeckia glabrescens Makinop0065대모산대모산1997산림생태계조사 연구보고서206564.3441372.6동경측지계
10853s0043가래나무Juglans mandshurica Maxim.p0058남산남산1987남산의 식물상, 자연보호 59:36-48199375.0449747.8동경측지계

Duplicate rows

Most frequently occurring

종코드국명학명서식지코드서식지명세부통계용명칭출현년도원전X좌표Y좌표서식지비고정보# duplicates
4s4752깔다구과류Chironomidae sp.p0222중랑천중랑천20072007년 한강생태계 조사206389.4453647.8동경측지계3
0s1754모메뚜기Tetrix japonica (Bolivar)p0222중랑천중랑천20062007년 한강생태계 조사206389.4453647.8동경측지계2
1s1754모메뚜기Tetrix japonica (Bolivar)p0246청계천청계천20072007년 한강생태계 조사204171.6451564.6동경측지계2
2s2011방가지똥Sonchus oleraceus L.p0246청계천청계천20072007년 한강생태계 조사204171.6451564.6동경측지계2
3s4752깔다구과류Chironomidae sp.p0158안양천안양천20062007년 한강생태계 조사189443.6447227.4동경측지계2
5s4752깔다구과류Chironomidae sp.p0298백운천항동수목원2005서울시 복개하천 복원 타당성 조사연구(2005)233493.0421578962.5137세계측지계2