Overview

Dataset statistics

Number of variables6
Number of observations52
Missing cells51
Missing cells (%)16.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.7 KiB
Average record size in memory52.5 B

Variable types

Numeric1
Categorical2
DateTime1
Text2

Dataset

Description국립생태원 연구과제관리정보를 나타낸 자료로써 동식물, 생태, 자연 등에 관련한 연구개발성과_학술대회 데이터 입니다.
Author국립생태원
URLhttps://www.data.go.kr/data/15088006/fileData.do

Alerts

분야 has constant value ""Constant
국제표준자료번호 has constant value ""Constant
학술지구분 is highly imbalanced (60.9%)Imbalance
국제표준자료번호 has 51 (98.1%) missing valuesMissing
일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 14:10:04.604660
Analysis finished2023-12-12 14:10:05.086892
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련번호
Real number (ℝ)

UNIQUE 

Distinct52
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.5
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size600.0 B
2023-12-12T23:10:05.158090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.55
Q113.75
median26.5
Q339.25
95-th percentile49.45
Maximum52
Range51
Interquartile range (IQR)25.5

Descriptive statistics

Standard deviation15.154757
Coefficient of variation (CV)0.57187763
Kurtosis-1.2
Mean26.5
Median Absolute Deviation (MAD)13
Skewness0
Sum1378
Variance229.66667
MonotonicityStrictly increasing
2023-12-12T23:10:05.300741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.9%
28 1
 
1.9%
30 1
 
1.9%
31 1
 
1.9%
32 1
 
1.9%
33 1
 
1.9%
34 1
 
1.9%
35 1
 
1.9%
36 1
 
1.9%
37 1
 
1.9%
Other values (42) 42
80.8%
ValueCountFrequency (%)
1 1
1.9%
2 1
1.9%
3 1
1.9%
4 1
1.9%
5 1
1.9%
6 1
1.9%
7 1
1.9%
8 1
1.9%
9 1
1.9%
10 1
1.9%
ValueCountFrequency (%)
52 1
1.9%
51 1
1.9%
50 1
1.9%
49 1
1.9%
48 1
1.9%
47 1
1.9%
46 1
1.9%
45 1
1.9%
44 1
1.9%
43 1
1.9%

분야
Categorical

CONSTANT 

Distinct1
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size548.0 B
학회 논문 발표
52 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row학회 논문 발표
2nd row학회 논문 발표
3rd row학회 논문 발표
4th row학회 논문 발표
5th row학회 논문 발표

Common Values

ValueCountFrequency (%)
학회 논문 발표 52
100.0%

Length

2023-12-12T23:10:05.449836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:10:05.545367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
학회 52
33.3%
논문 52
33.3%
발표 52
33.3%

학술지구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size548.0 B
4
48 
2
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row4
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
4 48
92.3%
2 4
 
7.7%

Length

2023-12-12T23:10:05.667696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:10:05.769079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 48
92.3%
2 4
 
7.7%
Distinct32
Distinct (%)61.5%
Missing0
Missing (%)0.0%
Memory size548.0 B
Minimum2014-10-16 00:00:00
Maximum2018-02-22 00:00:00
2023-12-12T23:10:05.876949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:10:06.007339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
Distinct26
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size548.0 B
2023-12-12T23:10:06.220153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length40
Mean length14.25
Min length5

Characters and Unicode

Total characters741
Distinct characters88
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)32.7%

Sample

1st rowInternational Ethological conference 2015
2nd row한국응용생태공학회 학술대회
3rd row한국하천호수학회
4th row한국기후변화학회
5th row국내 학술 대회 논문집
ValueCountFrequency (%)
국내 15
 
10.8%
대회 15
 
10.8%
논문집 15
 
10.8%
학술 15
 
10.8%
한국응용곤충학회 10
 
7.2%
2017 7
 
5.0%
한국생태환경과학협의회 5
 
3.6%
학술대회논문집 5
 
3.6%
한국하천호수학회 4
 
2.9%
4
 
2.9%
Other values (30) 44
31.7%
2023-12-12T23:10:06.559371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
94
 
12.7%
70
 
9.4%
66
 
8.9%
59
 
8.0%
39
 
5.3%
26
 
3.5%
24
 
3.2%
20
 
2.7%
20
 
2.7%
20
 
2.7%
Other values (78) 303
40.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 566
76.4%
Space Separator 94
 
12.7%
Decimal Number 44
 
5.9%
Lowercase Letter 32
 
4.3%
Math Symbol 2
 
0.3%
Uppercase Letter 2
 
0.3%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
70
 
12.4%
66
 
11.7%
59
 
10.4%
39
 
6.9%
26
 
4.6%
24
 
4.2%
20
 
3.5%
20
 
3.5%
20
 
3.5%
19
 
3.4%
Other values (54) 203
35.9%
Lowercase Letter
ValueCountFrequency (%)
n 5
15.6%
o 4
12.5%
e 4
12.5%
c 3
9.4%
a 3
9.4%
l 3
9.4%
t 3
9.4%
i 2
 
6.2%
r 2
 
6.2%
g 1
 
3.1%
Other values (2) 2
 
6.2%
Decimal Number
ValueCountFrequency (%)
1 11
25.0%
0 11
25.0%
2 11
25.0%
7 7
15.9%
5 2
 
4.5%
6 1
 
2.3%
4 1
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
E 1
50.0%
I 1
50.0%
Space Separator
ValueCountFrequency (%)
94
100.0%
Math Symbol
ValueCountFrequency (%)
| 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 566
76.4%
Common 141
 
19.0%
Latin 34
 
4.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
70
 
12.4%
66
 
11.7%
59
 
10.4%
39
 
6.9%
26
 
4.6%
24
 
4.2%
20
 
3.5%
20
 
3.5%
20
 
3.5%
19
 
3.4%
Other values (54) 203
35.9%
Latin
ValueCountFrequency (%)
n 5
14.7%
o 4
11.8%
e 4
11.8%
c 3
8.8%
a 3
8.8%
l 3
8.8%
t 3
8.8%
i 2
 
5.9%
r 2
 
5.9%
g 1
 
2.9%
Other values (4) 4
11.8%
Common
ValueCountFrequency (%)
94
66.7%
1 11
 
7.8%
0 11
 
7.8%
2 11
 
7.8%
7 7
 
5.0%
| 2
 
1.4%
5 2
 
1.4%
- 1
 
0.7%
6 1
 
0.7%
4 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 566
76.4%
ASCII 175
 
23.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
94
53.7%
1 11
 
6.3%
0 11
 
6.3%
2 11
 
6.3%
7 7
 
4.0%
n 5
 
2.9%
o 4
 
2.3%
e 4
 
2.3%
c 3
 
1.7%
a 3
 
1.7%
Other values (14) 22
 
12.6%
Hangul
ValueCountFrequency (%)
70
 
12.4%
66
 
11.7%
59
 
10.4%
39
 
6.9%
26
 
4.6%
24
 
4.2%
20
 
3.5%
20
 
3.5%
20
 
3.5%
19
 
3.4%
Other values (54) 203
35.9%

국제표준자료번호
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing51
Missing (%)98.1%
Memory size548.0 B
2023-12-12T23:10:06.741162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters14
Distinct characters11
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowISSN 2005-8756
ValueCountFrequency (%)
issn 1
50.0%
2005-8756 1
50.0%
2023-12-12T23:10:07.018003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 2
14.3%
0 2
14.3%
5 2
14.3%
I 1
7.1%
N 1
7.1%
1
7.1%
2 1
7.1%
- 1
7.1%
8 1
7.1%
7 1
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
57.1%
Uppercase Letter 4
28.6%
Space Separator 1
 
7.1%
Dash Punctuation 1
 
7.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2
25.0%
5 2
25.0%
2 1
12.5%
8 1
12.5%
7 1
12.5%
6 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 2
50.0%
I 1
25.0%
N 1
25.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10
71.4%
Latin 4
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2
20.0%
5 2
20.0%
1
10.0%
2 1
10.0%
- 1
10.0%
8 1
10.0%
7 1
10.0%
6 1
10.0%
Latin
ValueCountFrequency (%)
S 2
50.0%
I 1
25.0%
N 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 2
14.3%
0 2
14.3%
5 2
14.3%
I 1
7.1%
N 1
7.1%
1
7.1%
2 1
7.1%
- 1
7.1%
8 1
7.1%
7 1
7.1%

Interactions

2023-12-12T23:10:04.804015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:10:07.138072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호학술지구분학술지 출판일학술지명
일련번호1.0000.0000.8070.582
학술지구분0.0001.0000.9340.390
학술지 출판일0.8070.9341.0000.976
학술지명0.5820.3900.9761.000
2023-12-12T23:10:07.245847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호학술지구분
일련번호1.0000.000
학술지구분0.0001.000

Missing values

2023-12-12T23:10:04.927319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:10:05.036942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일련번호분야학술지구분학술지 출판일학술지명국제표준자료번호
01학회 논문 발표22015-08-10International Ethological conference 2015<NA>
12학회 논문 발표42017-04-28한국응용생태공학회 학술대회<NA>
23학회 논문 발표42017-10-18한국하천호수학회<NA>
34학회 논문 발표42017-06-15한국기후변화학회<NA>
45학회 논문 발표42017-04-28국내 학술 대회 논문집<NA>
56학회 논문 발표42017-04-21한국환경영향평가학회<NA>
67학회 논문 발표42015-10-15한국응용곤충학회 학술대회논문집 2015 한국응용곤충학회 임시총회 및 추계학술발표회<NA>
78학회 논문 발표22017-06-29국내 학술 대회 논문집<NA>
89학회 논문 발표42014-10-16한국응용곤충학회 학술대회논문집 2014 한국응용곤충학회 추계학술발표회<NA>
910학회 논문 발표42016-04-252016 한국응용곤충학회 정기총회 및 국제 심포지엄<NA>
일련번호분야학술지구분학술지 출판일학술지명국제표준자료번호
4243학회 논문 발표42017-10-18한국하천호수학회<NA>
4344학회 논문 발표42017-10-18한국하천호수학회<NA>
4445학회 논문 발표42017-10-18한국하천호수학회<NA>
4546학회 논문 발표42017-10-27한국응용곤충학회<NA>
4647학회 논문 발표42017-04-22한국조류학회<NA>
4748학회 논문 발표22017-09-13국내 학술 대회 논문집<NA>
4849학회 논문 발표42017-10-19한국작물학회<NA>
4950학회 논문 발표42017-08-09한국생태학회<NA>
5051학회 논문 발표42017-10-19국내 학술 대회 논문집<NA>
5152학회 논문 발표42016-08-10한국생물과학협회<NA>