Overview

Dataset statistics

Number of variables9
Number of observations502
Missing cells4008
Missing cells (%)88.7%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory35.9 KiB
Average record size in memory73.3 B

Variable types

Categorical1
Text7
DateTime1

Dataset

Description2018년 종료 농림식품 산림생산 연구개발사업 논문의(과제번호, 사업명, 연구책임자, 논문명, 학술년도, 저자, 학술지명)
Author농림식품기술기획평가원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20191014000000001329

Alerts

분류 has constant value ""Constant
과제번호 has constant value ""Constant
과제명 has constant value ""Constant
연구책임자 has constant value ""Constant
논문명 has constant value ""Constant
학술지 게재일자 has constant value ""Constant
저자 has constant value ""Constant
학술지명 has constant value ""Constant
Dataset has 1 (0.2%) duplicate rowsDuplicates
번호 is highly imbalanced (97.9%)Imbalance
분류 has 501 (99.8%) missing valuesMissing
과제번호 has 501 (99.8%) missing valuesMissing
과제명 has 501 (99.8%) missing valuesMissing
연구책임자 has 501 (99.8%) missing valuesMissing
논문명 has 501 (99.8%) missing valuesMissing
학술지 게재일자 has 501 (99.8%) missing valuesMissing
저자 has 501 (99.8%) missing valuesMissing
학술지명 has 501 (99.8%) missing valuesMissing

Reproduction

Analysis started2023-12-11 03:31:07.360400
Analysis finished2023-12-11 03:31:08.082304
Duration0.72 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
<NA>
501 
1
 
1

Length

Max length4
Median length4
Mean length3.9940239
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row1
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 501
99.8%
1 1
 
0.2%

Length

2023-12-11T12:31:08.155484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:31:08.258326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 501
99.8%
1 1
 
0.2%

분류
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing501
Missing (%)99.8%
Memory size4.1 KiB
2023-12-11T12:31:08.390892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row임산 공학
ValueCountFrequency (%)
임산 1
50.0%
공학 1
50.0%
2023-12-11T12:31:08.693290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4
80.0%
Space Separator 1
 
20.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4
80.0%
Common 1
 
20.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4
80.0%
ASCII 1
 
20.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
ASCII
ValueCountFrequency (%)
1
100.0%

과제번호
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing501
Missing (%)99.8%
Memory size4.1 KiB
2023-12-11T12:31:08.843121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row118040-3
ValueCountFrequency (%)
118040-3 1
100.0%
2023-12-11T12:31:09.137923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
25.0%
0 2
25.0%
8 1
12.5%
4 1
12.5%
- 1
12.5%
3 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
87.5%
Dash Punctuation 1
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
28.6%
0 2
28.6%
8 1
14.3%
4 1
14.3%
3 1
14.3%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
25.0%
0 2
25.0%
8 1
12.5%
4 1
12.5%
- 1
12.5%
3 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
25.0%
0 2
25.0%
8 1
12.5%
4 1
12.5%
- 1
12.5%
3 1
12.5%

과제명
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing501
Missing (%)99.8%
Memory size4.1 KiB
2023-12-11T12:31:09.390388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length49
Mean length49
Min length49

Characters and Unicode

Total characters49
Distinct characters38
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row농업부산물로 제조된 화학펄프 및 나노셀룰로오스를 활용한 친환경 고강도 과일봉지 원지 개발
ValueCountFrequency (%)
농업부산물로 1
9.1%
제조된 1
9.1%
화학펄프 1
9.1%
1
9.1%
나노셀룰로오스를 1
9.1%
활용한 1
9.1%
친환경 1
9.1%
고강도 1
9.1%
과일봉지 1
9.1%
원지 1
9.1%
2023-12-11T12:31:09.704110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
 
20.4%
2
 
4.1%
2
 
4.1%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
Other values (28) 28
57.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39
79.6%
Space Separator 10
 
20.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (27) 27
69.2%
Space Separator
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39
79.6%
Common 10
 
20.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (27) 27
69.2%
Common
ValueCountFrequency (%)
10
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39
79.6%
ASCII 10
 
20.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10
100.0%
Hangul
ValueCountFrequency (%)
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (27) 27
69.2%

연구책임자
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing501
Missing (%)99.8%
Memory size4.1 KiB
2023-12-11T12:31:09.807763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row이지영
ValueCountFrequency (%)
이지영 1
100.0%
2023-12-11T12:31:10.068700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

논문명
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing501
Missing (%)99.8%
Memory size4.1 KiB
2023-12-11T12:31:10.245636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length37
Mean length37
Min length37

Characters and Unicode

Total characters37
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row지력증강제 투입에 따른 농업부산물 유기충전제 적용 판지의 물성 평가
ValueCountFrequency (%)
지력증강제 1
11.1%
투입에 1
11.1%
따른 1
11.1%
농업부산물 1
11.1%
유기충전제 1
11.1%
적용 1
11.1%
판지의 1
11.1%
물성 1
11.1%
평가 1
11.1%
2023-12-11T12:31:10.527027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
21.6%
2
 
5.4%
2
 
5.4%
2
 
5.4%
1
 
2.7%
1
 
2.7%
1
 
2.7%
1
 
2.7%
1
 
2.7%
1
 
2.7%
Other values (17) 17
45.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29
78.4%
Space Separator 8
 
21.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
6.9%
2
 
6.9%
2
 
6.9%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
Other values (16) 16
55.2%
Space Separator
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29
78.4%
Common 8
 
21.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2
 
6.9%
2
 
6.9%
2
 
6.9%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
Other values (16) 16
55.2%
Common
ValueCountFrequency (%)
8
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29
78.4%
ASCII 8
 
21.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8
100.0%
Hangul
ValueCountFrequency (%)
2
 
6.9%
2
 
6.9%
2
 
6.9%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
Other values (16) 16
55.2%

학술지 게재일자
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing501
Missing (%)99.8%
Memory size4.1 KiB
Minimum2031-08-18 00:00:00
Maximum2031-08-18 00:00:00
2023-12-11T12:31:10.641422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:31:10.758873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

저자
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing501
Missing (%)99.8%
Memory size4.1 KiB
2023-12-11T12:31:10.899516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters8
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row주저자 : 이지영
ValueCountFrequency (%)
주저자 1
33.3%
1
33.3%
이지영 1
33.3%
2023-12-11T12:31:11.228249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2
22.2%
1
11.1%
1
11.1%
1
11.1%
: 1
11.1%
1
11.1%
1
11.1%
1
11.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6
66.7%
Space Separator 2
 
22.2%
Other Punctuation 1
 
11.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6
66.7%
Common 3
33.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Common
ValueCountFrequency (%)
2
66.7%
: 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6
66.7%
ASCII 3
33.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2
66.7%
: 1
33.3%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

학술지명
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing501
Missing (%)99.8%
Memory size4.1 KiB
2023-12-11T12:31:11.399808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length79
Median length79
Mean length79
Min length79

Characters and Unicode

Total characters79
Distinct characters30
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row펄프 종이기술 = Journal of Korea Technical Association of the Pulp and Paper Industry
ValueCountFrequency (%)
of 2
14.3%
펄프 1
 
7.1%
종이기술 1
 
7.1%
1
 
7.1%
journal 1
 
7.1%
korea 1
 
7.1%
technical 1
 
7.1%
association 1
 
7.1%
the 1
 
7.1%
pulp 1
 
7.1%
Other values (3) 3
21.4%
2023-12-11T12:31:11.753289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13
16.5%
o 6
 
7.6%
a 6
 
7.6%
n 5
 
6.3%
e 4
 
5.1%
r 4
 
5.1%
i 3
 
3.8%
l 3
 
3.8%
t 3
 
3.8%
s 3
 
3.8%
Other values (20) 29
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 52
65.8%
Space Separator 13
 
16.5%
Uppercase Letter 7
 
8.9%
Other Letter 6
 
7.6%
Math Symbol 1
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 6
11.5%
a 6
11.5%
n 5
9.6%
e 4
 
7.7%
r 4
 
7.7%
i 3
 
5.8%
l 3
 
5.8%
t 3
 
5.8%
s 3
 
5.8%
c 3
 
5.8%
Other values (6) 12
23.1%
Uppercase Letter
ValueCountFrequency (%)
P 2
28.6%
A 1
14.3%
I 1
14.3%
T 1
14.3%
K 1
14.3%
J 1
14.3%
Other Letter
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Space Separator
ValueCountFrequency (%)
13
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 59
74.7%
Common 14
 
17.7%
Hangul 6
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 6
 
10.2%
a 6
 
10.2%
n 5
 
8.5%
e 4
 
6.8%
r 4
 
6.8%
i 3
 
5.1%
l 3
 
5.1%
t 3
 
5.1%
s 3
 
5.1%
c 3
 
5.1%
Other values (12) 19
32.2%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Common
ValueCountFrequency (%)
13
92.9%
= 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73
92.4%
Hangul 6
 
7.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
13
17.8%
o 6
 
8.2%
a 6
 
8.2%
n 5
 
6.8%
e 4
 
5.5%
r 4
 
5.5%
i 3
 
4.1%
l 3
 
4.1%
t 3
 
4.1%
s 3
 
4.1%
Other values (14) 23
31.5%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Missing values

2023-12-11T12:31:07.635133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:31:07.813295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T12:31:07.970989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

번호분류과제번호과제명연구책임자논문명학술지 게재일자저자학술지명
01임산 공학118040-3농업부산물로 제조된 화학펄프 및 나노셀룰로오스를 활용한 친환경 고강도 과일봉지 원지 개발이지영지력증강제 투입에 따른 농업부산물 유기충전제 적용 판지의 물성 평가18/08/31주저자 : 이지영펄프 종이기술 = Journal of Korea Technical Association of the Pulp and Paper Industry
1<NA><NA><NA><NA><NA><NA><NA><NA><NA>
2<NA><NA><NA><NA><NA><NA><NA><NA><NA>
3<NA><NA><NA><NA><NA><NA><NA><NA><NA>
4<NA><NA><NA><NA><NA><NA><NA><NA><NA>
5<NA><NA><NA><NA><NA><NA><NA><NA><NA>
6<NA><NA><NA><NA><NA><NA><NA><NA><NA>
7<NA><NA><NA><NA><NA><NA><NA><NA><NA>
8<NA><NA><NA><NA><NA><NA><NA><NA><NA>
9<NA><NA><NA><NA><NA><NA><NA><NA><NA>
번호분류과제번호과제명연구책임자논문명학술지 게재일자저자학술지명
492<NA><NA><NA><NA><NA><NA><NA><NA><NA>
493<NA><NA><NA><NA><NA><NA><NA><NA><NA>
494<NA><NA><NA><NA><NA><NA><NA><NA><NA>
495<NA><NA><NA><NA><NA><NA><NA><NA><NA>
496<NA><NA><NA><NA><NA><NA><NA><NA><NA>
497<NA><NA><NA><NA><NA><NA><NA><NA><NA>
498<NA><NA><NA><NA><NA><NA><NA><NA><NA>
499<NA><NA><NA><NA><NA><NA><NA><NA><NA>
500<NA><NA><NA><NA><NA><NA><NA><NA><NA>
501<NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

번호분류과제번호과제명연구책임자논문명학술지 게재일자저자학술지명# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA><NA>501