Overview

Dataset statistics

Number of variables6
Number of observations3586
Missing cells7360
Missing cells (%)34.2%
Duplicate rows97
Duplicate rows (%)2.7%
Total size in memory168.2 KiB
Average record size in memory48.0 B

Variable types

Categorical1
Text5

Dataset

Description전통의학정보포털 오아시스의 한의연구보고서 입력 정보입니다. 부처명, 사업명, 대과제명, 과제명, 세부과제명, 과제영문명으로 이루어져있습니다.
Author한국한의학연구원
URLhttps://www.data.go.kr/data/15086073/fileData.do

Alerts

Dataset has 97 (2.7%) duplicate rowsDuplicates
사업명 has 617 (17.2%) missing valuesMissing
대과제명 has 2265 (63.2%) missing valuesMissing
세부과제명 has 2986 (83.3%) missing valuesMissing
과제영문명 has 1492 (41.6%) missing valuesMissing

Reproduction

Analysis started2023-12-12 17:38:26.662345
Analysis finished2023-12-12 17:38:28.516779
Duration1.85 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

부처명
Categorical

Distinct31
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size28.1 KiB
보건복지부
785 
식품의약품안전청
428 
과학기술부
379 
교육과학기술부
270 
중소기업청
234 
Other values (26)
1490 

Length

Max length9
Median length5
Mean length5.6508645
Min length2

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row미래창조과학부
2nd row산업통상자원부
3rd row미래창조과학부
4th row미래창조과학부
5th row교육과학기술부

Common Values

ValueCountFrequency (%)
보건복지부 785
21.9%
식품의약품안전청 428
11.9%
과학기술부 379
10.6%
교육과학기술부 270
 
7.5%
중소기업청 234
 
6.5%
농촌진흥청 208
 
5.8%
교육인적자원부 204
 
5.7%
국무조정실 194
 
5.4%
기타 153
 
4.3%
미래창조과학부 131
 
3.7%
Other values (21) 600
16.7%

Length

2023-12-13T02:38:28.609603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
보건복지부 785
21.9%
식품의약품안전청 428
11.9%
과학기술부 379
10.6%
교육과학기술부 270
 
7.5%
중소기업청 234
 
6.5%
농촌진흥청 208
 
5.8%
교육인적자원부 204
 
5.7%
국무조정실 194
 
5.4%
기타 153
 
4.3%
미래창조과학부 131
 
3.7%
Other values (21) 600
16.7%

사업명
Text

MISSING 

Distinct438
Distinct (%)14.8%
Missing617
Missing (%)17.2%
Memory size28.1 KiB
2023-12-13T02:38:28.844764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length30
Mean length9.8255305
Min length4

Characters and Unicode

Total characters29172
Distinct characters275
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique219 ?
Unique (%)7.4%

Sample

1st row일반연구자지원
2nd row이공학개인기초연구지원
3rd row연구소기업전략육성
4th row바이오의료기술개발
5th row일반연구자지원
ValueCountFrequency (%)
한국한의학연구원 293
 
8.2%
한방치료기술개발 163
 
4.6%
한방치료기술연구개발 117
 
3.3%
농림기술개발 112
 
3.1%
보건의료기술연구개발 101
 
2.8%
일반연구자지원 95
 
2.7%
산학연공동기술개발 93
 
2.6%
안전관리 81
 
2.3%
이공학개인기초연구지원 67
 
1.9%
66
 
1.8%
Other values (505) 2387
66.8%
2023-12-13T02:38:29.266683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1916
 
6.6%
1740
 
6.0%
1702
 
5.8%
1390
 
4.8%
1317
 
4.5%
1299
 
4.5%
1189
 
4.1%
1024
 
3.5%
942
 
3.2%
895
 
3.1%
Other values (265) 15758
54.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 27134
93.0%
Space Separator 606
 
2.1%
Decimal Number 353
 
1.2%
Uppercase Letter 339
 
1.2%
Open Punctuation 279
 
1.0%
Close Punctuation 279
 
1.0%
Other Punctuation 115
 
0.4%
Math Symbol 56
 
0.2%
Lowercase Letter 6
 
< 0.1%
Dash Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1916
 
7.1%
1740
 
6.4%
1702
 
6.3%
1390
 
5.1%
1317
 
4.9%
1299
 
4.8%
1189
 
4.4%
1024
 
3.8%
942
 
3.5%
895
 
3.3%
Other values (221) 13720
50.6%
Uppercase Letter
ValueCountFrequency (%)
C 123
36.3%
R 85
25.1%
D 26
 
7.7%
S 18
 
5.3%
N 15
 
4.4%
E 14
 
4.1%
M 14
 
4.1%
T 13
 
3.8%
A 10
 
2.9%
F 10
 
2.9%
Other values (5) 11
 
3.2%
Decimal Number
ValueCountFrequency (%)
2 147
41.6%
1 133
37.7%
0 40
 
11.3%
9 16
 
4.5%
5 6
 
1.7%
4 4
 
1.1%
3 3
 
0.8%
6 3
 
0.8%
8 1
 
0.3%
Other Punctuation
ValueCountFrequency (%)
, 51
44.3%
· 30
26.1%
& 25
21.7%
. 5
 
4.3%
: 3
 
2.6%
/ 1
 
0.9%
Lowercase Letter
ValueCountFrequency (%)
o 2
33.3%
c 1
16.7%
d 1
16.7%
t 1
16.7%
s 1
16.7%
Math Symbol
ValueCountFrequency (%)
< 27
48.2%
> 27
48.2%
+ 2
 
3.6%
Open Punctuation
ValueCountFrequency (%)
( 270
96.8%
[ 9
 
3.2%
Close Punctuation
ValueCountFrequency (%)
) 270
96.8%
] 9
 
3.2%
Space Separator
ValueCountFrequency (%)
606
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 27134
93.0%
Common 1693
 
5.8%
Latin 345
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1916
 
7.1%
1740
 
6.4%
1702
 
6.3%
1390
 
5.1%
1317
 
4.9%
1299
 
4.8%
1189
 
4.4%
1024
 
3.8%
942
 
3.5%
895
 
3.3%
Other values (221) 13720
50.6%
Common
ValueCountFrequency (%)
606
35.8%
( 270
15.9%
) 270
15.9%
2 147
 
8.7%
1 133
 
7.9%
, 51
 
3.0%
0 40
 
2.4%
· 30
 
1.8%
< 27
 
1.6%
> 27
 
1.6%
Other values (14) 92
 
5.4%
Latin
ValueCountFrequency (%)
C 123
35.7%
R 85
24.6%
D 26
 
7.5%
S 18
 
5.2%
N 15
 
4.3%
E 14
 
4.1%
M 14
 
4.1%
T 13
 
3.8%
A 10
 
2.9%
F 10
 
2.9%
Other values (10) 17
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 27134
93.0%
ASCII 2008
 
6.9%
None 30
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1916
 
7.1%
1740
 
6.4%
1702
 
6.3%
1390
 
5.1%
1317
 
4.9%
1299
 
4.8%
1189
 
4.4%
1024
 
3.8%
942
 
3.5%
895
 
3.3%
Other values (221) 13720
50.6%
ASCII
ValueCountFrequency (%)
606
30.2%
( 270
13.4%
) 270
13.4%
2 147
 
7.3%
1 133
 
6.6%
C 123
 
6.1%
R 85
 
4.2%
, 51
 
2.5%
0 40
 
2.0%
< 27
 
1.3%
Other values (33) 256
12.7%
None
ValueCountFrequency (%)
· 30
100.0%

대과제명
Text

MISSING 

Distinct723
Distinct (%)54.7%
Missing2265
Missing (%)63.2%
Memory size28.1 KiB
2023-12-13T02:38:29.569711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length88
Median length68
Mean length17.063588
Min length2

Characters and Unicode

Total characters22541
Distinct characters532
Distinct categories11 ?
Distinct scripts5 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique582 ?
Unique (%)44.1%

Sample

1st row미래산업융합기술연구
2nd row생리활성 해양천연물 기반 신의약 소재 개발 기초연구(신의약)
3rd row전통천연물 멀티오믹스 기반 맞춤형 바이오마커 기술 개발
4th row한방보험약사용현황에대한분석
5th row활혈화어지제를 이용한 항암제의 연구 및 개발
ValueCountFrequency (%)
연구 111
 
5.1%
한방치료기술개발 65
 
3.0%
개발 53
 
2.4%
49
 
2.2%
한방치료기술연구개발사업 40
 
1.8%
한약재 34
 
1.6%
안전관리 33
 
1.5%
생약(한약 29
 
1.3%
평가기술 28
 
1.3%
과학화 28
 
1.3%
Other values (1037) 1708
78.4%
2023-12-13T02:38:29.999117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
862
 
3.8%
857
 
3.8%
781
 
3.5%
679
 
3.0%
670
 
3.0%
658
 
2.9%
645
 
2.9%
624
 
2.8%
432
 
1.9%
430
 
1.9%
Other values (522) 15903
70.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 20285
90.0%
Lowercase Letter 866
 
3.8%
Space Separator 857
 
3.8%
Uppercase Letter 266
 
1.2%
Decimal Number 70
 
0.3%
Other Punctuation 66
 
0.3%
Close Punctuation 59
 
0.3%
Open Punctuation 59
 
0.3%
Dash Punctuation 10
 
< 0.1%
Letter Number 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
862
 
4.2%
781
 
3.9%
679
 
3.3%
670
 
3.3%
658
 
3.2%
645
 
3.2%
624
 
3.1%
432
 
2.1%
430
 
2.1%
377
 
1.9%
Other values (450) 14127
69.6%
Lowercase Letter
ValueCountFrequency (%)
e 99
11.4%
i 83
 
9.6%
a 79
 
9.1%
o 79
 
9.1%
n 76
 
8.8%
r 59
 
6.8%
l 55
 
6.4%
t 51
 
5.9%
s 42
 
4.8%
c 33
 
3.8%
Other values (16) 210
24.2%
Uppercase Letter
ValueCountFrequency (%)
C 28
10.5%
D 27
10.2%
P 23
 
8.6%
M 21
 
7.9%
A 20
 
7.5%
T 19
 
7.1%
R 18
 
6.8%
S 18
 
6.8%
N 17
 
6.4%
I 16
 
6.0%
Other values (13) 59
22.2%
Decimal Number
ValueCountFrequency (%)
1 16
22.9%
2 14
20.0%
3 13
18.6%
9 7
10.0%
8 6
 
8.6%
0 6
 
8.6%
5 4
 
5.7%
4 2
 
2.9%
6 1
 
1.4%
7 1
 
1.4%
Other Punctuation
ValueCountFrequency (%)
, 45
68.2%
· 10
 
15.2%
/ 6
 
9.1%
: 3
 
4.5%
& 1
 
1.5%
. 1
 
1.5%
Letter Number
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
857
100.0%
Close Punctuation
ValueCountFrequency (%)
) 59
100.0%
Open Punctuation
ValueCountFrequency (%)
( 59
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 20280
90.0%
Latin 1133
 
5.0%
Common 1122
 
5.0%
Han 5
 
< 0.1%
Greek 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
862
 
4.3%
781
 
3.9%
679
 
3.3%
670
 
3.3%
658
 
3.2%
645
 
3.2%
624
 
3.1%
432
 
2.1%
430
 
2.1%
377
 
1.9%
Other values (445) 14122
69.6%
Latin
ValueCountFrequency (%)
e 99
 
8.7%
i 83
 
7.3%
a 79
 
7.0%
o 79
 
7.0%
n 76
 
6.7%
r 59
 
5.2%
l 55
 
4.9%
t 51
 
4.5%
s 42
 
3.7%
c 33
 
2.9%
Other values (40) 477
42.1%
Common
ValueCountFrequency (%)
857
76.4%
) 59
 
5.3%
( 59
 
5.3%
, 45
 
4.0%
1 16
 
1.4%
2 14
 
1.2%
3 13
 
1.2%
· 10
 
0.9%
- 10
 
0.9%
9 7
 
0.6%
Other values (11) 32
 
2.9%
Han
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Greek
ValueCountFrequency (%)
κ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 20278
90.0%
ASCII 2243
 
10.0%
None 11
 
< 0.1%
CJK 5
 
< 0.1%
Compat Jamo 2
 
< 0.1%
Number Forms 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
862
 
4.3%
781
 
3.9%
679
 
3.3%
670
 
3.3%
658
 
3.2%
645
 
3.2%
624
 
3.1%
432
 
2.1%
430
 
2.1%
377
 
1.9%
Other values (444) 14120
69.6%
ASCII
ValueCountFrequency (%)
857
38.2%
e 99
 
4.4%
i 83
 
3.7%
a 79
 
3.5%
o 79
 
3.5%
n 76
 
3.4%
) 59
 
2.6%
( 59
 
2.6%
r 59
 
2.6%
l 55
 
2.5%
Other values (58) 738
32.9%
None
ValueCountFrequency (%)
· 10
90.9%
κ 1
 
9.1%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
CJK
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Number Forms
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct3095
Distinct (%)86.3%
Missing0
Missing (%)0.0%
Memory size28.1 KiB
2023-12-13T02:38:30.309953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length116
Median length80
Mean length30.531511
Min length5

Characters and Unicode

Total characters109486
Distinct characters1019
Distinct categories13 ?
Distinct scripts5 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2751 ?
Unique (%)76.7%

Sample

1st row사상체질에 따른 Gut hormone profiling을 통한 식욕의 개체 차이에 대한 기전 연구
2nd row대구한의대학교 한방생명자원연구센터 지역혁신센터사업
3rd rowUV/Microwave를 활용한 Phase Ⅱ enzyme 조절 항염증 천연물 metabolite 도출 연구
4th row복합천연추출물을 이용한 비만 예방 및 개선용 건강기능식품 개발 및 사업화
5th row한약자원 향장 소재은행
ValueCountFrequency (%)
1311
 
5.1%
연구 1244
 
4.8%
개발 1047
 
4.1%
이용한 435
 
1.7%
위한 376
 
1.5%
관한 361
 
1.4%
대한 248
 
1.0%
한약재 179
 
0.7%
의한 172
 
0.7%
통한 131
 
0.5%
Other values (8068) 20324
78.7%
2023-12-13T02:38:30.902814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
22246
 
20.3%
3559
 
3.3%
2827
 
2.6%
2148
 
2.0%
2144
 
2.0%
1795
 
1.6%
1613
 
1.5%
1600
 
1.5%
1583
 
1.4%
1575
 
1.4%
Other values (1009) 68396
62.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 78912
72.1%
Space Separator 22246
 
20.3%
Lowercase Letter 4134
 
3.8%
Uppercase Letter 1587
 
1.4%
Other Punctuation 589
 
0.5%
Close Punctuation 541
 
0.5%
Open Punctuation 541
 
0.5%
Decimal Number 502
 
0.5%
Dash Punctuation 306
 
0.3%
Letter Number 94
 
0.1%
Other values (3) 34
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3559
 
4.5%
2827
 
3.6%
2148
 
2.7%
2144
 
2.7%
1795
 
2.3%
1613
 
2.0%
1600
 
2.0%
1583
 
2.0%
1575
 
2.0%
1399
 
1.8%
Other values (905) 58669
74.3%
Lowercase Letter
ValueCountFrequency (%)
e 438
10.6%
i 400
 
9.7%
o 384
 
9.3%
n 343
 
8.3%
a 321
 
7.8%
r 263
 
6.4%
l 253
 
6.1%
s 249
 
6.0%
t 241
 
5.8%
c 202
 
4.9%
Other values (21) 1040
25.2%
Uppercase Letter
ValueCountFrequency (%)
I 211
13.3%
D 142
 
8.9%
A 135
 
8.5%
P 115
 
7.2%
N 101
 
6.4%
C 98
 
6.2%
B 91
 
5.7%
M 89
 
5.6%
T 77
 
4.9%
S 70
 
4.4%
Other values (16) 458
28.9%
Other Punctuation
ValueCountFrequency (%)
, 352
59.8%
· 106
 
18.0%
/ 50
 
8.5%
: 34
 
5.8%
' 12
 
2.0%
. 11
 
1.9%
& 9
 
1.5%
; 7
 
1.2%
" 6
 
1.0%
# 1
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 106
21.1%
0 101
20.1%
2 93
18.5%
3 59
11.8%
9 45
9.0%
5 24
 
4.8%
4 23
 
4.6%
7 18
 
3.6%
6 17
 
3.4%
8 16
 
3.2%
Close Punctuation
ValueCountFrequency (%)
) 527
97.4%
6
 
1.1%
6
 
1.1%
] 1
 
0.2%
1
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 527
97.4%
6
 
1.1%
6
 
1.1%
[ 1
 
0.2%
1
 
0.2%
Letter Number
ValueCountFrequency (%)
37
39.4%
33
35.1%
19
20.2%
4
 
4.3%
1
 
1.1%
Math Symbol
ValueCountFrequency (%)
+ 12
66.7%
~ 3
 
16.7%
> 1
 
5.6%
< 1
 
5.6%
1
 
5.6%
Final Punctuation
ValueCountFrequency (%)
7
87.5%
1
 
12.5%
Initial Punctuation
ValueCountFrequency (%)
7
87.5%
1
 
12.5%
Space Separator
ValueCountFrequency (%)
22246
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 306
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 78224
71.4%
Common 24759
 
22.6%
Latin 5799
 
5.3%
Han 688
 
0.6%
Greek 16
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3559
 
4.5%
2827
 
3.6%
2148
 
2.7%
2144
 
2.7%
1795
 
2.3%
1613
 
2.1%
1600
 
2.0%
1583
 
2.0%
1575
 
2.0%
1399
 
1.8%
Other values (626) 57981
74.1%
Han
ValueCountFrequency (%)
25
 
3.6%
20
 
2.9%
20
 
2.9%
19
 
2.8%
17
 
2.5%
16
 
2.3%
16
 
2.3%
15
 
2.2%
12
 
1.7%
12
 
1.7%
Other values (269) 516
75.0%
Latin
ValueCountFrequency (%)
e 438
 
7.6%
i 400
 
6.9%
o 384
 
6.6%
n 343
 
5.9%
a 321
 
5.5%
r 263
 
4.5%
l 253
 
4.4%
s 249
 
4.3%
t 241
 
4.2%
I 211
 
3.6%
Other values (47) 2696
46.5%
Common
ValueCountFrequency (%)
22246
89.9%
) 527
 
2.1%
( 527
 
2.1%
, 352
 
1.4%
- 306
 
1.2%
· 106
 
0.4%
1 106
 
0.4%
0 101
 
0.4%
2 93
 
0.4%
3 59
 
0.2%
Other values (32) 336
 
1.4%
Greek
ValueCountFrequency (%)
κ 5
31.2%
β 4
25.0%
α 3
18.8%
δ 3
18.8%
γ 1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 78198
71.4%
ASCII 30315
 
27.7%
CJK 673
 
0.6%
None 149
 
0.1%
Number Forms 94
 
0.1%
Compat Jamo 26
 
< 0.1%
Punctuation 16
 
< 0.1%
CJK Compat Ideographs 15
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
22246
73.4%
) 527
 
1.7%
( 527
 
1.7%
e 438
 
1.4%
i 400
 
1.3%
o 384
 
1.3%
, 352
 
1.2%
n 343
 
1.1%
a 321
 
1.1%
- 306
 
1.0%
Other values (72) 4471
 
14.7%
Hangul
ValueCountFrequency (%)
3559
 
4.6%
2827
 
3.6%
2148
 
2.7%
2144
 
2.7%
1795
 
2.3%
1613
 
2.1%
1600
 
2.0%
1583
 
2.0%
1575
 
2.0%
1399
 
1.8%
Other values (625) 57955
74.1%
None
ValueCountFrequency (%)
· 106
71.1%
6
 
4.0%
6
 
4.0%
6
 
4.0%
6
 
4.0%
κ 5
 
3.4%
β 4
 
2.7%
α 3
 
2.0%
δ 3
 
2.0%
γ 1
 
0.7%
Other values (3) 3
 
2.0%
Number Forms
ValueCountFrequency (%)
37
39.4%
33
35.1%
19
20.2%
4
 
4.3%
1
 
1.1%
Compat Jamo
ValueCountFrequency (%)
26
100.0%
CJK
ValueCountFrequency (%)
25
 
3.7%
20
 
3.0%
20
 
3.0%
19
 
2.8%
17
 
2.5%
16
 
2.4%
16
 
2.4%
15
 
2.2%
12
 
1.8%
12
 
1.8%
Other values (264) 501
74.4%
Punctuation
ValueCountFrequency (%)
7
43.8%
7
43.8%
1
 
6.2%
1
 
6.2%
CJK Compat Ideographs
ValueCountFrequency (%)
5
33.3%
4
26.7%
3
20.0%
2
 
13.3%
1
 
6.7%

세부과제명
Text

MISSING 

Distinct583
Distinct (%)97.2%
Missing2986
Missing (%)83.3%
Memory size28.1 KiB
2023-12-13T02:38:31.284575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length399
Median length172.5
Mean length100.53
Min length7

Characters and Unicode

Total characters60318
Distinct characters718
Distinct categories14 ?
Distinct scripts5 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique568 ?
Unique (%)94.7%

Sample

1st row비알콜성 지방간 질환 조절 기전구명 및 기능성 천연물 소재 탐색( 국립농업과학원 황유진)
2nd row유기물분변토 종류별 생리장해 발생 유형분석( 국립원예특작과학원 장인배)식물성유기물 종류별 생리장해 발생 유형분석( 국립원예특작과학원 유진)축산퇴비 종류별 생리장해 발생 유형분석( 국립원예특작과학원 장인배)
3rd row치매 치료제 한약물에 대한 독성 연구( 경희대학교 강철훈)한약물의 치매 치료 효능에 관한 임상 연구( 경희대학교 황의완)한약물의 신경세포보호 효과 및 작용기전 연구( 한국외국어대학교 권혁만)치매모델 동물의 행동 검사 및 전기생리적 특성 분석을 통한 한약물의 치매 치료 효능 연구( 충북대학교 조선영)치매환자의 뇌파(EEG와 ERP)분석과 표준화된 신경심리검사를 이용한 한약물의 치매치료 효능 연구( 고려대학교 김현택)
4th row강황 부산물의 기능성 물질 발굴 및 고부가 소재 개발( 원광대학교 없음)강황 부산물의 향장제품 소재화를 위한 기능성 탐색( 원광대학교 없음)
5th row인삼부산물을 이용한 기능성 가공식품 개발( 서울대학교 없음)인삼부산물의 기능성향상을 위한 전처리기술 개발( 서울대학교 없음)
ValueCountFrequency (%)
603
 
5.3%
연구 404
 
3.5%
개발 225
 
2.0%
경희대학교 133
 
1.2%
관한 124
 
1.1%
대한 116
 
1.0%
위한 115
 
1.0%
이용한 114
 
1.0%
서울대학교 97
 
0.8%
없음 66
 
0.6%
Other values (5032) 9433
82.5%
2023-12-13T02:38:31.851130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10836
 
18.0%
) 1432
 
2.4%
( 1430
 
2.4%
1412
 
2.3%
1333
 
2.2%
1290
 
2.1%
1228
 
2.0%
1029
 
1.7%
994
 
1.6%
968
 
1.6%
Other values (708) 38366
63.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 42408
70.3%
Space Separator 10836
 
18.0%
Lowercase Letter 1687
 
2.8%
Close Punctuation 1433
 
2.4%
Open Punctuation 1431
 
2.4%
Other Punctuation 1224
 
2.0%
Uppercase Letter 715
 
1.2%
Decimal Number 448
 
0.7%
Dash Punctuation 77
 
0.1%
Other Symbol 31
 
0.1%
Other values (4) 28
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1412
 
3.3%
1333
 
3.1%
1290
 
3.0%
1228
 
2.9%
1029
 
2.4%
994
 
2.3%
968
 
2.3%
904
 
2.1%
668
 
1.6%
605
 
1.4%
Other values (614) 31977
75.4%
Lowercase Letter
ValueCountFrequency (%)
i 208
12.3%
o 170
 
10.1%
e 135
 
8.0%
n 131
 
7.8%
a 112
 
6.6%
c 108
 
6.4%
t 97
 
5.7%
r 96
 
5.7%
s 92
 
5.5%
l 81
 
4.8%
Other values (19) 457
27.1%
Uppercase Letter
ValueCountFrequency (%)
D 78
 
10.9%
A 71
 
9.9%
N 63
 
8.8%
C 57
 
8.0%
H 42
 
5.9%
S 41
 
5.7%
T 41
 
5.7%
P 39
 
5.5%
G 39
 
5.5%
B 37
 
5.2%
Other values (16) 207
29.0%
Other Punctuation
ValueCountFrequency (%)
/ 819
66.9%
! 150
 
12.3%
@ 150
 
12.3%
· 50
 
4.1%
, 38
 
3.1%
: 6
 
0.5%
. 6
 
0.5%
& 3
 
0.2%
" 1
 
0.1%
; 1
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 160
35.7%
2 114
25.4%
3 69
15.4%
0 45
 
10.0%
4 23
 
5.1%
5 9
 
2.0%
7 9
 
2.0%
8 8
 
1.8%
9 7
 
1.6%
6 4
 
0.9%
Letter Number
ValueCountFrequency (%)
6
31.6%
6
31.6%
4
21.1%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Math Symbol
ValueCountFrequency (%)
~ 4
57.1%
> 1
 
14.3%
< 1
 
14.3%
+ 1
 
14.3%
Close Punctuation
ValueCountFrequency (%)
) 1432
99.9%
1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 1430
99.9%
1
 
0.1%
Space Separator
ValueCountFrequency (%)
10836
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 77
100.0%
Other Symbol
ValueCountFrequency (%)
31
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 42343
70.2%
Common 15458
 
25.6%
Latin 2414
 
4.0%
Han 96
 
0.2%
Greek 7
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1412
 
3.3%
1333
 
3.1%
1290
 
3.0%
1228
 
2.9%
1029
 
2.4%
994
 
2.3%
968
 
2.3%
904
 
2.1%
668
 
1.6%
605
 
1.4%
Other values (557) 31912
75.4%
Latin
ValueCountFrequency (%)
i 208
 
8.6%
o 170
 
7.0%
e 135
 
5.6%
n 131
 
5.4%
a 112
 
4.6%
c 108
 
4.5%
t 97
 
4.0%
r 96
 
4.0%
s 92
 
3.8%
l 81
 
3.4%
Other values (48) 1184
49.0%
Han
ValueCountFrequency (%)
5
 
5.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
3
 
3.1%
3
 
3.1%
3
 
3.1%
2
 
2.1%
Other values (48) 60
62.5%
Common
ValueCountFrequency (%)
10836
70.1%
) 1432
 
9.3%
( 1430
 
9.3%
/ 819
 
5.3%
1 160
 
1.0%
! 150
 
1.0%
@ 150
 
1.0%
2 114
 
0.7%
- 77
 
0.5%
3 69
 
0.4%
Other values (22) 221
 
1.4%
Greek
ValueCountFrequency (%)
α 3
42.9%
β 3
42.9%
κ 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 42291
70.1%
ASCII 17800
29.5%
CJK 93
 
0.2%
None 90
 
0.1%
Compat Jamo 21
 
< 0.1%
Number Forms 19
 
< 0.1%
CJK Compat Ideographs 3
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10836
60.9%
) 1432
 
8.0%
( 1430
 
8.0%
/ 819
 
4.6%
i 208
 
1.2%
o 170
 
1.0%
1 160
 
0.9%
! 150
 
0.8%
@ 150
 
0.8%
e 135
 
0.8%
Other values (70) 2310
 
13.0%
Hangul
ValueCountFrequency (%)
1412
 
3.3%
1333
 
3.2%
1290
 
3.1%
1228
 
2.9%
1029
 
2.4%
994
 
2.4%
968
 
2.3%
904
 
2.1%
668
 
1.6%
605
 
1.4%
Other values (555) 31860
75.3%
None
ValueCountFrequency (%)
· 50
55.6%
31
34.4%
α 3
 
3.3%
β 3
 
3.3%
1
 
1.1%
1
 
1.1%
κ 1
 
1.1%
Compat Jamo
ValueCountFrequency (%)
21
100.0%
Number Forms
ValueCountFrequency (%)
6
31.6%
6
31.6%
4
21.1%
1
 
5.3%
1
 
5.3%
1
 
5.3%
CJK
ValueCountFrequency (%)
5
 
5.4%
4
 
4.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
4
 
4.3%
3
 
3.2%
3
 
3.2%
3
 
3.2%
2
 
2.2%
Other values (46) 57
61.3%
CJK Compat Ideographs
ValueCountFrequency (%)
2
66.7%
1
33.3%
Punctuation
ValueCountFrequency (%)
1
100.0%

과제영문명
Text

MISSING 

Distinct1806
Distinct (%)86.2%
Missing1492
Missing (%)41.6%
Memory size28.1 KiB
2023-12-13T02:38:32.225633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length255
Median length158
Mean length90.073066
Min length20

Characters and Unicode

Total characters188613
Distinct characters99
Distinct categories12 ?
Distinct scripts5 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1618 ?
Unique (%)77.3%

Sample

1st rowA study investigating the reason for differences in appetite patterns among individuals through gut hormone profiling according to Sasang constitution
2nd rowAnti-inflammatory Effect of Natural Product Metabolite via Phase Ⅱ Enzyme Regulation by using UV/Microwave
3rd rowKorea Cosmeceutical Material Bank
4th rowInvestigation of Human Biomedical Signal on EEG/EMG by non-contact and focused Magnetic Field on acupoints
5th rowNeural mechanism of acupuncture in body awareness manipulation model
ValueCountFrequency (%)
of 2514
 
9.9%
and 1102
 
4.3%
the 1024
 
4.0%
on 801
 
3.1%
for 773
 
3.0%
development 697
 
2.7%
study 507
 
2.0%
medicine 433
 
1.7%
in 429
 
1.7%
herbal 388
 
1.5%
Other values (3698) 16806
66.0%
2023-12-13T02:38:32.774362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23381
12.4%
e 17686
 
9.4%
i 13888
 
7.4%
n 13812
 
7.3%
o 13414
 
7.1%
a 12756
 
6.8%
t 12513
 
6.6%
r 9188
 
4.9%
s 8064
 
4.3%
c 7021
 
3.7%
Other values (89) 56890
30.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 153713
81.5%
Space Separator 23381
 
12.4%
Uppercase Letter 9662
 
5.1%
Dash Punctuation 693
 
0.4%
Other Punctuation 397
 
0.2%
Decimal Number 257
 
0.1%
Open Punctuation 217
 
0.1%
Close Punctuation 216
 
0.1%
Letter Number 60
 
< 0.1%
Other Letter 10
 
< 0.1%
Other values (2) 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 17686
11.5%
i 13888
 
9.0%
n 13812
 
9.0%
o 13414
 
8.7%
a 12756
 
8.3%
t 12513
 
8.1%
r 9188
 
6.0%
s 8064
 
5.2%
c 7021
 
4.6%
l 6579
 
4.3%
Other values (20) 38792
25.2%
Uppercase Letter
ValueCountFrequency (%)
S 1055
10.9%
D 1048
10.8%
M 872
 
9.0%
A 776
 
8.0%
P 690
 
7.1%
T 650
 
6.7%
C 597
 
6.2%
I 541
 
5.6%
R 464
 
4.8%
E 438
 
4.5%
Other values (16) 2531
26.2%
Decimal Number
ValueCountFrequency (%)
1 76
29.6%
0 39
15.2%
3 38
14.8%
2 33
12.8%
9 20
 
7.8%
5 15
 
5.8%
4 13
 
5.1%
6 9
 
3.5%
8 7
 
2.7%
7 7
 
2.7%
Other Punctuation
ValueCountFrequency (%)
, 154
38.8%
& 70
17.6%
' 45
 
11.3%
. 45
 
11.3%
/ 34
 
8.6%
: 33
 
8.3%
; 10
 
2.5%
" 4
 
1.0%
¡ 2
 
0.5%
Other Letter
ValueCountFrequency (%)
4
40.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
Letter Number
ValueCountFrequency (%)
24
40.0%
20
33.3%
13
21.7%
2
 
3.3%
1
 
1.7%
Open Punctuation
ValueCountFrequency (%)
( 211
97.2%
5
 
2.3%
[ 1
 
0.5%
Close Punctuation
ValueCountFrequency (%)
) 210
97.2%
5
 
2.3%
] 1
 
0.5%
Math Symbol
ValueCountFrequency (%)
+ 3
60.0%
> 1
 
20.0%
< 1
 
20.0%
Space Separator
ValueCountFrequency (%)
23381
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 693
100.0%
Currency Symbol
ValueCountFrequency (%)
¤ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 163426
86.6%
Common 25168
 
13.3%
Greek 9
 
< 0.1%
Han 6
 
< 0.1%
Hangul 4
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 17686
 
10.8%
i 13888
 
8.5%
n 13812
 
8.5%
o 13414
 
8.2%
a 12756
 
7.8%
t 12513
 
7.7%
r 9188
 
5.6%
s 8064
 
4.9%
c 7021
 
4.3%
l 6579
 
4.0%
Other values (47) 48505
29.7%
Common
ValueCountFrequency (%)
23381
92.9%
- 693
 
2.8%
( 211
 
0.8%
) 210
 
0.8%
, 154
 
0.6%
1 76
 
0.3%
& 70
 
0.3%
' 45
 
0.2%
. 45
 
0.2%
0 39
 
0.2%
Other values (21) 244
 
1.0%
Han
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Greek
ValueCountFrequency (%)
β 4
44.4%
κ 2
22.2%
α 2
22.2%
γ 1
 
11.1%
Hangul
ValueCountFrequency (%)
4
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 188520
> 99.9%
Number Forms 60
 
< 0.1%
None 23
 
< 0.1%
CJK 5
 
< 0.1%
Compat Jamo 4
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
23381
12.4%
e 17686
 
9.4%
i 13888
 
7.4%
n 13812
 
7.3%
o 13414
 
7.1%
a 12756
 
6.8%
t 12513
 
6.6%
r 9188
 
4.9%
s 8064
 
4.3%
c 7021
 
3.7%
Other values (69) 56797
30.1%
Number Forms
ValueCountFrequency (%)
24
40.0%
20
33.3%
13
21.7%
2
 
3.3%
1
 
1.7%
None
ValueCountFrequency (%)
5
21.7%
5
21.7%
β 4
17.4%
¤ 2
 
8.7%
κ 2
 
8.7%
¡ 2
 
8.7%
α 2
 
8.7%
γ 1
 
4.3%
Compat Jamo
ValueCountFrequency (%)
4
100.0%
CJK
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%

Missing values

2023-12-13T02:38:28.187696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:38:28.307126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T02:38:28.436453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

부처명사업명대과제명과제명세부과제명과제영문명
0미래창조과학부일반연구자지원<NA>사상체질에 따른 Gut hormone profiling을 통한 식욕의 개체 차이에 대한 기전 연구<NA>A study investigating the reason for differences in appetite patterns among individuals through gut hormone profiling according to Sasang constitution
1산업통상자원부<NA><NA>대구한의대학교 한방생명자원연구센터 지역혁신센터사업<NA><NA>
2미래창조과학부이공학개인기초연구지원<NA>UV/Microwave를 활용한 Phase Ⅱ enzyme 조절 항염증 천연물 metabolite 도출 연구<NA>Anti-inflammatory Effect of Natural Product Metabolite via Phase Ⅱ Enzyme Regulation by using UV/Microwave
3미래창조과학부연구소기업전략육성<NA>복합천연추출물을 이용한 비만 예방 및 개선용 건강기능식품 개발 및 사업화<NA><NA>
4교육과학기술부바이오의료기술개발<NA>한약자원 향장 소재은행<NA>Korea Cosmeceutical Material Bank
5미래창조과학부일반연구자지원<NA>한의학 경혈점 비접촉 자기장 집속 자극에 의한 생체신호 반응 조사<NA>Investigation of Human Biomedical Signal on EEG/EMG by non-contact and focused Magnetic Field on acupoints
6미래창조과학부일반연구자지원<NA>신체자가인식 변형 모델을 통한 침 치료 뇌신경생리학적 작용기전 연구<NA>Neural mechanism of acupuncture in body awareness manipulation model
7농촌진흥청<NA><NA>비알콜성 지방간 질환 조절 기전구명 및 기능성 천연물 소재 탐색비알콜성 지방간 질환 조절 기전구명 및 기능성 천연물 소재 탐색( 국립농업과학원 황유진)Studies of natural material for improvement of nonalcoholic fatty liver diseases and molecular mechanism
8교육과학기술부일반연구자지원<NA>한의학적 아토피 피부염 동물모델 개발 및 한방제제의 효과와 그 기전 규명<NA>Development of Oriental atopic dermatitis animal model and identification of effect and the mechanism of Oriental prescription
9교육과학기술부일반연구자지원<NA>한의학 칠정(七情)에 기반을 둔 핵심감정평가도구 제작과 신뢰도ㆍ타당도 연구<NA>A study on The Development of The Core Emotion Assessment Instrument based on the seven emotion (七情) and Analysis of Reliability and Validity
부처명사업명대과제명과제명세부과제명과제영문명
3576교육과학기술부일반연구자지원<NA>흰민들레로부터 알쯔하이머 예방 및 항노화 활성 연구와 기능성 물질의 분리 및 작용 메카니즘 규명<NA>Study on protective effects of Taraxacum coreanum against Alzheimer's disease and aging process with its mechanisms and isolation of active compounds
3577과학기술정보통신부바이오·의료기술개발<NA>암환자 통증 및 악액질 완화 양·한방 통합 치료기술 개발1/항암제 통증 및 암성 악액질 완화 통합 치료기술 연구//!@ 2/암성 통증 완화 통합 치료기술 연구//!@ 3/암환자 대상 통합 치료기술 임상연구//<NA>
3578과학기술정보통신부바이오·의료기술개발<NA>암성 악액질 예방 및 치료를 위한 양한방 통합관리체계 개발<NA><NA>
3579과학기술정보통신부중견연구자지원<NA>뇌혈관 노화의 분자해부학적 분석과 황화수소 제어에 의한 뇌혈류 개선<NA>Molecular-anatomical analysis of cerebrovascular aging and improving cerebral blood flow using hydrogen sulfide
3580과학기술정보통신부중견연구자지원<NA>온병(溫病) 변증이론에 기반한 대장질환 치료약물 탐색 및 AMPK를 중심으로 한 기전 구명<NA>Investigation of the novel therapeutic agent for colon diseases based on warm disease theory (Onbyung hak) and elucidation the functional role of AMPK signaling in colon diseases
3581과학기술정보통신부중견연구자지원<NA>염증성 장질환 모델에서 Tumor necrosis factor-alpha inhibitor 및 한약재 병행투여에 따른 면역기전 연구<NA>The effects and immunological mechanisms of traditional Korean medicine combined with tumor necrosis factor-alpha inhibitor on inflammatory bowel disease animal model
3582과학기술정보통신부바이오·의료기술개발암환자 통증 및 악액질 완화 양·한방 통합 치료기술 개발암성 통증 완화 통합 치료기술 연구<NA><NA>
3583농촌진흥청연구개발성과실용화지원<NA>울금-커큐민 성분을 이용한 기능성 건강식품 개발<NA><NA>
3584과학기술정보통신부바이오·의료기술개발암성 악액질 예방 및 치료를 위한 양한방 통합관리체계 개발암성 악액질 관리체계 개발 및 한방 치료기술 검증<NA><NA>
3585보건복지부한의약선도기술개발<NA>의료기기중심 한의약 임상시험센터1/한의약 다기관 임상시험 인프라 구축/원광대학교/이상관!@ 2/한의약 임상시험 교육 프로그램 개발 및 전문인력 양성/원광대학교/김성철!@ 3/한의약 임상시험 실시기술 개발/원광대학교/권영달!@ 4/한의약 임상시험 활성화 및 제품화 지원체계 구축/원광대학교/성강경<NA>

Duplicate rows

Most frequently occurring

부처명사업명대과제명과제명세부과제명과제영문명# duplicates
1과학기술부21C 프론티어연구개발사업자생식물이용기술개발사업인삼의 구조, 기능 유전체 및 프로테움 연구<NA><NA>3
5과학기술부21세기프론티어연구개발사업자생식물이용기술개발자생식물 유래 혈관염 치료 식품의약의 개발<NA>Development of food medicines for the treatment of vascular inflammation from plant resources3
7과학기술부국가지정연구실사업국가지정연구실사업한의학 진단 및 치료를 위한 경혈 · 경락의 생물 물리학적 기전 연구<NA><NA>3
10과학기술부목적기초연구사업지역대학우수과학자지원연구배양 포유동물세포에서 DNA상해원에 의한 유전독성과 세포고사능변화에 미치는 홍삼사포닌성분들의 작용 기작규명<NA><NA>3
17과학기술부목적기초연구사업특정기초연구시상 하부파괴로 유발된 면역반응의 변화에 침자극이 미치는 영향<NA><NA>3
20과학기술부목적기초연구사업특정기초연구한방소갈약 인위적 복합물의 항돌연변이, 지질과 산화 및 항암효과에 관한 기전적 연구<NA><NA>3
42국무조정실한국한의학연구원국제협력및연구관리지원국제협력 및 연구관리 지원<NA><NA>3
43국무조정실한국한의학연구원한약재검사사업한약재 검사 사업<NA><NA>3
46국무조정실한국한의학연구원<NA>한의 학술 정보화 연구<NA><NA>3
51농림부농림기술개발미삼의압출성형에의한세포벽수용화및재구성기법연구미삼의 압출성형에 의한 세포벽 수용화 및 재구성 기법 연구<NA><NA>3