Overview

Dataset statistics

Number of variables14
Number of observations10000
Missing cells13603
Missing cells (%)9.7%
Duplicate rows5
Duplicate rows (%)0.1%
Total size in memory1.2 MiB
Average record size in memory122.0 B

Variable types

Text6
Categorical5
DateTime2
Numeric1

Dataset

Description1. KOICA-ODA 사업정보 KF-공공외교 사업 정보 목록 조회: 한글 국가명 또는 ISO국가코드(다.참고 1 ISO국가코드 이용), 한글 사업명으로 KOICA-ODA 사업정보 KF-공공외교 사업 정보 목록 조회
Author한국국제교류재단
URLhttps://www.data.go.kr/data/15099253/fileData.do

Alerts

Dataset has 5 (0.1%) duplicate rowsDuplicates
사업유형명 is highly overall correlated with 사업유형코드 and 2 other fieldsHigh correlation
다년구분코드명 is highly overall correlated with 사업유형코드 and 2 other fieldsHigh correlation
다년구분코드 is highly overall correlated with 사업유형코드 and 2 other fieldsHigh correlation
사업유형코드 is highly overall correlated with 사업유형명 and 2 other fieldsHigh correlation
사업유형코드 is highly imbalanced (89.3%)Imbalance
사업유형명 is highly imbalanced (89.3%)Imbalance
다년구분코드 is highly imbalanced (60.4%)Imbalance
다년구분코드명 is highly imbalanced (60.4%)Imbalance
사업명(영문) has 6504 (65.0%) missing valuesMissing
사업시작일 has 2697 (27.0%) missing valuesMissing
사업종료일 has 2699 (27.0%) missing valuesMissing
수혜기관명 has 1638 (16.4%) missing valuesMissing

Reproduction

Analysis started2023-12-12 22:47:29.132579
Analysis finished2023-12-12 22:47:32.098713
Duration2.97 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct123
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T07:47:32.339850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length3.0236
Min length2

Characters and Unicode

Total characters30236
Distinct characters148
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.1%

Sample

1st row대한민국
2nd row폴란드
3rd row인도네시아
4th row미국
5th row미국
ValueCountFrequency (%)
대한민국 2728
27.3%
미국 2435
24.3%
중국 545
 
5.5%
러시아 335
 
3.4%
일본 272
 
2.7%
독일 268
 
2.7%
베트남 242
 
2.4%
영국 226
 
2.3%
호주 159
 
1.6%
캐나다 157
 
1.6%
Other values (113) 2633
26.3%
2023-12-13T07:47:32.780278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6077
20.1%
2729
 
9.0%
2728
 
9.0%
2728
 
9.0%
2484
 
8.2%
1033
 
3.4%
687
 
2.3%
554
 
1.8%
545
 
1.8%
523
 
1.7%
Other values (138) 10148
33.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30232
> 99.9%
Uppercase Letter 2
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6077
20.1%
2729
 
9.0%
2728
 
9.0%
2728
 
9.0%
2484
 
8.2%
1033
 
3.4%
687
 
2.3%
554
 
1.8%
545
 
1.8%
523
 
1.7%
Other values (134) 10144
33.6%
Uppercase Letter
ValueCountFrequency (%)
R 1
50.0%
D 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30232
> 99.9%
Common 2
 
< 0.1%
Latin 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6077
20.1%
2729
 
9.0%
2728
 
9.0%
2728
 
9.0%
2484
 
8.2%
1033
 
3.4%
687
 
2.3%
554
 
1.8%
545
 
1.8%
523
 
1.7%
Other values (134) 10144
33.6%
Common
ValueCountFrequency (%)
( 1
50.0%
) 1
50.0%
Latin
ValueCountFrequency (%)
R 1
50.0%
D 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30232
> 99.9%
ASCII 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6077
20.1%
2729
 
9.0%
2728
 
9.0%
2728
 
9.0%
2484
 
8.2%
1033
 
3.4%
687
 
2.3%
554
 
1.8%
545
 
1.8%
523
 
1.7%
Other values (134) 10144
33.6%
ASCII
ValueCountFrequency (%)
( 1
25.0%
) 1
25.0%
R 1
25.0%
D 1
25.0%
Distinct122
Distinct (%)1.2%
Missing33
Missing (%)0.3%
Memory size156.2 KiB
2023-12-13T07:47:33.113623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length26
Mean length10.672218
Min length3

Characters and Unicode

Total characters106370
Distinct characters55
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.1%

Sample

1st rowKorea
2nd rowPoland
3rd rowIndonesia
4th rowUnited States of America
5th rowUnited States of America
ValueCountFrequency (%)
korea 2728
15.4%
united 2675
15.1%
of 2436
13.7%
states 2435
13.7%
america 2435
13.7%
china 545
 
3.1%
russia 335
 
1.9%
japan 272
 
1.5%
germany 268
 
1.5%
vietnam 242
 
1.4%
Other values (128) 3346
18.9%
2023-12-13T07:47:33.521889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 13404
12.6%
e 12065
11.3%
t 8482
 
8.0%
i 8304
 
7.8%
7750
 
7.3%
r 6658
 
6.3%
n 6151
 
5.8%
o 6134
 
5.8%
s 3900
 
3.7%
d 3764
 
3.5%
Other values (45) 29758
28.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 83257
78.3%
Uppercase Letter 15328
 
14.4%
Space Separator 7750
 
7.3%
Other Punctuation 32
 
< 0.1%
Dash Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 13404
16.1%
e 12065
14.5%
t 8482
10.2%
i 8304
10.0%
r 6658
8.0%
n 6151
7.4%
o 6134
7.4%
s 3900
 
4.7%
d 3764
 
4.5%
m 3467
 
4.2%
Other values (17) 10928
13.1%
Uppercase Letter
ValueCountFrequency (%)
K 3045
19.9%
A 2760
18.0%
U 2753
18.0%
S 2732
17.8%
C 901
 
5.9%
R 427
 
2.8%
I 418
 
2.7%
G 306
 
2.0%
T 302
 
2.0%
J 298
 
1.9%
Other values (13) 1386
9.0%
Other Punctuation
ValueCountFrequency (%)
: 14
43.8%
' 13
40.6%
& 5
 
15.6%
Space Separator
ValueCountFrequency (%)
7750
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 98585
92.7%
Common 7785
 
7.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 13404
13.6%
e 12065
12.2%
t 8482
 
8.6%
i 8304
 
8.4%
r 6658
 
6.8%
n 6151
 
6.2%
o 6134
 
6.2%
s 3900
 
4.0%
d 3764
 
3.8%
m 3467
 
3.5%
Other values (40) 26256
26.6%
Common
ValueCountFrequency (%)
7750
99.6%
: 14
 
0.2%
' 13
 
0.2%
& 5
 
0.1%
- 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 106357
> 99.9%
None 13
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 13404
12.6%
e 12065
11.3%
t 8482
 
8.0%
i 8304
 
7.8%
7750
 
7.3%
r 6658
 
6.3%
n 6151
 
5.8%
o 6134
 
5.8%
s 3900
 
3.7%
d 3764
 
3.5%
Other values (44) 29745
28.0%
None
ValueCountFrequency (%)
ô 13
100.0%
Distinct123
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T07:47:33.765597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters20000
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.1%

Sample

1st rowKR
2nd rowPL
3rd rowID
4th rowUS
5th rowUS
ValueCountFrequency (%)
kr 2728
27.3%
us 2435
24.3%
cn 545
 
5.5%
ru 335
 
3.4%
jp 272
 
2.7%
de 268
 
2.7%
vn 242
 
2.4%
gb 226
 
2.3%
au 159
 
1.6%
ca 157
 
1.6%
Other values (113) 2633
26.3%
2023-12-13T07:47:34.036987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 3498
17.5%
U 3044
15.2%
K 2945
14.7%
S 2674
13.4%
N 1099
 
5.5%
C 906
 
4.5%
E 619
 
3.1%
A 568
 
2.8%
T 466
 
2.3%
I 464
 
2.3%
Other values (16) 3717
18.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 20000
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 3498
17.5%
U 3044
15.2%
K 2945
14.7%
S 2674
13.4%
N 1099
 
5.5%
C 906
 
4.5%
E 619
 
3.1%
A 568
 
2.8%
T 466
 
2.3%
I 464
 
2.3%
Other values (16) 3717
18.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 20000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 3498
17.5%
U 3044
15.2%
K 2945
14.7%
S 2674
13.4%
N 1099
 
5.5%
C 906
 
4.5%
E 619
 
3.1%
A 568
 
2.8%
T 466
 
2.3%
I 464
 
2.3%
Other values (16) 3717
18.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 3498
17.5%
U 3044
15.2%
K 2945
14.7%
S 2674
13.4%
N 1099
 
5.5%
C 906
 
4.5%
E 619
 
3.1%
A 568
 
2.8%
T 466
 
2.3%
I 464
 
2.3%
Other values (16) 3717
18.6%

대륙명
Categorical

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
아시아
4732 
북아메리카
2777 
유럽
1921 
호주(오세아니아)
 
188
남아메리카
 
184
Other values (2)
 
198

Length

Max length9
Median length5
Mean length3.5327
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row아시아
2nd row유럽
3rd row아시아
4th row북아메리카
5th row북아메리카

Common Values

ValueCountFrequency (%)
아시아 4732
47.3%
북아메리카 2777
27.8%
유럽 1921
19.2%
호주(오세아니아) 188
 
1.9%
남아메리카 184
 
1.8%
아프리카 165
 
1.7%
<NA> 33
 
0.3%

Length

2023-12-13T07:47:34.148576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:47:34.250570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
아시아 4732
47.3%
북아메리카 2777
27.8%
유럽 1921
19.2%
호주(오세아니아 188
 
1.9%
남아메리카 184
 
1.8%
아프리카 165
 
1.7%
na 33
 
0.3%

사업유형코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
9859 
2
 
141

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 9859
98.6%
2 141
 
1.4%

Length

2023-12-13T07:47:34.348138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:47:34.439394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 9859
98.6%
2 141
 
1.4%

사업유형명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
KF
9859 
KOICA
 
141

Length

Max length5
Median length2
Mean length2.0423
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKF
2nd rowKF
3rd rowKOICA
4th rowKF
5th rowKF

Common Values

ValueCountFrequency (%)
KF 9859
98.6%
KOICA 141
 
1.4%

Length

2023-12-13T07:47:34.524288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:47:34.616243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kf 9859
98.6%
koica 141
 
1.4%
Distinct7973
Distinct (%)80.0%
Missing32
Missing (%)0.3%
Memory size156.2 KiB
2023-12-13T07:47:34.868379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length125
Median length92
Mean length23.931481
Min length3

Characters and Unicode

Total characters238549
Distinct characters919
Distinct categories17 ?
Distinct scripts7 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6915 ?
Unique (%)69.4%

Sample

1st row아태안보협력이사회(CSCAP)
2nd row2001년도 폴란드 바르샤바대 한국학 강좌 운영
3rd row인도네시아 파푸아 Boven Digoel 지역 의료서비스 개선 사업(2016-2018/2,817백만원/코린산업)
4th row미국 CFR
5th row[차세대] CCGA
ValueCountFrequency (%)
한국어 1129
 
2.6%
미국 1095
 
2.5%
한국학 873
 
2.0%
객원교수 836
 
1.9%
지원 814
 
1.9%
뉴스레터 366
 
0.8%
중국 305
 
0.7%
설치 281
 
0.6%
운영 280
 
0.6%
275
 
0.6%
Other values (9746) 37645
85.8%
2023-12-13T07:47:35.282414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34277
 
14.4%
7715
 
3.2%
5755
 
2.4%
2 5163
 
2.2%
0 5087
 
2.1%
4575
 
1.9%
3833
 
1.6%
1 3677
 
1.5%
3326
 
1.4%
[ 3015
 
1.3%
Other values (909) 162126
68.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 138533
58.1%
Space Separator 34277
 
14.4%
Decimal Number 19732
 
8.3%
Lowercase Letter 19244
 
8.1%
Uppercase Letter 10476
 
4.4%
Close Punctuation 5654
 
2.4%
Open Punctuation 5652
 
2.4%
Dash Punctuation 2295
 
1.0%
Other Punctuation 1934
 
0.8%
Math Symbol 540
 
0.2%
Other values (7) 212
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7715
 
5.6%
5755
 
4.2%
4575
 
3.3%
3833
 
2.8%
3326
 
2.4%
2763
 
2.0%
2684
 
1.9%
2661
 
1.9%
2375
 
1.7%
2188
 
1.6%
Other values (809) 100658
72.7%
Lowercase Letter
ValueCountFrequency (%)
e 2309
12.0%
i 2079
10.8%
o 1812
9.4%
n 1758
9.1%
t 1657
8.6%
a 1639
8.5%
r 1455
 
7.6%
s 1229
 
6.4%
l 865
 
4.5%
c 612
 
3.2%
Other values (19) 3829
19.9%
Uppercase Letter
ValueCountFrequency (%)
S 1232
11.8%
C 1044
 
10.0%
I 952
 
9.1%
A 949
 
9.1%
K 752
 
7.2%
U 671
 
6.4%
T 561
 
5.4%
F 561
 
5.4%
E 556
 
5.3%
P 507
 
4.8%
Other values (16) 2691
25.7%
Other Punctuation
ValueCountFrequency (%)
/ 928
48.0%
, 351
 
18.1%
. 168
 
8.7%
: 131
 
6.8%
' 120
 
6.2%
" 116
 
6.0%
· 55
 
2.8%
& 47
 
2.4%
? 11
 
0.6%
! 6
 
0.3%
Decimal Number
ValueCountFrequency (%)
2 5163
26.2%
0 5087
25.8%
1 3677
18.6%
9 1600
 
8.1%
5 961
 
4.9%
3 723
 
3.7%
8 689
 
3.5%
6 666
 
3.4%
7 654
 
3.3%
4 512
 
2.6%
Open Punctuation
ValueCountFrequency (%)
[ 3015
53.3%
( 2596
45.9%
30
 
0.5%
11
 
0.2%
Close Punctuation
ValueCountFrequency (%)
] 3015
53.3%
) 2598
45.9%
30
 
0.5%
11
 
0.2%
Math Symbol
ValueCountFrequency (%)
< 265
49.1%
> 265
49.1%
~ 6
 
1.1%
+ 4
 
0.7%
Final Punctuation
ValueCountFrequency (%)
17
54.8%
14
45.2%
Initial Punctuation
ValueCountFrequency (%)
16
51.6%
15
48.4%
Letter Number
ValueCountFrequency (%)
5
83.3%
1
 
16.7%
Space Separator
ValueCountFrequency (%)
34277
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2295
100.0%
Control
ValueCountFrequency (%)
102
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 40
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 138510
58.1%
Common 70289
29.5%
Latin 29724
 
12.5%
Han 21
 
< 0.1%
Hiragana 3
 
< 0.1%
Cyrillic 1
 
< 0.1%
Greek 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7715
 
5.6%
5755
 
4.2%
4575
 
3.3%
3833
 
2.8%
3326
 
2.4%
2763
 
2.0%
2684
 
1.9%
2661
 
1.9%
2375
 
1.7%
2188
 
1.6%
Other values (791) 100635
72.7%
Latin
ValueCountFrequency (%)
e 2309
 
7.8%
i 2079
 
7.0%
o 1812
 
6.1%
n 1758
 
5.9%
t 1657
 
5.6%
a 1639
 
5.5%
r 1455
 
4.9%
S 1232
 
4.1%
s 1229
 
4.1%
C 1044
 
3.5%
Other values (45) 13510
45.5%
Common
ValueCountFrequency (%)
34277
48.8%
2 5163
 
7.3%
0 5087
 
7.2%
1 3677
 
5.2%
[ 3015
 
4.3%
] 3015
 
4.3%
) 2598
 
3.7%
( 2596
 
3.7%
- 2295
 
3.3%
9 1600
 
2.3%
Other values (32) 6966
 
9.9%
Han
ValueCountFrequency (%)
5
23.8%
2
 
9.5%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (6) 6
28.6%
Hiragana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Cyrillic
ValueCountFrequency (%)
о 1
100.0%
Greek
ValueCountFrequency (%)
ο 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 138500
58.1%
ASCII 99806
41.8%
None 140
 
0.1%
Punctuation 62
 
< 0.1%
CJK 21
 
< 0.1%
Compat Jamo 9
 
< 0.1%
Number Forms 6
 
< 0.1%
Hiragana 3
 
< 0.1%
Cyrillic 1
 
< 0.1%
Katakana 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
34277
34.3%
2 5163
 
5.2%
0 5087
 
5.1%
1 3677
 
3.7%
[ 3015
 
3.0%
] 3015
 
3.0%
) 2598
 
2.6%
( 2596
 
2.6%
e 2309
 
2.3%
- 2295
 
2.3%
Other values (74) 35774
35.8%
Hangul
ValueCountFrequency (%)
7715
 
5.6%
5755
 
4.2%
4575
 
3.3%
3833
 
2.8%
3326
 
2.4%
2763
 
2.0%
2684
 
1.9%
2661
 
1.9%
2375
 
1.7%
2188
 
1.6%
Other values (788) 100625
72.7%
None
ValueCountFrequency (%)
· 55
39.3%
30
21.4%
30
21.4%
11
 
7.9%
11
 
7.9%
ο 1
 
0.7%
ô 1
 
0.7%
1
 
0.7%
Punctuation
ValueCountFrequency (%)
17
27.4%
16
25.8%
15
24.2%
14
22.6%
Compat Jamo
ValueCountFrequency (%)
7
77.8%
2
 
22.2%
CJK
ValueCountFrequency (%)
5
23.8%
2
 
9.5%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (6) 6
28.6%
Number Forms
ValueCountFrequency (%)
5
83.3%
1
 
16.7%
Cyrillic
ValueCountFrequency (%)
о 1
100.0%
Hiragana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Katakana
ValueCountFrequency (%)
1
100.0%

사업명(영문)
Text

MISSING 

Distinct2036
Distinct (%)58.2%
Missing6504
Missing (%)65.0%
Memory size156.2 KiB
2023-12-13T07:47:35.579686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length203
Median length151
Mean length45.851831
Min length3

Characters and Unicode

Total characters160298
Distinct characters96
Distinct categories14 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1601 ?
Unique (%)45.8%

Sample

1st row2001 Establishment of Professorships Program
2nd rowMedical Services Improvement of Local Community in Indonesia
3rd rowKAMS International Exchange Forum - [Edinburgh Expansion Strategy & Cases] Lecture
4th rowConcert celebrating 20th anniversary of the treaty of amity between Korea and Russia
5th rowVenezuela Writers Invitational Meeting
ValueCountFrequency (%)
of 1391
 
6.2%
program 1094
 
4.9%
korean 736
 
3.3%
the 725
 
3.3%
for 456
 
2.0%
and 421
 
1.9%
visiting 383
 
1.7%
staff 357
 
1.6%
teaching 356
 
1.6%
employment 355
 
1.6%
Other values (2967) 16001
71.8%
2023-12-13T07:47:36.106013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18804
 
11.7%
e 12272
 
7.7%
o 11249
 
7.0%
r 10824
 
6.8%
a 9654
 
6.0%
n 8789
 
5.5%
i 8646
 
5.4%
t 8225
 
5.1%
s 8114
 
5.1%
l 4278
 
2.7%
Other values (86) 59443
37.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 111528
69.6%
Uppercase Letter 20130
 
12.6%
Space Separator 18804
 
11.7%
Decimal Number 7668
 
4.8%
Other Punctuation 973
 
0.6%
Dash Punctuation 577
 
0.4%
Open Punctuation 230
 
0.1%
Close Punctuation 226
 
0.1%
Math Symbol 59
 
< 0.1%
Final Punctuation 45
 
< 0.1%
Other values (4) 58
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 12272
11.0%
o 11249
10.1%
r 10824
9.7%
a 9654
 
8.7%
n 8789
 
7.9%
i 8646
 
7.8%
t 8225
 
7.4%
s 8114
 
7.3%
l 4278
 
3.8%
m 3754
 
3.4%
Other values (16) 25723
23.1%
Uppercase Letter
ValueCountFrequency (%)
P 2535
12.6%
S 2019
 
10.0%
E 1825
 
9.1%
K 1554
 
7.7%
A 1296
 
6.4%
T 1233
 
6.1%
C 1193
 
5.9%
N 999
 
5.0%
R 908
 
4.5%
F 856
 
4.3%
Other values (16) 5712
28.4%
Other Punctuation
ValueCountFrequency (%)
: 212
21.8%
, 201
20.7%
. 157
16.1%
' 148
15.2%
" 147
15.1%
& 75
 
7.7%
/ 17
 
1.7%
; 9
 
0.9%
? 5
 
0.5%
# 1
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 2648
34.5%
2 1780
23.2%
1 1184
15.4%
9 838
 
10.9%
8 247
 
3.2%
7 244
 
3.2%
6 195
 
2.5%
3 193
 
2.5%
5 191
 
2.5%
4 148
 
1.9%
Math Symbol
ValueCountFrequency (%)
< 25
42.4%
> 23
39.0%
| 6
 
10.2%
+ 4
 
6.8%
1
 
1.7%
Open Punctuation
ValueCountFrequency (%)
( 129
56.1%
[ 100
43.5%
1
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 125
55.3%
] 100
44.2%
1
 
0.4%
Letter Number
ValueCountFrequency (%)
3
42.9%
2
28.6%
2
28.6%
Dash Punctuation
ValueCountFrequency (%)
- 576
99.8%
1
 
0.2%
Initial Punctuation
ValueCountFrequency (%)
28
75.7%
9
 
24.3%
Final Punctuation
ValueCountFrequency (%)
26
57.8%
19
42.2%
Space Separator
ValueCountFrequency (%)
18804
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 13
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 131665
82.1%
Common 28633
 
17.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 12272
 
9.3%
o 11249
 
8.5%
r 10824
 
8.2%
a 9654
 
7.3%
n 8789
 
6.7%
i 8646
 
6.6%
t 8225
 
6.2%
s 8114
 
6.2%
l 4278
 
3.2%
m 3754
 
2.9%
Other values (45) 45860
34.8%
Common
ValueCountFrequency (%)
18804
65.7%
0 2648
 
9.2%
2 1780
 
6.2%
1 1184
 
4.1%
9 838
 
2.9%
- 576
 
2.0%
8 247
 
0.9%
7 244
 
0.9%
: 212
 
0.7%
, 201
 
0.7%
Other values (31) 1899
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 160205
99.9%
Punctuation 83
 
0.1%
Number Forms 7
 
< 0.1%
None 2
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
18804
 
11.7%
e 12272
 
7.7%
o 11249
 
7.0%
r 10824
 
6.8%
a 9654
 
6.0%
n 8789
 
5.5%
i 8646
 
5.4%
t 8225
 
5.1%
s 8114
 
5.1%
l 4278
 
2.7%
Other values (75) 59350
37.0%
Punctuation
ValueCountFrequency (%)
28
33.7%
26
31.3%
19
22.9%
9
 
10.8%
1
 
1.2%
Number Forms
ValueCountFrequency (%)
3
42.9%
2
28.6%
2
28.6%
None
ValueCountFrequency (%)
1
50.0%
1
50.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

사업시작일
Date

MISSING 

Distinct2126
Distinct (%)29.1%
Missing2697
Missing (%)27.0%
Memory size156.2 KiB
Minimum1992-01-01 00:00:00
Maximum2024-05-31 00:00:00
2023-12-13T07:47:36.296579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:47:36.474554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

사업종료일
Date

MISSING 

Distinct2120
Distinct (%)29.0%
Missing2699
Missing (%)27.0%
Memory size156.2 KiB
Minimum1992-02-27 00:00:00
Maximum2024-07-31 00:00:00
2023-12-13T07:47:36.633315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:47:36.791836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

다년구분코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
S
7426 
M
2208 
<NA>
 
228
MC
 
83
MN
 
54

Length

Max length4
Median length1
Mean length1.0822
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowS
2nd rowS
3rd rowMC
4th rowS
5th rowS

Common Values

ValueCountFrequency (%)
S 7426
74.3%
M 2208
 
22.1%
<NA> 228
 
2.3%
MC 83
 
0.8%
MN 54
 
0.5%
SN 1
 
< 0.1%

Length

2023-12-13T07:47:36.953232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:47:37.078937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
s 7426
74.3%
m 2208
 
22.1%
na 228
 
2.3%
mc 83
 
0.8%
mn 54
 
0.5%
sn 1
 
< 0.1%

다년구분코드명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
단년
7426 
다년
2208 
<NA>
 
228
다년계속
 
83
다년신규
 
54

Length

Max length4
Median length2
Mean length2.0732
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row단년
2nd row단년
3rd row다년계속
4th row단년
5th row단년

Common Values

ValueCountFrequency (%)
단년 7426
74.3%
다년 2208
 
22.1%
<NA> 228
 
2.3%
다년계속 83
 
0.8%
다년신규 54
 
0.5%
단년신규 1
 
< 0.1%

Length

2023-12-13T07:47:37.228766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:47:37.384000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단년 7426
74.3%
다년 2208
 
22.1%
na 228
 
2.3%
다년계속 83
 
0.8%
다년신규 54
 
0.5%
단년신규 1
 
< 0.1%

수혜기관명
Text

MISSING 

Distinct2965
Distinct (%)35.5%
Missing1638
Missing (%)16.4%
Memory size156.2 KiB
2023-12-13T07:47:37.697409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length100
Median length87
Mean length14.446424
Min length2

Characters and Unicode

Total characters120801
Distinct characters1157
Distinct categories18 ?
Distinct scripts15 ?
Distinct blocks18 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1608 ?
Unique (%)19.2%

Sample

1st row아시아태평양 안보협력이사회 한국위원회
2nd row바르샤바대
3rd rowPT.Tunas Sawaerma, ㈜코린산업
4th row주 미국 대한민국 대사관
5th rowHarvard University
ValueCountFrequency (%)
of 555
 
3.0%
university 547
 
3.0%
한국국제교류재단 401
 
2.2%
355
 
2.0%
대사관 334
 
1.8%
대한민국 319
 
1.8%
and 195
 
1.1%
미국 173
 
1.0%
for 165
 
0.9%
studies 153
 
0.8%
Other values (3902) 15005
82.4%
2023-12-13T07:47:38.215585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9894
 
8.2%
i 4612
 
3.8%
e 4494
 
3.7%
4250
 
3.5%
n 3997
 
3.3%
a 3933
 
3.3%
t 3467
 
2.9%
r 3401
 
2.8%
3128
 
2.6%
o 3054
 
2.5%
Other values (1147) 76571
63.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 53330
44.1%
Lowercase Letter 43011
35.6%
Uppercase Letter 11072
 
9.2%
Space Separator 9894
 
8.2%
Close Punctuation 1107
 
0.9%
Open Punctuation 1101
 
0.9%
Other Punctuation 590
 
0.5%
Dash Punctuation 405
 
0.3%
Decimal Number 152
 
0.1%
Nonspacing Mark 88
 
0.1%
Other values (8) 51
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4250
 
8.0%
3128
 
5.9%
1856
 
3.5%
1833
 
3.4%
1644
 
3.1%
1244
 
2.3%
1168
 
2.2%
1033
 
1.9%
984
 
1.8%
972
 
1.8%
Other values (871) 35218
66.0%
Lowercase Letter
ValueCountFrequency (%)
i 4612
10.7%
e 4494
10.4%
n 3997
 
9.3%
a 3933
 
9.1%
t 3467
 
8.1%
r 3401
 
7.9%
o 3054
 
7.1%
s 2694
 
6.3%
l 1591
 
3.7%
u 1422
 
3.3%
Other values (125) 10346
24.1%
Uppercase Letter
ValueCountFrequency (%)
U 1203
 
10.9%
C 1070
 
9.7%
S 1069
 
9.7%
A 1029
 
9.3%
I 761
 
6.9%
L 576
 
5.2%
N 487
 
4.4%
E 465
 
4.2%
M 417
 
3.8%
K 384
 
3.5%
Other values (66) 3611
32.6%
Nonspacing Mark
ValueCountFrequency (%)
12
13.6%
̣ 9
10.2%
9
10.2%
7
 
8.0%
6
 
6.8%
6
 
6.8%
6
 
6.8%
̀ 5
 
5.7%
5
 
5.7%
4
 
4.5%
Other values (12) 19
21.6%
Other Punctuation
ValueCountFrequency (%)
, 303
51.4%
. 117
 
19.8%
/ 45
 
7.6%
& 30
 
5.1%
· 25
 
4.2%
' 24
 
4.1%
; 19
 
3.2%
" 13
 
2.2%
7
 
1.2%
: 4
 
0.7%
Other values (2) 3
 
0.5%
Decimal Number
ValueCountFrequency (%)
1 39
25.7%
2 36
23.7%
7 24
15.8%
0 19
12.5%
3 12
 
7.9%
5 10
 
6.6%
8 7
 
4.6%
4 5
 
3.3%
Spacing Mark
ValueCountFrequency (%)
ि 8
29.6%
7
25.9%
6
22.2%
2
 
7.4%
2
 
7.4%
2
 
7.4%
Open Punctuation
ValueCountFrequency (%)
( 1095
99.5%
5
 
0.5%
[ 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 1106
99.9%
] 1
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 404
99.8%
1
 
0.2%
Math Symbol
ValueCountFrequency (%)
< 1
50.0%
> 1
50.0%
Final Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
9894
100.0%
Initial Punctuation
ValueCountFrequency (%)
6
100.0%
Other Symbol
ValueCountFrequency (%)
5
100.0%
Format
ValueCountFrequency (%)
4
100.0%
Modifier Letter
ValueCountFrequency (%)
3
100.0%
Control
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 52249
43.3%
Hangul 51774
42.9%
Common 13264
 
11.0%
Cyrillic 1584
 
1.3%
Han 777
 
0.6%
Arabic 234
 
0.2%
Thai 220
 
0.2%
Hebrew 217
 
0.2%
Armenian 199
 
0.2%
Sinhala 74
 
0.1%
Other values (5) 209
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4250
 
8.2%
3128
 
6.0%
1856
 
3.6%
1833
 
3.5%
1644
 
3.2%
1244
 
2.4%
1168
 
2.3%
1033
 
2.0%
984
 
1.9%
972
 
1.9%
Other values (586) 33662
65.0%
Han
ValueCountFrequency (%)
68
 
8.8%
63
 
8.1%
37
 
4.8%
30
 
3.9%
24
 
3.1%
22
 
2.8%
20
 
2.6%
20
 
2.6%
20
 
2.6%
19
 
2.4%
Other values (150) 454
58.4%
Latin
ValueCountFrequency (%)
i 4612
 
8.8%
e 4494
 
8.6%
n 3997
 
7.6%
a 3933
 
7.5%
t 3467
 
6.6%
r 3401
 
6.5%
o 3054
 
5.8%
s 2694
 
5.2%
l 1591
 
3.0%
u 1422
 
2.7%
Other values (102) 19584
37.5%
Cyrillic
ValueCountFrequency (%)
е 151
 
9.5%
и 143
 
9.0%
т 116
 
7.3%
н 113
 
7.1%
о 112
 
7.1%
а 103
 
6.5%
с 101
 
6.4%
р 77
 
4.9%
в 71
 
4.5%
к 71
 
4.5%
Other values (45) 526
33.2%
Common
ValueCountFrequency (%)
9894
74.6%
) 1106
 
8.3%
( 1095
 
8.3%
- 404
 
3.0%
, 303
 
2.3%
. 117
 
0.9%
/ 45
 
0.3%
1 39
 
0.3%
2 36
 
0.3%
& 30
 
0.2%
Other values (25) 195
 
1.5%
Thai
ValueCountFrequency (%)
28
 
12.7%
20
 
9.1%
16
 
7.3%
12
 
5.5%
12
 
5.5%
12
 
5.5%
11
 
5.0%
9
 
4.1%
9
 
4.1%
8
 
3.6%
Other values (25) 83
37.7%
Armenian
ValueCountFrequency (%)
Ա 56
28.1%
Ե 17
 
8.5%
Ն 16
 
8.0%
Ր 12
 
6.0%
Կ 12
 
6.0%
Լ 8
 
4.0%
Վ 8
 
4.0%
Ի 8
 
4.0%
Տ 8
 
4.0%
Ս 8
 
4.0%
Other values (18) 46
23.1%
Devanagari
ValueCountFrequency (%)
ि 8
11.9%
7
 
10.4%
7
 
10.4%
6
 
9.0%
6
 
9.0%
5
 
7.5%
3
 
4.5%
2
 
3.0%
2
 
3.0%
2
 
3.0%
Other values (15) 19
28.4%
Arabic
ValueCountFrequency (%)
ا 48
20.5%
ل 32
13.7%
ة 20
8.5%
م 18
 
7.7%
ن 17
 
7.3%
ع 16
 
6.8%
ي 15
 
6.4%
س 13
 
5.6%
ج 9
 
3.8%
ش 7
 
3.0%
Other values (13) 39
16.7%
Lao
ValueCountFrequency (%)
9
19.6%
5
 
10.9%
4
 
8.7%
3
 
6.5%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
Other values (13) 13
28.3%
Hebrew
ValueCountFrequency (%)
י 43
19.8%
ר 21
9.7%
ב 19
8.8%
ו 18
8.3%
א 18
8.3%
ה 13
 
6.0%
ס 12
 
5.5%
ת 11
 
5.1%
ט 11
 
5.1%
נ 11
 
5.1%
Other values (11) 40
18.4%
Sinhala
ValueCountFrequency (%)
12
16.2%
6
 
8.1%
6
 
8.1%
6
 
8.1%
6
 
8.1%
6
 
8.1%
4
 
5.4%
4
 
5.4%
4
 
5.4%
2
 
2.7%
Other values (9) 18
24.3%
Georgian
ValueCountFrequency (%)
13
25.5%
6
11.8%
6
11.8%
4
 
7.8%
3
 
5.9%
3
 
5.9%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (6) 7
13.7%
Katakana
ValueCountFrequency (%)
8
33.3%
6
25.0%
6
25.0%
2
 
8.3%
2
 
8.3%
Inherited
ValueCountFrequency (%)
̣ 9
42.9%
̀ 5
23.8%
4
19.0%
́ 3
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 65049
53.8%
Hangul 51769
42.9%
Cyrillic 1584
 
1.3%
CJK 777
 
0.6%
None 347
 
0.3%
Arabic 234
 
0.2%
Thai 220
 
0.2%
Hebrew 217
 
0.2%
Armenian 199
 
0.2%
Latin Ext Additional 91
 
0.1%
Other values (8) 314
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9894
15.2%
i 4612
 
7.1%
e 4494
 
6.9%
n 3997
 
6.1%
a 3933
 
6.0%
t 3467
 
5.3%
r 3401
 
5.2%
o 3054
 
4.7%
s 2694
 
4.1%
l 1591
 
2.4%
Other values (68) 23912
36.8%
Hangul
ValueCountFrequency (%)
4250
 
8.2%
3128
 
6.0%
1856
 
3.6%
1833
 
3.5%
1644
 
3.2%
1244
 
2.4%
1168
 
2.3%
1033
 
2.0%
984
 
1.9%
972
 
1.9%
Other values (585) 33657
65.0%
Cyrillic
ValueCountFrequency (%)
е 151
 
9.5%
и 143
 
9.0%
т 116
 
7.3%
н 113
 
7.1%
о 112
 
7.1%
а 103
 
6.5%
с 101
 
6.4%
р 77
 
4.9%
в 71
 
4.5%
к 71
 
4.5%
Other values (45) 526
33.2%
CJK
ValueCountFrequency (%)
68
 
8.8%
63
 
8.1%
37
 
4.8%
30
 
3.9%
24
 
3.1%
22
 
2.8%
20
 
2.6%
20
 
2.6%
20
 
2.6%
19
 
2.4%
Other values (150) 454
58.4%
Armenian
ValueCountFrequency (%)
Ա 56
28.1%
Ե 17
 
8.5%
Ն 16
 
8.0%
Ր 12
 
6.0%
Կ 12
 
6.0%
Լ 8
 
4.0%
Վ 8
 
4.0%
Ի 8
 
4.0%
Տ 8
 
4.0%
Ս 8
 
4.0%
Other values (18) 46
23.1%
Arabic
ValueCountFrequency (%)
ا 48
20.5%
ل 32
13.7%
ة 20
8.5%
م 18
 
7.7%
ن 17
 
7.3%
ع 16
 
6.8%
ي 15
 
6.4%
س 13
 
5.6%
ج 9
 
3.8%
ش 7
 
3.0%
Other values (13) 39
16.7%
Hebrew
ValueCountFrequency (%)
י 43
19.8%
ר 21
9.7%
ב 19
8.8%
ו 18
8.3%
א 18
8.3%
ה 13
 
6.0%
ס 12
 
5.5%
ת 11
 
5.1%
ט 11
 
5.1%
נ 11
 
5.1%
Other values (11) 40
18.4%
Thai
ValueCountFrequency (%)
28
 
12.7%
20
 
9.1%
16
 
7.3%
12
 
5.5%
12
 
5.5%
12
 
5.5%
11
 
5.0%
9
 
4.1%
9
 
4.1%
8
 
3.6%
Other values (25) 83
37.7%
None
ValueCountFrequency (%)
á 27
 
7.8%
Đ 27
 
7.8%
· 25
 
7.2%
ä 24
 
6.9%
é 23
 
6.6%
à 22
 
6.3%
ö 14
 
4.0%
š 12
 
3.5%
ü 11
 
3.2%
ư 9
 
2.6%
Other values (39) 153
44.1%
Latin Ext Additional
ValueCountFrequency (%)
25
27.5%
24
26.4%
7
 
7.7%
7
 
7.7%
6
 
6.6%
6
 
6.6%
4
 
4.4%
3
 
3.3%
3
 
3.3%
3
 
3.3%
Other values (2) 3
 
3.3%
Georgian
ValueCountFrequency (%)
13
25.5%
6
11.8%
6
11.8%
4
 
7.8%
3
 
5.9%
3
 
5.9%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (6) 7
13.7%
Sinhala
ValueCountFrequency (%)
12
16.2%
6
 
8.1%
6
 
8.1%
6
 
8.1%
6
 
8.1%
6
 
8.1%
4
 
5.4%
4
 
5.4%
4
 
5.4%
2
 
2.7%
Other values (9) 18
24.3%
Diacriticals
ValueCountFrequency (%)
̣ 9
52.9%
̀ 5
29.4%
́ 3
 
17.6%
Lao
ValueCountFrequency (%)
9
19.6%
5
 
10.9%
4
 
8.7%
3
 
6.5%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
Other values (13) 13
28.3%
Devanagari
ValueCountFrequency (%)
ि 8
11.9%
7
 
10.4%
7
 
10.4%
6
 
9.0%
6
 
9.0%
5
 
7.5%
3
 
4.5%
2
 
3.0%
2
 
3.0%
2
 
3.0%
Other values (15) 19
28.4%
Katakana
ValueCountFrequency (%)
8
23.5%
7
20.6%
6
17.6%
6
17.6%
3
 
8.8%
2
 
5.9%
2
 
5.9%
IPA Ext
ValueCountFrequency (%)
ə 6
100.0%
Punctuation
ValueCountFrequency (%)
6
31.6%
5
26.3%
4
21.1%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%

사업연도
Real number (ℝ)

Distinct33
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.5251
Minimum1992
Maximum2024
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T07:47:38.388610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1992
5-th percentile1996
Q12006
median2011
Q32017
95-th percentile2021
Maximum2024
Range32
Interquartile range (IQR)11

Descriptive statistics

Standard deviation7.5389027
Coefficient of variation (CV)0.0037497183
Kurtosis-0.45246085
Mean2010.5251
Median Absolute Deviation (MAD)5
Skewness-0.52605535
Sum20105251
Variance56.835053
MonotonicityNot monotonic
2023-12-13T07:47:38.539934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
2019 622
 
6.2%
2010 561
 
5.6%
2011 550
 
5.5%
2009 518
 
5.2%
2008 496
 
5.0%
2015 492
 
4.9%
2018 479
 
4.8%
2007 473
 
4.7%
2012 471
 
4.7%
2020 450
 
4.5%
Other values (23) 4888
48.9%
ValueCountFrequency (%)
1992 105
1.1%
1993 94
0.9%
1994 123
1.2%
1995 162
1.6%
1996 159
1.6%
1997 160
1.6%
1998 109
1.1%
1999 114
1.1%
2000 167
1.7%
2001 184
1.8%
ValueCountFrequency (%)
2024 1
 
< 0.1%
2023 1
 
< 0.1%
2022 250
2.5%
2021 427
4.3%
2020 450
4.5%
2019 622
6.2%
2018 479
4.8%
2017 424
4.2%
2016 407
4.1%
2015 492
4.9%

Interactions

2023-12-13T07:47:31.145452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:47:38.640327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대륙명사업유형코드사업유형명다년구분코드다년구분코드명사업연도
대륙명1.0000.3650.3650.2200.2200.183
사업유형코드0.3651.0001.0001.0001.0000.361
사업유형명0.3651.0001.0001.0001.0000.361
다년구분코드0.2201.0001.0001.0001.0000.411
다년구분코드명0.2201.0001.0001.0001.0000.411
사업연도0.1830.3610.3610.4110.4111.000
2023-12-13T07:47:38.763504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대륙명사업유형명다년구분코드명다년구분코드사업유형코드
대륙명1.0000.2630.1500.1500.263
사업유형명0.2631.0001.0001.0000.996
다년구분코드명0.1501.0001.0001.0001.000
다년구분코드0.1501.0001.0001.0001.000
사업유형코드0.2630.9961.0001.0001.000
2023-12-13T07:47:38.872458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업연도대륙명사업유형코드사업유형명다년구분코드다년구분코드명
사업연도1.0000.1000.2780.2780.1840.184
대륙명0.1001.0000.2630.2630.1500.150
사업유형코드0.2780.2631.0000.9961.0001.000
사업유형명0.2780.2630.9961.0001.0001.000
다년구분코드0.1840.1501.0001.0001.0001.000
다년구분코드명0.1840.1501.0001.0001.0001.000

Missing values

2023-12-13T07:47:31.588275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:47:31.780332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T07:47:31.983348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

국가명국가영문명iso 2자리코드대륙명사업유형코드사업유형명사업명(국문)사업명(영문)사업시작일사업종료일다년구분코드다년구분코드명수혜기관명사업연도
2371대한민국KoreaKR아시아1KF아태안보협력이사회(CSCAP)<NA>2014-01-012014-12-31S단년아시아태평양 안보협력이사회 한국위원회2014
11247폴란드PolandPL유럽1KF2001년도 폴란드 바르샤바대 한국학 강좌 운영2001 Establishment of Professorships Program<NA><NA>S단년바르샤바대2001
9294인도네시아IndonesiaID아시아2KOICA인도네시아 파푸아 Boven Digoel 지역 의료서비스 개선 사업(2016-2018/2,817백만원/코린산업)Medical Services Improvement of Local Community in Indonesia2016-02-192019-02-18MC다년계속PT.Tunas Sawaerma, ㈜코린산업2019
5453미국United States of AmericaUS북아메리카1KF미국 CFR<NA>1998-01-011998-12-31S단년주 미국 대한민국 대사관1998
7052미국United States of AmericaUS북아메리카1KF[차세대] CCGA<NA>2020-07-012020-12-31S단년<NA>2020
5386미국United States of AmericaUS북아메리카1KF[지정기부]하버드대학 법대 동아시아 법률 연구 기금 설치(효성중공업)<NA>1996-01-011996-12-31S단년Harvard University1996
11280폴란드PolandPL유럽1KF제3차 GPDNet 총회<NA>2016-06-232016-06-26S단년Yunus Emre Institute2016
10925태국ThailandTH아시아1KF태국 실라파건대 한국어 객원교수파견<NA>2009-10-202011-10-19M다년실라파건(실파콘)대2009
1094대한민국KoreaKR아시아1KF한·중동간 새시대와 새협력<NA>2000-10-282000-10-29S단년한국중동학회2000
4508미국United States of AmericaUS북아메리카1KF미국 LACMA 한국실내 소규모 전시 및 프로그램 지원<NA><NA><NA>S단년LA카운티미술관2011
국가명국가영문명iso 2자리코드대륙명사업유형코드사업유형명사업명(국문)사업명(영문)사업시작일사업종료일다년구분코드다년구분코드명수혜기관명사업연도
9154인도IndiaIN아시아1KF인도 델리대 한국어객원교수파견(김도영)2013 Visiting Professors Program2013-01-012013-12-31M다년델리대학교2013
3815러시아RussiaRU유럽1KF생페테르부르그대 극동역사학과 프로그램 운영<NA><NA><NA>S단년상트페테르부르크국립대2001
279대한민국KoreaKR아시아1KF2010년도 뉴스레터 영문 11월2010 Newsletter English November<NA><NA>S단년와우이미지2010
9501일본JapanJP아시아1KF[해외]일본교육자 큐슈대 한국학워크숍2010 Korean Studies Workshop for Japan Secondary School Educators2010-07-262010-07-29S단년九州大學韓國硏究センタ-2010
2905대한민국KoreaKR아시아1KF[학술교육] 2018 '알쓸신아' 국가별강좌시리즈ACH Lecture Series "Useful and Mysterious ASEAN"2018-03-222018-12-20S단년동아대학교2018
2617대한민국KoreaKR아시아1KF한국광고홍보학회<NA>2016-01-012016-12-31S단년사단법인 한국광고홍보학회2016
1437대한민국KoreaKR아시아1KF베트남현대미술 특강<NA>2007-10-252007-10-25S단년<NA>2007
10421카자흐스탄KazakhstanKZ아시아1KF[유라시아] 2015-17 카자흐스탄 카자흐국제관계및세계언어대 한국어 객원교수 파견(장호종)<NA>2015-09-012016-02-29M다년카자흐 국제관계세계언어대2016
10769코트디부아르Côte D'IvoireCI아프리카1KF[아중동] 2021-22 코트디부아르 펠릭스우푸에부아니대 한국학 객원교수 파견(선미라)<NA>2022-01-012022-08-15M다년Universite Felix Houphouet Boigny2022
2840대한민국KoreaKR아시아1KF[공통경비] 평가(심의) 자문 및 진행비, 국내출장비 등<NA>2018-01-012018-12-31S단년<NA>2018

Duplicate rows

Most frequently occurring

국가명국가영문명iso 2자리코드대륙명사업유형코드사업유형명사업명(국문)사업명(영문)사업시작일사업종료일다년구분코드다년구분코드명수혜기관명사업연도# duplicates
0대한민국KoreaKR아시아1KF[공통경비] 법률자문료<NA>2019-01-012019-12-31S단년<NA>20192
1미국United States of AmericaUS북아메리카1KF미국 CSIS<NA>2002-01-012002-12-31S단년미국 국제전략문제연구소20022
2미국United States of AmericaUS북아메리카1KF미국 CSIS<NA>2002-01-012002-12-31S단년주 미국 대한민국 대사관20022
3미국United States of AmericaUS북아메리카1KF미국 CSIS<NA><NA><NA>S단년미국 국제전략문제연구소20042
4미국United States of AmericaUS북아메리카1KF미국 CSIS<NA><NA><NA>S단년주 미국 대한민국 대사관20042