gimi9 Pandas Profiling

Dataset statistics

Number of variables	14
Number of observations	10000
Missing cells	13603
Missing cells (%)	9.7%
Duplicate rows	5
Duplicate rows (%)	0.1%
Total size in memory	1.2 MiB
Average record size in memory	122.0 B

Variable types

Text	6
Categorical	5
DateTime	2
Numeric	1

Dataset

Description	1. KOICA-ODA 사업정보 KF-공공외교 사업 정보 목록 조회: 한글 국가명 또는 ISO국가코드(다.참고 1 ISO국가코드 이용), 한글 사업명으로 KOICA-ODA 사업정보 KF-공공외교 사업 정보 목록 조회
Author	한국국제교류재단
URL	https://www.data.go.kr/data/15099253/fileData.do

Alerts

Dataset has 5 (0.1%) duplicate rows	Duplicates
`사업유형명` is highly overall correlated with `사업유형코드` and 2 other fields	High correlation
`다년구분코드명` is highly overall correlated with `사업유형코드` and 2 other fields	High correlation
`다년구분코드` is highly overall correlated with `사업유형코드` and 2 other fields	High correlation
`사업유형코드` is highly overall correlated with `사업유형명` and 2 other fields	High correlation
`사업유형코드` is highly imbalanced (89.3%)	Imbalance
`사업유형명` is highly imbalanced (89.3%)	Imbalance
`다년구분코드` is highly imbalanced (60.4%)	Imbalance
`다년구분코드명` is highly imbalanced (60.4%)	Imbalance
`사업명(영문)` has 6504 (65.0%) missing values	Missing
`사업시작일` has 2697 (27.0%) missing values	Missing
`사업종료일` has 2699 (27.0%) missing values	Missing
`수혜기관명` has 1638 (16.4%) missing values	Missing

Reproduction

Analysis started	2023-12-12 22:47:29.132579
Analysis finished	2023-12-12 22:47:32.098713
Duration	2.97 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

국가명
Text

Distinct	123
Distinct (%)	1.2%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

Length

Max length	13
Median length	10
Mean length	3.0236
Min length	2

Characters and Unicode

Total characters	30236
Distinct characters	148
Distinct categories	4 ?
Distinct scripts	3 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	11 ?
Unique (%)	0.1%

Sample

1st row	대한민국
2nd row	폴란드
3rd row	인도네시아
4th row	미국
5th row	미국

Value	Count	Frequency (%)
대한민국	2728	27.3%
미국	2435	24.3%
중국	545	5.5%
러시아	335	3.4%
일본	272	2.7%
독일	268	2.7%
베트남	242	2.4%
영국	226	2.3%
호주	159	1.6%
캐나다	157	1.6%
Other values (113)	2633	26.3%

Most occurring characters

Value	Count	Frequency (%)
국	6077	20.1%
민	2729	9.0%
대	2728	9.0%
한	2728	9.0%
미	2484	8.2%
아	1033	3.4%
스	687	2.3%
일	554	1.8%
중	545	1.8%
시	523	1.7%
Other values (138)	10148	33.6%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	30232	> 99.9%
Uppercase Letter	2	< 0.1%
Open Punctuation	1	< 0.1%
Close Punctuation	1	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
국	6077	20.1%
민	2729	9.0%
대	2728	9.0%
한	2728	9.0%
미	2484	8.2%
아	1033	3.4%
스	687	2.3%
일	554	1.8%
중	545	1.8%
시	523	1.7%
Other values (134)	10144	33.6%

Uppercase Letter

Value	Count	Frequency (%)
R	1	50.0%
D	1	50.0%

Open Punctuation

Value	Count	Frequency (%)
(	1	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	30232	> 99.9%
Common	2	< 0.1%
Latin	2	< 0.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
국	6077	20.1%
민	2729	9.0%
대	2728	9.0%
한	2728	9.0%
미	2484	8.2%
아	1033	3.4%
스	687	2.3%
일	554	1.8%
중	545	1.8%
시	523	1.7%
Other values (134)	10144	33.6%

Common

Value	Count	Frequency (%)
(	1	50.0%
)	1	50.0%

Latin

Value	Count	Frequency (%)
R	1	50.0%
D	1	50.0%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	30232	> 99.9%
ASCII	4	< 0.1%

Most frequent character per block

Hangul

Value	Count	Frequency (%)
국	6077	20.1%
민	2729	9.0%
대	2728	9.0%
한	2728	9.0%
미	2484	8.2%
아	1033	3.4%
스	687	2.3%
일	554	1.8%
중	545	1.8%
시	523	1.7%
Other values (134)	10144	33.6%

ASCII

Value	Count	Frequency (%)
(	1	25.0%
)	1	25.0%
R	1	25.0%
D	1	25.0%

국가영문명
Text

Distinct	122
Distinct (%)	1.2%
Missing	33
Missing (%)	0.3%
Memory size	156.2 KiB

Length

Max length	28
Median length	26
Mean length	10.672218
Min length	3

Characters and Unicode

Total characters	106370
Distinct characters	55
Distinct categories	5 ?
Distinct scripts	2 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	11 ?
Unique (%)	0.1%

Sample

1st row	Korea
2nd row	Poland
3rd row	Indonesia
4th row	United States of America
5th row	United States of America

Value	Count	Frequency (%)
korea	2728	15.4%
united	2675	15.1%
of	2436	13.7%
states	2435	13.7%
america	2435	13.7%
china	545	3.1%
russia	335	1.9%
japan	272	1.5%
germany	268	1.5%
vietnam	242	1.4%
Other values (128)	3346	18.9%

Most occurring characters

Value	Count	Frequency (%)
a	13404	12.6%
e	12065	11.3%
t	8482	8.0%
i	8304	7.8%
	7750	7.3%
r	6658	6.3%
n	6151	5.8%
o	6134	5.8%
s	3900	3.7%
d	3764	3.5%
Other values (45)	29758	28.0%

Most occurring categories

Value	Count	Frequency (%)
Lowercase Letter	83257	78.3%
Uppercase Letter	15328	14.4%
Space Separator	7750	7.3%
Other Punctuation	32	< 0.1%
Dash Punctuation	3	< 0.1%

Most frequent character per category

Lowercase Letter

Value	Count	Frequency (%)
a	13404	16.1%
e	12065	14.5%
t	8482	10.2%
i	8304	10.0%
r	6658	8.0%
n	6151	7.4%
o	6134	7.4%
s	3900	4.7%
d	3764	4.5%
m	3467	4.2%
Other values (17)	10928	13.1%

Uppercase Letter

Value	Count	Frequency (%)
K	3045	19.9%
A	2760	18.0%
U	2753	18.0%
S	2732	17.8%
C	901	5.9%
R	427	2.8%
I	418	2.7%
G	306	2.0%
T	302	2.0%
J	298	1.9%
Other values (13)	1386	9.0%

Other Punctuation

Value	Count	Frequency (%)
:	14	43.8%
'	13	40.6%
&	5	15.6%

Space Separator

Value	Count	Frequency (%)
	7750	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	3	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	98585	92.7%
Common	7785	7.3%

Most frequent character per script

Latin

Value	Count	Frequency (%)
a	13404	13.6%
e	12065	12.2%
t	8482	8.6%
i	8304	8.4%
r	6658	6.8%
n	6151	6.2%
o	6134	6.2%
s	3900	4.0%
d	3764	3.8%
m	3467	3.5%
Other values (40)	26256	26.6%

Common

Value	Count	Frequency (%)
	7750	99.6%
:	14	0.2%
'	13	0.2%
&	5	0.1%
-	3	< 0.1%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	106357	> 99.9%
None	13	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
a	13404	12.6%
e	12065	11.3%
t	8482	8.0%
i	8304	7.8%
	7750	7.3%
r	6658	6.3%
n	6151	5.8%
o	6134	5.8%
s	3900	3.7%
d	3764	3.5%
Other values (44)	29745	28.0%

None

Value	Count	Frequency (%)
ô	13	100.0%

iso 2자리코드
Text

Distinct	123
Distinct (%)	1.2%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

Length

Max length	2
Median length	2
Mean length	2
Min length	2

Characters and Unicode

Total characters	20000
Distinct characters	26
Distinct categories	1 ?
Distinct scripts	1 ?
Distinct blocks	1 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	11 ?
Unique (%)	0.1%

Sample

1st row	KR
2nd row	PL
3rd row	ID
4th row	US
5th row	US

Value	Count	Frequency (%)
kr	2728	27.3%
us	2435	24.3%
cn	545	5.5%
ru	335	3.4%
jp	272	2.7%
de	268	2.7%
vn	242	2.4%
gb	226	2.3%
au	159	1.6%
ca	157	1.6%
Other values (113)	2633	26.3%

Most occurring characters

Value	Count	Frequency (%)
R	3498	17.5%
U	3044	15.2%
K	2945	14.7%
S	2674	13.4%
N	1099	5.5%
C	906	4.5%
E	619	3.1%
A	568	2.8%
T	466	2.3%
I	464	2.3%
Other values (16)	3717	18.6%

Most occurring categories

Value	Count	Frequency (%)
Uppercase Letter	20000	100.0%

Most frequent character per category

Uppercase Letter

Value	Count	Frequency (%)
R	3498	17.5%
U	3044	15.2%
K	2945	14.7%
S	2674	13.4%
N	1099	5.5%
C	906	4.5%
E	619	3.1%
A	568	2.8%
T	466	2.3%
I	464	2.3%
Other values (16)	3717	18.6%

Most occurring scripts

Value	Count	Frequency (%)
Latin	20000	100.0%

Most frequent character per script

Latin

Value	Count	Frequency (%)
R	3498	17.5%
U	3044	15.2%
K	2945	14.7%
S	2674	13.4%
N	1099	5.5%
C	906	4.5%
E	619	3.1%
A	568	2.8%
T	466	2.3%
I	464	2.3%
Other values (16)	3717	18.6%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	20000	100.0%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
R	3498	17.5%
U	3044	15.2%
K	2945	14.7%
S	2674	13.4%
N	1099	5.5%
C	906	4.5%
E	619	3.1%
A	568	2.8%
T	466	2.3%
I	464	2.3%
Other values (16)	3717	18.6%

대륙명
Categorical

Distinct	7
Distinct (%)	0.1%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

아시아	4732
북아메리카	2777
유럽	1921
호주(오세아니아)	188
남아메리카	184
Other values (2)	198

Length

Max length	9
Median length	5
Mean length	3.5327
Min length	2

Unique

Unique	0 ?
Unique (%)	0.0%

Sample

1st row	아시아
2nd row	유럽
3rd row	아시아
4th row	북아메리카
5th row	북아메리카

Common Values

Value	Count	Frequency (%)
아시아	4732	47.3%
북아메리카	2777	27.8%
유럽	1921	19.2%
호주(오세아니아)	188	1.9%
남아메리카	184	1.8%
아프리카	165	1.7%
<NA>	33	0.3%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
아시아	4732	47.3%
북아메리카	2777	27.8%
유럽	1921	19.2%
호주(오세아니아	188	1.9%
남아메리카	184	1.8%
아프리카	165	1.7%
na	33	0.3%

사업유형코드
Categorical

HIGH CORRELATION IMBALANCE

Distinct	2
Distinct (%)	< 0.1%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

1	9859
2	141

Length

Max length	1
Median length	1
Mean length	1
Min length	1

Unique

Unique	0 ?
Unique (%)	0.0%

Sample

1st row	1
2nd row	1
3rd row	2
4th row	1
5th row	1

Common Values

Value	Count	Frequency (%)
1	9859	98.6%
2	141	1.4%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
1	9859	98.6%
2	141	1.4%

사업유형명
Categorical

HIGH CORRELATION IMBALANCE

Distinct	2
Distinct (%)	< 0.1%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

KF	9859
KOICA	141

Length

Max length	5
Median length	2
Mean length	2.0423
Min length	2

Unique

Unique	0 ?
Unique (%)	0.0%

Sample

1st row	KF
2nd row	KF
3rd row	KOICA
4th row	KF
5th row	KF

Common Values

Value	Count	Frequency (%)
KF	9859	98.6%
KOICA	141	1.4%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
kf	9859	98.6%
koica	141	1.4%

사업명(국문)
Text

Distinct	7973
Distinct (%)	80.0%
Missing	32
Missing (%)	0.3%
Memory size	156.2 KiB

Length

Max length	125
Median length	92
Mean length	23.931481
Min length	3

Characters and Unicode

Total characters	238549
Distinct characters	919
Distinct categories	17 ?
Distinct scripts	7 ?
Distinct blocks	10 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	6915 ?
Unique (%)	69.4%

Sample

1st row	아태안보협력이사회(CSCAP)
2nd row	2001년도 폴란드 바르샤바대 한국학 강좌 운영
3rd row	인도네시아 파푸아 Boven Digoel 지역 의료서비스 개선 사업(2016-2018/2,817백만원/코린산업)
4th row	미국 CFR
5th row	[차세대] CCGA

Value	Count	Frequency (%)
한국어	1129	2.6%
미국	1095	2.5%
한국학	873	2.0%
객원교수	836	1.9%
지원	814	1.9%
뉴스레터	366	0.8%
중국	305	0.7%
설치	281	0.6%
운영	280	0.6%
및	275	0.6%
Other values (9746)	37645	85.8%

Most occurring characters

Value	Count	Frequency (%)
	34277	14.4%
국	7715	3.2%
한	5755	2.4%
2	5163	2.2%
0	5087	2.1%
대	4575	1.9%
원	3833	1.6%
1	3677	1.5%
교	3326	1.4%
[	3015	1.3%
Other values (909)	162126	68.0%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	138533	58.1%
Space Separator	34277	14.4%
Decimal Number	19732	8.3%
Lowercase Letter	19244	8.1%
Uppercase Letter	10476	4.4%
Close Punctuation	5654	2.4%
Open Punctuation	5652	2.4%
Dash Punctuation	2295	1.0%
Other Punctuation	1934	0.8%
Math Symbol	540	0.2%
Other values (7)	212	0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
국	7715	5.6%
한	5755	4.2%
대	4575	3.3%
원	3833	2.8%
교	3326	2.4%
아	2763	2.0%
학	2684	1.9%
지	2661	1.9%
미	2375	1.7%
스	2188	1.6%
Other values (809)	100658	72.7%

Lowercase Letter

Value	Count	Frequency (%)
e	2309	12.0%
i	2079	10.8%
o	1812	9.4%
n	1758	9.1%
t	1657	8.6%
a	1639	8.5%
r	1455	7.6%
s	1229	6.4%
l	865	4.5%
c	612	3.2%
Other values (19)	3829	19.9%

Uppercase Letter

Value	Count	Frequency (%)
S	1232	11.8%
C	1044	10.0%
I	952	9.1%
A	949	9.1%
K	752	7.2%
U	671	6.4%
T	561	5.4%
F	561	5.4%
E	556	5.3%
P	507	4.8%
Other values (16)	2691	25.7%

Other Punctuation

Value	Count	Frequency (%)
/	928	48.0%
,	351	18.1%
.	168	8.7%
:	131	6.8%
'	120	6.2%
"	116	6.0%
·	55	2.8%
&	47	2.4%
?	11	0.6%
!	6	0.3%

Decimal Number

Value	Count	Frequency (%)
2	5163	26.2%
0	5087	25.8%
1	3677	18.6%
9	1600	8.1%
5	961	4.9%
3	723	3.7%
8	689	3.5%
6	666	3.4%
7	654	3.3%
4	512	2.6%

Open Punctuation

Value	Count	Frequency (%)
[	3015	53.3%
(	2596	45.9%
《	30	0.5%
「	11	0.2%

Close Punctuation

Value	Count	Frequency (%)
]	3015	53.3%
)	2598	45.9%
》	30	0.5%
」	11	0.2%

Math Symbol

Value	Count	Frequency (%)
<	265	49.1%
>	265	49.1%
~	6	1.1%
+	4	0.7%

Final Punctuation

Value	Count	Frequency (%)
’	17	54.8%
”	14	45.2%

Initial Punctuation

Value	Count	Frequency (%)
‘	16	51.6%
“	15	48.4%

Letter Number

Value	Count	Frequency (%)
Ⅱ	5	83.3%
Ⅲ	1	16.7%

Space Separator

Value	Count	Frequency (%)
	34277	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	2295	100.0%

Control

Value	Count	Frequency (%)
	102	100.0%

Connector Punctuation

Value	Count	Frequency (%)
_	40	100.0%

Modifier Symbol

Value	Count	Frequency (%)
`	1	100.0%

Other Symbol

Value	Count	Frequency (%)
㈜	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	138510	58.1%
Common	70289	29.5%
Latin	29724	12.5%
Han	21	< 0.1%
Hiragana	3	< 0.1%
Cyrillic	1	< 0.1%
Greek	1	< 0.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
국	7715	5.6%
한	5755	4.2%
대	4575	3.3%
원	3833	2.8%
교	3326	2.4%
아	2763	2.0%
학	2684	1.9%
지	2661	1.9%
미	2375	1.7%
스	2188	1.6%
Other values (791)	100635	72.7%

Latin

Value	Count	Frequency (%)
e	2309	7.8%
i	2079	7.0%
o	1812	6.1%
n	1758	5.9%
t	1657	5.6%
a	1639	5.5%
r	1455	4.9%
S	1232	4.1%
s	1229	4.1%
C	1044	3.5%
Other values (45)	13510	45.5%

Common

Value	Count	Frequency (%)
	34277	48.8%
2	5163	7.3%
0	5087	7.2%
1	3677	5.2%
[	3015	4.3%
]	3015	4.3%
)	2598	3.7%
(	2596	3.7%
-	2295	3.3%
9	1600	2.3%
Other values (32)	6966	9.9%

Han

Value	Count	Frequency (%)
展	5	23.8%
美	2	9.5%
化	1	4.8%
文	1	4.8%
濟	1	4.8%
百	1	4.8%
本	1	4.8%
日	1	4.8%
下	1	4.8%
和	1	4.8%
Other values (6)	6	28.6%

Hiragana

Value	Count	Frequency (%)
む	1	33.3%
を	1	33.3%
の	1	33.3%

Cyrillic

Value	Count	Frequency (%)
о	1	100.0%

Greek

Value	Count	Frequency (%)
ο	1	100.0%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	138500	58.1%
ASCII	99806	41.8%
None	140	0.1%
Punctuation	62	< 0.1%
CJK	21	< 0.1%
Compat Jamo	9	< 0.1%
Number Forms	6	< 0.1%
Hiragana	3	< 0.1%
Cyrillic	1	< 0.1%
Katakana	1	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	34277	34.3%
2	5163	5.2%
0	5087	5.1%
1	3677	3.7%
[	3015	3.0%
]	3015	3.0%
)	2598	2.6%
(	2596	2.6%
e	2309	2.3%
-	2295	2.3%
Other values (74)	35774	35.8%

Hangul

Value	Count	Frequency (%)
국	7715	5.6%
한	5755	4.2%
대	4575	3.3%
원	3833	2.8%
교	3326	2.4%
아	2763	2.0%
학	2684	1.9%
지	2661	1.9%
미	2375	1.7%
스	2188	1.6%
Other values (788)	100625	72.7%

None

Value	Count	Frequency (%)
·	55	39.3%
《	30	21.4%
》	30	21.4%
」	11	7.9%
「	11	7.9%
ο	1	0.7%
ô	1	0.7%
㈜	1	0.7%

Punctuation

Value	Count	Frequency (%)
’	17	27.4%
‘	16	25.8%
“	15	24.2%
”	14	22.6%

Compat Jamo

Value	Count	Frequency (%)
ㆍ	7	77.8%
ㅇ	2	22.2%

CJK

Value	Count	Frequency (%)
展	5	23.8%
美	2	9.5%
化	1	4.8%
文	1	4.8%
濟	1	4.8%
百	1	4.8%
本	1	4.8%
日	1	4.8%
下	1	4.8%
和	1	4.8%
Other values (6)	6	28.6%

Number Forms

Value	Count	Frequency (%)
Ⅱ	5	83.3%
Ⅲ	1	16.7%

Cyrillic

Value	Count	Frequency (%)
о	1	100.0%

Hiragana

Value	Count	Frequency (%)
む	1	33.3%
を	1	33.3%
の	1	33.3%

Katakana

Value	Count	Frequency (%)
・	1	100.0%

사업명(영문)
Text

MISSING

Distinct	2036
Distinct (%)	58.2%
Missing	6504
Missing (%)	65.0%
Memory size	156.2 KiB

Length

Max length	203
Median length	151
Mean length	45.851831
Min length	3

Characters and Unicode

Total characters	160298
Distinct characters	96
Distinct categories	14 ?
Distinct scripts	2 ?
Distinct blocks	5 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	1601 ?
Unique (%)	45.8%

Sample

1st row	2001 Establishment of Professorships Program
2nd row	Medical Services Improvement of Local Community in Indonesia
3rd row	KAMS International Exchange Forum - [Edinburgh Expansion Strategy & Cases] Lecture
4th row	Concert celebrating 20th anniversary of the treaty of amity between Korea and Russia
5th row	Venezuela Writers Invitational Meeting

Value	Count	Frequency (%)
of	1391	6.2%
program	1094	4.9%
korean	736	3.3%
the	725	3.3%
for	456	2.0%
and	421	1.9%
visiting	383	1.7%
staff	357	1.6%
teaching	356	1.6%
employment	355	1.6%
Other values (2967)	16001	71.8%

Most occurring characters

Value	Count	Frequency (%)
	18804	11.7%
e	12272	7.7%
o	11249	7.0%
r	10824	6.8%
a	9654	6.0%
n	8789	5.5%
i	8646	5.4%
t	8225	5.1%
s	8114	5.1%
l	4278	2.7%
Other values (86)	59443	37.1%

Most occurring categories

Value	Count	Frequency (%)
Lowercase Letter	111528	69.6%
Uppercase Letter	20130	12.6%
Space Separator	18804	11.7%
Decimal Number	7668	4.8%
Other Punctuation	973	0.6%
Dash Punctuation	577	0.4%
Open Punctuation	230	0.1%
Close Punctuation	226	0.1%
Math Symbol	59	< 0.1%
Final Punctuation	45	< 0.1%
Other values (4)	58	< 0.1%

Most frequent character per category

Lowercase Letter

Value	Count	Frequency (%)
e	12272	11.0%
o	11249	10.1%
r	10824	9.7%
a	9654	8.7%
n	8789	7.9%
i	8646	7.8%
t	8225	7.4%
s	8114	7.3%
l	4278	3.8%
m	3754	3.4%
Other values (16)	25723	23.1%

Uppercase Letter

Value	Count	Frequency (%)
P	2535	12.6%
S	2019	10.0%
E	1825	9.1%
K	1554	7.7%
A	1296	6.4%
T	1233	6.1%
C	1193	5.9%
N	999	5.0%
R	908	4.5%
F	856	4.3%
Other values (16)	5712	28.4%

Other Punctuation

Value	Count	Frequency (%)
:	212	21.8%
,	201	20.7%
.	157	16.1%
'	148	15.2%
"	147	15.1%
&	75	7.7%
/	17	1.7%
;	9	0.9%
?	5	0.5%
#	1	0.1%

Decimal Number

Value	Count	Frequency (%)
0	2648	34.5%
2	1780	23.2%
1	1184	15.4%
9	838	10.9%
8	247	3.2%
7	244	3.2%
6	195	2.5%
3	193	2.5%
5	191	2.5%
4	148	1.9%

Math Symbol

Value	Count	Frequency (%)
<	25	42.4%
>	23	39.0%
\|	6	10.2%
+	4	6.8%
∥	1	1.7%

Open Punctuation

Value	Count	Frequency (%)
(	129	56.1%
[	100	43.5%
《	1	0.4%

Close Punctuation

Value	Count	Frequency (%)
)	125	55.3%
]	100	44.2%
》	1	0.4%

Letter Number

Value	Count	Frequency (%)
Ⅰ	3	42.9%
Ⅲ	2	28.6%
Ⅱ	2	28.6%

Dash Punctuation

Value	Count	Frequency (%)
-	576	99.8%
–	1	0.2%

Initial Punctuation

Value	Count	Frequency (%)
“	28	75.7%
‘	9	24.3%

Final Punctuation

Value	Count	Frequency (%)
”	26	57.8%
’	19	42.2%

Space Separator

Value	Count	Frequency (%)
	18804	100.0%

Connector Punctuation

Value	Count	Frequency (%)
_	13	100.0%

Modifier Symbol

Value	Count	Frequency (%)
`	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	131665	82.1%
Common	28633	17.9%

Most frequent character per script

Latin

Value	Count	Frequency (%)
e	12272	9.3%
o	11249	8.5%
r	10824	8.2%
a	9654	7.3%
n	8789	6.7%
i	8646	6.6%
t	8225	6.2%
s	8114	6.2%
l	4278	3.2%
m	3754	2.9%
Other values (45)	45860	34.8%

Common

Value	Count	Frequency (%)
	18804	65.7%
0	2648	9.2%
2	1780	6.2%
1	1184	4.1%
9	838	2.9%
-	576	2.0%
8	247	0.9%
7	244	0.9%
:	212	0.7%
,	201	0.7%
Other values (31)	1899	6.6%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	160205	99.9%
Punctuation	83	0.1%
Number Forms	7	< 0.1%
None	2	< 0.1%
Math Operators	1	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	18804	11.7%
e	12272	7.7%
o	11249	7.0%
r	10824	6.8%
a	9654	6.0%
n	8789	5.5%
i	8646	5.4%
t	8225	5.1%
s	8114	5.1%
l	4278	2.7%
Other values (75)	59350	37.0%

Punctuation

Value	Count	Frequency (%)
“	28	33.7%
”	26	31.3%
’	19	22.9%
‘	9	10.8%
–	1	1.2%

Number Forms

Value	Count	Frequency (%)
Ⅰ	3	42.9%
Ⅲ	2	28.6%
Ⅱ	2	28.6%

None

Value	Count	Frequency (%)
《	1	50.0%
》	1	50.0%

Math Operators

Value	Count	Frequency (%)
∥	1	100.0%

사업시작일
Date

MISSING

Distinct	2126
Distinct (%)	29.1%
Missing	2697
Missing (%)	27.0%
Memory size	156.2 KiB

Minimum	1992-01-01 00:00:00
Maximum	2024-05-31 00:00:00

Histogram

Histogram with fixed size bins (bins=50)

사업종료일
Date

MISSING

Distinct	2120
Distinct (%)	29.0%
Missing	2699
Missing (%)	27.0%
Memory size	156.2 KiB

Minimum	1992-02-27 00:00:00
Maximum	2024-07-31 00:00:00

Histogram

Histogram with fixed size bins (bins=50)

다년구분코드
Categorical

HIGH CORRELATION IMBALANCE

Distinct	6
Distinct (%)	0.1%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

S	7426
M	2208
<NA>	228
MC	83
MN	54

Length

Max length	4
Median length	1
Mean length	1.0822
Min length	1

Unique

Unique	1 ?
Unique (%)	< 0.1%

Sample

1st row	S
2nd row	S
3rd row	MC
4th row	S
5th row	S

Common Values

Value	Count	Frequency (%)
S	7426	74.3%
M	2208	22.1%
<NA>	228	2.3%
MC	83	0.8%
MN	54	0.5%
SN	1	< 0.1%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
s	7426	74.3%
m	2208	22.1%
na	228	2.3%
mc	83	0.8%
mn	54	0.5%
sn	1	< 0.1%

다년구분코드명
Categorical

HIGH CORRELATION IMBALANCE

Distinct	6
Distinct (%)	0.1%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

단년	7426
다년	2208
<NA>	228
다년계속	83
다년신규	54

Length

Max length	4
Median length	2
Mean length	2.0732
Min length	2

Unique

Unique	1 ?
Unique (%)	< 0.1%

Sample

1st row	단년
2nd row	단년
3rd row	다년계속
4th row	단년
5th row	단년

Common Values

Value	Count	Frequency (%)
단년	7426	74.3%
다년	2208	22.1%
<NA>	228	2.3%
다년계속	83	0.8%
다년신규	54	0.5%
단년신규	1	< 0.1%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
단년	7426	74.3%
다년	2208	22.1%
na	228	2.3%
다년계속	83	0.8%
다년신규	54	0.5%
단년신규	1	< 0.1%

수혜기관명
Text

MISSING

Distinct	2965
Distinct (%)	35.5%
Missing	1638
Missing (%)	16.4%
Memory size	156.2 KiB

Length

Max length	100
Median length	87
Mean length	14.446424
Min length	2

Characters and Unicode

Total characters	120801
Distinct characters	1157
Distinct categories	18 ?
Distinct scripts	15 ?
Distinct blocks	18 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	1608 ?
Unique (%)	19.2%

Sample

1st row	아시아태평양 안보협력이사회 한국위원회
2nd row	바르샤바대
3rd row	PT.Tunas Sawaerma, ㈜코린산업
4th row	주 미국 대한민국 대사관
5th row	Harvard University

Value	Count	Frequency (%)
of	555	3.0%
university	547	3.0%
한국국제교류재단	401	2.2%
주	355	2.0%
대사관	334	1.8%
대한민국	319	1.8%
and	195	1.1%
미국	173	1.0%
for	165	0.9%
studies	153	0.8%
Other values (3902)	15005	82.4%

Most occurring characters

Value	Count	Frequency (%)
	9894	8.2%
i	4612	3.8%
e	4494	3.7%
대	4250	3.5%
n	3997	3.3%
a	3933	3.3%
t	3467	2.9%
r	3401	2.8%
국	3128	2.6%
o	3054	2.5%
Other values (1147)	76571	63.4%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	53330	44.1%
Lowercase Letter	43011	35.6%
Uppercase Letter	11072	9.2%
Space Separator	9894	8.2%
Close Punctuation	1107	0.9%
Open Punctuation	1101	0.9%
Other Punctuation	590	0.5%
Dash Punctuation	405	0.3%
Decimal Number	152	0.1%
Nonspacing Mark	88	0.1%
Other values (8)	51	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
대	4250	8.0%
국	3128	5.9%
한	1856	3.5%
학	1833	3.4%
교	1644	3.1%
아	1244	2.3%
사	1168	2.2%
스	1033	1.9%
관	984	1.8%
제	972	1.8%
Other values (871)	35218	66.0%

Lowercase Letter

Value	Count	Frequency (%)
i	4612	10.7%
e	4494	10.4%
n	3997	9.3%
a	3933	9.1%
t	3467	8.1%
r	3401	7.9%
o	3054	7.1%
s	2694	6.3%
l	1591	3.7%
u	1422	3.3%
Other values (125)	10346	24.1%

Uppercase Letter

Value	Count	Frequency (%)
U	1203	10.9%
C	1070	9.7%
S	1069	9.7%
A	1029	9.3%
I	761	6.9%
L	576	5.2%
N	487	4.4%
E	465	4.2%
M	417	3.8%
K	384	3.5%
Other values (66)	3611	32.6%

Nonspacing Mark

Value	Count	Frequency (%)
ั	12	13.6%
̣	9	10.2%
ิ	9	10.2%
्	7	8.0%
ි	6	6.8%
์	6	6.8%
්	6	6.8%
̀	5	5.7%
่	5	5.7%
ี	4	4.5%
Other values (12)	19	21.6%

Other Punctuation

Value	Count	Frequency (%)
,	303	51.4%
.	117	19.8%
/	45	7.6%
&	30	5.1%
·	25	4.2%
'	24	4.1%
;	19	3.2%
"	13	2.2%
・	7	1.2%
:	4	0.7%
Other values (2)	3	0.5%

Decimal Number

Value	Count	Frequency (%)
1	39	25.7%
2	36	23.7%
7	24	15.8%
0	19	12.5%
3	12	7.9%
5	10	6.6%
8	7	4.6%
4	5	3.3%

Spacing Mark

Value	Count	Frequency (%)
ि	8	29.6%
ा	7	25.9%
ා	6	22.2%
ी	2	7.4%
ැ	2	7.4%
ං	2	7.4%

Open Punctuation

Value	Count	Frequency (%)
(	1095	99.5%
„	5	0.5%
[	1	0.1%

Close Punctuation

Value	Count	Frequency (%)
)	1106	99.9%
]	1	0.1%

Dash Punctuation

Value	Count	Frequency (%)
-	404	99.8%
–	1	0.2%

Math Symbol

Value	Count	Frequency (%)
<	1	50.0%
>	1	50.0%

Final Punctuation

Value	Count	Frequency (%)
”	1	50.0%
’	1	50.0%

Space Separator

Value	Count	Frequency (%)
	9894	100.0%

Initial Punctuation

Value	Count	Frequency (%)
“	6	100.0%

Other Symbol

Value	Count	Frequency (%)
㈜	5	100.0%

Format

Value	Count	Frequency (%)
‍	4	100.0%

Modifier Letter

Value	Count	Frequency (%)
ー	3	100.0%

Control

Value	Count	Frequency (%)
	2	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	52249	43.3%
Hangul	51774	42.9%
Common	13264	11.0%
Cyrillic	1584	1.3%
Han	777	0.6%
Arabic	234	0.2%
Thai	220	0.2%
Hebrew	217	0.2%
Armenian	199	0.2%
Sinhala	74	0.1%
Other values (5)	209	0.2%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
대	4250	8.2%
국	3128	6.0%
한	1856	3.6%
학	1833	3.5%
교	1644	3.2%
아	1244	2.4%
사	1168	2.3%
스	1033	2.0%
관	984	1.9%
제	972	1.9%
Other values (586)	33662	65.0%

Han

Value	Count	Frequency (%)
学	68	8.8%
大	63	8.1%
社	37	4.8%
国	30	3.9%
究	24	3.1%
學	22	2.8%
語	20	2.6%
京	20	2.6%
研	20	2.6%
出	19	2.4%
Other values (150)	454	58.4%

Latin

Value	Count	Frequency (%)
i	4612	8.8%
e	4494	8.6%
n	3997	7.6%
a	3933	7.5%
t	3467	6.6%
r	3401	6.5%
o	3054	5.8%
s	2694	5.2%
l	1591	3.0%
u	1422	2.7%
Other values (102)	19584	37.5%

Cyrillic

Value	Count	Frequency (%)
е	151	9.5%
и	143	9.0%
т	116	7.3%
н	113	7.1%
о	112	7.1%
а	103	6.5%
с	101	6.4%
р	77	4.9%
в	71	4.5%
к	71	4.5%
Other values (45)	526	33.2%

Common

Value	Count	Frequency (%)
	9894	74.6%
)	1106	8.3%
(	1095	8.3%
-	404	3.0%
,	303	2.3%
.	117	0.9%
/	45	0.3%
1	39	0.3%
2	36	0.3%
&	30	0.2%
Other values (25)	195	1.5%

Thai

Value	Count	Frequency (%)
า	28	12.7%
ย	20	9.1%
ม	16	7.3%
ห	12	5.5%
ั	12	5.5%
ร	12	5.5%
ล	11	5.0%
ว	9	4.1%
ิ	9	4.1%
ท	8	3.6%
Other values (25)	83	37.7%

Armenian

Value	Count	Frequency (%)
Ա	56	28.1%
Ե	17	8.5%
Ն	16	8.0%
Ր	12	6.0%
Կ	12	6.0%
Լ	8	4.0%
Վ	8	4.0%
Ի	8	4.0%
Տ	8	4.0%
Ս	8	4.0%
Other values (18)	46	23.1%

Devanagari

Value	Count	Frequency (%)
ि	8	11.9%
ा	7	10.4%
्	7	10.4%
य	6	9.0%
व	6	9.0%
ल	5	7.5%
द	3	4.5%
ी	2	3.0%
ं	2	3.0%
क	2	3.0%
Other values (15)	19	28.4%

Arabic

Value	Count	Frequency (%)
ا	48	20.5%
ل	32	13.7%
ة	20	8.5%
م	18	7.7%
ن	17	7.3%
ع	16	6.8%
ي	15	6.4%
س	13	5.6%
ج	9	3.8%
ش	7	3.0%
Other values (13)	39	16.7%

Lao

Value	Count	Frequency (%)
າ	9	19.6%
ສ	5	10.9%
ະ	4	8.7%
ພ	3	6.5%
ນ	3	6.5%
ຸ	2	4.3%
ວ	2	4.3%
ົ	2	4.3%
ຫ	2	4.3%
ິ	1	2.2%
Other values (13)	13	28.3%

Hebrew

Value	Count	Frequency (%)
י	43	19.8%
ר	21	9.7%
ב	19	8.8%
ו	18	8.3%
א	18	8.3%
ה	13	6.0%
ס	12	5.5%
ת	11	5.1%
ט	11	5.1%
נ	11	5.1%
Other values (11)	40	18.4%

Sinhala

Value	Count	Frequency (%)
ය	12	16.2%
ා	6	8.1%
ව	6	8.1%
ි	6	8.1%
න	6	8.1%
්	6	8.1%
අ	4	5.4%
ශ	4	5.4%
ල	4	5.4%
ණ	2	2.7%
Other values (9)	18	24.3%

Georgian

Value	Count	Frequency (%)
ი	13	25.5%
ა	6	11.8%
ს	6	11.8%
ტ	4	7.8%
ე	3	5.9%
უ	3	5.9%
რ	3	5.9%
თ	2	3.9%
ც	2	3.9%
ლ	2	3.9%
Other values (6)	7	13.7%

Katakana

Value	Count	Frequency (%)
ン	8	33.3%
タ	6	25.0%
セ	6	25.0%
ク	2	8.3%
オ	2	8.3%

Inherited

Value	Count	Frequency (%)
̣	9	42.9%
̀	5	23.8%
‍	4	19.0%
́	3	14.3%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	65049	53.8%
Hangul	51769	42.9%
Cyrillic	1584	1.3%
CJK	777	0.6%
None	347	0.3%
Arabic	234	0.2%
Thai	220	0.2%
Hebrew	217	0.2%
Armenian	199	0.2%
Latin Ext Additional	91	0.1%
Other values (8)	314	0.3%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	9894	15.2%
i	4612	7.1%
e	4494	6.9%
n	3997	6.1%
a	3933	6.0%
t	3467	5.3%
r	3401	5.2%
o	3054	4.7%
s	2694	4.1%
l	1591	2.4%
Other values (68)	23912	36.8%

Hangul

Value	Count	Frequency (%)
대	4250	8.2%
국	3128	6.0%
한	1856	3.6%
학	1833	3.5%
교	1644	3.2%
아	1244	2.4%
사	1168	2.3%
스	1033	2.0%
관	984	1.9%
제	972	1.9%
Other values (585)	33657	65.0%

Cyrillic

Value	Count	Frequency (%)
е	151	9.5%
и	143	9.0%
т	116	7.3%
н	113	7.1%
о	112	7.1%
а	103	6.5%
с	101	6.4%
р	77	4.9%
в	71	4.5%
к	71	4.5%
Other values (45)	526	33.2%

CJK

Value	Count	Frequency (%)
学	68	8.8%
大	63	8.1%
社	37	4.8%
国	30	3.9%
究	24	3.1%
學	22	2.8%
語	20	2.6%
京	20	2.6%
研	20	2.6%
出	19	2.4%
Other values (150)	454	58.4%

Armenian

Value	Count	Frequency (%)
Ա	56	28.1%
Ե	17	8.5%
Ն	16	8.0%
Ր	12	6.0%
Կ	12	6.0%
Լ	8	4.0%
Վ	8	4.0%
Ի	8	4.0%
Տ	8	4.0%
Ս	8	4.0%
Other values (18)	46	23.1%

Arabic

Value	Count	Frequency (%)
ا	48	20.5%
ل	32	13.7%
ة	20	8.5%
م	18	7.7%
ن	17	7.3%
ع	16	6.8%
ي	15	6.4%
س	13	5.6%
ج	9	3.8%
ش	7	3.0%
Other values (13)	39	16.7%

Hebrew

Value	Count	Frequency (%)
י	43	19.8%
ר	21	9.7%
ב	19	8.8%
ו	18	8.3%
א	18	8.3%
ה	13	6.0%
ס	12	5.5%
ת	11	5.1%
ט	11	5.1%
נ	11	5.1%
Other values (11)	40	18.4%

Thai

Value	Count	Frequency (%)
า	28	12.7%
ย	20	9.1%
ม	16	7.3%
ห	12	5.5%
ั	12	5.5%
ร	12	5.5%
ล	11	5.0%
ว	9	4.1%
ิ	9	4.1%
ท	8	3.6%
Other values (25)	83	37.7%

None

Value	Count	Frequency (%)
á	27	7.8%
Đ	27	7.8%
·	25	7.2%
ä	24	6.9%
é	23	6.6%
à	22	6.3%
ö	14	4.0%
š	12	3.5%
ü	11	3.2%
ư	9	2.6%
Other values (39)	153	44.1%

Latin Ext Additional

Value	Count	Frequency (%)
ạ	25	27.5%
ọ	24	26.4%
ệ	7	7.7%
ộ	7	7.7%
ữ	6	6.6%
ờ	6	6.6%
ẵ	4	4.4%
ố	3	3.3%
ắ	3	3.3%
ứ	3	3.3%
Other values (2)	3	3.3%

Georgian

Value	Count	Frequency (%)
ი	13	25.5%
ა	6	11.8%
ს	6	11.8%
ტ	4	7.8%
ე	3	5.9%
უ	3	5.9%
რ	3	5.9%
თ	2	3.9%
ც	2	3.9%
ლ	2	3.9%
Other values (6)	7	13.7%

Sinhala

Value	Count	Frequency (%)
ය	12	16.2%
ා	6	8.1%
ව	6	8.1%
ි	6	8.1%
න	6	8.1%
්	6	8.1%
අ	4	5.4%
ශ	4	5.4%
ල	4	5.4%
ණ	2	2.7%
Other values (9)	18	24.3%

Diacriticals

Value	Count	Frequency (%)
̣	9	52.9%
̀	5	29.4%
́	3	17.6%

Lao

Value	Count	Frequency (%)
າ	9	19.6%
ສ	5	10.9%
ະ	4	8.7%
ພ	3	6.5%
ນ	3	6.5%
ຸ	2	4.3%
ວ	2	4.3%
ົ	2	4.3%
ຫ	2	4.3%
ິ	1	2.2%
Other values (13)	13	28.3%

Devanagari

Value	Count	Frequency (%)
ि	8	11.9%
ा	7	10.4%
्	7	10.4%
य	6	9.0%
व	6	9.0%
ल	5	7.5%
द	3	4.5%
ी	2	3.0%
ं	2	3.0%
क	2	3.0%
Other values (15)	19	28.4%

Katakana

Value	Count	Frequency (%)
ン	8	23.5%
・	7	20.6%
タ	6	17.6%
セ	6	17.6%
ー	3	8.8%
ク	2	5.9%
オ	2	5.9%

IPA Ext

Value	Count	Frequency (%)
ə	6	100.0%

Punctuation

Value	Count	Frequency (%)
“	6	31.6%
„	5	26.3%
‍	4	21.1%
…	1	5.3%
”	1	5.3%
’	1	5.3%
–	1	5.3%

사업연도
Real number (ℝ)

Distinct	33
Distinct (%)	0.3%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Mean	2010.5251

Minimum	1992
Maximum	2024
Zeros	0
Zeros (%)	0.0%
Negative	0
Negative (%)	0.0%
Memory size	166.0 KiB

Quantile statistics

Minimum	1992
5-th percentile	1996
Q1	2006
median	2011
Q3	2017
95-th percentile	2021
Maximum	2024
Range	32
Interquartile range (IQR)	11

Descriptive statistics

Standard deviation	7.5389027
Coefficient of variation (CV)	0.0037497183
Kurtosis	-0.45246085
Mean	2010.5251
Median Absolute Deviation (MAD)	5
Skewness	-0.52605535
Sum	20105251
Variance	56.835053
Monotonicity	Not monotonic

Histogram with fixed size bins (bins=33)

Value	Count	Frequency (%)
2019	622	6.2%
2010	561	5.6%
2011	550	5.5%
2009	518	5.2%
2008	496	5.0%
2015	492	4.9%
2018	479	4.8%
2007	473	4.7%
2012	471	4.7%
2020	450	4.5%
Other values (23)	4888	48.9%

Minimum 10 values
Maximum 10 values

Value	Count	Frequency (%)
1992	105	1.1%
1993	94	0.9%
1994	123	1.2%
1995	162	1.6%
1996	159	1.6%
1997	160	1.6%
1998	109	1.1%
1999	114	1.1%
2000	167	1.7%
2001	184	1.8%

Value	Count	Frequency (%)
2024	1	< 0.1%
2023	1	< 0.1%
2022	250	2.5%
2021	427	4.3%
2020	450	4.5%
2019	622	6.2%
2018	479	4.8%
2017	424	4.2%
2016	407	4.1%
2015	492	4.9%

사업연도

사업연도

Heatmap
Table

	대륙명	사업유형코드	사업유형명	다년구분코드	다년구분코드명	사업연도
대륙명	1.000	0.365	0.365	0.220	0.220	0.183
사업유형코드	0.365	1.000	1.000	1.000	1.000	0.361
사업유형명	0.365	1.000	1.000	1.000	1.000	0.361
다년구분코드	0.220	1.000	1.000	1.000	1.000	0.411
다년구분코드명	0.220	1.000	1.000	1.000	1.000	0.411
사업연도	0.183	0.361	0.361	0.411	0.411	1.000

Heatmap
Table

	대륙명	사업유형명	다년구분코드명	다년구분코드	사업유형코드
대륙명	1.000	0.263	0.150	0.150	0.263
사업유형명	0.263	1.000	1.000	1.000	0.996
다년구분코드명	0.150	1.000	1.000	1.000	1.000
다년구분코드	0.150	1.000	1.000	1.000	1.000
사업유형코드	0.263	0.996	1.000	1.000	1.000

Heatmap
Table

	사업연도	대륙명	사업유형코드	사업유형명	다년구분코드	다년구분코드명
사업연도	1.000	0.100	0.278	0.278	0.184	0.184
대륙명	0.100	1.000	0.263	0.263	0.150	0.150
사업유형코드	0.278	0.263	1.000	0.996	1.000	1.000
사업유형명	0.278	0.263	0.996	1.000	1.000	1.000
다년구분코드	0.184	0.150	1.000	1.000	1.000	1.000
다년구분코드명	0.184	0.150	1.000	1.000	1.000	1.000

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

First rows
Last rows

	국가명	국가영문명	iso 2자리코드	대륙명	사업유형코드	사업유형명	사업명(국문)	사업명(영문)	사업시작일	사업종료일	다년구분코드	다년구분코드명	수혜기관명	사업연도
2371	대한민국	Korea	KR	아시아	1	KF	아태안보협력이사회(CSCAP)	<NA>	2014-01-01	2014-12-31	S	단년	아시아태평양 안보협력이사회 한국위원회	2014
11247	폴란드	Poland	PL	유럽	1	KF	2001년도 폴란드 바르샤바대 한국학 강좌 운영	2001 Establishment of Professorships Program	<NA>	<NA>	S	단년	바르샤바대	2001
9294	인도네시아	Indonesia	ID	아시아	2	KOICA	인도네시아 파푸아 Boven Digoel 지역 의료서비스 개선 사업(2016-2018/2,817백만원/코린산업)	Medical Services Improvement of Local Community in Indonesia	2016-02-19	2019-02-18	MC	다년계속	PT.Tunas Sawaerma, ㈜코린산업	2019
5453	미국	United States of America	US	북아메리카	1	KF	미국 CFR	<NA>	1998-01-01	1998-12-31	S	단년	주 미국 대한민국 대사관	1998
7052	미국	United States of America	US	북아메리카	1	KF	[차세대] CCGA	<NA>	2020-07-01	2020-12-31	S	단년	<NA>	2020
5386	미국	United States of America	US	북아메리카	1	KF	[지정기부]하버드대학 법대 동아시아 법률 연구 기금 설치(효성중공업)	<NA>	1996-01-01	1996-12-31	S	단년	Harvard University	1996
11280	폴란드	Poland	PL	유럽	1	KF	제3차 GPDNet 총회	<NA>	2016-06-23	2016-06-26	S	단년	Yunus Emre Institute	2016
10925	태국	Thailand	TH	아시아	1	KF	태국 실라파건대 한국어 객원교수파견	<NA>	2009-10-20	2011-10-19	M	다년	실라파건(실파콘)대	2009
1094	대한민국	Korea	KR	아시아	1	KF	한·중동간 새시대와 새협력	<NA>	2000-10-28	2000-10-29	S	단년	한국중동학회	2000
4508	미국	United States of America	US	북아메리카	1	KF	미국 LACMA 한국실내 소규모 전시 및 프로그램 지원	<NA>	<NA>	<NA>	S	단년	LA카운티미술관	2011

	국가명	국가영문명	iso 2자리코드	대륙명	사업유형코드	사업유형명	사업명(국문)	사업명(영문)	사업시작일	사업종료일	다년구분코드	다년구분코드명	수혜기관명	사업연도
9154	인도	India	IN	아시아	1	KF	인도 델리대 한국어객원교수파견(김도영)	2013 Visiting Professors Program	2013-01-01	2013-12-31	M	다년	델리대학교	2013
3815	러시아	Russia	RU	유럽	1	KF	생페테르부르그대 극동역사학과 프로그램 운영	<NA>	<NA>	<NA>	S	단년	상트페테르부르크국립대	2001
279	대한민국	Korea	KR	아시아	1	KF	2010년도 뉴스레터 영문 11월	2010 Newsletter English November	<NA>	<NA>	S	단년	와우이미지	2010
9501	일본	Japan	JP	아시아	1	KF	[해외]일본교육자 큐슈대 한국학워크숍	2010 Korean Studies Workshop for Japan Secondary School Educators	2010-07-26	2010-07-29	S	단년	九州大學韓國硏究センタ-	2010
2905	대한민국	Korea	KR	아시아	1	KF	[학술교육] 2018 '알쓸신아' 국가별강좌시리즈	ACH Lecture Series "Useful and Mysterious ASEAN"	2018-03-22	2018-12-20	S	단년	동아대학교	2018
2617	대한민국	Korea	KR	아시아	1	KF	한국광고홍보학회	<NA>	2016-01-01	2016-12-31	S	단년	사단법인 한국광고홍보학회	2016
1437	대한민국	Korea	KR	아시아	1	KF	베트남현대미술 특강	<NA>	2007-10-25	2007-10-25	S	단년	<NA>	2007
10421	카자흐스탄	Kazakhstan	KZ	아시아	1	KF	[유라시아] 2015-17 카자흐스탄 카자흐국제관계및세계언어대 한국어 객원교수 파견(장호종)	<NA>	2015-09-01	2016-02-29	M	다년	카자흐 국제관계세계언어대	2016
10769	코트디부아르	Côte D'Ivoire	CI	아프리카	1	KF	[아중동] 2021-22 코트디부아르 펠릭스우푸에부아니대 한국학 객원교수 파견(선미라)	<NA>	2022-01-01	2022-08-15	M	다년	Universite Felix Houphouet Boigny	2022
2840	대한민국	Korea	KR	아시아	1	KF	[공통경비] 평가(심의) 자문 및 진행비, 국내출장비 등	<NA>	2018-01-01	2018-12-31	S	단년	<NA>	2018

Most frequently occurring

	국가명	국가영문명	iso 2자리코드	대륙명	사업유형코드	사업유형명	사업명(국문)	사업명(영문)	사업시작일	사업종료일	다년구분코드	다년구분코드명	수혜기관명	사업연도	# duplicates
0	대한민국	Korea	KR	아시아	1	KF	[공통경비] 법률자문료	<NA>	2019-01-01	2019-12-31	S	단년	<NA>	2019	2
1	미국	United States of America	US	북아메리카	1	KF	미국 CSIS	<NA>	2002-01-01	2002-12-31	S	단년	미국 국제전략문제연구소	2002	2
2	미국	United States of America	US	북아메리카	1	KF	미국 CSIS	<NA>	2002-01-01	2002-12-31	S	단년	주 미국 대한민국 대사관	2002	2
3	미국	United States of America	US	북아메리카	1	KF	미국 CSIS	<NA>	<NA>	<NA>	S	단년	미국 국제전략문제연구소	2004	2
4	미국	United States of America	US	북아메리카	1	KF	미국 CSIS	<NA>	<NA>	<NA>	S	단년	주 미국 대한민국 대사관	2004	2

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Uppercase Letter

Open Punctuation

Close Punctuation

Most occurring scripts

Most frequent character per script

Hangul

Common

Latin

Most occurring blocks

Most frequent character per block

Hangul

ASCII

Most occurring characters

Most occurring categories

Most frequent character per category

Lowercase Letter

Uppercase Letter

Other Punctuation

Space Separator

Dash Punctuation

Most occurring scripts

Most frequent character per script

Latin

Common

Most occurring blocks

Most frequent character per block

ASCII

None

Most occurring characters

Most occurring categories

Most frequent character per category

Uppercase Letter

Most occurring scripts

Most frequent character per script

Latin

Most occurring blocks

Most frequent character per block

ASCII

Common Values

Length

Common Values (Plot)

Common Values

Length

Common Values (Plot)

Common Values

Length

Common Values (Plot)

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Lowercase Letter

Uppercase Letter

Other Punctuation

Decimal Number

Open Punctuation

Close Punctuation

Math Symbol

Final Punctuation

Initial Punctuation

Letter Number

Space Separator

Dash Punctuation

Control

Connector Punctuation

Modifier Symbol

Other Symbol

Most occurring scripts

Most frequent character per script

Hangul

Latin

Common

Han

Hiragana