Overview

Dataset statistics

Number of variables5
Number of observations979
Missing cells0
Missing cells (%)0.0%
Duplicate rows15
Duplicate rows (%)1.5%
Total size in memory39.3 KiB
Average record size in memory41.1 B

Variable types

Numeric1
Categorical2
Text2

Dataset

Description한국국제교류재단이 시행하는 다양한 문화예술교류 사업에 관한 정보(사업연도, 사업분류, 프로젝트명 등)를 제공합니다.
Author한국국제교류재단
URLhttps://www.data.go.kr/data/15116500/fileData.do

Alerts

Dataset has 15 (1.5%) duplicate rowsDuplicates
사업유형 is highly imbalanced (79.7%)Imbalance

Reproduction

Analysis started2024-03-14 13:24:47.245110
Analysis finished2024-03-14 13:24:48.458631
Duration1.21 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사업연도
Real number (ℝ)

Distinct32
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2009.2247
Minimum1992
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.7 KiB
2024-03-14T22:24:48.564165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1992
5-th percentile2000
Q12007
median2009
Q32010
95-th percentile2020
Maximum2023
Range31
Interquartile range (IQR)3

Descriptive statistics

Standard deviation5.283822
Coefficient of variation (CV)0.0026297815
Kurtosis1.7781652
Mean2009.2247
Median Absolute Deviation (MAD)2
Skewness0.0064962871
Sum1967031
Variance27.918775
MonotonicityDecreasing
2024-03-14T22:24:48.786719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
2010 178
18.2%
2009 160
16.3%
2008 98
10.0%
2007 88
9.0%
2006 76
 
7.8%
2011 48
 
4.9%
2005 45
 
4.6%
2012 31
 
3.2%
2013 22
 
2.2%
2015 21
 
2.1%
Other values (22) 212
21.7%
ValueCountFrequency (%)
1992 6
0.6%
1993 3
 
0.3%
1994 6
0.6%
1995 6
0.6%
1996 5
0.5%
1997 10
1.0%
1998 3
 
0.3%
1999 5
0.5%
2000 9
0.9%
2001 8
0.8%
ValueCountFrequency (%)
2023 15
1.5%
2022 17
1.7%
2021 10
1.0%
2020 8
 
0.8%
2019 15
1.5%
2018 18
1.8%
2017 18
1.8%
2016 19
1.9%
2015 21
2.1%
2014 1
 
0.1%

사업유형
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.8 KiB
단년
948 
다년
 
31

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row단년
2nd row단년
3rd row단년
4th row단년
5th row단년

Common Values

ValueCountFrequency (%)
단년 948
96.8%
다년 31
 
3.2%

Length

2024-03-14T22:24:49.008870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T22:24:49.169325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단년 948
96.8%
다년 31
 
3.2%

사업분류
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.8 KiB
세계문화교류
718 
전략지역 문화예술행사 개최
260 
코리아페스티벌
 
1

Length

Max length14
Median length6
Mean length8.1256384
Min length6

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row세계문화교류
2nd row세계문화교류
3rd row세계문화교류
4th row세계문화교류
5th row세계문화교류

Common Values

ValueCountFrequency (%)
세계문화교류 718
73.3%
전략지역 문화예술행사 개최 260
 
26.6%
코리아페스티벌 1
 
0.1%

Length

2024-03-14T22:24:49.437522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T22:24:49.648671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
세계문화교류 718
47.9%
전략지역 260
 
17.3%
문화예술행사 260
 
17.3%
개최 260
 
17.3%
코리아페스티벌 1
 
0.1%
Distinct877
Distinct (%)89.6%
Missing0
Missing (%)0.0%
Memory size7.8 KiB
2024-03-14T22:24:50.722876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length66
Median length42
Mean length17.727273
Min length5

Characters and Unicode

Total characters17355
Distinct characters650
Distinct categories14 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique827 ?
Unique (%)84.5%

Sample

1st row전략적문화교류증진사업
2nd row체험관 K-컬처 전시
3rd row<한글 헬베티카 서밋>
4th row2023 코리아워크숍
5th row<모두의 어떤 차이>
ValueCountFrequency (%)
118
 
3.1%
예술인 51
 
1.3%
재외 50
 
1.3%
한국어교실 47
 
1.2%
문화행사 46
 
1.2%
순회공연 44
 
1.1%
공연 39
 
1.0%
지원 38
 
1.0%
기념 33
 
0.9%
사진전 33
 
0.9%
Other values (1870) 3366
87.1%
2024-03-14T22:24:52.115911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2963
 
17.1%
408
 
2.4%
402
 
2.3%
335
 
1.9%
255
 
1.5%
245
 
1.4%
0 237
 
1.4%
212
 
1.2%
210
 
1.2%
202
 
1.2%
Other values (640) 11886
68.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11830
68.2%
Space Separator 2963
 
17.1%
Decimal Number 772
 
4.4%
Lowercase Letter 713
 
4.1%
Uppercase Letter 425
 
2.4%
Dash Punctuation 191
 
1.1%
Other Punctuation 191
 
1.1%
Math Symbol 121
 
0.7%
Open Punctuation 55
 
0.3%
Close Punctuation 55
 
0.3%
Other values (4) 39
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
408
 
3.4%
402
 
3.4%
335
 
2.8%
255
 
2.2%
245
 
2.1%
212
 
1.8%
210
 
1.8%
202
 
1.7%
196
 
1.7%
176
 
1.5%
Other values (553) 9189
77.7%
Lowercase Letter
ValueCountFrequency (%)
e 113
15.8%
o 80
11.2%
a 72
10.1%
r 60
8.4%
t 57
 
8.0%
n 44
 
6.2%
s 42
 
5.9%
i 41
 
5.8%
l 38
 
5.3%
h 20
 
2.8%
Other values (15) 146
20.5%
Uppercase Letter
ValueCountFrequency (%)
K 66
15.5%
F 62
14.6%
I 49
11.5%
A 37
 
8.7%
C 24
 
5.6%
S 23
 
5.4%
E 21
 
4.9%
N 19
 
4.5%
T 18
 
4.2%
R 13
 
3.1%
Other values (14) 93
21.9%
Decimal Number
ValueCountFrequency (%)
0 237
30.7%
2 182
23.6%
1 146
18.9%
5 41
 
5.3%
9 37
 
4.8%
3 36
 
4.7%
6 29
 
3.8%
8 27
 
3.5%
7 19
 
2.5%
4 18
 
2.3%
Other Punctuation
ValueCountFrequency (%)
, 59
30.9%
' 38
19.9%
: 37
19.4%
" 27
14.1%
& 11
 
5.8%
. 6
 
3.1%
· 5
 
2.6%
/ 5
 
2.6%
? 2
 
1.0%
! 1
 
0.5%
Math Symbol
ValueCountFrequency (%)
> 60
49.6%
< 60
49.6%
+ 1
 
0.8%
Open Punctuation
ValueCountFrequency (%)
30
54.5%
( 18
32.7%
7
 
12.7%
Close Punctuation
ValueCountFrequency (%)
30
54.5%
) 18
32.7%
7
 
12.7%
Initial Punctuation
ValueCountFrequency (%)
10
58.8%
7
41.2%
Final Punctuation
ValueCountFrequency (%)
9
56.2%
7
43.8%
Letter Number
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Space Separator
ValueCountFrequency (%)
2963
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 191
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11818
68.1%
Common 4383
 
25.3%
Latin 1142
 
6.6%
Han 12
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
408
 
3.5%
402
 
3.4%
335
 
2.8%
255
 
2.2%
245
 
2.1%
212
 
1.8%
210
 
1.8%
202
 
1.7%
196
 
1.7%
176
 
1.5%
Other values (546) 9177
77.7%
Latin
ValueCountFrequency (%)
e 113
 
9.9%
o 80
 
7.0%
a 72
 
6.3%
K 66
 
5.8%
F 62
 
5.4%
r 60
 
5.3%
t 57
 
5.0%
I 49
 
4.3%
n 44
 
3.9%
s 42
 
3.7%
Other values (41) 497
43.5%
Common
ValueCountFrequency (%)
2963
67.6%
0 237
 
5.4%
- 191
 
4.4%
2 182
 
4.2%
1 146
 
3.3%
> 60
 
1.4%
< 60
 
1.4%
, 59
 
1.3%
5 41
 
0.9%
' 38
 
0.9%
Other values (26) 406
 
9.3%
Han
ValueCountFrequency (%)
5
41.7%
2
 
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11812
68.1%
ASCII 5409
31.2%
None 79
 
0.5%
Punctuation 33
 
0.2%
CJK 12
 
0.1%
Compat Jamo 6
 
< 0.1%
Number Forms 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2963
54.8%
0 237
 
4.4%
- 191
 
3.5%
2 182
 
3.4%
1 146
 
2.7%
e 113
 
2.1%
o 80
 
1.5%
a 72
 
1.3%
K 66
 
1.2%
F 62
 
1.1%
Other values (66) 1297
24.0%
Hangul
ValueCountFrequency (%)
408
 
3.5%
402
 
3.4%
335
 
2.8%
255
 
2.2%
245
 
2.1%
212
 
1.8%
210
 
1.8%
202
 
1.7%
196
 
1.7%
176
 
1.5%
Other values (545) 9171
77.6%
None
ValueCountFrequency (%)
30
38.0%
30
38.0%
7
 
8.9%
7
 
8.9%
· 5
 
6.3%
Punctuation
ValueCountFrequency (%)
10
30.3%
9
27.3%
7
21.2%
7
21.2%
Compat Jamo
ValueCountFrequency (%)
6
100.0%
CJK
ValueCountFrequency (%)
5
41.7%
2
 
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
Number Forms
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Distinct76
Distinct (%)7.8%
Missing0
Missing (%)0.0%
Memory size7.8 KiB
2024-03-14T22:24:52.919855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length4
Mean length3.6271706
Min length2

Characters and Unicode

Total characters3551
Distinct characters120
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)2.1%

Sample

1st row몽골
2nd row글로벌
3rd row스위스
4th row대한민국
5th row캐나다
ValueCountFrequency (%)
대한민국 517
52.7%
다국가 207
21.1%
이탈리아 16
 
1.6%
브라질 11
 
1.1%
독일 10
 
1.0%
멕시코 9
 
0.9%
체코 8
 
0.8%
인도 8
 
0.8%
러시아 7
 
0.7%
폴란드 7
 
0.7%
Other values (68) 181
 
18.5%
2024-03-14T22:24:53.927484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
746
21.0%
518
14.6%
517
14.6%
517
14.6%
222
 
6.3%
210
 
5.9%
63
 
1.8%
43
 
1.2%
40
 
1.1%
35
 
1.0%
Other values (110) 640
18.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3549
99.9%
Space Separator 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
746
21.0%
518
14.6%
517
14.6%
517
14.6%
222
 
6.3%
210
 
5.9%
63
 
1.8%
43
 
1.2%
40
 
1.1%
35
 
1.0%
Other values (109) 638
18.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3549
99.9%
Common 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
746
21.0%
518
14.6%
517
14.6%
517
14.6%
222
 
6.3%
210
 
5.9%
63
 
1.8%
43
 
1.2%
40
 
1.1%
35
 
1.0%
Other values (109) 638
18.0%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3549
99.9%
ASCII 2
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
746
21.0%
518
14.6%
517
14.6%
517
14.6%
222
 
6.3%
210
 
5.9%
63
 
1.8%
43
 
1.2%
40
 
1.1%
35
 
1.0%
Other values (109) 638
18.0%
ASCII
ValueCountFrequency (%)
2
100.0%

Interactions

2024-03-14T22:24:47.828623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T22:24:54.132798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업연도사업유형사업분류사업국가
사업연도1.0000.3630.4310.582
사업유형0.3631.0000.0600.247
사업분류0.4310.0601.0000.135
사업국가0.5820.2470.1351.000
2024-03-14T22:24:54.288116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업유형사업분류
사업유형1.0000.099
사업분류0.0991.000
2024-03-14T22:24:54.426393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업연도사업유형사업분류
사업연도1.0000.2780.285
사업유형0.2781.0000.099
사업분류0.2850.0991.000

Missing values

2024-03-14T22:24:48.177521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T22:24:48.390836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사업연도사업유형사업분류프로젝트명사업국가
02023단년세계문화교류전략적문화교류증진사업몽골
12023단년세계문화교류체험관 K-컬처 전시글로벌
22023단년세계문화교류<한글 헬베티카 서밋>스위스
32023단년세계문화교류2023 코리아워크숍대한민국
42023단년세계문화교류<모두의 어떤 차이>캐나다
52023단년세계문화교류<모두의 어떤 차이>캐나다
62023단년세계문화교류체험관 개관전 <창백한 푸른 점>대한민국
72023단년세계문화교류포르투갈-마법으로 지은 찰나포르투갈
82023단년세계문화교류태평양도서국 대상 전시다국가
92023단년세계문화교류런웨이 싱가포르 서울 순회전싱가포르
사업연도사업유형사업분류프로젝트명사업국가
9691994단년전략지역 문화예술행사 개최한국문화 공개강좌대한민국
9701993단년세계문화교류대전 Expo 1993 국제민속축제다국가
9711993단년전략지역 문화예술행사 개최한국문화소개 프로그램대한민국
9721993단년전략지역 문화예술행사 개최한,베 수교 1주년 기념 창작&전통공연다국가
9731992단년세계문화교류포르투갈 포스터전 지원포르투갈
9741992단년세계문화교류호주현대미술전 개최호주
9751992단년세계문화교류우스이 다이칸 사진전 "세계의 얼굴들"일본
9761992단년전략지역 문화예술행사 개최전통예술단 미국 순회공연대한민국
9771992단년전략지역 문화예술행사 개최한국문화 공개 강좌대한민국
9781992단년전략지역 문화예술행사 개최전통예술단 남미 순회공연전통예술단 남미 순회공연대한민국

Duplicate rows

Most frequently occurring

사업연도사업유형사업분류프로젝트명사업국가# duplicates
72010단년전략지역 문화예술행사 개최재외 예술인 참여 문화행사대한민국14
62010단년전략지역 문화예술행사 개최재외 예술인 참여 문화행사다국가7
22006단년세계문화교류아린(AHRIN) 월례포럼대한민국5
82010단년전략지역 문화예술행사 개최재외 예술인 참여 문화행사대한민국4
92010단년전략지역 문화예술행사 개최재외 예술인 활용 문화행사다국가3
02004단년전략지역 문화예술행사 개최한국예술단 아시아 순회공연대한민국2
12005단년세계문화교류아린(AHRIN) 월례포럼대한민국2
32006단년세계문화교류음악은 나의 인생다국가2
42007단년전략지역 문화예술행사 개최중국 Team Korea Project 행사대한민국2
52009단년전략지역 문화예술행사 개최재외 예술인 활용 문화행사대한민국2