Overview

Dataset statistics

Number of variables5
Number of observations56
Missing cells0
Missing cells (%)0.0%
Duplicate rows4
Duplicate rows (%)7.1%
Total size in memory2.5 KiB
Average record size in memory46.4 B

Variable types

Text1
Categorical3
Numeric1

Dataset

Description2014, 2015, 2018, 2019년 문예진흥기금 공모사업 중 문학 분야 "문학행사 및 연구" 지원 사업의 홍보실적(예: 언론보도실적, 온라인홍보실적, 홍보물홍보실적)
Author한국문화예술위원회
URLhttps://www.data.go.kr/data/15076464/fileData.do

Alerts

Dataset has 4 (7.1%) duplicate rowsDuplicates
온라인홍보실적(건) is highly overall correlated with 홍보물홍보실적(건)High correlation
홍보물홍보실적(건) is highly overall correlated with 온라인홍보실적(건)High correlation
언론보도실적(건) has 25 (44.6%) zerosZeros

Reproduction

Analysis started2023-12-12 05:29:10.322517
Analysis finished2023-12-12 05:29:10.863003
Duration0.54 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct28
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size580.0 B
2023-12-12T14:29:11.008174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters280
Distinct characters36
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)33.9%

Sample

1st row*국**회
2nd row*색**원
3rd row*린**회
4th row*랑**회
5th row*국**회
ValueCountFrequency (%)
국**회 20
35.7%
동**회 3
 
5.4%
린**회 2
 
3.6%
우**터 2
 
3.6%
디**원 2
 
3.6%
주**의 2
 
3.6%
국**관 2
 
3.6%
국**의 2
 
3.6%
b**회 2
 
3.6%
학**사 1
 
1.8%
Other values (18) 18
32.1%
2023-12-12T14:29:11.402589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 168
60.0%
33
 
11.8%
24
 
8.6%
4
 
1.4%
3
 
1.1%
3
 
1.1%
3
 
1.1%
3
 
1.1%
3
 
1.1%
3
 
1.1%
Other values (26) 33
 
11.8%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 168
60.0%
Other Letter 110
39.3%
Uppercase Letter 2
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
33
30.0%
24
21.8%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (24) 28
25.5%
Other Punctuation
ValueCountFrequency (%)
* 168
100.0%
Uppercase Letter
ValueCountFrequency (%)
B 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 168
60.0%
Hangul 110
39.3%
Latin 2
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
33
30.0%
24
21.8%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (24) 28
25.5%
Common
ValueCountFrequency (%)
* 168
100.0%
Latin
ValueCountFrequency (%)
B 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 170
60.7%
Hangul 110
39.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 168
98.8%
B 2
 
1.2%
Hangul
ValueCountFrequency (%)
33
30.0%
24
21.8%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (24) 28
25.5%

사업연도
Categorical

Distinct4
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Memory size580.0 B
2014
22 
2018
12 
2015
11 
2019
11 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2014
2nd row2014
3rd row2014
4th row2014
5th row2014

Common Values

ValueCountFrequency (%)
2014 22
39.3%
2018 12
21.4%
2015 11
19.6%
2019 11
19.6%

Length

2023-12-12T14:29:11.581633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:29:11.687606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2014 22
39.3%
2018 12
21.4%
2015 11
19.6%
2019 11
19.6%

언론보도실적(건)
Real number (ℝ)

ZEROS 

Distinct9
Distinct (%)16.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6071429
Minimum0
Maximum11
Zeros25
Zeros (%)44.6%
Negative0
Negative (%)0.0%
Memory size636.0 B
2023-12-12T14:29:11.790245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32.25
95-th percentile5.25
Maximum11
Range11
Interquartile range (IQR)2.25

Descriptive statistics

Standard deviation2.2130076
Coefficient of variation (CV)1.3769825
Kurtosis4.9823869
Mean1.6071429
Median Absolute Deviation (MAD)1
Skewness1.9708613
Sum90
Variance4.8974026
MonotonicityNot monotonic
2023-12-12T14:29:12.296311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0 25
44.6%
1 11
19.6%
4 6
 
10.7%
2 6
 
10.7%
3 3
 
5.4%
5 2
 
3.6%
6 1
 
1.8%
7 1
 
1.8%
11 1
 
1.8%
ValueCountFrequency (%)
0 25
44.6%
1 11
19.6%
2 6
 
10.7%
3 3
 
5.4%
4 6
 
10.7%
5 2
 
3.6%
6 1
 
1.8%
7 1
 
1.8%
11 1
 
1.8%
ValueCountFrequency (%)
11 1
 
1.8%
7 1
 
1.8%
6 1
 
1.8%
5 2
 
3.6%
4 6
 
10.7%
3 3
 
5.4%
2 6
 
10.7%
1 11
19.6%
0 25
44.6%

온라인홍보실적(건)
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)10.7%
Missing0
Missing (%)0.0%
Memory size580.0 B
<NA>
33 
0
1
2
4
 
2

Length

Max length4
Median length4
Mean length2.7678571
Min length1

Unique

Unique1 ?
Unique (%)1.8%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 33
58.9%
0 8
 
14.3%
1 7
 
12.5%
2 5
 
8.9%
4 2
 
3.6%
5 1
 
1.8%

Length

2023-12-12T14:29:12.434637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:29:12.603146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 33
58.9%
0 8
 
14.3%
1 7
 
12.5%
2 5
 
8.9%
4 2
 
3.6%
5 1
 
1.8%

홍보물홍보실적(건)
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)8.9%
Missing0
Missing (%)0.0%
Memory size580.0 B
<NA>
33 
1
11 
0
10 
2
 
1
3
 
1

Length

Max length4
Median length4
Mean length2.7678571
Min length1

Unique

Unique2 ?
Unique (%)3.6%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 33
58.9%
1 11
 
19.6%
0 10
 
17.9%
2 1
 
1.8%
3 1
 
1.8%

Length

2023-12-12T14:29:12.738822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:29:12.872747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 33
58.9%
1 11
 
19.6%
0 10
 
17.9%
2 1
 
1.8%
3 1
 
1.8%

Interactions

2023-12-12T14:29:10.569553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:29:12.962066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문학단체명사업연도언론보도실적(건)온라인홍보실적(건)홍보물홍보실적(건)
문학단체명1.0000.0000.6300.8400.886
사업연도0.0001.0000.0000.1260.000
언론보도실적(건)0.6300.0001.0000.4630.252
온라인홍보실적(건)0.8400.1260.4631.0000.835
홍보물홍보실적(건)0.8860.0000.2520.8351.000
2023-12-12T14:29:13.099249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
홍보물홍보실적(건)온라인홍보실적(건)사업연도
홍보물홍보실적(건)1.0000.7840.000
온라인홍보실적(건)0.7841.0000.111
사업연도0.0000.1111.000
2023-12-12T14:29:13.206118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
언론보도실적(건)사업연도온라인홍보실적(건)홍보물홍보실적(건)
언론보도실적(건)1.0000.0000.3100.114
사업연도0.0001.0000.1110.000
온라인홍보실적(건)0.3100.1111.0000.784
홍보물홍보실적(건)0.1140.0000.7841.000

Missing values

2023-12-12T14:29:10.694265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:29:10.812081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

문학단체명사업연도언론보도실적(건)온라인홍보실적(건)홍보물홍보실적(건)
0*국**회20140<NA><NA>
1*색**원20140<NA><NA>
2*린**회20140<NA><NA>
3*랑**회20140<NA><NA>
4*국**회20144<NA><NA>
5*오**촌20143<NA><NA>
6*서**요20140<NA><NA>
7*국**회20140<NA><NA>
8*우**터20146<NA><NA>
9*국**회20142<NA><NA>
문학단체명사업연도언론보도실적(건)온라인홍보실적(건)홍보물홍보실적(건)
46*국**회2019111
47*B**회2019142
48*림**회2019011
49*주**의2019500
50*학**실2019100
51*린**대2019041
52*동**회20191121
53*국**의2019421
54*디**원2019553
55*지**션2019100

Duplicate rows

Most frequently occurring

문학단체명사업연도언론보도실적(건)온라인홍보실적(건)홍보물홍보실적(건)# duplicates
0*국**회20140<NA><NA>4
1*국**회20150<NA><NA>2
2*국**회20154<NA><NA>2
3*국**회20181112