Overview

Dataset statistics

Number of variables3
Number of observations56
Missing cells33
Missing cells (%)19.6%
Duplicate rows4
Duplicate rows (%)7.1%
Total size in memory1.6 KiB
Average record size in memory28.4 B

Variable types

Text1
Categorical1
Numeric1

Dataset

Description2014, 2015, 2018, 2019년 문에진흥기금 공모사업 중 문학 분야 "문학행사 및 연구" 지원 사업의 국고보조금 현황(단위: 원)
Author한국문화예술위원회
URLhttps://www.data.go.kr/data/15076465/fileData.do

Alerts

Dataset has 4 (7.1%) duplicate rowsDuplicates
국고보조금(원) has 33 (58.9%) missing valuesMissing

Reproduction

Analysis started2023-12-12 21:17:48.689589
Analysis finished2023-12-12 21:17:49.013654
Duration0.32 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct28
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size580.0 B
2023-12-13T06:17:49.134555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters280
Distinct characters36
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)33.9%

Sample

1st row*국**회
2nd row*색**원
3rd row*린**회
4th row*랑**회
5th row*국**회
ValueCountFrequency (%)
국**회 20
35.7%
동**회 3
 
5.4%
린**회 2
 
3.6%
우**터 2
 
3.6%
디**원 2
 
3.6%
주**의 2
 
3.6%
국**관 2
 
3.6%
국**의 2
 
3.6%
b**회 2
 
3.6%
학**사 1
 
1.8%
Other values (18) 18
32.1%
2023-12-13T06:17:49.435661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 168
60.0%
33
 
11.8%
24
 
8.6%
4
 
1.4%
3
 
1.1%
3
 
1.1%
3
 
1.1%
3
 
1.1%
3
 
1.1%
3
 
1.1%
Other values (26) 33
 
11.8%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 168
60.0%
Other Letter 110
39.3%
Uppercase Letter 2
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
33
30.0%
24
21.8%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (24) 28
25.5%
Other Punctuation
ValueCountFrequency (%)
* 168
100.0%
Uppercase Letter
ValueCountFrequency (%)
B 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 168
60.0%
Hangul 110
39.3%
Latin 2
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
33
30.0%
24
21.8%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (24) 28
25.5%
Common
ValueCountFrequency (%)
* 168
100.0%
Latin
ValueCountFrequency (%)
B 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 170
60.7%
Hangul 110
39.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 168
98.8%
B 2
 
1.2%
Hangul
ValueCountFrequency (%)
33
30.0%
24
21.8%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (24) 28
25.5%

사업연도
Categorical

Distinct4
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Memory size580.0 B
2014
22 
2018
12 
2015
11 
2019
11 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2014
2nd row2014
3rd row2014
4th row2014
5th row2014

Common Values

ValueCountFrequency (%)
2014 22
39.3%
2018 12
21.4%
2015 11
19.6%
2019 11
19.6%

Length

2023-12-13T06:17:49.629686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:17:49.737483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2014 22
39.3%
2018 12
21.4%
2015 11
19.6%
2019 11
19.6%

국고보조금(원)
Real number (ℝ)

MISSING 

Distinct10
Distinct (%)43.5%
Missing33
Missing (%)58.9%
Infinite0
Infinite (%)0.0%
Mean8002522
Minimum6
Maximum12000000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.0 B
2023-12-13T06:17:49.845881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile1490000
Q16000000
median8000000
Q311000000
95-th percentile11995800
Maximum12000000
Range11999994
Interquartile range (IQR)5000000

Descriptive statistics

Standard deviation3291003.7
Coefficient of variation (CV)0.41124582
Kurtosis0.46275367
Mean8002522
Median Absolute Deviation (MAD)2000000
Skewness-0.81469836
Sum1.8405801 × 108
Variance1.0830706 × 1013
MonotonicityNot monotonic
2023-12-13T06:17:49.974768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
6000000 5
 
8.9%
11000000 5
 
8.9%
9000000 3
 
5.4%
8000000 2
 
3.6%
12000000 2
 
3.6%
7000000 2
 
3.6%
11958000 1
 
1.8%
1100000 1
 
1.8%
6 1
 
1.8%
5000000 1
 
1.8%
(Missing) 33
58.9%
ValueCountFrequency (%)
6 1
 
1.8%
1100000 1
 
1.8%
5000000 1
 
1.8%
6000000 5
8.9%
7000000 2
 
3.6%
8000000 2
 
3.6%
9000000 3
5.4%
11000000 5
8.9%
11958000 1
 
1.8%
12000000 2
 
3.6%
ValueCountFrequency (%)
12000000 2
 
3.6%
11958000 1
 
1.8%
11000000 5
8.9%
9000000 3
5.4%
8000000 2
 
3.6%
7000000 2
 
3.6%
6000000 5
8.9%
5000000 1
 
1.8%
1100000 1
 
1.8%
6 1
 
1.8%

Interactions

2023-12-13T06:17:48.804541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:17:50.356544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문학단체명사업연도국고보조금(원)
문학단체명1.0000.0000.000
사업연도0.0001.0000.224
국고보조금(원)0.0000.2241.000
2023-12-13T06:17:50.453700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국고보조금(원)사업연도
국고보조금(원)1.0000.104
사업연도0.1041.000

Missing values

2023-12-13T06:17:48.920488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:17:48.985588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

문학단체명사업연도국고보조금(원)
0*국**회2014<NA>
1*색**원2014<NA>
2*린**회2014<NA>
3*랑**회2014<NA>
4*국**회2014<NA>
5*오**촌2014<NA>
6*서**요2014<NA>
7*국**회2014<NA>
8*우**터2014<NA>
9*국**회2014<NA>
문학단체명사업연도국고보조금(원)
46*국**회201911000000
47*B**회20196
48*림**회201911000000
49*주**의201911000000
50*학**실20196000000
51*린**대20198000000
52*동**회201911000000
53*국**의20199000000
54*디**원201911000000
55*지**션20195000000

Duplicate rows

Most frequently occurring

문학단체명사업연도국고보조금(원)# duplicates
0*국**회2014<NA>7
1*국**회2015<NA>5
2*국**회201860000002
3*국**회201890000002