Overview

Dataset statistics

Number of variables3
Number of observations181
Missing cells74
Missing cells (%)13.6%
Duplicate rows12
Duplicate rows (%)6.6%
Total size in memory4.7 KiB
Average record size in memory26.7 B

Variable types

Text1
Numeric2

Dataset

Description2014-2019년 문예진흥기금 공모사업 중 문학 분야 "문예지발간" 지원 사업의 국고보조금 현황(단위: 원)
Author한국문화예술위원회
URLhttps://www.data.go.kr/data/15076419/fileData.do

Alerts

Dataset has 12 (6.6%) duplicate rowsDuplicates
국고보조금(원) has 74 (40.9%) missing valuesMissing

Reproduction

Analysis started2023-12-12 19:53:26.745128
Analysis finished2023-12-12 19:53:27.419399
Duration0.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct62
Distinct (%)34.3%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-13T04:53:27.567791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters905
Distinct characters68
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)10.5%

Sample

1st row*제**부
2nd row*국**회
3rd row*1**학
4th row*학**네
5th row*학**상
ValueCountFrequency (%)
국**회 45
24.9%
대**학 8
 
4.4%
제**부 5
 
2.8%
학**네 4
 
2.2%
학**상 4
 
2.2%
비**비 4
 
2.2%
학**사 4
 
2.2%
음**음 4
 
2.2%
년**작 3
 
1.7%
학**당 3
 
1.7%
Other values (52) 97
53.6%
2023-12-13T04:53:27.926025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 543
60.0%
54
 
6.0%
48
 
5.3%
41
 
4.5%
17
 
1.9%
11
 
1.2%
10
 
1.1%
9
 
1.0%
9
 
1.0%
8
 
0.9%
Other values (58) 155
 
17.1%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 543
60.0%
Other Letter 360
39.8%
Decimal Number 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%
Other Punctuation
ValueCountFrequency (%)
* 543
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 545
60.2%
Hangul 360
39.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%
Common
ValueCountFrequency (%)
* 543
99.6%
1 2
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 545
60.2%
Hangul 360
39.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 543
99.6%
1 2
 
0.4%
Hangul
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%

사업연도
Real number (ℝ)

Distinct6
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.7624
Minimum2014
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T04:53:28.037147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2014
5-th percentile2014
Q12014
median2018
Q32019
95-th percentile2019
Maximum2019
Range5
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.0395867
Coefficient of variation (CV)0.0010113173
Kurtosis-1.5893876
Mean2016.7624
Median Absolute Deviation (MAD)1
Skewness-0.35284142
Sum365034
Variance4.1599141
MonotonicityIncreasing
2023-12-13T04:53:28.152832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2014 51
28.2%
2018 50
27.6%
2019 47
26.0%
2015 14
 
7.7%
2017 13
 
7.2%
2016 6
 
3.3%
ValueCountFrequency (%)
2014 51
28.2%
2015 14
 
7.7%
2016 6
 
3.3%
2017 13
 
7.2%
2018 50
27.6%
2019 47
26.0%
ValueCountFrequency (%)
2019 47
26.0%
2018 50
27.6%
2017 13
 
7.2%
2016 6
 
3.3%
2015 14
 
7.7%
2014 51
28.2%

국고보조금(원)
Real number (ℝ)

MISSING 

Distinct26
Distinct (%)24.3%
Missing74
Missing (%)40.9%
Infinite0
Infinite (%)0.0%
Mean16297196
Minimum800000
Maximum54000000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T04:53:28.274404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum800000
5-th percentile2610000
Q112000000
median14000000
Q318000000
95-th percentile40000000
Maximum54000000
Range53200000
Interquartile range (IQR)6000000

Descriptive statistics

Standard deviation9872789.5
Coefficient of variation (CV)0.60579681
Kurtosis3.3174966
Mean16297196
Median Absolute Deviation (MAD)2000000
Skewness1.6458557
Sum1.7438 × 109
Variance9.7471973 × 1013
MonotonicityNot monotonic
2023-12-13T04:53:28.387137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
12000000 21
 
11.6%
14000000 18
 
9.9%
18000000 10
 
5.5%
16000000 9
 
5.0%
15000000 7
 
3.9%
8000000 6
 
3.3%
40000000 4
 
2.2%
24000000 4
 
2.2%
7000000 3
 
1.7%
20000000 3
 
1.7%
Other values (16) 22
 
12.2%
(Missing) 74
40.9%
ValueCountFrequency (%)
800000 1
 
0.6%
1000000 1
 
0.6%
1400000 1
 
0.6%
1500000 1
 
0.6%
1800000 2
 
1.1%
4500000 1
 
0.6%
6000000 1
 
0.6%
7000000 3
1.7%
8000000 6
3.3%
9000000 2
 
1.1%
ValueCountFrequency (%)
54000000 1
 
0.6%
48000000 2
1.1%
40000000 4
2.2%
36000000 1
 
0.6%
32000000 3
1.7%
30000000 1
 
0.6%
27000000 1
 
0.6%
24000000 4
2.2%
23000000 1
 
0.6%
20000000 3
1.7%

Interactions

2023-12-13T04:53:27.067384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:53:26.856669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:53:27.204282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:53:26.955759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:53:28.466649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문학단체명사업연도국고보조금(원)
문학단체명1.0000.0000.000
사업연도0.0001.0000.299
국고보조금(원)0.0000.2991.000
2023-12-13T04:53:28.544242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업연도국고보조금(원)
사업연도1.000-0.132
국고보조금(원)-0.1321.000

Missing values

2023-12-13T04:53:27.314765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:53:27.387640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

문학단체명사업연도국고보조금(원)
0*제**부2014<NA>
1*국**회2014<NA>
2*1**학2014<NA>
3*학**네2014<NA>
4*학**상2014<NA>
5*음**사2014<NA>
6*천**학2014<NA>
7*행**사2014<NA>
8*년**작2014<NA>
9*간**선2014<NA>
문학단체명사업연도국고보조금(원)
171*서**망201912000000
172*와**시201912000000
173*국**회201912000000
174*시**아2019<NA>
175*국**회201923000000
176*행**사201918000000
177*국**연201932000000
178*대**학201940000000
179*국**회201916000000
180*천**학20191500000

Duplicate rows

Most frequently occurring

문학단체명사업연도국고보조금(원)# duplicates
0*국**회2014<NA>10
2*국**회2016<NA>5
3*국**회2017120000004
6*국**회2018140000004
8*국**회2019120000004
4*국**회2017240000003
1*국**회2015<NA>2
5*국**회201880000002
7*국**회2018400000002
9*국**회2019320000002