Overview

Dataset statistics

Number of variables5
Number of observations1429
Missing cells0
Missing cells (%)0.0%
Duplicate rows229
Duplicate rows (%)16.0%
Total size in memory58.7 KiB
Average record size in memory42.1 B

Variable types

Text1
Numeric2
Categorical2

Dataset

Description2014-2019년 문예진흥기금 공모사업 중 문학 분야 "문예지발간" 지원 사업의 우수작품 실적(예: 작가 연령대, 실적 분야, 우수작품 수록 및 수상 건수)
Author한국문화예술위원회
URLhttps://www.data.go.kr/data/15076423/fileData.do

Alerts

Dataset has 229 (16.0%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 07:08:29.688645
Analysis finished2023-12-12 07:08:30.587568
Duration0.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct62
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size11.3 KiB
2023-12-12T16:08:30.738403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters7145
Distinct characters68
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)0.6%

Sample

1st row*제**부
2nd row*국**회
3rd row*국**회
4th row*국**회
5th row*국**회
ValueCountFrequency (%)
국**회 783
54.8%
학**네 51
 
3.6%
대**학 50
 
3.5%
국**학 38
 
2.7%
조**학 32
 
2.2%
비**비 26
 
1.8%
요**사 26
 
1.8%
학**사 20
 
1.4%
학**상 19
 
1.3%
동**론 18
 
1.3%
Other values (52) 366
25.6%
2023-12-12T16:08:31.118585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 4287
60.0%
841
 
11.8%
798
 
11.2%
281
 
3.9%
97
 
1.4%
60
 
0.8%
52
 
0.7%
51
 
0.7%
39
 
0.5%
33
 
0.5%
Other values (58) 606
 
8.5%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 4287
60.0%
Other Letter 2850
39.9%
Decimal Number 8
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
841
29.5%
798
28.0%
281
 
9.9%
97
 
3.4%
60
 
2.1%
52
 
1.8%
51
 
1.8%
39
 
1.4%
33
 
1.2%
32
 
1.1%
Other values (56) 566
19.9%
Other Punctuation
ValueCountFrequency (%)
* 4287
100.0%
Decimal Number
ValueCountFrequency (%)
1 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4295
60.1%
Hangul 2850
39.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
841
29.5%
798
28.0%
281
 
9.9%
97
 
3.4%
60
 
2.1%
52
 
1.8%
51
 
1.8%
39
 
1.4%
33
 
1.2%
32
 
1.1%
Other values (56) 566
19.9%
Common
ValueCountFrequency (%)
* 4287
99.8%
1 8
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4295
60.1%
Hangul 2850
39.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 4287
99.8%
1 8
 
0.2%
Hangul
ValueCountFrequency (%)
841
29.5%
798
28.0%
281
 
9.9%
97
 
3.4%
60
 
2.1%
52
 
1.8%
51
 
1.8%
39
 
1.4%
33
 
1.2%
32
 
1.1%
Other values (56) 566
19.9%

사업연도
Real number (ℝ)

Distinct6
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.0252
Minimum2014
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.7 KiB
2023-12-12T16:08:31.306809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2014
5-th percentile2014
Q12014
median2016
Q32018
95-th percentile2019
Maximum2019
Range5
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.9662939
Coefficient of variation (CV)0.00097533201
Kurtosis-1.5693346
Mean2016.0252
Median Absolute Deviation (MAD)2
Skewness0.26314313
Sum2880900
Variance3.8663117
MonotonicityIncreasing
2023-12-12T16:08:31.446682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2014 570
39.9%
2018 245
17.1%
2019 216
 
15.1%
2017 178
 
12.5%
2015 140
 
9.8%
2016 80
 
5.6%
ValueCountFrequency (%)
2014 570
39.9%
2015 140
 
9.8%
2016 80
 
5.6%
2017 178
 
12.5%
2018 245
17.1%
2019 216
 
15.1%
ValueCountFrequency (%)
2019 216
 
15.1%
2018 245
17.1%
2017 178
 
12.5%
2016 80
 
5.6%
2015 140
 
9.8%
2014 570
39.9%

작가연령대
Categorical

Distinct7
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size11.3 KiB
20대 미만
790 
50대
184 
40대
136 
30대
105 
60대
102 
Other values (2)
112 

Length

Max length6
Median length6
Mean length4.7214836
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row40대
2nd row20대 미만
3rd row20대 미만
4th row20대 미만
5th row20대 미만

Common Values

ValueCountFrequency (%)
20대 미만 790
55.3%
50대 184
 
12.9%
40대 136
 
9.5%
30대 105
 
7.3%
60대 102
 
7.1%
20대 82
 
5.7%
60대 이상 30
 
2.1%

Length

2023-12-12T16:08:31.597842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:08:31.733366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20대 872
38.8%
미만 790
35.1%
50대 184
 
8.2%
40대 136
 
6.0%
60대 132
 
5.9%
30대 105
 
4.7%
이상 30
 
1.3%

실적분야
Categorical

Distinct41
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size11.3 KiB
361 
수필
247 
소설
207 
동시
123 
시조
84 
Other values (36)
407 

Length

Max length8
Median length2
Mean length1.973408
Min length1

Unique

Unique16 ?
Unique (%)1.1%

Sample

1st row
2nd row
3rd row시조
4th row수필
5th row희곡

Common Values

ValueCountFrequency (%)
361
25.3%
수필 247
17.3%
소설 207
14.5%
동시 123
 
8.6%
시조 84
 
5.9%
희곡 61
 
4.3%
평론 60
 
4.2%
동화 54
 
3.8%
<NA> 49
 
3.4%
에세이 43
 
3.0%
Other values (31) 140
 
9.8%

Length

2023-12-12T16:08:31.877792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
361
25.3%
수필 247
17.3%
소설 207
14.5%
동시 123
 
8.6%
시조 84
 
5.9%
희곡 61
 
4.3%
평론 60
 
4.2%
동화 54
 
3.8%
na 49
 
3.4%
에세이 43
 
3.0%
Other values (31) 140
 
9.8%
Distinct10
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1770469
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.7 KiB
2023-12-12T16:08:32.025720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum11
Range10
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.79132514
Coefficient of variation (CV)0.67229704
Kurtosis68.01452
Mean1.1770469
Median Absolute Deviation (MAD)0
Skewness7.4101114
Sum1682
Variance0.62619547
MonotonicityNot monotonic
2023-12-12T16:08:32.161939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 1297
90.8%
2 90
 
6.3%
3 16
 
1.1%
5 11
 
0.8%
4 6
 
0.4%
10 3
 
0.2%
8 2
 
0.1%
6 2
 
0.1%
9 1
 
0.1%
11 1
 
0.1%
ValueCountFrequency (%)
1 1297
90.8%
2 90
 
6.3%
3 16
 
1.1%
4 6
 
0.4%
5 11
 
0.8%
6 2
 
0.1%
8 2
 
0.1%
9 1
 
0.1%
10 3
 
0.2%
11 1
 
0.1%
ValueCountFrequency (%)
11 1
 
0.1%
10 3
 
0.2%
9 1
 
0.1%
8 2
 
0.1%
6 2
 
0.1%
5 11
 
0.8%
4 6
 
0.4%
3 16
 
1.1%
2 90
 
6.3%
1 1297
90.8%

Interactions

2023-12-12T16:08:30.211752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:08:30.016213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:08:30.299986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:08:30.101795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:08:32.267977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문학단체명사업연도작가연령대실적분야우수작품수록및수상건수(건)
문학단체명1.0000.7310.5980.7190.453
사업연도0.7311.0000.2740.6040.160
작가연령대0.5980.2741.0000.4400.219
실적분야0.7190.6040.4401.0000.679
우수작품수록및수상건수(건)0.4530.1600.2190.6791.000
2023-12-12T16:08:32.412331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
실적분야작가연령대
실적분야1.0000.193
작가연령대0.1931.000
2023-12-12T16:08:32.511057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업연도우수작품수록및수상건수(건)작가연령대실적분야
사업연도1.000-0.2450.1430.276
우수작품수록및수상건수(건)-0.2451.0000.0590.411
작가연령대0.1430.0591.0000.193
실적분야0.2760.4110.1931.000

Missing values

2023-12-12T16:08:30.427274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:08:30.546223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

문학단체명사업연도작가연령대실적분야우수작품수록및수상건수(건)
0*제**부201440대4
1*국**회201420대 미만1
2*국**회201420대 미만시조1
3*국**회201420대 미만수필1
4*국**회201420대 미만희곡1
5*국**회201450대1
6*국**회201440대소설1
7*국**회201450대수필1
8*국**회201440대동시1
9*국**회201450대1
문학단체명사업연도작가연령대실적분야우수작품수록및수상건수(건)
1419*국**회201920대 미만동시조1
1420*국**회201920대 미만장편동화1
1421*국**회201920대 미만동화1
1422*국**회201920대 미만동시1
1423*국**회201920대 미만동시1
1424*국**회201920대 미만동시1
1425*국**회201920대 미만동시1
1426*천**학201920대소설1
1427*천**학201960대1
1428*천**학201950대1

Duplicate rows

Most frequently occurring

문학단체명사업연도작가연령대실적분야우수작품수록및수상건수(건)# duplicates
48*국**회201450대수필236
26*국**회201420대 미만수필128
23*국**회201420대 미만리뷰127
88*국**회201720대 미만희곡125
82*국**회201720대 미만수필122
99*국**회201820대 미만동시121
78*국**회201720대 미만동시120
117*국**회201920대 미만동시119
115*국**회201860대수필118
36*국**회201420대 미만희곡117