Overview

Dataset statistics

Number of variables6
Number of observations700
Missing cells101
Missing cells (%)2.4%
Duplicate rows38
Duplicate rows (%)5.4%
Total size in memory34.3 KiB
Average record size in memory50.2 B

Variable types

Categorical2
Numeric2
DateTime2

Dataset

Description2014-2019년 문예진흥기금 공모사업 중 문학 분야 "집필공간운영" 지원 사업의 창작실적(예: 창작분야, 입주시작일, 입주종료일 등)
Author한국문화예술위원회
URLhttps://www.data.go.kr/data/15076483/fileData.do

Alerts

Dataset has 38 (5.4%) duplicate rowsDuplicates
입주시작일 has 10 (1.4%) missing valuesMissing
입주종료일 has 11 (1.6%) missing valuesMissing
창작실적수(건) has 80 (11.4%) missing valuesMissing

Reproduction

Analysis started2024-04-17 16:40:34.420874
Analysis finished2024-04-17 16:40:35.116356
Duration0.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

문학단체명
Categorical

Distinct7
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
*지**단
222 
*을**집
156 
*1**학
134 
*악**원
84 
*날**날
45 
Other values (2)
59 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row*악**원
2nd row*악**원
3rd row*악**원
4th row*악**원
5th row*악**원

Common Values

ValueCountFrequency (%)
*지**단 222
31.7%
*을**집 156
22.3%
*1**학 134
19.1%
*악**원 84
 
12.0%
*날**날 45
 
6.4%
*버**집 32
 
4.6%
*산**꽃 27
 
3.9%

Length

2024-04-18T01:40:35.183918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T01:40:35.277572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지**단 222
31.7%
을**집 156
22.3%
1**학 134
19.1%
악**원 84
 
12.0%
날**날 45
 
6.4%
버**집 32
 
4.6%
산**꽃 27
 
3.9%

사업연도
Real number (ℝ)

Distinct6
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2015.4843
Minimum2014
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.3 KiB
2024-04-18T01:40:35.364531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2014
5-th percentile2014
Q12015
median2015
Q32016
95-th percentile2017
Maximum2019
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1097353
Coefficient of variation (CV)0.00055060477
Kurtosis-0.75460619
Mean2015.4843
Median Absolute Deviation (MAD)1
Skewness0.26962537
Sum1410839
Variance1.2315124
MonotonicityIncreasing
2024-04-18T01:40:35.451923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2015 219
31.3%
2016 168
24.0%
2014 156
22.3%
2017 148
21.1%
2018 5
 
0.7%
2019 4
 
0.6%
ValueCountFrequency (%)
2014 156
22.3%
2015 219
31.3%
2016 168
24.0%
2017 148
21.1%
2018 5
 
0.7%
2019 4
 
0.6%
ValueCountFrequency (%)
2019 4
 
0.6%
2018 5
 
0.7%
2017 148
21.1%
2016 168
24.0%
2015 219
31.3%
2014 156
22.3%
Distinct40
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
소설
288 
156 
아동문학
65 
희곡
58 
동화
 
25
Other values (35)
108 

Length

Max length7
Median length2
Mean length2.1328571
Min length1

Unique

Unique17 ?
Unique (%)2.4%

Sample

1st row소설
2nd row번역|소설
3rd row동화
4th row소설
5th row소설

Common Values

ValueCountFrequency (%)
소설 288
41.1%
156
22.3%
아동문학 65
 
9.3%
희곡 58
 
8.3%
동화 25
 
3.6%
평론 19
 
2.7%
번역 12
 
1.7%
<NA> 9
 
1.3%
산문 6
 
0.9%
시조 6
 
0.9%
Other values (30) 56
 
8.0%

Length

2024-04-18T01:40:35.541410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
소설 288
41.1%
156
22.3%
아동문학 65
 
9.3%
희곡 58
 
8.3%
동화 25
 
3.6%
평론 19
 
2.7%
번역 12
 
1.7%
na 9
 
1.3%
시나리오 6
 
0.9%
산문 6
 
0.9%
Other values (30) 56
 
8.0%

입주시작일
Date

MISSING 

Distinct186
Distinct (%)27.0%
Missing10
Missing (%)1.4%
Memory size5.6 KiB
Minimum2013-12-16 00:00:00
Maximum2017-12-07 00:00:00
2024-04-18T01:40:35.632423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:40:35.735440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

입주종료일
Date

MISSING 

Distinct193
Distinct (%)28.0%
Missing11
Missing (%)1.6%
Memory size5.6 KiB
Minimum2014-01-28 00:00:00
Maximum2018-10-31 00:00:00
2024-04-18T01:40:35.831111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:40:35.934756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

창작실적수(건)
Real number (ℝ)

MISSING 

Distinct44
Distinct (%)7.1%
Missing80
Missing (%)11.4%
Infinite0
Infinite (%)0.0%
Mean5.1209677
Minimum1
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.3 KiB
2024-04-18T01:40:36.037600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile21.1
Maximum125
Range124
Interquartile range (IQR)2

Descriptive statistics

Standard deviation10.82389
Coefficient of variation (CV)2.1136414
Kurtosis36.165572
Mean5.1209677
Median Absolute Deviation (MAD)1
Skewness5.1786163
Sum3175
Variance117.15659
MonotonicityNot monotonic
2024-04-18T01:40:36.144784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
1 272
38.9%
2 141
20.1%
3 62
 
8.9%
4 20
 
2.9%
5 15
 
2.1%
7 13
 
1.9%
10 9
 
1.3%
11 7
 
1.0%
6 7
 
1.0%
9 6
 
0.9%
Other values (34) 68
 
9.7%
(Missing) 80
 
11.4%
ValueCountFrequency (%)
1 272
38.9%
2 141
20.1%
3 62
 
8.9%
4 20
 
2.9%
5 15
 
2.1%
6 7
 
1.0%
7 13
 
1.9%
8 6
 
0.9%
9 6
 
0.9%
10 9
 
1.3%
ValueCountFrequency (%)
125 1
0.1%
74 1
0.1%
69 1
0.1%
68 1
0.1%
62 1
0.1%
61 1
0.1%
60 2
0.3%
53 1
0.1%
52 1
0.1%
50 1
0.1%

Interactions

2024-04-18T01:40:34.736028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:40:34.596984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:40:34.813677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:40:34.665668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-18T01:40:36.211669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문학단체명사업연도창작실적분야창작실적수(건)
문학단체명1.0000.3890.5640.179
사업연도0.3891.0000.2420.063
창작실적분야0.5640.2421.0000.399
창작실적수(건)0.1790.0630.3991.000
2024-04-18T01:40:36.282199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
창작실적분야문학단체명
창작실적분야1.0000.262
문학단체명0.2621.000
2024-04-18T01:40:36.345555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업연도창작실적수(건)문학단체명창작실적분야
사업연도1.0000.0610.2480.109
창작실적수(건)0.0611.0000.0630.170
문학단체명0.2480.0631.0000.262
창작실적분야0.1090.1700.2621.000

Missing values

2024-04-18T01:40:34.911228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-18T01:40:34.992006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-18T01:40:35.067193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

문학단체명사업연도창작실적분야입주시작일입주종료일창작실적수(건)
0*악**원2014소설2014-01-022014-12-30<NA>
1*악**원2014번역|소설2014-05-012014-06-301
2*악**원2014동화2014-07-012014-08-30<NA>
3*악**원2014소설2014-09-022014-12-30<NA>
4*악**원2014소설2014-05-012014-08-01<NA>
5*악**원2014수필|소설2014-05-012014-07-30<NA>
6*악**원2014소설2014-01-022014-12-30<NA>
7*악**원20142014-10-012014-12-30<NA>
8*악**원2014희곡2014-04-052014-05-03<NA>
9*악**원2014드라마2014-05-052014-06-30<NA>
문학단체명사업연도창작실적분야입주시작일입주종료일창작실적수(건)
690*을**집20172017-01-032017-02-277
691*버**집2018<NA><NA><NA><NA>
692*날**날2018<NA><NA><NA><NA>
693*을**집2018<NA><NA><NA><NA>
694*1**학2018<NA><NA><NA><NA>
695*지**단2018<NA><NA><NA><NA>
696*을**집2019<NA><NA><NA><NA>
697*지**단2019<NA><NA><NA><NA>
698*버**집2019<NA><NA><NA><NA>
699*악**원2019<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

문학단체명사업연도창작실적분야입주시작일입주종료일창작실적수(건)# duplicates
0*1**학2014소설2014-12-152015-01-3114
8*1**학2016소설2016-05-022016-07-1714
9*1**학2016소설2016-08-012016-12-3114
16*1**학2017소설2017-02-012017-04-1714
21*산**꽃2015소설2015-11-012015-12-3114
36*지**단2017소설2017-11-012017-11-3014
11*1**학20162015-12-162016-01-3013
29*을**집2017소설2017-03-012017-05-3123
30*을**집20172017-12-012017-12-31<NA>3
1*1**학2014장편소설2014-02-032014-04-2412