Overview

Dataset statistics

Number of variables13
Number of observations181
Missing cells564
Missing cells (%)24.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory19.2 KiB
Average record size in memory108.7 B

Variable types

Text1
Numeric3
DateTime2
Categorical3
Boolean4

Dataset

Description2014-2019년 문예진흥기금 공모사업 중 문학 분야 "문예지발간" 지원 사업의 개요(예: 문예지 분야, 발간주기, 발간형태 등)
Author한국문화예술위원회
URLhttps://www.data.go.kr/data/15076418/fileData.do

Alerts

발간형태_기타 has constant value ""Constant
발간형태_전자책 is highly overall correlated with 전자책_총발행호수(호)High correlation
발간형태_웹진 is highly overall correlated with 전자책_총발행호수(호)High correlation
전자책_총발행호수(호) is highly overall correlated with 발간형태_전자책 and 1 other fieldsHigh correlation
발간형태_전자책 is highly imbalanced (58.9%)Imbalance
발간형태_웹진 is highly imbalanced (80.1%)Imbalance
전자책_총발행호수(호) is highly imbalanced (54.6%)Imbalance
사업시작일 has 65 (35.9%) missing valuesMissing
사업종료일 has 65 (35.9%) missing valuesMissing
발간형태_종이책 has 84 (46.4%) missing valuesMissing
발간형태_전자책 has 84 (46.4%) missing valuesMissing
발간형태_웹진 has 84 (46.4%) missing valuesMissing
발간형태_기타 has 84 (46.4%) missing valuesMissing
종이책_총발행호수(호) has 14 (7.7%) missing valuesMissing
웹진누적방문수(명) has 84 (46.4%) missing valuesMissing
웹진누적방문수(명) has 92 (50.8%) zerosZeros

Reproduction

Analysis started2023-12-12 22:32:15.709803
Analysis finished2023-12-12 22:32:17.939794
Duration2.23 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct62
Distinct (%)34.3%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-13T07:32:18.104843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters905
Distinct characters68
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)10.5%

Sample

1st row*제**부
2nd row*국**회
3rd row*1**학
4th row*학**네
5th row*학**상
ValueCountFrequency (%)
국**회 45
24.9%
대**학 8
 
4.4%
제**부 5
 
2.8%
학**네 4
 
2.2%
학**상 4
 
2.2%
비**비 4
 
2.2%
학**사 4
 
2.2%
음**음 4
 
2.2%
년**작 3
 
1.7%
학**당 3
 
1.7%
Other values (52) 97
53.6%
2023-12-13T07:32:18.741898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 543
60.0%
54
 
6.0%
48
 
5.3%
41
 
4.5%
17
 
1.9%
11
 
1.2%
10
 
1.1%
9
 
1.0%
9
 
1.0%
8
 
0.9%
Other values (58) 155
 
17.1%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 543
60.0%
Other Letter 360
39.8%
Decimal Number 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%
Other Punctuation
ValueCountFrequency (%)
* 543
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 545
60.2%
Hangul 360
39.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%
Common
ValueCountFrequency (%)
* 543
99.6%
1 2
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 545
60.2%
Hangul 360
39.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 543
99.6%
1 2
 
0.4%
Hangul
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%

사업연도
Real number (ℝ)

Distinct6
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.7624
Minimum2014
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T07:32:18.856671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2014
5-th percentile2014
Q12014
median2018
Q32019
95-th percentile2019
Maximum2019
Range5
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.0395867
Coefficient of variation (CV)0.0010113173
Kurtosis-1.5893876
Mean2016.7624
Median Absolute Deviation (MAD)1
Skewness-0.35284142
Sum365034
Variance4.1599141
MonotonicityIncreasing
2023-12-13T07:32:18.980634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2014 51
28.2%
2018 50
27.6%
2019 47
26.0%
2015 14
 
7.7%
2017 13
 
7.2%
2016 6
 
3.3%
ValueCountFrequency (%)
2014 51
28.2%
2015 14
 
7.7%
2016 6
 
3.3%
2017 13
 
7.2%
2018 50
27.6%
2019 47
26.0%
ValueCountFrequency (%)
2019 47
26.0%
2018 50
27.6%
2017 13
 
7.2%
2016 6
 
3.3%
2015 14
 
7.7%
2014 51
28.2%

사업시작일
Date

MISSING 

Distinct16
Distinct (%)13.8%
Missing65
Missing (%)35.9%
Memory size1.5 KiB
Minimum2016-01-01 00:00:00
Maximum2019-05-01 00:00:00
2023-12-13T07:32:19.094723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:32:19.213089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)

사업종료일
Date

MISSING 

Distinct11
Distinct (%)9.5%
Missing65
Missing (%)35.9%
Memory size1.5 KiB
Minimum2016-12-31 00:00:00
Maximum2020-02-29 00:00:00
2023-12-13T07:32:19.320959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:32:19.421586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)

문예지분야
Categorical

Distinct8
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
종합
102 
시(조)
44 
수필
 
9
소설
 
7
평론
 
6
Other values (3)
13 

Length

Max length4
Median length2
Mean length2.5856354
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종합
2nd row종합
3rd row종합
4th row종합
5th row종합

Common Values

ValueCountFrequency (%)
종합 102
56.4%
시(조) 44
24.3%
수필 9
 
5.0%
소설 7
 
3.9%
평론 6
 
3.3%
<NA> 6
 
3.3%
희곡 4
 
2.2%
아동문학 3
 
1.7%

Length

2023-12-13T07:32:19.563632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:32:19.716353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
종합 102
56.4%
시(조 44
24.3%
수필 9
 
5.0%
소설 7
 
3.9%
평론 6
 
3.3%
na 6
 
3.3%
희곡 4
 
2.2%
아동문학 3
 
1.7%

발간주기
Categorical

Distinct5
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
계간
107 
월간
30 
반년간
19 
격월간
15 
미뷴류
 
10

Length

Max length3
Median length2
Mean length2.2430939
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row격월간
2nd row월간
3rd row계간
4th row미뷴류
5th row월간

Common Values

ValueCountFrequency (%)
계간 107
59.1%
월간 30
 
16.6%
반년간 19
 
10.5%
격월간 15
 
8.3%
미뷴류 10
 
5.5%

Length

2023-12-13T07:32:19.853919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:32:19.976793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
계간 107
59.1%
월간 30
 
16.6%
반년간 19
 
10.5%
격월간 15
 
8.3%
미뷴류 10
 
5.5%

발간형태_종이책
Boolean

MISSING 

Distinct2
Distinct (%)2.1%
Missing84
Missing (%)46.4%
Memory size494.0 B
True
73 
False
24 
(Missing)
84 
ValueCountFrequency (%)
True 73
40.3%
False 24
 
13.3%
(Missing) 84
46.4%
2023-12-13T07:32:20.093083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

발간형태_전자책
Boolean

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)2.1%
Missing84
Missing (%)46.4%
Memory size494.0 B
False
89 
True
 
8
(Missing)
84 
ValueCountFrequency (%)
False 89
49.2%
True 8
 
4.4%
(Missing) 84
46.4%
2023-12-13T07:32:20.188585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

발간형태_웹진
Boolean

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)2.1%
Missing84
Missing (%)46.4%
Memory size494.0 B
False
94 
True
 
3
(Missing)
84 
ValueCountFrequency (%)
False 94
51.9%
True 3
 
1.7%
(Missing) 84
46.4%
2023-12-13T07:32:20.281962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

발간형태_기타
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)1.0%
Missing84
Missing (%)46.4%
Memory size494.0 B
False
97 
(Missing)
84 
ValueCountFrequency (%)
False 97
53.6%
(Missing) 84
46.4%
2023-12-13T07:32:20.360724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

종이책_총발행호수(호)
Real number (ℝ)

MISSING 

Distinct117
Distinct (%)70.1%
Missing14
Missing (%)7.7%
Infinite0
Infinite (%)0.0%
Mean137.95808
Minimum2
Maximum1500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T07:32:20.474857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile4
Q121.5
median66
Q3140
95-th percentile582.4
Maximum1500
Range1498
Interquartile range (IQR)118.5

Descriptive statistics

Standard deviation204.22977
Coefficient of variation (CV)1.4803755
Kurtosis12.605664
Mean137.95808
Median Absolute Deviation (MAD)52
Skewness3.0097317
Sum23039
Variance41709.799
MonotonicityNot monotonic
2023-12-13T07:32:20.631675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 16
 
8.8%
12 6
 
3.3%
80 4
 
2.2%
21 3
 
1.7%
6 3
 
1.7%
58 3
 
1.7%
52 2
 
1.1%
62 2
 
1.1%
134 2
 
1.1%
144 2
 
1.1%
Other values (107) 124
68.5%
(Missing) 14
 
7.7%
ValueCountFrequency (%)
2 2
 
1.1%
3 1
 
0.6%
4 16
8.8%
6 3
 
1.7%
7 1
 
0.6%
9 1
 
0.6%
10 1
 
0.6%
11 1
 
0.6%
12 6
 
3.3%
13 1
 
0.6%
ValueCountFrequency (%)
1500 1
0.6%
781 1
0.6%
769 1
0.6%
732 1
0.6%
611 1
0.6%
599 1
0.6%
592 1
0.6%
586 2
1.1%
574 1
0.6%
566 1
0.6%

전자책_총발행호수(호)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
0
93 
<NA>
84 
101
 
1
62
 
1
31
 
1

Length

Max length4
Median length1
Mean length2.4143646
Min length1

Unique

Unique4 ?
Unique (%)2.2%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
0 93
51.4%
<NA> 84
46.4%
101 1
 
0.6%
62 1
 
0.6%
31 1
 
0.6%
6 1
 
0.6%

Length

2023-12-13T07:32:20.785530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:32:20.930139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 93
51.4%
na 84
46.4%
101 1
 
0.6%
62 1
 
0.6%
31 1
 
0.6%
6 1
 
0.6%

웹진누적방문수(명)
Real number (ℝ)

MISSING  ZEROS 

Distinct6
Distinct (%)6.2%
Missing84
Missing (%)46.4%
Infinite0
Infinite (%)0.0%
Mean110558.14
Minimum0
Maximum10000000
Zeros92
Zeros (%)50.8%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T07:32:21.052826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile120
Maximum10000000
Range10000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1015979.4
Coefficient of variation (CV)9.189548
Kurtosis96.449961
Mean110558.14
Median Absolute Deviation (MAD)0
Skewness9.8084579
Sum10724140
Variance1.0322141 × 1012
MonotonicityNot monotonic
2023-12-13T07:32:21.164441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 92
50.8%
10000000 1
 
0.6%
265268 1
 
0.6%
2400 1
 
0.6%
455872 1
 
0.6%
600 1
 
0.6%
(Missing) 84
46.4%
ValueCountFrequency (%)
0 92
50.8%
600 1
 
0.6%
2400 1
 
0.6%
265268 1
 
0.6%
455872 1
 
0.6%
10000000 1
 
0.6%
ValueCountFrequency (%)
10000000 1
 
0.6%
455872 1
 
0.6%
265268 1
 
0.6%
2400 1
 
0.6%
600 1
 
0.6%
0 92
50.8%

Interactions

2023-12-13T07:32:17.015524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:32:16.364585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:32:16.697983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:32:17.128623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:32:16.488121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:32:16.796818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:32:17.248385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:32:16.599090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:32:16.912691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:32:21.294302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문학단체명사업연도사업시작일사업종료일문예지분야발간주기발간형태_종이책발간형태_전자책발간형태_웹진종이책_총발행호수(호)전자책_총발행호수(호)웹진누적방문수(명)
문학단체명1.0000.0000.0000.0000.0000.7840.4340.6020.7180.4320.0000.000
사업연도0.0001.0001.0000.9850.2650.2830.4200.0000.0000.0000.0560.000
사업시작일0.0001.0001.0000.8860.3230.1540.4230.3060.0000.0000.0000.000
사업종료일0.0000.9850.8861.0000.5720.0000.2910.0000.0000.0000.0000.000
문예지분야0.0000.2650.3230.5721.0000.2470.2920.0000.0000.0000.0000.000
발간주기0.7840.2830.1540.0000.2471.0000.1190.0670.0000.4930.0000.000
발간형태_종이책0.4340.4200.4230.2910.2920.1191.0000.1240.0000.0000.0000.000
발간형태_전자책0.6020.0000.3060.0000.0000.0670.1241.0000.6840.2250.5580.000
발간형태_웹진0.7180.0000.0000.0000.0000.0000.0000.6841.0000.3700.6710.000
종이책_총발행호수(호)0.4320.0000.0000.0000.0000.4930.0000.2250.3701.0000.2050.000
전자책_총발행호수(호)0.0000.0560.0000.0000.0000.0000.0000.5580.6710.2051.0000.000
웹진누적방문수(명)0.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0001.000
2023-12-13T07:32:21.461343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발간형태_웹진발간주기발간형태_전자책발간형태_종이책전자책_총발행호수(호)문예지분야
발간형태_웹진1.0000.0000.4790.0000.7900.000
발간주기0.0001.0000.0780.1420.0000.159
발간형태_전자책0.4790.0781.0000.0780.6640.000
발간형태_종이책0.0000.1420.0781.0000.0000.303
전자책_총발행호수(호)0.7900.0000.6640.0001.0000.000
문예지분야0.0000.1590.0000.3030.0001.000
2023-12-13T07:32:21.577117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업연도종이책_총발행호수(호)웹진누적방문수(명)문예지분야발간주기발간형태_종이책발간형태_전자책발간형태_웹진전자책_총발행호수(호)
사업연도1.0000.2980.0500.1410.1060.2760.0000.0000.064
종이책_총발행호수(호)0.2981.0000.1230.0000.3420.0000.2330.3850.128
웹진누적방문수(명)0.0500.1231.0000.0000.0000.0000.0000.0000.000
문예지분야0.1410.0000.0001.0000.1590.3030.0000.0000.000
발간주기0.1060.3420.0000.1591.0000.1420.0780.0000.000
발간형태_종이책0.2760.0000.0000.3030.1421.0000.0780.0000.000
발간형태_전자책0.0000.2330.0000.0000.0780.0781.0000.4790.664
발간형태_웹진0.0000.3850.0000.0000.0000.0000.4791.0000.790
전자책_총발행호수(호)0.0640.1280.0000.0000.0000.0000.6640.7901.000

Missing values

2023-12-13T07:32:17.395226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:32:17.594867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T07:32:17.804502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

문학단체명사업연도사업시작일사업종료일문예지분야발간주기발간형태_종이책발간형태_전자책발간형태_웹진발간형태_기타종이책_총발행호수(호)전자책_총발행호수(호)웹진누적방문수(명)
0*제**부2014<NA><NA>종합격월간<NA><NA><NA><NA>123<NA><NA>
1*국**회2014<NA><NA>종합월간<NA><NA><NA><NA>550<NA><NA>
2*1**학2014<NA><NA>종합계간<NA><NA><NA><NA>4<NA><NA>
3*학**네2014<NA><NA>종합미뷴류<NA><NA><NA><NA><NA><NA><NA>
4*학**상2014<NA><NA>종합월간<NA><NA><NA><NA>12<NA><NA>
5*음**사2014<NA><NA>종합계간<NA><NA><NA><NA>4<NA><NA>
6*천**학2014<NA><NA>종합계간<NA><NA><NA><NA>116<NA><NA>
7*행**사2014<NA><NA>종합계간<NA><NA><NA><NA>13<NA><NA>
8*년**작2014<NA><NA>시(조)계간<NA><NA><NA><NA>51<NA><NA>
9*간**선2014<NA><NA>종합계간<NA><NA><NA><NA>42<NA><NA>
문학단체명사업연도사업시작일사업종료일문예지분야발간주기발간형태_종이책발간형태_전자책발간형태_웹진발간형태_기타종이책_총발행호수(호)전자책_총발행호수(호)웹진누적방문수(명)
171*서**망20192019-03-012019-12-31시(조)계간YNNN8100
172*와**시20192019-03-012019-12-31시(조)계간YNNN11000
173*국**회20192019-03-012020-02-29소설계간YNNN6600
174*시**아20192019-04-012019-12-31종합계간YYNN5500
175*국**회20192019-05-012019-12-31수필월간YNNN29800
176*행**사20192019-01-012019-12-31종합격월간YNNN2700
177*국**연20192019-01-012019-12-31시(조)월간YNNN150000
178*대**학20192019-01-012019-12-31종합월간YNNN78100
179*국**회20192019-01-012019-12-31종합격월간YYYN4376600
180*천**학20192019-01-012019-12-31종합계간YNNN13400