Overview

Dataset statistics

Number of variables8
Number of observations181
Missing cells288
Missing cells (%)19.9%
Duplicate rows5
Duplicate rows (%)2.8%
Total size in memory12.3 KiB
Average record size in memory69.7 B

Variable types

Text1
Numeric5
Categorical2

Dataset

Description2014-2019년 문예진흥기금 공모사업 중 문학 분야 "문예지발간" 지원 사업의 문예지 연간발간내역(예: 발간주기, 연간발간횟수, 연간제작비)
Author한국문화예술위원회
URLhttps://www.data.go.kr/data/15076420/fileData.do

Alerts

Dataset has 5 (2.8%) duplicate rowsDuplicates
종이책_연간발간횟수(회) is highly overall correlated with 종이책_연간제작비총액(원) and 1 other fieldsHigh correlation
종이책_연간제작비총액(원) is highly overall correlated with 종이책_연간발간횟수(회)High correlation
전자책웹진_연간발간횟수(회) is highly overall correlated with 전자책웹진_연간제작비총액(원) and 1 other fieldsHigh correlation
전자책웹진_연간제작비총액(원) is highly overall correlated with 전자책웹진_연간발간횟수(회) and 1 other fieldsHigh correlation
종이책_주기 is highly overall correlated with 종이책_연간발간횟수(회)High correlation
전자책웹진_주기 is highly overall correlated with 전자책웹진_연간발간횟수(회) and 1 other fieldsHigh correlation
종이책_연간발간횟수(회) has 71 (39.2%) missing valuesMissing
종이책_연간제작비총액(원) has 72 (39.8%) missing valuesMissing
전자책웹진_연간발간횟수(회) has 71 (39.2%) missing valuesMissing
전자책웹진_연간제작비총액(원) has 74 (40.9%) missing valuesMissing
전자책웹진_연간발간횟수(회) has 96 (53.0%) zerosZeros
전자책웹진_연간제작비총액(원) has 96 (53.0%) zerosZeros

Reproduction

Analysis started2023-12-12 14:09:37.743219
Analysis finished2023-12-12 14:09:41.375577
Duration3.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct62
Distinct (%)34.3%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-12T23:09:41.542856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters905
Distinct characters68
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)10.5%

Sample

1st row*제**부
2nd row*국**회
3rd row*1**학
4th row*학**네
5th row*학**상
ValueCountFrequency (%)
국**회 45
24.9%
대**학 8
 
4.4%
제**부 5
 
2.8%
학**네 4
 
2.2%
학**상 4
 
2.2%
비**비 4
 
2.2%
학**사 4
 
2.2%
음**음 4
 
2.2%
년**작 3
 
1.7%
학**당 3
 
1.7%
Other values (52) 97
53.6%
2023-12-12T23:09:41.921747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 543
60.0%
54
 
6.0%
48
 
5.3%
41
 
4.5%
17
 
1.9%
11
 
1.2%
10
 
1.1%
9
 
1.0%
9
 
1.0%
8
 
0.9%
Other values (58) 155
 
17.1%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 543
60.0%
Other Letter 360
39.8%
Decimal Number 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%
Other Punctuation
ValueCountFrequency (%)
* 543
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 545
60.2%
Hangul 360
39.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%
Common
ValueCountFrequency (%)
* 543
99.6%
1 2
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 545
60.2%
Hangul 360
39.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 543
99.6%
1 2
 
0.4%
Hangul
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%

사업연도
Real number (ℝ)

Distinct6
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.7624
Minimum2014
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-12T23:09:42.061179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2014
5-th percentile2014
Q12014
median2018
Q32019
95-th percentile2019
Maximum2019
Range5
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.0395867
Coefficient of variation (CV)0.0010113173
Kurtosis-1.5893876
Mean2016.7624
Median Absolute Deviation (MAD)1
Skewness-0.35284142
Sum365034
Variance4.1599141
MonotonicityIncreasing
2023-12-12T23:09:42.208373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2014 51
28.2%
2018 50
27.6%
2019 47
26.0%
2015 14
 
7.7%
2017 13
 
7.2%
2016 6
 
3.3%
ValueCountFrequency (%)
2014 51
28.2%
2015 14
 
7.7%
2016 6
 
3.3%
2017 13
 
7.2%
2018 50
27.6%
2019 47
26.0%
ValueCountFrequency (%)
2019 47
26.0%
2018 50
27.6%
2017 13
 
7.2%
2016 6
 
3.3%
2015 14
 
7.7%
2014 51
28.2%

종이책_주기
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
<NA>
71 
계간
65 
월간
17 
반년간
16 
격월간
12 

Length

Max length4
Median length3
Mean length2.9392265
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 71
39.2%
계간 65
35.9%
월간 17
 
9.4%
반년간 16
 
8.8%
격월간 12
 
6.6%

Length

2023-12-12T23:09:42.329160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:09:42.453878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 71
39.2%
계간 65
35.9%
월간 17
 
9.4%
반년간 16
 
8.8%
격월간 12
 
6.6%

종이책_연간발간횟수(회)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6
Distinct (%)5.5%
Missing71
Missing (%)39.2%
Infinite0
Infinite (%)0.0%
Mean5.0181818
Minimum2
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-12T23:09:42.577903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q14
median4
Q35.5
95-th percentile12
Maximum12
Range10
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation2.936583
Coefficient of variation (CV)0.58518864
Kurtosis1.4704865
Mean5.0181818
Median Absolute Deviation (MAD)0
Skewness1.5979992
Sum552
Variance8.6235196
MonotonicityNot monotonic
2023-12-12T23:09:42.694971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
4 64
35.4%
2 16
 
8.8%
12 14
 
7.7%
6 11
 
6.1%
8 3
 
1.7%
3 2
 
1.1%
(Missing) 71
39.2%
ValueCountFrequency (%)
2 16
 
8.8%
3 2
 
1.1%
4 64
35.4%
6 11
 
6.1%
8 3
 
1.7%
12 14
 
7.7%
ValueCountFrequency (%)
12 14
 
7.7%
8 3
 
1.7%
6 11
 
6.1%
4 64
35.4%
3 2
 
1.1%
2 16
 
8.8%

종이책_연간제작비총액(원)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct92
Distinct (%)84.4%
Missing72
Missing (%)39.8%
Infinite0
Infinite (%)0.0%
Mean63559351
Minimum7800000
Maximum2.7009535 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-12T23:09:42.829449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7800000
5-th percentile18584800
Q128000000
median40000000
Q380000000
95-th percentile1.8744 × 108
Maximum2.7009535 × 108
Range2.6229535 × 108
Interquartile range (IQR)52000000

Descriptive statistics

Standard deviation55805942
Coefficient of variation (CV)0.87801309
Kurtosis3.0232201
Mean63559351
Median Absolute Deviation (MAD)15600000
Skewness1.8224337
Sum6.9279693 × 109
Variance3.1143032 × 1015
MonotonicityNot monotonic
2023-12-12T23:09:43.033984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40000000 3
 
1.7%
29400000 2
 
1.1%
28000000 2
 
1.1%
26000000 2
 
1.1%
40500000 2
 
1.1%
112400000 2
 
1.1%
38400000 2
 
1.1%
44000000 2
 
1.1%
34800000 2
 
1.1%
96000000 2
 
1.1%
Other values (82) 88
48.6%
(Missing) 72
39.8%
ValueCountFrequency (%)
7800000 1
0.6%
12000000 1
0.6%
13000000 1
0.6%
16600000 1
0.6%
17400000 1
0.6%
18308000 1
0.6%
19000000 1
0.6%
19744000 1
0.6%
20000000 1
0.6%
21000000 2
1.1%
ValueCountFrequency (%)
270095352 1
0.6%
254876832 1
0.6%
247841796 1
0.6%
193808000 1
0.6%
190000000 1
0.6%
188400000 1
0.6%
186000000 1
0.6%
184000000 1
0.6%
182400000 1
0.6%
150000000 1
0.6%

전자책웹진_주기
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
미분류
96 
<NA>
71 
계간
10 
월간
 
2
반년간
 
1

Length

Max length4
Median length3
Mean length3.3259669
Min length2

Unique

Unique2 ?
Unique (%)1.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
미분류 96
53.0%
<NA> 71
39.2%
계간 10
 
5.5%
월간 2
 
1.1%
반년간 1
 
0.6%
격월간 1
 
0.6%

Length

2023-12-12T23:09:43.207026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:09:43.462639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
미분류 96
53.0%
na 71
39.2%
계간 10
 
5.5%
월간 2
 
1.1%
반년간 1
 
0.6%
격월간 1
 
0.6%

전자책웹진_연간발간횟수(회)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct6
Distinct (%)5.5%
Missing71
Missing (%)39.2%
Infinite0
Infinite (%)0.0%
Mean0.64545455
Minimum0
Maximum12
Zeros96
Zeros (%)53.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-12T23:09:43.603749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4
Maximum12
Range12
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.9981017
Coefficient of variation (CV)3.0956505
Kurtosis18.356677
Mean0.64545455
Median Absolute Deviation (MAD)0
Skewness3.9952305
Sum71
Variance3.9924103
MonotonicityNot monotonic
2023-12-12T23:09:43.719304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 96
53.0%
4 9
 
5.0%
12 2
 
1.1%
3 1
 
0.6%
2 1
 
0.6%
6 1
 
0.6%
(Missing) 71
39.2%
ValueCountFrequency (%)
0 96
53.0%
2 1
 
0.6%
3 1
 
0.6%
4 9
 
5.0%
6 1
 
0.6%
12 2
 
1.1%
ValueCountFrequency (%)
12 2
 
1.1%
6 1
 
0.6%
4 9
 
5.0%
3 1
 
0.6%
2 1
 
0.6%
0 96
53.0%

전자책웹진_연간제작비총액(원)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct11
Distinct (%)10.3%
Missing74
Missing (%)40.9%
Infinite0
Infinite (%)0.0%
Mean4220484.1
Minimum0
Maximum2.478418 × 108
Zeros96
Zeros (%)53.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-12T23:09:43.841498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile15282000
Maximum2.478418 × 108
Range2.478418 × 108
Interquartile range (IQR)0

Descriptive statistics

Standard deviation25381183
Coefficient of variation (CV)6.0138086
Kurtosis82.078935
Mean4220484.1
Median Absolute Deviation (MAD)0
Skewness8.6993272
Sum4.515918 × 108
Variance6.4420447 × 1014
MonotonicityNot monotonic
2023-12-12T23:09:43.981701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
0 96
53.0%
5440000 2
 
1.1%
63600000 1
 
0.6%
247841796 1
 
0.6%
450000 1
 
0.6%
42000000 1
 
0.6%
19500000 1
 
0.6%
1200000 1
 
0.6%
20000000 1
 
0.6%
46000000 1
 
0.6%
(Missing) 74
40.9%
ValueCountFrequency (%)
0 96
53.0%
120000 1
 
0.6%
450000 1
 
0.6%
1200000 1
 
0.6%
5440000 2
 
1.1%
19500000 1
 
0.6%
20000000 1
 
0.6%
42000000 1
 
0.6%
46000000 1
 
0.6%
63600000 1
 
0.6%
ValueCountFrequency (%)
247841796 1
0.6%
63600000 1
0.6%
46000000 1
0.6%
42000000 1
0.6%
20000000 1
0.6%
19500000 1
0.6%
5440000 2
1.1%
1200000 1
0.6%
450000 1
0.6%
120000 1
0.6%

Interactions

2023-12-12T23:09:40.381313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:38.148095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:38.657970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:39.126163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:39.575147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:40.494121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:38.244479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:38.767485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:39.222821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:40.002842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:40.598280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:38.342094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:38.847678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:39.305523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:40.100375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:40.719821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:38.445729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:38.931769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:39.408341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:40.203146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:40.832472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:38.540861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:39.031691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:39.494102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:09:40.291941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:09:44.083730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문학단체명사업연도종이책_주기종이책_연간발간횟수(회)종이책_연간제작비총액(원)전자책웹진_주기전자책웹진_연간발간횟수(회)전자책웹진_연간제작비총액(원)
문학단체명1.0000.0000.8520.6200.7010.0000.1870.545
사업연도0.0001.0000.0000.0000.2420.0630.0000.000
종이책_주기0.8520.0001.0000.9990.6110.2300.2500.000
종이책_연간발간횟수(회)0.6200.0000.9991.0000.6250.1980.7010.000
종이책_연간제작비총액(원)0.7010.2420.6110.6251.0000.3550.3520.368
전자책웹진_주기0.0000.0630.2300.1980.3551.0001.0000.921
전자책웹진_연간발간횟수(회)0.1870.0000.2500.7010.3521.0001.0000.698
전자책웹진_연간제작비총액(원)0.5450.0000.0000.0000.3680.9210.6981.000
2023-12-12T23:09:44.219144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종이책_주기전자책웹진_주기
종이책_주기1.0000.187
전자책웹진_주기0.1871.000
2023-12-12T23:09:44.329655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업연도종이책_연간발간횟수(회)종이책_연간제작비총액(원)전자책웹진_연간발간횟수(회)전자책웹진_연간제작비총액(원)종이책_주기전자책웹진_주기
사업연도1.0000.0110.033-0.039-0.0960.0000.043
종이책_연간발간횟수(회)0.0111.0000.697-0.036-0.0750.9540.114
종이책_연간제작비총액(원)0.0330.6971.0000.1950.1970.4350.210
전자책웹진_연간발간횟수(회)-0.039-0.0360.1951.0000.9980.1600.995
전자책웹진_연간제작비총액(원)-0.096-0.0750.1970.9981.0000.0000.628
종이책_주기0.0000.9540.4350.1600.0001.0000.187
전자책웹진_주기0.0430.1140.2100.9950.6280.1871.000

Missing values

2023-12-12T23:09:40.969801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:09:41.129761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T23:09:41.277157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

문학단체명사업연도종이책_주기종이책_연간발간횟수(회)종이책_연간제작비총액(원)전자책웹진_주기전자책웹진_연간발간횟수(회)전자책웹진_연간제작비총액(원)
0*제**부2014<NA><NA><NA><NA><NA><NA>
1*국**회2014<NA><NA><NA><NA><NA><NA>
2*1**학2014<NA><NA><NA><NA><NA><NA>
3*학**네2014<NA><NA><NA><NA><NA><NA>
4*학**상2014<NA><NA><NA><NA><NA><NA>
5*음**사2014<NA><NA><NA><NA><NA><NA>
6*천**학2014<NA><NA><NA><NA><NA><NA>
7*행**사2014<NA><NA><NA><NA><NA><NA>
8*년**작2014<NA><NA><NA><NA><NA><NA>
9*간**선2014<NA><NA><NA><NA><NA><NA>
문학단체명사업연도종이책_주기종이책_연간발간횟수(회)종이책_연간제작비총액(원)전자책웹진_주기전자책웹진_연간발간횟수(회)전자책웹진_연간제작비총액(원)
171*서**망2019계간444000000미분류00
172*와**시2019계간419744000미분류00
173*국**회2019계간426000000미분류00
174*시**아2019계간468800000미분류00
175*국**회2019월간866400000미분류00
176*행**사2019격월간6127920000미분류00
177*국**연2019월간1296000000미분류00
178*대**학2019월간12188400000미분류00
179*국**회2019격월간634000000격월간6<NA>
180*천**학2019계간428000000미분류00

Duplicate rows

Most frequently occurring

문학단체명사업연도종이책_주기종이책_연간발간횟수(회)종이책_연간제작비총액(원)전자책웹진_주기전자책웹진_연간발간횟수(회)전자책웹진_연간제작비총액(원)# duplicates
0*국**회2014<NA><NA><NA><NA><NA><NA>10
2*국**회2016<NA><NA><NA><NA><NA><NA>5
1*국**회2015<NA><NA><NA><NA><NA><NA>2
3*대**학2014<NA><NA><NA><NA><NA><NA>2
4*대**학2015<NA><NA><NA><NA><NA><NA>2