Overview

Dataset statistics

Number of variables8
Number of observations181
Missing cells142
Missing cells (%)9.8%
Duplicate rows3
Duplicate rows (%)1.7%
Total size in memory12.7 KiB
Average record size in memory71.7 B

Variable types

Text1
Numeric5
Categorical2

Dataset

Description2014-2019년 문예진흥기금 공모사업 중 문학 분야 "문예지발간" 지원 사업의 문예지 판매현황(예: 종이책 연간 총 판매량, 전자책 연간 총 판매량, 웹진 연간 총 판매량 등)
Author한국문화예술위원회
URLhttps://www.data.go.kr/data/15076414/fileData.do

Alerts

Dataset has 3 (1.7%) duplicate rowsDuplicates
웹진_작년_연간총판매량(부) is highly overall correlated with 전자책_작년_연간총판매량(부) and 1 other fieldsHigh correlation
웹진_금년_연간총판매량(부) is highly overall correlated with 전자책_작년_연간총판매량(부) and 1 other fieldsHigh correlation
종이책_작년_연간총판매량(부) is highly overall correlated with 종이책_금년_연간총판매량(부)High correlation
종이책_금년_연간총판매량(부) is highly overall correlated with 종이책_작년_연간총판매량(부)High correlation
전자책_작년_연간총판매량(부) is highly overall correlated with 전자책_금년_연간총판매량(부) and 2 other fieldsHigh correlation
전자책_금년_연간총판매량(부) is highly overall correlated with 전자책_작년_연간총판매량(부)High correlation
웹진_작년_연간총판매량(부) is highly imbalanced (52.5%)Imbalance
웹진_금년_연간총판매량(부) is highly imbalanced (55.6%)Imbalance
전자책_작년_연간총판매량(부) has 71 (39.2%) missing valuesMissing
전자책_금년_연간총판매량(부) has 71 (39.2%) missing valuesMissing
종이책_작년_연간총판매량(부) has 49 (27.1%) zerosZeros
종이책_금년_연간총판매량(부) has 51 (28.2%) zerosZeros
전자책_작년_연간총판매량(부) has 104 (57.5%) zerosZeros
전자책_금년_연간총판매량(부) has 104 (57.5%) zerosZeros

Reproduction

Analysis started2023-12-12 21:14:18.760119
Analysis finished2023-12-12 21:14:22.234236
Duration3.47 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct62
Distinct (%)34.3%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-13T06:14:22.409348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters905
Distinct characters68
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)10.5%

Sample

1st row*제**부
2nd row*국**회
3rd row*1**학
4th row*학**네
5th row*학**상
ValueCountFrequency (%)
국**회 45
24.9%
대**학 8
 
4.4%
제**부 5
 
2.8%
학**네 4
 
2.2%
학**상 4
 
2.2%
비**비 4
 
2.2%
학**사 4
 
2.2%
음**음 4
 
2.2%
년**작 3
 
1.7%
학**당 3
 
1.7%
Other values (52) 97
53.6%
2023-12-13T06:14:22.772393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 543
60.0%
54
 
6.0%
48
 
5.3%
41
 
4.5%
17
 
1.9%
11
 
1.2%
10
 
1.1%
9
 
1.0%
9
 
1.0%
8
 
0.9%
Other values (58) 155
 
17.1%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 543
60.0%
Other Letter 360
39.8%
Decimal Number 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%
Other Punctuation
ValueCountFrequency (%)
* 543
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 545
60.2%
Hangul 360
39.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%
Common
ValueCountFrequency (%)
* 543
99.6%
1 2
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 545
60.2%
Hangul 360
39.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 543
99.6%
1 2
 
0.4%
Hangul
ValueCountFrequency (%)
54
 
15.0%
48
 
13.3%
41
 
11.4%
17
 
4.7%
11
 
3.1%
10
 
2.8%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (56) 146
40.6%

사업연도
Real number (ℝ)

Distinct6
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.7624
Minimum2014
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T06:14:22.934924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2014
5-th percentile2014
Q12014
median2018
Q32019
95-th percentile2019
Maximum2019
Range5
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.0395867
Coefficient of variation (CV)0.0010113173
Kurtosis-1.5893876
Mean2016.7624
Median Absolute Deviation (MAD)1
Skewness-0.35284142
Sum365034
Variance4.1599141
MonotonicityIncreasing
2023-12-13T06:14:23.072545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2014 51
28.2%
2018 50
27.6%
2019 47
26.0%
2015 14
 
7.7%
2017 13
 
7.2%
2016 6
 
3.3%
ValueCountFrequency (%)
2014 51
28.2%
2015 14
 
7.7%
2016 6
 
3.3%
2017 13
 
7.2%
2018 50
27.6%
2019 47
26.0%
ValueCountFrequency (%)
2019 47
26.0%
2018 50
27.6%
2017 13
 
7.2%
2016 6
 
3.3%
2015 14
 
7.7%
2014 51
28.2%

종이책_작년_연간총판매량(부)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct77
Distinct (%)42.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79762.552
Minimum0
Maximum14000000
Zeros49
Zeros (%)27.1%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T06:14:23.229773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median716
Q31835
95-th percentile14400
Maximum14000000
Range14000000
Interquartile range (IQR)1835

Descriptive statistics

Standard deviation1040442.7
Coefficient of variation (CV)13.04425
Kurtosis180.99197
Mean79762.552
Median Absolute Deviation (MAD)716
Skewness13.453179
Sum14437022
Variance1.082521 × 1012
MonotonicityNot monotonic
2023-12-13T06:14:23.402742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 49
27.1%
1200 12
 
6.6%
800 8
 
4.4%
600 6
 
3.3%
3600 5
 
2.8%
500 4
 
2.2%
2400 4
 
2.2%
400 3
 
1.7%
300 3
 
1.7%
1000 3
 
1.7%
Other values (67) 84
46.4%
ValueCountFrequency (%)
0 49
27.1%
10 1
 
0.6%
50 1
 
0.6%
80 2
 
1.1%
90 1
 
0.6%
100 2
 
1.1%
120 1
 
0.6%
130 1
 
0.6%
200 2
 
1.1%
240 1
 
0.6%
ValueCountFrequency (%)
14000000 1
0.6%
30000 1
0.6%
24000 2
1.1%
22508 1
0.6%
18600 1
0.6%
18000 1
0.6%
16161 1
0.6%
15000 1
0.6%
14400 1
0.6%
12000 2
1.1%

종이책_금년_연간총판매량(부)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct90
Distinct (%)49.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean90774.591
Minimum0
Maximum16000000
Zeros51
Zeros (%)28.2%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T06:14:23.578844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median800
Q31950
95-th percentile12000
Maximum16000000
Range16000000
Interquartile range (IQR)1950

Descriptive statistics

Standard deviation1189102.4
Coefficient of variation (CV)13.099507
Kurtosis180.99418
Mean90774.591
Median Absolute Deviation (MAD)800
Skewness13.453302
Sum16430201
Variance1.4139645 × 1012
MonotonicityNot monotonic
2023-12-13T06:14:23.732986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 51
28.2%
1200 7
 
3.9%
400 5
 
2.8%
800 5
 
2.8%
3600 5
 
2.8%
1600 4
 
2.2%
1000 4
 
2.2%
480 3
 
1.7%
2400 3
 
1.7%
1800 3
 
1.7%
Other values (80) 91
50.3%
ValueCountFrequency (%)
0 51
28.2%
5 1
 
0.6%
100 2
 
1.1%
106 1
 
0.6%
120 2
 
1.1%
150 1
 
0.6%
152 1
 
0.6%
160 1
 
0.6%
200 1
 
0.6%
240 1
 
0.6%
ValueCountFrequency (%)
16000000 1
0.6%
27600 1
0.6%
24000 1
0.6%
22000 1
0.6%
21363 1
0.6%
20600 1
0.6%
18600 1
0.6%
18000 1
0.6%
16200 1
0.6%
12000 2
1.1%

전자책_작년_연간총판매량(부)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct7
Distinct (%)6.4%
Missing71
Missing (%)39.2%
Infinite0
Infinite (%)0.0%
Mean298.92727
Minimum0
Maximum23045
Zeros104
Zeros (%)57.5%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T06:14:23.853710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile97.35
Maximum23045
Range23045
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2297.0635
Coefficient of variation (CV)7.6843556
Kurtosis90.905569
Mean298.92727
Median Absolute Deviation (MAD)0
Skewness9.3244027
Sum32882
Variance5276500.5
MonotonicityNot monotonic
2023-12-13T06:14:23.942401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 104
57.5%
7200 1
 
0.6%
888 1
 
0.6%
23045 1
 
0.6%
772 1
 
0.6%
800 1
 
0.6%
177 1
 
0.6%
(Missing) 71
39.2%
ValueCountFrequency (%)
0 104
57.5%
177 1
 
0.6%
772 1
 
0.6%
800 1
 
0.6%
888 1
 
0.6%
7200 1
 
0.6%
23045 1
 
0.6%
ValueCountFrequency (%)
23045 1
 
0.6%
7200 1
 
0.6%
888 1
 
0.6%
800 1
 
0.6%
772 1
 
0.6%
177 1
 
0.6%
0 104
57.5%

전자책_금년_연간총판매량(부)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct7
Distinct (%)6.4%
Missing71
Missing (%)39.2%
Infinite0
Infinite (%)0.0%
Mean92.2
Minimum0
Maximum7200
Zeros104
Zeros (%)57.5%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T06:14:24.027160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile121
Maximum7200
Range7200
Interquartile range (IQR)0

Descriptive statistics

Standard deviation696.40849
Coefficient of variation (CV)7.5532374
Kurtosis102.12799
Mean92.2
Median Absolute Deviation (MAD)0
Skewness9.9576724
Sum10142
Variance484984.79
MonotonicityNot monotonic
2023-12-13T06:14:24.128498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 104
57.5%
7200 1
 
0.6%
772 1
 
0.6%
220 1
 
0.6%
755 1
 
0.6%
720 1
 
0.6%
475 1
 
0.6%
(Missing) 71
39.2%
ValueCountFrequency (%)
0 104
57.5%
220 1
 
0.6%
475 1
 
0.6%
720 1
 
0.6%
755 1
 
0.6%
772 1
 
0.6%
7200 1
 
0.6%
ValueCountFrequency (%)
7200 1
 
0.6%
772 1
 
0.6%
755 1
 
0.6%
720 1
 
0.6%
475 1
 
0.6%
220 1
 
0.6%
0 104
57.5%

웹진_작년_연간총판매량(부)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
0
107 
<NA>
71 
177
 
1
7909
 
1
3930
 
1

Length

Max length4
Median length1
Mean length2.2209945
Min length1

Unique

Unique3 ?
Unique (%)1.7%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
0 107
59.1%
<NA> 71
39.2%
177 1
 
0.6%
7909 1
 
0.6%
3930 1
 
0.6%

Length

2023-12-13T06:14:24.249138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:14:24.351675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 107
59.1%
na 71
39.2%
177 1
 
0.6%
7909 1
 
0.6%
3930 1
 
0.6%

웹진_금년_연간총판매량(부)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
0
106 
<NA>
71 
35934
 
1
9533
 
1
6392
 
1

Length

Max length5
Median length1
Mean length2.2430939
Min length1

Unique

Unique4 ?
Unique (%)2.2%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
0 106
58.6%
<NA> 71
39.2%
35934 1
 
0.6%
9533 1
 
0.6%
6392 1
 
0.6%
600 1
 
0.6%

Length

2023-12-13T06:14:24.479441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:14:24.583805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 106
58.6%
na 71
39.2%
35934 1
 
0.6%
9533 1
 
0.6%
6392 1
 
0.6%
600 1
 
0.6%

Interactions

2023-12-13T06:14:21.264516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:19.073917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:19.565884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:20.263781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:20.748415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:21.376965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:19.161806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:19.917937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:20.354646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:20.832003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:21.477397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:19.270199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:20.003388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:20.475214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:20.948508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:21.588918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:19.382732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:20.100949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:20.578731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:21.072840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:21.692872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:19.471413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:20.176831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:20.660863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:14:21.163881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:14:24.668506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문학단체명사업연도종이책_작년_연간총판매량(부)종이책_금년_연간총판매량(부)전자책_작년_연간총판매량(부)전자책_금년_연간총판매량(부)웹진_작년_연간총판매량(부)웹진_금년_연간총판매량(부)
문학단체명1.0000.0000.0000.0000.0000.5820.5540.000
사업연도0.0001.0000.0000.0000.0000.0000.0000.000
종이책_작년_연간총판매량(부)0.0000.0001.0000.6990.0000.0000.0000.000
종이책_금년_연간총판매량(부)0.0000.0000.6991.0000.0000.0000.0000.000
전자책_작년_연간총판매량(부)0.0000.0000.0000.0001.0000.9400.6680.708
전자책_금년_연간총판매량(부)0.5820.0000.0000.0000.9401.0000.4790.529
웹진_작년_연간총판매량(부)0.5540.0000.0000.0000.6680.4791.0001.000
웹진_금년_연간총판매량(부)0.0000.0000.0000.0000.7080.5291.0001.000
2023-12-13T06:14:24.783898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
웹진_작년_연간총판매량(부)웹진_금년_연간총판매량(부)
웹진_작년_연간총판매량(부)1.0000.995
웹진_금년_연간총판매량(부)0.9951.000
2023-12-13T06:14:24.879215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업연도종이책_작년_연간총판매량(부)종이책_금년_연간총판매량(부)전자책_작년_연간총판매량(부)전자책_금년_연간총판매량(부)웹진_작년_연간총판매량(부)웹진_금년_연간총판매량(부)
사업연도1.0000.3840.3150.0550.0580.0000.000
종이책_작년_연간총판매량(부)0.3841.0000.9000.0730.0780.0000.000
종이책_금년_연간총판매량(부)0.3150.9001.0000.0420.0500.0000.000
전자책_작년_연간총판매량(부)0.0550.0730.0421.0000.9990.6940.687
전자책_금년_연간총판매량(부)0.0580.0780.0500.9991.0000.4740.464
웹진_작년_연간총판매량(부)0.0000.0000.0000.6940.4741.0000.995
웹진_금년_연간총판매량(부)0.0000.0000.0000.6870.4640.9951.000

Missing values

2023-12-13T06:14:21.848126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:14:22.022717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T06:14:22.152347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

문학단체명사업연도종이책_작년_연간총판매량(부)종이책_금년_연간총판매량(부)전자책_작년_연간총판매량(부)전자책_금년_연간총판매량(부)웹진_작년_연간총판매량(부)웹진_금년_연간총판매량(부)
0*제**부201400<NA><NA><NA><NA>
1*국**회201400<NA><NA><NA><NA>
2*1**학201412001200<NA><NA><NA><NA>
3*학**네201434357104<NA><NA><NA><NA>
4*학**상20141103311159<NA><NA><NA><NA>
5*음**사201440004000<NA><NA><NA><NA>
6*천**학201400<NA><NA><NA><NA>
7*행**사201400<NA><NA><NA><NA>
8*년**작201412001200<NA><NA><NA><NA>
9*간**선201480400<NA><NA><NA><NA>
문학단체명사업연도종이책_작년_연간총판매량(부)종이책_금년_연간총판매량(부)전자책_작년_연간총판매량(부)전자책_금년_연간총판매량(부)웹진_작년_연간총판매량(부)웹진_금년_연간총판매량(부)
171*서**망20198008000000
172*와**시2019120016000000
173*국**회20198005000000
174*시**아2019120012000000
175*국**회2019360036000000
176*행**사20191616199520000
177*국**연201918600186000000
178*대**학201955025000000
179*국**회2019480480000600
180*천**학2019120012000000

Duplicate rows

Most frequently occurring

문학단체명사업연도종이책_작년_연간총판매량(부)종이책_금년_연간총판매량(부)전자책_작년_연간총판매량(부)전자책_금년_연간총판매량(부)웹진_작년_연간총판매량(부)웹진_금년_연간총판매량(부)# duplicates
0*국**회201400<NA><NA><NA><NA>8
1*국**회20170000002
2*대**학201400<NA><NA><NA><NA>2