Overview

Dataset statistics

Number of variables4
Number of observations91
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.2 KiB
Average record size in memory36.5 B

Variable types

Numeric3
Categorical1

Dataset

Description한국전력공사의 PPA시스템에서 관리하고 있는 발전원별 PPA 계약현황 정보입니다. 본 시스템은 타 시스템으로 이관되어 과거 실적 자료로 활용해주시면 감사하겠습니다.
Author한국전력공사
URLhttps://www.data.go.kr/data/15039559/fileData.do

Alerts

사업자수 is highly overall correlated with 설비용량High correlation
설비용량 is highly overall correlated with 사업자수High correlation
사업자수 has 3 (3.3%) zerosZeros
설비용량 has 3 (3.3%) zerosZeros

Reproduction

Analysis started2023-12-12 01:36:53.777670
Analysis finished2023-12-12 01:36:55.660133
Duration1.88 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년도
Real number (ℝ)

Distinct15
Distinct (%)16.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2014.5714
Minimum2006
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size951.0 B
2023-12-12T10:36:55.728930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2006
5-th percentile2008
Q12012
median2015
Q32018
95-th percentile2020
Maximum2020
Range14
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.7032804
Coefficient of variation (CV)0.0018382473
Kurtosis-0.71122285
Mean2014.5714
Median Absolute Deviation (MAD)3
Skewness-0.35676479
Sum183326
Variance13.714286
MonotonicityIncreasing
2023-12-12T10:36:55.879899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
2012 8
8.8%
2013 8
8.8%
2014 8
8.8%
2015 8
8.8%
2016 8
8.8%
2017 8
8.8%
2018 8
8.8%
2019 8
8.8%
2020 8
8.8%
2011 6
6.6%
Other values (5) 13
14.3%
ValueCountFrequency (%)
2006 1
 
1.1%
2007 3
 
3.3%
2008 3
 
3.3%
2009 3
 
3.3%
2010 3
 
3.3%
2011 6
6.6%
2012 8
8.8%
2013 8
8.8%
2014 8
8.8%
2015 8
8.8%
ValueCountFrequency (%)
2020 8
8.8%
2019 8
8.8%
2018 8
8.8%
2017 8
8.8%
2016 8
8.8%
2015 8
8.8%
2014 8
8.8%
2013 8
8.8%
2012 8
8.8%
2011 6
6.6%

발전원
Categorical

Distinct8
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size860.0 B
태양광
15 
바이오
14 
소수력
14 
연료전지
10 
폐기물
10 
Other values (3)
28 

Length

Max length10
Median length3
Mean length3.8901099
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row태양광
2nd row바이오
3rd row소수력
4th row태양광
5th row바이오

Common Values

ValueCountFrequency (%)
태양광 15
16.5%
바이오 14
15.4%
소수력 14
15.4%
연료전지 10
11.0%
폐기물 10
11.0%
풍력 10
11.0%
매립지가스(LFG) 9
9.9%
해양에너지 9
9.9%

Length

2023-12-12T10:36:56.084354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:36:56.262803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
태양광 15
16.5%
바이오 14
15.4%
소수력 14
15.4%
연료전지 10
11.0%
폐기물 10
11.0%
풍력 10
11.0%
매립지가스(lfg 9
9.9%
해양에너지 9
9.9%

사업자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct36
Distinct (%)39.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2430.4505
Minimum0
Maximum64142
Zeros3
Zeros (%)3.3%
Negative0
Negative (%)0.0%
Memory size951.0 B
2023-12-12T10:36:56.463031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median4
Q316
95-th percentile17365
Maximum64142
Range64142
Interquartile range (IQR)15

Descriptive statistics

Standard deviation9266.8983
Coefficient of variation (CV)3.8128314
Kurtosis26.139191
Mean2430.4505
Median Absolute Deviation (MAD)3
Skewness4.8715471
Sum221171
Variance85875403
MonotonicityNot monotonic
2023-12-12T10:36:56.655267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
1 26
28.6%
2 9
 
9.9%
6 5
 
5.5%
9 5
 
5.5%
4 4
 
4.4%
3 4
 
4.4%
0 3
 
3.3%
7 2
 
2.2%
16 2
 
2.2%
18 2
 
2.2%
Other values (26) 29
31.9%
ValueCountFrequency (%)
0 3
 
3.3%
1 26
28.6%
2 9
 
9.9%
3 4
 
4.4%
4 4
 
4.4%
5 2
 
2.2%
6 5
 
5.5%
7 2
 
2.2%
8 2
 
2.2%
9 5
 
5.5%
ValueCountFrequency (%)
64142 1
1.1%
42268 1
1.1%
32522 1
1.1%
23625 1
1.1%
19001 1
1.1%
15729 1
1.1%
9997 1
1.1%
4887 1
1.1%
3012 1
1.1%
2007 1
1.1%

설비용량
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct63
Distinct (%)69.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean296511.66
Minimum0
Maximum8255319.6
Zeros3
Zeros (%)3.3%
Negative0
Negative (%)0.0%
Memory size951.0 B
2023-12-12T10:36:56.877187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile79
Q1119.7
median993
Q34236.5
95-th percentile1873272.8
Maximum8255319.6
Range8255319.6
Interquartile range (IQR)4116.8

Descriptive statistics

Standard deviation1178424.6
Coefficient of variation (CV)3.9742943
Kurtosis28.043368
Mean296511.66
Median Absolute Deviation (MAD)894
Skewness5.0720984
Sum26982561
Variance1.3886845 × 1012
MonotonicityNot monotonic
2023-12-12T10:36:57.109160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99.0 9
 
9.9%
110.0 7
 
7.7%
300.0 6
 
6.6%
197.4 3
 
3.3%
0.0 3
 
3.3%
2992.0 3
 
3.3%
1620.0 2
 
2.2%
60.0 2
 
2.2%
125.4 2
 
2.2%
2819681.2 1
 
1.1%
Other values (53) 53
58.2%
ValueCountFrequency (%)
0.0 3
 
3.3%
60.0 2
 
2.2%
98.0 1
 
1.1%
99.0 9
9.9%
110.0 7
7.7%
114.0 1
 
1.1%
125.4 2
 
2.2%
133.0 1
 
1.1%
133.4 1
 
1.1%
188.4 1
 
1.1%
ValueCountFrequency (%)
8255319.57 1
1.1%
5545233.82 1
1.1%
4193210.05 1
1.1%
2819681.2 1
1.1%
2095860.19 1
1.1%
1650685.42 1
1.1%
1008165.23 1
1.1%
495519.78 1
1.1%
304195.85 1
1.1%
196646.8 1
1.1%

Interactions

2023-12-12T10:36:54.678655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:36:53.916931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:36:54.291575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:36:55.181723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:36:54.039774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:36:54.428923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:36:55.313069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:36:54.160253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:36:54.551033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:36:57.248836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도발전원사업자수설비용량
년도1.0000.0000.0000.000
발전원0.0001.0000.0000.000
사업자수0.0000.0001.0000.998
설비용량0.0000.0000.9981.000
2023-12-12T10:36:57.389279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년도사업자수설비용량발전원
년도1.0000.1650.1220.000
사업자수0.1651.0000.9390.000
설비용량0.1220.9391.0000.000
발전원0.0000.0000.0001.000

Missing values

2023-12-12T10:36:55.492338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:36:55.604489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

년도발전원사업자수설비용량
02006태양광4265.0
12007바이오160.0
22007소수력21620.0
32007태양광1328551.56
42008바이오160.0
52008소수력21620.0
62008태양광58236908.15
72009바이오2114.0
82009소수력00.0
92009태양광119492329.4
년도발전원사업자수설비용량
812019풍력5188.4
822019해양에너지00.0
832020매립지가스(LFG)199.0
842020바이오177416.0
852020소수력334686.0
862020연료전지94060.0
872020태양광641428255319.57
882020폐기물82822.0
892020풍력6209.4
902020해양에너지00.0