Overview

Dataset statistics

Number of variables6
Number of observations86
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.3 KiB
Average record size in memory51.5 B

Variable types

Categorical3
Text1
Numeric2

Dataset

Description한국남부발전(주)_에너지원별 신재생에너지 발전설비 현황에 대한 데이터로 에너지원, 구분, 발전소명, 용량, 준공년도 등의 항목을 제공합니다.
Author한국남부발전(주)
URLhttps://www.data.go.kr/data/15003687/fileData.do

Alerts

용량(MW) is highly overall correlated with 에너지원 and 1 other fieldsHigh correlation
준공년도 is highly overall correlated with 비 고High correlation
에너지원 is highly overall correlated with 용량(MW)High correlation
구분 is highly overall correlated with 용량(MW)High correlation
비 고 is highly overall correlated with 준공년도High correlation
비 고 is highly imbalanced (50.3%)Imbalance
발전소명 has unique valuesUnique

Reproduction

Analysis started2024-04-06 08:24:29.622551
Analysis finished2024-04-06 08:24:31.853525
Duration2.23 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

에너지원
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size820.0 B
태양광
58 
풍 력
11 
바이오
연료전지
 
5
ESS
 
3

Length

Max length4
Median length3
Mean length3.0581395
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row풍 력
2nd row풍 력
3rd row풍 력
4th row풍 력
5th row풍 력

Common Values

ValueCountFrequency (%)
태양광 58
67.4%
풍 력 11
 
12.8%
바이오 6
 
7.0%
연료전지 5
 
5.8%
ESS 3
 
3.5%
소수력 3
 
3.5%

Length

2024-04-06T17:24:31.970978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:24:32.227887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
태양광 58
59.8%
11
 
11.3%
11
 
11.3%
바이오 6
 
6.2%
연료전지 5
 
5.2%
ess 3
 
3.1%
소수력 3
 
3.1%

구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size820.0 B
자체
73 
SPC
13 

Length

Max length3
Median length2
Mean length2.1511628
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row자체
2nd row자체
3rd row자체
4th row자체
5th rowSPC

Common Values

ValueCountFrequency (%)
자체 73
84.9%
SPC 13
 
15.1%

Length

2024-04-06T17:24:32.499951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:24:32.728453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
자체 73
84.9%
spc 13
 
15.1%

발전소명
Text

UNIQUE 

Distinct86
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size820.0 B
2024-04-06T17:24:33.324093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length15
Mean length9.372093
Min length3

Characters and Unicode

Total characters806
Distinct characters143
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique86 ?
Unique (%)100.0%

Sample

1st row한경풍력 1단계(1~4호기)
2nd row한경풍력 2단계(5~9호기)
3rd row성산풍력 1단계(1~6호기)
4th row성산풍력 2단계(7~10호기)
5th row태백풍력(1~9호기)
ValueCountFrequency (%)
태양광 11
 
7.4%
신인천 10
 
6.7%
연료전지 5
 
3.4%
2 4
 
2.7%
1 4
 
2.7%
삼척 4
 
2.7%
하동화력태양광 4
 
2.7%
소내태양광 4
 
2.7%
주차장 3
 
2.0%
성산풍력 3
 
2.0%
Other values (87) 97
65.1%
2024-04-06T17:24:34.207657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
63
 
7.8%
39
 
4.8%
39
 
4.8%
38
 
4.7%
# 26
 
3.2%
25
 
3.1%
1 23
 
2.9%
21
 
2.6%
) 19
 
2.4%
19
 
2.4%
Other values (133) 494
61.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 580
72.0%
Decimal Number 65
 
8.1%
Space Separator 63
 
7.8%
Other Punctuation 35
 
4.3%
Close Punctuation 19
 
2.4%
Open Punctuation 19
 
2.4%
Uppercase Letter 13
 
1.6%
Math Symbol 11
 
1.4%
Lowercase Letter 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
39
 
6.7%
39
 
6.7%
38
 
6.6%
25
 
4.3%
21
 
3.6%
19
 
3.3%
15
 
2.6%
15
 
2.6%
14
 
2.4%
14
 
2.4%
Other values (108) 341
58.8%
Decimal Number
ValueCountFrequency (%)
1 23
35.4%
2 15
23.1%
5 6
 
9.2%
4 6
 
9.2%
3 5
 
7.7%
6 3
 
4.6%
8 2
 
3.1%
9 2
 
3.1%
7 2
 
3.1%
0 1
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
S 6
46.2%
E 3
23.1%
A 1
 
7.7%
B 1
 
7.7%
K 1
 
7.7%
H 1
 
7.7%
Other Punctuation
ValueCountFrequency (%)
# 26
74.3%
, 4
 
11.4%
% 3
 
8.6%
' 2
 
5.7%
Space Separator
ValueCountFrequency (%)
63
100.0%
Close Punctuation
ValueCountFrequency (%)
) 19
100.0%
Open Punctuation
ValueCountFrequency (%)
( 19
100.0%
Math Symbol
ValueCountFrequency (%)
~ 11
100.0%
Lowercase Letter
ValueCountFrequency (%)
n 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 580
72.0%
Common 212
 
26.3%
Latin 14
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
39
 
6.7%
39
 
6.7%
38
 
6.6%
25
 
4.3%
21
 
3.6%
19
 
3.3%
15
 
2.6%
15
 
2.6%
14
 
2.4%
14
 
2.4%
Other values (108) 341
58.8%
Common
ValueCountFrequency (%)
63
29.7%
# 26
12.3%
1 23
 
10.8%
) 19
 
9.0%
( 19
 
9.0%
2 15
 
7.1%
~ 11
 
5.2%
5 6
 
2.8%
4 6
 
2.8%
3 5
 
2.4%
Other values (8) 19
 
9.0%
Latin
ValueCountFrequency (%)
S 6
42.9%
E 3
21.4%
A 1
 
7.1%
B 1
 
7.1%
K 1
 
7.1%
n 1
 
7.1%
H 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 580
72.0%
ASCII 226
 
28.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
63
27.9%
# 26
11.5%
1 23
 
10.2%
) 19
 
8.4%
( 19
 
8.4%
2 15
 
6.6%
~ 11
 
4.9%
5 6
 
2.7%
4 6
 
2.7%
S 6
 
2.7%
Other values (15) 32
14.2%
Hangul
ValueCountFrequency (%)
39
 
6.7%
39
 
6.7%
38
 
6.6%
25
 
4.3%
21
 
3.6%
19
 
3.3%
15
 
2.6%
15
 
2.6%
14
 
2.4%
14
 
2.4%
Other values (108) 341
58.8%

용량(MW)
Real number (ℝ)

HIGH CORRELATION 

Distinct70
Distinct (%)81.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.409022
Minimum0.006
Maximum102.2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size906.0 B
2024-04-06T17:24:34.511761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.006
5-th percentile0.064
Q10.49975
median0.998
Q314.965
95-th percentile79.11
Maximum102.2
Range102.194
Interquartile range (IQR)14.46525

Descriptive statistics

Standard deviation25.983885
Coefficient of variation (CV)1.9377912
Kurtosis4.6221563
Mean13.409022
Median Absolute Deviation (MAD)0.885
Skewness2.3424407
Sum1153.1759
Variance675.1623
MonotonicityNot monotonic
2024-04-06T17:24:34.806664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.998 5
 
5.8%
0.999 4
 
4.7%
0.997 3
 
3.5%
0.995 2
 
2.3%
50.0 2
 
2.3%
100.0 2
 
2.3%
15.0 2
 
2.3%
0.2 2
 
2.3%
60.0 2
 
2.3%
20.24 2
 
2.3%
Other values (60) 60
69.8%
ValueCountFrequency (%)
0.006 1
1.2%
0.04 1
1.2%
0.048 1
1.2%
0.055 1
1.2%
0.06 1
1.2%
0.076 1
1.2%
0.09 1
1.2%
0.093 1
1.2%
0.098 1
1.2%
0.111 1
1.2%
ValueCountFrequency (%)
102.2 1
1.2%
100.0 2
2.3%
98.0 1
1.2%
79.48 1
1.2%
78.0 1
1.2%
60.0 2
2.3%
51.7 1
1.2%
50.0 2
2.3%
32.2 1
1.2%
30.0 1
1.2%

준공년도
Real number (ℝ)

HIGH CORRELATION 

Distinct19
Distinct (%)22.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.593
Minimum2004
Maximum2024
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size906.0 B
2024-04-06T17:24:35.066684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2004
5-th percentile2009
Q12012
median2018
Q32020
95-th percentile2023
Maximum2024
Range20
Interquartile range (IQR)8

Descriptive statistics

Standard deviation4.6963763
Coefficient of variation (CV)0.0023288667
Kurtosis-0.74558693
Mean2016.593
Median Absolute Deviation (MAD)3
Skewness-0.5688802
Sum173427
Variance22.055951
MonotonicityNot monotonic
2024-04-06T17:24:35.312630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
2021 11
12.8%
2018 11
12.8%
2020 9
10.5%
2019 8
9.3%
2011 8
9.3%
2017 7
8.1%
2012 6
7.0%
2023 5
 
5.8%
2010 5
 
5.8%
2022 3
 
3.5%
Other values (9) 13
15.1%
ValueCountFrequency (%)
2004 1
 
1.2%
2007 1
 
1.2%
2008 2
 
2.3%
2009 2
 
2.3%
2010 5
5.8%
2011 8
9.3%
2012 6
7.0%
2013 1
 
1.2%
2014 2
 
2.3%
2015 2
 
2.3%
ValueCountFrequency (%)
2024 1
 
1.2%
2023 5
5.8%
2022 3
 
3.5%
2021 11
12.8%
2020 9
10.5%
2019 8
9.3%
2018 11
12.8%
2017 7
8.1%
2016 1
 
1.2%
2015 2
 
2.3%

비 고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Memory size820.0 B
RPS
66 
RPA
11 
FIT
 
5
<NA>
 
3
자가용
 
1

Length

Max length4
Median length3
Mean length3.0348837
Min length3

Unique

Unique1 ?
Unique (%)1.2%

Sample

1st rowFIT
2nd rowFIT
3rd rowFIT
4th rowRPS
5th rowRPS

Common Values

ValueCountFrequency (%)
RPS 66
76.7%
RPA 11
 
12.8%
FIT 5
 
5.8%
<NA> 3
 
3.5%
자가용 1
 
1.2%

Length

2024-04-06T17:24:35.582542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:24:35.801824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
rps 66
76.7%
rpa 11
 
12.8%
fit 5
 
5.8%
na 3
 
3.5%
자가용 1
 
1.2%

Interactions

2024-04-06T17:24:30.816536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:24:30.436549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:24:31.002008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:24:30.635589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T17:24:35.972588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
에너지원구분발전소명용량(MW)준공년도비 고
에너지원1.0000.6831.0000.7600.3440.421
구분0.6831.0001.0000.8270.0000.132
발전소명1.0001.0001.0001.0001.0001.000
용량(MW)0.7600.8271.0001.0000.3830.000
준공년도0.3440.0001.0000.3831.0000.765
비 고0.4210.1321.0000.0000.7651.000
2024-04-06T17:24:36.273775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
에너지원구분비 고
에너지원1.0000.4890.352
구분0.4891.0000.084
비 고0.3520.0841.000
2024-04-06T17:24:36.479545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용량(MW)준공년도에너지원구분비 고
용량(MW)1.0000.2280.5480.6230.000
준공년도0.2281.0000.1830.0000.595
에너지원0.5480.1831.0000.4890.352
구분0.6230.0000.4891.0000.084
비 고0.0000.5950.3520.0841.000

Missing values

2024-04-06T17:24:31.317620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T17:24:31.687842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

에너지원구분발전소명용량(MW)준공년도비 고
0풍 력자체한경풍력 1단계(1~4호기)6.02004FIT
1풍 력자체한경풍력 2단계(5~9호기)15.02007FIT
2풍 력자체성산풍력 1단계(1~6호기)12.02009FIT
3풍 력자체성산풍력 2단계(7~10호기)8.02010RPS
4풍 력SPC태백풍력(1~9호기)18.02012RPS
5풍 력SPC창죽풍력(1~8호기)16.02013RPS
6풍 력SPC평창풍력(1~15호기)30.02016RPS
7풍 력SPC정암풍력(1~14호기)32.22018RPS
8풍 력SPC한국해상풍력60.02019RPS
9풍 력SPC귀네미풍력19.82020RPS
에너지원구분발전소명용량(MW)준공년도비 고
76연료전지자체영월#1 연료전지15.02021RPS
77소수력자체남제주 소수력0.092011자가용
78소수력자체행원 소수력0.062010RPS
79소수력자체삼척 소수력3.3322017RPS
80바이오자체하동혼소#1~4(3%)60.02012RPS
81바이오자체남제주바이오중유 #1(전소)100.02014RPS
82바이오자체하동혼소#5,6(5%)50.02014RPS
83바이오자체삼척혼소#1,2102.22018RPS
84바이오자체남제주바이오중유#2(전소)100.02019RPS
85바이오자체하동혼소#7,8(5%)50.02023RPS