Overview

Dataset statistics

Number of variables5
Number of observations71
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.0 KiB
Average record size in memory43.9 B

Variable types

Categorical2
Text1
Numeric2

Dataset

Description한국수력원자력의 재생에너지 발전설비 현황에 대한 데이터로 태양광, 풍력, 소수력 등 한수원이 관리하는 재생에너지 설비에 대한 용량, 위치, 준공연도에 대한 정보가 포함되어 있습니다.
URLhttps://www.data.go.kr/data/3070081/fileData.do

Alerts

설비용량 (MW) is highly overall correlated with 위치High correlation
준공연도 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
구분 is highly overall correlated with 준공연도 and 1 other fieldsHigh correlation
위치 is highly overall correlated with 설비용량 (MW) and 2 other fieldsHigh correlation
사업명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 06:40:45.297006
Analysis finished2023-12-12 06:40:46.485718
Duration1.19 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Memory size700.0 B
태양광
55 
소수력
일반수력
풍력
 
1

Length

Max length4
Median length3
Mean length3.084507
Min length2

Unique

Unique1 ?
Unique (%)1.4%

Sample

1st row태양광
2nd row태양광
3rd row태양광
4th row태양광
5th row태양광

Common Values

ValueCountFrequency (%)
태양광 55
77.5%
소수력 8
 
11.3%
일반수력 7
 
9.9%
풍력 1
 
1.4%

Length

2023-12-12T15:40:46.583173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:40:46.725790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
태양광 55
77.5%
소수력 8
 
11.3%
일반수력 7
 
9.9%
풍력 1
 
1.4%

사업명
Text

UNIQUE 

Distinct71
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size700.0 B
2023-12-12T15:40:47.018470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length5.0704225
Min length2

Characters and Unicode

Total characters360
Distinct characters112
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique71 ?
Unique (%)100.0%

Sample

1st row한빛솔라파크#1
2nd row한빛솔라파크#2
3rd row한빛솔라파크#3
4th row예천#1
5th row예천#2
ValueCountFrequency (%)
제주 2
 
2.5%
고리 2
 
2.5%
2호 2
 
2.5%
한동태양광 1
 
1.3%
녹동산단4 1
 
1.3%
녹동산단3 1
 
1.3%
녹동산단2 1
 
1.3%
월성3발주차장 1
 
1.3%
월성자재창고 1
 
1.3%
청평유휴부지#2 1
 
1.3%
Other values (66) 66
83.5%
2023-12-12T15:40:47.615194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23
 
6.4%
19
 
5.3%
# 18
 
5.0%
18
 
5.0%
10
 
2.8%
2 10
 
2.8%
9
 
2.5%
8
 
2.2%
7
 
1.9%
7
 
1.9%
Other values (102) 231
64.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 306
85.0%
Decimal Number 24
 
6.7%
Other Punctuation 18
 
5.0%
Space Separator 8
 
2.2%
Open Punctuation 2
 
0.6%
Close Punctuation 2
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
23
 
7.5%
19
 
6.2%
18
 
5.9%
10
 
3.3%
9
 
2.9%
7
 
2.3%
7
 
2.3%
7
 
2.3%
7
 
2.3%
7
 
2.3%
Other values (92) 192
62.7%
Decimal Number
ValueCountFrequency (%)
2 10
41.7%
1 7
29.2%
3 3
 
12.5%
4 2
 
8.3%
5 1
 
4.2%
6 1
 
4.2%
Other Punctuation
ValueCountFrequency (%)
# 18
100.0%
Space Separator
ValueCountFrequency (%)
8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 306
85.0%
Common 54
 
15.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
23
 
7.5%
19
 
6.2%
18
 
5.9%
10
 
3.3%
9
 
2.9%
7
 
2.3%
7
 
2.3%
7
 
2.3%
7
 
2.3%
7
 
2.3%
Other values (92) 192
62.7%
Common
ValueCountFrequency (%)
# 18
33.3%
2 10
18.5%
8
14.8%
1 7
 
13.0%
3 3
 
5.6%
4 2
 
3.7%
( 2
 
3.7%
) 2
 
3.7%
5 1
 
1.9%
6 1
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 306
85.0%
ASCII 54
 
15.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
23
 
7.5%
19
 
6.2%
18
 
5.9%
10
 
3.3%
9
 
2.9%
7
 
2.3%
7
 
2.3%
7
 
2.3%
7
 
2.3%
7
 
2.3%
Other values (92) 192
62.7%
ASCII
ValueCountFrequency (%)
# 18
33.3%
2 10
18.5%
8
14.8%
1 7
 
13.0%
3 3
 
5.6%
4 2
 
3.7%
( 2
 
3.7%
) 2
 
3.7%
5 1
 
1.9%
6 1
 
1.9%

설비용량 (MW)
Real number (ℝ)

HIGH CORRELATION 

Distinct68
Distinct (%)95.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.5340282
Minimum0.046
Maximum140.1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size771.0 B
2023-12-12T15:40:47.820765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.046
5-th percentile0.093
Q10.518
median0.939
Q31.785
95-th percentile72.1
Maximum140.1
Range140.054
Interquartile range (IQR)1.267

Descriptive statistics

Standard deviation27.676907
Coefficient of variation (CV)2.9029605
Kurtosis12.032604
Mean9.5340282
Median Absolute Deviation (MAD)0.511
Skewness3.5284822
Sum676.916
Variance766.0112
MonotonicityNot monotonic
2023-12-12T15:40:48.014687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.999 3
 
4.2%
0.498 2
 
2.8%
1.25 1
 
1.4%
0.288 1
 
1.4%
0.409 1
 
1.4%
1.82 1
 
1.4%
0.84 1
 
1.4%
0.278 1
 
1.4%
0.312 1
 
1.4%
0.154 1
 
1.4%
Other values (58) 58
81.7%
ValueCountFrequency (%)
0.046 1
1.4%
0.05 1
1.4%
0.073 1
1.4%
0.091 1
1.4%
0.095 1
1.4%
0.098 1
1.4%
0.154 1
1.4%
0.194 1
1.4%
0.245 1
1.4%
0.278 1
1.4%
ValueCountFrequency (%)
140.1 1
1.4%
120.0 1
1.4%
108.0 1
1.4%
82.0 1
1.4%
62.2 1
1.4%
48.0 1
1.4%
34.8 1
1.4%
10.947 1
1.4%
5.146 1
1.4%
4.5 1
1.4%

준공연도
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)32.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2009.8732
Minimum1935
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size771.0 B
2023-12-12T15:40:48.181842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1935
5-th percentile1951
Q12011.5
median2019
Q32021
95-th percentile2022
Maximum2022
Range87
Interquartile range (IQR)9.5

Descriptive statistics

Standard deviation21.948596
Coefficient of variation (CV)0.010920388
Kurtosis4.0742914
Mean2009.8732
Median Absolute Deviation (MAD)2
Skewness-2.2775213
Sum142701
Variance481.74085
MonotonicityNot monotonic
2023-12-12T15:40:48.341792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
2021 17
23.9%
2020 9
12.7%
2022 9
12.7%
2018 6
 
8.5%
2019 6
 
8.5%
2012 3
 
4.2%
2017 3
 
4.2%
2008 2
 
2.8%
2011 2
 
2.8%
2007 1
 
1.4%
Other values (13) 13
18.3%
ValueCountFrequency (%)
1935 1
1.4%
1943 1
1.4%
1944 1
1.4%
1945 1
1.4%
1957 1
1.4%
1964 1
1.4%
1967 1
1.4%
1972 1
1.4%
1978 1
1.4%
1990 1
1.4%
ValueCountFrequency (%)
2022 9
12.7%
2021 17
23.9%
2020 9
12.7%
2019 6
 
8.5%
2018 6
 
8.5%
2017 3
 
4.2%
2012 3
 
4.2%
2011 2
 
2.8%
2010 1
 
1.4%
2008 2
 
2.8%

위치
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)33.8%
Missing0
Missing (%)0.0%
Memory size700.0 B
경북 경주시
10 
제주도 서귀포시
10 
제주도 제주시
전남 영광군
경기도 가평군
Other values (19)
30 

Length

Max length9
Median length8
Mean length6.7746479
Min length6

Unique

Unique11 ?
Unique (%)15.5%

Sample

1st row전남 영광군
2nd row전남 영광군
3rd row전남 영광군
4th row경북 예천군
5th row경북 예천군

Common Values

ValueCountFrequency (%)
경북 경주시 10
14.1%
제주도 서귀포시 10
14.1%
제주도 제주시 8
11.3%
전남 영광군 7
9.9%
경기도 가평군 6
 
8.5%
경북 예천군 3
 
4.2%
강원도 춘천시 3
 
4.2%
부산광역시 기장군 3
 
4.2%
경기도 연천군 2
 
2.8%
경북 청송군 2
 
2.8%
Other values (14) 17
23.9%

Length

2023-12-12T15:40:48.534802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
제주도 18
12.7%
경북 15
 
10.6%
전남 10
 
7.0%
경기도 10
 
7.0%
경주시 10
 
7.0%
서귀포시 10
 
7.0%
제주시 8
 
5.6%
영광군 7
 
4.9%
강원도 7
 
4.9%
가평군 6
 
4.2%
Other values (24) 41
28.9%

Interactions

2023-12-12T15:40:46.073743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:40:45.570424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:40:46.192127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:40:45.983721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:40:48.643533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분사업명설비용량 (MW)준공연도위치
구분1.0001.0000.8380.8260.918
사업명1.0001.0001.0001.0001.000
설비용량 (MW)0.8381.0001.0000.9200.964
준공연도0.8261.0000.9201.0000.956
위치0.9181.0000.9640.9561.000
2023-12-12T15:40:48.733943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위치구분
위치1.0000.568
구분0.5681.000
2023-12-12T15:40:48.811789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설비용량 (MW)준공연도구분위치
설비용량 (MW)1.000-0.3360.4940.631
준공연도-0.3361.0000.6490.729
구분0.4940.6491.0000.568
위치0.6310.7290.5681.000

Missing values

2023-12-12T15:40:46.315822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:40:46.440783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분사업명설비용량 (MW)준공연도위치
0태양광한빛솔라파크#11.252007전남 영광군
1태양광한빛솔라파크#21.752008전남 영광군
2태양광한빛솔라파크#310.9472012전남 영광군
3태양광예천#11.3862012경북 예천군
4태양광예천#20.6292012경북 예천군
5태양광고리 #15.1462017부산광역시 기장군
6태양광농가참여형0.0732017경기도 가평군
7태양광수력교육훈련센터0.0912017경기도 가평군
8태양광청평양수운동장0.0952018경기도 가평군
9태양광청송양수옥상0.0462018경북 청송군
구분사업명설비용량 (MW)준공연도위치
61일반수력섬진강34.81945전북 정읍시
62일반수력강릉82.01990강원도 강릉시
63소수력안흥0.481978강원도 횡성군
64소수력보성강4.51935전남 보성군
65소수력괴산2.81957충북 괴산군
66소수력토평0.052011경기도 구리시
67소수력무주0.42003전북 무주군
68소수력양양1.42004강원도 양양군
69소수력산청1.02010경남 산청군
70소수력예천0.92011경북 예천군