Overview

Dataset statistics

Number of variables5
Number of observations93
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.9 KiB
Average record size in memory43.4 B

Variable types

Categorical2
Text1
Numeric2

Dataset

Description한국동서발전의 연간운전시간 정보를 제공합니다. 연간운전시간은 사업소, 중분류, 호기, 용량(kW), 발전시간(HH)의 항목으로 구성됩니다.
URLhttps://www.data.go.kr/data/15064443/fileData.do

Alerts

용량(kW) is highly overall correlated with 사업소 and 1 other fieldsHigh correlation
발전시간(HH) is highly overall correlated with 사업소High correlation
사업소 is highly overall correlated with 용량(kW) and 1 other fieldsHigh correlation
호기 is highly overall correlated with 용량(kW)High correlation
발전시간(HH) has 2 (2.2%) zerosZeros

Reproduction

Analysis started2023-12-12 12:47:39.578279
Analysis finished2023-12-12 12:47:40.408012
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사업소
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Memory size876.0 B
태양광
49 
울산복합
12 
당진
10 
일산복합
연료전지
Other values (4)

Length

Max length5
Median length3
Mean length3.2043011
Min length2

Unique

Unique2 ?
Unique (%)2.2%

Sample

1st row당진
2nd row당진
3rd row당진
4th row당진
5th row당진

Common Values

ValueCountFrequency (%)
태양광 49
52.7%
울산복합 12
 
12.9%
당진 10
 
10.8%
일산복합 8
 
8.6%
연료전지 7
 
7.5%
울산기력 3
 
3.2%
동해 2
 
2.2%
풍력 1
 
1.1%
바이오매스 1
 
1.1%

Length

2023-12-12T21:47:40.486669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:47:40.621136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
태양광 49
52.7%
울산복합 12
 
12.9%
당진 10
 
10.8%
일산복합 8
 
8.6%
연료전지 7
 
7.5%
울산기력 3
 
3.2%
동해 2
 
2.2%
풍력 1
 
1.1%
바이오매스 1
 
1.1%
Distinct67
Distinct (%)72.0%
Missing0
Missing (%)0.0%
Memory size876.0 B
2023-12-12T21:47:40.841540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length13
Mean length7.3870968
Min length2

Characters and Unicode

Total characters687
Distinct characters134
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique58 ?
Unique (%)62.4%

Sample

1st row당진
2nd row당진
3rd row당진
4th row당진
5th row당진
ValueCountFrequency (%)
태양광 24
 
18.6%
당진 10
 
7.8%
일산cc1 5
 
3.9%
동해 4
 
3.1%
발전설비 4
 
3.1%
일산cc2 3
 
2.3%
울산cc1 3
 
2.3%
울산cc2 3
 
2.3%
울산cc3 3
 
2.3%
울산cc4 3
 
2.3%
Other values (63) 67
51.9%
2023-12-12T21:47:41.250160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
55
 
8.0%
55
 
8.0%
49
 
7.1%
C 42
 
6.1%
37
 
5.4%
34
 
4.9%
23
 
3.3%
18
 
2.6%
18
 
2.6%
1 13
 
1.9%
Other values (124) 343
49.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 559
81.4%
Uppercase Letter 48
 
7.0%
Space Separator 37
 
5.4%
Decimal Number 33
 
4.8%
Close Punctuation 4
 
0.6%
Open Punctuation 4
 
0.6%
Other Punctuation 1
 
0.1%
Connector Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
55
 
9.8%
55
 
9.8%
49
 
8.8%
34
 
6.1%
23
 
4.1%
18
 
3.2%
18
 
3.2%
12
 
2.1%
12
 
2.1%
11
 
2.0%
Other values (111) 272
48.7%
Uppercase Letter
ValueCountFrequency (%)
C 42
87.5%
S 4
 
8.3%
E 1
 
2.1%
M 1
 
2.1%
Decimal Number
ValueCountFrequency (%)
1 13
39.4%
2 11
33.3%
4 5
 
15.2%
3 4
 
12.1%
Space Separator
ValueCountFrequency (%)
37
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Other Punctuation
ValueCountFrequency (%)
# 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 559
81.4%
Common 80
 
11.6%
Latin 48
 
7.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
55
 
9.8%
55
 
9.8%
49
 
8.8%
34
 
6.1%
23
 
4.1%
18
 
3.2%
18
 
3.2%
12
 
2.1%
12
 
2.1%
11
 
2.0%
Other values (111) 272
48.7%
Common
ValueCountFrequency (%)
37
46.2%
1 13
 
16.2%
2 11
 
13.8%
4 5
 
6.2%
) 4
 
5.0%
( 4
 
5.0%
3 4
 
5.0%
# 1
 
1.2%
_ 1
 
1.2%
Latin
ValueCountFrequency (%)
C 42
87.5%
S 4
 
8.3%
E 1
 
2.1%
M 1
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 559
81.4%
ASCII 128
 
18.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
55
 
9.8%
55
 
9.8%
49
 
8.8%
34
 
6.1%
23
 
4.1%
18
 
3.2%
18
 
3.2%
12
 
2.1%
12
 
2.1%
11
 
2.0%
Other values (111) 272
48.7%
ASCII
ValueCountFrequency (%)
C 42
32.8%
37
28.9%
1 13
 
10.2%
2 11
 
8.6%
4 5
 
3.9%
) 4
 
3.1%
( 4
 
3.1%
S 4
 
3.1%
3 4
 
3.1%
# 1
 
0.8%
Other values (3) 3
 
2.3%

호기
Categorical

HIGH CORRELATION 

Distinct32
Distinct (%)34.4%
Missing0
Missing (%)0.0%
Memory size876.0 B
1
48 
CG3
 
2
CG2
 
2
4
 
2
5
 
2
Other values (27)
37 

Length

Max length4
Median length1
Mean length1.5913978
Min length1

Unique

Unique17 ?
Unique (%)18.3%

Sample

1st row1
2nd row2
3rd row3
4th row4
5th row5

Common Values

ValueCountFrequency (%)
1 48
51.6%
CG3 2
 
2.2%
CG2 2
 
2.2%
4 2
 
2.2%
5 2
 
2.2%
6 2
 
2.2%
CG5 2
 
2.2%
CS1 2
 
2.2%
CG1 2
 
2.2%
CG4 2
 
2.2%
Other values (22) 27
29.0%

Length

2023-12-12T21:47:41.418648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 48
51.6%
cg1 2
 
2.2%
cg3 2
 
2.2%
f4 2
 
2.2%
s1 2
 
2.2%
cg6 2
 
2.2%
cs2 2
 
2.2%
cg4 2
 
2.2%
2 2
 
2.2%
cs1 2
 
2.2%
Other values (22) 27
29.0%

용량(kW)
Real number (ℝ)

HIGH CORRELATION 

Distinct61
Distinct (%)65.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean115776.34
Minimum88
Maximum1020000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-12T21:47:41.551213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum88
5-th percentile198.4
Q1693
median2800
Q3150000
95-th percentile500000
Maximum1020000
Range1019912
Interquartile range (IQR)149307

Descriptive statistics

Standard deviation206546.59
Coefficient of variation (CV)1.7840138
Kurtosis6.7393116
Mean115776.34
Median Absolute Deviation (MAD)2510.45
Skewness2.4392689
Sum10767199
Variance4.2661493 × 1010
MonotonicityNot monotonic
2023-12-12T21:47:41.697324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100000.0 10
 
10.8%
500000.0 8
 
8.6%
150000.0 6
 
6.5%
200000.0 3
 
3.2%
400000.0 3
 
3.2%
15000.0 2
 
2.2%
999.0 2
 
2.2%
1020000.0 2
 
2.2%
4200.0 2
 
2.2%
998.0 2
 
2.2%
Other values (51) 53
57.0%
ValueCountFrequency (%)
88.0 1
1.1%
95.0 1
1.1%
102.0 1
1.1%
109.0 1
1.1%
190.0 1
1.1%
204.0 1
1.1%
218.0 1
1.1%
289.55 1
1.1%
299.0 1
1.1%
326.0 1
1.1%
ValueCountFrequency (%)
1020000.0 2
 
2.2%
500000.0 8
8.6%
400000.0 3
 
3.2%
298700.0 1
 
1.1%
286600.0 2
 
2.2%
200000.0 3
 
3.2%
150000.0 6
6.5%
100000.0 10
10.8%
30000.0 1
 
1.1%
24952.0 1
 
1.1%

발전시간(HH)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct91
Distinct (%)97.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4308.7634
Minimum0
Maximum8760
Zeros2
Zeros (%)2.2%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-12T21:47:42.152886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile371.8
Q14017
median4387
Q34857
95-th percentile7982
Maximum8760
Range8760
Interquartile range (IQR)840

Descriptive statistics

Standard deviation2104.6308
Coefficient of variation (CV)0.48845356
Kurtosis0.2504147
Mean4308.7634
Median Absolute Deviation (MAD)470
Skewness-0.12580461
Sum400715
Variance4429471
MonotonicityNot monotonic
2023-12-12T21:47:42.288752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2
 
2.2%
8760 2
 
2.2%
174 1
 
1.1%
4884 1
 
1.1%
4516 1
 
1.1%
4164 1
 
1.1%
4317 1
 
1.1%
4136 1
 
1.1%
4405 1
 
1.1%
4315 1
 
1.1%
Other values (81) 81
87.1%
ValueCountFrequency (%)
0 2
2.2%
174 1
1.1%
176 1
1.1%
346 1
1.1%
389 1
1.1%
394 1
1.1%
460 1
1.1%
477 1
1.1%
615 1
1.1%
633 1
1.1%
ValueCountFrequency (%)
8760 2
2.2%
8725 1
1.1%
8675 1
1.1%
8657 1
1.1%
7532 1
1.1%
7460 1
1.1%
7447 1
1.1%
7406 1
1.1%
7366 1
1.1%
7321 1
1.1%

Interactions

2023-12-12T21:47:40.056542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:47:39.866719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:47:40.165652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:47:39.957656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:47:42.380543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업소중분류호기용량(kW)발전시간(HH)
사업소1.0001.0000.2850.9040.936
중분류1.0001.0000.0000.8040.866
호기0.2850.0001.0000.9380.624
용량(kW)0.9040.8040.9381.0000.709
발전시간(HH)0.9360.8660.6240.7091.000
2023-12-12T21:47:42.481678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업소호기
사업소1.0000.071
호기0.0711.000
2023-12-12T21:47:42.583651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용량(kW)발전시간(HH)사업소호기
용량(kW)1.0000.1110.7030.635
발전시간(HH)0.1111.0000.5930.239
사업소0.7030.5931.0000.071
호기0.6350.2390.0711.000

Missing values

2023-12-12T21:47:40.267138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:47:40.372077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사업소중분류호기용량(kW)발전시간(HH)
0당진당진1500000.00
1당진당진2500000.08675
2당진당진3500000.07532
3당진당진4500000.0477
4당진당진5500000.07366
5당진당진6500000.06814
6당진당진7500000.05183
7당진당진8500000.06605
8당진당진91020000.06235
9당진당진101020000.07460
사업소중분류호기용량(kW)발전시간(HH)
83태양광황금물류센터태양광11100.04448
84풍력영광지산풍력13000.05858
85바이오매스동해바이오매스130000.06867
86연료전지동해 북평레포츠 연료전지14200.06218
87연료전지동해연료전지115000.08725
88연료전지울산수소연료전지11000.06292
89연료전지울산연료전지12800.00
90연료전지울산연료전지214200.08760
91연료전지일산연료전지4F45280.08657
92연료전지호남연료전지115000.08760