Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory478.5 KiB
Average record size in memory49.0 B

Variable types

Numeric1
Categorical2
Text2

Dataset

Description한국전기안전공사에서 제공하는 최근 5년(17년 ~ 22년) 태양광발전설비 사용전검사 결과 데이터입니다. 연도, 업무구분, 설비구분, 원동기종류, 용량, 수량을 확인하실 수 있습니다.
URLhttps://www.data.go.kr/data/15103232/fileData.do

Alerts

발전기종류 has constant value ""Constant
연도 is highly overall correlated with 용도High correlation
용도 is highly overall correlated with 연도High correlation
용도 is highly imbalanced (64.2%)Imbalance

Reproduction

Analysis started2023-12-12 19:57:47.967403
Analysis finished2023-12-12 19:57:48.537097
Duration0.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2019.676
Minimum2017
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T04:57:48.599785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2017
5-th percentile2017
Q12019
median2020
Q32021
95-th percentile2022
Maximum2022
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.5255349
Coefficient of variation (CV)0.00075533645
Kurtosis-0.96948827
Mean2019.676
Median Absolute Deviation (MAD)1
Skewness-0.19483586
Sum20196760
Variance2.3272567
MonotonicityNot monotonic
2023-12-13T04:57:48.734784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2020 2199
22.0%
2021 2163
21.6%
2019 1935
19.4%
2018 1366
13.7%
2022 1255
12.6%
2017 1082
10.8%
ValueCountFrequency (%)
2017 1082
10.8%
2018 1366
13.7%
2019 1935
19.4%
2020 2199
22.0%
2021 2163
21.6%
2022 1255
12.6%
ValueCountFrequency (%)
2022 1255
12.6%
2021 2163
21.6%
2020 2199
22.0%
2019 1935
19.4%
2018 1366
13.7%
2017 1082
10.8%

용도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
자가용(발전)
9320 
사업용(발전)
 
680

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row자가용(발전)
2nd row자가용(발전)
3rd row자가용(발전)
4th row자가용(발전)
5th row자가용(발전)

Common Values

ValueCountFrequency (%)
자가용(발전) 9320
93.2%
사업용(발전) 680
 
6.8%

Length

2023-12-13T04:57:48.874908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:57:48.989406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
자가용(발전 9320
93.2%
사업용(발전 680
 
6.8%

발전기종류
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
태양광
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row태양광
2nd row태양광
3rd row태양광
4th row태양광
5th row태양광

Common Values

ValueCountFrequency (%)
태양광 10000
100.0%

Length

2023-12-13T04:57:49.114917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:57:49.221544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
태양광 10000
100.0%
Distinct7133
Distinct (%)71.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T04:57:49.616595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length5.2963
Min length1

Characters and Unicode

Total characters52963
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5208 ?
Unique (%)52.1%

Sample

1st row46.698
2nd row67.23
3rd row371.25
4th row259.875
5th row20.48
ValueCountFrequency (%)
17.64 6
 
0.1%
16 6
 
0.1%
9.6 6
 
0.1%
32.64 6
 
0.1%
72 6
 
0.1%
51 6
 
0.1%
46.08 6
 
0.1%
25.5 6
 
0.1%
43.2 6
 
0.1%
496.4 6
 
0.1%
Other values (7123) 9940
99.4%
2023-12-13T04:57:50.214836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 9456
17.9%
2 5315
10.0%
1 5117
9.7%
9 5115
9.7%
5 5061
9.6%
4 4890
9.2%
8 4274
8.1%
6 3994
7.5%
3 3478
 
6.6%
7 3358
 
6.3%
Other values (2) 2905
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 43170
81.5%
Other Punctuation 9793
 
18.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 5315
12.3%
1 5117
11.9%
9 5115
11.8%
5 5061
11.7%
4 4890
11.3%
8 4274
9.9%
6 3994
9.3%
3 3478
8.1%
7 3358
7.8%
0 2568
5.9%
Other Punctuation
ValueCountFrequency (%)
. 9456
96.6%
, 337
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Common 52963
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 9456
17.9%
2 5315
10.0%
1 5117
9.7%
9 5115
9.7%
5 5061
9.6%
4 4890
9.2%
8 4274
8.1%
6 3994
7.5%
3 3478
 
6.6%
7 3358
 
6.3%
Other values (2) 2905
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 52963
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 9456
17.9%
2 5315
10.0%
1 5117
9.7%
9 5115
9.7%
5 5061
9.6%
4 4890
9.2%
8 4274
8.1%
6 3994
7.5%
3 3478
 
6.6%
7 3358
 
6.3%
Other values (2) 2905
 
5.5%

건수
Text

Distinct161
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T04:57:50.581769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length1
Mean length1.0965
Min length1

Characters and Unicode

Total characters10965
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique86 ?
Unique (%)0.9%

Sample

1st row1
2nd row3
3rd row1
4th row1
5th row10
ValueCountFrequency (%)
1 5177
51.8%
2 1870
 
18.7%
3 754
 
7.5%
4 444
 
4.4%
5 286
 
2.9%
6 216
 
2.2%
7 141
 
1.4%
9 119
 
1.2%
8 112
 
1.1%
10 86
 
0.9%
Other values (151) 795
 
8.0%
2023-12-13T04:57:51.044034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 5837
53.2%
2 2118
 
19.3%
3 955
 
8.7%
4 563
 
5.1%
5 394
 
3.6%
6 335
 
3.1%
7 229
 
2.1%
9 197
 
1.8%
8 186
 
1.7%
0 150
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10964
> 99.9%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5837
53.2%
2 2118
 
19.3%
3 955
 
8.7%
4 563
 
5.1%
5 394
 
3.6%
6 335
 
3.1%
7 229
 
2.1%
9 197
 
1.8%
8 186
 
1.7%
0 150
 
1.4%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10965
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 5837
53.2%
2 2118
 
19.3%
3 955
 
8.7%
4 563
 
5.1%
5 394
 
3.6%
6 335
 
3.1%
7 229
 
2.1%
9 197
 
1.8%
8 186
 
1.7%
0 150
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10965
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 5837
53.2%
2 2118
 
19.3%
3 955
 
8.7%
4 563
 
5.1%
5 394
 
3.6%
6 335
 
3.1%
7 229
 
2.1%
9 197
 
1.8%
8 186
 
1.7%
0 150
 
1.4%

Interactions

2023-12-13T04:57:48.256104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:57:51.140467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도용도
연도1.0000.588
용도0.5881.000
2023-12-13T04:57:51.234827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도용도
연도1.0000.713
용도0.7131.000

Missing values

2023-12-13T04:57:48.377360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:57:48.480790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도용도발전기종류발전기용량건수
190332022자가용(발전)태양광46.6981
60492020자가용(발전)태양광67.233
78302020자가용(발전)태양광371.251
31082021자가용(발전)태양광259.8751
161012017자가용(발전)태양광20.4810
184242022자가용(발전)태양광87.754
164502017자가용(발전)태양광45.2256
190842022자가용(발전)태양광71.551
91392019자가용(발전)태양광13.865
66812020자가용(발전)태양광100.7161
연도용도발전기종류발전기용량건수
95622019자가용(발전)태양광25.6151
181272022자가용(발전)태양광17.6753
79542020자가용(발전)태양광401.662
184882022자가용(발전)태양광16.522
45062020자가용(발전)태양광5.257
177592017자가용(발전)태양광998.644
110462019자가용(발전)태양광148.743
188072022자가용(발전)태양광17.681
69642020자가용(발전)태양광151.7451
27962021자가용(발전)태양광185.21