Overview

Dataset statistics

Number of variables15
Number of observations7555
Missing cells21198
Missing cells (%)18.7%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory937.1 KiB
Average record size in memory127.0 B

Variable types

Text2
Numeric5
Categorical7
DateTime1

Dataset

Description태양광발전소 전기사업허가정보(제공표준)
Author경기도
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=VI0D9IY634MNRGJITBI527985650&infSeq=1

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
설치상세위치구분명 is highly overall correlated with 가동상태구분명 and 2 other fieldsHigh correlation
주파수(Hz) is highly overall correlated with 위도 and 5 other fieldsHigh correlation
세부용도 is highly overall correlated with 설치상세위치구분명 and 3 other fieldsHigh correlation
허가기관 is highly overall correlated with 위도 and 6 other fieldsHigh correlation
데이터기준일자 is highly overall correlated with 위도 and 4 other fieldsHigh correlation
위도 is highly overall correlated with 주파수(Hz) and 2 other fieldsHigh correlation
경도 is highly overall correlated with 주파수(Hz) and 2 other fieldsHigh correlation
설비용량(KW) is highly overall correlated with 설치면적(㎥)High correlation
설치연도 is highly overall correlated with 허가기관 and 1 other fieldsHigh correlation
설치면적(㎥) is highly overall correlated with 설비용량(KW) and 1 other fieldsHigh correlation
가동상태구분명 is highly overall correlated with 설치상세위치구분명High correlation
공급전압(V) is highly overall correlated with 허가기관High correlation
설치상세위치구분명 is highly imbalanced (71.8%)Imbalance
가동상태구분명 is highly imbalanced (73.0%)Imbalance
공급전압(V) is highly imbalanced (89.2%)Imbalance
주파수(Hz) is highly imbalanced (99.8%)Imbalance
소재지지번주소 has 1726 (22.8%) missing valuesMissing
위도 has 5495 (72.7%) missing valuesMissing
경도 has 5495 (72.7%) missing valuesMissing
허가일자 has 1724 (22.8%) missing valuesMissing
설치면적(㎥) has 6758 (89.5%) missing valuesMissing

Reproduction

Analysis started2024-05-10 21:07:23.600135
Analysis finished2024-05-10 21:07:35.854041
Duration12.25 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct7074
Distinct (%)93.6%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
2024-05-10T21:07:36.636106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length18
Mean length10.559365
Min length2

Characters and Unicode

Total characters79776
Distinct characters685
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6763 ?
Unique (%)89.5%

Sample

1st row(KCC)케이씨씨 여주태양광제3발전소
2nd row(유)답곡발전
3rd row(유)레나다
4th row(유)방장 태양광발전소
5th row(유)백석 태양광발전소
ValueCountFrequency (%)
태양광발전소 4318
33.3%
발전소 159
 
1.2%
2호 76
 
0.6%
1호 74
 
0.6%
태양광 54
 
0.4%
주식회사 39
 
0.3%
35
 
0.3%
3호 29
 
0.2%
그린드림 20
 
0.2%
2호기 16
 
0.1%
Other values (7043) 8139
62.8%
2024-05-10T21:07:38.179401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7273
 
9.1%
7206
 
9.0%
7186
 
9.0%
6924
 
8.7%
6897
 
8.6%
6885
 
8.6%
5414
 
6.8%
2627
 
3.3%
1 1207
 
1.5%
2 1137
 
1.4%
Other values (675) 27020
33.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 68497
85.9%
Space Separator 5414
 
6.8%
Decimal Number 3848
 
4.8%
Uppercase Letter 752
 
0.9%
Other Punctuation 413
 
0.5%
Dash Punctuation 195
 
0.2%
Lowercase Letter 191
 
0.2%
Other Symbol 169
 
0.2%
Open Punctuation 156
 
0.2%
Close Punctuation 137
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7273
 
10.6%
7206
 
10.5%
7186
 
10.5%
6924
 
10.1%
6897
 
10.1%
6885
 
10.1%
2627
 
3.8%
631
 
0.9%
574
 
0.8%
552
 
0.8%
Other values (607) 21742
31.7%
Uppercase Letter
ValueCountFrequency (%)
S 97
12.9%
K 84
11.2%
C 70
 
9.3%
O 65
 
8.6%
E 57
 
7.6%
P 50
 
6.6%
J 44
 
5.9%
H 44
 
5.9%
F 34
 
4.5%
B 25
 
3.3%
Other values (13) 182
24.2%
Lowercase Letter
ValueCountFrequency (%)
p 25
13.1%
o 25
13.1%
e 22
11.5%
k 18
9.4%
a 17
8.9%
c 16
8.4%
r 11
 
5.8%
n 11
 
5.8%
l 9
 
4.7%
m 9
 
4.7%
Other values (10) 28
14.7%
Decimal Number
ValueCountFrequency (%)
1 1207
31.4%
2 1137
29.5%
3 600
15.6%
4 288
 
7.5%
5 204
 
5.3%
6 124
 
3.2%
7 81
 
2.1%
0 75
 
1.9%
8 70
 
1.8%
9 62
 
1.6%
Other Punctuation
ValueCountFrequency (%)
* 389
94.2%
. 7
 
1.7%
& 7
 
1.7%
; 7
 
1.7%
? 2
 
0.5%
: 1
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 155
99.4%
[ 1
 
0.6%
Other Symbol
ValueCountFrequency (%)
90
53.3%
79
46.7%
Letter Number
ValueCountFrequency (%)
2
50.0%
2
50.0%
Space Separator
ValueCountFrequency (%)
5414
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 195
100.0%
Close Punctuation
ValueCountFrequency (%)
) 137
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 68587
86.0%
Common 10242
 
12.8%
Latin 947
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7273
 
10.6%
7206
 
10.5%
7186
 
10.5%
6924
 
10.1%
6897
 
10.1%
6885
 
10.0%
2627
 
3.8%
631
 
0.9%
574
 
0.8%
552
 
0.8%
Other values (608) 21832
31.8%
Latin
ValueCountFrequency (%)
S 97
 
10.2%
K 84
 
8.9%
C 70
 
7.4%
O 65
 
6.9%
E 57
 
6.0%
P 50
 
5.3%
J 44
 
4.6%
H 44
 
4.6%
F 34
 
3.6%
B 25
 
2.6%
Other values (35) 377
39.8%
Common
ValueCountFrequency (%)
5414
52.9%
1 1207
 
11.8%
2 1137
 
11.1%
3 600
 
5.9%
* 389
 
3.8%
4 288
 
2.8%
5 204
 
2.0%
- 195
 
1.9%
( 155
 
1.5%
) 137
 
1.3%
Other values (12) 516
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 68497
85.9%
ASCII 11106
 
13.9%
None 90
 
0.1%
Geometric Shapes 79
 
0.1%
Number Forms 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7273
 
10.6%
7206
 
10.5%
7186
 
10.5%
6924
 
10.1%
6897
 
10.1%
6885
 
10.1%
2627
 
3.8%
631
 
0.9%
574
 
0.8%
552
 
0.8%
Other values (607) 21742
31.7%
ASCII
ValueCountFrequency (%)
5414
48.7%
1 1207
 
10.9%
2 1137
 
10.2%
3 600
 
5.4%
* 389
 
3.5%
4 288
 
2.6%
5 204
 
1.8%
- 195
 
1.8%
( 155
 
1.4%
) 137
 
1.2%
Other values (54) 1380
 
12.4%
None
ValueCountFrequency (%)
90
100.0%
Geometric Shapes
ValueCountFrequency (%)
79
100.0%
Number Forms
ValueCountFrequency (%)
2
50.0%
2
50.0%

소재지지번주소
Text

MISSING 

Distinct3199
Distinct (%)54.9%
Missing1726
Missing (%)22.8%
Memory size59.2 KiB
2024-05-10T21:07:38.756376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length96
Median length88
Mean length20.862755
Min length10

Characters and Unicode

Total characters121609
Distinct characters368
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2517 ?
Unique (%)43.2%

Sample

1st row경기도 여주시 가남읍 본두리 36-104
2nd row경기도 연천군 신서면 답곡리 1176
3rd row경기도 안성시 일죽면 가리 67-5
4th row경기도 여주시 점동면 처리 569-4
5th row경기도 연천군 미산면 백석리 283-2, 283-3
ValueCountFrequency (%)
경기도 5829
 
20.1%
화성시 1824
 
6.3%
안성시 1229
 
4.2%
여주시 715
 
2.5%
연천군 686
 
2.4%
일죽면 408
 
1.4%
장안면 381
 
1.3%
파주시 380
 
1.3%
팔탄면 238
 
0.8%
안산시 225
 
0.8%
Other values (3974) 17018
58.8%
2024-05-10T21:07:39.803800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23111
19.0%
6072
 
5.0%
5904
 
4.9%
5838
 
4.8%
5318
 
4.4%
4964
 
4.1%
1 4024
 
3.3%
3898
 
3.2%
- 3477
 
2.9%
3315
 
2.7%
Other values (358) 55688
45.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 72761
59.8%
Space Separator 23111
 
19.0%
Decimal Number 19713
 
16.2%
Dash Punctuation 3477
 
2.9%
Other Punctuation 1260
 
1.0%
Open Punctuation 586
 
0.5%
Close Punctuation 585
 
0.5%
Uppercase Letter 92
 
0.1%
Lowercase Letter 18
 
< 0.1%
Letter Number 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6072
 
8.3%
5904
 
8.1%
5838
 
8.0%
5318
 
7.3%
4964
 
6.8%
3898
 
5.4%
3315
 
4.6%
2098
 
2.9%
2058
 
2.8%
1787
 
2.5%
Other values (318) 31509
43.3%
Uppercase Letter
ValueCountFrequency (%)
A 35
38.0%
B 22
23.9%
C 15
16.3%
F 4
 
4.3%
E 4
 
4.3%
D 3
 
3.3%
P 2
 
2.2%
K 2
 
2.2%
S 2
 
2.2%
G 1
 
1.1%
Other values (2) 2
 
2.2%
Decimal Number
ValueCountFrequency (%)
1 4024
20.4%
2 2644
13.4%
3 2224
11.3%
5 1888
9.6%
4 1783
9.0%
6 1775
9.0%
7 1535
 
7.8%
8 1327
 
6.7%
0 1294
 
6.6%
9 1219
 
6.2%
Lowercase Letter
ValueCountFrequency (%)
m 4
22.2%
i 2
11.1%
u 2
11.1%
l 2
11.1%
t 2
11.1%
a 2
11.1%
c 2
11.1%
e 2
11.1%
Other Punctuation
ValueCountFrequency (%)
, 1256
99.7%
. 4
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 415
70.8%
[ 171
29.2%
Close Punctuation
ValueCountFrequency (%)
) 415
70.9%
] 170
29.1%
Space Separator
ValueCountFrequency (%)
23111
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3477
100.0%
Letter Number
ValueCountFrequency (%)
5
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 72761
59.8%
Common 48733
40.1%
Latin 115
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6072
 
8.3%
5904
 
8.1%
5838
 
8.0%
5318
 
7.3%
4964
 
6.8%
3898
 
5.4%
3315
 
4.6%
2098
 
2.9%
2058
 
2.8%
1787
 
2.5%
Other values (318) 31509
43.3%
Latin
ValueCountFrequency (%)
A 35
30.4%
B 22
19.1%
C 15
13.0%
5
 
4.3%
F 4
 
3.5%
E 4
 
3.5%
m 4
 
3.5%
D 3
 
2.6%
P 2
 
1.7%
i 2
 
1.7%
Other values (11) 19
16.5%
Common
ValueCountFrequency (%)
23111
47.4%
1 4024
 
8.3%
- 3477
 
7.1%
2 2644
 
5.4%
3 2224
 
4.6%
5 1888
 
3.9%
4 1783
 
3.7%
6 1775
 
3.6%
7 1535
 
3.1%
8 1327
 
2.7%
Other values (9) 4945
 
10.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 72761
59.8%
ASCII 48843
40.2%
Number Forms 5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
23111
47.3%
1 4024
 
8.2%
- 3477
 
7.1%
2 2644
 
5.4%
3 2224
 
4.6%
5 1888
 
3.9%
4 1783
 
3.7%
6 1775
 
3.6%
7 1535
 
3.1%
8 1327
 
2.7%
Other values (29) 5055
 
10.3%
Hangul
ValueCountFrequency (%)
6072
 
8.3%
5904
 
8.1%
5838
 
8.0%
5318
 
7.3%
4964
 
6.8%
3898
 
5.4%
3315
 
4.6%
2098
 
2.9%
2058
 
2.8%
1787
 
2.5%
Other values (318) 31509
43.3%
Number Forms
ValueCountFrequency (%)
5
100.0%

위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1369
Distinct (%)66.5%
Missing5495
Missing (%)72.7%
Infinite0
Infinite (%)0.0%
Mean37.616188
Minimum37.10886
Maximum38.229445
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size66.5 KiB
2024-05-10T21:07:40.286439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.10886
5-th percentile37.207497
Q137.313808
median37.398065
Q338.04082
95-th percentile38.164766
Maximum38.229445
Range1.1205844
Interquartile range (IQR)0.7270126

Descriptive statistics

Standard deviation0.35644481
Coefficient of variation (CV)0.0094758353
Kurtosis-1.5340085
Mean37.616188
Median Absolute Deviation (MAD)0.2001154
Skewness0.39580733
Sum77489.348
Variance0.1270529
MonotonicityNot monotonic
2024-05-10T21:07:40.731850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.36602272 22
 
0.3%
37.36069764 22
 
0.3%
37.36859828 21
 
0.3%
37.37020378 20
 
0.3%
37.23241913 17
 
0.2%
37.30421583 17
 
0.2%
38.09722202 12
 
0.2%
38.0110627 12
 
0.2%
37.30182901 11
 
0.1%
37.29631685 11
 
0.1%
Other values (1359) 1895
 
25.1%
(Missing) 5495
72.7%
ValueCountFrequency (%)
37.10886028 1
< 0.1%
37.1300037358 1
< 0.1%
37.1301641 1
< 0.1%
37.13135603 1
< 0.1%
37.13693436 1
< 0.1%
37.14706512 1
< 0.1%
37.14707175 1
< 0.1%
37.14727884 1
< 0.1%
37.14945125 1
< 0.1%
37.1515582 1
< 0.1%
ValueCountFrequency (%)
38.22944471 2
 
< 0.1%
38.22397553 1
 
< 0.1%
38.2235265 5
0.1%
38.22333612 3
< 0.1%
38.21952381 7
0.1%
38.2192444 1
 
< 0.1%
38.21480985 2
 
< 0.1%
38.21430774 2
 
< 0.1%
38.2142626 2
 
< 0.1%
38.21275755 1
 
< 0.1%

경도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1372
Distinct (%)66.6%
Missing5495
Missing (%)72.7%
Infinite0
Infinite (%)0.0%
Mean127.22151
Minimum126.39353
Maximum129.143
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size66.5 KiB
2024-05-10T21:07:41.148094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.39353
5-th percentile126.74306
Q1126.97649
median127.09509
Q3127.58474
95-th percentile127.70082
Maximum129.143
Range2.7494655
Interquartile range (IQR)0.608248

Descriptive statistics

Standard deviation0.33339234
Coefficient of variation (CV)0.0026205659
Kurtosis-0.86641138
Mean127.22151
Median Absolute Deviation (MAD)0.25091715
Skewness0.22292501
Sum262076.31
Variance0.11115045
MonotonicityNot monotonic
2024-05-10T21:07:41.577216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.6838115 22
 
0.3%
127.6254943 22
 
0.3%
127.6269978 21
 
0.3%
127.6429986 20
 
0.3%
127.5401471 17
 
0.2%
127.703228 17
 
0.2%
127.033518 12
 
0.2%
126.9392213 12
 
0.2%
126.8053324 11
 
0.1%
126.8910319 11
 
0.1%
Other values (1362) 1895
 
25.1%
(Missing) 5495
72.7%
ValueCountFrequency (%)
126.3935345 1
< 0.1%
126.5450416 1
< 0.1%
126.5641192 1
< 0.1%
126.5653166 1
< 0.1%
126.5662561 1
< 0.1%
126.5705782 1
< 0.1%
126.5731623 1
< 0.1%
126.5745515 1
< 0.1%
126.5753536 1
< 0.1%
126.5754495 1
< 0.1%
ValueCountFrequency (%)
129.1429999679 1
< 0.1%
127.7458615 1
< 0.1%
127.7455014 1
< 0.1%
127.7361128 1
< 0.1%
127.7353643 1
< 0.1%
127.7278636 1
< 0.1%
127.7278055 1
< 0.1%
127.7277429 1
< 0.1%
127.7276767 1
< 0.1%
127.7276089 1
< 0.1%

설치상세위치구분명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
<NA>
6205 
옥상
718 
기타
 
591
옥외
 
17
주차장
 
13
Other values (4)
 
11

Length

Max length5
Median length4
Mean length3.6485771
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row기타
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 6205
82.1%
옥상 718
 
9.5%
기타 591
 
7.8%
옥외 17
 
0.2%
주차장 13
 
0.2%
옥상+기타 5
 
0.1%
건물일체형 3
 
< 0.1%
주차장+옥 2
 
< 0.1%
보일러실 1
 
< 0.1%

Length

2024-05-10T21:07:42.009002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T21:07:42.394330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 6205
82.1%
옥상 718
 
9.5%
기타 591
 
7.8%
옥외 17
 
0.2%
주차장 13
 
0.2%
옥상+기타 5
 
0.1%
건물일체형 3
 
< 0.1%
주차장+옥 2
 
< 0.1%
보일러실 1
 
< 0.1%

가동상태구분명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
정상가동
6934 
가동중단
 
598
폐기
 
23

Length

Max length4
Median length4
Mean length3.9939113
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정상가동
2nd row정상가동
3rd row가동중단
4th row정상가동
5th row정상가동

Common Values

ValueCountFrequency (%)
정상가동 6934
91.8%
가동중단 598
 
7.9%
폐기 23
 
0.3%

Length

2024-05-10T21:07:42.747058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T21:07:43.029397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정상가동 6934
91.8%
가동중단 598
 
7.9%
폐기 23
 
0.3%

설비용량(KW)
Real number (ℝ)

HIGH CORRELATION 

Distinct2182
Distinct (%)28.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean134.3581
Minimum0
Maximum2999
Zeros5
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size66.5 KiB
2024-05-10T21:07:43.529226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile16.623
Q145.025
median98.18
Q399.68
95-th percentile472.106
Maximum2999
Range2999
Interquartile range (IQR)54.655

Descriptive statistics

Standard deviation265.25355
Coefficient of variation (CV)1.9742282
Kurtosis52.352244
Mean134.3581
Median Absolute Deviation (MAD)23.3
Skewness6.5832447
Sum1015075.5
Variance70359.447
MonotonicityNot monotonic
2024-05-10T21:07:43.877646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99.0 423
 
5.6%
99.9 223
 
3.0%
99.36 186
 
2.5%
99.96 170
 
2.3%
99.84 148
 
2.0%
98.28 133
 
1.8%
99.6 117
 
1.5%
99.45 106
 
1.4%
99.65 106
 
1.4%
97.92 94
 
1.2%
Other values (2172) 5849
77.4%
ValueCountFrequency (%)
0.0 5
0.1%
0.04 1
 
< 0.1%
0.3 1
 
< 0.1%
0.32 1
 
< 0.1%
0.44 1
 
< 0.1%
0.56 1
 
< 0.1%
0.7 1
 
< 0.1%
0.8 1
 
< 0.1%
0.88 2
 
< 0.1%
2.0 4
0.1%
ValueCountFrequency (%)
2999.0 2
< 0.1%
2996.0 2
< 0.1%
2995.0 1
 
< 0.1%
2994.0 4
0.1%
2993.0 2
< 0.1%
2991.0 2
< 0.1%
2990.0 1
 
< 0.1%
2981.0 1
 
< 0.1%
2970.0 1
 
< 0.1%
2667.0 2
< 0.1%

공급전압(V)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
380
7293 
22900
 
177
220
 
82
330
 
2
220380
 
1

Length

Max length6
Median length3
Mean length3.0472535
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row380
2nd row22900
3rd row380
4th row380
5th row22900

Common Values

ValueCountFrequency (%)
380 7293
96.5%
22900 177
 
2.3%
220 82
 
1.1%
330 2
 
< 0.1%
220380 1
 
< 0.1%

Length

2024-05-10T21:07:44.331058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T21:07:44.702476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
380 7293
96.5%
22900 177
 
2.3%
220 82
 
1.1%
330 2
 
< 0.1%
220380 1
 
< 0.1%

주파수(Hz)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
60
7554 
63
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row60
2nd row60
3rd row60
4th row60
5th row60

Common Values

ValueCountFrequency (%)
60 7554
> 99.9%
63 1
 
< 0.1%

Length

2024-05-10T21:07:45.072871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T21:07:45.388201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
60 7554
> 99.9%
63 1
 
< 0.1%

설치연도
Real number (ℝ)

HIGH CORRELATION 

Distinct24
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2019.297
Minimum1905
Maximum2099
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size66.5 KiB
2024-05-10T21:07:45.716661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1905
5-th percentile2014
Q12018
median2020
Q32021
95-th percentile2023
Maximum2099
Range194
Interquartile range (IQR)3

Descriptive statistics

Standard deviation6.5461993
Coefficient of variation (CV)0.0032418209
Kurtosis232.57071
Mean2019.297
Median Absolute Deviation (MAD)2
Skewness-13.633907
Sum15255789
Variance42.852725
MonotonicityNot monotonic
2024-05-10T21:07:46.120974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
2021 1525
20.2%
2020 1139
15.1%
2022 1046
13.8%
2019 973
12.9%
2018 894
11.8%
2023 679
9.0%
2017 462
 
6.1%
2015 206
 
2.7%
2014 204
 
2.7%
2016 165
 
2.2%
Other values (14) 262
 
3.5%
ValueCountFrequency (%)
1905 3
 
< 0.1%
1910 19
 
0.3%
2005 1
 
< 0.1%
2006 3
 
< 0.1%
2007 3
 
< 0.1%
2008 10
 
0.1%
2009 15
 
0.2%
2010 13
 
0.2%
2011 18
 
0.2%
2012 70
0.9%
ValueCountFrequency (%)
2099 1
 
< 0.1%
2025 20
 
0.3%
2024 17
 
0.2%
2023 679
9.0%
2022 1046
13.8%
2021 1525
20.2%
2020 1139
15.1%
2019 973
12.9%
2018 894
11.8%
2017 462
 
6.1%

세부용도
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
발전사업용
3042 
<NA>
2978 
사업용
672 
태양광발전시설
665 
전기사업용
 
191
Other values (2)
 
7

Length

Max length7
Median length5
Mean length4.6023825
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row태양광발전시설
3rd row<NA>
4th row<NA>
5th row발전사업용

Common Values

ValueCountFrequency (%)
발전사업용 3042
40.3%
<NA> 2978
39.4%
사업용 672
 
8.9%
태양광발전시설 665
 
8.8%
전기사업용 191
 
2.5%
태양광 6
 
0.1%
바이오가스 1
 
< 0.1%

Length

2024-05-10T21:07:46.577311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T21:07:46.971391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
발전사업용 3042
40.3%
na 2978
39.4%
사업용 672
 
8.9%
태양광발전시설 665
 
8.8%
전기사업용 191
 
2.5%
태양광 6
 
0.1%
바이오가스 1
 
< 0.1%

허가일자
Date

MISSING 

Distinct1664
Distinct (%)28.5%
Missing1724
Missing (%)22.8%
Memory size59.2 KiB
Minimum2006-02-16 00:00:00
Maximum2023-11-14 00:00:00
2024-05-10T21:07:47.368551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:47.800229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

허가기관
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
<NA>
3013 
경기도 화성시청
1873 
경기도 안성시청
1245 
경기도 용인시청
 
300
경기도 남양주시청
 
213
Other values (20)
911 

Length

Max length12
Median length9
Mean length6.2946393
Min length3

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row경기도 안성시청
4th row<NA>
5th row경기도

Common Values

ValueCountFrequency (%)
<NA> 3013
39.9%
경기도 화성시청 1873
24.8%
경기도 안성시청 1245
16.5%
경기도 용인시청 300
 
4.0%
경기도 남양주시청 213
 
2.8%
경기도 안산시 194
 
2.6%
경기도 시흥시청 191
 
2.5%
경기도 130
 
1.7%
경기도 수원시청 83
 
1.1%
경기도 오산시 47
 
0.6%
Other values (15) 266
 
3.5%

Length

2024-05-10T21:07:48.205544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 4448
37.3%
na 3013
25.3%
화성시청 1873
15.7%
안성시청 1245
 
10.4%
용인시청 300
 
2.5%
남양주시청 213
 
1.8%
안산시 194
 
1.6%
시흥시청 191
 
1.6%
수원시청 83
 
0.7%
경상남도 51
 
0.4%
Other values (18) 316
 
2.6%

설치면적(㎥)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct626
Distinct (%)78.5%
Missing6758
Missing (%)89.5%
Infinite0
Infinite (%)0.0%
Mean975.71805
Minimum15.2
Maximum90444
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size66.5 KiB
2024-05-10T21:07:48.573631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum15.2
5-th percentile94.696
Q1221.76
median458
Q3650
95-th percentile2970.4
Maximum90444
Range90428.8
Interquartile range (IQR)428.24

Descriptive statistics

Standard deviation3679.4512
Coefficient of variation (CV)3.7710188
Kurtosis447.11264
Mean975.71805
Median Absolute Deviation (MAD)218
Skewness19.237193
Sum777647.29
Variance13538361
MonotonicityNot monotonic
2024-05-10T21:07:48.996731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
500.0 9
 
0.1%
475.251 8
 
0.1%
490.0 6
 
0.1%
483.0 6
 
0.1%
481.0 5
 
0.1%
162.0 5
 
0.1%
481.1 5
 
0.1%
458.0 5
 
0.1%
510.0 5
 
0.1%
186.0 4
 
0.1%
Other values (616) 739
 
9.8%
(Missing) 6758
89.5%
ValueCountFrequency (%)
15.2 1
< 0.1%
21.6 2
< 0.1%
49.3 1
< 0.1%
60.0 1
< 0.1%
60.71 1
< 0.1%
62.413 1
< 0.1%
63.0 2
< 0.1%
64.272 1
< 0.1%
65.0 1
< 0.1%
75.7 1
< 0.1%
ValueCountFrequency (%)
90444.0 1
 
< 0.1%
31693.0 1
 
< 0.1%
19335.0 1
 
< 0.1%
16664.0 1
 
< 0.1%
9903.56 1
 
< 0.1%
9700.0 1
 
< 0.1%
8840.0 1
 
< 0.1%
8512.0 3
< 0.1%
8247.0 1
 
< 0.1%
8000.0 1
 
< 0.1%

데이터기준일자
Categorical

HIGH CORRELATION 

Distinct27
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
2023-06-11
1873 
2023-07-06
1245 
2023-11-15
723 
2023-11-30
665 
2024-01-18
604 
Other values (22)
2445 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row2023-11-15
2nd row2023-11-30
3rd row2023-07-06
4th row2023-11-15
5th row2023-10-31

Common Values

ValueCountFrequency (%)
2023-06-11 1873
24.8%
2023-07-06 1245
16.5%
2023-11-15 723
 
9.6%
2023-11-30 665
 
8.8%
2024-01-18 604
 
8.0%
2022-12-12 518
 
6.9%
2023-06-26 317
 
4.2%
2022-11-21 275
 
3.6%
2022-11-17 230
 
3.0%
2023-01-26 221
 
2.9%
Other values (17) 884
11.7%

Length

2024-05-10T21:07:49.322760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2023-06-11 1873
24.8%
2023-07-06 1245
16.5%
2023-11-15 723
 
9.6%
2023-11-30 665
 
8.8%
2024-01-18 604
 
8.0%
2022-12-12 518
 
6.9%
2023-06-26 317
 
4.2%
2022-11-21 275
 
3.6%
2022-11-17 230
 
3.0%
2023-01-26 221
 
2.9%
Other values (17) 884
11.7%

Interactions

2024-05-10T21:07:32.720978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:27.011732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:28.551223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:29.954594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:31.319700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:32.988639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:27.273653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:28.822068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:30.221692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:31.592156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:33.271967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:27.545012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:29.086032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:30.492867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:31.865252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:33.532741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:27.808576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:29.344313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:30.760681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:32.160867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:33.802126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:28.300171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:29.614316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:31.062939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:07:32.456426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-10T21:07:49.533604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도설치상세위치구분명가동상태구분명설비용량(KW)공급전압(V)주파수(Hz)설치연도세부용도허가기관설치면적(㎥)데이터기준일자
위도1.0000.7400.7280.2450.1270.215NaN0.4740.7190.9130.0000.896
경도0.7401.0000.4020.2220.0920.251NaN0.3940.6910.8910.0000.949
설치상세위치구분명0.7280.4021.0000.7870.3150.203NaN0.1730.7810.8300.1120.761
가동상태구분명0.2450.2220.7871.0000.0400.0000.0000.1660.3710.7270.0000.738
설비용량(KW)0.1270.0920.3150.0401.0000.7700.0000.4050.2110.6090.4030.655
공급전압(V)0.2150.2510.2030.0000.7701.0000.0000.2260.2640.7890.2950.670
주파수(Hz)NaNNaNNaN0.0000.0000.0001.0000.000NaNNaNNaN0.000
설치연도0.4740.3940.1730.1660.4050.2260.0001.0000.2660.8950.0260.860
세부용도0.7190.6910.7810.3710.2110.264NaN0.2661.0000.9550.2470.974
허가기관0.9130.8910.8300.7270.6090.789NaN0.8950.9551.0000.4510.998
설치면적(㎥)0.0000.0000.1120.0000.4030.295NaN0.0260.2470.4511.0000.252
데이터기준일자0.8960.9490.7610.7380.6550.6700.0000.8600.9740.9980.2521.000
2024-05-10T21:07:49.848723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공급전압(V)설치상세위치구분명가동상태구분명주파수(Hz)세부용도허가기관데이터기준일자
공급전압(V)1.0000.1310.0000.0000.1130.5280.400
설치상세위치구분명0.1311.0000.7041.0000.5830.4630.477
가동상태구분명0.0000.7041.0000.0000.1650.4600.473
주파수(Hz)0.0001.0000.0001.0001.0001.0000.000
세부용도0.1130.5830.1651.0001.0000.8640.893
허가기관0.5280.4630.4601.0000.8641.0000.972
데이터기준일자0.4000.4770.4730.0000.8930.9721.000
2024-05-10T21:07:50.272432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도설비용량(KW)설치연도설치면적(㎥)설치상세위치구분명가동상태구분명공급전압(V)주파수(Hz)세부용도허가기관데이터기준일자
위도1.000-0.2820.1240.366-0.0340.3210.1500.1311.0000.4840.7180.665
경도-0.2821.000-0.009-0.063-0.1210.2790.0940.1071.0000.3120.7060.682
설비용량(KW)0.124-0.0091.0000.1090.8180.1990.0230.4260.0000.1130.2720.302
설치연도0.366-0.0630.1091.000-0.0310.0800.1490.1720.0000.1140.6360.641
설치면적(㎥)-0.034-0.1210.818-0.0311.0000.0710.0000.2331.0000.0940.2530.141
설치상세위치구분명0.3210.2790.1990.0800.0711.0000.7040.1311.0000.5830.4630.477
가동상태구분명0.1500.0940.0230.1490.0000.7041.0000.0000.0000.1650.4600.473
공급전압(V)0.1310.1070.4260.1720.2330.1310.0001.0000.0000.1130.5280.400
주파수(Hz)1.0001.0000.0000.0001.0001.0000.0000.0001.0001.0001.0000.000
세부용도0.4840.3120.1130.1140.0940.5830.1650.1131.0001.0000.8640.893
허가기관0.7180.7060.2720.6360.2530.4630.4600.5281.0000.8641.0000.972
데이터기준일자0.6650.6820.3020.6410.1410.4770.4730.4000.0000.8930.9721.000

Missing values

2024-05-10T21:07:34.294439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-10T21:07:34.917765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-10T21:07:35.521856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

태양광발전시설명소재지지번주소위도경도설치상세위치구분명가동상태구분명설비용량(KW)공급전압(V)주파수(Hz)설치연도세부용도허가일자허가기관설치면적(㎥)데이터기준일자
0(KCC)케이씨씨 여주태양광제3발전소경기도 여주시 가남읍 본두리 36-10437.242997127.593146<NA>정상가동829.4380602019<NA><NA><NA><NA>2023-11-15
1(유)답곡발전경기도 연천군 신서면 답곡리 117638.166532127.068037기타정상가동595.6922900602023태양광발전시설2022-11-02<NA><NA>2023-11-30
2(유)레나다경기도 안성시 일죽면 가리 67-5<NA><NA><NA>가동중단99.4380602018<NA>2018-10-10경기도 안성시청<NA>2023-07-06
3(유)방장 태양광발전소경기도 여주시 점동면 처리 569-437.212598127.669858<NA>정상가동999.0380602022<NA><NA><NA><NA>2023-11-15
4(유)백석 태양광발전소경기도 연천군 미산면 백석리 283-2, 283-3<NA><NA><NA>정상가동2000.022900602019발전사업용<NA>경기도<NA>2023-10-31
5(유)신흥 태양광발전소경기도 안성시 일죽면 신흥리 산7-2<NA><NA><NA>정상가동1498.022900602018발전사업용<NA>경기도<NA>2023-10-31
6(유)신흥2호 태양광발전소경기도 안성시 일죽면 신흥리 25,26,27<NA><NA><NA>가동중단498.96380602018<NA>2017-11-08경기도 안성시청<NA>2023-07-06
7(유)여주케이원솔라 태양광발전소경기도 여주시 점동면 처리 569-437.212598127.669858<NA>정상가동499.5380602022<NA><NA><NA><NA>2023-11-15
8(유)행죽태양광발전소경기도 이천시 설성면 639<NA><NA><NA>정상가동1786.022900602018발전사업용<NA>경기도<NA>2023-10-31
9(주)가산 태양광발전소경기도 안양시 만안구 박달동 613-737.407671126.892718<NA>정상가동150.75380602017<NA>2017-06-26경기도 안양시청<NA>2023-02-06
태양광발전시설명소재지지번주소위도경도설치상세위치구분명가동상태구분명설비용량(KW)공급전압(V)주파수(Hz)설치연도세부용도허가일자허가기관설치면적(㎥)데이터기준일자
7545희종 태양광발전소<NA><NA><NA><NA>정상가동97.65380602017<NA><NA><NA><NA>2024-01-18
7546희중 태양광발전소경기도 안성시 삼죽면 덕산리 372-1<NA><NA><NA>정상가동199.43380602019사업용2019-08-01경기도 안성시청<NA>2023-07-06
7547희중 태양광발전소<NA><NA><NA><NA>정상가동30.0380602014<NA><NA><NA><NA>2024-01-18
7548희천 태양광발전소경기도 안성시 미양면 구수리 201-1<NA><NA><NA>정상가동18.06220602021사업용2021-03-26경기도 안성시청<NA>2023-07-06
7549히즈시스템1호태양광발전소경기도 화성시 송산면 지화<NA><NA><NA>정상가동89.6380602021발전사업용2021-06-02경기도 화성시청<NA>2023-06-11
7550히즈시스템2호태양광발전소경기도 화성시 송산면 지화<NA><NA><NA>정상가동80.0380602021발전사업용2021-06-02경기도 화성시청<NA>2023-06-11
7551힐CREATIVE 태양광발전소경기도 파주시 조리읍 능안리 1038-7<NA><NA><NA>정상가동19.88380602020<NA><NA><NA><NA>2024-01-18
7552힐링 태양광발전소<NA>37.151558127.061273옥상정상가동16.0380602020발전사업용2020-02-07경기도 오산시83.02023-11-03
7553힘찬에너지 태양광발전소경기도 안성시 미양면 마산리 644-48<NA><NA><NA>가동중단74.7380602020<NA>2020-10-19경기도 안성시청<NA>2023-07-06
7554힘찬에셋<NA><NA><NA><NA>정상가동76.44380602016<NA>2016-05-25<NA><NA>2022-12-12

Duplicate rows

Most frequently occurring

태양광발전시설명소재지지번주소위도경도설치상세위치구분명가동상태구분명설비용량(KW)공급전압(V)주파수(Hz)설치연도세부용도허가일자허가기관설치면적(㎥)데이터기준일자# duplicates
0성은에너지 태양광발전소<NA><NA><NA><NA>정상가동99.0380602015발전사업용<NA><NA><NA>2022-11-212