Overview

Dataset statistics

Number of variables8
Number of observations1614
Missing cells392
Missing cells (%)3.0%
Duplicate rows1
Duplicate rows (%)0.1%
Total size in memory104.2 KiB
Average record size in memory66.1 B

Variable types

Categorical3
Text2
Numeric1
DateTime2

Dataset

Description전북특별자치도 장수군 소재의 태양광발전허가 현황(허가자, 발전소명, 사업장소, 설치용량, 공급전압, 주파수, 허가일, 사업개시일)에 대한 데이터 정보를 제공하고자 합니다
Author전북특별자치도 장수군
URLhttps://www.data.go.kr/data/15042049/fileData.do

Alerts

Dataset has 1 (0.1%) duplicate rowsDuplicates
설치용량(kW) is highly overall correlated with 허가자High correlation
허가자 is highly overall correlated with 설치용량(kW) and 1 other fieldsHigh correlation
공급전압(V) is highly overall correlated with 허가자High correlation
공급전압(V) is highly imbalanced (81.8%)Imbalance
주파수(Hz) is highly imbalanced (96.9%)Imbalance
사업개시일 has 392 (24.3%) missing valuesMissing

Reproduction

Analysis started2024-04-06 08:46:22.795898
Analysis finished2024-04-06 08:46:25.185471
Duration2.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

허가자
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.7 KiB
장수군
1391 
223 

Length

Max length3
Median length3
Mean length2.7236679
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row장수군
5th row

Common Values

ValueCountFrequency (%)
장수군 1391
86.2%
223
 
13.8%

Length

2024-04-06T17:46:25.411727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:46:25.810153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
장수군 1391
86.2%
223
 
13.8%
Distinct1549
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Memory size12.7 KiB
2024-04-06T17:46:26.921992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length18
Mean length10.197026
Min length4

Characters and Unicode

Total characters16458
Distinct characters389
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1494 ?
Unique (%)92.6%

Sample

1st row동화댐소수력발전소
2nd row동화에너지
3rd row㈜토탈에너지
4th row먹골태양광발전소
5th row(유)유성에너지태양광발전소
ValueCountFrequency (%)
태양광발전소 1128
39.2%
발전소 48
 
1.7%
태양광 27
 
0.9%
장수 7
 
0.2%
한마음 7
 
0.2%
2호 5
 
0.2%
하월 5
 
0.2%
행복 5
 
0.2%
1호 4
 
0.1%
봉황대 4
 
0.1%
Other values (1543) 1636
56.9%
2024-04-06T17:46:28.516382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1584
 
9.6%
1580
 
9.6%
1564
 
9.5%
1563
 
9.5%
1558
 
9.5%
1546
 
9.4%
1389
 
8.4%
691
 
4.2%
1 273
 
1.7%
2 245
 
1.5%
Other values (379) 4465
27.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14095
85.6%
Space Separator 1389
 
8.4%
Decimal Number 852
 
5.2%
Uppercase Letter 62
 
0.4%
Lowercase Letter 29
 
0.2%
Open Punctuation 13
 
0.1%
Close Punctuation 13
 
0.1%
Other Symbol 3
 
< 0.1%
Dash Punctuation 1
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1584
11.2%
1580
11.2%
1564
11.1%
1563
11.1%
1558
11.1%
1546
11.0%
691
 
4.9%
192
 
1.4%
183
 
1.3%
88
 
0.6%
Other values (338) 3546
25.2%
Uppercase Letter
ValueCountFrequency (%)
S 19
30.6%
J 9
14.5%
K 9
14.5%
H 5
 
8.1%
D 3
 
4.8%
P 3
 
4.8%
G 3
 
4.8%
Y 2
 
3.2%
A 2
 
3.2%
B 2
 
3.2%
Other values (5) 5
 
8.1%
Decimal Number
ValueCountFrequency (%)
1 273
32.0%
2 245
28.8%
3 119
14.0%
5 56
 
6.6%
4 56
 
6.6%
6 31
 
3.6%
7 26
 
3.1%
8 19
 
2.2%
0 15
 
1.8%
9 12
 
1.4%
Lowercase Letter
ValueCountFrequency (%)
e 6
20.7%
t 4
13.8%
c 3
10.3%
a 3
10.3%
h 3
10.3%
s 3
10.3%
n 2
 
6.9%
p 2
 
6.9%
y 2
 
6.9%
u 1
 
3.4%
Space Separator
ValueCountFrequency (%)
1389
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Other Symbol
ValueCountFrequency (%)
3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14098
85.7%
Common 2269
 
13.8%
Latin 91
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1584
11.2%
1580
11.2%
1564
11.1%
1563
11.1%
1558
11.1%
1546
11.0%
691
 
4.9%
192
 
1.4%
183
 
1.3%
88
 
0.6%
Other values (339) 3549
25.2%
Latin
ValueCountFrequency (%)
S 19
20.9%
J 9
 
9.9%
K 9
 
9.9%
e 6
 
6.6%
H 5
 
5.5%
t 4
 
4.4%
c 3
 
3.3%
a 3
 
3.3%
h 3
 
3.3%
s 3
 
3.3%
Other values (15) 27
29.7%
Common
ValueCountFrequency (%)
1389
61.2%
1 273
 
12.0%
2 245
 
10.8%
3 119
 
5.2%
5 56
 
2.5%
4 56
 
2.5%
6 31
 
1.4%
7 26
 
1.1%
8 19
 
0.8%
0 15
 
0.7%
Other values (5) 40
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14095
85.6%
ASCII 2360
 
14.3%
None 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1584
11.2%
1580
11.2%
1564
11.1%
1563
11.1%
1558
11.1%
1546
11.0%
691
 
4.9%
192
 
1.4%
183
 
1.3%
88
 
0.6%
Other values (338) 3546
25.2%
ASCII
ValueCountFrequency (%)
1389
58.9%
1 273
 
11.6%
2 245
 
10.4%
3 119
 
5.0%
5 56
 
2.4%
4 56
 
2.4%
6 31
 
1.3%
7 26
 
1.1%
8 19
 
0.8%
S 19
 
0.8%
Other values (30) 127
 
5.4%
None
ValueCountFrequency (%)
3
100.0%
Distinct1476
Distinct (%)91.4%
Missing0
Missing (%)0.0%
Memory size12.7 KiB
2024-04-06T17:46:29.386375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length168
Median length90
Mean length32.180297
Min length21

Characters and Unicode

Total characters51939
Distinct characters133
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1396 ?
Unique (%)86.5%

Sample

1st row전북특별자치도 장수군 번암면 죽림리 일원
2nd row전북특별자치도 장수군 산서면 오산리 산50
3rd row전북특별자치도 장수군 산서면 오산리 874 외 3건
4th row전북특별자치도 장수군 산서면 동화리 603-8,9,10,11,27번지
5th row전북특별자치도 장수군 장수읍 대성리728-6번지
ValueCountFrequency (%)
전북특별자치도 1616
 
16.5%
장수군 1616
 
16.5%
장수읍 406
 
4.1%
계북면 303
 
3.1%
건물상부 277
 
2.8%
산서면 269
 
2.7%
천천면 253
 
2.6%
장계면 177
 
1.8%
원촌리 153
 
1.6%
계남면 141
 
1.4%
Other values (2316) 4581
46.8%
2024-04-06T17:46:30.812970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8217
 
15.8%
2272
 
4.4%
1 2235
 
4.3%
2054
 
4.0%
- 2030
 
3.9%
1922
 
3.7%
1661
 
3.2%
1617
 
3.1%
1616
 
3.1%
1616
 
3.1%
Other values (123) 26699
51.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 28199
54.3%
Decimal Number 11456
22.1%
Space Separator 8217
 
15.8%
Dash Punctuation 2030
 
3.9%
Other Punctuation 1305
 
2.5%
Close Punctuation 365
 
0.7%
Open Punctuation 365
 
0.7%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2272
 
8.1%
2054
 
7.3%
1922
 
6.8%
1661
 
5.9%
1617
 
5.7%
1616
 
5.7%
1616
 
5.7%
1616
 
5.7%
1616
 
5.7%
1616
 
5.7%
Other values (106) 10593
37.6%
Decimal Number
ValueCountFrequency (%)
1 2235
19.5%
2 1419
12.4%
3 1156
10.1%
5 1126
9.8%
4 1038
9.1%
7 992
8.7%
0 900
7.9%
9 888
 
7.8%
6 884
 
7.7%
8 818
 
7.1%
Other Punctuation
ValueCountFrequency (%)
, 1293
99.1%
. 12
 
0.9%
Space Separator
ValueCountFrequency (%)
8217
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2030
100.0%
Close Punctuation
ValueCountFrequency (%)
) 365
100.0%
Open Punctuation
ValueCountFrequency (%)
( 365
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 28199
54.3%
Common 23740
45.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2272
 
8.1%
2054
 
7.3%
1922
 
6.8%
1661
 
5.9%
1617
 
5.7%
1616
 
5.7%
1616
 
5.7%
1616
 
5.7%
1616
 
5.7%
1616
 
5.7%
Other values (106) 10593
37.6%
Common
ValueCountFrequency (%)
8217
34.6%
1 2235
 
9.4%
- 2030
 
8.6%
2 1419
 
6.0%
, 1293
 
5.4%
3 1156
 
4.9%
5 1126
 
4.7%
4 1038
 
4.4%
7 992
 
4.2%
0 900
 
3.8%
Other values (7) 3334
14.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 28199
54.3%
ASCII 23740
45.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8217
34.6%
1 2235
 
9.4%
- 2030
 
8.6%
2 1419
 
6.0%
, 1293
 
5.4%
3 1156
 
4.9%
5 1126
 
4.7%
4 1038
 
4.4%
7 992
 
4.2%
0 900
 
3.8%
Other values (7) 3334
14.0%
Hangul
ValueCountFrequency (%)
2272
 
8.1%
2054
 
7.3%
1922
 
6.8%
1661
 
5.9%
1617
 
5.7%
1616
 
5.7%
1616
 
5.7%
1616
 
5.7%
1616
 
5.7%
1616
 
5.7%
Other values (106) 10593
37.6%

설치용량(kW)
Real number (ℝ)

HIGH CORRELATION 

Distinct364
Distinct (%)22.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean169.24048
Minimum9.45
Maximum2993.76
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.3 KiB
2024-04-06T17:46:31.210289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9.45
5-th percentile29.75
Q198.56
median99.36
Q399.84
95-th percentile764.7
Maximum2993.76
Range2984.31
Interquartile range (IQR)1.28

Descriptive statistics

Standard deviation237.40312
Coefficient of variation (CV)1.4027561
Kurtosis31.207567
Mean169.24048
Median Absolute Deviation (MAD)0.54
Skewness4.4866122
Sum273154.14
Variance56360.241
MonotonicityNot monotonic
2024-04-06T17:46:31.580656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99.0 195
 
12.1%
99.6 78
 
4.8%
99.4 57
 
3.5%
99.9 56
 
3.5%
99.75 51
 
3.2%
99.96 50
 
3.1%
99.28 48
 
3.0%
99.45 47
 
2.9%
99.36 46
 
2.9%
98.28 42
 
2.6%
Other values (354) 944
58.5%
ValueCountFrequency (%)
9.45 1
0.1%
9.9 2
0.1%
10.2 1
0.1%
12.04 1
0.1%
12.1 1
0.1%
12.98 1
0.1%
14.25 1
0.1%
14.28 1
0.1%
14.4 1
0.1%
14.5 2
0.1%
ValueCountFrequency (%)
2993.76 1
 
0.1%
2984.85 1
 
0.1%
1992.0 1
 
0.1%
1192.78 1
 
0.1%
1000.0 8
0.5%
999.8 2
 
0.1%
999.77 1
 
0.1%
999.6 2
 
0.1%
999.53 1
 
0.1%
999.45 6
0.4%

공급전압(V)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct9
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size12.7 KiB
380
1471 
22900
 
91
220
 
21
220/380
 
20
<NA>
 
7
Other values (4)
 
4

Length

Max length7
Median length3
Mean length3.1716233
Min length3

Unique

Unique4 ?
Unique (%)0.2%

Sample

1st row3300
2nd row<NA>
3rd row<NA>
4th row380
5th row<NA>

Common Values

ValueCountFrequency (%)
380 1471
91.1%
22900 91
 
5.6%
220 21
 
1.3%
220/380 20
 
1.2%
<NA> 7
 
0.4%
3300 1
 
0.1%
22,900 1
 
0.1%
220/280 1
 
0.1%
280 1
 
0.1%

Length

2024-04-06T17:46:32.000707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:46:32.398560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
380 1471
91.1%
22900 91
 
5.6%
220 21
 
1.3%
220/380 20
 
1.2%
na 7
 
0.4%
3300 1
 
0.1%
22,900 1
 
0.1%
220/280 1
 
0.1%
280 1
 
0.1%

주파수(Hz)
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size12.7 KiB
60
1604 
600
 
5
380
 
4
601
 
1

Length

Max length3
Median length2
Mean length2.0061958
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row60
2nd row60
3rd row60
4th row60
5th row60

Common Values

ValueCountFrequency (%)
60 1604
99.4%
600 5
 
0.3%
380 4
 
0.2%
601 1
 
0.1%

Length

2024-04-06T17:46:32.770245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:46:33.063550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
60 1604
99.4%
600 5
 
0.3%
380 4
 
0.2%
601 1
 
0.1%
Distinct479
Distinct (%)29.7%
Missing0
Missing (%)0.0%
Memory size12.7 KiB
Minimum2002-11-25 00:00:00
Maximum2024-02-20 00:00:00
2024-04-06T17:46:33.405980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:46:33.795015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

사업개시일
Date

MISSING 

Distinct371
Distinct (%)30.4%
Missing392
Missing (%)24.3%
Memory size12.7 KiB
Minimum1900-07-19 00:00:00
Maximum2024-01-16 00:00:00
2024-04-06T17:46:34.192187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:46:34.680884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-04-06T17:46:24.166292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T17:46:34.988795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
허가자설치용량(kW)공급전압(V)주파수(Hz)
허가자1.0000.9070.7380.000
설치용량(kW)0.9071.0000.6560.000
공급전압(V)0.7380.6561.0000.000
주파수(Hz)0.0000.0000.0001.000
2024-04-06T17:46:35.247925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
허가자주파수(Hz)공급전압(V)
허가자1.0000.0000.565
주파수(Hz)0.0001.0000.000
공급전압(V)0.5650.0001.000
2024-04-06T17:46:35.523449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치용량(kW)허가자공급전압(V)주파수(Hz)
설치용량(kW)1.0000.7270.4390.000
허가자0.7271.0000.5650.000
공급전압(V)0.4390.5651.0000.000
주파수(Hz)0.0000.0000.0001.000

Missing values

2024-04-06T17:46:24.618977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T17:46:25.020559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

허가자발전소명사업장소(발전소 위치)설치용량(kW)공급전압(V)주파수(Hz)허가일사업개시일
0동화댐소수력발전소전북특별자치도 장수군 번암면 죽림리 일원1000.03300602002-11-252000-05-01
1동화에너지전북특별자치도 장수군 산서면 오산리 산501000.0<NA>602006-09-192007-12-31
2㈜토탈에너지전북특별자치도 장수군 산서면 오산리 874 외 3건1192.78<NA>602007-07-042008-05-21
3장수군먹골태양광발전소전북특별자치도 장수군 산서면 동화리 603-8,9,10,11,27번지99.82380602009-03-05<NA>
4(유)유성에너지태양광발전소전북특별자치도 장수군 장수읍 대성리728-6번지977.04<NA>602009-03-062009-11-12
5(유)유성에너지수산태양광발전소전북특별자치도 장수군 장수읍 식천리 산27-6번지486.6<NA>602009-05-152009-11-12
6장수군수태양광발전소전북특별자치도 장수군 장수읍 송천리 1090-5099.0<NA>602009-09-072010-01-07
7장수군먹골2차태양광발전소전북특별자치도 장수군 산서면 동화리 603-14,1548.18380602010-01-082010-09-02
8한국농어촌공사용림소수력발전소전북특별자치도 장수군 장수읍 덕산리 720-1,2,3,721-2,3600.0<NA>602010-08-062011-06-20
9장수군동부태양광발전소전북특별자치도 장수군 장수읍 노곡리 산123-34, 909번지99.82<NA>602011-03-162011-04-14
허가자발전소명사업장소(발전소 위치)설치용량(kW)공급전압(V)주파수(Hz)허가일사업개시일
1604장수군열매태양광(2)전북특별자치도 장수군 천천면 장판리 23-1699.6380602024-01-23<NA>
1605장수군하성오 태양광발전소전북특별자치도 장수군 천천면 봉덕리 601-3번지 주1 (건물상부)277.2380602024-01-23<NA>
1606장수군전진 태양광발전소전북특별자치도 장수군 천천면 남양리 1018-199.0380602024-02-19<NA>
1607장수군해오름1호발전소전북특별자치도 장수군 장수읍 용계리 579-16(건물상부)99.2380602024-02-19<NA>
1608장수군해오름2호발전소전북특별자치도 장수군 장수읍 용계리 579-16(건물상부)99.2380602024-02-19<NA>
1609장수군해오름3호발전소전북특별자치도 장수군 장수읍 용계리 579-16(건물상부)49.6380602024-02-19<NA>
1610장수군재홍 태양광발전소전북특별자치도 장수군 장수읍 두산리 713-27 주1, 주2(건물상부)64.96380602024-02-19<NA>
1611장수군재홍1 태양광발전소전북특별자치도 장수군 장수읍 두산리 713-44(건물상부)29.58380602024-02-19<NA>
1612장수군하윤수2호 태양광발전소전북특별자치도 장수군 천천면 봉덕리 601-3, 601-4199.2380602024-02-20<NA>
1613장수군하윤수3호 태양광발전소전북특별자치도 장수군 천천면 봉덕리 61199.6380602024-02-20<NA>

Duplicate rows

Most frequently occurring

허가자발전소명사업장소(발전소 위치)설치용량(kW)공급전압(V)주파수(Hz)허가일사업개시일# duplicates
0장수군영구태양광발전소전북특별자치도 장수군 장수읍 송천리 1090-33, 1090-40, 1090-4199.0380602013-12-232017-04-212