Overview

Dataset statistics

Number of variables8
Number of observations2148
Missing cells577
Missing cells (%)3.4%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory136.5 KiB
Average record size in memory65.1 B

Variable types

Categorical2
Text2
DateTime3
Numeric1

Dataset

Description제주특별자치도에 위치한 태양광 발전소와 관련한 데이터로 행정시, 읍면동, 허가일자, 상호, 설비용량(KW), 상태, 사업개시일 정보를 제공합니다.
URLhttps://www.data.go.kr/data/3082724/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (< 0.1%) duplicate rowsDuplicates
사업개시일 has 577 (26.9%) missing valuesMissing
설비용량(KW) is highly skewed (γ1 = 23.96636617)Skewed

Reproduction

Analysis started2023-12-12 19:46:32.184664
Analysis finished2023-12-12 19:46:33.167818
Duration0.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

행정시
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
제주시
1238 
서귀포시
910 

Length

Max length4
Median length3
Mean length3.4236499
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제주시
2nd row제주시
3rd row제주시
4th row제주시
5th row제주시

Common Values

ValueCountFrequency (%)
제주시 1238
57.6%
서귀포시 910
42.4%

Length

2023-12-13T04:46:33.251412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:46:33.407963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제주시 1238
57.6%
서귀포시 910
42.4%
Distinct57
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
2023-12-13T04:46:33.594115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.0200186
Min length2

Characters and Unicode

Total characters6487
Distinct characters68
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)0.6%

Sample

1st row구좌읍
2nd row회천동
3rd row한경면
4th row한경면
5th row구좌읍
ValueCountFrequency (%)
한림읍 346
16.1%
한경면 260
12.1%
대정읍 218
10.1%
구좌읍 213
9.9%
표선면 183
8.5%
성산읍 177
8.2%
남원읍 172
8.0%
애월읍 155
7.2%
조천읍 120
 
5.6%
안덕면 81
 
3.8%
Other values (47) 223
10.4%
2023-12-13T04:46:33.975698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1401
21.6%
606
 
9.3%
525
 
8.1%
346
 
5.3%
260
 
4.0%
229
 
3.5%
225
 
3.5%
219
 
3.4%
213
 
3.3%
213
 
3.3%
Other values (58) 2250
34.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6487
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1401
21.6%
606
 
9.3%
525
 
8.1%
346
 
5.3%
260
 
4.0%
229
 
3.5%
225
 
3.5%
219
 
3.4%
213
 
3.3%
213
 
3.3%
Other values (58) 2250
34.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6487
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1401
21.6%
606
 
9.3%
525
 
8.1%
346
 
5.3%
260
 
4.0%
229
 
3.5%
225
 
3.5%
219
 
3.4%
213
 
3.3%
213
 
3.3%
Other values (58) 2250
34.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6487
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1401
21.6%
606
 
9.3%
525
 
8.1%
346
 
5.3%
260
 
4.0%
229
 
3.5%
225
 
3.5%
219
 
3.4%
213
 
3.3%
213
 
3.3%
Other values (58) 2250
34.7%
Distinct516
Distinct (%)24.0%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
Minimum1998-06-10 00:00:00
Maximum2022-06-21 00:00:00
2023-12-13T04:46:34.144326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:46:34.356319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

상호
Text

Distinct2064
Distinct (%)96.1%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
2023-12-13T04:46:34.612262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length24
Mean length10.272346
Min length2

Characters and Unicode

Total characters22065
Distinct characters467
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1991 ?
Unique (%)92.7%

Sample

1st row행원풍력발전단지(1차)
2nd row파낙스에너지(주)
3rd row한경풍력 1단계
4th row신창풍력발전단지
5th row제주월정풍력발전소
ValueCountFrequency (%)
태양광발전소 893
 
26.8%
발전소 45
 
1.4%
주식회사 34
 
1.0%
2호 24
 
0.7%
1호 20
 
0.6%
태양광 9
 
0.3%
3호 8
 
0.2%
대명솔라 7
 
0.2%
한태연 6
 
0.2%
행원2018 6
 
0.2%
Other values (2089) 2281
68.4%
2023-12-13T04:46:35.075471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1957
 
8.9%
1940
 
8.8%
1927
 
8.7%
1881
 
8.5%
1856
 
8.4%
1835
 
8.3%
1185
 
5.4%
703
 
3.2%
385
 
1.7%
1 351
 
1.6%
Other values (457) 8045
36.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 19335
87.6%
Space Separator 1185
 
5.4%
Decimal Number 1084
 
4.9%
Close Punctuation 156
 
0.7%
Open Punctuation 154
 
0.7%
Uppercase Letter 98
 
0.4%
Lowercase Letter 27
 
0.1%
Dash Punctuation 21
 
0.1%
Other Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1957
 
10.1%
1940
 
10.0%
1927
 
10.0%
1881
 
9.7%
1856
 
9.6%
1835
 
9.5%
703
 
3.6%
385
 
2.0%
258
 
1.3%
255
 
1.3%
Other values (404) 6338
32.8%
Uppercase Letter
ValueCountFrequency (%)
J 14
14.3%
K 13
13.3%
S 13
13.3%
L 7
 
7.1%
C 6
 
6.1%
G 5
 
5.1%
H 5
 
5.1%
M 5
 
5.1%
E 5
 
5.1%
V 5
 
5.1%
Other values (10) 20
20.4%
Lowercase Letter
ValueCountFrequency (%)
a 3
11.1%
r 3
11.1%
e 3
11.1%
k 3
11.1%
o 2
7.4%
m 2
7.4%
d 2
7.4%
c 2
7.4%
s 2
7.4%
l 1
 
3.7%
Other values (4) 4
14.8%
Decimal Number
ValueCountFrequency (%)
1 351
32.4%
2 324
29.9%
3 128
 
11.8%
4 67
 
6.2%
5 54
 
5.0%
8 38
 
3.5%
6 38
 
3.5%
0 34
 
3.1%
9 26
 
2.4%
7 24
 
2.2%
Other Punctuation
ValueCountFrequency (%)
& 1
20.0%
' 1
20.0%
. 1
20.0%
: 1
20.0%
1
20.0%
Space Separator
ValueCountFrequency (%)
1185
100.0%
Close Punctuation
ValueCountFrequency (%)
) 156
100.0%
Open Punctuation
ValueCountFrequency (%)
( 154
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 19335
87.6%
Common 2605
 
11.8%
Latin 125
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1957
 
10.1%
1940
 
10.0%
1927
 
10.0%
1881
 
9.7%
1856
 
9.6%
1835
 
9.5%
703
 
3.6%
385
 
2.0%
258
 
1.3%
255
 
1.3%
Other values (404) 6338
32.8%
Latin
ValueCountFrequency (%)
J 14
 
11.2%
K 13
 
10.4%
S 13
 
10.4%
L 7
 
5.6%
C 6
 
4.8%
G 5
 
4.0%
H 5
 
4.0%
M 5
 
4.0%
E 5
 
4.0%
V 5
 
4.0%
Other values (24) 47
37.6%
Common
ValueCountFrequency (%)
1185
45.5%
1 351
 
13.5%
2 324
 
12.4%
) 156
 
6.0%
( 154
 
5.9%
3 128
 
4.9%
4 67
 
2.6%
5 54
 
2.1%
8 38
 
1.5%
6 38
 
1.5%
Other values (9) 110
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 19335
87.6%
ASCII 2729
 
12.4%
Punctuation 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1957
 
10.1%
1940
 
10.0%
1927
 
10.0%
1881
 
9.7%
1856
 
9.6%
1835
 
9.5%
703
 
3.6%
385
 
2.0%
258
 
1.3%
255
 
1.3%
Other values (404) 6338
32.8%
ASCII
ValueCountFrequency (%)
1185
43.4%
1 351
 
12.9%
2 324
 
11.9%
) 156
 
5.7%
( 154
 
5.6%
3 128
 
4.7%
4 67
 
2.5%
5 54
 
2.0%
8 38
 
1.4%
6 38
 
1.4%
Other values (42) 234
 
8.6%
Punctuation
ValueCountFrequency (%)
1
100.0%

설비용량(KW)
Real number (ℝ)

SKEWED 

Distinct859
Distinct (%)40.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean544.52271
Minimum3
Maximum100000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.0 KiB
2023-12-13T04:46:35.658297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile51.8245
Q199
median167.22
Q3496.4
95-th percentile999
Maximum100000
Range99997
Interquartile range (IQR)397.4

Descriptive statistics

Standard deviation2826.8331
Coefficient of variation (CV)5.1913961
Kurtosis748.72706
Mean544.52271
Median Absolute Deviation (MAD)84.96
Skewness23.966366
Sum1169634.8
Variance7990985.3
MonotonicityNot monotonic
2023-12-13T04:46:35.843714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99.0 181
 
8.4%
99.36 74
 
3.4%
99.28 51
 
2.4%
98.0 51
 
2.4%
99.16 40
 
1.9%
99.6 38
 
1.8%
99.06 33
 
1.5%
98.28 27
 
1.3%
99.9 26
 
1.2%
98.8 24
 
1.1%
Other values (849) 1603
74.6%
ValueCountFrequency (%)
3.0 1
< 0.1%
5.02 1
< 0.1%
5.59 1
< 0.1%
9.9 1
< 0.1%
9.95 2
0.1%
10.0 1
< 0.1%
12.0 1
< 0.1%
12.18 1
< 0.1%
15.0 2
0.1%
15.84 1
< 0.1%
ValueCountFrequency (%)
100000.0 1
 
< 0.1%
33000.0 1
 
< 0.1%
30000.0 4
0.2%
25200.0 1
 
< 0.1%
21000.0 1
 
< 0.1%
20000.0 1
 
< 0.1%
15000.0 2
0.1%
12000.0 1
 
< 0.1%
11000.0 1
 
< 0.1%
8150.0 1
 
< 0.1%

상태
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
사업개시
1568 
인허가
480 
공사진행
 
100

Length

Max length4
Median length4
Mean length3.7765363
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사업개시
2nd row사업개시
3rd row사업개시
4th row사업개시
5th row사업개시

Common Values

ValueCountFrequency (%)
사업개시 1568
73.0%
인허가 480
 
22.3%
공사진행 100
 
4.7%

Length

2023-12-13T04:46:36.024687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:46:36.165489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
사업개시 1568
73.0%
인허가 480
 
22.3%
공사진행 100
 
4.7%

사업개시일
Date

MISSING 

Distinct612
Distinct (%)39.0%
Missing577
Missing (%)26.9%
Memory size16.9 KiB
Minimum2000-04-05 00:00:00
Maximum2022-06-07 00:00:00
2023-12-13T04:46:36.334147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:46:36.554221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
Minimum2022-06-30 00:00:00
Maximum2022-06-30 00:00:00
2023-12-13T04:46:36.732606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:46:36.855997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-13T04:46:32.803596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:46:36.953654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정시읍면동설비용량(KW)상태
행정시1.0000.9990.0000.058
읍면동0.9991.0000.0000.254
설비용량(KW)0.0000.0001.0000.113
상태0.0580.2540.1131.000
2023-12-13T04:46:37.077006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상태행정시
상태1.0000.096
행정시0.0961.000
2023-12-13T04:46:37.180766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설비용량(KW)행정시상태
설비용량(KW)1.0000.0000.085
행정시0.0001.0000.096
상태0.0850.0961.000

Missing values

2023-12-13T04:46:32.950653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:46:33.098291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

행정시읍면동허가일자상호설비용량(KW)상태사업개시일데이터기준일자
0제주시구좌읍1998-06-10행원풍력발전단지(1차)3480.0사업개시2000-04-052022-06-30
1제주시회천동2002-12-09파낙스에너지(주)1000.0사업개시2002-12-182022-06-30
2제주시한경면2003-04-14한경풍력 1단계6000.0사업개시2004-02-282022-06-30
3제주시한경면2005-04-04신창풍력발전단지1700.0사업개시2006-03-032022-06-30
4제주시구좌읍2005-08-30제주월정풍력발전소1500.0사업개시2006-07-202022-06-30
5제주시한경면2006-03-03오복산전59.87사업개시2006-07-112022-06-30
6제주시한경면2006-08-11탐라해상풍력발전소30000.0공사진행2017-09-162022-06-30
7제주시한경면2006-08-24오복산전 3호29.52사업개시2006-09-252022-06-30
8제주시조천읍2007-01-05민솔라에너지90.72사업개시2007-04-042022-06-30
9서귀포시성산읍2007-02-21제주에너지 주식회사200.0사업개시2008-09-292022-06-30
행정시읍면동허가일자상호설비용량(KW)상태사업개시일데이터기준일자
2138제주시한림읍2022-05-20구좌조천축산영농조합법인990.0인허가<NA>2022-06-30
2139제주시구좌읍2022-06-08태광3호 태양광발전소99.99인허가<NA>2022-06-30
2140제주시구좌읍2022-06-08윤슬 태양광발전소99.99인허가<NA>2022-06-30
2141서귀포시표선면2022-06-08비제이1호 태양광발전소454.72인허가<NA>2022-06-30
2142제주시한경면2022-06-08성도 태양광발전소99.6인허가<NA>2022-06-30
2143제주시한경면2022-06-08민재 태양광발전소99.6인허가<NA>2022-06-30
2144제주시한림읍2022-06-08효일4호 태양광발전소99.0인허가<NA>2022-06-30
2145제주시한림읍2022-06-08효일3호 태양광발전소99.0인허가<NA>2022-06-30
2146제주시한림읍2022-06-08효일5호 태양광발전소99.0인허가<NA>2022-06-30
2147서귀포시성산읍2022-06-21한라2호소수력발전소210.0인허가<NA>2022-06-30

Duplicate rows

Most frequently occurring

행정시읍면동허가일자상호설비용량(KW)상태사업개시일데이터기준일자# duplicates
0서귀포시안덕면2007-12-14번내태양광발전주식회사90.72사업개시2008-05-132022-06-302