Overview

Dataset statistics

Number of variables6
Number of observations248
Missing cells30
Missing cells (%)2.0%
Duplicate rows1
Duplicate rows (%)0.4%
Total size in memory12.0 KiB
Average record size in memory49.5 B

Variable types

Text3
Numeric1
DateTime2

Dataset

Description충청북도 증평군의 태양광발전소 허가일, 발전소명, 설비용량, 소재지주소,사업개시일, 에너지원, 상태 등 정보
URLhttps://www.data.go.kr/data/3037949/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (0.4%) duplicate rowsDuplicates
발전소명 has 5 (2.0%) missing valuesMissing
설치장소 has 5 (2.0%) missing valuesMissing
설비용량(kW) has 5 (2.0%) missing valuesMissing
허가일자 has 5 (2.0%) missing valuesMissing
사업개시일자 has 5 (2.0%) missing valuesMissing
데이터기준일자 has 5 (2.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 14:37:48.373512
Analysis finished2023-12-12 14:37:49.227521
Duration0.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

발전소명
Text

MISSING 

Distinct232
Distinct (%)95.5%
Missing5
Missing (%)2.0%
Memory size2.1 KiB
2023-12-12T23:37:49.540092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length23
Mean length10.794239
Min length7

Characters and Unicode

Total characters2623
Distinct characters241
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique223 ?
Unique (%)91.8%

Sample

1st row대원전기교육원
2nd row한국철강(주) GETWATT Solar 발전소
3rd row율리태양광발전소
4th row창도태양광발전소
5th row충북태양광발전㈜15호발전소
ValueCountFrequency (%)
태양광발전소 201
43.4%
도안 4
 
0.9%
덕상 2
 
0.4%
김덕원 2
 
0.4%
제이디쏠라파워 2
 
0.4%
발전소 2
 
0.4%
숫고개 2
 
0.4%
2호 2
 
0.4%
행복 2
 
0.4%
용강 2
 
0.4%
Other values (238) 242
52.3%
2023-12-12T23:37:50.108627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
248
 
9.5%
245
 
9.3%
245
 
9.3%
242
 
9.2%
242
 
9.2%
240
 
9.1%
220
 
8.4%
48
 
1.8%
2 35
 
1.3%
1 24
 
0.9%
Other values (231) 834
31.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2287
87.2%
Space Separator 220
 
8.4%
Decimal Number 86
 
3.3%
Uppercase Letter 17
 
0.6%
Open Punctuation 4
 
0.2%
Close Punctuation 4
 
0.2%
Lowercase Letter 4
 
0.2%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
248
 
10.8%
245
 
10.7%
245
 
10.7%
242
 
10.6%
242
 
10.6%
240
 
10.5%
48
 
2.1%
24
 
1.0%
23
 
1.0%
23
 
1.0%
Other values (205) 707
30.9%
Uppercase Letter
ValueCountFrequency (%)
T 3
17.6%
B 3
17.6%
Y 2
11.8%
S 2
11.8%
J 1
 
5.9%
A 1
 
5.9%
W 1
 
5.9%
E 1
 
5.9%
G 1
 
5.9%
C 1
 
5.9%
Decimal Number
ValueCountFrequency (%)
2 35
40.7%
1 24
27.9%
3 17
19.8%
5 4
 
4.7%
4 3
 
3.5%
6 2
 
2.3%
9 1
 
1.2%
Lowercase Letter
ValueCountFrequency (%)
a 1
25.0%
r 1
25.0%
l 1
25.0%
o 1
25.0%
Space Separator
ValueCountFrequency (%)
220
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2288
87.2%
Common 314
 
12.0%
Latin 21
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
248
 
10.8%
245
 
10.7%
245
 
10.7%
242
 
10.6%
242
 
10.6%
240
 
10.5%
48
 
2.1%
24
 
1.0%
23
 
1.0%
23
 
1.0%
Other values (206) 708
30.9%
Latin
ValueCountFrequency (%)
T 3
14.3%
B 3
14.3%
Y 2
 
9.5%
S 2
 
9.5%
J 1
 
4.8%
a 1
 
4.8%
r 1
 
4.8%
l 1
 
4.8%
o 1
 
4.8%
A 1
 
4.8%
Other values (5) 5
23.8%
Common
ValueCountFrequency (%)
220
70.1%
2 35
 
11.1%
1 24
 
7.6%
3 17
 
5.4%
( 4
 
1.3%
) 4
 
1.3%
5 4
 
1.3%
4 3
 
1.0%
6 2
 
0.6%
9 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2287
87.2%
ASCII 335
 
12.8%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
248
 
10.8%
245
 
10.7%
245
 
10.7%
242
 
10.6%
242
 
10.6%
240
 
10.5%
48
 
2.1%
24
 
1.0%
23
 
1.0%
23
 
1.0%
Other values (205) 707
30.9%
ASCII
ValueCountFrequency (%)
220
65.7%
2 35
 
10.4%
1 24
 
7.2%
3 17
 
5.1%
( 4
 
1.2%
) 4
 
1.2%
5 4
 
1.2%
4 3
 
0.9%
T 3
 
0.9%
B 3
 
0.9%
Other values (15) 18
 
5.4%
None
ValueCountFrequency (%)
1
100.0%

설치장소
Text

MISSING 

Distinct203
Distinct (%)83.5%
Missing5
Missing (%)2.0%
Memory size2.1 KiB
2023-12-12T23:37:50.358701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length45
Median length37
Mean length23.753086
Min length18

Characters and Unicode

Total characters5772
Distinct characters167
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique179 ?
Unique (%)73.7%

Sample

1st row충청북도 증평군 증평읍 연탄리 510-14
2nd row충청북도 증평군 증평읍 증평산단로 80
3rd row충청북도 증평군 증평읍 율리휴양로 307
4th row충청북도 증평군 증평읍 삼보로1길 36-4
5th row충청북도 증평군 증평읍 삼보로 40-60
ValueCountFrequency (%)
충청북도 243
18.5%
증평군 243
18.5%
증평읍 141
 
10.7%
도안면 102
 
7.7%
송정리 31
 
2.4%
용강리 31
 
2.4%
광덕리 18
 
1.4%
덕상리 12
 
0.9%
석곡리 12
 
0.9%
연탄리 11
 
0.8%
Other values (312) 473
35.9%
2023-12-12T23:37:50.789902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1074
18.6%
397
 
6.9%
391
 
6.8%
346
 
6.0%
245
 
4.2%
245
 
4.2%
243
 
4.2%
243
 
4.2%
1 194
 
3.4%
141
 
2.4%
Other values (157) 2253
39.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3588
62.2%
Space Separator 1074
 
18.6%
Decimal Number 922
 
16.0%
Dash Punctuation 77
 
1.3%
Other Punctuation 75
 
1.3%
Close Punctuation 16
 
0.3%
Open Punctuation 16
 
0.3%
Uppercase Letter 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
397
 
11.1%
391
 
10.9%
346
 
9.6%
245
 
6.8%
245
 
6.8%
243
 
6.8%
243
 
6.8%
141
 
3.9%
133
 
3.7%
107
 
3.0%
Other values (140) 1097
30.6%
Decimal Number
ValueCountFrequency (%)
1 194
21.0%
3 113
12.3%
2 111
12.0%
8 99
10.7%
4 84
9.1%
9 81
8.8%
7 68
 
7.4%
5 65
 
7.0%
0 57
 
6.2%
6 50
 
5.4%
Uppercase Letter
ValueCountFrequency (%)
K 2
50.0%
S 2
50.0%
Space Separator
ValueCountFrequency (%)
1074
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 77
100.0%
Other Punctuation
ValueCountFrequency (%)
, 75
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3588
62.2%
Common 2180
37.8%
Latin 4
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
397
 
11.1%
391
 
10.9%
346
 
9.6%
245
 
6.8%
245
 
6.8%
243
 
6.8%
243
 
6.8%
141
 
3.9%
133
 
3.7%
107
 
3.0%
Other values (140) 1097
30.6%
Common
ValueCountFrequency (%)
1074
49.3%
1 194
 
8.9%
3 113
 
5.2%
2 111
 
5.1%
8 99
 
4.5%
4 84
 
3.9%
9 81
 
3.7%
- 77
 
3.5%
, 75
 
3.4%
7 68
 
3.1%
Other values (5) 204
 
9.4%
Latin
ValueCountFrequency (%)
K 2
50.0%
S 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3588
62.2%
ASCII 2184
37.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1074
49.2%
1 194
 
8.9%
3 113
 
5.2%
2 111
 
5.1%
8 99
 
4.5%
4 84
 
3.8%
9 81
 
3.7%
- 77
 
3.5%
, 75
 
3.4%
7 68
 
3.1%
Other values (7) 208
 
9.5%
Hangul
ValueCountFrequency (%)
397
 
11.1%
391
 
10.9%
346
 
9.6%
245
 
6.8%
245
 
6.8%
243
 
6.8%
243
 
6.8%
141
 
3.9%
133
 
3.7%
107
 
3.0%
Other values (140) 1097
30.6%

설비용량(kW)
Real number (ℝ)

MISSING 

Distinct129
Distinct (%)53.1%
Missing5
Missing (%)2.0%
Infinite0
Infinite (%)0.0%
Mean126.97778
Minimum9.33
Maximum2570.25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 KiB
2023-12-12T23:37:50.956804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9.33
5-th percentile19.365
Q129.83
median99
Q399.9
95-th percentile495
Maximum2570.25
Range2560.92
Interquartile range (IQR)70.07

Descriptive statistics

Standard deviation217.55746
Coefficient of variation (CV)1.7133507
Kurtosis67.727311
Mean126.97778
Median Absolute Deviation (MAD)49.5
Skewness6.9130868
Sum30855.6
Variance47331.25
MonotonicityNot monotonic
2023-12-12T23:37:51.136165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99.9 19
 
7.7%
99.0 15
 
6.0%
19.8 13
 
5.2%
99.2 10
 
4.0%
99.6 8
 
3.2%
99.28 7
 
2.8%
99.63 6
 
2.4%
19.92 6
 
2.4%
99.96 5
 
2.0%
30.0 5
 
2.0%
Other values (119) 149
60.1%
(Missing) 5
 
2.0%
ValueCountFrequency (%)
9.33 1
0.4%
9.99 1
0.4%
11.04 1
0.4%
14.11 1
0.4%
14.94 1
0.4%
17.55 1
0.4%
17.6 1
0.4%
17.8 1
0.4%
18.0 1
0.4%
18.72 1
0.4%
ValueCountFrequency (%)
2570.25 1
 
0.4%
993.22 1
 
0.4%
976.5 1
 
0.4%
882.9 1
 
0.4%
723.52 1
 
0.4%
510.56 1
 
0.4%
499.14 1
 
0.4%
498.96 3
1.2%
498.0 1
 
0.4%
497.0 1
 
0.4%

허가일자
Date

MISSING 

Distinct142
Distinct (%)58.4%
Missing5
Missing (%)2.0%
Memory size2.1 KiB
Minimum2008-02-29 00:00:00
Maximum2022-12-15 00:00:00
2023-12-12T23:37:51.311258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:37:51.487175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

사업개시일자
Text

MISSING 

Distinct141
Distinct (%)58.0%
Missing5
Missing (%)2.0%
Memory size2.1 KiB
2023-12-12T23:37:51.807098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2430
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique102 ?
Unique (%)42.0%

Sample

1st row2008-05-28
2nd row2012-04-10
3rd row2010-12-29
4th row2012-07-23
5th row2012-07-26
ValueCountFrequency (%)
2019-11-08 15
 
6.2%
2020-06-18 8
 
3.3%
2019-04-03 7
 
2.9%
2020-08-21 6
 
2.5%
2021-01-14 5
 
2.1%
2020-11-02 5
 
2.1%
2019-10-15 5
 
2.1%
2020-05-06 5
 
2.1%
2020-02-24 5
 
2.1%
2019-08-17 4
 
1.6%
Other values (131) 178
73.3%
2023-12-12T23:37:52.260267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 578
23.8%
2 508
20.9%
- 486
20.0%
1 392
16.1%
9 95
 
3.9%
8 93
 
3.8%
3 64
 
2.6%
4 58
 
2.4%
6 55
 
2.3%
7 53
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1944
80.0%
Dash Punctuation 486
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 578
29.7%
2 508
26.1%
1 392
20.2%
9 95
 
4.9%
8 93
 
4.8%
3 64
 
3.3%
4 58
 
3.0%
6 55
 
2.8%
7 53
 
2.7%
5 48
 
2.5%
Dash Punctuation
ValueCountFrequency (%)
- 486
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2430
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 578
23.8%
2 508
20.9%
- 486
20.0%
1 392
16.1%
9 95
 
3.9%
8 93
 
3.8%
3 64
 
2.6%
4 58
 
2.4%
6 55
 
2.3%
7 53
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2430
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 578
23.8%
2 508
20.9%
- 486
20.0%
1 392
16.1%
9 95
 
3.9%
8 93
 
3.8%
3 64
 
2.6%
4 58
 
2.4%
6 55
 
2.3%
7 53
 
2.2%

데이터기준일자
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)0.4%
Missing5
Missing (%)2.0%
Memory size2.1 KiB
Minimum2023-06-21 00:00:00
Maximum2023-06-21 00:00:00
2023-12-12T23:37:52.397490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:37:52.503482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T23:37:48.692789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-12T23:37:48.841219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:37:48.991623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T23:37:49.138775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

발전소명설치장소설비용량(kW)허가일자사업개시일자데이터기준일자
0대원전기교육원충청북도 증평군 증평읍 연탄리 510-1420.52008-02-292008-05-282023-06-21
1한국철강(주) GETWATT Solar 발전소충청북도 증평군 증평읍 증평산단로 80510.562009-06-112012-04-102023-06-21
2율리태양광발전소충청북도 증평군 증평읍 율리휴양로 30730.02009-09-282010-12-292023-06-21
3창도태양광발전소충청북도 증평군 증평읍 삼보로1길 36-411.042012-03-132012-07-232023-06-21
4충북태양광발전㈜15호발전소충청북도 증평군 증평읍 삼보로 40-6060.02012-06-262012-07-262023-06-21
5미래씨앤엘주식회사 태양광발전소충청북도 증평군 증평읍 증평산단로 13, 미래씨앤엘(주)199.922012-09-192012-11-292023-06-21
6홍태양광발전소충청북도 증평군 증평읍 연정길 6321.02012-09-262013-01-162023-06-21
7충북증평태양광발전소충청북도 증평군 증평읍 미암로 82-41, SK아이이테크놀로지993.222012-09-262013-03-042023-06-21
8하나태양광발전소충청북도 증평군 증평읍 초중로 8020.02013-03-042013-08-212023-06-21
9신라태양광발전소충청북도 증평군 증평읍 중앙로 64-62, (주)신라식품250.02013-05-082013-08-232023-06-21
발전소명설치장소설비용량(kW)허가일자사업개시일자데이터기준일자
238다원 태양광발전소충청북도 증평군 증평읍 송산로5길 15-119.642022-11-162023-03-202023-06-21
239럭키 태양광발전소충청북도 증평군 증평읍 하평로 4919.622022-11-242023-04-122023-06-21
240외룡동회마을2 태양광발전소충청북도 증평군 증평읍 용강리 115-199.52022-12-022023-03-202023-06-21
241은행정 태양광발전소충청북도 증평군 도안면 은행정길 626.192022-12-082023-05-122023-06-21
242유진2 태양광발전소충청북도 증평군 증평읍 용강리 6649.52022-12-152023-03-202023-06-21
243<NA><NA><NA><NA><NA><NA>
244<NA><NA><NA><NA><NA><NA>
245<NA><NA><NA><NA><NA><NA>
246<NA><NA><NA><NA><NA><NA>
247<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

발전소명설치장소설비용량(kW)허가일자사업개시일자데이터기준일자# duplicates
0<NA><NA><NA><NA><NA><NA>5