Overview

Dataset statistics

Number of variables7
Number of observations2524
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory143.1 KiB
Average record size in memory58.1 B

Variable types

Text2
Categorical3
Numeric1
DateTime1

Dataset

Description전라남도 영암군 태양광허가정보 데이터는 태양광시설명, 주소, 주파수, 용량 등의 정보를 포함하고 있는데 데이터파일입니다.
URLhttps://www.data.go.kr/data/15113476/fileData.do

Alerts

주파수 has constant value ""Constant
데이터기준일자 has constant value ""Constant
Dataset has 1 (< 0.1%) duplicate rowsDuplicates
가동상태명 is highly overall correlated with 용량 and 1 other fieldsHigh correlation
설치연도 is highly overall correlated with 가동상태명High correlation
용량 is highly overall correlated with 가동상태명High correlation

Reproduction

Analysis started2023-12-12 06:11:36.224821
Analysis finished2023-12-12 06:11:37.075083
Duration0.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2332
Distinct (%)92.4%
Missing0
Missing (%)0.0%
Memory size19.8 KiB
2023-12-12T15:11:37.448869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length18
Mean length4.2369255
Min length1

Characters and Unicode

Total characters10694
Distinct characters426
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2189 ?
Unique (%)86.7%

Sample

1st row서리
2nd row무현
3rd row지은이
4th row서호
5th row영암정수
ValueCountFrequency (%)
2호 69
 
2.4%
1호 51
 
1.8%
태양광발전소 35
 
1.2%
3호 33
 
1.2%
수상 13
 
0.5%
4호 13
 
0.5%
제1 12
 
0.4%
은곡리 12
 
0.4%
신포리 8
 
0.3%
5호 7
 
0.2%
Other values (2205) 2598
91.1%
2023-12-12T15:11:38.052995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1020
 
9.5%
390
 
3.6%
1 360
 
3.4%
2 354
 
3.3%
275
 
2.6%
253
 
2.4%
3 194
 
1.8%
181
 
1.7%
180
 
1.7%
170
 
1.6%
Other values (416) 7317
68.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8569
80.1%
Decimal Number 1185
 
11.1%
Space Separator 390
 
3.6%
Open Punctuation 151
 
1.4%
Close Punctuation 151
 
1.4%
Uppercase Letter 108
 
1.0%
Other Symbol 89
 
0.8%
Lowercase Letter 26
 
0.2%
Dash Punctuation 19
 
0.2%
Other Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1020
 
11.9%
275
 
3.2%
253
 
3.0%
181
 
2.1%
180
 
2.1%
170
 
2.0%
170
 
2.0%
152
 
1.8%
132
 
1.5%
130
 
1.5%
Other values (369) 5906
68.9%
Uppercase Letter
ValueCountFrequency (%)
S 21
19.4%
J 12
11.1%
H 11
10.2%
K 7
 
6.5%
Y 6
 
5.6%
G 6
 
5.6%
M 6
 
5.6%
I 5
 
4.6%
E 5
 
4.6%
F 4
 
3.7%
Other values (10) 25
23.1%
Decimal Number
ValueCountFrequency (%)
1 360
30.4%
2 354
29.9%
3 194
16.4%
4 98
 
8.3%
5 68
 
5.7%
6 42
 
3.5%
7 29
 
2.4%
8 19
 
1.6%
9 11
 
0.9%
0 10
 
0.8%
Lowercase Letter
ValueCountFrequency (%)
o 6
23.1%
c 4
15.4%
p 4
15.4%
k 4
15.4%
e 4
15.4%
d 1
 
3.8%
r 1
 
3.8%
a 1
 
3.8%
l 1
 
3.8%
Other Punctuation
ValueCountFrequency (%)
. 3
60.0%
& 2
40.0%
Space Separator
ValueCountFrequency (%)
390
100.0%
Open Punctuation
ValueCountFrequency (%)
( 151
100.0%
Close Punctuation
ValueCountFrequency (%)
) 151
100.0%
Other Symbol
ValueCountFrequency (%)
89
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8658
81.0%
Common 1901
 
17.8%
Latin 135
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1020
 
11.8%
275
 
3.2%
253
 
2.9%
181
 
2.1%
180
 
2.1%
170
 
2.0%
170
 
2.0%
152
 
1.8%
132
 
1.5%
130
 
1.5%
Other values (370) 5995
69.2%
Latin
ValueCountFrequency (%)
S 21
15.6%
J 12
 
8.9%
H 11
 
8.1%
K 7
 
5.2%
Y 6
 
4.4%
G 6
 
4.4%
M 6
 
4.4%
o 6
 
4.4%
I 5
 
3.7%
E 5
 
3.7%
Other values (20) 50
37.0%
Common
ValueCountFrequency (%)
390
20.5%
1 360
18.9%
2 354
18.6%
3 194
10.2%
( 151
 
7.9%
) 151
 
7.9%
4 98
 
5.2%
5 68
 
3.6%
6 42
 
2.2%
7 29
 
1.5%
Other values (6) 64
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8569
80.1%
ASCII 2035
 
19.0%
None 89
 
0.8%
Number Forms 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1020
 
11.9%
275
 
3.2%
253
 
3.0%
181
 
2.1%
180
 
2.1%
170
 
2.0%
170
 
2.0%
152
 
1.8%
132
 
1.5%
130
 
1.5%
Other values (369) 5906
68.9%
ASCII
ValueCountFrequency (%)
390
19.2%
1 360
17.7%
2 354
17.4%
3 194
9.5%
( 151
 
7.4%
) 151
 
7.4%
4 98
 
4.8%
5 68
 
3.3%
6 42
 
2.1%
7 29
 
1.4%
Other values (35) 198
9.7%
None
ValueCountFrequency (%)
89
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct1944
Distinct (%)77.0%
Missing0
Missing (%)0.0%
Memory size19.8 KiB
2023-12-12T15:11:38.308199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length94
Median length89
Mean length24.849842
Min length17

Characters and Unicode

Total characters62721
Distinct characters126
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1647 ?
Unique (%)65.3%

Sample

1st row전라남도 영암군도포면 영호리 572-5
2nd row전라남도 영암군도포면 구학리 470
3rd row전라남도 영암군도포면 구학리 388
4th row전라남도 영암군서호면 성재리 498
5th row전라남도 영암군영암읍 회문리 372
ValueCountFrequency (%)
전라남도 2524
24.9%
영암군삼호읍 545
 
5.4%
영암군시종면 456
 
4.5%
영암군학산면 378
 
3.7%
영암군신북면 344
 
3.4%
동호리 291
 
2.9%
은곡리 269
 
2.7%
영암군미암면 210
 
2.1%
영암군도포면 169
 
1.7%
영암군군서면 125
 
1.2%
Other values (2038) 4819
47.6%
2023-12-12T15:11:38.789073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7651
 
12.2%
1 3770
 
6.0%
- 3149
 
5.0%
2816
 
4.5%
2731
 
4.4%
2648
 
4.2%
2628
 
4.2%
2614
 
4.2%
2534
 
4.0%
2526
 
4.0%
Other values (116) 29654
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 33137
52.8%
Decimal Number 17104
27.3%
Space Separator 7651
 
12.2%
Dash Punctuation 3149
 
5.0%
Other Punctuation 1672
 
2.7%
Close Punctuation 4
 
< 0.1%
Open Punctuation 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2816
 
8.5%
2731
 
8.2%
2648
 
8.0%
2628
 
7.9%
2614
 
7.9%
2534
 
7.6%
2526
 
7.6%
2524
 
7.6%
1923
 
5.8%
992
 
3.0%
Other values (101) 9201
27.8%
Decimal Number
ValueCountFrequency (%)
1 3770
22.0%
2 1862
10.9%
6 1741
10.2%
3 1670
9.8%
8 1526
8.9%
5 1464
 
8.6%
4 1450
 
8.5%
7 1347
 
7.9%
9 1256
 
7.3%
0 1018
 
6.0%
Space Separator
ValueCountFrequency (%)
7651
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3149
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1672
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 33137
52.8%
Common 29584
47.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2816
 
8.5%
2731
 
8.2%
2648
 
8.0%
2628
 
7.9%
2614
 
7.9%
2534
 
7.6%
2526
 
7.6%
2524
 
7.6%
1923
 
5.8%
992
 
3.0%
Other values (101) 9201
27.8%
Common
ValueCountFrequency (%)
7651
25.9%
1 3770
12.7%
- 3149
10.6%
2 1862
 
6.3%
6 1741
 
5.9%
, 1672
 
5.7%
3 1670
 
5.6%
8 1526
 
5.2%
5 1464
 
4.9%
4 1450
 
4.9%
Other values (5) 3629
12.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 33137
52.8%
ASCII 29584
47.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7651
25.9%
1 3770
12.7%
- 3149
10.6%
2 1862
 
6.3%
6 1741
 
5.9%
, 1672
 
5.7%
3 1670
 
5.6%
8 1526
 
5.2%
5 1464
 
4.9%
4 1450
 
4.9%
Other values (5) 3629
12.3%
Hangul
ValueCountFrequency (%)
2816
 
8.5%
2731
 
8.2%
2648
 
8.0%
2628
 
7.9%
2614
 
7.9%
2534
 
7.6%
2526
 
7.6%
2524
 
7.6%
1923
 
5.8%
992
 
3.0%
Other values (101) 9201
27.8%

가동상태명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size19.8 KiB
정상가동
1559 
<NA>
965 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정상가동
2nd row정상가동
3rd row정상가동
4th row정상가동
5th row정상가동

Common Values

ValueCountFrequency (%)
정상가동 1559
61.8%
<NA> 965
38.2%

Length

2023-12-12T15:11:38.964137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:11:39.056672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정상가동 1559
61.8%
na 965
38.2%

용량
Real number (ℝ)

HIGH CORRELATION 

Distinct594
Distinct (%)23.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean207.35545
Minimum1
Maximum1721.92
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.3 KiB
2023-12-12T15:11:39.185065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile40.30675
Q199
median99.36
Q3198.32
95-th percentile990
Maximum1721.92
Range1720.92
Interquartile range (IQR)99.32

Descriptive statistics

Standard deviation243.27894
Coefficient of variation (CV)1.1732459
Kurtosis4.5545307
Mean207.35545
Median Absolute Deviation (MAD)0.6
Skewness2.2767491
Sum523365.16
Variance59184.643
MonotonicityNot monotonic
2023-12-12T15:11:39.364401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99.0 284
 
11.3%
99.2 225
 
8.9%
99.51 132
 
5.2%
99.36 96
 
3.8%
97.2 75
 
3.0%
99.9 63
 
2.5%
99.28 46
 
1.8%
99.96 40
 
1.6%
99.45 36
 
1.4%
99.54 36
 
1.4%
Other values (584) 1491
59.1%
ValueCountFrequency (%)
1.0 1
 
< 0.1%
9.68 1
 
< 0.1%
10.56 1
 
< 0.1%
13.44 1
 
< 0.1%
13.5 1
 
< 0.1%
14.08 3
0.1%
14.4 1
 
< 0.1%
14.445 1
 
< 0.1%
14.52 1
 
< 0.1%
14.76 1
 
< 0.1%
ValueCountFrequency (%)
1721.92 1
 
< 0.1%
999.92 3
 
0.1%
999.81 1
 
< 0.1%
999.79 1
 
< 0.1%
999.6 6
0.2%
999.585 1
 
< 0.1%
999.58 1
 
< 0.1%
999.32 1
 
< 0.1%
999.0 12
0.5%
998.97 1
 
< 0.1%

주파수
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.8 KiB
60
2524 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row60
2nd row60
3rd row60
4th row60
5th row60

Common Values

ValueCountFrequency (%)
60 2524
100.0%

Length

2023-12-12T15:11:39.553900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:11:39.677002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
60 2524
100.0%

설치연도
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size19.8 KiB
<NA>
965 
2020
548 
2022
251 
2019
235 
2021
136 
Other values (17)
389 

Length

Max length4
Median length4
Mean length3.9940571
Min length1

Unique

Unique7 ?
Unique (%)0.3%

Sample

1st row2015
2nd row2015
3rd row2015
4th row2015
5th row2014

Common Values

ValueCountFrequency (%)
<NA> 965
38.2%
2020 548
21.7%
2022 251
 
9.9%
2019 235
 
9.3%
2021 136
 
5.4%
2015 129
 
5.1%
2018 126
 
5.0%
2017 42
 
1.7%
2016 22
 
0.9%
2008 18
 
0.7%
Other values (12) 52
 
2.1%

Length

2023-12-12T15:11:39.902588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 965
38.2%
2020 548
21.7%
2022 251
 
9.9%
2019 235
 
9.3%
2021 136
 
5.4%
2015 129
 
5.1%
2018 126
 
5.0%
2017 42
 
1.7%
2016 22
 
0.9%
2008 18
 
0.7%
Other values (11) 51
 
2.0%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.8 KiB
Minimum2023-03-31 00:00:00
Maximum2023-03-31 00:00:00
2023-12-12T15:11:40.032244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:11:40.133821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T15:11:36.714448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:11:40.220840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용량설치연도
용량1.0000.468
설치연도0.4681.000
2023-12-12T15:11:40.321691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가동상태명설치연도
가동상태명1.0001.000
설치연도1.0001.000
2023-12-12T15:11:40.417808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용량가동상태명설치연도
용량1.0001.0000.230
가동상태명1.0001.0001.000
설치연도0.2301.0001.000

Missing values

2023-12-12T15:11:36.856137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:11:37.013606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명소재지지번주소가동상태명용량주파수설치연도데이터기준일자
0서리전라남도 영암군도포면 영호리 572-5정상가동99.06020152023-03-31
1무현전라남도 영암군도포면 구학리 470정상가동90.06020152023-03-31
2지은이전라남도 영암군도포면 구학리 388정상가동60.06020152023-03-31
3서호전라남도 영암군서호면 성재리 498정상가동50.06020152023-03-31
4영암정수전라남도 영암군영암읍 회문리 372정상가동29.896020142023-03-31
5H.K산업전라남도 영암군삼호읍 난전리 1713-74정상가동55.06020142023-03-31
6덕산전라남도 영암군시종면 봉소리 1145정상가동99.06020152023-03-31
7우신투자개발전라남도 영암군삼호읍 난전리 1692-5정상가동99.06020152023-03-31
8일출전라남도 영암군도포면 구학리 1692-2정상가동90.06020152023-03-31
9SF전라남도 영암군시종면 구산리 2189-4정상가동90.06020172023-03-31
시설명소재지지번주소가동상태명용량주파수설치연도데이터기준일자
2514다인전라남도 영암군시종면 신학리 2466-1,553-1<NA>99.060<NA>2023-03-31
2515다정전라남도 영암군시종면 신학리 2466-1,553-1<NA>49.560<NA>2023-03-31
2516금광전라남도 영암군삼호읍 난전리 1698-2,1698-3,1698-5<NA>982.860<NA>2023-03-31
2517태정전라남도 영암군시종면 구산리 340-21<NA>13.4460<NA>2023-03-31
2518서진전라남도 영암군도포면 수산리 1125-1<NA>99.560<NA>2023-03-31
2519㈜해미푸드전라남도 영암군영암읍 망호리 1167<NA>195.060<NA>2023-03-31
2520목포문화방송 삼호송신소 제4호태양광전라남도 영암군삼호읍 산호리 1365,1365-5<NA>499.3860<NA>2023-03-31
2521신성전라남도 영암군도포면 덕화리 308<NA>447.48560<NA>2023-03-31
2522(유)광진전력전라남도 영암군군서면 성양리 840-1정상가동170.26020092023-03-31
2523덕인4전라남도 영암군신북면 모산리리 130-3,130-5<NA>449.8260<NA>2023-03-31

Duplicate rows

Most frequently occurring

시설명소재지지번주소가동상태명용량주파수설치연도데이터기준일자# duplicates
0영암농업협동조합전라남도 영암군영암읍 망호리 1168,1169<NA>98.9460<NA>2023-03-312