Overview

Dataset statistics

Number of variables8
Number of observations1437
Missing cells1721
Missing cells (%)15.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory94.2 KiB
Average record size in memory67.1 B

Variable types

Numeric3
Text1
Categorical1
DateTime3

Dataset

Description충청남도 아산시 태양광발전 설치 현황으로 설치년도, 설비용량, 발전소주소, 설치위치, 최초허가일, 사업개시일 의 내용을 포함합니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=400&beforeMenuCd=DOM_000000201001001000&publicdatapk=15034102

Alerts

순번 is highly overall correlated with 년도High correlation
년도 is highly overall correlated with 순번High correlation
위치 is highly imbalanced (74.4%)Imbalance
사업개시일 has 434 (30.2%) missing valuesMissing
허가취소일 has 1287 (89.6%) missing valuesMissing
순번 has unique valuesUnique

Reproduction

Analysis started2024-01-09 22:26:00.640591
Analysis finished2024-01-09 22:26:01.789959
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1437
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean719
Minimum1
Maximum1437
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.8 KiB
2024-01-10T07:26:01.844111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile72.8
Q1360
median719
Q31078
95-th percentile1365.2
Maximum1437
Range1436
Interquartile range (IQR)718

Descriptive statistics

Standard deviation414.97048
Coefficient of variation (CV)0.57714949
Kurtosis-1.2
Mean719
Median Absolute Deviation (MAD)359
Skewness0
Sum1033203
Variance172200.5
MonotonicityStrictly increasing
2024-01-10T07:26:01.958449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
945 1
 
0.1%
965 1
 
0.1%
964 1
 
0.1%
963 1
 
0.1%
962 1
 
0.1%
961 1
 
0.1%
960 1
 
0.1%
959 1
 
0.1%
958 1
 
0.1%
Other values (1427) 1427
99.3%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1437 1
0.1%
1436 1
0.1%
1435 1
0.1%
1434 1
0.1%
1433 1
0.1%
1432 1
0.1%
1431 1
0.1%
1430 1
0.1%
1429 1
0.1%
1428 1
0.1%

년도
Real number (ℝ)

HIGH CORRELATION 

Distinct10
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2019.7871
Minimum2014
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.8 KiB
2024-01-10T07:26:02.053944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2014
5-th percentile2015
Q12018
median2020
Q32022
95-th percentile2023
Maximum2023
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.385372
Coefficient of variation (CV)0.0011810017
Kurtosis-0.19125862
Mean2019.7871
Median Absolute Deviation (MAD)2
Skewness-0.62971332
Sum2902434
Variance5.6899995
MonotonicityIncreasing
2024-01-10T07:26:02.133513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2020 315
21.9%
2021 213
14.8%
2023 212
14.8%
2019 179
12.5%
2022 157
10.9%
2017 116
 
8.1%
2018 95
 
6.6%
2014 58
 
4.0%
2016 56
 
3.9%
2015 36
 
2.5%
ValueCountFrequency (%)
2014 58
 
4.0%
2015 36
 
2.5%
2016 56
 
3.9%
2017 116
 
8.1%
2018 95
 
6.6%
2019 179
12.5%
2020 315
21.9%
2021 213
14.8%
2022 157
10.9%
2023 212
14.8%
ValueCountFrequency (%)
2023 212
14.8%
2022 157
10.9%
2021 213
14.8%
2020 315
21.9%
2019 179
12.5%
2018 95
 
6.6%
2017 116
 
8.1%
2016 56
 
3.9%
2015 36
 
2.5%
2014 58
 
4.0%

용량(KW)
Real number (ℝ)

Distinct570
Distinct (%)39.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean117.87769
Minimum9.36
Maximum2261.7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.8 KiB
2024-01-10T07:26:02.234373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9.36
5-th percentile19.5
Q165.49
median98.88
Q399.84
95-th percentile373.08
Maximum2261.7
Range2252.34
Interquartile range (IQR)34.35

Descriptive statistics

Standard deviation130.11136
Coefficient of variation (CV)1.1037827
Kurtosis61.911523
Mean117.87769
Median Absolute Deviation (MAD)7.43
Skewness5.7385185
Sum169390.24
Variance16928.965
MonotonicityNot monotonic
2024-01-10T07:26:02.343946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99.2 60
 
4.2%
99.96 51
 
3.5%
99.0 51
 
3.5%
99.9 44
 
3.1%
99.84 44
 
3.1%
99.45 34
 
2.4%
99.6 30
 
2.1%
98.44 29
 
2.0%
97.2 27
 
1.9%
98.28 23
 
1.6%
Other values (560) 1044
72.7%
ValueCountFrequency (%)
9.36 1
 
0.1%
9.38 1
 
0.1%
10.13 1
 
0.1%
13.5 1
 
0.1%
13.86 1
 
0.1%
14.4 1
 
0.1%
14.6 1
 
0.1%
14.88 1
 
0.1%
15.0 3
0.2%
15.21 2
0.1%
ValueCountFrequency (%)
2261.7 1
0.1%
999.6 1
0.1%
999.0 1
0.1%
997.5 1
0.1%
965.7 1
0.1%
950.82 1
0.1%
943.36 1
0.1%
881.64 1
0.1%
856.8 1
0.1%
802.14 1
0.1%
Distinct950
Distinct (%)66.1%
Missing0
Missing (%)0.0%
Memory size11.4 KiB
2024-01-10T07:26:02.619777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length55
Median length50
Mean length23.207376
Min length15

Characters and Unicode

Total characters33349
Distinct characters182
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique721 ?
Unique (%)50.2%

Sample

1st row충청남도 아산시 인주면 도흥리 109-5외 2필지
2nd row충청남도 아산시 도고면 도고산로 309
3rd row충청남도 아산시 도고면 도산리 78
4th row충청남도 아산시 염치읍 산양리 132-4
5th row충청남도 아산시 신창면 신정호길 268
ValueCountFrequency (%)
충청남도 1437
18.8%
아산시 1437
18.8%
음봉면 253
 
3.3%
선장면 214
 
2.8%
도고면 196
 
2.6%
둔포면 183
 
2.4%
영인면 125
 
1.6%
신창면 111
 
1.5%
인주면 110
 
1.4%
염치읍 94
 
1.2%
Other values (1288) 3477
45.5%
2024-01-10T07:26:03.031578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6204
18.6%
1697
 
5.1%
1693
 
5.1%
1515
 
4.5%
1450
 
4.3%
1449
 
4.3%
1438
 
4.3%
1438
 
4.3%
1296
 
3.9%
1 1217
 
3.6%
Other values (172) 13952
41.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 18968
56.9%
Decimal Number 6482
 
19.4%
Space Separator 6204
 
18.6%
Dash Punctuation 1162
 
3.5%
Other Punctuation 367
 
1.1%
Close Punctuation 75
 
0.2%
Open Punctuation 74
 
0.2%
Uppercase Letter 10
 
< 0.1%
Control 4
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1697
 
8.9%
1693
 
8.9%
1515
 
8.0%
1450
 
7.6%
1449
 
7.6%
1438
 
7.6%
1438
 
7.6%
1296
 
6.8%
1215
 
6.4%
379
 
2.0%
Other values (150) 5398
28.5%
Decimal Number
ValueCountFrequency (%)
1 1217
18.8%
2 950
14.7%
3 735
11.3%
5 622
9.6%
7 613
9.5%
4 603
9.3%
6 512
7.9%
9 447
 
6.9%
0 395
 
6.1%
8 388
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
A 5
50.0%
B 3
30.0%
D 1
 
10.0%
C 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
, 364
99.2%
. 3
 
0.8%
Space Separator
ValueCountFrequency (%)
6204
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1162
100.0%
Close Punctuation
ValueCountFrequency (%)
) 75
100.0%
Open Punctuation
ValueCountFrequency (%)
( 74
100.0%
Control
ValueCountFrequency (%)
4
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 18968
56.9%
Common 14371
43.1%
Latin 10
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1697
 
8.9%
1693
 
8.9%
1515
 
8.0%
1450
 
7.6%
1449
 
7.6%
1438
 
7.6%
1438
 
7.6%
1296
 
6.8%
1215
 
6.4%
379
 
2.0%
Other values (150) 5398
28.5%
Common
ValueCountFrequency (%)
6204
43.2%
1 1217
 
8.5%
- 1162
 
8.1%
2 950
 
6.6%
3 735
 
5.1%
5 622
 
4.3%
7 613
 
4.3%
4 603
 
4.2%
6 512
 
3.6%
9 447
 
3.1%
Other values (8) 1306
 
9.1%
Latin
ValueCountFrequency (%)
A 5
50.0%
B 3
30.0%
D 1
 
10.0%
C 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 18968
56.9%
ASCII 14381
43.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6204
43.1%
1 1217
 
8.5%
- 1162
 
8.1%
2 950
 
6.6%
3 735
 
5.1%
5 622
 
4.3%
7 613
 
4.3%
4 603
 
4.2%
6 512
 
3.6%
9 447
 
3.1%
Other values (12) 1316
 
9.2%
Hangul
ValueCountFrequency (%)
1697
 
8.9%
1693
 
8.9%
1515
 
8.0%
1450
 
7.6%
1449
 
7.6%
1438
 
7.6%
1438
 
7.6%
1296
 
6.8%
1215
 
6.4%
379
 
2.0%
Other values (150) 5398
28.5%

위치
Categorical

IMBALANCE 

Distinct10
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size11.4 KiB
건물 위
1122 
토지 위
299 
건물 위, 토지 위
 
6
건물 위
 
3
건물, 토지 위
 
2
Other values (5)
 
5

Length

Max length10
Median length4
Mean length4.045929
Min length4

Unique

Unique5 ?
Unique (%)0.3%

Sample

1st row건물 위
2nd row건물 위
3rd row건물 위
4th row건물 위
5th row건물 위

Common Values

ValueCountFrequency (%)
건물 위 1122
78.1%
토지 위 299
 
20.8%
건물 위, 토지 위 6
 
0.4%
건물 위 3
 
0.2%
건물, 토지 위 2
 
0.1%
건물/주차장 1
 
0.1%
건물 및 토지 위 1
 
0.1%
건물 위, 토지 위 1
 
0.1%
토지 위 1
 
0.1%
건물 위,토지 위 1
 
0.1%

Length

2024-01-10T07:26:03.143688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:26:03.240798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1443
49.9%
건물 1136
39.3%
토지 310
 
10.7%
건물/주차장 1
 
< 0.1%
1
 
< 0.1%
위,토지 1
 
< 0.1%
Distinct505
Distinct (%)35.1%
Missing0
Missing (%)0.0%
Memory size11.4 KiB
Minimum2014-02-17 00:00:00
Maximum2023-11-28 00:00:00
2024-01-10T07:26:03.346011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:03.450540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

사업개시일
Date

MISSING 

Distinct446
Distinct (%)44.5%
Missing434
Missing (%)30.2%
Memory size11.4 KiB
Minimum2014-05-08 00:00:00
Maximum2023-11-07 00:00:00
2024-01-10T07:26:03.800545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:03.901816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

허가취소일
Date

MISSING 

Distinct68
Distinct (%)45.3%
Missing1287
Missing (%)89.6%
Memory size11.4 KiB
Minimum2015-10-15 00:00:00
Maximum2023-12-04 00:00:00
2024-01-10T07:26:04.015826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:04.130969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-01-10T07:26:01.357998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:00.923583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:01.137006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:01.429477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:00.998495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:01.209846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:01.495691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:01.069702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:01.288263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T07:26:04.200301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번년도용량(KW)위치허가취소일
순번1.0000.9340.1700.4260.980
년도0.9341.0000.1410.3340.993
용량(KW)0.1700.1411.0000.0750.729
위치0.4260.3340.0751.0000.859
허가취소일0.9800.9930.7290.8591.000
2024-01-10T07:26:04.275486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번년도용량(KW)위치
순번1.0000.9890.0130.142
년도0.9891.0000.0350.145
용량(KW)0.0130.0351.0000.039
위치0.1420.1450.0391.000

Missing values

2024-01-10T07:26:01.580595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T07:26:01.673649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-10T07:26:01.751246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

순번년도용량(KW)설치장소위치최초허가일사업개시일허가취소일
012014159.9충청남도 아산시 인주면 도흥리 109-5외 2필지건물 위2014-02-17<NA>2022-02-11
12201449.14충청남도 아산시 도고면 도고산로 309건물 위2014-02-202014-07-01<NA>
23201418.72충청남도 아산시 도고면 도산리 78건물 위2014-02-202014-07-01<NA>
34201420.0충청남도 아산시 염치읍 산양리 132-4건물 위2014-02-252014-05-08<NA>
45201430.0충청남도 아산시 신창면 신정호길 268건물 위2014-02-252015-01-082019-02-27
56201499.0충청남도 아산시 영인면 신봉길 271-36건물 위2014-02-252014-09-18<NA>
67201430.0충청남도 아산시 배방읍 호서로 67-6건물 위2014-03-182014-07-25<NA>
78201481.0충청남도 아산시 선장면 선장로 49건물 위2014-03-242014-12-10<NA>
89201499.4충청남도 아산시 방축동 276-1토지 위2014-03-25<NA>2017-02-07
910201419.0충청남도 아산시 둔포면 신남동길 67건물 위2014-03-272014-06-16<NA>
순번년도용량(KW)설치장소위치최초허가일사업개시일허가취소일
14271428202327.0충청남도 아산시 둔포면 염작리 186-1토지 위2023-11-07<NA><NA>
142814292023300.0충청남도 아산시 음봉면 삼거리 78-2, 1동,2동건물 위2023-11-07<NA><NA>
14291430202399.6충청남도 아산시 둔포면 운교리 96-3, 96-4건물 위2023-11-13<NA><NA>
14301431202363.6충청남도 아산시 둔포면 운교리 150, 151-4건물 위2023-11-21<NA><NA>
14311432202356.33충청남도 아산시 둔포면 염작리 87-7건물 위2023-11-23<NA><NA>
14321433202399.56충청남도 아산시 둔포면 신남리 749-3 A동건물 위2023-11-23<NA><NA>
14331434202399.56충청남도 아산시 둔포면 신남리 749-3 A동건물 위2023-11-23<NA><NA>
14341435202399.96충청남도 아산시 음봉면 산동리 950-125건물 위2023-11-24<NA><NA>
14351436202361.88충청남도 아산시 음봉면 산동리 950-125건물 위2023-11-24<NA><NA>
14361437202319.2충청남도 아산시 배방읍 북수리 1172건물 위2023-11-28<NA><NA>