Overview

Dataset statistics

Number of variables5
Number of observations68
Missing cells2
Missing cells (%)0.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.1 KiB
Average record size in memory45.9 B

Variable types

Numeric3
Text1
Categorical1

Dataset

Description한국수력원자력(주)안심가로등사업가로등설치현황데이터로가로등설치연도설치지역등의정보를제공합니다.
URLhttps://www.data.go.kr/data/15117759/fileData.do

Alerts

태양광 is highly overall correlated with 총 설치본수High correlation
총 설치본수 is highly overall correlated with 태양광 High correlation
하이브리드 is highly imbalanced (64.6%)Imbalance
태양광 has 1 (1.5%) missing valuesMissing
총 설치본수 has 1 (1.5%) missing valuesMissing
설치지역 has unique valuesUnique

Reproduction

Analysis started2023-12-13 00:35:04.339674
Analysis finished2023-12-13 00:35:05.413069
Duration1.07 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

설치년도
Real number (ℝ)

Distinct9
Distinct (%)13.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2018.8529
Minimum2014
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 B
2023-12-13T09:35:05.462341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2014
5-th percentile2015
Q12017
median2019
Q32021
95-th percentile2022
Maximum2022
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.4018942
Coefficient of variation (CV)0.0011897321
Kurtosis-1.2181933
Mean2018.8529
Median Absolute Deviation (MAD)2
Skewness-0.22112614
Sum137282
Variance5.7690957
MonotonicityIncreasing
2023-12-13T09:35:05.555066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2022 12
17.6%
2021 11
16.2%
2016 8
11.8%
2018 8
11.8%
2019 8
11.8%
2017 7
10.3%
2020 7
10.3%
2015 6
8.8%
2014 1
 
1.5%
ValueCountFrequency (%)
2014 1
 
1.5%
2015 6
8.8%
2016 8
11.8%
2017 7
10.3%
2018 8
11.8%
2019 8
11.8%
2020 7
10.3%
2021 11
16.2%
2022 12
17.6%
ValueCountFrequency (%)
2022 12
17.6%
2021 11
16.2%
2020 7
10.3%
2019 8
11.8%
2018 8
11.8%
2017 7
10.3%
2016 8
11.8%
2015 6
8.8%
2014 1
 
1.5%

설치지역
Text

UNIQUE 

Distinct68
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size676.0 B
2023-12-13T09:35:05.737023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length78
Median length29
Mean length17.808824
Min length9

Characters and Unicode

Total characters1211
Distinct characters183
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique68 ?
Unique (%)100.0%

Sample

1st row서울시 서대문구 홍제동
2nd row경북 경주시 석장동
3rd row경북 영덕군 영덕읍
4th row대전 유성구 자운동
5th row부산 서구 남부민동
ValueCountFrequency (%)
경북 21
 
7.6%
경주시 12
 
4.3%
강원도 8
 
2.9%
전남 7
 
2.5%
전북 7
 
2.5%
충북 4
 
1.4%
일대 4
 
1.4%
고창군 3
 
1.1%
경남 3
 
1.1%
충남 3
 
1.1%
Other values (194) 206
74.1%
2023-12-13T09:35:06.041818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
210
 
17.3%
, 55
 
4.5%
46
 
3.8%
39
 
3.2%
38
 
3.1%
32
 
2.6%
31
 
2.6%
31
 
2.6%
27
 
2.2%
26
 
2.1%
Other values (173) 676
55.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 930
76.8%
Space Separator 210
 
17.3%
Other Punctuation 55
 
4.5%
Decimal Number 10
 
0.8%
Open Punctuation 3
 
0.2%
Close Punctuation 3
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
46
 
4.9%
39
 
4.2%
38
 
4.1%
32
 
3.4%
31
 
3.3%
31
 
3.3%
27
 
2.9%
26
 
2.8%
25
 
2.7%
24
 
2.6%
Other values (165) 611
65.7%
Decimal Number
ValueCountFrequency (%)
1 4
40.0%
4 3
30.0%
2 2
20.0%
9 1
 
10.0%
Space Separator
ValueCountFrequency (%)
210
100.0%
Other Punctuation
ValueCountFrequency (%)
, 55
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 930
76.8%
Common 281
 
23.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
46
 
4.9%
39
 
4.2%
38
 
4.1%
32
 
3.4%
31
 
3.3%
31
 
3.3%
27
 
2.9%
26
 
2.8%
25
 
2.7%
24
 
2.6%
Other values (165) 611
65.7%
Common
ValueCountFrequency (%)
210
74.7%
, 55
 
19.6%
1 4
 
1.4%
( 3
 
1.1%
) 3
 
1.1%
4 3
 
1.1%
2 2
 
0.7%
9 1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 930
76.8%
ASCII 281
 
23.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
210
74.7%
, 55
 
19.6%
1 4
 
1.4%
( 3
 
1.1%
) 3
 
1.1%
4 3
 
1.1%
2 2
 
0.7%
9 1
 
0.4%
Hangul
ValueCountFrequency (%)
46
 
4.9%
39
 
4.2%
38
 
4.1%
32
 
3.4%
31
 
3.3%
31
 
3.3%
27
 
2.9%
26
 
2.8%
25
 
2.7%
24
 
2.6%
Other values (165) 611
65.7%

태양광
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct24
Distinct (%)35.8%
Missing1
Missing (%)1.5%
Infinite0
Infinite (%)0.0%
Mean41.820896
Minimum2
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 B
2023-12-13T09:35:06.141473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile15.1
Q137
median42
Q350
95-th percentile69.7
Maximum100
Range98
Interquartile range (IQR)13

Descriptive statistics

Standard deviation17.278403
Coefficient of variation (CV)0.41315238
Kurtosis2.7662978
Mean41.820896
Median Absolute Deviation (MAD)6
Skewness0.85606269
Sum2802
Variance298.54319
MonotonicityNot monotonic
2023-12-13T09:35:06.230672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
42 13
19.1%
50 13
19.1%
37 7
10.3%
40 7
10.3%
38 4
 
5.9%
21 3
 
4.4%
9 2
 
2.9%
20 2
 
2.9%
22 1
 
1.5%
2 1
 
1.5%
Other values (14) 14
20.6%
ValueCountFrequency (%)
2 1
 
1.5%
9 2
2.9%
13 1
 
1.5%
20 2
2.9%
21 3
4.4%
22 1
 
1.5%
25 1
 
1.5%
28 1
 
1.5%
31 1
 
1.5%
36 1
 
1.5%
ValueCountFrequency (%)
100 1
 
1.5%
92 1
 
1.5%
90 1
 
1.5%
70 1
 
1.5%
69 1
 
1.5%
66 1
 
1.5%
61 1
 
1.5%
50 13
19.1%
48 1
 
1.5%
42 13
19.1%

하이브리드
Categorical

IMBALANCE 

Distinct6
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size676.0 B
<NA>
58 
1
 
4
10
 
2
9
 
2
6
 
1

Length

Max length4
Median length4
Mean length3.5882353
Min length1

Unique

Unique2 ?
Unique (%)2.9%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 58
85.3%
1 4
 
5.9%
10 2
 
2.9%
9 2
 
2.9%
6 1
 
1.5%
4 1
 
1.5%

Length

2023-12-13T09:35:06.322475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:35:06.406107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 58
85.3%
1 4
 
5.9%
10 2
 
2.9%
9 2
 
2.9%
6 1
 
1.5%
4 1
 
1.5%

총 설치본수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct25
Distinct (%)37.3%
Missing1
Missing (%)1.5%
Infinite0
Infinite (%)0.0%
Mean42.597015
Minimum2
Maximum110
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 B
2023-12-13T09:35:06.500782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile20
Q137
median42
Q350
95-th percentile76.7
Maximum110
Range108
Interquartile range (IQR)13

Descriptive statistics

Standard deviation17.70641
Coefficient of variation (CV)0.41567256
Kurtosis3.8281437
Mean42.597015
Median Absolute Deviation (MAD)6
Skewness1.1648938
Sum2854
Variance313.51696
MonotonicityNot monotonic
2023-12-13T09:35:06.616324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
50 12
17.6%
42 11
16.2%
37 8
11.8%
40 7
10.3%
38 4
 
5.9%
9 2
 
2.9%
20 2
 
2.9%
21 2
 
2.9%
43 2
 
2.9%
22 2
 
2.9%
Other values (15) 15
22.1%
ValueCountFrequency (%)
2 1
 
1.5%
9 2
 
2.9%
20 2
 
2.9%
21 2
 
2.9%
22 2
 
2.9%
25 1
 
1.5%
28 1
 
1.5%
35 1
 
1.5%
36 1
 
1.5%
37 8
11.8%
ValueCountFrequency (%)
110 1
 
1.5%
92 1
 
1.5%
90 1
 
1.5%
80 1
 
1.5%
69 1
 
1.5%
66 1
 
1.5%
61 1
 
1.5%
51 1
 
1.5%
50 12
17.6%
48 1
 
1.5%

Interactions

2023-12-13T09:35:04.979429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:35:04.552182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:35:04.778130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:35:05.056568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:35:04.637332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:35:04.855128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:35:05.113838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:35:04.703414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:35:04.910969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:35:06.687727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치년도설치지역태양광하이브리드총 설치본수
설치년도1.0001.0000.7050.8140.560
설치지역1.0001.0001.0001.0001.000
태양광0.7051.0001.0000.7650.941
하이브리드0.8141.0000.7651.0000.484
총 설치본수0.5601.0000.9410.4841.000
2023-12-13T09:35:06.762503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치년도태양광총 설치본수하이브리드
설치년도1.000-0.163-0.1410.358
태양광-0.1631.0000.9960.285
총 설치본수-0.1410.9961.0000.112
하이브리드0.3580.2850.1121.000

Missing values

2023-12-13T09:35:05.206085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:35:05.289983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T09:35:05.369219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

설치년도설치지역태양광하이브리드총 설치본수
02014서울시 서대문구 홍제동37<NA>37
12015경북 경주시 석장동66<NA>66
22015경북 영덕군 영덕읍69<NA>69
32015대전 유성구 자운동9<NA>9
42015부산 서구 남부민동36<NA>36
52015서울 금천구 시흥동25<NA>25
62015전북 고창군 고창읍48<NA>48
72016강원도 횡성군 횡성읍40<NA>40
82016경기도 가평군 청평면38<NA>38
92016경북 경주시 북군동21<NA>21
설치년도설치지역태양광하이브리드총 설치본수
582022경북 울진군 화성리40<NA>40
592022경북 청송군 파천면40<NA>40
602022경북 경주시 현곡면, 외동읍, 경주여고 일대50151
612022경남 창원시 성산구 불모산동40<NA>40
622022전북 김제시 금산면50<NA>50
632022전북 무주군 적상면40<NA>40
642022전남 해남군 삼마도20<NA>20
652022충북 영동군 둔전리20<NA>20
662022부산광역시 12개 학교(삼성중,부산중,개성고,대상초,가람중,덕문고,반여중,부산여자상고,부산센텀여고,대양고,학산여고,혜화여고)28937
672022충남 예산군 예화여고 일대2<NA>2