Overview

Dataset statistics

Number of variables6
Number of observations56
Missing cells42
Missing cells (%)12.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.9 KiB
Average record size in memory52.4 B

Variable types

Numeric2
Categorical1
Text3

Dataset

Description인천광역시의 민간부문(주택대상) 신재생에너지 보급현황에 대한 정보로 에너지원 종류, 용량, 보급실적에 관한 항목을 제공합니다.
Author인천광역시
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15102414&srcSe=7661IVAWM27C61E190

Alerts

연도 is highly overall correlated with 예산액(천원)High correlation
예산액(천원) is highly overall correlated with 연도 and 1 other fieldsHigh correlation
에너지원 is highly overall correlated with 예산액(천원)High correlation
예산액(천원) has 42 (75.0%) missing valuesMissing

Reproduction

Analysis started2024-03-18 04:59:56.658733
Analysis finished2024-03-18 04:59:58.871550
Duration2.21 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct14
Distinct (%)25.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2015.5
Minimum2009
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.0 B
2024-03-18T13:59:58.923367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2009
5-th percentile2009
Q12012
median2015.5
Q32019
95-th percentile2022
Maximum2022
Range13
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.0676104
Coefficient of variation (CV)0.0020181644
Kurtosis-1.2126431
Mean2015.5
Median Absolute Deviation (MAD)3.5
Skewness0
Sum112868
Variance16.545455
MonotonicityDecreasing
2024-03-18T13:59:59.023264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
2022 4
 
7.1%
2021 4
 
7.1%
2020 4
 
7.1%
2019 4
 
7.1%
2018 4
 
7.1%
2017 4
 
7.1%
2016 4
 
7.1%
2015 4
 
7.1%
2014 4
 
7.1%
2013 4
 
7.1%
Other values (4) 16
28.6%
ValueCountFrequency (%)
2009 4
7.1%
2010 4
7.1%
2011 4
7.1%
2012 4
7.1%
2013 4
7.1%
2014 4
7.1%
2015 4
7.1%
2016 4
7.1%
2017 4
7.1%
2018 4
7.1%
ValueCountFrequency (%)
2022 4
7.1%
2021 4
7.1%
2020 4
7.1%
2019 4
7.1%
2018 4
7.1%
2017 4
7.1%
2016 4
7.1%
2015 4
7.1%
2014 4
7.1%
2013 4
7.1%

에너지원
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Memory size580.0 B
태양광(킬로와트)
14 
태양열(제곱미터)
14 
지열(킬로와트)
14 
연료전지(킬로와트)
14 

Length

Max length10
Median length9.5
Mean length9
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row태양광(킬로와트)
2nd row태양열(제곱미터)
3rd row지열(킬로와트)
4th row연료전지(킬로와트)
5th row태양광(킬로와트)

Common Values

ValueCountFrequency (%)
태양광(킬로와트) 14
25.0%
태양열(제곱미터) 14
25.0%
지열(킬로와트) 14
25.0%
연료전지(킬로와트) 14
25.0%

Length

2024-03-18T13:59:59.165393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-18T13:59:59.290063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
태양광(킬로와트 14
25.0%
태양열(제곱미터 14
25.0%
지열(킬로와트 14
25.0%
연료전지(킬로와트 14
25.0%
Distinct36
Distinct (%)64.3%
Missing0
Missing (%)0.0%
Memory size580.0 B
2024-03-18T13:59:59.444242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.0535714
Min length1

Characters and Unicode

Total characters115
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)53.6%

Sample

1st row677
2nd row1
3rd row12
4th row
5th row794
ValueCountFrequency (%)
1 3
 
7.3%
6 2
 
4.9%
2 2
 
4.9%
101 2
 
4.9%
3 2
 
4.9%
10 1
 
2.4%
105 1
 
2.4%
37 1
 
2.4%
18 1
 
2.4%
13 1
 
2.4%
Other values (25) 25
61.0%
2024-03-18T13:59:59.716948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30
26.1%
1 24
20.9%
2 15
13.0%
6 8
 
7.0%
7 8
 
7.0%
0 7
 
6.1%
4 7
 
6.1%
3 6
 
5.2%
5 5
 
4.3%
9 4
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 85
73.9%
Space Separator 30
 
26.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 24
28.2%
2 15
17.6%
6 8
 
9.4%
7 8
 
9.4%
0 7
 
8.2%
4 7
 
8.2%
3 6
 
7.1%
5 5
 
5.9%
9 4
 
4.7%
8 1
 
1.2%
Space Separator
ValueCountFrequency (%)
30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 115
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
30
26.1%
1 24
20.9%
2 15
13.0%
6 8
 
7.0%
7 8
 
7.0%
0 7
 
6.1%
4 7
 
6.1%
3 6
 
5.2%
5 5
 
4.3%
9 4
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 115
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
30
26.1%
1 24
20.9%
2 15
13.0%
6 8
 
7.0%
7 8
 
7.0%
0 7
 
6.1%
4 7
 
6.1%
3 6
 
5.2%
5 5
 
4.3%
9 4
 
3.5%
Distinct40
Distinct (%)71.4%
Missing0
Missing (%)0.0%
Memory size580.0 B
2024-03-18T13:59:59.894084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.6428571
Min length1

Characters and Unicode

Total characters148
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)66.1%

Sample

1st row2031
2nd row6
3rd row210
4th row
5th row2382
ValueCountFrequency (%)
210 2
 
4.9%
6 2
 
4.9%
315 1
 
2.4%
639 1
 
2.4%
277 1
 
2.4%
1484 1
 
2.4%
10 1
 
2.4%
3 1
 
2.4%
393 1
 
2.4%
268 1
 
2.4%
Other values (29) 29
70.7%
2024-03-18T14:00:00.207987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30
20.3%
1 25
16.9%
3 20
13.5%
2 16
10.8%
0 11
 
7.4%
6 11
 
7.4%
8 9
 
6.1%
7 8
 
5.4%
5 8
 
5.4%
4 6
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 118
79.7%
Space Separator 30
 
20.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 25
21.2%
3 20
16.9%
2 16
13.6%
0 11
9.3%
6 11
9.3%
8 9
 
7.6%
7 8
 
6.8%
5 8
 
6.8%
4 6
 
5.1%
9 4
 
3.4%
Space Separator
ValueCountFrequency (%)
30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 148
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
30
20.3%
1 25
16.9%
3 20
13.5%
2 16
10.8%
0 11
 
7.4%
6 11
 
7.4%
8 9
 
6.1%
7 8
 
5.4%
5 8
 
5.4%
4 6
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 148
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
30
20.3%
1 25
16.9%
3 20
13.5%
2 16
10.8%
0 11
 
7.4%
6 11
 
7.4%
8 9
 
6.1%
7 8
 
5.4%
5 8
 
5.4%
4 6
 
4.1%

예산액(천원)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct7
Distinct (%)50.0%
Missing42
Missing (%)75.0%
Infinite0
Infinite (%)0.0%
Mean492071.43
Minimum400000
Maximum830000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.0 B
2024-03-18T14:00:00.349345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum400000
5-th percentile400000
Q1423000
median450000
Q3503750
95-th percentile680500
Maximum830000
Range430000
Interquartile range (IQR)80750

Descriptive statistics

Standard deviation117093.3
Coefficient of variation (CV)0.23795996
Kurtosis4.9998824
Mean492071.43
Median Absolute Deviation (MAD)43000
Skewness2.1197136
Sum6889000
Variance1.3710841 × 1010
MonotonicityNot monotonic
2024-03-18T14:00:00.435245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
450000 5
 
8.9%
400000 3
 
5.4%
600000 2
 
3.6%
485000 1
 
1.8%
510000 1
 
1.8%
830000 1
 
1.8%
414000 1
 
1.8%
(Missing) 42
75.0%
ValueCountFrequency (%)
400000 3
5.4%
414000 1
 
1.8%
450000 5
8.9%
485000 1
 
1.8%
510000 1
 
1.8%
600000 2
 
3.6%
830000 1
 
1.8%
ValueCountFrequency (%)
830000 1
 
1.8%
600000 2
 
3.6%
510000 1
 
1.8%
485000 1
 
1.8%
450000 5
8.9%
414000 1
 
1.8%
400000 3
5.4%
Distinct42
Distinct (%)75.0%
Missing0
Missing (%)0.0%
Memory size580.0 B
2024-03-18T14:00:00.610921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.375
Min length2

Characters and Unicode

Total characters245
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)73.2%

Sample

1st row440950
2nd row800
3rd row28320
4th row
5th row436940
ValueCountFrequency (%)
2000 1
 
2.4%
161580 1
 
2.4%
410280 1
 
2.4%
5320 1
 
2.4%
189200 1
 
2.4%
41200 1
 
2.4%
219600 1
 
2.4%
290400 1
 
2.4%
7500 1
 
2.4%
127100 1
 
2.4%
Other values (31) 31
75.6%
2024-03-18T14:00:00.983129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 75
30.6%
30
 
12.2%
1 28
 
11.4%
2 23
 
9.4%
5 17
 
6.9%
7 15
 
6.1%
8 14
 
5.7%
4 12
 
4.9%
9 12
 
4.9%
3 12
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 215
87.8%
Space Separator 30
 
12.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 75
34.9%
1 28
 
13.0%
2 23
 
10.7%
5 17
 
7.9%
7 15
 
7.0%
8 14
 
6.5%
4 12
 
5.6%
9 12
 
5.6%
3 12
 
5.6%
6 7
 
3.3%
Space Separator
ValueCountFrequency (%)
30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 245
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 75
30.6%
30
 
12.2%
1 28
 
11.4%
2 23
 
9.4%
5 17
 
6.9%
7 15
 
6.1%
8 14
 
5.7%
4 12
 
4.9%
9 12
 
4.9%
3 12
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 245
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 75
30.6%
30
 
12.2%
1 28
 
11.4%
2 23
 
9.4%
5 17
 
6.9%
7 15
 
6.1%
8 14
 
5.7%
4 12
 
4.9%
9 12
 
4.9%
3 12
 
4.9%

Interactions

2024-03-18T13:59:58.477239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-18T13:59:58.287103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-18T13:59:58.608183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-18T13:59:58.401582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-18T14:00:01.084570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도에너지원가 구 수용 량예산액(천원)보조금(천원)
연도1.0000.0000.1310.2920.7640.452
에너지원0.0001.0000.8780.927NaN0.878
가 구 수0.1310.8781.0000.9991.0001.000
용 량0.2920.9270.9991.0001.0001.000
예산액(천원)0.764NaN1.0001.0001.0001.000
보조금(천원)0.4520.8781.0001.0001.0001.000
2024-03-18T14:00:01.191855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도예산액(천원)에너지원
연도1.0000.8520.000
예산액(천원)0.8521.0001.000
에너지원0.0001.0001.000

Missing values

2024-03-18T13:59:58.746935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-18T13:59:58.832793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도에너지원가 구 수용 량예산액(천원)보조금(천원)
02022태양광(킬로와트)6772031485000440950
12022태양열(제곱미터)16<NA>800
22022지열(킬로와트)12210<NA>28320
32022연료전지(킬로와트)<NA>
42021태양광(킬로와트)7942382510000436940
52021태양열(제곱미터)<NA>
62021지열(킬로와트)6105<NA>11700
72021연료전지(킬로와트)<NA>
82020태양광(킬로와트)9072721600000548040
92020태양열(제곱미터)16<NA>500
연도에너지원가 구 수용 량예산액(천원)보조금(천원)
462011지열(킬로와트)37639<NA>111000
472011연료전지(킬로와트)<NA>
482010태양광(킬로와트)105315400000177860
492010태양열(제곱미터)521484<NA>198000
502010지열(킬로와트)<NA>
512010연료전지(킬로와트)<NA>
522009태양광(킬로와트)101277400000198700
532009태양열(제곱미터)421201<NA>197500
542009지열(킬로와트)<NA>
552009연료전지(킬로와트)<NA>