Overview

Dataset statistics

Number of variables6
Number of observations996
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory50.7 KiB
Average record size in memory52.1 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 건물 단위면적(1제곱미터)당 에너지 사용량(toe), 연면적(제곱미터), 에너지 사용량(toe)으로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15110111/fileData.do

Alerts

건물 단위면적(1제곱미터)당 에너지 사용량(toe) is highly overall correlated with 연면적(제곱미터) and 1 other fieldsHigh correlation
연면적(제곱미터) is highly overall correlated with 건물 단위면적(1제곱미터)당 에너지 사용량(toe) and 1 other fieldsHigh correlation
에너지 사용량(toe) is highly overall correlated with 건물 단위면적(1제곱미터)당 에너지 사용량(toe) and 1 other fieldsHigh correlation
연면적(제곱미터) has unique valuesUnique

Reproduction

Analysis started2023-12-12 06:32:44.801870
Analysis finished2023-12-12 06:32:46.873316
Duration2.07 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2018
249 
2019
249 
2020
249 
2021
249 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2018
2nd row2018
3rd row2018
4th row2018
5th row2018

Common Values

ValueCountFrequency (%)
2018 249
25.0%
2019 249
25.0%
2020 249
25.0%
2021 249
25.0%

Length

2023-12-12T15:32:46.950468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:32:47.076169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2018 249
25.0%
2019 249
25.0%
2020 249
25.0%
2021 249
25.0%

시도명
Categorical

Distinct16
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
경기도
168 
서울특별시
100 
경상북도
96 
전라남도
88 
경상남도
88 
Other values (11)
456 

Length

Max length7
Median length5
Mean length4.0803213
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 168
16.9%
서울특별시 100
10.0%
경상북도 96
9.6%
전라남도 88
8.8%
경상남도 88
8.8%
강원도 72
7.2%
부산광역시 64
 
6.4%
충청남도 64
 
6.4%
전라북도 60
 
6.0%
충청북도 56
 
5.6%
Other values (6) 140
14.1%

Length

2023-12-12T15:32:47.222351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 168
16.9%
서울특별시 100
10.0%
경상북도 96
9.6%
전라남도 88
8.8%
경상남도 88
8.8%
강원도 72
7.2%
부산광역시 64
 
6.4%
충청남도 64
 
6.4%
전라북도 60
 
6.0%
충청북도 56
 
5.6%
Other values (6) 140
14.1%
Distinct227
Distinct (%)22.8%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-12T15:32:47.660738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.4698795
Min length2

Characters and Unicode

Total characters3456
Distinct characters142
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
동구 24
 
2.1%
중구 24
 
2.1%
서구 20
 
1.8%
남구 20
 
1.8%
북구 20
 
1.8%
창원시 20
 
1.8%
청주시 16
 
1.4%
수원시 16
 
1.4%
고양시 12
 
1.1%
용인시 12
 
1.1%
Other values (226) 940
83.6%
2023-12-12T15:32:48.209974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
424
 
12.3%
396
 
11.5%
340
 
9.8%
128
 
3.7%
96
 
2.8%
92
 
2.7%
92
 
2.7%
88
 
2.5%
84
 
2.4%
80
 
2.3%
Other values (132) 1636
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3328
96.3%
Space Separator 128
 
3.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
424
 
12.7%
396
 
11.9%
340
 
10.2%
96
 
2.9%
92
 
2.8%
92
 
2.8%
88
 
2.6%
84
 
2.5%
80
 
2.4%
72
 
2.2%
Other values (131) 1564
47.0%
Space Separator
ValueCountFrequency (%)
128
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3328
96.3%
Common 128
 
3.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
424
 
12.7%
396
 
11.9%
340
 
10.2%
96
 
2.9%
92
 
2.8%
92
 
2.8%
88
 
2.6%
84
 
2.5%
80
 
2.4%
72
 
2.2%
Other values (131) 1564
47.0%
Common
ValueCountFrequency (%)
128
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3328
96.3%
ASCII 128
 
3.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
424
 
12.7%
396
 
11.9%
340
 
10.2%
96
 
2.9%
92
 
2.8%
92
 
2.8%
88
 
2.6%
84
 
2.5%
80
 
2.4%
72
 
2.2%
Other values (131) 1564
47.0%
ASCII
ValueCountFrequency (%)
128
100.0%
Distinct13
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0092811245
Minimum0.005
Maximum0.017
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-12T15:32:48.403342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.005
5-th percentile0.005
Q10.007
median0.009
Q30.011
95-th percentile0.014
Maximum0.017
Range0.012
Interquartile range (IQR)0.004

Descriptive statistics

Standard deviation0.0027625065
Coefficient of variation (CV)0.29764783
Kurtosis-0.42818583
Mean0.0092811245
Median Absolute Deviation (MAD)0.002
Skewness0.42752146
Sum9.244
Variance7.6314424 × 10-6
MonotonicityNot monotonic
2023-12-12T15:32:48.532867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0.01 136
13.7%
0.009 130
13.1%
0.008 123
12.3%
0.007 116
11.6%
0.006 107
10.7%
0.011 106
10.6%
0.005 74
7.4%
0.012 65
6.5%
0.013 53
 
5.3%
0.014 43
 
4.3%
Other values (3) 43
 
4.3%
ValueCountFrequency (%)
0.005 74
7.4%
0.006 107
10.7%
0.007 116
11.6%
0.008 123
12.3%
0.009 130
13.1%
0.01 136
13.7%
0.011 106
10.6%
0.012 65
6.5%
0.013 53
 
5.3%
0.014 43
 
4.3%
ValueCountFrequency (%)
0.017 5
 
0.5%
0.016 14
 
1.4%
0.015 24
 
2.4%
0.014 43
 
4.3%
0.013 53
 
5.3%
0.012 65
6.5%
0.011 106
10.6%
0.01 136
13.7%
0.009 130
13.1%
0.008 123
12.3%

연면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct996
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12927683
Minimum509919
Maximum62269977
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-12T15:32:48.713982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum509919
5-th percentile1730911.5
Q13852547.5
median11439896
Q319084204
95-th percentile31270046
Maximum62269977
Range61760058
Interquartile range (IQR)15231656

Descriptive statistics

Standard deviation10455373
Coefficient of variation (CV)0.80875843
Kurtosis2.5112187
Mean12927683
Median Absolute Deviation (MAD)7599823.5
Skewness1.297469
Sum1.2875973 × 1010
Variance1.0931483 × 1014
MonotonicityNot monotonic
2023-12-12T15:32:48.922974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16007114 1
 
0.1%
7458095 1
 
0.1%
3962813 1
 
0.1%
3243740 1
 
0.1%
1768318 1
 
0.1%
6693948 1
 
0.1%
5368543 1
 
0.1%
4483054 1
 
0.1%
21624502 1
 
0.1%
20724846 1
 
0.1%
Other values (986) 986
99.0%
ValueCountFrequency (%)
509919 1
0.1%
522219 1
0.1%
528993 1
0.1%
539377 1
0.1%
952802 1
0.1%
963159 1
0.1%
967947 1
0.1%
975259 1
0.1%
1199315 1
0.1%
1225660 1
0.1%
ValueCountFrequency (%)
62269977 1
0.1%
60619673 1
0.1%
59879779 1
0.1%
59742288 1
0.1%
59732823 1
0.1%
58438134 1
0.1%
56980287 1
0.1%
50872820 1
0.1%
49677832 1
0.1%
48871036 1
0.1%

에너지 사용량(toe)
Real number (ℝ)

HIGH CORRELATION 

Distinct993
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean135276.2
Minimum3464
Maximum750566
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-12T15:32:49.186243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3464
5-th percentile9553
Q125956.25
median105428.5
Q3212215
95-th percentile362118.5
Maximum750566
Range747102
Interquartile range (IQR)186258.75

Descriptive statistics

Standard deviation123017.17
Coefficient of variation (CV)0.90937776
Kurtosis1.9485004
Mean135276.2
Median Absolute Deviation (MAD)82840
Skewness1.2465769
Sum1.347351 × 108
Variance1.5133224 × 1010
MonotonicityNot monotonic
2023-12-12T15:32:49.380376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18717 2
 
0.2%
24487 2
 
0.2%
149552 2
 
0.2%
54379 1
 
0.1%
23234 1
 
0.1%
9362 1
 
0.1%
50748 1
 
0.1%
41087 1
 
0.1%
37006 1
 
0.1%
230403 1
 
0.1%
Other values (983) 983
98.7%
ValueCountFrequency (%)
3464 1
0.1%
3622 1
0.1%
3698 1
0.1%
3853 1
0.1%
5744 1
0.1%
5885 1
0.1%
6272 1
0.1%
6555 1
0.1%
6925 1
0.1%
6955 1
0.1%
ValueCountFrequency (%)
750566 1
0.1%
714604 1
0.1%
696833 1
0.1%
685653 1
0.1%
597466 1
0.1%
545535 1
0.1%
541467 1
0.1%
540952 1
0.1%
535759 1
0.1%
521795 1
0.1%

Interactions

2023-12-12T15:32:46.244323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:32:45.106990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:32:45.480822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:32:46.366757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:32:45.220727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:32:45.957625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:32:46.538213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:32:45.344200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:32:46.111373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:32:49.515498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명건물 단위면적(1제곱미터)당 에너지 사용량(toe)연면적(제곱미터)에너지 사용량(toe)
통계연도1.0000.0000.0430.0000.000
시도명0.0001.0000.7080.5490.618
건물 단위면적(1제곱미터)당 에너지 사용량(toe)0.0430.7081.0000.6480.576
연면적(제곱미터)0.0000.5490.6481.0000.847
에너지 사용량(toe)0.0000.6180.5760.8471.000
2023-12-12T15:32:49.637141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-12T15:32:49.757059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건물 단위면적(1제곱미터)당 에너지 사용량(toe)연면적(제곱미터)에너지 사용량(toe)통계연도시도명
건물 단위면적(1제곱미터)당 에너지 사용량(toe)1.0000.6710.7980.0230.365
연면적(제곱미터)0.6711.0000.9770.0000.250
에너지 사용량(toe)0.7980.9771.0000.0000.310
통계연도0.0230.0000.0001.0000.000
시도명0.3650.2500.3100.0001.000

Missing values

2023-12-12T15:32:46.694721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:32:46.818989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명건물 단위면적(1제곱미터)당 에너지 사용량(toe)연면적(제곱미터)에너지 사용량(toe)
02018서울특별시종로구0.01716007114267934
12018서울특별시중구0.01719244993336130
22018서울특별시용산구0.01515368069230760
32018서울특별시성동구0.01516660028245691
42018서울특별시광진구0.01716078054265540
52018서울특별시동대문구0.01517451922254933
62018서울특별시중랑구0.01120542693229146
72018서울특별시성북구0.01519749411301555
82018서울특별시강북구0.01612248821201200
92018서울특별시도봉구0.01315177814195268
통계연도시도명시군구명건물 단위면적(1제곱미터)당 에너지 사용량(toe)연면적(제곱미터)에너지 사용량(toe)
9862021경상남도창녕군0.005436566822755
9872021경상남도고성군0.007350183825964
9882021경상남도남해군0.005306279914509
9892021경상남도하동군0.006305339819091
9902021경상남도산청군0.005208590910936
9912021경상남도함양군0.006238376714540
9922021경상남도거창군0.008374446629180
9932021경상남도합천군0.005291687213390
9942021제주특별자치도제주시0.00732119074225297
9952021제주특별자치도서귀포시0.0051602022986151