Overview

Dataset statistics

Number of variables6
Number of observations1140
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory58.0 KiB
Average record size in memory52.1 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 주민 1인당 생활폐기물 배출량(kg/일), 생활계폐기물발생량(톤/일), 주민등록인구수(명)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15110176

Alerts

생활계폐기물발생량(톤_일) is highly overall correlated with 주민등록인구수(명)High correlation
주민등록인구수(명) is highly overall correlated with 생활계폐기물발생량(톤_일)High correlation
주민등록인구수(명) has unique valuesUnique

Reproduction

Analysis started2023-12-11 00:59:54.841779
Analysis finished2023-12-11 00:59:56.288467
Duration1.45 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
2016
228 
2017
228 
2018
228 
2019
228 
2020
228 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016
2nd row2016
3rd row2016
4th row2016
5th row2016

Common Values

ValueCountFrequency (%)
2016 228
20.0%
2017 228
20.0%
2018 228
20.0%
2019 228
20.0%
2020 228
20.0%

Length

2023-12-11T09:59:56.349968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:59:56.470049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2016 228
20.0%
2017 228
20.0%
2018 228
20.0%
2019 228
20.0%
2020 228
20.0%

시도명
Categorical

Distinct16
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
경기도
155 
서울특별시
125 
경상북도
115 
전라남도
110 
강원도
90 
Other values (11)
545 

Length

Max length7
Median length5
Mean length4.1359649
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 155
13.6%
서울특별시 125
11.0%
경상북도 115
10.1%
전라남도 110
9.6%
강원도 90
7.9%
경상남도 90
7.9%
부산광역시 80
7.0%
충청남도 75
6.6%
전라북도 70
 
6.1%
충청북도 55
 
4.8%
Other values (6) 175
15.4%

Length

2023-12-11T09:59:56.620615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 155
13.6%
서울특별시 125
11.0%
경상북도 115
10.1%
전라남도 110
9.6%
강원도 90
7.9%
경상남도 90
7.9%
부산광역시 80
7.0%
충청남도 75
6.6%
전라북도 70
 
6.1%
충청북도 55
 
4.8%
Other values (6) 175
15.4%
Distinct206
Distinct (%)18.1%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
2023-12-11T09:59:56.973415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.9307018
Min length2

Characters and Unicode

Total characters3341
Distinct characters132
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
동구 30
 
2.6%
중구 30
 
2.6%
서구 25
 
2.2%
남구 22
 
1.9%
북구 20
 
1.8%
고성군 10
 
0.9%
강서구 10
 
0.9%
완주군 5
 
0.4%
무주군 5
 
0.4%
진안군 5
 
0.4%
Other values (196) 978
85.8%
2023-12-11T09:59:57.372764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
425
 
12.7%
390
 
11.7%
370
 
11.1%
110
 
3.3%
100
 
3.0%
90
 
2.7%
90
 
2.7%
85
 
2.5%
80
 
2.4%
65
 
1.9%
Other values (122) 1536
46.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3341
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
425
 
12.7%
390
 
11.7%
370
 
11.1%
110
 
3.3%
100
 
3.0%
90
 
2.7%
90
 
2.7%
85
 
2.5%
80
 
2.4%
65
 
1.9%
Other values (122) 1536
46.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3341
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
425
 
12.7%
390
 
11.7%
370
 
11.1%
110
 
3.3%
100
 
3.0%
90
 
2.7%
90
 
2.7%
85
 
2.5%
80
 
2.4%
65
 
1.9%
Other values (122) 1536
46.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3341
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
425
 
12.7%
390
 
11.7%
370
 
11.1%
110
 
3.3%
100
 
3.0%
90
 
2.7%
90
 
2.7%
85
 
2.5%
80
 
2.4%
65
 
1.9%
Other values (122) 1536
46.0%
Distinct38
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2033333
Minimum0.2
Maximum7.2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.1 KiB
2023-12-11T09:59:57.490835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.2
5-th percentile0.6
Q10.9
median1.1
Q31.4
95-th percentile2.1
Maximum7.2
Range7
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.53978448
Coefficient of variation (CV)0.44857436
Kurtosis25.197675
Mean1.2033333
Median Absolute Deviation (MAD)0.2
Skewness3.4437673
Sum1371.8
Variance0.29136728
MonotonicityNot monotonic
2023-12-11T09:59:57.613209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
0.9 155
13.6%
1.0 133
11.7%
1.1 120
10.5%
0.8 120
10.5%
1.2 117
10.3%
1.3 79
 
6.9%
0.7 57
 
5.0%
1.5 56
 
4.9%
1.4 45
 
3.9%
1.6 43
 
3.8%
Other values (28) 215
18.9%
ValueCountFrequency (%)
0.2 2
 
0.2%
0.3 3
 
0.3%
0.4 5
 
0.4%
0.5 13
 
1.1%
0.6 35
 
3.1%
0.7 57
 
5.0%
0.8 120
10.5%
0.9 155
13.6%
1.0 133
11.7%
1.1 120
10.5%
ValueCountFrequency (%)
7.2 1
0.1%
6.4 1
0.1%
4.9 1
0.1%
4.4 1
0.1%
4.1 1
0.1%
4.0 1
0.1%
3.5 1
0.1%
3.4 1
0.1%
3.3 2
0.2%
3.1 1
0.1%

생활계폐기물발생량(톤_일)
Real number (ℝ)

HIGH CORRELATION 

Distinct1015
Distinct (%)89.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean246.20044
Minimum10.3
Maximum1526.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.1 KiB
2023-12-11T09:59:57.766543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10.3
5-th percentile27
Q166.175
median185.35
Q3352.775
95-th percentile708.715
Maximum1526.9
Range1516.6
Interquartile range (IQR)286.6

Descriptive statistics

Standard deviation235.16803
Coefficient of variation (CV)0.95518934
Kurtosis4.4274714
Mean246.20044
Median Absolute Deviation (MAD)129.15
Skewness1.8492637
Sum280668.5
Variance55304.004
MonotonicityNot monotonic
2023-12-11T09:59:57.906840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
41.0 3
 
0.3%
59.5 3
 
0.3%
89.2 3
 
0.3%
27.0 3
 
0.3%
39.0 3
 
0.3%
270.3 3
 
0.3%
46.2 3
 
0.3%
15.1 3
 
0.3%
29.1 3
 
0.3%
55.7 3
 
0.3%
Other values (1005) 1110
97.4%
ValueCountFrequency (%)
10.3 1
 
0.1%
10.7 1
 
0.1%
12.7 1
 
0.1%
13.0 1
 
0.1%
13.2 1
 
0.1%
14.4 1
 
0.1%
14.5 1
 
0.1%
14.8 2
0.2%
14.9 1
 
0.1%
15.1 3
0.3%
ValueCountFrequency (%)
1526.9 1
0.1%
1391.2 1
0.1%
1301.5 1
0.1%
1292.7 1
0.1%
1252.2 1
0.1%
1245.7 1
0.1%
1241.0 1
0.1%
1232.9 1
0.1%
1226.2 1
0.1%
1224.2 1
0.1%

주민등록인구수(명)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1140
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean225829.84
Minimum9077
Maximum1202628
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.1 KiB
2023-12-11T09:59:58.094048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9077
5-th percentile27217.45
Q153266.25
median149942.5
Q3341482.75
95-th percentile651769.7
Maximum1202628
Range1193551
Interquartile range (IQR)288216.5

Descriptive statistics

Standard deviation221011
Coefficient of variation (CV)0.97866162
Kurtosis3.2049859
Mean225829.84
Median Absolute Deviation (MAD)108178
Skewness1.6590682
Sum2.5744602 × 108
Variance4.8845863 × 1010
MonotonicityNot monotonic
2023-12-11T09:59:58.254404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
152737 1
 
0.1%
942724 1
 
0.1%
94768 1
 
0.1%
513027 1
 
0.1%
316552 1
 
0.1%
829996 1
 
0.1%
567044 1
 
0.1%
451868 1
 
0.1%
1194465 1
 
0.1%
1066351 1
 
0.1%
Other values (1130) 1130
99.1%
ValueCountFrequency (%)
9077 1
0.1%
9617 1
0.1%
9832 1
0.1%
9975 1
0.1%
10001 1
0.1%
16692 1
0.1%
16993 1
0.1%
17356 1
0.1%
17479 1
0.1%
17713 1
0.1%
ValueCountFrequency (%)
1202628 1
0.1%
1201166 1
0.1%
1194465 1
0.1%
1194041 1
0.1%
1186078 1
0.1%
1079216 1
0.1%
1074176 1
0.1%
1066351 1
0.1%
1063907 1
0.1%
1059609 1
0.1%

Interactions

2023-12-11T09:59:55.754063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:59:55.104334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:59:55.379314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:59:55.877213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:59:55.202508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:59:55.491943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:59:55.996577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:59:55.294451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:59:55.605573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:59:58.342157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명주민 1인당 생활폐기물 배출량(킬로그램_일)생활계폐기물발생량(톤_일)주민등록인구수(명)
통계연도1.0000.0000.1420.0000.000
시도명0.0001.0000.3930.5760.604
주민 1인당 생활폐기물 배출량(킬로그램_일)0.1420.3931.0000.5470.165
생활계폐기물발생량(톤_일)0.0000.5760.5471.0000.891
주민등록인구수(명)0.0000.6040.1650.8911.000
2023-12-11T09:59:58.441099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명통계연도
시도명1.0000.000
통계연도0.0001.000
2023-12-11T09:59:58.528259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주민 1인당 생활폐기물 배출량(킬로그램_일)생활계폐기물발생량(톤_일)주민등록인구수(명)통계연도시도명
주민 1인당 생활폐기물 배출량(킬로그램_일)1.0000.088-0.2270.0840.172
생활계폐기물발생량(톤_일)0.0881.0000.9350.0000.267
주민등록인구수(명)-0.2270.9351.0000.0000.287
통계연도0.0840.0000.0001.0000.000
시도명0.1720.2670.2870.0001.000

Missing values

2023-12-11T09:59:56.116343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:59:56.240599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명주민 1인당 생활폐기물 배출량(킬로그램_일)생활계폐기물발생량(톤_일)주민등록인구수(명)
02016서울특별시종로구2.7406.0152737
12016서울특별시중구4.0499.2125249
22016서울특별시용산구1.1262.1230241
32016서울특별시성동구0.9257.8299259
42016서울특별시광진구0.9326.3357215
52016서울특별시동대문구1.2426.6355069
62016서울특별시중랑구0.9387.9411005
72016서울특별시성북구0.8358.3450355
82016서울특별시강북구0.7230.0327195
92016서울특별시도봉구0.9315.1348220
통계연도시도명시군구명주민 1인당 생활폐기물 배출량(킬로그램_일)생활계폐기물발생량(톤_일)주민등록인구수(명)
11302020경상남도창녕군1.168.661301
11312020경상남도고성군0.630.551361
11322020경상남도남해군1.563.742958
11332020경상남도하동군0.941.244785
11342020경상남도산청군1.240.634857
11352020경상남도함양군1.767.139080
11362020경상남도거창군1.6100.661502
11372020경상남도합천군1.564.644006
11382020제주특별자치도제주시1.9947.6492466
11392020제주특별자치도서귀포시2.0372.8182169