Overview

Dataset statistics

Number of variables5
Number of observations1560
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory65.6 KiB
Average record size in memory43.1 B

Variable types

Numeric3
Categorical1
Text1

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 전국대비인구비율(퍼센트), 총인구수(명)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15110126

Alerts

전국대비인구비율(퍼센트) is highly overall correlated with 총인구수(명)High correlation
총인구수(명) is highly overall correlated with 전국대비인구비율(퍼센트)High correlation

Reproduction

Analysis started2023-12-11 00:01:00.370615
Analysis finished2023-12-11 00:01:01.871670
Duration1.5 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Real number (ℝ)

Distinct6
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2018.5
Minimum2016
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2023-12-11T09:01:01.945459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2016
5-th percentile2016
Q12017
median2018.5
Q32020
95-th percentile2021
Maximum2021
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.7083728
Coefficient of variation (CV)0.00084635758
Kurtosis-1.2687907
Mean2018.5
Median Absolute Deviation (MAD)1.5
Skewness0
Sum3148860
Variance2.9185375
MonotonicityIncreasing
2023-12-11T09:01:02.086455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2016 260
16.7%
2017 260
16.7%
2018 260
16.7%
2019 260
16.7%
2020 260
16.7%
2021 260
16.7%
ValueCountFrequency (%)
2016 260
16.7%
2017 260
16.7%
2018 260
16.7%
2019 260
16.7%
2020 260
16.7%
2021 260
16.7%
ValueCountFrequency (%)
2021 260
16.7%
2020 260
16.7%
2019 260
16.7%
2018 260
16.7%
2017 260
16.7%
2016 260
16.7%

시도명
Categorical

Distinct16
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size12.3 KiB
경기도
288 
서울특별시
150 
경상북도
150 
경상남도
138 
전라남도
132 
Other values (11)
702 

Length

Max length7
Median length5
Mean length4.0538462
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 288
18.5%
서울특별시 150
9.6%
경상북도 150
9.6%
경상남도 138
8.8%
전라남도 132
8.5%
강원도 108
 
6.9%
충청남도 102
 
6.5%
부산광역시 96
 
6.2%
전라북도 96
 
6.2%
충청북도 90
 
5.8%
Other values (6) 210
13.5%

Length

2023-12-11T09:01:02.236630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 288
18.5%
서울특별시 150
9.6%
경상북도 150
9.6%
경상남도 138
8.8%
전라남도 132
8.5%
강원도 108
 
6.9%
충청남도 102
 
6.5%
부산광역시 96
 
6.2%
전라북도 96
 
6.2%
충청북도 90
 
5.8%
Other values (6) 210
13.5%
Distinct236
Distinct (%)15.1%
Missing0
Missing (%)0.0%
Memory size12.3 KiB
2023-12-11T09:01:02.621324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.9551282
Min length2

Characters and Unicode

Total characters4610
Distinct characters141
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
중구 36
 
2.3%
동구 36
 
2.3%
남구 32
 
2.1%
북구 30
 
1.9%
서구 30
 
1.9%
강서구 12
 
0.8%
고성군 12
 
0.8%
남원시 6
 
0.4%
덕진구 6
 
0.4%
군산시 6
 
0.4%
Other values (226) 1354
86.8%
2023-12-11T09:01:03.172143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
636
 
13.8%
510
 
11.1%
468
 
10.2%
132
 
2.9%
132
 
2.9%
120
 
2.6%
120
 
2.6%
114
 
2.5%
114
 
2.5%
96
 
2.1%
Other values (131) 2168
47.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4610
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
636
 
13.8%
510
 
11.1%
468
 
10.2%
132
 
2.9%
132
 
2.9%
120
 
2.6%
120
 
2.6%
114
 
2.5%
114
 
2.5%
96
 
2.1%
Other values (131) 2168
47.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4610
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
636
 
13.8%
510
 
11.1%
468
 
10.2%
132
 
2.9%
132
 
2.9%
120
 
2.6%
120
 
2.6%
114
 
2.5%
114
 
2.5%
96
 
2.1%
Other values (131) 2168
47.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4610
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
636
 
13.8%
510
 
11.1%
468
 
10.2%
132
 
2.9%
132
 
2.9%
120
 
2.6%
120
 
2.6%
114
 
2.5%
114
 
2.5%
96
 
2.1%
Other values (131) 2168
47.0%

전국대비인구비율(퍼센트)
Real number (ℝ)

HIGH CORRELATION 

Distinct160
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.45030128
Minimum0.02
Maximum2.32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2023-12-11T09:01:03.363782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.02
5-th percentile0.05
Q10.12
median0.365
Q30.66
95-th percentile1.162
Maximum2.32
Range2.3
Interquartile range (IQR)0.54

Descriptive statistics

Standard deviation0.40618325
Coefficient of variation (CV)0.90202553
Kurtosis3.4631733
Mean0.45030128
Median Absolute Deviation (MAD)0.26
Skewness1.6166616
Sum702.47
Variance0.16498484
MonotonicityNot monotonic
2023-12-11T09:01:03.544080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.08 68
 
4.4%
0.05 62
 
4.0%
0.06 54
 
3.5%
0.1 51
 
3.3%
0.09 51
 
3.3%
0.07 43
 
2.8%
0.12 42
 
2.7%
0.13 32
 
2.1%
0.16 24
 
1.5%
0.22 24
 
1.5%
Other values (150) 1109
71.1%
ValueCountFrequency (%)
0.02 6
 
0.4%
0.03 6
 
0.4%
0.04 16
 
1.0%
0.05 62
4.0%
0.06 54
3.5%
0.07 43
2.8%
0.08 68
4.4%
0.09 51
3.3%
0.1 51
3.3%
0.11 22
 
1.4%
ValueCountFrequency (%)
2.32 2
0.1%
2.31 1
0.1%
2.3 1
0.1%
2.29 2
0.1%
2.09 2
0.1%
2.08 1
0.1%
2.07 1
0.1%
2.06 2
0.1%
2.04 2
0.1%
2.03 1
0.1%

총인구수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1559
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean233180.19
Minimum8867
Maximum1202628
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.8 KiB
2023-12-11T09:01:03.725666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8867
5-th percentile27534.5
Q162314.25
median188423.5
Q3342149.25
95-th percentile602504.2
Maximum1202628
Range1193761
Interquartile range (IQR)279835

Descriptive statistics

Standard deviation210243.68
Coefficient of variation (CV)0.90163614
Kurtosis3.4716959
Mean233180.19
Median Absolute Deviation (MAD)134374.5
Skewness1.6183189
Sum3.6376109 × 108
Variance4.4202406 × 1010
MonotonicityNot monotonic
2023-12-11T09:01:03.944400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
122499 2
 
0.1%
152737 1
 
0.1%
45204 1
 
0.1%
394702 1
 
0.1%
342837 1
 
0.1%
346682 1
 
0.1%
293556 1
 
0.1%
230040 1
 
0.1%
125240 1
 
0.1%
149384 1
 
0.1%
Other values (1549) 1549
99.3%
ValueCountFrequency (%)
8867 1
0.1%
9077 1
0.1%
9617 1
0.1%
9832 1
0.1%
9975 1
0.1%
10001 1
0.1%
16320 1
0.1%
16692 1
0.1%
16993 1
0.1%
17356 1
0.1%
ValueCountFrequency (%)
1202628 1
0.1%
1201166 1
0.1%
1194465 1
0.1%
1194041 1
0.1%
1186078 1
0.1%
1183714 1
0.1%
1079353 1
0.1%
1079216 1
0.1%
1077508 1
0.1%
1074176 1
0.1%

Interactions

2023-12-11T09:01:01.280494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:01:00.636205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:01:00.932596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:01:01.409389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:01:00.724476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:01:01.051571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:01:01.546711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:01:00.821716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:01:01.161663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:01:04.050268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명전국대비인구비율(퍼센트)총인구수(명)
통계연도1.0000.0000.0000.000
시도명0.0001.0000.5720.571
전국대비인구비율(퍼센트)0.0000.5721.0001.000
총인구수(명)0.0000.5711.0001.000
2023-12-11T09:01:04.168103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도전국대비인구비율(퍼센트)총인구수(명)시도명
통계연도1.000-0.008-0.0080.000
전국대비인구비율(퍼센트)-0.0081.0001.0000.264
총인구수(명)-0.0081.0001.0000.264
시도명0.0000.2640.2641.000

Missing values

2023-12-11T09:01:01.675031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:01:01.827001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명전국대비인구비율(퍼센트)총인구수(명)
02016서울특별시종로구0.3152737
12016서울특별시중구0.24125249
22016서울특별시용산구0.45230241
32016서울특별시성동구0.58299259
42016서울특별시광진구0.69357215
52016서울특별시동대문구0.69355069
62016서울특별시중랑구0.8411005
72016서울특별시성북구0.87450355
82016서울특별시강북구0.63327195
92016서울특별시도봉구0.67348220
통계연도시도명시군구명전국대비인구비율(퍼센트)총인구수(명)
15502021경상남도창녕군0.1260129
15512021경상남도고성군0.150478
15522021경상남도남해군0.0842266
15532021경상남도하동군0.0843449
15542021경상남도산청군0.0734360
15552021경상남도함양군0.0738310
15562021경상남도거창군0.1261073
15572021경상남도합천군0.0842935
15582021제주특별자치도제주시0.95493096
15592021제주특별자치도서귀포시0.36183663