Overview

Dataset statistics

Number of variables6
Number of observations1300
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory66.1 KiB
Average record size in memory52.1 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 세대당 인구(명), 총인구수(명), 세대수(가구)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15110117

Alerts

세대당 인구(명) is highly overall correlated with 총인구수(명) and 1 other fieldsHigh correlation
총인구수(명) is highly overall correlated with 세대당 인구(명) and 1 other fieldsHigh correlation
세대수(가구) is highly overall correlated with 세대당 인구(명) and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-10 23:14:10.887012
Analysis finished2023-12-10 23:14:12.733037
Duration1.85 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size10.3 KiB
2017
260 
2018
260 
2019
260 
2020
260 
2021
260 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2018
3rd row2019
4th row2020
5th row2021

Common Values

ValueCountFrequency (%)
2017 260
20.0%
2018 260
20.0%
2019 260
20.0%
2020 260
20.0%
2021 260
20.0%

Length

2023-12-11T08:14:12.802615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:14:12.926932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017 260
20.0%
2018 260
20.0%
2019 260
20.0%
2020 260
20.0%
2021 260
20.0%

시도명
Categorical

Distinct16
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size10.3 KiB
경기도
240 
경상북도
125 
서울특별시
125 
경상남도
115 
전라남도
110 
Other values (11)
585 

Length

Max length7
Median length5
Mean length4.0538462
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row강원도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경기도 240
18.5%
경상북도 125
9.6%
서울특별시 125
9.6%
경상남도 115
8.8%
전라남도 110
8.5%
강원도 90
 
6.9%
충청남도 85
 
6.5%
전라북도 80
 
6.2%
부산광역시 80
 
6.2%
충청북도 75
 
5.8%
Other values (6) 175
13.5%

Length

2023-12-11T08:14:13.069099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 240
18.5%
경상북도 125
9.6%
서울특별시 125
9.6%
경상남도 115
8.8%
전라남도 110
8.5%
강원도 90
 
6.9%
충청남도 85
 
6.5%
전라북도 80
 
6.2%
부산광역시 80
 
6.2%
충청북도 75
 
5.8%
Other values (6) 175
13.5%
Distinct236
Distinct (%)18.2%
Missing0
Missing (%)0.0%
Memory size10.3 KiB
2023-12-11T08:14:13.425022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.9561538
Min length2

Characters and Unicode

Total characters3843
Distinct characters141
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강릉시
2nd row강릉시
3rd row강릉시
4th row강릉시
5th row강릉시
ValueCountFrequency (%)
동구 30
 
2.3%
중구 30
 
2.3%
남구 26
 
2.0%
북구 25
 
1.9%
서구 25
 
1.9%
강서구 10
 
0.8%
고성군 10
 
0.8%
논산시 5
 
0.4%
순천시 5
 
0.4%
부여군 5
 
0.4%
Other values (226) 1129
86.8%
2023-12-11T08:14:13.916867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
530
 
13.8%
425
 
11.1%
390
 
10.1%
110
 
2.9%
110
 
2.9%
100
 
2.6%
100
 
2.6%
95
 
2.5%
95
 
2.5%
80
 
2.1%
Other values (131) 1808
47.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3843
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
530
 
13.8%
425
 
11.1%
390
 
10.1%
110
 
2.9%
110
 
2.9%
100
 
2.6%
100
 
2.6%
95
 
2.5%
95
 
2.5%
80
 
2.1%
Other values (131) 1808
47.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3843
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
530
 
13.8%
425
 
11.1%
390
 
10.1%
110
 
2.9%
110
 
2.9%
100
 
2.6%
100
 
2.6%
95
 
2.5%
95
 
2.5%
80
 
2.1%
Other values (131) 1808
47.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3843
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
530
 
13.8%
425
 
11.1%
390
 
10.1%
110
 
2.9%
110
 
2.9%
100
 
2.6%
100
 
2.6%
95
 
2.5%
95
 
2.5%
80
 
2.1%
Other values (131) 1808
47.0%

세대당 인구(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct109
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1997462
Minimum1.69
Maximum2.84
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.6 KiB
2023-12-11T08:14:14.076207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.69
5-th percentile1.85
Q12.02
median2.2
Q32.37
95-th percentile2.57
Maximum2.84
Range1.15
Interquartile range (IQR)0.35

Descriptive statistics

Standard deviation0.22755489
Coefficient of variation (CV)0.10344598
Kurtosis-0.6688143
Mean2.1997462
Median Absolute Deviation (MAD)0.18
Skewness0.11526977
Sum2859.67
Variance0.051781229
MonotonicityNot monotonic
2023-12-11T08:14:14.207453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.1 24
 
1.8%
2.05 24
 
1.8%
2.07 23
 
1.8%
2.26 23
 
1.8%
2.31 23
 
1.8%
2.29 22
 
1.7%
2.0 22
 
1.7%
2.08 22
 
1.7%
2.16 22
 
1.7%
2.35 22
 
1.7%
Other values (99) 1073
82.5%
ValueCountFrequency (%)
1.69 3
0.2%
1.71 2
 
0.2%
1.73 1
 
0.1%
1.74 3
0.2%
1.75 5
0.4%
1.76 3
0.2%
1.77 4
0.3%
1.78 6
0.5%
1.79 6
0.5%
1.8 7
0.5%
ValueCountFrequency (%)
2.84 1
 
0.1%
2.81 1
 
0.1%
2.79 1
 
0.1%
2.77 1
 
0.1%
2.76 1
 
0.1%
2.75 3
0.2%
2.74 4
0.3%
2.72 1
 
0.1%
2.71 4
0.3%
2.7 3
0.2%

총인구수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1299
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean233181.26
Minimum8867
Maximum1202628
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.6 KiB
2023-12-11T08:14:14.360922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8867
5-th percentile27516.1
Q162205.25
median186957
Q3343145.25
95-th percentile603183
Maximum1202628
Range1193761
Interquartile range (IQR)280940

Descriptive statistics

Standard deviation210655.59
Coefficient of variation (CV)0.90339845
Kurtosis3.4788296
Mean233181.26
Median Absolute Deviation (MAD)133194
Skewness1.6212127
Sum3.0313564 × 108
Variance4.4375777 × 1010
MonotonicityNot monotonic
2023-12-11T08:14:14.499993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
122499 2
 
0.2%
213952 1
 
0.1%
193827 1
 
0.1%
49715 1
 
0.1%
50240 1
 
0.1%
192899 1
 
0.1%
194720 1
 
0.1%
203209 1
 
0.1%
209404 1
 
0.1%
216782 1
 
0.1%
Other values (1289) 1289
99.2%
ValueCountFrequency (%)
8867 1
0.1%
9077 1
0.1%
9617 1
0.1%
9832 1
0.1%
9975 1
0.1%
16320 1
0.1%
16692 1
0.1%
16993 1
0.1%
17356 1
0.1%
17479 1
0.1%
ValueCountFrequency (%)
1202628 1
0.1%
1201166 1
0.1%
1194465 1
0.1%
1186078 1
0.1%
1183714 1
0.1%
1079353 1
0.1%
1079216 1
0.1%
1077508 1
0.1%
1074176 1
0.1%
1066351 1
0.1%

세대수(가구)
Real number (ℝ)

HIGH CORRELATION 

Distinct1296
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100898.14
Minimum5258
Maximum517822
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.6 KiB
2023-12-11T08:14:14.659079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5258
5-th percentile14540.95
Q129133.5
median81870.5
Q3143060.5
95-th percentile264723.65
Maximum517822
Range512564
Interquartile range (IQR)113927

Descriptive statistics

Standard deviation86782.529
Coefficient of variation (CV)0.86010036
Kurtosis3.295955
Mean100898.14
Median Absolute Deviation (MAD)55048
Skewness1.5817569
Sum1.3116758 × 108
Variance7.5312073 × 109
MonotonicityNot monotonic
2023-12-11T08:14:14.824327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
61502 2
 
0.2%
23160 2
 
0.2%
16705 2
 
0.2%
75841 2
 
0.2%
95963 1
 
0.1%
84548 1
 
0.1%
24132 1
 
0.1%
89194 1
 
0.1%
87404 1
 
0.1%
87148 1
 
0.1%
Other values (1286) 1286
98.9%
ValueCountFrequency (%)
5258 1
0.1%
5312 1
0.1%
5426 1
0.1%
5435 1
0.1%
5490 1
0.1%
8827 1
0.1%
8913 1
0.1%
8942 1
0.1%
9055 1
0.1%
9127 1
0.1%
ValueCountFrequency (%)
517822 1
0.1%
506950 1
0.1%
498836 1
0.1%
492939 1
0.1%
483558 1
0.1%
456852 1
0.1%
451940 1
0.1%
448574 1
0.1%
442097 1
0.1%
434028 1
0.1%

Interactions

2023-12-11T08:14:12.216549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:14:11.550533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:14:11.879631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:14:12.317226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:14:11.690221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:14:11.976228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:14:12.435554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:14:11.799349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:14:12.095072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T08:14:14.927333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명세대당 인구(명)총인구수(명)세대수(가구)
통계연도1.0000.0000.3770.0000.000
시도명0.0001.0000.4640.5670.577
세대당 인구(명)0.3770.4641.0000.6780.662
총인구수(명)0.0000.5670.6781.0000.993
세대수(가구)0.0000.5770.6620.9931.000
2023-12-11T08:14:15.037652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-11T08:14:15.127770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세대당 인구(명)총인구수(명)세대수(가구)통계연도시도명
세대당 인구(명)1.0000.7600.7130.1660.204
총인구수(명)0.7601.0000.9970.0000.262
세대수(가구)0.7130.9971.0000.0000.268
통계연도0.1660.0000.0001.0000.000
시도명0.2040.2620.2680.0001.000

Missing values

2023-12-11T08:14:12.568929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:14:12.692842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명세대당 인구(명)총인구수(명)세대수(가구)
02017강원도강릉시2.2321395295963
12018강원도강릉시2.221295796859
22019강원도강릉시2.1521344299086
32020강원도강릉시2.1213321101424
42021강원도강릉시2.07212965102879
52017강원도고성군1.933002915519
62018강원도고성군1.912814414763
72019강원도고성군1.892726014445
82020강원도고성군1.842675714546
92021강원도고성군1.812724915064
통계연도시도명시군구명세대당 인구(명)총인구수(명)세대수(가구)
12902017제주특별자치도제주시2.41478700198454
12912018제주특별자치도제주시2.37485946204621
12922019제주특별자치도제주시2.34489405209439
12932020제주특별자치도제주시2.28492466216202
12942021제주특별자치도제주시2.24493096219978
12952017제주특별자치도서귀포시2.2417838379749
12962018제주특별자치도서귀포시2.218124582483
12972019제주특별자치도서귀포시2.1718158483716
12982020제주특별자치도서귀포시2.1218216985831
12992021제주특별자치도서귀포시2.118366387551