Overview

Dataset statistics

Number of variables6
Number of observations1510
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory76.8 KiB
Average record size in memory52.1 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 1인가구비율(퍼센트), 일반가구수(가구), 1인가구수(가구)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15110115/fileData.do

Alerts

일반가구수(가구) is highly overall correlated with 1인가구수(가구)High correlation
1인가구수(가구) is highly overall correlated with 일반가구수(가구)High correlation

Reproduction

Analysis started2023-12-12 11:01:07.108510
Analysis finished2023-12-12 11:01:09.446608
Duration2.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size11.9 KiB
2021
302 
2017
302 
2018
302 
2019
302 
2020
302 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2021 302
20.0%
2017 302
20.0%
2018 302
20.0%
2019 302
20.0%
2020 302
20.0%

Length

2023-12-12T20:01:09.559075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:01:09.758159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 302
20.0%
2017 302
20.0%
2018 302
20.0%
2019 302
20.0%
2020 302
20.0%

시도명
Categorical

Distinct17
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size11.9 KiB
경기도
255 
경상북도
140 
경상남도
130 
전라남도
125 
서울특별시
125 
Other values (12)
735 

Length

Max length7
Median length5
Mean length4.1258278
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 255
16.9%
경상북도 140
9.3%
경상남도 130
8.6%
전라남도 125
8.3%
서울특별시 125
8.3%
강원도 105
7.0%
충청남도 100
 
6.6%
전라북도 95
 
6.3%
부산광역시 95
 
6.3%
충청북도 90
 
6.0%
Other values (7) 250
16.6%

Length

2023-12-12T20:01:10.004704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 255
16.9%
경상북도 140
9.3%
경상남도 130
8.6%
전라남도 125
8.3%
서울특별시 125
8.3%
강원도 105
7.0%
충청남도 100
 
6.6%
부산광역시 95
 
6.3%
전라북도 95
 
6.3%
충청북도 90
 
6.0%
Other values (7) 250
16.6%
Distinct239
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Memory size11.9 KiB
2023-12-12T20:01:10.616762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.8231788
Min length2

Characters and Unicode

Total characters4263
Distinct characters142
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
동부 70
 
4.6%
읍부 70
 
4.6%
면부 70
 
4.6%
동구 30
 
2.0%
중구 30
 
2.0%
남구 26
 
1.7%
서구 25
 
1.7%
북구 25
 
1.7%
고성군 10
 
0.7%
강서구 10
 
0.7%
Other values (229) 1144
75.8%
2023-12-12T20:01:11.482907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
530
 
12.4%
425
 
10.0%
390
 
9.1%
240
 
5.6%
170
 
4.0%
110
 
2.6%
110
 
2.6%
100
 
2.3%
95
 
2.2%
95
 
2.2%
Other values (132) 1998
46.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4263
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
530
 
12.4%
425
 
10.0%
390
 
9.1%
240
 
5.6%
170
 
4.0%
110
 
2.6%
110
 
2.6%
100
 
2.3%
95
 
2.2%
95
 
2.2%
Other values (132) 1998
46.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4263
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
530
 
12.4%
425
 
10.0%
390
 
9.1%
240
 
5.6%
170
 
4.0%
110
 
2.6%
110
 
2.6%
100
 
2.3%
95
 
2.2%
95
 
2.2%
Other values (132) 1998
46.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4263
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
530
 
12.4%
425
 
10.0%
390
 
9.1%
240
 
5.6%
170
 
4.0%
110
 
2.6%
110
 
2.6%
100
 
2.3%
95
 
2.2%
95
 
2.2%
Other values (132) 1998
46.9%
Distinct255
Distinct (%)16.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.128675
Minimum17
Maximum53.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.4 KiB
2023-12-12T20:01:11.729912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile22.8
Q128.6
median32.5
Q335.7
95-th percentile40.2
Maximum53.9
Range36.9
Interquartile range (IQR)7.1

Descriptive statistics

Standard deviation5.3694833
Coefficient of variation (CV)0.16712433
Kurtosis0.24997285
Mean32.128675
Median Absolute Deviation (MAD)3.5
Skewness-0.068097588
Sum48514.3
Variance28.831351
MonotonicityNot monotonic
2023-12-12T20:01:11.984609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
35.2 20
 
1.3%
32.1 19
 
1.3%
35.0 18
 
1.2%
30.7 18
 
1.2%
34.0 16
 
1.1%
36.4 16
 
1.1%
29.9 15
 
1.0%
32.9 15
 
1.0%
33.6 15
 
1.0%
33.8 13
 
0.9%
Other values (245) 1345
89.1%
ValueCountFrequency (%)
17.0 1
0.1%
17.2 1
0.1%
17.7 1
0.1%
17.8 1
0.1%
18.2 1
0.1%
18.3 2
0.1%
18.4 1
0.1%
18.5 2
0.1%
18.6 1
0.1%
19.0 1
0.1%
ValueCountFrequency (%)
53.9 1
0.1%
51.9 1
0.1%
51.7 1
0.1%
49.5 1
0.1%
48.8 1
0.1%
48.6 1
0.1%
47.6 1
0.1%
46.9 1
0.1%
46.3 1
0.1%
46.0 1
0.1%

일반가구수(가구)
Real number (ℝ)

HIGH CORRELATION 

Distinct1504
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean129868.84
Minimum4061
Maximum4389800
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.4 KiB
2023-12-12T20:01:12.253853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4061
5-th percentile11856.05
Q126601
median81726
Q3152717.25
95-th percentile362014.7
Maximum4389800
Range4385739
Interquartile range (IQR)126116.25

Descriptive statistics

Standard deviation271618.97
Coefficient of variation (CV)2.0914868
Kurtosis152.1009
Mean129868.84
Median Absolute Deviation (MAD)58951
Skewness10.977772
Sum1.9610195 × 108
Variance7.3776865 × 1010
MonotonicityNot monotonic
2023-12-12T20:01:12.489473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11723 2
 
0.1%
111810 2
 
0.1%
14044 2
 
0.1%
128012 2
 
0.1%
19431 2
 
0.1%
11518 2
 
0.1%
4082640 1
 
0.1%
531700 1
 
0.1%
113075 1
 
0.1%
92620 1
 
0.1%
Other values (1494) 1494
98.9%
ValueCountFrequency (%)
4061 1
0.1%
4072 1
0.1%
4116 1
0.1%
4135 1
0.1%
4145 1
0.1%
6915 1
0.1%
6959 1
0.1%
7072 1
0.1%
7539 1
0.1%
7565 1
0.1%
ValueCountFrequency (%)
4389800 1
0.1%
4235667 1
0.1%
4082640 1
0.1%
3939290 1
0.1%
3806189 1
0.1%
1361719 1
0.1%
1338453 1
0.1%
1314852 1
0.1%
1302500 1
0.1%
1295613 1
0.1%

1인가구수(가구)
Real number (ℝ)

HIGH CORRELATION 

Distinct1494
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39151.972
Minimum1641
Maximum1258890
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.4 KiB
2023-12-12T20:01:12.732865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1641
5-th percentile4078.35
Q19194.5
median23992.5
Q344816.75
95-th percentile108280
Maximum1258890
Range1257249
Interquartile range (IQR)35622.25

Descriptive statistics

Standard deviation74399.309
Coefficient of variation (CV)1.9002698
Kurtosis133.10049
Mean39151.972
Median Absolute Deviation (MAD)16208.5
Skewness9.9928506
Sum59119478
Variance5.5352571 × 109
MonotonicityNot monotonic
2023-12-12T20:01:13.023778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
108280 2
 
0.1%
3944 2
 
0.1%
15563 2
 
0.1%
34072 2
 
0.1%
4250 2
 
0.1%
4768 2
 
0.1%
53948 2
 
0.1%
4981 2
 
0.1%
8326 2
 
0.1%
6100 2
 
0.1%
Other values (1484) 1490
98.7%
ValueCountFrequency (%)
1641 1
0.1%
1661 1
0.1%
1734 1
0.1%
1739 1
0.1%
1850 1
0.1%
2057 1
0.1%
2064 1
0.1%
2086 1
0.1%
2231 1
0.1%
2352 1
0.1%
ValueCountFrequency (%)
1258890 1
0.1%
1147357 1
0.1%
1054072 1
0.1%
969930 1
0.1%
910375 1
0.1%
467033 1
0.1%
436804 1
0.1%
406575 1
0.1%
388766 1
0.1%
374549 1
0.1%

Interactions

2023-12-12T20:01:08.617877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:01:07.628664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:01:08.162273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:01:08.788742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:01:07.813904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:01:08.327861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:01:08.971717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:01:07.994671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:01:08.458903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:01:13.177443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명1인가구비율(퍼센트)일반가구수(가구)1인가구수(가구)
통계연도1.0000.0000.3250.0000.000
시도명0.0001.0000.5640.2230.202
1인가구비율(퍼센트)0.3250.5641.0000.0810.156
일반가구수(가구)0.0000.2230.0811.0000.936
1인가구수(가구)0.0000.2020.1560.9361.000
2023-12-12T20:01:13.343260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-12T20:01:13.501515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
1인가구비율(퍼센트)일반가구수(가구)1인가구수(가구)통계연도시도명
1인가구비율(퍼센트)1.000-0.423-0.2860.1410.258
일반가구수(가구)-0.4231.0000.9850.0000.106
1인가구수(가구)-0.2860.9851.0000.0000.092
통계연도0.1410.0000.0001.0000.000
시도명0.2580.1060.0920.0001.000

Missing values

2023-12-12T20:01:09.188977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:01:09.368111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명1인가구비율(퍼센트)일반가구수(가구)1인가구수(가구)
02021서울특별시종로구42.76398427308
12021서울특별시중구43.75611624544
22021서울특별시용산구40.59698939270
32021서울특별시성동구36.612275644946
42021서울특별시광진구43.015396266140
52021서울특별시동대문구42.615324665290
62021서울특별시중랑구36.216726060487
72021서울특별시성북구35.718182764985
82021서울특별시강북구37.213032948428
92021서울특별시도봉구29.512843937853
통계연도시도명시군구명1인가구비율(퍼센트)일반가구수(가구)1인가구수(가구)
15002020경상남도하동군36.4194357072
15012020경상남도산청군38.6156116032
15022020경상남도함양군38.5173616679
15032020경상남도거창군34.0258748787
15042020경상남도합천군40.1201398080
15052020제주특별자치도동부31.018801658231
15062020제주특별자치도읍부30.66006018382
15072020제주특별자치도면부35.0149925242
15082020제주특별자치도제주시30.819161959038
15092020제주특별자치도서귀포시31.97144922817