Overview

Dataset statistics

Number of variables6
Number of observations1304
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory66.3 KiB
Average record size in memory52.1 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 주민등록인구수(명), 청년인구수(명), 청년인구비율(퍼센트)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15110128/fileData.do

Alerts

주민등록인구수(명) is highly overall correlated with 청년인구수(명) and 1 other fieldsHigh correlation
청년인구수(명) is highly overall correlated with 주민등록인구수(명) and 1 other fieldsHigh correlation
청년인구비율(퍼센트) is highly overall correlated with 주민등록인구수(명) and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 19:22:15.284685
Analysis finished2023-12-12 19:22:16.892399
Duration1.61 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size10.3 KiB
2020
262 
2021
262 
2017
260 
2018
260 
2019
260 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
2020 262
20.1%
2021 262
20.1%
2017 260
19.9%
2018 260
19.9%
2019 260
19.9%

Length

2023-12-13T04:22:16.956494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:22:17.068618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 262
20.1%
2021 262
20.1%
2017 260
19.9%
2018 260
19.9%
2019 260
19.9%

시도명
Categorical

Distinct16
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size10.3 KiB
경기도
242 
서울특별시
125 
경상북도
125 
경상남도
115 
전라남도
110 
Other values (11)
587 

Length

Max length7
Median length5
Mean length4.0506135
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 242
18.6%
서울특별시 125
9.6%
경상북도 125
9.6%
경상남도 115
8.8%
전라남도 110
8.4%
강원도 92
 
7.1%
충청남도 85
 
6.5%
부산광역시 80
 
6.1%
전라북도 80
 
6.1%
충청북도 75
 
5.8%
Other values (6) 175
13.4%

Length

2023-12-13T04:22:17.212889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 242
18.6%
서울특별시 125
9.6%
경상북도 125
9.6%
경상남도 115
8.8%
전라남도 110
8.4%
강원도 92
 
7.1%
충청남도 85
 
6.5%
부산광역시 80
 
6.1%
전라북도 80
 
6.1%
충청북도 75
 
5.8%
Other values (6) 175
13.4%
Distinct238
Distinct (%)18.3%
Missing0
Missing (%)0.0%
Memory size10.3 KiB
2023-12-13T04:22:17.559468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.9624233
Min length2

Characters and Unicode

Total characters3863
Distinct characters143
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
중구 30
 
2.3%
동구 30
 
2.3%
남구 26
 
2.0%
북구 25
 
1.9%
서구 25
 
1.9%
강서구 10
 
0.8%
고성군 10
 
0.8%
남원시 5
 
0.4%
덕진구 5
 
0.4%
군산시 5
 
0.4%
Other values (228) 1133
86.9%
2023-12-13T04:22:18.100215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
530
 
13.7%
425
 
11.0%
390
 
10.1%
110
 
2.8%
110
 
2.8%
102
 
2.6%
100
 
2.6%
95
 
2.5%
95
 
2.5%
80
 
2.1%
Other values (133) 1826
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3863
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
530
 
13.7%
425
 
11.0%
390
 
10.1%
110
 
2.8%
110
 
2.8%
102
 
2.6%
100
 
2.6%
95
 
2.5%
95
 
2.5%
80
 
2.1%
Other values (133) 1826
47.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3863
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
530
 
13.7%
425
 
11.0%
390
 
10.1%
110
 
2.8%
110
 
2.8%
102
 
2.6%
100
 
2.6%
95
 
2.5%
95
 
2.5%
80
 
2.1%
Other values (133) 1826
47.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3863
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
530
 
13.7%
425
 
11.0%
390
 
10.1%
110
 
2.8%
110
 
2.8%
102
 
2.6%
100
 
2.6%
95
 
2.5%
95
 
2.5%
80
 
2.1%
Other values (133) 1826
47.3%

주민등록인구수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1300
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean232465.98
Minimum0
Maximum1202628
Zeros4
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size11.6 KiB
2023-12-13T04:22:18.268236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile27250.65
Q161498
median185626.5
Q3342149.25
95-th percentile600979.7
Maximum1202628
Range1202628
Interquartile range (IQR)280651.25

Descriptive statistics

Standard deviation210727.21
Coefficient of variation (CV)0.90648622
Kurtosis3.4766892
Mean232465.98
Median Absolute Deviation (MAD)132368
Skewness1.6206137
Sum3.0313564 × 108
Variance4.4405956 × 1010
MonotonicityNot monotonic
2023-12-13T04:22:18.483496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4
 
0.3%
122499 2
 
0.2%
239413 1
 
0.1%
351888 1
 
0.1%
654915 1
 
0.1%
94353 1
 
0.1%
537307 1
 
0.1%
298599 1
 
0.1%
818383 1
 
0.1%
310614 1
 
0.1%
Other values (1290) 1290
98.9%
ValueCountFrequency (%)
0 4
0.3%
8867 1
 
0.1%
9077 1
 
0.1%
9617 1
 
0.1%
9832 1
 
0.1%
9975 1
 
0.1%
16320 1
 
0.1%
16692 1
 
0.1%
16993 1
 
0.1%
17356 1
 
0.1%
ValueCountFrequency (%)
1202628 1
0.1%
1201166 1
0.1%
1194465 1
0.1%
1186078 1
0.1%
1183714 1
0.1%
1079353 1
0.1%
1079216 1
0.1%
1077508 1
0.1%
1074176 1
0.1%
1066351 1
0.1%

청년인구수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1293
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean74633.215
Minimum0
Maximum455915
Zeros4
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size11.6 KiB
2023-12-13T04:22:18.659850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5474.95
Q113265.25
median56782
Q3110755.75
95-th percentile212657.8
Maximum455915
Range455915
Interquartile range (IQR)97490.5

Descriptive statistics

Standard deviation73831.876
Coefficient of variation (CV)0.98926296
Kurtosis3.4422303
Mean74633.215
Median Absolute Deviation (MAD)45437.5
Skewness1.6142643
Sum97321713
Variance5.4511458 × 109
MonotonicityNot monotonic
2023-12-13T04:22:18.821151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4
 
0.3%
7803 2
 
0.2%
165962 2
 
0.2%
54549 2
 
0.2%
84277 2
 
0.2%
14040 2
 
0.2%
6917 2
 
0.2%
4060 2
 
0.2%
8121 2
 
0.2%
52905 1
 
0.1%
Other values (1283) 1283
98.4%
ValueCountFrequency (%)
0 4
0.3%
1820 1
 
0.1%
1919 1
 
0.1%
2161 1
 
0.1%
2322 1
 
0.1%
2417 1
 
0.1%
2503 1
 
0.1%
2669 1
 
0.1%
2896 1
 
0.1%
3053 1
 
0.1%
ValueCountFrequency (%)
455915 1
0.1%
450021 1
0.1%
438636 1
0.1%
426148 1
0.1%
419635 1
0.1%
353301 1
0.1%
352999 1
0.1%
351058 1
0.1%
348651 1
0.1%
348574 1
0.1%

청년인구비율(퍼센트)
Real number (ℝ)

HIGH CORRELATION 

Distinct246
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.389724
Minimum0
Maximum42.9
Zeros4
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size11.6 KiB
2023-12-13T04:22:18.973623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile17.715
Q123.1
median30
Q333.2
95-th percentile36.5
Maximum42.9
Range42.9
Interquartile range (IQR)10.1

Descriptive statistics

Standard deviation6.264668
Coefficient of variation (CV)0.22066674
Kurtosis0.16925837
Mean28.389724
Median Absolute Deviation (MAD)4.1
Skewness-0.6333418
Sum37020.2
Variance39.246065
MonotonicityNot monotonic
2023-12-13T04:22:19.121127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.2 19
 
1.5%
33.4 19
 
1.5%
33.6 17
 
1.3%
32.5 17
 
1.3%
31.2 16
 
1.2%
32.8 15
 
1.2%
33.0 13
 
1.0%
32.4 13
 
1.0%
34.5 13
 
1.0%
31.4 13
 
1.0%
Other values (236) 1149
88.1%
ValueCountFrequency (%)
0.0 4
0.3%
13.3 1
 
0.1%
13.7 1
 
0.1%
14.3 1
 
0.1%
14.6 1
 
0.1%
14.7 2
0.2%
14.8 1
 
0.1%
15.2 1
 
0.1%
15.3 3
0.2%
15.4 1
 
0.1%
ValueCountFrequency (%)
42.9 1
0.1%
42.8 1
0.1%
42.7 2
0.2%
42.1 1
0.1%
40.7 1
0.1%
40.3 2
0.2%
39.9 1
0.1%
39.7 1
0.1%
39.2 1
0.1%
39.0 1
0.1%

Interactions

2023-12-13T04:22:16.329579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:22:15.603354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:22:16.026971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:22:16.470592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:22:15.765784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:22:16.139662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:22:16.577552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:22:15.888230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:22:16.227943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:22:19.230757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명주민등록인구수(명)청년인구수(명)청년인구비율(퍼센트)
통계연도1.0000.0000.0000.0000.189
시도명0.0001.0000.5670.5560.663
주민등록인구수(명)0.0000.5671.0000.9390.595
청년인구수(명)0.0000.5560.9391.0000.604
청년인구비율(퍼센트)0.1890.6630.5950.6041.000
2023-12-13T04:22:19.343341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명통계연도
시도명1.0000.000
통계연도0.0001.000
2023-12-13T04:22:19.438460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주민등록인구수(명)청년인구수(명)청년인구비율(퍼센트)통계연도시도명
주민등록인구수(명)1.0000.9940.7970.0000.261
청년인구수(명)0.9941.0000.8490.0000.266
청년인구비율(퍼센트)0.7970.8491.0000.1160.294
통계연도0.0000.0000.1161.0000.000
시도명0.2610.2660.2940.0001.000

Missing values

2023-12-13T04:22:16.720431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:22:16.839581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명주민등록인구수(명)청년인구수(명)청년인구비율(퍼센트)
02017서울특별시종로구1547705290534.2
12017서울특별시중구1257094324734.4
22017서울특별시용산구2291618015335.0
32017서울특별시성동구30480811189536.7
42017서울특별시광진구35770313936539.0
52017서울특별시동대문구35064712326435.2
62017서울특별시중랑구40822614067934.5
72017서울특별시성북구44405515132834.1
82017서울특별시강북구32447910492232.3
92017서울특별시도봉구34416611223432.6
통계연도시도명시군구명주민등록인구수(명)청년인구수(명)청년인구비율(퍼센트)
12942021경상남도창녕군601291120818.6
12952021경상남도고성군50478920818.2
12962021경상남도남해군42266683816.2
12972021경상남도하동군43449691715.9
12982021경상남도산청군34360524915.3
12992021경상남도함양군38310679617.7
13002021경상남도거창군610731358422.2
13012021경상남도합천군42935627114.6
13022021제주특별자치도제주시49309614957030.3
13032021제주특별자치도서귀포시1836634639725.3