Overview

Dataset statistics

Number of variables6
Number of observations2610
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory132.7 KiB
Average record size in memory52.1 B

Variable types

Numeric4
Categorical1
Text1

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 학령인구(6~21세) 비율(%), 총인구수(명), 총 학령인구(6~21세)(명)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15110187/fileData.do

Alerts

학령인구(6-21세) 비율(퍼센트) is highly overall correlated with 총인구수(명) and 1 other fieldsHigh correlation
총인구수(명) is highly overall correlated with 학령인구(6-21세) 비율(퍼센트) and 1 other fieldsHigh correlation
총 학령인구(6-21세)(명) is highly overall correlated with 학령인구(6-21세) 비율(퍼센트) and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 01:32:55.515996
Analysis finished2023-12-12 01:32:58.371891
Duration2.86 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Real number (ℝ)

Distinct10
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.4893
Minimum2012
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.1 KiB
2023-12-12T10:32:58.445006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2012
5-th percentile2012
Q12014
median2016
Q32019
95-th percentile2021
Maximum2021
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.8733453
Coefficient of variation (CV)0.0014249247
Kurtosis-1.2253071
Mean2016.4893
Median Absolute Deviation (MAD)2
Skewness0.0059544183
Sum5263037
Variance8.2561133
MonotonicityNot monotonic
2023-12-12T10:32:58.592566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2014 263
10.1%
2015 263
10.1%
2013 262
10.0%
2012 262
10.0%
2019 260
10.0%
2017 260
10.0%
2018 260
10.0%
2016 260
10.0%
2020 260
10.0%
2021 260
10.0%
ValueCountFrequency (%)
2012 262
10.0%
2013 262
10.0%
2014 263
10.1%
2015 263
10.1%
2016 260
10.0%
2017 260
10.0%
2018 260
10.0%
2019 260
10.0%
2020 260
10.0%
2021 260
10.0%
ValueCountFrequency (%)
2021 260
10.0%
2020 260
10.0%
2019 260
10.0%
2018 260
10.0%
2017 260
10.0%
2016 260
10.0%
2015 263
10.1%
2014 263
10.1%
2013 262
10.0%
2012 262
10.0%

시도명
Categorical

Distinct16
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size20.5 KiB
경기도
492 
경상북도
250 
서울특별시
250 
경상남도
230 
전라남도
220 
Other values (11)
1168 

Length

Max length7
Median length5
Mean length4.0490421
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대전광역시
2nd row경상북도
3rd row부산광역시
4th row경상남도
5th row부산광역시

Common Values

ValueCountFrequency (%)
경기도 492
18.9%
경상북도 250
9.6%
서울특별시 250
9.6%
경상남도 230
8.8%
전라남도 220
8.4%
강원도 180
 
6.9%
충청남도 170
 
6.5%
부산광역시 160
 
6.1%
전라북도 160
 
6.1%
충청북도 148
 
5.7%
Other values (6) 350
13.4%

Length

2023-12-12T10:32:58.772277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 492
18.9%
경상북도 250
9.6%
서울특별시 250
9.6%
경상남도 230
8.8%
전라남도 220
8.4%
강원도 180
 
6.9%
충청남도 170
 
6.5%
부산광역시 160
 
6.1%
전라북도 160
 
6.1%
충청북도 148
 
5.7%
Other values (6) 350
13.4%
Distinct241
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Memory size20.5 KiB
2023-12-12T10:32:59.235772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length2.9685824
Min length2

Characters and Unicode

Total characters7748
Distinct characters144
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row서구
2nd row칠곡군
3rd row기장군
4th row김해시
5th row영도구
ValueCountFrequency (%)
동구 60
 
2.3%
중구 60
 
2.3%
남구 56
 
2.1%
서구 50
 
1.9%
북구 50
 
1.9%
강서구 20
 
0.8%
고성군 20
 
0.8%
진도군 10
 
0.4%
서초구 10
 
0.4%
철원군 10
 
0.4%
Other values (231) 2264
86.7%
2023-12-12T10:32:59.857057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1068
 
13.8%
853
 
11.0%
779
 
10.1%
220
 
2.8%
220
 
2.8%
200
 
2.6%
200
 
2.6%
190
 
2.5%
190
 
2.5%
158
 
2.0%
Other values (134) 3670
47.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7728
99.7%
Close Punctuation 10
 
0.1%
Open Punctuation 10
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1068
 
13.8%
853
 
11.0%
779
 
10.1%
220
 
2.8%
220
 
2.8%
200
 
2.6%
200
 
2.6%
190
 
2.5%
190
 
2.5%
158
 
2.0%
Other values (132) 3650
47.2%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7728
99.7%
Common 20
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1068
 
13.8%
853
 
11.0%
779
 
10.1%
220
 
2.8%
220
 
2.8%
200
 
2.6%
200
 
2.6%
190
 
2.5%
190
 
2.5%
158
 
2.0%
Other values (132) 3650
47.2%
Common
ValueCountFrequency (%)
) 10
50.0%
( 10
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7728
99.7%
ASCII 20
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1068
 
13.8%
853
 
11.0%
779
 
10.1%
220
 
2.8%
220
 
2.8%
200
 
2.6%
200
 
2.6%
190
 
2.5%
190
 
2.5%
158
 
2.0%
Other values (132) 3650
47.2%
ASCII
ValueCountFrequency (%)
) 10
50.0%
( 10
50.0%

학령인구(6-21세) 비율(퍼센트)
Real number (ℝ)

HIGH CORRELATION 

Distinct1115
Distinct (%)42.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.640444
Minimum6.33
Maximum25.93
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.1 KiB
2023-12-12T10:33:00.065260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6.33
5-th percentile10.46
Q113.25
median15.675
Q317.95
95-th percentile20.94
Maximum25.93
Range19.6
Interquartile range (IQR)4.7

Descriptive statistics

Standard deviation3.2142181
Coefficient of variation (CV)0.20550683
Kurtosis-0.37541794
Mean15.640444
Median Absolute Deviation (MAD)2.355
Skewness0.061559688
Sum40821.56
Variance10.331198
MonotonicityNot monotonic
2023-12-12T10:33:00.266776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14.26 10
 
0.4%
14.98 10
 
0.4%
17.46 8
 
0.3%
12.34 8
 
0.3%
14.04 7
 
0.3%
16.01 7
 
0.3%
13.06 7
 
0.3%
12.56 7
 
0.3%
15.9 7
 
0.3%
13.75 7
 
0.3%
Other values (1105) 2532
97.0%
ValueCountFrequency (%)
6.33 1
< 0.1%
6.73 1
< 0.1%
6.84 1
< 0.1%
7.08 1
< 0.1%
7.2 1
< 0.1%
7.39 1
< 0.1%
7.69 1
< 0.1%
7.78 1
< 0.1%
7.87 1
< 0.1%
7.97 1
< 0.1%
ValueCountFrequency (%)
25.93 1
< 0.1%
25.61 1
< 0.1%
25.14 1
< 0.1%
25.12 1
< 0.1%
24.83 1
< 0.1%
24.8 1
< 0.1%
24.46 1
< 0.1%
24.29 1
< 0.1%
23.96 1
< 0.1%
23.92 1
< 0.1%

총인구수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct2604
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean232758.07
Minimum8867
Maximum1202628
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.1 KiB
2023-12-12T10:33:00.505320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8867
5-th percentile27877.8
Q162527
median186634.5
Q3340392.5
95-th percentile599002.15
Maximum1202628
Range1193761
Interquartile range (IQR)277865.5

Descriptive statistics

Standard deviation208000.28
Coefficient of variation (CV)0.89363293
Kurtosis3.3906253
Mean232758.07
Median Absolute Deviation (MAD)131634
Skewness1.5952235
Sum6.0749856 × 108
Variance4.3264115 × 1010
MonotonicityNot monotonic
2023-12-12T10:33:00.688641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
67667 2
 
0.1%
207840 2
 
0.1%
122499 2
 
0.1%
40241 2
 
0.1%
30948 2
 
0.1%
29209 2
 
0.1%
481222 1
 
< 0.1%
62448 1
 
< 0.1%
49314 1
 
< 0.1%
20455 1
 
< 0.1%
Other values (2594) 2594
99.4%
ValueCountFrequency (%)
8867 1
< 0.1%
9077 1
< 0.1%
9617 1
< 0.1%
9832 1
< 0.1%
9975 1
< 0.1%
10001 1
< 0.1%
10153 1
< 0.1%
10264 1
< 0.1%
10524 1
< 0.1%
10673 1
< 0.1%
ValueCountFrequency (%)
1202628 1
< 0.1%
1201166 1
< 0.1%
1194465 1
< 0.1%
1194041 1
< 0.1%
1186078 1
< 0.1%
1184624 1
< 0.1%
1183714 1
< 0.1%
1174228 1
< 0.1%
1148157 1
< 0.1%
1120258 1
< 0.1%

총 학령인구(6-21세)(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct2562
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39725.864
Minimum785
Maximum239869
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.1 KiB
2023-12-12T10:33:00.895274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum785
5-th percentile3370.45
Q18548
median30391.5
Q357284.25
95-th percentile111469.85
Maximum239869
Range239084
Interquartile range (IQR)48736.25

Descriptive statistics

Standard deviation38768.042
Coefficient of variation (CV)0.97588922
Kurtosis3.8067608
Mean39725.864
Median Absolute Deviation (MAD)23017
Skewness1.7002402
Sum1.036845 × 108
Variance1.5029611 × 109
MonotonicityNot monotonic
2023-12-12T10:33:01.420153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5531 3
 
0.1%
8548 2
 
0.1%
7161 2
 
0.1%
3797 2
 
0.1%
6989 2
 
0.1%
14998 2
 
0.1%
12111 2
 
0.1%
4448 2
 
0.1%
3195 2
 
0.1%
2946 2
 
0.1%
Other values (2552) 2589
99.2%
ValueCountFrequency (%)
785 1
< 0.1%
812 1
< 0.1%
886 1
< 0.1%
940 1
< 0.1%
979 1
< 0.1%
1039 1
< 0.1%
1111 1
< 0.1%
1154 1
< 0.1%
1228 1
< 0.1%
1313 1
< 0.1%
ValueCountFrequency (%)
239869 1
< 0.1%
238154 1
< 0.1%
234196 1
< 0.1%
226910 1
< 0.1%
221846 1
< 0.1%
220542 1
< 0.1%
214748 1
< 0.1%
213752 1
< 0.1%
211032 1
< 0.1%
209367 1
< 0.1%

Interactions

2023-12-12T10:32:57.665845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:56.073883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:56.568643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:57.134214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:57.778655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:56.201935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:56.702325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:57.263160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:57.901067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:56.330359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:56.843646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:57.414052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:58.028181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:56.455116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:57.007128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:32:57.533231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:33:01.569437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명학령인구(6-21세) 비율(퍼센트)총인구수(명)총 학령인구(6-21세)(명)
통계연도1.0000.0000.3100.0000.000
시도명0.0001.0000.4410.5750.526
학령인구(6-21세) 비율(퍼센트)0.3100.4411.0000.5640.625
총인구수(명)0.0000.5750.5641.0000.959
총 학령인구(6-21세)(명)0.0000.5260.6250.9591.000
2023-12-12T10:33:01.689157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도학령인구(6-21세) 비율(퍼센트)총인구수(명)총 학령인구(6-21세)(명)시도명
통계연도1.000-0.394-0.008-0.0860.000
학령인구(6-21세) 비율(퍼센트)-0.3941.0000.6170.7290.189
총인구수(명)-0.0080.6171.0000.9860.267
총 학령인구(6-21세)(명)-0.0860.7290.9861.0000.236
시도명0.0000.1890.2670.2361.000

Missing values

2023-12-12T10:32:58.161570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:32:58.305879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명학령인구(6-21세) 비율(퍼센트)총인구수(명)총 학령인구(6-21세)(명)
02019대전광역시서구17.4248122283822
12014경상북도칠곡군18.0912205822083
22017부산광역시기장군16.8116165127172
32013경상남도김해시22.31522049116446
42017부산광역시영도구12.2512352115130
52018경기도안양시16.3357683194224
62019부산광역시금정구13.3223906231839
72016경기도부천시16.49851380140368
82015경기도용인시20.48975746199800
92017인천광역시연수구19.3933514264987
통계연도시도명시군구명학령인구(6-21세) 비율(퍼센트)총인구수(명)총 학령인구(6-21세)(명)
26002018충청남도서북구18.5938809772157
26012015전라북도완산구20.7636560575885
26022019경상남도진해구18.0219362234899
26032021경상북도영덕군9.07353143204
26042012인천광역시서구21.44469887100738
26052020경상남도산청군9.34348573255
26062019서울특별시노원구16.6553290588708
26072020부산광역시강서구16.0313795722108
26082018충청남도아산시18.331282257260
26092019충청남도홍성군15.710042315770