Overview

Dataset statistics

Number of variables6
Number of observations719
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory36.6 KiB
Average record size in memory52.2 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 고등교육_취업률(%), 졸업자 수(명), 취업자 수(명)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15110179

Alerts

졸업자 수(명) is highly overall correlated with 취업자 수(명)High correlation
취업자 수(명) is highly overall correlated with 졸업자 수(명)High correlation
고등교육_취업률(퍼센트) has 11 (1.5%) zerosZeros
취업자 수(명) has 11 (1.5%) zerosZeros

Reproduction

Analysis started2023-12-11 00:20:48.137398
Analysis finished2023-12-11 00:20:49.630093
Duration1.49 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
2016
146 
2017
145 
2019
143 
2018
143 
2020
142 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2017
3rd row2020
4th row2020
5th row2017

Common Values

ValueCountFrequency (%)
2016 146
20.3%
2017 145
20.2%
2019 143
19.9%
2018 143
19.9%
2020 142
19.7%

Length

2023-12-11T09:20:49.691916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:20:49.801482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2016 146
20.3%
2017 145
20.2%
2019 143
19.9%
2018 143
19.9%
2020 142
19.7%

시도명
Categorical

Distinct17
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
경기도
126 
서울특별시
95 
충청남도
55 
경상북도
55 
전라남도
50 
Other values (12)
338 

Length

Max length7
Median length5
Mean length4.1849791
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충청북도
2nd row경상북도
3rd row전라남도
4th row경상남도
5th row부산광역시

Common Values

ValueCountFrequency (%)
경기도 126
17.5%
서울특별시 95
13.2%
충청남도 55
 
7.6%
경상북도 55
 
7.6%
전라남도 50
 
7.0%
부산광역시 45
 
6.3%
강원도 41
 
5.7%
경상남도 40
 
5.6%
전라북도 37
 
5.1%
충청북도 35
 
4.9%
Other values (7) 140
19.5%

Length

2023-12-11T09:20:49.933103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 126
17.5%
서울특별시 95
13.2%
충청남도 55
 
7.6%
경상북도 55
 
7.6%
전라남도 50
 
7.0%
부산광역시 45
 
6.3%
강원도 41
 
5.7%
경상남도 40
 
5.6%
전라북도 37
 
5.1%
충청북도 35
 
4.9%
Other values (7) 140
19.5%
Distinct134
Distinct (%)18.6%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
2023-12-11T09:20:50.264248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.930459
Min length2

Characters and Unicode

Total characters2107
Distinct characters109
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row음성군
2nd row칠곡군
3rd row나주시
4th row진주시
5th row부산진구
ValueCountFrequency (%)
동구 25
 
3.5%
남구 20
 
2.8%
중구 15
 
2.1%
북구 15
 
2.1%
서구 10
 
1.4%
서대문구 5
 
0.7%
부천시 5
 
0.7%
성남시 5
 
0.7%
무안군 5
 
0.7%
보령시 5
 
0.7%
Other values (124) 609
84.7%
2023-12-11T09:20:50.743845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
334
 
15.9%
265
 
12.6%
140
 
6.6%
85
 
4.0%
70
 
3.3%
66
 
3.1%
65
 
3.1%
57
 
2.7%
50
 
2.4%
37
 
1.8%
Other values (99) 938
44.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2107
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
334
 
15.9%
265
 
12.6%
140
 
6.6%
85
 
4.0%
70
 
3.3%
66
 
3.1%
65
 
3.1%
57
 
2.7%
50
 
2.4%
37
 
1.8%
Other values (99) 938
44.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2107
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
334
 
15.9%
265
 
12.6%
140
 
6.6%
85
 
4.0%
70
 
3.3%
66
 
3.1%
65
 
3.1%
57
 
2.7%
50
 
2.4%
37
 
1.8%
Other values (99) 938
44.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2107
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
334
 
15.9%
265
 
12.6%
140
 
6.6%
85
 
4.0%
70
 
3.3%
66
 
3.1%
65
 
3.1%
57
 
2.7%
50
 
2.4%
37
 
1.8%
Other values (99) 938
44.5%

고등교육_취업률(퍼센트)
Real number (ℝ)

ZEROS 

Distinct606
Distinct (%)84.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.010376
Minimum0
Maximum91.84
Zeros11
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2023-12-11T09:20:50.884118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile57.475
Q164.31
median67.84
Q371.14
95-th percentile79.186
Maximum91.84
Range91.84
Interquartile range (IQR)6.83

Descriptive statistics

Standard deviation10.671489
Coefficient of variation (CV)0.1592513
Kurtosis22.249617
Mean67.010376
Median Absolute Deviation (MAD)3.43
Skewness-3.8558166
Sum48180.46
Variance113.88068
MonotonicityNot monotonic
2023-12-11T09:20:51.023756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 11
 
1.5%
68.15 4
 
0.6%
70.13 4
 
0.6%
66.91 3
 
0.4%
68.85 3
 
0.4%
65.59 3
 
0.4%
67.29 3
 
0.4%
66.32 3
 
0.4%
67.94 3
 
0.4%
70.02 3
 
0.4%
Other values (596) 679
94.4%
ValueCountFrequency (%)
0.0 11
1.5%
22.22 1
 
0.1%
33.33 1
 
0.1%
37.31 1
 
0.1%
40.57 1
 
0.1%
42.86 1
 
0.1%
44.53 1
 
0.1%
45.43 1
 
0.1%
48.65 1
 
0.1%
50.0 1
 
0.1%
ValueCountFrequency (%)
91.84 1
0.1%
88.42 1
0.1%
88.03 1
0.1%
87.64 1
0.1%
85.56 1
0.1%
84.96 1
0.1%
84.38 2
0.3%
83.95 1
0.1%
83.81 1
0.1%
83.1 1
0.1%

졸업자 수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct680
Distinct (%)94.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3914.3074
Minimum1
Maximum20700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2023-12-11T09:20:51.195622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile199.7
Q1754.5
median2379
Q36320
95-th percentile11905.1
Maximum20700
Range20699
Interquartile range (IQR)5565.5

Descriptive statistics

Standard deviation4043.5095
Coefficient of variation (CV)1.0330077
Kurtosis1.9036494
Mean3914.3074
Median Absolute Deviation (MAD)1902
Skewness1.4585215
Sum2814387
Variance16349969
MonotonicityNot monotonic
2023-12-11T09:20:51.353086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
570 3
 
0.4%
477 3
 
0.4%
7606 2
 
0.3%
1639 2
 
0.3%
2594 2
 
0.3%
1227 2
 
0.3%
282 2
 
0.3%
603 2
 
0.3%
7498 2
 
0.3%
458 2
 
0.3%
Other values (670) 697
96.9%
ValueCountFrequency (%)
1 1
0.1%
6 1
0.1%
8 1
0.1%
9 1
0.1%
11 1
0.1%
13 1
0.1%
15 1
0.1%
19 1
0.1%
30 1
0.1%
66 1
0.1%
ValueCountFrequency (%)
20700 1
0.1%
20458 1
0.1%
18855 1
0.1%
18787 1
0.1%
18300 1
0.1%
18113 1
0.1%
18104 1
0.1%
17454 1
0.1%
17324 1
0.1%
17221 1
0.1%

취업자 수(명)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct648
Distinct (%)90.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2302.9138
Minimum0
Maximum11632
Zeros11
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2023-12-11T09:20:51.481372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile81
Q1473
median1440
Q33549
95-th percentile6969.3
Maximum11632
Range11632
Interquartile range (IQR)3076

Descriptive statistics

Standard deviation2319.844
Coefficient of variation (CV)1.0073517
Kurtosis1.5308908
Mean2302.9138
Median Absolute Deviation (MAD)1139
Skewness1.3808656
Sum1655795
Variance5381676.3
MonotonicityNot monotonic
2023-12-11T09:20:51.635821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 11
 
1.5%
1322 4
 
0.6%
1502 3
 
0.4%
1593 3
 
0.4%
1843 3
 
0.4%
225 3
 
0.4%
875 3
 
0.4%
181 3
 
0.4%
376 3
 
0.4%
699 2
 
0.3%
Other values (638) 681
94.7%
ValueCountFrequency (%)
0 11
1.5%
2 2
 
0.3%
17 1
 
0.1%
26 1
 
0.1%
33 1
 
0.1%
36 1
 
0.1%
37 1
 
0.1%
41 1
 
0.1%
46 2
 
0.3%
47 1
 
0.1%
ValueCountFrequency (%)
11632 1
0.1%
11332 1
0.1%
10407 1
0.1%
10346 1
0.1%
10335 1
0.1%
10051 1
0.1%
9915 1
0.1%
9657 1
0.1%
9597 1
0.1%
9583 1
0.1%

Interactions

2023-12-11T09:20:49.121122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:20:48.617434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:20:48.882159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:20:49.242919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:20:48.722113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:20:48.957953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:20:49.350124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:20:48.802494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:20:49.037452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:20:51.726317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명고등교육_취업률(퍼센트)졸업자 수(명)취업자 수(명)
통계연도1.0000.0000.0640.0000.000
시도명0.0001.0000.4230.5690.536
고등교육_취업률(퍼센트)0.0640.4231.0000.2680.276
졸업자 수(명)0.0000.5690.2681.0000.987
취업자 수(명)0.0000.5360.2760.9871.000
2023-12-11T09:20:51.842689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-11T09:20:51.932480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고등교육_취업률(퍼센트)졸업자 수(명)취업자 수(명)통계연도시도명
고등교육_취업률(퍼센트)1.000-0.190-0.1360.0370.182
졸업자 수(명)-0.1901.0000.9970.0000.261
취업자 수(명)-0.1360.9971.0000.0000.240
통계연도0.0370.0000.0001.0000.000
시도명0.1820.2610.2400.0001.000

Missing values

2023-12-11T09:20:49.467041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:20:49.580363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명고등교육_취업률(퍼센트)졸업자 수(명)취업자 수(명)
02019충청북도음성군69.4523201448
12017경상북도칠곡군64.59850498
22020전라남도나주시67.2224851530
32020경상남도진주시58.9772763916
42017부산광역시부산진구65.7971544384
52016부산광역시남구62.86101975823
62018서울특별시송파구71.28550335
72020부산광역시사상구62.0271183969
82019충청남도서산시71.3617471104
92018제주특별자치도제주시67.3847322735
통계연도시도명시군구명고등교육_취업률(퍼센트)졸업자 수(명)취업자 수(명)
7092019충청북도옥천군65.92383205
7102019경기도오산시68.8530531863
7112016충청남도홍성군70.1327961843
7122020경기도안산시64.2269433828
7132017인천광역시강화군62.59330179
7142019전라북도정읍시74.49605400
7152018인천광역시연수구67.9530631910
7162016경기도양평군0.01800
7172020경상남도사천시70.77226138
7182018충청남도금산군63.661668953