Overview

Dataset statistics

Number of variables6
Number of observations1245
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory63.4 KiB
Average record size in memory52.1 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 직장_총가입자수(명), 지역_가입자수(명), 건강보험적용인구(명)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15110147

Alerts

직장_총가입자수(명) is highly overall correlated with 지역_가입자수(명) and 1 other fieldsHigh correlation
지역_가입자수(명) is highly overall correlated with 직장_총가입자수(명) and 1 other fieldsHigh correlation
건강보험적용인구(명) is highly overall correlated with 직장_총가입자수(명) and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-11 00:02:21.436716
Analysis finished2023-12-11 00:02:22.974698
Duration1.54 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
2016
249 
2017
249 
2018
249 
2019
249 
2020
249 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016
2nd row2016
3rd row2016
4th row2016
5th row2016

Common Values

ValueCountFrequency (%)
2016 249
20.0%
2017 249
20.0%
2018 249
20.0%
2019 249
20.0%
2020 249
20.0%

Length

2023-12-11T09:02:23.042199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:02:23.173728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2016 249
20.0%
2017 249
20.0%
2018 249
20.0%
2019 249
20.0%
2020 249
20.0%

시도명
Categorical

Distinct16
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
경기도
210 
서울특별시
125 
경상북도
120 
전라남도
110 
경상남도
110 
Other values (11)
570 

Length

Max length7
Median length5
Mean length4.0803213
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 210
16.9%
서울특별시 125
10.0%
경상북도 120
9.6%
전라남도 110
8.8%
경상남도 110
8.8%
강원도 90
7.2%
부산광역시 80
 
6.4%
충청남도 80
 
6.4%
전라북도 75
 
6.0%
충청북도 70
 
5.6%
Other values (6) 175
14.1%

Length

2023-12-11T09:02:23.343751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 210
16.9%
서울특별시 125
10.0%
경상북도 120
9.6%
전라남도 110
8.8%
경상남도 110
8.8%
강원도 90
7.2%
부산광역시 80
 
6.4%
충청남도 80
 
6.4%
전라북도 75
 
6.0%
충청북도 70
 
5.6%
Other values (6) 175
14.1%
Distinct227
Distinct (%)18.2%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
2023-12-11T09:02:23.671929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.3381526
Min length2

Characters and Unicode

Total characters4156
Distinct characters141
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
동구 30
 
2.4%
중구 30
 
2.4%
서구 25
 
2.0%
남구 22
 
1.8%
북구 20
 
1.6%
고성군 10
 
0.8%
강서구 10
 
0.8%
목포시 5
 
0.4%
순창군 5
 
0.4%
임실군 5
 
0.4%
Other values (217) 1083
87.0%
2023-12-11T09:02:24.148450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
530
 
12.8%
495
 
11.9%
425
 
10.2%
120
 
2.9%
115
 
2.8%
115
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
90
 
2.2%
Other values (131) 1951
46.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4156
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
530
 
12.8%
495
 
11.9%
425
 
10.2%
120
 
2.9%
115
 
2.8%
115
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
90
 
2.2%
Other values (131) 1951
46.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4156
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
530
 
12.8%
495
 
11.9%
425
 
10.2%
120
 
2.9%
115
 
2.8%
115
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
90
 
2.2%
Other values (131) 1951
46.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4156
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
530
 
12.8%
495
 
11.9%
425
 
10.2%
120
 
2.9%
115
 
2.8%
115
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
90
 
2.2%
Other values (131) 1951
46.9%

직장_총가입자수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1243
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean147559.08
Minimum5338
Maximum718450
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-11T09:02:24.324537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5338
5-th percentile17339.8
Q137385
median122613
Q3229650
95-th percentile376726.2
Maximum718450
Range713112
Interquartile range (IQR)192265

Descriptive statistics

Standard deviation122919.97
Coefficient of variation (CV)0.83302209
Kurtosis0.82045947
Mean147559.08
Median Absolute Deviation (MAD)89927
Skewness0.9981267
Sum1.8371105 × 108
Variance1.510932 × 1010
MonotonicityNot monotonic
2023-12-11T09:02:24.766513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
244401 2
 
0.2%
52879 2
 
0.2%
233455 1
 
0.1%
60897 1
 
0.1%
378259 1
 
0.1%
238142 1
 
0.1%
597373 1
 
0.1%
258816 1
 
0.1%
177426 1
 
0.1%
306461 1
 
0.1%
Other values (1233) 1233
99.0%
ValueCountFrequency (%)
5338 1
0.1%
5640 1
0.1%
5811 1
0.1%
5813 1
0.1%
5836 1
0.1%
10079 1
0.1%
10203 1
0.1%
10436 1
0.1%
10450 1
0.1%
10576 1
0.1%
ValueCountFrequency (%)
718450 1
0.1%
687429 1
0.1%
635250 1
0.1%
610178 1
0.1%
608928 1
0.1%
605863 1
0.1%
597373 1
0.1%
586743 1
0.1%
578754 1
0.1%
533602 1
0.1%

지역_가입자수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1236
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56450.523
Minimum3551
Maximum239384
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-11T09:02:24.914344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3551
5-th percentile8848.6
Q118927
median49027
Q380561
95-th percentile137750.8
Maximum239384
Range235833
Interquartile range (IQR)61634

Descriptive statistics

Standard deviation42592.99
Coefficient of variation (CV)0.75451897
Kurtosis0.88692151
Mean56450.523
Median Absolute Deviation (MAD)30299
Skewness0.99351714
Sum70280901
Variance1.8141628 × 109
MonotonicityNot monotonic
2023-12-11T09:02:25.097766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8818 2
 
0.2%
13107 2
 
0.2%
70344 2
 
0.2%
10277 2
 
0.2%
16330 2
 
0.2%
34854 2
 
0.2%
53922 2
 
0.2%
95900 2
 
0.2%
12822 2
 
0.2%
19265 1
 
0.1%
Other values (1226) 1226
98.5%
ValueCountFrequency (%)
3551 1
0.1%
3709 1
0.1%
3768 1
0.1%
3777 1
0.1%
3792 1
0.1%
5735 1
0.1%
5743 1
0.1%
5794 1
0.1%
5826 1
0.1%
6007 1
0.1%
ValueCountFrequency (%)
239384 1
0.1%
238714 1
0.1%
238516 1
0.1%
238107 1
0.1%
237866 1
0.1%
213071 1
0.1%
209138 1
0.1%
206106 1
0.1%
205388 1
0.1%
204900 1
0.1%

건강보험적용인구(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1243
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean204009.6
Minimum8889
Maximum901163
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-11T09:02:25.281779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8889
5-th percentile26523.2
Q156429
median171844
Q3312536
95-th percentile501459.2
Maximum901163
Range892274
Interquartile range (IQR)256107

Descriptive statistics

Standard deviation164277.79
Coefficient of variation (CV)0.8052454
Kurtosis0.6641566
Mean204009.6
Median Absolute Deviation (MAD)121362
Skewness0.96003428
Sum2.5399196 × 108
Variance2.6987194 × 1010
MonotonicityNot monotonic
2023-12-11T09:02:25.424925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
45929 2
 
0.2%
36885 2
 
0.2%
159749 1
 
0.1%
217835 1
 
0.1%
523319 1
 
0.1%
316779 1
 
0.1%
836757 1
 
0.1%
322658 1
 
0.1%
244607 1
 
0.1%
439432 1
 
0.1%
Other values (1233) 1233
99.0%
ValueCountFrequency (%)
8889 1
0.1%
9408 1
0.1%
9522 1
0.1%
9603 1
0.1%
9613 1
0.1%
15822 1
0.1%
15997 1
0.1%
16262 1
0.1%
16311 1
0.1%
16457 1
0.1%
ValueCountFrequency (%)
901163 1
0.1%
861137 1
0.1%
848285 1
0.1%
846794 1
0.1%
844379 1
0.1%
836757 1
0.1%
825457 1
0.1%
794322 1
0.1%
725268 1
0.1%
708077 1
0.1%

Interactions

2023-12-11T09:02:22.415248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:02:21.747296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:02:22.090650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:02:22.525313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:02:21.884768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:02:22.215087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:02:22.637479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:02:21.982131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:02:22.316013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:02:25.515758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명직장_총가입자수(명)지역_가입자수(명)건강보험적용인구(명)
통계연도1.0000.0000.0000.0000.000
시도명0.0001.0000.5620.6320.592
직장_총가입자수(명)0.0000.5621.0000.9680.979
지역_가입자수(명)0.0000.6320.9681.0000.967
건강보험적용인구(명)0.0000.5920.9790.9671.000
2023-12-11T09:02:25.630934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-11T09:02:25.722557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
직장_총가입자수(명)지역_가입자수(명)건강보험적용인구(명)통계연도시도명
직장_총가입자수(명)1.0000.9770.9980.0000.258
지역_가입자수(명)0.9771.0000.9870.0000.307
건강보험적용인구(명)0.9980.9871.0000.0000.278
통계연도0.0000.0000.0001.0000.000
시도명0.2580.3070.2780.0001.000

Missing values

2023-12-11T09:02:22.788839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:02:22.917908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명직장_총가입자수(명)지역_가입자수(명)건강보험적용인구(명)
02016서울특별시종로구10248650604153090
12016서울특별시중구8528544640129925
22016서울특별시용산구15525769645224902
32016서울특별시성동구21036584550294915
42016서울특별시광진구248736105825354561
52016서울특별시동대문구232427111541343968
62016서울특별시중랑구264463132491396954
72016서울특별시성북구304871132152437023
82016서울특별시강북구202998107857310855
92016서울특별시도봉구237630100477338107
통계연도시도명시군구명직장_총가입자수(명)지역_가입자수(명)건강보험적용인구(명)
12352020경상남도고성군316231819549818
12362020경상남도남해군257741579041564
12372020경상남도하동군267681628443052
12382020경상남도산청군213341240533739
12392020경상남도함양군248801281237692
12402020경상남도거창군399561930459260
12412020경상남도합천군280131412542138
12422020제주특별자치도제주시335773152559488332
12432020제주특별자치도서귀포시11330067389180689
12442020경기도고양시일산서구22732574896302221