Overview

Dataset statistics

Number of variables8
Number of observations1247
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory85.4 KiB
Average record size in memory70.1 B

Variable types

Categorical2
Text1
Numeric5

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 일반건강검진율(퍼센트), 정상_에이(명), 정상_비(경계)(명), 질환의심자수(명), 유질환자수(명)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15110159/fileData.do

Alerts

일반건강검진율(퍼센트) is highly overall correlated with 정상_에이(명) and 3 other fieldsHigh correlation
정상_에이(명) is highly overall correlated with 일반건강검진율(퍼센트) and 3 other fieldsHigh correlation
정상_비(경계)(명) is highly overall correlated with 일반건강검진율(퍼센트) and 3 other fieldsHigh correlation
질환의심자수(명) is highly overall correlated with 일반건강검진율(퍼센트) and 3 other fieldsHigh correlation
유질환자수(명) is highly overall correlated with 일반건강검진율(퍼센트) and 3 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 20:43:42.267440
Analysis finished2023-12-12 20:43:45.927784
Duration3.66 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
2016
251 
2017
249 
2018
249 
2019
249 
2020
249 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016
2nd row2016
3rd row2016
4th row2016
5th row2016

Common Values

ValueCountFrequency (%)
2016 251
20.1%
2017 249
20.0%
2018 249
20.0%
2019 249
20.0%
2020 249
20.0%

Length

2023-12-13T05:43:46.019540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:43:46.196998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2016 251
20.1%
2017 249
20.0%
2018 249
20.0%
2019 249
20.0%
2020 249
20.0%

시도명
Categorical

Distinct16
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
경기도
212 
서울특별시
125 
경상북도
120 
전라남도
110 
경상남도
110 
Other values (11)
570 

Length

Max length7
Median length5
Mean length4.0785886
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 212
17.0%
서울특별시 125
10.0%
경상북도 120
9.6%
전라남도 110
8.8%
경상남도 110
8.8%
강원도 90
7.2%
부산광역시 80
 
6.4%
충청남도 80
 
6.4%
전라북도 75
 
6.0%
충청북도 70
 
5.6%
Other values (6) 175
14.0%

Length

2023-12-13T05:43:46.398143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 212
17.0%
서울특별시 125
10.0%
경상북도 120
9.6%
전라남도 110
8.8%
경상남도 110
8.8%
강원도 90
7.2%
부산광역시 80
 
6.4%
충청남도 80
 
6.4%
전라북도 75
 
6.0%
충청북도 70
 
5.6%
Other values (6) 175
14.0%
Distinct230
Distinct (%)18.4%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
2023-12-13T05:43:46.786716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.3680834
Min length2

Characters and Unicode

Total characters4200
Distinct characters143
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.2%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
동구 30
 
2.4%
중구 30
 
2.4%
창원시 25
 
2.0%
서구 25
 
2.0%
북구 20
 
1.6%
남구 20
 
1.6%
고성군 10
 
0.8%
강서구 10
 
0.8%
장수군 5
 
0.4%
진안군 5
 
0.4%
Other values (221) 1092
85.8%
2023-12-13T05:43:47.366434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
533
 
12.7%
497
 
11.8%
425
 
10.1%
120
 
2.9%
117
 
2.8%
115
 
2.7%
110
 
2.6%
105
 
2.5%
100
 
2.4%
91
 
2.2%
Other values (133) 1987
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4175
99.4%
Space Separator 25
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
533
 
12.8%
497
 
11.9%
425
 
10.2%
120
 
2.9%
117
 
2.8%
115
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
91
 
2.2%
Other values (132) 1962
47.0%
Space Separator
ValueCountFrequency (%)
25
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4175
99.4%
Common 25
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
533
 
12.8%
497
 
11.9%
425
 
10.2%
120
 
2.9%
117
 
2.8%
115
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
91
 
2.2%
Other values (132) 1962
47.0%
Common
ValueCountFrequency (%)
25
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4175
99.4%
ASCII 25
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
533
 
12.8%
497
 
11.9%
425
 
10.2%
120
 
2.9%
117
 
2.8%
115
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
91
 
2.2%
Other values (132) 1962
47.0%
ASCII
ValueCountFrequency (%)
25
100.0%

일반건강검진율(퍼센트)
Real number (ℝ)

HIGH CORRELATION 

Distinct935
Distinct (%)75.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.468348
Minimum27.01
Maximum55.69
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-13T05:43:47.873516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum27.01
5-th percentile30.918
Q136.555
median41.25
Q344.465
95-th percentile48.83
Maximum55.69
Range28.68
Interquartile range (IQR)7.91

Descriptive statistics

Standard deviation5.4301387
Coefficient of variation (CV)0.13418237
Kurtosis-0.65323072
Mean40.468348
Median Absolute Deviation (MAD)3.85
Skewness-0.21664462
Sum50464.03
Variance29.486407
MonotonicityNot monotonic
2023-12-13T05:43:48.053900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
43.62 5
 
0.4%
42.31 4
 
0.3%
43.36 4
 
0.3%
33.61 4
 
0.3%
43.76 4
 
0.3%
44.2 4
 
0.3%
45.05 4
 
0.3%
40.22 4
 
0.3%
39.5 4
 
0.3%
43.73 4
 
0.3%
Other values (925) 1206
96.7%
ValueCountFrequency (%)
27.01 1
0.1%
27.89 1
0.1%
28.08 1
0.1%
28.26 1
0.1%
28.32 1
0.1%
28.4 1
0.1%
28.49 1
0.1%
28.71 1
0.1%
28.76 1
0.1%
28.78 1
0.1%
ValueCountFrequency (%)
55.69 1
0.1%
54.07 1
0.1%
53.41 1
0.1%
53.34 1
0.1%
52.88 1
0.1%
52.53 1
0.1%
52.06 1
0.1%
51.96 1
0.1%
51.83 1
0.1%
51.72 1
0.1%

정상_에이(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1178
Distinct (%)94.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6046.5237
Minimum121
Maximum37419
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-13T05:43:48.229040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum121
5-th percentile510.6
Q11292.5
median4381
Q39061
95-th percentile16958.4
Maximum37419
Range37298
Interquartile range (IQR)7768.5

Descriptive statistics

Standard deviation5707.7918
Coefficient of variation (CV)0.94397907
Kurtosis2.3094934
Mean6046.5237
Median Absolute Deviation (MAD)3459
Skewness1.3690821
Sum7540015
Variance32578887
MonotonicityNot monotonic
2023-12-13T05:43:48.414191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1828 3
 
0.2%
664 3
 
0.2%
12726 3
 
0.2%
394 2
 
0.2%
1195 2
 
0.2%
1070 2
 
0.2%
1393 2
 
0.2%
489 2
 
0.2%
10726 2
 
0.2%
11685 2
 
0.2%
Other values (1168) 1224
98.2%
ValueCountFrequency (%)
121 1
0.1%
146 1
0.1%
188 2
0.2%
193 1
0.1%
213 1
0.1%
216 1
0.1%
270 1
0.1%
275 1
0.1%
287 1
0.1%
313 1
0.1%
ValueCountFrequency (%)
37419 1
0.1%
33876 1
0.1%
33783 1
0.1%
33006 1
0.1%
31173 1
0.1%
29189 1
0.1%
27003 1
0.1%
26104 1
0.1%
24804 1
0.1%
24750 1
0.1%

정상_비(경계)(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1224
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19241.528
Minimum619
Maximum92260
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-13T05:43:48.604252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum619
5-th percentile2228.4
Q14635.5
median16788
Q329534
95-th percentile48016.9
Maximum92260
Range91641
Interquartile range (IQR)24898.5

Descriptive statistics

Standard deviation15777.397
Coefficient of variation (CV)0.81996594
Kurtosis0.60831184
Mean19241.528
Median Absolute Deviation (MAD)12307
Skewness0.90307706
Sum23994185
Variance2.4892627 × 108
MonotonicityNot monotonic
2023-12-13T05:43:48.792767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10413 2
 
0.2%
4092 2
 
0.2%
2480 2
 
0.2%
3622 2
 
0.2%
3214 2
 
0.2%
24016 2
 
0.2%
3838 2
 
0.2%
24961 2
 
0.2%
2499 2
 
0.2%
33612 2
 
0.2%
Other values (1214) 1227
98.4%
ValueCountFrequency (%)
619 1
0.1%
727 1
0.1%
760 1
0.1%
761 1
0.1%
768 1
0.1%
1082 1
0.1%
1223 1
0.1%
1366 1
0.1%
1405 1
0.1%
1414 1
0.1%
ValueCountFrequency (%)
92260 1
0.1%
85727 1
0.1%
84819 1
0.1%
82526 1
0.1%
82122 1
0.1%
78212 1
0.1%
71831 1
0.1%
70754 1
0.1%
66520 1
0.1%
65198 1
0.1%

질환의심자수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1224
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19797.673
Minimum765
Maximum92774
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-13T05:43:48.969674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum765
5-th percentile2617
Q15412
median17630
Q330199.5
95-th percentile47206.8
Maximum92774
Range92009
Interquartile range (IQR)24787.5

Descriptive statistics

Standard deviation15678.111
Coefficient of variation (CV)0.79191686
Kurtosis0.74985844
Mean19797.673
Median Absolute Deviation (MAD)12286
Skewness0.92454399
Sum24687698
Variance2.4580316 × 108
MonotonicityNot monotonic
2023-12-13T05:43:49.157864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2617 3
 
0.2%
20263 2
 
0.2%
4330 2
 
0.2%
2679 2
 
0.2%
7082 2
 
0.2%
3290 2
 
0.2%
2575 2
 
0.2%
3565 2
 
0.2%
4968 2
 
0.2%
4029 2
 
0.2%
Other values (1214) 1226
98.3%
ValueCountFrequency (%)
765 1
0.1%
913 1
0.1%
960 1
0.1%
1032 1
0.1%
1046 1
0.1%
1322 1
0.1%
1438 1
0.1%
1571 1
0.1%
1603 1
0.1%
1768 1
0.1%
ValueCountFrequency (%)
92774 1
0.1%
90862 1
0.1%
84002 1
0.1%
83204 1
0.1%
77856 1
0.1%
76062 1
0.1%
75452 1
0.1%
74555 1
0.1%
72509 1
0.1%
64431 1
0.1%

유질환자수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1214
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13466.2
Minimum673
Maximum64446
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-13T05:43:49.334083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum673
5-th percentile2672
Q15095
median11106
Q319555.5
95-th percentile31735.4
Maximum64446
Range63773
Interquartile range (IQR)14460.5

Descriptive statistics

Standard deviation9738.0228
Coefficient of variation (CV)0.72314558
Kurtosis1.4791214
Mean13466.2
Median Absolute Deviation (MAD)6824
Skewness1.0983246
Sum16792351
Variance94829087
MonotonicityNot monotonic
2023-12-13T05:43:49.480180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3115 2
 
0.2%
4762 2
 
0.2%
10724 2
 
0.2%
4528 2
 
0.2%
5662 2
 
0.2%
16149 2
 
0.2%
27212 2
 
0.2%
4012 2
 
0.2%
6901 2
 
0.2%
20406 2
 
0.2%
Other values (1204) 1227
98.4%
ValueCountFrequency (%)
673 1
0.1%
678 1
0.1%
775 1
0.1%
837 1
0.1%
981 1
0.1%
1414 1
0.1%
1453 1
0.1%
1529 1
0.1%
1629 1
0.1%
1633 1
0.1%
ValueCountFrequency (%)
64446 1
0.1%
60520 1
0.1%
60463 1
0.1%
51854 1
0.1%
50456 1
0.1%
49045 1
0.1%
48041 1
0.1%
47756 1
0.1%
46005 1
0.1%
42248 1
0.1%

Interactions

2023-12-13T05:43:45.067543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:42.724684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:43.284059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:43.857634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:44.410051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:45.180636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:42.824893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:43.402462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:43.969875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:44.537219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:45.300863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:42.939133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:43.523162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:44.082469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:44.720777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:45.426087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:43.037825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:43.646610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:44.191345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:44.836129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:45.545708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:43.161163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:43.745327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:44.297657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:43:44.943154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:43:49.603154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명일반건강검진율(퍼센트)정상_에이(명)정상_비(경계)(명)질환의심자수(명)유질환자수(명)
통계연도1.0000.0000.4180.3520.0000.0000.154
시도명0.0001.0000.4660.4750.5520.5490.538
일반건강검진율(퍼센트)0.4180.4661.0000.6890.6920.6510.611
정상_에이(명)0.3520.4750.6891.0000.9370.8950.920
정상_비(경계)(명)0.0000.5520.6920.9371.0000.9790.920
질환의심자수(명)0.0000.5490.6510.8950.9791.0000.927
유질환자수(명)0.1540.5380.6110.9200.9200.9271.000
2023-12-13T05:43:49.771829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-13T05:43:49.883406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일반건강검진율(퍼센트)정상_에이(명)정상_비(경계)(명)질환의심자수(명)유질환자수(명)통계연도시도명
일반건강검진율(퍼센트)1.0000.7930.7460.6740.6170.1860.202
정상_에이(명)0.7931.0000.9510.9300.9500.1530.207
정상_비(경계)(명)0.7460.9511.0000.9910.9550.0000.252
질환의심자수(명)0.6740.9300.9911.0000.9580.0000.250
유질환자수(명)0.6170.9500.9550.9581.0000.0640.243
통계연도0.1860.1530.0000.0000.0641.0000.000
시도명0.2020.2070.2520.2500.2430.0001.000

Missing values

2023-12-13T05:43:45.724909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:43:45.867412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명일반건강검진율(퍼센트)정상_에이(명)정상_비(경계)(명)질환의심자수(명)유질환자수(명)
02016서울특별시종로구41.77264511837125697620
12016서울특별시중구40.43234110135112847096
22016서울특별시용산구44.584799184411778211106
32016서울특별시성동구41.295628241372615016170
42016서울특별시광진구44.297747319473173918187
52016서울특별시동대문구38.956200275853234420604
62016서울특별시중랑구39.517722334913789725198
72016서울특별시성북구39.977690348643829725622
82016서울특별시강북구36.744812246552865022089
92016서울특별시도봉구37.245481274973274522835
통계연도시도명시군구명일반건강검진율(퍼센트)정상_에이(명)정상_비(경계)(명)질환의심자수(명)유질환자수(명)
12372020경상남도창녕군35.671676435156105261
12382020경상남도고성군34.261175360644714702
12392020경상남도남해군31.16828254832404218
12402020경상남도하동군35.08893286835393422
12412020경상남도산청군33.42886212828783126
12422020경상남도함양군33.79777259032023396
12432020경상남도거창군38.351605476051045129
12442020경상남도합천군33.941109282633284331
12452020제주특별자치도제주시38.1613081346674854728829
12462020제주특별자치도서귀포시35.584454113221638612176