Overview

Dataset statistics

Number of variables6
Number of observations1247
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory63.5 KiB
Average record size in memory52.1 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 대상인원(명), 수검인원(명), 암검진수검률(퍼센트)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15110154/fileData.do

Alerts

대상인원(명) is highly overall correlated with 수검인원(명)High correlation
수검인원(명) is highly overall correlated with 대상인원(명)High correlation

Reproduction

Analysis started2023-12-12 06:57:13.291330
Analysis finished2023-12-12 06:57:14.734112
Duration1.44 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
2016
251 
2017
249 
2018
249 
2019
249 
2020
249 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016
2nd row2016
3rd row2016
4th row2016
5th row2016

Common Values

ValueCountFrequency (%)
2016 251
20.1%
2017 249
20.0%
2018 249
20.0%
2019 249
20.0%
2020 249
20.0%

Length

2023-12-12T15:57:15.132756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:57:15.263850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2016 251
20.1%
2017 249
20.0%
2018 249
20.0%
2019 249
20.0%
2020 249
20.0%

시도명
Categorical

Distinct16
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
경기도
212 
서울특별시
125 
경상북도
120 
전라남도
110 
경상남도
110 
Other values (11)
570 

Length

Max length7
Median length5
Mean length4.0785886
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 212
17.0%
서울특별시 125
10.0%
경상북도 120
9.6%
전라남도 110
8.8%
경상남도 110
8.8%
강원도 90
7.2%
부산광역시 80
 
6.4%
충청남도 80
 
6.4%
전라북도 75
 
6.0%
충청북도 70
 
5.6%
Other values (6) 175
14.0%

Length

2023-12-12T15:57:15.421508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 212
17.0%
서울특별시 125
10.0%
경상북도 120
9.6%
전라남도 110
8.8%
경상남도 110
8.8%
강원도 90
7.2%
부산광역시 80
 
6.4%
충청남도 80
 
6.4%
전라북도 75
 
6.0%
충청북도 70
 
5.6%
Other values (6) 175
14.0%
Distinct231
Distinct (%)18.5%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
2023-12-12T15:57:15.860234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.3680834
Min length2

Characters and Unicode

Total characters4200
Distinct characters143
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)0.3%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
동구 30
 
2.4%
중구 30
 
2.4%
창원시 25
 
2.0%
서구 25
 
2.0%
북구 20
 
1.6%
남구 20
 
1.6%
고성군 10
 
0.8%
강서구 10
 
0.8%
무주군 5
 
0.4%
진안군 5
 
0.4%
Other values (222) 1092
85.8%
2023-12-12T15:57:16.344420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
533
 
12.7%
493
 
11.7%
429
 
10.2%
120
 
2.9%
117
 
2.8%
115
 
2.7%
110
 
2.6%
105
 
2.5%
100
 
2.4%
91
 
2.2%
Other values (133) 1987
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4175
99.4%
Space Separator 25
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
533
 
12.8%
493
 
11.8%
429
 
10.3%
120
 
2.9%
117
 
2.8%
115
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
91
 
2.2%
Other values (132) 1962
47.0%
Space Separator
ValueCountFrequency (%)
25
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4175
99.4%
Common 25
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
533
 
12.8%
493
 
11.8%
429
 
10.3%
120
 
2.9%
117
 
2.8%
115
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
91
 
2.2%
Other values (132) 1962
47.0%
Common
ValueCountFrequency (%)
25
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4175
99.4%
ASCII 25
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
533
 
12.8%
493
 
11.8%
429
 
10.3%
120
 
2.9%
117
 
2.8%
115
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
91
 
2.2%
Other values (132) 1962
47.0%
ASCII
ValueCountFrequency (%)
25
100.0%

대상인원(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1234
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean90359.488
Minimum5063
Maximum381090
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-12T15:57:16.494600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5063
5-th percentile15021.3
Q131254.5
median77885
Q3134868
95-th percentile209276.6
Maximum381090
Range376027
Interquartile range (IQR)103613.5

Descriptive statistics

Standard deviation66693.186
Coefficient of variation (CV)0.73808725
Kurtosis0.43864653
Mean90359.488
Median Absolute Deviation (MAD)50169
Skewness0.87731996
Sum1.1267828 × 108
Variance4.4479811 × 109
MonotonicityNot monotonic
2023-12-12T15:57:16.713508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
27534 2
 
0.2%
19776 2
 
0.2%
45045 2
 
0.2%
121321 2
 
0.2%
42702 2
 
0.2%
18681 2
 
0.2%
31742 2
 
0.2%
21883 2
 
0.2%
126759 2
 
0.2%
5361 2
 
0.2%
Other values (1224) 1227
98.4%
ValueCountFrequency (%)
5063 1
0.1%
5305 1
0.1%
5361 2
0.2%
5411 1
0.1%
9747 1
0.1%
9912 1
0.1%
9981 1
0.1%
10044 1
0.1%
10059 1
0.1%
10143 1
0.1%
ValueCountFrequency (%)
381090 1
0.1%
372436 1
0.1%
371574 1
0.1%
371372 1
0.1%
309343 1
0.1%
294929 1
0.1%
290105 1
0.1%
289805 1
0.1%
287134 1
0.1%
286360 1
0.1%

수검인원(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1236
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46778.204
Minimum1957
Maximum223191
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-12T15:57:16.892578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1957
5-th percentile7819.9
Q115661
median39212
Q371660.5
95-th percentile111608.1
Maximum223191
Range221234
Interquartile range (IQR)55999.5

Descriptive statistics

Standard deviation35002.132
Coefficient of variation (CV)0.74825729
Kurtosis0.72004472
Mean46778.204
Median Absolute Deviation (MAD)25667
Skewness0.92995426
Sum58332420
Variance1.2251492 × 109
MonotonicityNot monotonic
2023-12-12T15:57:17.024997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14893 2
 
0.2%
11498 2
 
0.2%
26053 2
 
0.2%
43242 2
 
0.2%
82576 2
 
0.2%
10129 2
 
0.2%
12837 2
 
0.2%
21116 2
 
0.2%
8037 2
 
0.2%
7305 2
 
0.2%
Other values (1226) 1227
98.4%
ValueCountFrequency (%)
1957 1
0.1%
2229 1
0.1%
2677 1
0.1%
2720 1
0.1%
2857 1
0.1%
4377 1
0.1%
4557 1
0.1%
4576 1
0.1%
4585 1
0.1%
4664 1
0.1%
ValueCountFrequency (%)
223191 1
0.1%
210318 1
0.1%
189354 1
0.1%
185331 1
0.1%
163754 1
0.1%
162481 1
0.1%
154184 1
0.1%
152597 1
0.1%
150940 1
0.1%
146923 1
0.1%
Distinct838
Distinct (%)67.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.536592
Minimum38.65
Maximum64.04
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-12T15:57:17.198888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum38.65
5-th percentile44.923
Q148.575
median51.61
Q354.555
95-th percentile58.041
Maximum64.04
Range25.39
Interquartile range (IQR)5.98

Descriptive statistics

Standard deviation4.0569495
Coefficient of variation (CV)0.078719786
Kurtosis-0.32687305
Mean51.536592
Median Absolute Deviation (MAD)3.01
Skewness-0.024171567
Sum64266.13
Variance16.458839
MonotonicityNot monotonic
2023-12-12T15:57:17.369532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
48.56 6
 
0.5%
52.83 5
 
0.4%
48.3 4
 
0.3%
47.3 4
 
0.3%
49.73 4
 
0.3%
54.64 4
 
0.3%
49.49 4
 
0.3%
53.12 4
 
0.3%
49.3 4
 
0.3%
49.64 4
 
0.3%
Other values (828) 1204
96.6%
ValueCountFrequency (%)
38.65 1
0.1%
40.07 1
0.1%
40.87 1
0.1%
40.88 1
0.1%
41.13 1
0.1%
41.52 1
0.1%
41.59 1
0.1%
41.63 1
0.1%
41.75 1
0.1%
41.79 1
0.1%
ValueCountFrequency (%)
64.04 1
0.1%
62.95 1
0.1%
61.78 1
0.1%
61.62 1
0.1%
61.45 1
0.1%
61.26 1
0.1%
61.12 1
0.1%
61.02 1
0.1%
60.87 1
0.1%
60.86 1
0.1%

Interactions

2023-12-12T15:57:14.259120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:57:13.600350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:57:13.942830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:57:14.380883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:57:13.720021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:57:14.041257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:57:14.462685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:57:13.813019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:57:14.152459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:57:17.472815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명대상인원(명)수검인원(명)암검진수검률(퍼센트)
통계연도1.0000.0000.0000.0000.643
시도명0.0001.0000.5870.5490.396
대상인원(명)0.0000.5871.0000.9730.052
수검인원(명)0.0000.5490.9731.0000.193
암검진수검률(퍼센트)0.6430.3960.0520.1931.000
2023-12-12T15:57:17.614891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-12T15:57:17.719681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대상인원(명)수검인원(명)암검진수검률(퍼센트)통계연도시도명
대상인원(명)1.0000.9950.0720.0000.275
수검인원(명)0.9951.0000.1580.0000.250
암검진수검률(퍼센트)0.0720.1581.0000.3200.166
통계연도0.0000.0000.3201.0000.000
시도명0.2750.2500.1660.0001.000

Missing values

2023-12-12T15:57:14.582999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:57:14.686188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명대상인원(명)수검인원(명)암검진수검률(퍼센트)
02016서울특별시종로구685832802840.87
12016서울특별시중구585062513842.97
22016서울특별시용산구979674029441.13
32016서울특별시성동구1262495710545.23
42016서울특별시광진구1509776911745.78
52016서울특별시동대문구1548807070445.65
62016서울특별시중랑구1790488714248.67
72016서울특별시성북구1892848799546.49
82016서울특별시강북구1433327126249.72
92016서울특별시도봉구1535267603249.52
통계연도시도명시군구명대상인원(명)수검인원(명)암검진수검률(퍼센트)
12372020경상남도창녕군336001442342.93
12382020경상남도고성군275911291546.81
12392020경상남도남해군240731093045.4
12402020경상남도하동군244191132146.36
12412020경상남도산청군20104889844.26
12422020경상남도함양군218721016146.46
12432020경상남도거창군319941572249.14
12442020경상남도합천군264641168144.14
12452020제주특별자치도제주시2072069622046.44
12462020제주특별자치도서귀포시841723848345.72