Overview

Dataset statistics

Number of variables6
Number of observations1400
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory71.2 KiB
Average record size in memory52.1 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 총사망자수(명), 암으로인한 사망자수(명), 암으로 인한 사망률(퍼센트)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15110160/fileData.do

Alerts

총사망자수(명) is highly overall correlated with 암으로인한 사망자수(명)High correlation
암으로인한 사망자수(명) is highly overall correlated with 총사망자수(명)High correlation
총사망자수(명) has 100 (7.1%) zerosZeros
암으로인한 사망자수(명) has 100 (7.1%) zerosZeros
암으로 인한 사망률(퍼센트) has 100 (7.1%) zerosZeros

Reproduction

Analysis started2023-12-12 21:07:46.619076
Analysis finished2023-12-12 21:07:48.007752
Duration1.39 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size11.1 KiB
2016
280 
2017
280 
2018
280 
2019
280 
2020
280 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016
2nd row2016
3rd row2016
4th row2016
5th row2016

Common Values

ValueCountFrequency (%)
2016 280
20.0%
2017 280
20.0%
2018 280
20.0%
2019 280
20.0%
2020 280
20.0%

Length

2023-12-13T06:07:48.063815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:07:48.170890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2016 280
20.0%
2017 280
20.0%
2018 280
20.0%
2019 280
20.0%
2020 280
20.0%

시도명
Categorical

Distinct16
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size11.1 KiB
경기도
285 
경상남도
140 
서울특별시
125 
경상북도
125 
전라남도
110 
Other values (11)
615 

Length

Max length5
Median length4
Mean length3.9857143
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 285
20.4%
경상남도 140
10.0%
서울특별시 125
8.9%
경상북도 125
8.9%
전라남도 110
 
7.9%
충청남도 95
 
6.8%
강원도 90
 
6.4%
부산광역시 80
 
5.7%
충청북도 80
 
5.7%
전라북도 80
 
5.7%
Other values (6) 190
13.6%

Length

2023-12-13T06:07:48.310182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 285
20.4%
경상남도 140
10.0%
서울특별시 125
8.9%
경상북도 125
8.9%
전라남도 110
 
7.9%
충청남도 95
 
6.8%
강원도 90
 
6.4%
부산광역시 80
 
5.7%
충청북도 80
 
5.7%
전라북도 80
 
5.7%
Other values (6) 190
13.6%
Distinct255
Distinct (%)18.2%
Missing0
Missing (%)0.0%
Memory size11.1 KiB
2023-12-13T06:07:48.650220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.9714286
Min length2

Characters and Unicode

Total characters4160
Distinct characters142
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
동구 30
 
2.1%
중구 30
 
2.1%
남구 30
 
2.1%
북구 25
 
1.8%
서구 25
 
1.8%
고성군 10
 
0.7%
강서구 10
 
0.7%
남원시 5
 
0.4%
완산구 5
 
0.4%
덕진구 5
 
0.4%
Other values (245) 1225
87.5%
2023-12-13T06:07:49.136929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
565
 
13.6%
475
 
11.4%
405
 
9.7%
125
 
3.0%
120
 
2.9%
115
 
2.8%
100
 
2.4%
100
 
2.4%
100
 
2.4%
80
 
1.9%
Other values (132) 1975
47.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4160
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
565
 
13.6%
475
 
11.4%
405
 
9.7%
125
 
3.0%
120
 
2.9%
115
 
2.8%
100
 
2.4%
100
 
2.4%
100
 
2.4%
80
 
1.9%
Other values (132) 1975
47.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4160
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
565
 
13.6%
475
 
11.4%
405
 
9.7%
125
 
3.0%
120
 
2.9%
115
 
2.8%
100
 
2.4%
100
 
2.4%
100
 
2.4%
80
 
1.9%
Other values (132) 1975
47.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4160
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
565
 
13.6%
475
 
11.4%
405
 
9.7%
125
 
3.0%
120
 
2.9%
115
 
2.8%
100
 
2.4%
100
 
2.4%
100
 
2.4%
80
 
1.9%
Other values (132) 1975
47.5%

총사망자수(명)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct988
Distinct (%)70.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1192.1586
Minimum0
Maximum5627
Zeros100
Zeros (%)7.1%
Negative0
Negative (%)0.0%
Memory size12.4 KiB
2023-12-13T06:07:49.284331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1571
median1072
Q31601
95-th percentile2703.6
Maximum5627
Range5627
Interquartile range (IQR)1030

Descriptive statistics

Standard deviation893.61007
Coefficient of variation (CV)0.74957316
Kurtosis3.551097
Mean1192.1586
Median Absolute Deviation (MAD)513.5
Skewness1.4810222
Sum1669022
Variance798538.96
MonotonicityNot monotonic
2023-12-13T06:07:49.400741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 100
 
7.1%
994 5
 
0.4%
1144 4
 
0.3%
725 4
 
0.3%
1214 4
 
0.3%
1094 4
 
0.3%
1192 4
 
0.3%
1145 4
 
0.3%
1110 4
 
0.3%
897 4
 
0.3%
Other values (978) 1263
90.2%
ValueCountFrequency (%)
0 100
7.1%
63 1
 
0.1%
64 1
 
0.1%
67 1
 
0.1%
70 1
 
0.1%
76 1
 
0.1%
146 1
 
0.1%
167 1
 
0.1%
173 1
 
0.1%
174 1
 
0.1%
ValueCountFrequency (%)
5627 1
0.1%
5615 1
0.1%
5473 1
0.1%
5362 1
0.1%
5294 1
0.1%
5119 1
0.1%
4993 1
0.1%
4939 1
0.1%
4913 1
0.1%
4857 1
0.1%

암으로인한 사망자수(명)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct619
Distinct (%)44.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean326.05571
Minimum0
Maximum1593
Zeros100
Zeros (%)7.1%
Negative0
Negative (%)0.0%
Memory size12.4 KiB
2023-12-13T06:07:49.525885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1142
median283
Q3442
95-th percentile761.05
Maximum1593
Range1593
Interquartile range (IQR)300

Descriptive statistics

Standard deviation255.28376
Coefficient of variation (CV)0.7829452
Kurtosis3.2623255
Mean326.05571
Median Absolute Deviation (MAD)149
Skewness1.4680083
Sum456478
Variance65169.797
MonotonicityNot monotonic
2023-12-13T06:07:49.664691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 100
 
7.1%
173 9
 
0.6%
134 8
 
0.6%
123 8
 
0.6%
98 8
 
0.6%
205 7
 
0.5%
128 7
 
0.5%
319 7
 
0.5%
168 7
 
0.5%
178 6
 
0.4%
Other values (609) 1233
88.1%
ValueCountFrequency (%)
0 100
7.1%
19 1
 
0.1%
21 1
 
0.1%
22 2
 
0.1%
29 1
 
0.1%
36 1
 
0.1%
37 1
 
0.1%
38 1
 
0.1%
39 2
 
0.1%
41 1
 
0.1%
ValueCountFrequency (%)
1593 1
0.1%
1513 1
0.1%
1506 1
0.1%
1485 1
0.1%
1472 1
0.1%
1456 1
0.1%
1452 1
0.1%
1376 1
0.1%
1359 1
0.1%
1345 1
0.1%

암으로 인한 사망률(퍼센트)
Real number (ℝ)

ZEROS 

Distinct1001
Distinct (%)71.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean186.3105
Minimum0
Maximum494.3
Zeros100
Zeros (%)7.1%
Negative0
Negative (%)0.0%
Memory size12.4 KiB
2023-12-13T06:07:49.825686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1128.975
median167.5
Q3252.8
95-th percentile349.22
Maximum494.3
Range494.3
Interquartile range (IQR)123.825

Descriptive statistics

Standard deviation94.610769
Coefficient of variation (CV)0.50781233
Kurtosis-0.089841444
Mean186.3105
Median Absolute Deviation (MAD)53.1
Skewness0.25197663
Sum260834.7
Variance8951.1977
MonotonicityNot monotonic
2023-12-13T06:07:49.961655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 100
 
7.1%
129.9 5
 
0.4%
141.5 5
 
0.4%
132.8 4
 
0.3%
136.5 4
 
0.3%
140.0 4
 
0.3%
106.5 4
 
0.3%
130.4 4
 
0.3%
132.6 4
 
0.3%
123.2 3
 
0.2%
Other values (991) 1263
90.2%
ValueCountFrequency (%)
0.0 100
7.1%
68.6 1
 
0.1%
71.1 1
 
0.1%
78.1 1
 
0.1%
80.4 1
 
0.1%
81.1 1
 
0.1%
81.7 1
 
0.1%
83.4 2
 
0.1%
84.2 1
 
0.1%
86.9 1
 
0.1%
ValueCountFrequency (%)
494.3 1
0.1%
474.0 1
0.1%
464.4 1
0.1%
462.9 1
0.1%
456.9 1
0.1%
442.2 1
0.1%
432.9 1
0.1%
431.6 2
0.1%
426.7 1
0.1%
416.3 1
0.1%

Interactions

2023-12-13T06:07:47.562289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:07:46.908892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:07:47.254990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:07:47.644243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:07:47.010756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:07:47.354449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:07:47.746141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:07:47.160630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:07:47.464416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:07:50.039644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명총사망자수(명)암으로인한 사망자수(명)암으로 인한 사망률(퍼센트)
통계연도1.0000.0000.0000.0000.000
시도명0.0001.0000.5090.5180.579
총사망자수(명)0.0000.5091.0000.9870.681
암으로인한 사망자수(명)0.0000.5180.9871.0000.671
암으로 인한 사망률(퍼센트)0.0000.5790.6810.6711.000
2023-12-13T06:07:50.448863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-13T06:07:50.530314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
총사망자수(명)암으로인한 사망자수(명)암으로 인한 사망률(퍼센트)통계연도시도명
총사망자수(명)1.0000.993-0.2860.0000.226
암으로인한 사망자수(명)0.9931.000-0.2980.0000.231
암으로 인한 사망률(퍼센트)-0.286-0.2981.0000.0000.269
통계연도0.0000.0000.0001.0000.000
시도명0.2260.2310.2690.0001.000

Missing values

2023-12-13T06:07:47.854847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:07:47.960645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명총사망자수(명)암으로인한 사망자수(명)암으로 인한 사망률(퍼센트)
02016서울특별시종로구823216145.6
12016서울특별시중구669188155.2
22016서울특별시용산구1137341153.7
32016서울특별시성동구1304386131.8
42016서울특별시광진구1344391110.7
52016서울특별시동대문구1921585166.8
62016서울특별시중랑구2013593145.9
72016서울특별시성북구2139642143.6
82016서울특별시강북구1845562173.9
92016서울특별시도봉구1819572165.6
통계연도시도명시군구명총사망자수(명)암으로인한 사망자수(명)암으로 인한 사망률(퍼센트)
13902020경상남도남해군710167388.3
13912020경상남도하동군624148326.1
13922020경상남도산청군521123353.0
13932020경상남도함양군558134343.2
13942020경상남도거창군713199323.7
13952020경상남도합천군735148335.1
13962020제주도제주시2715736151.2
13972020제주도서귀포시1237336186.2
13982020제주도북제주군000.0
13992020제주도남제주군000.0