Overview

Dataset statistics

Number of variables6
Number of observations420
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.5 KiB
Average record size in memory52.3 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 인구 천명당 범죄발생 건수(건), 범죄 수(건), 총인구수(명)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15110167

Alerts

범죄 수(건) is highly overall correlated with 총인구수(명) and 1 other fieldsHigh correlation
총인구수(명) is highly overall correlated with 범죄 수(건) and 1 other fieldsHigh correlation
시도명 is highly overall correlated with 범죄 수(건) and 1 other fieldsHigh correlation
총인구수(명) has unique valuesUnique

Reproduction

Analysis started2023-12-11 00:36:06.410209
Analysis finished2023-12-11 00:36:07.691871
Duration1.28 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
2016
84 
2017
84 
2018
84 
2019
84 
2020
84 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016
2nd row2016
3rd row2016
4th row2016
5th row2016

Common Values

ValueCountFrequency (%)
2016 84
20.0%
2017 84
20.0%
2018 84
20.0%
2019 84
20.0%
2020 84
20.0%

Length

2023-12-11T09:36:07.749537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:36:07.881644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2016 84
20.0%
2017 84
20.0%
2018 84
20.0%
2019 84
20.0%
2020 84
20.0%

시도명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
경기도
140 
경상북도
50 
경상남도
40 
강원도
35 
충청남도
35 
Other values (12)
120 

Length

Max length7
Median length5
Mean length3.7738095
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row부산광역시
3rd row대구광역시
4th row인천광역시
5th row광주광역시

Common Values

ValueCountFrequency (%)
경기도 140
33.3%
경상북도 50
 
11.9%
경상남도 40
 
9.5%
강원도 35
 
8.3%
충청남도 35
 
8.3%
전라북도 30
 
7.1%
전라남도 25
 
6.0%
충청북도 15
 
3.6%
제주특별자치도 10
 
2.4%
부산광역시 5
 
1.2%
Other values (7) 35
 
8.3%

Length

2023-12-11T09:36:08.029397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 140
33.3%
경상북도 50
 
11.9%
경상남도 40
 
9.5%
강원도 35
 
8.3%
충청남도 35
 
8.3%
전라북도 30
 
7.1%
전라남도 25
 
6.0%
충청북도 15
 
3.6%
제주특별자치도 10
 
2.4%
부산광역시 5
 
1.2%
Other values (7) 35
 
8.3%
Distinct84
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
2023-12-11T09:36:08.371003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3.2619048
Min length3

Characters and Unicode

Total characters1370
Distinct characters84
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row부산광역시
3rd row대구광역시
4th row인천광역시
5th row광주광역시
ValueCountFrequency (%)
서울특별시 5
 
1.2%
당진시 5
 
1.2%
순천시 5
 
1.2%
여수시 5
 
1.2%
목포시 5
 
1.2%
김제시 5
 
1.2%
남원시 5
 
1.2%
정읍시 5
 
1.2%
익산시 5
 
1.2%
군산시 5
 
1.2%
Other values (74) 370
88.1%
2023-12-11T09:36:08.833385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
425
31.0%
85
 
6.2%
65
 
4.7%
55
 
4.0%
50
 
3.6%
35
 
2.6%
30
 
2.2%
30
 
2.2%
25
 
1.8%
20
 
1.5%
Other values (74) 550
40.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1370
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
425
31.0%
85
 
6.2%
65
 
4.7%
55
 
4.0%
50
 
3.6%
35
 
2.6%
30
 
2.2%
30
 
2.2%
25
 
1.8%
20
 
1.5%
Other values (74) 550
40.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1370
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
425
31.0%
85
 
6.2%
65
 
4.7%
55
 
4.0%
50
 
3.6%
35
 
2.6%
30
 
2.2%
30
 
2.2%
25
 
1.8%
20
 
1.5%
Other values (74) 550
40.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1370
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
425
31.0%
85
 
6.2%
65
 
4.7%
55
 
4.0%
50
 
3.6%
35
 
2.6%
30
 
2.2%
30
 
2.2%
25
 
1.8%
20
 
1.5%
Other values (74) 550
40.1%
Distinct387
Distinct (%)92.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.942786
Minimum7.01
Maximum58.42
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 KiB
2023-12-11T09:36:08.991878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7.01
5-th percentile21.8115
Q127.24
median30.77
Q335.0475
95-th percentile45.5935
Maximum58.42
Range51.41
Interquartile range (IQR)7.8075

Descriptive statistics

Standard deviation7.2049912
Coefficient of variation (CV)0.22555926
Kurtosis1.6470428
Mean31.942786
Median Absolute Deviation (MAD)3.86
Skewness0.84252347
Sum13415.97
Variance51.911897
MonotonicityNot monotonic
2023-12-11T09:36:09.122918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
27.08 3
 
0.7%
29.0 3
 
0.7%
35.43 2
 
0.5%
34.12 2
 
0.5%
35.45 2
 
0.5%
30.16 2
 
0.5%
34.83 2
 
0.5%
31.42 2
 
0.5%
30.38 2
 
0.5%
24.88 2
 
0.5%
Other values (377) 398
94.8%
ValueCountFrequency (%)
7.01 1
0.2%
16.41 1
0.2%
17.37 1
0.2%
17.57 1
0.2%
17.9 1
0.2%
18.16 1
0.2%
19.18 1
0.2%
19.28 1
0.2%
19.31 1
0.2%
19.32 1
0.2%
ValueCountFrequency (%)
58.42 1
0.2%
57.5 1
0.2%
57.45 1
0.2%
56.88 1
0.2%
55.02 1
0.2%
54.51 1
0.2%
53.57 1
0.2%
52.83 1
0.2%
51.98 1
0.2%
49.51 1
0.2%

범죄 수(건)
Real number (ℝ)

HIGH CORRELATION 

Distinct416
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18372.017
Minimum732
Maximum341925
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 KiB
2023-12-11T09:36:09.265730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum732
5-th percentile2149
Q13960.5
median8209.5
Q317940.5
95-th percentile50905.95
Maximum341925
Range341193
Interquartile range (IQR)13980

Descriptive statistics

Standard deviation37787.621
Coefficient of variation (CV)2.0568031
Kurtosis43.54363
Mean18372.017
Median Absolute Deviation (MAD)5010.5
Skewness6.1131888
Sum7716247
Variance1.4279043 × 109
MonotonicityNot monotonic
2023-12-11T09:36:09.425281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1777 2
 
0.5%
8129 2
 
0.5%
3125 2
 
0.5%
2975 2
 
0.5%
341925 1
 
0.2%
5898 1
 
0.2%
9777 1
 
0.2%
21734 1
 
0.2%
11182 1
 
0.2%
6313 1
 
0.2%
Other values (406) 406
96.7%
ValueCountFrequency (%)
732 1
0.2%
1090 1
0.2%
1162 1
0.2%
1221 1
0.2%
1233 1
0.2%
1365 1
0.2%
1481 1
0.2%
1526 1
0.2%
1633 1
0.2%
1679 1
0.2%
ValueCountFrequency (%)
341925 1
0.2%
319046 1
0.2%
306661 1
0.2%
305909 1
0.2%
290816 1
0.2%
133966 1
0.2%
120095 1
0.2%
119267 1
0.2%
118854 1
0.2%
115727 1
0.2%

총인구수(명)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct420
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean572177.96
Minimum42719
Maximum9930616
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 KiB
2023-12-11T09:36:09.593370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum42719
5-th percentile81402.05
Q1135193.75
median268995
Q3530099.5
95-th percentile1502834.1
Maximum9930616
Range9887897
Interquartile range (IQR)394905.75

Descriptive statistics

Standard deviation1169489.7
Coefficient of variation (CV)2.0439264
Kurtosis44.482369
Mean572177.96
Median Absolute Deviation (MAD)160117.5
Skewness6.1915997
Sum2.4031474 × 108
Variance1.3677061 × 1012
MonotonicityNot monotonic
2023-12-11T09:36:09.763207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9930616 1
 
0.2%
829996 1
 
0.2%
281291 1
 
0.2%
111083 1
 
0.2%
148379 1
 
0.2%
222314 1
 
0.2%
372654 1
 
0.2%
815396 1
 
0.2%
437221 1
 
0.2%
183405 1
 
0.2%
Other values (410) 410
97.6%
ValueCountFrequency (%)
42719 1
0.2%
43866 1
0.2%
44858 1
0.2%
45888 1
0.2%
47070 1
0.2%
57527 1
0.2%
58142 1
0.2%
58289 1
0.2%
63231 1
0.2%
63778 1
0.2%
ValueCountFrequency (%)
9930616 1
0.2%
9857426 1
0.2%
9765623 1
0.2%
9729107 1
0.2%
9668465 1
0.2%
3498529 1
0.2%
3470653 1
0.2%
3441453 1
0.2%
3413841 1
0.2%
3391946 1
0.2%

Interactions

2023-12-11T09:36:07.234175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:06.673388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:06.960655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:07.325504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:06.767988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:07.041365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:07.415841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:06.884630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:07.133228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:36:09.852601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명시군구명인구 천명당 범죄발생 건수(건)범죄 수(건)총인구수(명)
통계연도1.0000.0000.0000.3210.0000.000
시도명0.0001.0001.0000.5530.9410.974
시군구명0.0001.0001.0000.8040.9710.999
인구 천명당 범죄발생 건수(건)0.3210.5530.8041.0000.1350.000
범죄 수(건)0.0000.9410.9710.1351.0000.927
총인구수(명)0.0000.9740.9990.0000.9271.000
2023-12-11T09:36:09.970631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-11T09:36:10.046376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인구 천명당 범죄발생 건수(건)범죄 수(건)총인구수(명)통계연도시도명
인구 천명당 범죄발생 건수(건)1.0000.3250.1060.1380.249
범죄 수(건)0.3251.0000.9680.0000.792
총인구수(명)0.1060.9681.0000.0000.914
통계연도0.1380.0000.0001.0000.000
시도명0.2490.7920.9140.0001.000

Missing values

2023-12-11T09:36:07.529325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:36:07.652028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명인구 천명당 범죄발생 건수(건)범죄 수(건)총인구수(명)
02016서울특별시서울특별시34.433419259930616
12016부산광역시부산광역시38.291339663498529
22016대구광역시대구광역시34.65860942484557
32016인천광역시인천광역시35.151034592943069
42016광주광역시광주광역시36.95542871469214
52016대전광역시대전광역시32.38490331514370
62016울산광역시울산광역시34.4403301172304
72016세종특별자치시세종특별자치시19.184661243048
82016경기도수원시42.48507281194041
92016경기도성남시38.6237638974580
통계연도시도명시군구명인구 천명당 범죄발생 건수(건)범죄 수(건)총인구수(명)
4102020경상남도창원시29.44305221036738
4112020경상남도진주시29.1510146348096
4122020경상남도통영시43.435572128293
4132020경상남도사천시33.583731111105
4142020경상남도김해시33.218004542338
4152020경상남도밀양시26.682797104831
4162020경상남도거제시33.058121245754
4172020경상남도양산시28.159915352229
4182020제주특별자치도제주시41.5920484492466
4192020제주특별자치도서귀포시44.848168182169