Overview

Dataset statistics

Number of variables3
Number of observations1240
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory30.4 KiB
Average record size in memory25.1 B

Variable types

Text1
Categorical1
Numeric1

Dataset

Description2022년 서울경찰청 관할 31개 경찰서별 살인 · 폭력 죄종 피해자 성별 · 연령대별 발생 건수로 구분(경찰서별 살인, 폭력), 연령(남.여 자6세이하 부터 남.여 미상까지) 을 파악한현황 입니다.
Author경찰청 서울특별시경찰청
URLhttps://www.data.go.kr/data/3075830/fileData.do

Alerts

피해자 수 has 602 (48.5%) zerosZeros

Reproduction

Analysis started2024-03-14 16:31:46.225033
Analysis finished2024-03-14 16:31:48.497321
Duration2.27 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

Distinct62
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size9.8 KiB
2024-03-15T01:31:49.232959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length5.1290323
Min length5

Characters and Unicode

Total characters6360
Distinct characters48
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중부 살인
2nd row중부 살인
3rd row중부 살인
4th row중부 살인
5th row중부 살인
ValueCountFrequency (%)
살인 620
25.0%
폭력 620
25.0%
종암 40
 
1.6%
중랑 40
 
1.6%
수서 40
 
1.6%
강남 40
 
1.6%
관악 40
 
1.6%
강서 40
 
1.6%
강동 40
 
1.6%
중부 40
 
1.6%
Other values (23) 920
37.1%
2024-03-15T01:31:50.591230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1240
19.5%
620
 
9.7%
620
 
9.7%
620
 
9.7%
620
 
9.7%
200
 
3.1%
160
 
2.5%
160
 
2.5%
120
 
1.9%
120
 
1.9%
Other values (38) 1880
29.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5120
80.5%
Space Separator 1240
 
19.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
620
 
12.1%
620
 
12.1%
620
 
12.1%
620
 
12.1%
200
 
3.9%
160
 
3.1%
160
 
3.1%
120
 
2.3%
120
 
2.3%
80
 
1.6%
Other values (37) 1800
35.2%
Space Separator
ValueCountFrequency (%)
1240
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5120
80.5%
Common 1240
 
19.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
620
 
12.1%
620
 
12.1%
620
 
12.1%
620
 
12.1%
200
 
3.9%
160
 
3.1%
160
 
3.1%
120
 
2.3%
120
 
2.3%
80
 
1.6%
Other values (37) 1800
35.2%
Common
ValueCountFrequency (%)
1240
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5120
80.5%
ASCII 1240
 
19.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1240
100.0%
Hangul
ValueCountFrequency (%)
620
 
12.1%
620
 
12.1%
620
 
12.1%
620
 
12.1%
200
 
3.9%
160
 
3.1%
160
 
3.1%
120
 
2.3%
120
 
2.3%
80
 
1.6%
Other values (37) 1800
35.2%

연령
Categorical

Distinct20
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size9.8 KiB
남자6세 이하
 
62
남자12세이하
 
62
남자15세이하
 
62
남자20세이하
 
62
남자30세이하
 
62
Other values (15)
930 

Length

Max length7
Median length7
Mean length6.7
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남자6세 이하
2nd row남자12세이하
3rd row남자15세이하
4th row남자20세이하
5th row남자30세이하

Common Values

ValueCountFrequency (%)
남자6세 이하 62
 
5.0%
남자12세이하 62
 
5.0%
남자15세이하 62
 
5.0%
남자20세이하 62
 
5.0%
남자30세이하 62
 
5.0%
남자40세이하 62
 
5.0%
남자50세이하 62
 
5.0%
남자60세이하 62
 
5.0%
남자60세초과 62
 
5.0%
남자미상 62
 
5.0%
Other values (10) 620
50.0%

Length

2024-03-15T01:31:51.016782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
이하 124
 
9.1%
남자6세 62
 
4.5%
여자6세 62
 
4.5%
여자60세초과 62
 
4.5%
여자60세이하 62
 
4.5%
여자50세이하 62
 
4.5%
여자40세이하 62
 
4.5%
여자30세이하 62
 
4.5%
여자20세이하 62
 
4.5%
여자15세이하 62
 
4.5%
Other values (11) 682
50.0%

피해자 수
Real number (ℝ)

ZEROS 

Distinct200
Distinct (%)16.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.172581
Minimum0
Maximum489
Zeros602
Zeros (%)48.5%
Negative0
Negative (%)0.0%
Memory size11.0 KiB
2024-03-15T01:31:51.419972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q337.25
95-th percentile181
Maximum489
Range489
Interquartile range (IQR)37.25

Descriptive statistics

Standard deviation66.146163
Coefficient of variation (CV)1.9356502
Kurtosis7.0841032
Mean34.172581
Median Absolute Deviation (MAD)1
Skewness2.4617503
Sum42374
Variance4375.3148
MonotonicityNot monotonic
2024-03-15T01:31:51.867788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 602
48.5%
1 132
 
10.6%
2 47
 
3.8%
3 14
 
1.1%
8 14
 
1.1%
4 13
 
1.0%
7 11
 
0.9%
5 11
 
0.9%
9 7
 
0.6%
14 7
 
0.6%
Other values (190) 382
30.8%
ValueCountFrequency (%)
0 602
48.5%
1 132
 
10.6%
2 47
 
3.8%
3 14
 
1.1%
4 13
 
1.0%
5 11
 
0.9%
6 3
 
0.2%
7 11
 
0.9%
8 14
 
1.1%
9 7
 
0.6%
ValueCountFrequency (%)
489 1
0.1%
454 1
0.1%
397 1
0.1%
396 1
0.1%
323 1
0.1%
319 1
0.1%
313 1
0.1%
312 1
0.1%
305 1
0.1%
298 1
0.1%

Interactions

2024-03-15T01:31:46.902827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T01:31:52.136506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분연령피해자 수
구분1.0000.0000.630
연령0.0001.0000.472
피해자 수0.6300.4721.000
2024-03-15T01:31:52.443069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
피해자 수연령
피해자 수1.0000.206
연령0.2061.000

Missing values

2024-03-15T01:31:47.511509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T01:31:48.325977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분연령피해자 수
0중부 살인남자6세 이하0
1중부 살인남자12세이하0
2중부 살인남자15세이하0
3중부 살인남자20세이하0
4중부 살인남자30세이하0
5중부 살인남자40세이하0
6중부 살인남자50세이하0
7중부 살인남자60세이하0
8중부 살인남자60세초과0
9중부 살인남자미상0
구분연령피해자 수
1230수서 폭력여자6세 이하0
1231수서 폭력여자12세이하1
1232수서 폭력여자15세이하6
1233수서 폭력여자20세이하20
1234수서 폭력여자30세이하83
1235수서 폭력여자40세이하85
1236수서 폭력여자50세이하88
1237수서 폭력여자60세이하68
1238수서 폭력여자60세초과62
1239수서 폭력여자미상1