Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 31 |
Missing cells | 27 |
Missing cells (%) | 14.5% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 1.7 KiB |
Average record size in memory | 57.3 B |
Variable types
Text | 1 |
---|---|
Categorical | 1 |
Numeric | 4 |
Dataset
Description | 2021년 서울특별시경찰청 경찰서별 청소년 5대범죄에 대한 현황으로 살인, 강도, 추행 등의 통계를 제공합니다. |
---|---|
Author | 경찰청 서울특별시경찰청 |
URL | https://www.data.go.kr/data/3075889/fileData.do |
강간-추행 is highly overall correlated with 폭력 | High correlation |
절도 is highly overall correlated with 폭력 | High correlation |
폭력 is highly overall correlated with 강간-추행 and 1 other fields | High correlation |
살인 is highly imbalanced (79.4%) | Imbalance |
강도 has 22 (71.0%) missing values | Missing |
강간-추행 has 5 (16.1%) missing values | Missing |
구분 has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 08:49:35.427610 |
---|---|
Analysis finished | 2023-12-12 08:49:38.016502 |
Duration | 2.59 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
구분
Text
UNIQUE
 
Distinct | 31 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 380.0 B |
Value | Count | Frequency (%) |
중부 | 1 | 3.2% |
중랑 | 1 | 3.2% |
도봉 | 1 | 3.2% |
은평 | 1 | 3.2% |
방배 | 1 | 3.2% |
노원 | 1 | 3.2% |
송파 | 1 | 3.2% |
양천 | 1 | 3.2% |
서초 | 1 | 3.2% |
구로 | 1 | 3.2% |
Other values (21) | 21 |
Most occurring characters
Value | Count | Frequency (%) |
서 | 5 | 7.6% |
동 | 4 | 6.1% |
강 | 4 | 6.1% |
대 | 3 | 4.5% |
문 | 3 | 4.5% |
성 | 2 | 3.0% |
천 | 2 | 3.0% |
부 | 2 | 3.0% |
포 | 2 | 3.0% |
북 | 2 | 3.0% |
Other values (33) | 37 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 66 |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
서 | 5 | 7.6% |
동 | 4 | 6.1% |
강 | 4 | 6.1% |
대 | 3 | 4.5% |
문 | 3 | 4.5% |
성 | 2 | 3.0% |
천 | 2 | 3.0% |
부 | 2 | 3.0% |
포 | 2 | 3.0% |
북 | 2 | 3.0% |
Other values (33) | 37 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 66 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
서 | 5 | 7.6% |
동 | 4 | 6.1% |
강 | 4 | 6.1% |
대 | 3 | 4.5% |
문 | 3 | 4.5% |
성 | 2 | 3.0% |
천 | 2 | 3.0% |
부 | 2 | 3.0% |
포 | 2 | 3.0% |
북 | 2 | 3.0% |
Other values (33) | 37 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 66 |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
서 | 5 | 7.6% |
동 | 4 | 6.1% |
강 | 4 | 6.1% |
대 | 3 | 4.5% |
문 | 3 | 4.5% |
성 | 2 | 3.0% |
천 | 2 | 3.0% |
부 | 2 | 3.0% |
포 | 2 | 3.0% |
북 | 2 | 3.0% |
Other values (33) | 37 |
살인
Categorical
IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 6.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 380.0 B |
<NA> | |
---|---|
1 | 1 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.9032258 |
Min length | 1 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 3.2% |
Sample
1st row | 1 |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 30 | |
1 | 1 | 3.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 30 | |
1 | 1 | 3.2% |
강도
Real number (ℝ)
MISSING
 
Distinct | 6 |
---|---|
Distinct (%) | 66.7% |
Missing | 22 |
Missing (%) | 71.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 3.4444444 |
Minimum | 1 |
---|---|
Maximum | 8 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 411.0 B |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 2 |
median | 2 |
Q3 | 5 |
95-th percentile | 7.6 |
Maximum | 8 |
Range | 7 |
Interquartile range (IQR) | 3 |
Descriptive statistics
Standard deviation | 2.6034166 |
---|---|
Coefficient of variation (CV) | 0.75583061 |
Kurtosis | -0.60139363 |
Mean | 3.4444444 |
Median Absolute Deviation (MAD) | 1 |
Skewness | 0.95555393 |
Sum | 31 |
Variance | 6.7777778 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2 | 3 | 9.7% |
1 | 2 | 6.5% |
7 | 1 | 3.2% |
5 | 1 | 3.2% |
3 | 1 | 3.2% |
8 | 1 | 3.2% |
(Missing) | 22 |
Value | Count | Frequency (%) |
1 | 2 | |
2 | 3 | |
3 | 1 | 3.2% |
5 | 1 | 3.2% |
7 | 1 | 3.2% |
8 | 1 | 3.2% |
Value | Count | Frequency (%) |
8 | 1 | 3.2% |
7 | 1 | 3.2% |
5 | 1 | 3.2% |
3 | 1 | 3.2% |
2 | 3 | |
1 | 2 |
강간-추행
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 14 |
---|---|
Distinct (%) | 53.8% |
Missing | 5 |
Missing (%) | 16.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 6.5769231 |
Minimum | 1 |
---|---|
Maximum | 15 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 411.0 B |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1.25 |
Q1 | 3.25 |
median | 6 |
Q3 | 9 |
95-th percentile | 13.5 |
Maximum | 15 |
Range | 14 |
Interquartile range (IQR) | 5.75 |
Descriptive statistics
Standard deviation | 3.8902244 |
---|---|
Coefficient of variation (CV) | 0.59149611 |
Kurtosis | -0.44779632 |
Mean | 6.5769231 |
Median Absolute Deviation (MAD) | 3 |
Skewness | 0.51903492 |
Sum | 171 |
Variance | 15.133846 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
6 | 4 | |
3 | 3 | |
5 | 3 | |
1 | 2 | 6.5% |
7 | 2 | 6.5% |
11 | 2 | 6.5% |
2 | 2 | 6.5% |
9 | 2 | 6.5% |
4 | 1 | 3.2% |
8 | 1 | 3.2% |
Other values (4) | 4 | |
(Missing) | 5 |
Value | Count | Frequency (%) |
1 | 2 | |
2 | 2 | |
3 | 3 | |
4 | 1 | 3.2% |
5 | 3 | |
6 | 4 | |
7 | 2 | |
8 | 1 | 3.2% |
9 | 2 | |
10 | 1 | 3.2% |
Value | Count | Frequency (%) |
15 | 1 | 3.2% |
14 | 1 | 3.2% |
12 | 1 | 3.2% |
11 | 2 | |
10 | 1 | 3.2% |
9 | 2 | |
8 | 1 | 3.2% |
7 | 2 | |
6 | 4 | |
5 | 3 |
절도
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 25 |
---|---|
Distinct (%) | 80.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 53.83871 |
Minimum | 4 |
---|---|
Maximum | 134 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 411.0 B |
Quantile statistics
Minimum | 4 |
---|---|
5-th percentile | 11.5 |
Q1 | 36.5 |
median | 52 |
Q3 | 65 |
95-th percentile | 111 |
Maximum | 134 |
Range | 130 |
Interquartile range (IQR) | 28.5 |
Descriptive statistics
Standard deviation | 30.21379 |
---|---|
Coefficient of variation (CV) | 0.56119083 |
Kurtosis | 0.91403273 |
Mean | 53.83871 |
Median Absolute Deviation (MAD) | 15 |
Skewness | 0.70357378 |
Sum | 1669 |
Variance | 912.87312 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
48 | 3 | 9.7% |
62 | 3 | 9.7% |
13 | 2 | 6.5% |
60 | 2 | 6.5% |
37 | 1 | 3.2% |
39 | 1 | 3.2% |
101 | 1 | 3.2% |
88 | 1 | 3.2% |
20 | 1 | 3.2% |
121 | 1 | 3.2% |
Other values (15) | 15 |
Value | Count | Frequency (%) |
4 | 1 | |
10 | 1 | |
13 | 2 | |
20 | 1 | |
22 | 1 | |
30 | 1 | |
36 | 1 | |
37 | 1 | |
39 | 1 | |
45 | 1 |
Value | Count | Frequency (%) |
134 | 1 | 3.2% |
121 | 1 | 3.2% |
101 | 1 | 3.2% |
88 | 1 | 3.2% |
77 | 1 | 3.2% |
72 | 1 | 3.2% |
67 | 1 | 3.2% |
66 | 1 | 3.2% |
64 | 1 | 3.2% |
62 | 3 |
폭력
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 28 |
---|---|
Distinct (%) | 90.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 65.935484 |
Minimum | 2 |
---|---|
Maximum | 147 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 411.0 B |
Quantile statistics
Minimum | 2 |
---|---|
5-th percentile | 9 |
Q1 | 30 |
median | 70 |
Q3 | 94.5 |
95-th percentile | 124.5 |
Maximum | 147 |
Range | 145 |
Interquartile range (IQR) | 64.5 |
Descriptive statistics
Standard deviation | 38.746127 |
---|---|
Coefficient of variation (CV) | 0.58763696 |
Kurtosis | -0.61133482 |
Mean | 65.935484 |
Median Absolute Deviation (MAD) | 27 |
Skewness | 0.10805597 |
Sum | 2044 |
Variance | 1501.2624 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
102 | 2 | 6.5% |
20 | 2 | 6.5% |
94 | 2 | 6.5% |
24 | 1 | 3.2% |
147 | 1 | 3.2% |
39 | 1 | 3.2% |
90 | 1 | 3.2% |
18 | 1 | 3.2% |
97 | 1 | 3.2% |
74 | 1 | 3.2% |
Other values (18) | 18 |
Value | Count | Frequency (%) |
2 | 1 | |
4 | 1 | |
14 | 1 | |
18 | 1 | |
20 | 2 | |
24 | 1 | |
29 | 1 | |
31 | 1 | |
39 | 1 | |
51 | 1 |
Value | Count | Frequency (%) |
147 | 1 | |
144 | 1 | |
105 | 1 | |
102 | 2 | |
97 | 1 | |
96 | 1 | |
95 | 1 | |
94 | 2 | |
90 | 1 | |
85 | 1 |
구분 | 강도 | 강간-추행 | 절도 | 폭력 | |
---|---|---|---|---|---|
구분 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
강도 | 1.000 | 1.000 | 0.839 | 0.457 | 0.573 |
강간-추행 | 1.000 | 0.839 | 1.000 | 0.819 | 0.296 |
절도 | 1.000 | 0.457 | 0.819 | 1.000 | 0.699 |
폭력 | 1.000 | 0.573 | 0.296 | 0.699 | 1.000 |
강도 | 강간-추행 | 절도 | 폭력 | 살인 | |
---|---|---|---|---|---|
강도 | 1.000 | -0.057 | 0.462 | 0.366 | NaN |
강간-추행 | -0.057 | 1.000 | 0.444 | 0.702 | 0.000 |
절도 | 0.462 | 0.444 | 1.000 | 0.688 | NaN |
폭력 | 0.366 | 0.702 | 0.688 | 1.000 | NaN |
살인 | NaN | 0.000 | NaN | NaN | 1.000 |
구분 | 살인 | 강도 | 강간-추행 | 절도 | 폭력 | |
---|---|---|---|---|---|---|
0 | 중부 | 1 | 1 | <NA> | 37 | 29 |
1 | 종로 | <NA> | <NA> | 4 | 4 | 14 |
2 | 남대문 | <NA> | <NA> | 1 | 10 | 2 |
3 | 서대문 | <NA> | <NA> | 6 | 64 | 66 |
4 | 혜화 | <NA> | <NA> | <NA> | 13 | 4 |
5 | 용산 | <NA> | <NA> | <NA> | 52 | 20 |
6 | 성북 | <NA> | <NA> | 3 | 22 | 69 |
7 | 동대문 | <NA> | 7 | 7 | 30 | 70 |
8 | 마포 | <NA> | 2 | 11 | 72 | 85 |
9 | 영등포 | <NA> | 5 | 2 | 62 | 52 |
구분 | 살인 | 강도 | 강간-추행 | 절도 | 폭력 | |
---|---|---|---|---|---|---|
21 | 종암 | <NA> | 1 | <NA> | 20 | 31 |
22 | 구로 | <NA> | <NA> | 9 | 48 | 95 |
23 | 서초 | <NA> | 2 | 3 | 60 | 74 |
24 | 양천 | <NA> | <NA> | 9 | 88 | 94 |
25 | 송파 | <NA> | <NA> | 14 | 101 | 97 |
26 | 노원 | <NA> | <NA> | 15 | 62 | 102 |
27 | 방배 | <NA> | <NA> | 1 | 13 | 18 |
28 | 은평 | <NA> | <NA> | 6 | 62 | 90 |
29 | 도봉 | <NA> | <NA> | 6 | 39 | 39 |
30 | 수서 | <NA> | <NA> | 11 | 48 | 147 |