Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 31 |
Missing cells | 22 |
Missing cells (%) | 11.8% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 1.7 KiB |
Average record size in memory | 57.3 B |
Variable types
Text | 1 |
---|---|
Categorical | 1 |
Numeric | 4 |
Dataset
Description | 서울경찰청에서 관리하는 2022년도 서울특별시 내에서 발생한 청소년 5대 범죄 관련 범죄 별 검거 통계 현황 데이터 파일( 31개 경찰서 별 )입니다. |
---|---|
URL | https://www.data.go.kr/data/15114278/fileData.do |
강간-추행 is highly overall correlated with 절도 and 1 other fields | High correlation |
절도 is highly overall correlated with 강간-추행 and 1 other fields | High correlation |
폭력 is highly overall correlated with 강간-추행 and 1 other fields | High correlation |
살인 is highly imbalanced (79.4%) | Imbalance |
강도 has 19 (61.3%) missing values | Missing |
강간-추행 has 2 (6.5%) missing values | Missing |
폭력 has 1 (3.2%) missing values | Missing |
구분 has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 06:49:04.456341 |
---|---|
Analysis finished | 2023-12-12 06:49:06.608111 |
Duration | 2.15 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
구분
Text
UNIQUE
 
Distinct | 31 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 380.0 B |
Value | Count | Frequency (%) |
중부 | 1 | 3.2% |
중랑 | 1 | 3.2% |
도봉 | 1 | 3.2% |
은평 | 1 | 3.2% |
방배 | 1 | 3.2% |
노원 | 1 | 3.2% |
송파 | 1 | 3.2% |
양천 | 1 | 3.2% |
서초 | 1 | 3.2% |
구로 | 1 | 3.2% |
Other values (21) | 21 |
Most occurring characters
Value | Count | Frequency (%) |
서 | 5 | 7.6% |
동 | 4 | 6.1% |
강 | 4 | 6.1% |
대 | 3 | 4.5% |
문 | 3 | 4.5% |
성 | 2 | 3.0% |
천 | 2 | 3.0% |
부 | 2 | 3.0% |
포 | 2 | 3.0% |
북 | 2 | 3.0% |
Other values (33) | 37 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 66 |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
서 | 5 | 7.6% |
동 | 4 | 6.1% |
강 | 4 | 6.1% |
대 | 3 | 4.5% |
문 | 3 | 4.5% |
성 | 2 | 3.0% |
천 | 2 | 3.0% |
부 | 2 | 3.0% |
포 | 2 | 3.0% |
북 | 2 | 3.0% |
Other values (33) | 37 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 66 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
서 | 5 | 7.6% |
동 | 4 | 6.1% |
강 | 4 | 6.1% |
대 | 3 | 4.5% |
문 | 3 | 4.5% |
성 | 2 | 3.0% |
천 | 2 | 3.0% |
부 | 2 | 3.0% |
포 | 2 | 3.0% |
북 | 2 | 3.0% |
Other values (33) | 37 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 66 |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
서 | 5 | 7.6% |
동 | 4 | 6.1% |
강 | 4 | 6.1% |
대 | 3 | 4.5% |
문 | 3 | 4.5% |
성 | 2 | 3.0% |
천 | 2 | 3.0% |
부 | 2 | 3.0% |
포 | 2 | 3.0% |
북 | 2 | 3.0% |
Other values (33) | 37 |
살인
Categorical
IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 6.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 380.0 B |
<NA> | |
---|---|
1 | 1 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.9032258 |
Min length | 1 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 3.2% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 30 | |
1 | 1 | 3.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 30 | |
1 | 1 | 3.2% |
강도
Real number (ℝ)
MISSING
 
Distinct | 6 |
---|---|
Distinct (%) | 50.0% |
Missing | 19 |
Missing (%) | 61.3% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 3.4166667 |
Minimum | 1 |
---|---|
Maximum | 8 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 411.0 B |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 1.75 |
median | 2.5 |
Q3 | 4.5 |
95-th percentile | 8 |
Maximum | 8 |
Range | 7 |
Interquartile range (IQR) | 2.75 |
Descriptive statistics
Standard deviation | 2.5746433 |
---|---|
Coefficient of variation (CV) | 0.75355412 |
Kurtosis | -0.29767993 |
Mean | 3.4166667 |
Median Absolute Deviation (MAD) | 1.5 |
Skewness | 1.0162407 |
Sum | 41 |
Variance | 6.6287879 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2 | 3 | 9.7% |
1 | 3 | 9.7% |
3 | 2 | 6.5% |
8 | 2 | 6.5% |
4 | 1 | 3.2% |
6 | 1 | 3.2% |
(Missing) | 19 |
Value | Count | Frequency (%) |
1 | 3 | |
2 | 3 | |
3 | 2 | |
4 | 1 | 3.2% |
6 | 1 | 3.2% |
8 | 2 |
Value | Count | Frequency (%) |
8 | 2 | |
6 | 1 | 3.2% |
4 | 1 | 3.2% |
3 | 2 | |
2 | 3 | |
1 | 3 |
강간-추행
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 13 |
---|---|
Distinct (%) | 44.8% |
Missing | 2 |
Missing (%) | 6.5% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 6.3448276 |
Minimum | 1 |
---|---|
Maximum | 15 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 411.0 B |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 4 |
median | 6 |
Q3 | 9 |
95-th percentile | 12.2 |
Maximum | 15 |
Range | 14 |
Interquartile range (IQR) | 5 |
Descriptive statistics
Standard deviation | 3.7059398 |
---|---|
Coefficient of variation (CV) | 0.58408835 |
Kurtosis | -0.42043917 |
Mean | 6.3448276 |
Median Absolute Deviation (MAD) | 3 |
Skewness | 0.35347023 |
Sum | 184 |
Variance | 13.73399 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2 | 3 | |
1 | 3 | |
10 | 3 | |
4 | 3 | |
7 | 3 | |
9 | 3 | |
5 | 3 | |
8 | 2 | |
6 | 2 | |
3 | 1 | 3.2% |
Other values (3) | 3 | |
(Missing) | 2 |
Value | Count | Frequency (%) |
1 | 3 | |
2 | 3 | |
3 | 1 | 3.2% |
4 | 3 | |
5 | 3 | |
6 | 2 | |
7 | 3 | |
8 | 2 | |
9 | 3 | |
10 | 3 |
Value | Count | Frequency (%) |
15 | 1 | 3.2% |
13 | 1 | 3.2% |
11 | 1 | 3.2% |
10 | 3 | |
9 | 3 | |
8 | 2 | |
7 | 3 | |
6 | 2 | |
5 | 3 | |
4 | 3 |
절도
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 27 |
---|---|
Distinct (%) | 87.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 60.548387 |
Minimum | 8 |
---|---|
Maximum | 132 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 411.0 B |
Quantile statistics
Minimum | 8 |
---|---|
5-th percentile | 12.5 |
Q1 | 26 |
median | 58 |
Q3 | 95 |
95-th percentile | 125.5 |
Maximum | 132 |
Range | 124 |
Interquartile range (IQR) | 69 |
Descriptive statistics
Standard deviation | 37.596843 |
---|---|
Coefficient of variation (CV) | 0.6209388 |
Kurtosis | -1.0669205 |
Mean | 60.548387 |
Median Absolute Deviation (MAD) | 32 |
Skewness | 0.40547893 |
Sum | 1877 |
Variance | 1413.5226 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
26 | 3 | 9.7% |
34 | 2 | 6.5% |
95 | 2 | 6.5% |
19 | 1 | 3.2% |
128 | 1 | 3.2% |
105 | 1 | 3.2% |
98 | 1 | 3.2% |
44 | 1 | 3.2% |
16 | 1 | 3.2% |
111 | 1 | 3.2% |
Other values (17) | 17 |
Value | Count | Frequency (%) |
8 | 1 | 3.2% |
9 | 1 | 3.2% |
16 | 1 | 3.2% |
19 | 1 | 3.2% |
21 | 1 | 3.2% |
25 | 1 | 3.2% |
26 | 3 | |
34 | 2 | |
39 | 1 | 3.2% |
43 | 1 | 3.2% |
Value | Count | Frequency (%) |
132 | 1 | |
128 | 1 | |
123 | 1 | |
111 | 1 | |
105 | 1 | |
98 | 1 | |
97 | 1 | |
95 | 2 | |
87 | 1 | |
76 | 1 |
폭력
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 27 |
---|---|
Distinct (%) | 90.0% |
Missing | 1 |
Missing (%) | 3.2% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 75.066667 |
Minimum | 7 |
---|---|
Maximum | 207 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 411.0 B |
Quantile statistics
Minimum | 7 |
---|---|
5-th percentile | 17.45 |
Q1 | 34.5 |
median | 65.5 |
Q3 | 102.25 |
95-th percentile | 139.3 |
Maximum | 207 |
Range | 200 |
Interquartile range (IQR) | 67.75 |
Descriptive statistics
Standard deviation | 46.056886 |
---|---|
Coefficient of variation (CV) | 0.61354644 |
Kurtosis | 0.68722893 |
Mean | 75.066667 |
Median Absolute Deviation (MAD) | 36 |
Skewness | 0.75061351 |
Sum | 2252 |
Variance | 2121.2368 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
65 | 3 | 9.7% |
27 | 2 | 6.5% |
68 | 1 | 3.2% |
66 | 1 | 3.2% |
115 | 1 | 3.2% |
17 | 1 | 3.2% |
207 | 1 | 3.2% |
121 | 1 | 3.2% |
91 | 1 | 3.2% |
142 | 1 | 3.2% |
Other values (17) | 17 |
Value | Count | Frequency (%) |
7 | 1 | |
17 | 1 | |
18 | 1 | |
26 | 1 | |
27 | 2 | |
28 | 1 | |
31 | 1 | |
45 | 1 | |
51 | 1 | |
52 | 1 |
Value | Count | Frequency (%) |
207 | 1 | |
142 | 1 | |
136 | 1 | |
129 | 1 | |
121 | 1 | |
117 | 1 | |
115 | 1 | |
103 | 1 | |
100 | 1 | |
99 | 1 |
구분 | 강도 | 강간-추행 | 절도 | 폭력 | |
---|---|---|---|---|---|
구분 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
강도 | 1.000 | 1.000 | 0.777 | 0.768 | 0.930 |
강간-추행 | 1.000 | 0.777 | 1.000 | 0.852 | 0.678 |
절도 | 1.000 | 0.768 | 0.852 | 1.000 | 0.744 |
폭력 | 1.000 | 0.930 | 0.678 | 0.744 | 1.000 |
강도 | 강간-추행 | 절도 | 폭력 | 살인 | |
---|---|---|---|---|---|
강도 | 1.000 | 0.028 | 0.235 | 0.399 | 0.000 |
강간-추행 | 0.028 | 1.000 | 0.660 | 0.654 | NaN |
절도 | 0.235 | 0.660 | 1.000 | 0.747 | NaN |
폭력 | 0.399 | 0.654 | 0.747 | 1.000 | NaN |
살인 | 0.000 | NaN | NaN | NaN | 1.000 |
구분 | 살인 | 강도 | 강간-추행 | 절도 | 폭력 | |
---|---|---|---|---|---|---|
0 | 중부 | <NA> | <NA> | 2 | 19 | 31 |
1 | 종로 | <NA> | 4 | <NA> | 21 | 18 |
2 | 남대문 | <NA> | <NA> | 1 | 9 | <NA> |
3 | 서대문 | <NA> | 2 | 10 | 47 | 88 |
4 | 혜화 | <NA> | <NA> | 1 | 8 | 7 |
5 | 용산 | <NA> | <NA> | 4 | 26 | 26 |
6 | 성북 | <NA> | <NA> | 8 | 34 | 51 |
7 | 동대문 | <NA> | 1 | 7 | 43 | 52 |
8 | 마포 | <NA> | <NA> | 9 | 95 | 99 |
9 | 영등포 | <NA> | 6 | 3 | 70 | 84 |
구분 | 살인 | 강도 | 강간-추행 | 절도 | 폭력 | |
---|---|---|---|---|---|---|
21 | 종암 | <NA> | <NA> | 5 | 25 | 27 |
22 | 구로 | <NA> | 8 | 13 | 64 | 142 |
23 | 서초 | <NA> | 1 | 7 | 97 | 65 |
24 | 양천 | <NA> | <NA> | 6 | 87 | 91 |
25 | 송파 | <NA> | 8 | 8 | 123 | 121 |
26 | 노원 | <NA> | 2 | 15 | 111 | 207 |
27 | 방배 | <NA> | <NA> | 2 | 16 | 17 |
28 | 은평 | <NA> | <NA> | 5 | 44 | 115 |
29 | 도봉 | <NA> | <NA> | 9 | 98 | 65 |
30 | 수서 | <NA> | <NA> | <NA> | 105 | 66 |