Overview

Dataset statistics

Number of variables7
Number of observations213
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.8 KiB
Average record size in memory56.6 B

Variable types

Text4
Categorical3

Dataset

Description범죄의 발생 검거상황(총계, 지검, 지경 등)
Author대검찰청
URLhttps://www.data.go.kr/data/2200084/fileData.do

Alerts

검거인원 여 is highly overall correlated with 검거인원 미상 and 1 other fieldsHigh correlation
검거인원 미상 is highly overall correlated with 검거인원 여 and 1 other fieldsHigh correlation
법인 is highly overall correlated with 검거인원 여 and 1 other fieldsHigh correlation
검거인원 여 is highly imbalanced (52.5%)Imbalance
검거인원 미상 is highly imbalanced (63.7%)Imbalance
법인 is highly imbalanced (62.0%)Imbalance

Reproduction

Analysis started2023-12-12 18:08:45.909821
Analysis finished2023-12-12 18:08:46.821499
Duration0.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct211
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-13T03:08:46.953727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length21
Mean length7.4319249
Min length2

Characters and Unicode

Total characters1583
Distinct characters245
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique209 ?
Unique (%)98.1%

Sample

1st row절도
2nd row불법사용
3rd row침입절도
4th row장물
5th row사기
ValueCountFrequency (%)
공갈 2
 
0.9%
협박 2
 
0.9%
경범죄처벌법 1
 
0.5%
저작권법 1
 
0.5%
선박직원법 1
 
0.5%
변호사법 1
 
0.5%
절도 1
 
0.5%
보조금관리에관한법률 1
 
0.5%
도시및주거환경정비법 1
 
0.5%
독점규제및공정거래에관한법률 1
 
0.5%
Other values (204) 204
94.4%
2023-12-13T03:08:47.278927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
117
 
7.4%
69
 
4.4%
43
 
2.7%
42
 
2.7%
35
 
2.2%
32
 
2.0%
31
 
2.0%
28
 
1.8%
28
 
1.8%
27
 
1.7%
Other values (235) 1131
71.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1536
97.0%
Other Punctuation 18
 
1.1%
Open Punctuation 13
 
0.8%
Close Punctuation 13
 
0.8%
Space Separator 3
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
117
 
7.6%
69
 
4.5%
43
 
2.8%
42
 
2.7%
35
 
2.3%
32
 
2.1%
31
 
2.0%
28
 
1.8%
28
 
1.8%
27
 
1.8%
Other values (229) 1084
70.6%
Other Punctuation
ValueCountFrequency (%)
, 10
55.6%
· 5
27.8%
/ 3
 
16.7%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1536
97.0%
Common 47
 
3.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
117
 
7.6%
69
 
4.5%
43
 
2.8%
42
 
2.7%
35
 
2.3%
32
 
2.1%
31
 
2.0%
28
 
1.8%
28
 
1.8%
27
 
1.8%
Other values (229) 1084
70.6%
Common
ValueCountFrequency (%)
( 13
27.7%
) 13
27.7%
, 10
21.3%
· 5
 
10.6%
/ 3
 
6.4%
3
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1536
97.0%
ASCII 42
 
2.7%
None 5
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
117
 
7.6%
69
 
4.5%
43
 
2.8%
42
 
2.7%
35
 
2.3%
32
 
2.1%
31
 
2.0%
28
 
1.8%
28
 
1.8%
27
 
1.8%
Other values (229) 1084
70.6%
ASCII
ValueCountFrequency (%)
( 13
31.0%
) 13
31.0%
, 10
23.8%
/ 3
 
7.1%
3
 
7.1%
None
ValueCountFrequency (%)
· 5
100.0%
Distinct73
Distinct (%)34.3%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-13T03:08:47.472980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.5399061
Min length1

Characters and Unicode

Total characters541
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique61 ?
Unique (%)28.6%

Sample

1st row276
2nd row -
3rd row2
4th row4
5th row102
ValueCountFrequency (%)
102
47.9%
1 16
 
7.5%
2 8
 
3.8%
4 5
 
2.3%
3 5
 
2.3%
5 3
 
1.4%
7 3
 
1.4%
24 2
 
0.9%
13 2
 
0.9%
17 2
 
0.9%
Other values (63) 65
30.5%
2023-12-13T03:08:47.836689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
204
37.7%
- 102
18.9%
1 51
 
9.4%
2 41
 
7.6%
3 31
 
5.7%
4 27
 
5.0%
7 18
 
3.3%
6 18
 
3.3%
5 16
 
3.0%
9 15
 
2.8%
Other values (2) 18
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 235
43.4%
Space Separator 204
37.7%
Dash Punctuation 102
18.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 51
21.7%
2 41
17.4%
3 31
13.2%
4 27
11.5%
7 18
 
7.7%
6 18
 
7.7%
5 16
 
6.8%
9 15
 
6.4%
0 10
 
4.3%
8 8
 
3.4%
Space Separator
ValueCountFrequency (%)
204
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 102
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 541
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
204
37.7%
- 102
18.9%
1 51
 
9.4%
2 41
 
7.6%
3 31
 
5.7%
4 27
 
5.0%
7 18
 
3.3%
6 18
 
3.3%
5 16
 
3.0%
9 15
 
2.8%
Other values (2) 18
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 541
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
204
37.7%
- 102
18.9%
1 51
 
9.4%
2 41
 
7.6%
3 31
 
5.7%
4 27
 
5.0%
7 18
 
3.3%
6 18
 
3.3%
5 16
 
3.0%
9 15
 
2.8%
Other values (2) 18
 
3.3%
Distinct69
Distinct (%)32.4%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-13T03:08:48.031070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.5258216
Min length1

Characters and Unicode

Total characters538
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)25.8%

Sample

1st row255
2nd row -
3rd row2
4th row4
5th row97
ValueCountFrequency (%)
101
47.4%
1 19
 
8.9%
3 7
 
3.3%
4 5
 
2.3%
2 4
 
1.9%
7 4
 
1.9%
5 3
 
1.4%
13 3
 
1.4%
53 2
 
0.9%
17 2
 
0.9%
Other values (59) 63
29.6%
2023-12-13T03:08:48.333846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
202
37.5%
- 101
18.8%
1 56
 
10.4%
4 32
 
5.9%
3 31
 
5.8%
2 28
 
5.2%
7 20
 
3.7%
6 19
 
3.5%
5 14
 
2.6%
0 13
 
2.4%
Other values (2) 22
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 235
43.7%
Space Separator 202
37.5%
Dash Punctuation 101
18.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 56
23.8%
4 32
13.6%
3 31
13.2%
2 28
11.9%
7 20
 
8.5%
6 19
 
8.1%
5 14
 
6.0%
0 13
 
5.5%
9 12
 
5.1%
8 10
 
4.3%
Space Separator
ValueCountFrequency (%)
202
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 101
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 538
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
202
37.5%
- 101
18.8%
1 56
 
10.4%
4 32
 
5.9%
3 31
 
5.8%
2 28
 
5.2%
7 20
 
3.7%
6 19
 
3.5%
5 14
 
2.6%
0 13
 
2.4%
Other values (2) 22
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 538
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
202
37.5%
- 101
18.8%
1 56
 
10.4%
4 32
 
5.9%
3 31
 
5.8%
2 28
 
5.2%
7 20
 
3.7%
6 19
 
3.5%
5 14
 
2.6%
0 13
 
2.4%
Other values (2) 22
 
4.1%
Distinct68
Distinct (%)31.9%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-13T03:08:48.544527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.5868545
Min length1

Characters and Unicode

Total characters551
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)25.8%

Sample

1st row175
2nd row -
3rd row -
4th row8
5th row65
ValueCountFrequency (%)
112
52.6%
1 13
 
6.1%
3 6
 
2.8%
4 4
 
1.9%
2 4
 
1.9%
18 3
 
1.4%
7 3
 
1.4%
14 3
 
1.4%
26 2
 
0.9%
65 2
 
0.9%
Other values (58) 61
28.6%
2023-12-13T03:08:48.882830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
224
40.7%
- 112
20.3%
1 41
 
7.4%
2 31
 
5.6%
4 27
 
4.9%
3 22
 
4.0%
8 22
 
4.0%
7 18
 
3.3%
6 16
 
2.9%
5 16
 
2.9%
Other values (2) 22
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Space Separator 224
40.7%
Decimal Number 215
39.0%
Dash Punctuation 112
20.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 41
19.1%
2 31
14.4%
4 27
12.6%
3 22
10.2%
8 22
10.2%
7 18
8.4%
6 16
 
7.4%
5 16
 
7.4%
0 12
 
5.6%
9 10
 
4.7%
Space Separator
ValueCountFrequency (%)
224
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 112
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 551
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
224
40.7%
- 112
20.3%
1 41
 
7.4%
2 31
 
5.6%
4 27
 
4.9%
3 22
 
4.0%
8 22
 
4.0%
7 18
 
3.3%
6 16
 
2.9%
5 16
 
2.9%
Other values (2) 22
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 551
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
224
40.7%
- 112
20.3%
1 41
 
7.4%
2 31
 
5.6%
4 27
 
4.9%
3 22
 
4.0%
8 22
 
4.0%
7 18
 
3.3%
6 16
 
2.9%
5 16
 
2.9%
Other values (2) 22
 
4.0%

검거인원 여
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct42
Distinct (%)19.7%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
-
142 
1
 
9
2
 
7
3
 
6
5
 
3
Other values (37)
46 

Length

Max length4
Median length3
Mean length2.6103286
Min length1

Unique

Unique29 ?
Unique (%)13.6%

Sample

1st row47
2nd row -
3rd row -
4th row3
5th row17

Common Values

ValueCountFrequency (%)
- 142
66.7%
1 9
 
4.2%
2 7
 
3.3%
3 6
 
2.8%
5 3
 
1.4%
26 3
 
1.4%
38 2
 
0.9%
11 2
 
0.9%
9 2
 
0.9%
17 2
 
0.9%
Other values (32) 35
 
16.4%

Length

2023-12-13T03:08:49.033471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
142
66.7%
1 9
 
4.2%
2 7
 
3.3%
3 6
 
2.8%
5 3
 
1.4%
26 3
 
1.4%
34 2
 
0.9%
47 2
 
0.9%
30 2
 
0.9%
17 2
 
0.9%
Other values (32) 35
 
16.4%

검거인원 미상
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct20
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
-
164 
1
17 
2
 
5
6
 
4
4
 
4
Other values (15)
19 

Length

Max length3
Median length3
Mean length2.629108
Min length1

Unique

Unique13 ?
Unique (%)6.1%

Sample

1st row28
2nd row -
3rd row -
4th row1
5th row6

Common Values

ValueCountFrequency (%)
- 164
77.0%
1 17
 
8.0%
2 5
 
2.3%
6 4
 
1.9%
4 4
 
1.9%
28 3
 
1.4%
3 3
 
1.4%
125 1
 
0.5%
7 1
 
0.5%
12 1
 
0.5%
Other values (10) 10
 
4.7%

Length

2023-12-13T03:08:49.159508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
164
77.0%
1 17
 
8.0%
2 5
 
2.3%
6 4
 
1.9%
4 4
 
1.9%
28 3
 
1.4%
3 3
 
1.4%
43 1
 
0.5%
147 1
 
0.5%
241 1
 
0.5%
Other values (10) 10
 
4.7%

법인
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct39
Distinct (%)18.3%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
-
159 
1
 
12
2
 
5
3
 
2
293
 
1
Other values (34)
34 

Length

Max length4
Median length3
Mean length2.7042254
Min length1

Unique

Unique35 ?
Unique (%)16.4%

Sample

1st row -
2nd row -
3rd row -
4th row -
5th row4

Common Values

ValueCountFrequency (%)
- 159
74.6%
1 12
 
5.6%
2 5
 
2.3%
3 2
 
0.9%
293 1
 
0.5%
6 1
 
0.5%
5 1
 
0.5%
37 1
 
0.5%
41 1
 
0.5%
14 1
 
0.5%
Other values (29) 29
 
13.6%

Length

2023-12-13T03:08:49.288738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
159
74.6%
1 12
 
5.6%
2 5
 
2.3%
3 2
 
0.9%
26 1
 
0.5%
47 1
 
0.5%
339 1
 
0.5%
32 1
 
0.5%
46 1
 
0.5%
271 1
 
0.5%
Other values (29) 29
 
13.6%

Correlations

2023-12-13T03:08:49.370558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발생건수검거건수검거인원 남검거인원 여검거인원 미상법인
발생건수1.0001.0000.9990.9990.9990.999
검거건수1.0001.0000.9990.9990.9990.999
검거인원 남0.9990.9991.0000.9990.9990.999
검거인원 여0.9990.9990.9991.0000.9910.995
검거인원 미상0.9990.9990.9990.9911.0000.986
법인0.9990.9990.9990.9950.9861.000
2023-12-13T03:08:49.469715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법인검거인원 미상검거인원 여
법인1.0000.7680.836
검거인원 미상0.7681.0000.813
검거인원 여0.8360.8131.000
2023-12-13T03:08:49.556353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검거인원 여검거인원 미상법인
검거인원 여1.0000.8130.836
검거인원 미상0.8131.0000.768
법인0.8360.7681.000

Missing values

2023-12-13T03:08:46.622213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:08:46.774552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

2017년발생건수검거건수검거인원 남검거인원 여검거인원 미상법인
0절도2762551754728-
1불법사용------
2침입절도22----
3장물44831-
4사기10297651764
5컴퓨터등사용사기131373--
6부당이득--1---
7편의시설부정이용161669--
8전기통신금융사기피해금환급에관한특별법------
9보험사기방지특별법------
2017년발생건수검거건수검거인원 남검거인원 여검거인원 미상법인
203폐기물관리법389386372264169
204풍속영업의규제에관한법률------
205하천법333---
206학교보건법------
207학원의설립운영및과외교습에관한법률------
208화물자동차운수사업법111---
209화재로인한재해보상과보험가입에관한법률1-----
210화재예방·소방시설설치유지및안전관리에관한법률2426269-8
211화학물질관리법36236232026-253
212기타특별법24752465237843140791