Overview

Dataset statistics

Number of variables6
Number of observations407
Missing cells0
Missing cells (%)0.0%
Duplicate rows57
Duplicate rows (%)14.0%
Total size in memory20.0 KiB
Average record size in memory50.3 B

Variable types

Numeric2
Categorical4

Dataset

Description우체국 금융 사기유형별 처리현황에 대한 정보입니다. 해당 데이터가 보유한 컬럼명은 다음과 같습니다. 컬럼명 : 나이대,피해자성별,피해금액,피해년월,사기유형,사칭기관
URLhttps://www.data.go.kr/data/15021794/fileData.do

Alerts

Dataset has 57 (14.0%) duplicate rowsDuplicates
사기유형 is highly overall correlated with 사칭기관High correlation
사칭기관 is highly overall correlated with 사기유형High correlation
사기유형 is highly imbalanced (71.7%)Imbalance
사칭기관 is highly imbalanced (75.0%)Imbalance

Reproduction

Analysis started2023-12-12 18:57:11.261726
Analysis finished2023-12-12 18:57:12.710857
Duration1.45 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

나이대
Real number (ℝ)

Distinct7
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.921376
Minimum20
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2023-12-13T03:57:12.813168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile40
Q150
median60
Q360
95-th percentile70
Maximum80
Range60
Interquartile range (IQR)10

Descriptive statistics

Standard deviation10.808394
Coefficient of variation (CV)0.19327841
Kurtosis2.7927291
Mean55.921376
Median Absolute Deviation (MAD)10
Skewness-1.1566836
Sum22760
Variance116.82139
MonotonicityNot monotonic
2023-12-13T03:57:13.024615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
60 180
44.2%
50 132
32.4%
70 55
 
13.5%
40 16
 
3.9%
20 15
 
3.7%
80 6
 
1.5%
30 3
 
0.7%
ValueCountFrequency (%)
20 15
 
3.7%
30 3
 
0.7%
40 16
 
3.9%
50 132
32.4%
60 180
44.2%
70 55
 
13.5%
80 6
 
1.5%
ValueCountFrequency (%)
80 6
 
1.5%
70 55
 
13.5%
60 180
44.2%
50 132
32.4%
40 16
 
3.9%
30 3
 
0.7%
20 15
 
3.7%

피해자성별
Categorical

Distinct3
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
206 
200 
<NA>
 
1

Length

Max length4
Median length1
Mean length1.007371
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
206
50.6%
200
49.1%
<NA> 1
 
0.2%

Length

2023-12-13T03:57:13.285815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:57:13.520199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
206
50.6%
200
49.1%
na 1
 
0.2%

피해금액
Real number (ℝ)

Distinct165
Distinct (%)40.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7329135.4
Minimum93000
Maximum4.6 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2023-12-13T03:57:13.753232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum93000
5-th percentile100000
Q1200000
median1750000
Q36035670
95-th percentile29228000
Maximum4.6 × 108
Range4.59907 × 108
Interquartile range (IQR)5835670

Descriptive statistics

Standard deviation26505833
Coefficient of variation (CV)3.6165021
Kurtosis212.03022
Mean7329135.4
Median Absolute Deviation (MAD)1610000
Skewness13.024073
Sum2.9829581 × 109
Variance7.0255921 × 1014
MonotonicityNot monotonic
2023-12-13T03:57:14.026440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
200000 61
 
15.0%
100000 43
 
10.6%
150000 23
 
5.7%
2000000 13
 
3.2%
10000000 12
 
2.9%
6000000 12
 
2.9%
1000000 11
 
2.7%
3000000 10
 
2.5%
500000 9
 
2.2%
9000000 8
 
2.0%
Other values (155) 205
50.4%
ValueCountFrequency (%)
93000 1
 
0.2%
100000 43
10.6%
120000 1
 
0.2%
140000 1
 
0.2%
149375 1
 
0.2%
150000 23
 
5.7%
152000 1
 
0.2%
180000 1
 
0.2%
190000 2
 
0.5%
200000 61
15.0%
ValueCountFrequency (%)
460000000 1
0.2%
119002000 1
0.2%
118000000 1
0.2%
89000000 2
0.5%
73000000 2
0.5%
70370000 1
0.2%
62732400 1
0.2%
60000000 1
0.2%
47000000 1
0.2%
44900000 2
0.5%

피해년월
Categorical

Distinct6
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2022년03월
92 
2022년05월
75 
2022년01월
69 
2022년02월
61 
2022년06월
60 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022년06월
2nd row2022년06월
3rd row2022년06월
4th row2022년06월
5th row2022년06월

Common Values

ValueCountFrequency (%)
2022년03월 92
22.6%
2022년05월 75
18.4%
2022년01월 69
17.0%
2022년02월 61
15.0%
2022년06월 60
14.7%
2022년04월 50
12.3%

Length

2023-12-13T03:57:14.272783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:57:14.500421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022년03월 92
22.6%
2022년05월 75
18.4%
2022년01월 69
17.0%
2022년02월 61
15.0%
2022년06월 60
14.7%
2022년04월 50
12.3%

사기유형
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
지인사칭(메신저피싱)
359 
기존대출 상환
 
16
사건연루조사
 
14
기타
 
7
대출실행 후 상환
 
5
Other values (2)
 
6

Length

Max length19
Median length11
Mean length10.572482
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사건연루조사
2nd row사건연루조사
3rd row대출수수료 요구(보증료. 공탁 등)
4th row기타
5th row지인사칭(메신저피싱)

Common Values

ValueCountFrequency (%)
지인사칭(메신저피싱) 359
88.2%
기존대출 상환 16
 
3.9%
사건연루조사 14
 
3.4%
기타 7
 
1.7%
대출실행 후 상환 5
 
1.2%
대출수수료 요구(보증료. 공탁 등) 3
 
0.7%
개인정보유출방지. 보안강화 3
 
0.7%

Length

2023-12-13T03:57:14.785965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:57:15.027170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지인사칭(메신저피싱 359
80.7%
상환 21
 
4.7%
기존대출 16
 
3.6%
사건연루조사 14
 
3.1%
기타 7
 
1.6%
대출실행 5
 
1.1%
5
 
1.1%
대출수수료 3
 
0.7%
요구(보증료 3
 
0.7%
공탁 3
 
0.7%
Other values (3) 9
 
2.0%

사칭기관
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct9
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
개인
358 
시중은행
 
21
경찰. 검찰. 법원
 
15
기타
 
4
저축은행
 
3
Other values (4)
 
6

Length

Max length15
Median length2
Mean length2.5356265
Min length2

Unique

Unique3 ?
Unique (%)0.7%

Sample

1st row경찰. 검찰. 법원
2nd row경찰. 검찰. 법원
3rd row저축은행
4th row저축은행
5th row개인

Common Values

ValueCountFrequency (%)
개인 358
88.0%
시중은행 21
 
5.2%
경찰. 검찰. 법원 15
 
3.7%
기타 4
 
1.0%
저축은행 3
 
0.7%
할부금융(카드사 및 캐피탈) 3
 
0.7%
금융지주회사 1
 
0.2%
대부업체 1
 
0.2%
기타 공공기관 1
 
0.2%

Length

2023-12-13T03:57:15.293459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:57:15.554944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
개인 358
80.6%
시중은행 21
 
4.7%
경찰 15
 
3.4%
검찰 15
 
3.4%
법원 15
 
3.4%
기타 5
 
1.1%
저축은행 3
 
0.7%
할부금융(카드사 3
 
0.7%
3
 
0.7%
캐피탈 3
 
0.7%
Other values (3) 3
 
0.7%

Interactions

2023-12-13T03:57:12.062479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:57:11.784803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:57:12.242081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:57:11.923756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:57:15.737189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
나이대피해자성별피해금액피해년월사기유형사칭기관
나이대1.0000.1860.4220.1800.8110.589
피해자성별0.1861.0000.2070.3520.0570.000
피해금액0.4220.2071.0000.1000.3220.383
피해년월0.1800.3520.1001.0000.1720.187
사기유형0.8110.0570.3220.1721.0000.822
사칭기관0.5890.0000.3830.1870.8221.000
2023-12-13T03:57:15.922041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
피해자성별사칭기관피해년월사기유형
피해자성별1.0000.0000.2520.061
사칭기관0.0001.0000.0930.625
피해년월0.2520.0931.0000.103
사기유형0.0610.6250.1031.000
2023-12-13T03:57:16.099525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
나이대피해금액피해자성별피해년월사기유형사칭기관
나이대1.000-0.1120.1970.1070.4080.361
피해금액-0.1121.0000.1370.0630.2260.253
피해자성별0.1970.1371.0000.2520.0610.000
피해년월0.1070.0630.2521.0000.1030.093
사기유형0.4080.2260.0610.1031.0000.625
사칭기관0.3610.2530.0000.0930.6251.000

Missing values

2023-12-13T03:57:12.446751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:57:12.635427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

나이대피해자성별피해금액피해년월사기유형사칭기관
02040000002022년06월사건연루조사경찰. 검찰. 법원
12065000002022년06월사건연루조사경찰. 검찰. 법원
22030000002022년06월대출수수료 요구(보증료. 공탁 등)저축은행
320310000002022년06월기타저축은행
44060100002022년06월지인사칭(메신저피싱)개인
5401400002022년06월지인사칭(메신저피싱)개인
6405000002022년06월지인사칭(메신저피싱)개인
75030000002022년06월지인사칭(메신저피싱)개인
85030000002022년06월지인사칭(메신저피싱)개인
95030000002022년06월지인사칭(메신저피싱)개인
나이대피해자성별피해금액피해년월사기유형사칭기관
397701000002022년01월지인사칭(메신저피싱)개인
3987010000002022년01월지인사칭(메신저피싱)개인
399703100002022년01월지인사칭(메신저피싱)개인
4007060350512022년01월지인사칭(메신저피싱)개인
4017060000002022년01월지인사칭(메신저피싱)개인
402701000002022년01월지인사칭(메신저피싱)개인
4037060000002022년01월지인사칭(메신저피싱)개인
4047010000002022년01월지인사칭(메신저피싱)개인
4057060200002022년01월지인사칭(메신저피싱)개인
4067012500002022년01월지인사칭(메신저피싱)개인

Duplicate rows

Most frequently occurring

나이대피해자성별피해금액피해년월사기유형사칭기관# duplicates
29602000002022년03월지인사칭(메신저피싱)개인6
4501000002022년03월지인사칭(메신저피싱)개인5
23601000002022년02월지인사칭(메신저피싱)개인5
24601000002022년03월지인사칭(메신저피싱)개인5
41602000002022년03월지인사칭(메신저피싱)개인5
42602000002022년04월지인사칭(메신저피싱)개인5
8502000002022년02월지인사칭(메신저피싱)개인4
16502000002022년02월지인사칭(메신저피싱)개인4
340292400002022년05월기존대출 상환시중은행3
7502000002022년01월지인사칭(메신저피싱)개인3