Overview

Dataset statistics

Number of variables7
Number of observations187
Missing cells0
Missing cells (%)0.0%
Duplicate rows23
Duplicate rows (%)12.3%
Total size in memory11.1 KiB
Average record size in memory60.7 B

Variable types

Categorical6
Numeric1

Dataset

DescriptionSample
Author한국인터넷진흥원
URLhttps://www.bigdata-telecom.kr/invoke/SOKBP2603/?goodsCode=KIS00000000000000022

Alerts

발신번호 has constant value ""Constant
수신번호 has constant value ""Constant
스팸내용 has constant value ""Constant
Dataset has 23 (12.3%) duplicate rowsDuplicates
수신월 is highly overall correlated with 수신시분초 and 2 other fieldsHigh correlation
수신일 is highly overall correlated with 수신시분초 and 2 other fieldsHigh correlation
수신년도 is highly overall correlated with 수신시분초 and 2 other fieldsHigh correlation
수신시분초 is highly overall correlated with 수신년도 and 2 other fieldsHigh correlation

Reproduction

Analysis started2023-12-10 06:46:19.473153
Analysis finished2023-12-10 06:46:20.007976
Duration0.53 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

수신년도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
2018
151 
2017
36 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
2018 151
80.7%
2017 36
 
19.3%

Length

2023-12-10T15:46:20.087754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:46:20.197588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2018 151
80.7%
2017 36
 
19.3%

수신월
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
2
99 
1
52 
11
36 

Length

Max length2
Median length1
Mean length1.1925134
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11
2nd row11
3rd row11
4th row11
5th row11

Common Values

ValueCountFrequency (%)
2 99
52.9%
1 52
27.8%
11 36
 
19.3%

Length

2023-12-10T15:46:20.301988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:46:20.412415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 99
52.9%
1 52
27.8%
11 36
 
19.3%

수신일
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
24
99 
28
52 
23
36 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row23
2nd row23
3rd row23
4th row23
5th row23

Common Values

ValueCountFrequency (%)
24 99
52.9%
28 52
27.8%
23 36
 
19.3%

Length

2023-12-10T15:46:20.509122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:46:20.618062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
24 99
52.9%
28 52
27.8%
23 36
 
19.3%

수신시분초
Real number (ℝ)

HIGH CORRELATION 

Distinct145
Distinct (%)77.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean116984.49
Minimum100
Maximum233100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2023-12-10T15:46:20.744843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile72020
Q195500
median113000
Q3132400
95-th percentile182800
Maximum233100
Range233000
Interquartile range (IQR)36900

Descriptive statistics

Standard deviation38267.901
Coefficient of variation (CV)0.32711944
Kurtosis2.0455243
Mean116984.49
Median Absolute Deviation (MAD)18500
Skewness0.035261028
Sum21876100
Variance1.4644323 × 109
MonotonicityNot monotonic
2023-12-10T15:46:20.901463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
182800 6
 
3.2%
110100 3
 
1.6%
102100 3
 
1.6%
120100 3
 
1.6%
130000 3
 
1.6%
113000 2
 
1.1%
103600 2
 
1.1%
100 2
 
1.1%
150600 2
 
1.1%
121500 2
 
1.1%
Other values (135) 159
85.0%
ValueCountFrequency (%)
100 2
1.1%
1200 1
0.5%
2200 1
0.5%
2400 1
0.5%
32100 1
0.5%
35700 1
0.5%
52200 1
0.5%
70300 1
0.5%
70400 1
0.5%
75800 1
0.5%
ValueCountFrequency (%)
233100 1
 
0.5%
230400 1
 
0.5%
225500 1
 
0.5%
205000 1
 
0.5%
201500 1
 
0.5%
195600 1
 
0.5%
182800 6
3.2%
182700 1
 
0.5%
182500 2
 
1.1%
181800 1
 
0.5%

발신번호
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
***********
187 

Length

Max length11
Median length11
Mean length11
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row***********
2nd row***********
3rd row***********
4th row***********
5th row***********

Common Values

ValueCountFrequency (%)
*********** 187
100.0%

Length

2023-12-10T15:46:21.037099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:46:21.137521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
187
100.0%

수신번호
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
***********
187 

Length

Max length11
Median length11
Mean length11
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row***********
2nd row***********
3rd row***********
4th row***********
5th row***********

Common Values

ValueCountFrequency (%)
*********** 187
100.0%

Length

2023-12-10T15:46:21.250391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:46:21.375504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
187
100.0%

스팸내용
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
컨텐츠 비공개
187 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row컨텐츠 비공개
2nd row컨텐츠 비공개
3rd row컨텐츠 비공개
4th row컨텐츠 비공개
5th row컨텐츠 비공개

Common Values

ValueCountFrequency (%)
컨텐츠 비공개 187
100.0%

Length

2023-12-10T15:46:21.495105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:46:21.641651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
컨텐츠 187
50.0%
비공개 187
50.0%

Interactions

2023-12-10T15:46:19.672241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:46:21.756211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수신년도수신월수신일수신시분초
수신년도1.0001.0001.0000.844
수신월1.0001.0001.0000.756
수신일1.0001.0001.0000.756
수신시분초0.8440.7560.7561.000
2023-12-10T15:46:21.893511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수신월수신일수신년도
수신월1.0001.0000.997
수신일1.0001.0000.997
수신년도0.9970.9971.000
2023-12-10T15:46:22.004314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수신시분초수신년도수신월수신일
수신시분초1.0000.6610.6170.617
수신년도0.6611.0000.9970.997
수신월0.6170.9971.0001.000
수신일0.6170.9971.0001.000

Missing values

2023-12-10T15:46:19.804699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:46:19.945741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

수신년도수신월수신일수신시분초발신번호수신번호스팸내용
02017112332100**********************컨텐츠 비공개
12017112382400**********************컨텐츠 비공개
22017112391400**********************컨텐츠 비공개
32017112392200**********************컨텐츠 비공개
42017112392300**********************컨텐츠 비공개
52017112392500**********************컨텐츠 비공개
62017112392500**********************컨텐츠 비공개
72017112393100**********************컨텐츠 비공개
82017112393200**********************컨텐츠 비공개
92017112393800**********************컨텐츠 비공개
수신년도수신월수신일수신시분초발신번호수신번호스팸내용
1772018224131300**********************컨텐츠 비공개
1782018224131700**********************컨텐츠 비공개
1792018224131900**********************컨텐츠 비공개
1802018224132000**********************컨텐츠 비공개
1812018224132200**********************컨텐츠 비공개
1822018224133000**********************컨텐츠 비공개
1832018224133400**********************컨텐츠 비공개
1842018224133600**********************컨텐츠 비공개
1852018224134000**********************컨텐츠 비공개
1862018224135200**********************컨텐츠 비공개

Duplicate rows

Most frequently occurring

수신년도수신월수신일수신시분초발신번호수신번호스팸내용# duplicates
320171123182800**********************컨텐츠 비공개6
172018224110100**********************컨텐츠 비공개3
192018224120100**********************컨텐츠 비공개3
212018224130000**********************컨텐츠 비공개3
02017112392500**********************컨텐츠 비공개2
120171123171900**********************컨텐츠 비공개2
220171123182500**********************컨텐츠 비공개2
42018128115000**********************컨텐츠 비공개2
52018128122000**********************컨텐츠 비공개2
62018128140000**********************컨텐츠 비공개2