Overview

Dataset statistics

Number of variables4
Number of observations200
Missing cells0
Missing cells (%)0.0%
Duplicate rows53
Duplicate rows (%)26.5%
Total size in memory7.0 KiB
Average record size in memory35.7 B

Variable types

Categorical3
Numeric1

Alerts

YEAR has constant value ""Constant
MONTH has constant value ""Constant
Dataset has 53 (26.5%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-10 06:15:06.002030
Analysis finished2023-12-10 06:15:06.367964
Duration0.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

YEAR
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2020
200 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 200
100.0%

Length

2023-12-10T15:15:06.433474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:15:06.545614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 200
100.0%

MONTH
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
1
200 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 200
100.0%

Length

2023-12-10T15:15:06.679571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:15:06.793422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 200
100.0%

SPAM_STTEMNT_VALUE
Real number (ℝ)

Distinct15
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.98
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:15:06.907524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median8
Q311
95-th percentile99
Maximum99
Range98
Interquartile range (IQR)7

Descriptive statistics

Standard deviation25.947562
Coefficient of variation (CV)1.732147
Kurtosis6.6531816
Mean14.98
Median Absolute Deviation (MAD)3
Skewness2.8857326
Sum2996
Variance673.27598
MonotonicityNot monotonic
2023-12-10T15:15:07.061332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
5 21
10.5%
8 19
9.5%
99 17
8.5%
6 17
8.5%
14 16
 
8.0%
10 15
 
7.5%
1 14
 
7.0%
4 14
 
7.0%
2 13
 
6.5%
11 12
 
6.0%
Other values (5) 42
21.0%
ValueCountFrequency (%)
1 14
7.0%
2 13
6.5%
3 10
5.0%
4 14
7.0%
5 21
10.5%
6 17
8.5%
7 9
4.5%
8 19
9.5%
9 8
 
4.0%
10 15
7.5%
ValueCountFrequency (%)
99 17
8.5%
14 16
8.0%
13 7
 
3.5%
12 8
4.0%
11 12
6.0%
10 15
7.5%
9 8
4.0%
8 19
9.5%
7 9
4.5%
6 17
8.5%

GRAD
Categorical

Distinct12
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
E1
21 
C1
21 
C2
20 
D9
19 
C7
19 
Other values (7)
100 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowD9
2nd rowE1
3rd rowD9
4th rowD5
5th rowA2

Common Values

ValueCountFrequency (%)
E1 21
10.5%
C1 21
10.5%
C2 20
10.0%
D9 19
9.5%
C7 19
9.5%
D6 19
9.5%
B1 18
9.0%
A2 17
8.5%
D5 13
6.5%
B2 13
6.5%
Other values (2) 20
10.0%

Length

2023-12-10T15:15:07.177104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
e1 21
10.5%
c1 21
10.5%
c2 20
10.0%
d9 19
9.5%
c7 19
9.5%
d6 19
9.5%
b1 18
9.0%
a2 17
8.5%
d5 13
6.5%
b2 13
6.5%
Other values (2) 20
10.0%

Interactions

2023-12-10T15:15:06.112633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:15:07.247799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SPAM_STTEMNT_VALUEGRAD
SPAM_STTEMNT_VALUE1.0000.000
GRAD0.0001.000
2023-12-10T15:15:07.347918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SPAM_STTEMNT_VALUEGRAD
SPAM_STTEMNT_VALUE1.0000.000
GRAD0.0001.000

Missing values

2023-12-10T15:15:06.242339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:15:06.334493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

YEARMONTHSPAM_STTEMNT_VALUEGRAD
02020112D9
1202017E1
2202015D9
32020199D5
42020111A2
5202013D9
6202019A2
72020110C1
8202011E1
92020111C7
YEARMONTHSPAM_STTEMNT_VALUEGRAD
190202014E1
191202013D9
192202014D8
193202019A1
194202016A2
195202016C2
1962020110D6
197202011A2
198202016D8
199202017C2

Duplicate rows

Most frequently occurring

YEARMONTHSPAM_STTEMNT_VALUEGRAD# duplicates
15202015B15
27202018C75
19202015E14
28202018D64
342020110C14
1202011C73
3202011E13
6202012C13
11202014C13
12202014C23