Overview

Dataset statistics

Number of variables9
Number of observations200
Missing cells200
Missing cells (%)11.1%
Duplicate rows10
Duplicate rows (%)5.0%
Total size in memory15.6 KiB
Average record size in memory79.7 B

Variable types

Categorical6
Unsupported1
Numeric2

Alerts

YEAR has constant value ""Constant
MONTH has constant value ""Constant
DATE has constant value ""Constant
DAYS has constant value ""Constant
TEL_NO has constant value ""Constant
Dataset has 10 (5.0%) duplicate rowsDuplicates
SHARE_CNT is highly overall correlated with SAFE_CNTHigh correlation
SAFE_CNT is highly overall correlated with SHARE_CNTHigh correlation
SHARE_INFO has 200 (100.0%) missing valuesMissing
SHARE_INFO is an unsupported type, check if it needs cleaning or further analysisUnsupported
SHARE_CNT has 140 (70.0%) zerosZeros
SAFE_CNT has 139 (69.5%) zerosZeros

Reproduction

Analysis started2023-12-10 06:42:18.107327
Analysis finished2023-12-10 06:42:19.170699
Duration1.06 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

YEAR
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2020
200 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 200
100.0%

Length

2023-12-10T15:42:19.298527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:42:19.559249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 200
100.0%

MONTH
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2
200 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 200
100.0%

Length

2023-12-10T15:42:19.727968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:42:19.910511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 200
100.0%

DATE
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
4
200 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
4 200
100.0%

Length

2023-12-10T15:42:20.092919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:42:20.277075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 200
100.0%

TIMES
Categorical

Distinct4
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2323
101 
2324
92 
2322
 
5
2325
 
2

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2322
2nd row2322
3rd row2322
4th row2322
5th row2322

Common Values

ValueCountFrequency (%)
2323 101
50.5%
2324 92
46.0%
2322 5
 
2.5%
2325 2
 
1.0%

Length

2023-12-10T15:42:20.439676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:42:20.613052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2323 101
50.5%
2324 92
46.0%
2322 5
 
2.5%
2325 2
 
1.0%

DAYS
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
TUE
200 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTUE
2nd rowTUE
3rd rowTUE
4th rowTUE
5th rowTUE

Common Values

ValueCountFrequency (%)
TUE 200
100.0%

Length

2023-12-10T15:42:20.835156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:42:21.025061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
tue 200
100.0%

TEL_NO
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
**********
200 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row**********
2nd row**********
3rd row**********
4th row**********
5th row**********

Common Values

ValueCountFrequency (%)
********** 200
100.0%

Length

2023-12-10T15:42:21.207138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:42:21.395972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
200
100.0%

SHARE_INFO
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing200
Missing (%)100.0%
Memory size1.9 KiB

SHARE_CNT
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct20
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.05
Minimum0
Maximum100
Zeros140
Zeros (%)70.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:42:21.586185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q368
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)68

Descriptive statistics

Standard deviation41.13915
Coefficient of variation (CV)1.6422814
Kurtosis-0.64917467
Mean25.05
Median Absolute Deviation (MAD)0
Skewness1.122706
Sum5010
Variance1692.4296
MonotonicityNot monotonic
2023-12-10T15:42:21.823994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
0 140
70.0%
100 21
 
10.5%
99 9
 
4.5%
68 6
 
3.0%
97 4
 
2.0%
98 4
 
2.0%
71 2
 
1.0%
93 2
 
1.0%
90 1
 
0.5%
4 1
 
0.5%
Other values (10) 10
 
5.0%
ValueCountFrequency (%)
0 140
70.0%
4 1
 
0.5%
5 1
 
0.5%
16 1
 
0.5%
20 1
 
0.5%
22 1
 
0.5%
23 1
 
0.5%
25 1
 
0.5%
64 1
 
0.5%
68 6
 
3.0%
ValueCountFrequency (%)
100 21
10.5%
99 9
4.5%
98 4
 
2.0%
97 4
 
2.0%
93 2
 
1.0%
90 1
 
0.5%
84 1
 
0.5%
81 1
 
0.5%
71 2
 
1.0%
69 1
 
0.5%

SAFE_CNT
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct41
Distinct (%)20.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8044.305
Minimum0
Maximum93269
Zeros139
Zeros (%)69.5%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:42:22.043190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3623.75
95-th percentile53933.9
Maximum93269
Range93269
Interquartile range (IQR)623.75

Descriptive statistics

Standard deviation20044.355
Coefficient of variation (CV)2.4917448
Kurtosis7.8626277
Mean8044.305
Median Absolute Deviation (MAD)0
Skewness2.8697331
Sum1608861
Variance4.0177617 × 108
MonotonicityNot monotonic
2023-12-10T15:42:22.298457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
0 139
69.5%
93269 5
 
2.5%
28882 4
 
2.0%
51054 4
 
2.0%
10295 3
 
1.5%
1236 3
 
1.5%
25641 2
 
1.0%
2124 2
 
1.0%
23257 2
 
1.0%
965 2
 
1.0%
Other values (31) 34
 
17.0%
ValueCountFrequency (%)
0 139
69.5%
1 1
 
0.5%
5 1
 
0.5%
9 1
 
0.5%
57 1
 
0.5%
145 1
 
0.5%
164 1
 
0.5%
176 1
 
0.5%
263 1
 
0.5%
459 1
 
0.5%
ValueCountFrequency (%)
93269 5
2.5%
65599 2
 
1.0%
65359 2
 
1.0%
54255 1
 
0.5%
53917 1
 
0.5%
52265 1
 
0.5%
51054 4
2.0%
49464 1
 
0.5%
41120 1
 
0.5%
30077 1
 
0.5%

Interactions

2023-12-10T15:42:18.582327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:42:18.312207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:42:18.714119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:42:18.454699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:42:22.454775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
TIMESSHARE_CNTSAFE_CNT
TIMES1.0000.0620.545
SHARE_CNT0.0621.0000.740
SAFE_CNT0.5450.7401.000
2023-12-10T15:42:22.596980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SHARE_CNTSAFE_CNTTIMES
SHARE_CNT1.0000.9620.083
SAFE_CNT0.9621.0000.265
TIMES0.0830.2651.000

Missing values

2023-12-10T15:42:18.888251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:42:19.079591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

YEARMONTHDATETIMESDAYSTEL_NOSHARE_INFOSHARE_CNTSAFE_CNT
02020242322TUE**********<NA>10025641
12020242322TUE**********<NA>10023257
22020242322TUE**********<NA>9965359
32020242322TUE**********<NA>00
42020242322TUE**********<NA>00
52020242323TUE**********<NA>00
62020242323TUE**********<NA>25164
72020242323TUE**********<NA>00
82020242323TUE**********<NA>00
92020242323TUE**********<NA>6893269
YEARMONTHDATETIMESDAYSTEL_NOSHARE_INFOSHARE_CNTSAFE_CNT
1902020242324TUE**********<NA>00
1912020242324TUE**********<NA>00
1922020242324TUE**********<NA>00
1932020242324TUE**********<NA>20176
1942020242324TUE**********<NA>98965
1952020242324TUE**********<NA>00
1962020242324TUE**********<NA>00
1972020242324TUE**********<NA>00
1982020242325TUE**********<NA>6893269
1992020242325TUE**********<NA>00

Duplicate rows

Most frequently occurring

YEARMONTHDATETIMESDAYSTEL_NOSHARE_CNTSAFE_CNT# duplicates
12020242323TUE**********0070
72020242324TUE**********0066
22020242323TUE**********68932693
42020242323TUE**********97510543
52020242323TUE**********99102953
62020242323TUE**********100288823
02020242322TUE**********002
32020242323TUE**********9366542
82020242324TUE**********10012362
92020242324TUE**********10021242