Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.3 KiB
Average record size in memory44.3 B

Variable types

Numeric2
Categorical1
DateTime2

Alerts

anl_base_dt has constant value ""Constant
seq is highly overall correlated with species_master_seq and 1 other fieldsHigh correlation
species_master_seq is highly overall correlated with seq and 1 other fieldsHigh correlation
loan_count is highly overall correlated with seq and 1 other fieldsHigh correlation
loan_count is highly imbalanced (80.6%)Imbalance
seq has unique valuesUnique
species_master_seq has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:20:41.421744
Analysis finished2023-12-10 10:20:42.559608
Duration1.14 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

seq
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.9579647 × 108
Minimum5.9558642 × 108
Maximum5.9580302 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:42.694712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5.9558642 × 108
5-th percentile5.9580292 × 108
Q15.9580294 × 108
median5.9580297 × 108
Q35.9580299 × 108
95-th percentile5.9580301 × 108
Maximum5.9580302 × 108
Range216598
Interquartile range (IQR)50.5

Descriptive statistics

Standard deviation37126.539
Coefficient of variation (CV)6.231413 × 10-5
Kurtosis29.897737
Mean5.9579647 × 108
Median Absolute Deviation (MAD)25.5
Skewness-5.5946443
Sum5.9579647 × 1010
Variance1.3783799 × 109
MonotonicityNot monotonic
2023-12-10T19:20:42.927816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
595802919 1
 
1.0%
595802983 1
 
1.0%
595802993 1
 
1.0%
595802992 1
 
1.0%
595802991 1
 
1.0%
595802990 1
 
1.0%
595802989 1
 
1.0%
595802988 1
 
1.0%
595802987 1
 
1.0%
595802986 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
595586420 1
1.0%
595586421 1
1.0%
595586422 1
1.0%
595802919 1
1.0%
595802921 1
1.0%
595802922 1
1.0%
595802923 1
1.0%
595802924 1
1.0%
595802925 1
1.0%
595802927 1
1.0%
ValueCountFrequency (%)
595803018 1
1.0%
595803017 1
1.0%
595803016 1
1.0%
595803015 1
1.0%
595803014 1
1.0%
595803013 1
1.0%
595803012 1
1.0%
595803011 1
1.0%
595803010 1
1.0%
595803009 1
1.0%

species_master_seq
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2056507.1
Minimum1916668
Maximum6351576
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:43.164035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1916668
5-th percentile1917151.1
Q11920863.2
median1924120.5
Q31926871.2
95-th percentile1929703.2
Maximum6351576
Range4434908
Interquartile range (IQR)6008

Descriptive statistics

Standard deviation759154.12
Coefficient of variation (CV)0.36914734
Kurtosis29.89597
Mean2056507.1
Median Absolute Deviation (MAD)3223
Skewness5.5944045
Sum2.0565071 × 108
Variance5.7631497 × 1011
MonotonicityNot monotonic
2023-12-10T19:20:43.372865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1916668 1
 
1.0%
1926222 1
 
1.0%
1926568 1
 
1.0%
1926564 1
 
1.0%
1926560 1
 
1.0%
1926558 1
 
1.0%
1926555 1
 
1.0%
1926553 1
 
1.0%
1926549 1
 
1.0%
1926388 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1916668 1
1.0%
1916676 1
1.0%
1916794 1
1.0%
1916820 1
1.0%
1917134 1
1.0%
1917152 1
1.0%
1917410 1
1.0%
1917760 1
1.0%
1917765 1
1.0%
1917913 1
1.0%
ValueCountFrequency (%)
6351576 1
1.0%
6351548 1
1.0%
6351506 1
1.0%
1929714 1
1.0%
1929707 1
1.0%
1929703 1
1.0%
1929699 1
1.0%
1929687 1
1.0%
1929665 1
1.0%
1929409 1
1.0%

loan_count
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
4
97 
1
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row1
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
4 97
97.0%
1 3
 
3.0%

Length

2023-12-10T19:20:43.563853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:20:43.684750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 97
97.0%
1 3
 
3.0%

anl_dt
Date

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2021-02-01 00:00:00
Maximum2021-03-01 00:00:00
2023-12-10T19:20:43.792720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:43.919669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=2)

anl_base_dt
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2021-12-03 00:00:00
Maximum2021-12-03 00:00:00
2023-12-10T19:20:44.071461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:44.219691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-10T19:20:41.895824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:41.618678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:42.058733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:41.747207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:20:44.332781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
seqspecies_master_seqloan_countanl_dt
seq1.0000.9190.9190.919
species_master_seq0.9191.0000.9630.963
loan_count0.9190.9631.0000.963
anl_dt0.9190.9630.9631.000
2023-12-10T19:20:44.476751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
seqspecies_master_seqloan_count
seq1.0000.8250.826
species_master_seq0.8251.0000.826
loan_count0.8260.8261.000

Missing values

2023-12-10T19:20:42.241847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:20:42.486189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

seqspecies_master_seqloan_countanl_dtanl_base_dt
0595802919191666842021-022021-12-03
1595586420635150612021-032021-12-03
2595802921191667642021-022021-12-03
3595802922191679442021-022021-12-03
4595802923191682042021-022021-12-03
5595802924191713442021-022021-12-03
6595802925191715242021-022021-12-03
7595586421635154812021-032021-12-03
8595802927191741042021-022021-12-03
9595802928191776042021-022021-12-03
seqspecies_master_seqloan_countanl_dtanl_base_dt
90595803009192895842021-022021-12-03
91595803010192903742021-022021-12-03
92595803011192935842021-022021-12-03
93595803012192940942021-022021-12-03
94595803013192966542021-022021-12-03
95595803014192968742021-022021-12-03
96595803015192969942021-022021-12-03
97595803016192970342021-022021-12-03
98595803017192970742021-022021-12-03
99595803018192971442021-022021-12-03