Overview

Dataset statistics

Number of variables10
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.9 KiB
Average record size in memory91.3 B

Variable types

Categorical5
Numeric5

Alerts

tms is highly overall correlated with race_dayHigh correlation
race_day is highly overall correlated with tms and 1 other fieldsHigh correlation
tak is highly overall correlated with starting and 1 other fieldsHigh correlation
starting is highly overall correlated with takHigh correlation
eclnt is highly overall correlated with takHigh correlation
stnd_year is highly overall correlated with race_dayHigh correlation
stnd_year is highly imbalanced (80.6%)Imbalance
repr is highly imbalanced (52.3%)Imbalance
race_day has unique valuesUnique
tak has 4 (4.0%) zerosZeros
starting has 9 (9.0%) zerosZeros
eclnt has 9 (9.0%) zerosZeros

Reproduction

Analysis started2023-12-10 10:00:15.419189
Analysis finished2023-12-10 10:00:21.460938
Duration6.04 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

stnd_year
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2019
97 
2021
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2021
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
2019 97
97.0%
2021 3
 
3.0%

Length

2023-12-10T19:00:21.666107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:00:21.934132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2019 97
97.0%
2021 3
 
3.0%

tms
Real number (ℝ)

HIGH CORRELATION 

Distinct34
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.16
Minimum10
Maximum51
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:00:22.180310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile11
Q117
median34.5
Q343
95-th percentile50
Maximum51
Range41
Interquartile range (IQR)26

Descriptive statistics

Standard deviation13.652039
Coefficient of variation (CV)0.43812707
Kurtosis-1.4564971
Mean31.16
Median Absolute Deviation (MAD)12.5
Skewness-0.21943898
Sum3116
Variance186.37818
MonotonicityNot monotonic
2023-12-10T19:00:22.843685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
11 5
 
5.0%
12 4
 
4.0%
43 3
 
3.0%
19 3
 
3.0%
16 3
 
3.0%
15 3
 
3.0%
14 3
 
3.0%
13 3
 
3.0%
10 3
 
3.0%
40 3
 
3.0%
Other values (24) 67
67.0%
ValueCountFrequency (%)
10 3
3.0%
11 5
5.0%
12 4
4.0%
13 3
3.0%
14 3
3.0%
15 3
3.0%
16 3
3.0%
17 3
3.0%
18 3
3.0%
19 3
3.0%
ValueCountFrequency (%)
51 3
3.0%
50 3
3.0%
49 3
3.0%
48 3
3.0%
47 3
3.0%
46 3
3.0%
45 2
2.0%
44 3
3.0%
43 3
3.0%
42 3
3.0%

day_ord
Categorical

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
35 
3
34 
2
31 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row3
4th row1
5th row2

Common Values

ValueCountFrequency (%)
1 35
35.0%
3 34
34.0%
2 31
31.0%

Length

2023-12-10T19:00:23.072763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:00:23.279627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 35
35.0%
3 34
34.0%
2 31
31.0%

race_day
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20191386
Minimum20190308
Maximum20210319
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:00:23.578871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20190308
5-th percentile20190317
Q120190504
median20190904
Q320191108
95-th percentile20191227
Maximum20210319
Range20011
Interquartile range (IQR)604.5

Descriptive statistics

Standard deviation3360.1838
Coefficient of variation (CV)0.0001664167
Kurtosis29.334365
Mean20191386
Median Absolute Deviation (MAD)262
Skewness5.5177837
Sum2.0191386 × 109
Variance11290835
MonotonicityNot monotonic
2023-12-10T19:00:23.894758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20190726 1
 
1.0%
20191019 1
 
1.0%
20190324 1
 
1.0%
20190323 1
 
1.0%
20190322 1
 
1.0%
20190317 1
 
1.0%
20190316 1
 
1.0%
20190315 1
 
1.0%
20190310 1
 
1.0%
20190309 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
20190308 1
1.0%
20190309 1
1.0%
20190310 1
1.0%
20190315 1
1.0%
20190316 1
1.0%
20190317 1
1.0%
20190322 1
1.0%
20190323 1
1.0%
20190324 1
1.0%
20190329 1
1.0%
ValueCountFrequency (%)
20210319 1
1.0%
20210314 1
1.0%
20210313 1
1.0%
20191229 1
1.0%
20191228 1
1.0%
20191227 1
1.0%
20191222 1
1.0%
20191221 1
1.0%
20191220 1
1.0%
20191215 1
1.0%

tak
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct18
Distinct (%)18.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.22
Minimum0
Maximum17
Zeros4
Zeros (%)4.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:00:24.144951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q14.75
median7
Q310
95-th percentile13.05
Maximum17
Range17
Interquartile range (IQR)5.25

Descriptive statistics

Standard deviation3.9043617
Coefficient of variation (CV)0.54077032
Kurtosis-0.52451253
Mean7.22
Median Absolute Deviation (MAD)3
Skewness0.10128905
Sum722
Variance15.24404
MonotonicityNot monotonic
2023-12-10T19:00:24.390325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
6 14
14.0%
9 13
13.0%
10 12
12.0%
2 10
10.0%
8 7
 
7.0%
5 7
 
7.0%
3 5
 
5.0%
13 5
 
5.0%
7 5
 
5.0%
12 4
 
4.0%
Other values (8) 18
18.0%
ValueCountFrequency (%)
0 4
 
4.0%
1 2
 
2.0%
2 10
10.0%
3 5
 
5.0%
4 4
 
4.0%
5 7
7.0%
6 14
14.0%
7 5
 
5.0%
8 7
7.0%
9 13
13.0%
ValueCountFrequency (%)
17 1
 
1.0%
16 1
 
1.0%
15 1
 
1.0%
14 2
 
2.0%
13 5
 
5.0%
12 4
 
4.0%
11 3
 
3.0%
10 12
12.0%
9 13
13.0%
8 7
7.0%

rora
Categorical

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
60 
1
25 
2
10 
3
 
3
4
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row2
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 60
60.0%
1 25
25.0%
2 10
 
10.0%
3 3
 
3.0%
4 2
 
2.0%

Length

2023-12-10T19:00:24.735107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:00:24.923885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 60
60.0%
1 25
25.0%
2 10
 
10.0%
3 3
 
3.0%
4 2
 
2.0%

repr
Categorical

IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
85 
1
2
 
6

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 85
85.0%
1 9
 
9.0%
2 6
 
6.0%

Length

2023-12-10T19:00:25.154402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:00:25.371671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 85
85.0%
1 9
 
9.0%
2 6
 
6.0%

starting
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct10
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.83
Minimum0
Maximum9
Zeros9
Zeros (%)9.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:00:25.570610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median4
Q36
95-th percentile7
Maximum9
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.2920812
Coefficient of variation (CV)0.59845463
Kurtosis-0.96278276
Mean3.83
Median Absolute Deviation (MAD)2
Skewness0.0074873425
Sum383
Variance5.2536364
MonotonicityNot monotonic
2023-12-10T19:00:25.806808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2 18
18.0%
4 16
16.0%
5 14
14.0%
6 12
12.0%
7 12
12.0%
0 9
9.0%
3 8
8.0%
1 8
8.0%
8 2
 
2.0%
9 1
 
1.0%
ValueCountFrequency (%)
0 9
9.0%
1 8
8.0%
2 18
18.0%
3 8
8.0%
4 16
16.0%
5 14
14.0%
6 12
12.0%
7 12
12.0%
8 2
 
2.0%
9 1
 
1.0%
ValueCountFrequency (%)
9 1
 
1.0%
8 2
 
2.0%
7 12
12.0%
6 12
12.0%
5 14
14.0%
4 16
16.0%
3 8
8.0%
2 18
18.0%
1 8
8.0%
0 9
9.0%

eclnt
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.4
Minimum0
Maximum8
Zeros9
Zeros (%)9.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:00:26.039933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q35
95-th percentile8
Maximum8
Range8
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.1177461
Coefficient of variation (CV)0.6228665
Kurtosis-0.34396676
Mean3.4
Median Absolute Deviation (MAD)1
Skewness0.37635367
Sum340
Variance4.4848485
MonotonicityNot monotonic
2023-12-10T19:00:26.539263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
3 18
18.0%
4 18
18.0%
2 17
17.0%
5 13
13.0%
1 10
10.0%
0 9
9.0%
8 6
 
6.0%
6 6
 
6.0%
7 3
 
3.0%
ValueCountFrequency (%)
0 9
9.0%
1 10
10.0%
2 17
17.0%
3 18
18.0%
4 18
18.0%
5 13
13.0%
6 6
 
6.0%
7 3
 
3.0%
8 6
 
6.0%
ValueCountFrequency (%)
8 6
 
6.0%
7 3
 
3.0%
6 6
 
6.0%
5 13
13.0%
4 18
18.0%
3 18
18.0%
2 17
17.0%
1 10
10.0%
0 9
9.0%

get_eclet
Categorical

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
48 
1
28 
2
22 
5
 
1
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row2
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 48
48.0%
1 28
28.0%
2 22
22.0%
5 1
 
1.0%
4 1
 
1.0%

Length

2023-12-10T19:00:26.953187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:00:27.148926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 48
48.0%
1 28
28.0%
2 22
22.0%
5 1
 
1.0%
4 1
 
1.0%

Interactions

2023-12-10T19:00:19.938101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:16.177804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:17.013566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:18.104532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:19.043909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:20.132146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:16.333485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:17.191804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:18.282032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:19.197497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:20.384139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:16.515172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:17.507482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:18.491574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:19.378652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:20.632734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:16.689790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:17.703277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:18.697856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:19.576551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:20.791554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:16.837259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:17.897438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:18.855232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:00:19.762023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:00:27.297489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
stnd_yeartmsday_ordrace_daytakrorareprstartingeclntget_eclet
stnd_year1.0000.2580.0000.9630.0000.0000.0910.0000.2600.000
tms0.2581.0000.0000.2760.2090.0000.2030.2380.0810.283
day_ord0.0000.0001.0000.0000.5430.0000.0000.4260.4830.407
race_day0.9630.2760.0001.0000.0000.0000.0910.0000.2580.000
tak0.0000.2090.5430.0001.0000.0000.0000.8330.6040.432
rora0.0000.0000.0000.0000.0001.0000.3240.0000.1490.000
repr0.0910.2030.0000.0910.0000.3241.0000.2180.0000.260
starting0.0000.2380.4260.0000.8330.0000.2181.0000.2620.000
eclnt0.2600.0810.4830.2580.6040.1490.0000.2621.0000.076
get_eclet0.0000.2830.4070.0000.4320.0000.2600.0000.0761.000
2023-12-10T19:00:27.523553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
stnd_yearroraday_ordreprget_eclet
stnd_year1.0000.0000.0000.1500.000
rora0.0001.0000.0000.2560.000
day_ord0.0000.0001.0000.0000.333
repr0.1500.2560.0001.0000.200
get_eclet0.0000.0000.3330.2001.000
2023-12-10T19:00:27.741483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
tmsrace_daytakstartingeclntstnd_yearday_ordrorareprget_eclet
tms1.0000.844-0.245-0.3710.0820.2460.0000.0000.0830.161
race_day0.8441.000-0.383-0.457-0.0620.8260.0000.0000.1500.000
tak-0.245-0.3831.0000.7900.6850.0000.3710.0000.0000.186
starting-0.371-0.4570.7901.0000.3480.0000.2710.0000.1240.000
eclnt0.082-0.0620.6850.3481.0000.2490.2340.0790.0000.031
stnd_year0.2460.8260.0000.0000.2491.0000.0000.0000.1500.000
day_ord0.0000.0000.3710.2710.2340.0001.0000.0000.0000.333
rora0.0000.0000.0000.0000.0790.0000.0001.0000.2560.000
repr0.0830.1500.0000.1240.0000.1500.0000.2561.0000.200
get_eclet0.1610.0000.1860.0000.0310.0000.3330.0000.2001.000

Missing values

2023-12-10T19:00:21.077460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:00:21.364546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

stnd_yeartmsday_ordrace_daytakrorareprstartingeclntget_eclet
0201930120190726900432
1202112120210319510420
2201930320190728320230
32019311201908021100380
4201931220190803910631
5201931320190804600420
62019321201908091300481
7202111320210314202200
8201932320190811610340
92019331201908161010561
stnd_yeartmsday_ordrace_daytakrorareprstartingeclntget_eclet
90201918120190503900720
912019182201905041000631
92201918320190505600510
932019191201905101001245
94201919220190511900234
95201919320190512000000
962019201201905171221861
972019202201905181002741
98201920320190519800620
99201921120190524810540