Overview

Dataset statistics

Number of variables6
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.4 KiB
Average record size in memory55.3 B

Variable types

Categorical1
Numeric5

Alerts

base_month is highly overall correlated with ticket_dt and 2 other fieldsHigh correlation
ticket_dt is highly overall correlated with base_month and 2 other fieldsHigh correlation
show_dt is highly overall correlated with base_month and 2 other fieldsHigh correlation
base_year is highly overall correlated with base_month and 2 other fieldsHigh correlation
base_year is highly imbalanced (80.6%)Imbalance

Reproduction

Analysis started2023-12-10 09:46:30.177750
Analysis finished2023-12-10 09:46:35.640674
Duration5.46 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

base_year
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2017
97 
2020
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2020
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
2017 97
97.0%
2020 3
 
3.0%

Length

2023-12-10T18:46:35.773121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:46:35.954324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017 97
97.0%
2020 3
 
3.0%

base_month
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.6
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:46:36.100472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.95
Q14
median5
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.5109031
Coefficient of variation (CV)0.3284572
Kurtosis-0.11388064
Mean4.6
Median Absolute Deviation (MAD)1
Skewness-0.45542182
Sum460
Variance2.2828283
MonotonicityNot monotonic
2023-12-10T18:46:36.916955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
5 30
30.0%
4 18
18.0%
6 18
18.0%
3 17
17.0%
7 10
 
10.0%
1 5
 
5.0%
2 2
 
2.0%
ValueCountFrequency (%)
1 5
 
5.0%
2 2
 
2.0%
3 17
17.0%
4 18
18.0%
5 30
30.0%
6 18
18.0%
7 10
 
10.0%
ValueCountFrequency (%)
7 10
 
10.0%
6 18
18.0%
5 30
30.0%
4 18
18.0%
3 17
17.0%
2 2
 
2.0%
1 5
 
5.0%

base_day
Real number (ℝ)

Distinct26
Distinct (%)26.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.93
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:46:37.251159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median20
Q327
95-th percentile30.05
Maximum31
Range30
Interquartile range (IQR)19

Descriptive statistics

Standard deviation10.011362
Coefficient of variation (CV)0.55835818
Kurtosis-1.3666987
Mean17.93
Median Absolute Deviation (MAD)10
Skewness-0.28559762
Sum1793
Variance100.22737
MonotonicityNot monotonic
2023-12-10T18:46:37.501137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
30 13
 
13.0%
8 7
 
7.0%
2 7
 
7.0%
26 6
 
6.0%
10 6
 
6.0%
19 6
 
6.0%
6 5
 
5.0%
31 5
 
5.0%
20 5
 
5.0%
27 5
 
5.0%
Other values (16) 35
35.0%
ValueCountFrequency (%)
1 3
3.0%
2 7
7.0%
3 2
 
2.0%
5 2
 
2.0%
6 5
5.0%
7 2
 
2.0%
8 7
7.0%
10 6
6.0%
12 1
 
1.0%
13 1
 
1.0%
ValueCountFrequency (%)
31 5
 
5.0%
30 13
13.0%
29 1
 
1.0%
28 3
 
3.0%
27 5
 
5.0%
26 6
6.0%
25 2
 
2.0%
24 3
 
3.0%
23 3
 
3.0%
22 4
 
4.0%

ticket_dt
Real number (ℝ)

HIGH CORRELATION 

Distinct77
Distinct (%)77.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20171357
Minimum20170119
Maximum20200113
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:46:37.775200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20170119
5-th percentile20170224
Q120170386
median20170504
Q320170610
95-th percentile20170719
Maximum20200113
Range29994
Interquartile range (IQR)224.75

Descriptive statistics

Standard deviation5084.341
Coefficient of variation (CV)0.00025205746
Kurtosis29.846614
Mean20171357
Median Absolute Deviation (MAD)107.5
Skewness5.5877016
Sum2.0171357 × 109
Variance25850523
MonotonicityNot monotonic
2023-12-10T18:46:38.044236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20170629 4
 
4.0%
20170428 4
 
4.0%
20170525 3
 
3.0%
20170329 3
 
3.0%
20170510 2
 
2.0%
20170421 2
 
2.0%
20170531 2
 
2.0%
20170405 2
 
2.0%
20170324 2
 
2.0%
20170522 2
 
2.0%
Other values (67) 74
74.0%
ValueCountFrequency (%)
20170119 1
1.0%
20170120 1
1.0%
20170214 1
1.0%
20170220 1
1.0%
20170221 1
1.0%
20170224 1
1.0%
20170227 1
1.0%
20170228 1
1.0%
20170302 1
1.0%
20170310 1
1.0%
ValueCountFrequency (%)
20200113 1
1.0%
20200112 1
1.0%
20200110 1
1.0%
20170727 1
1.0%
20170725 1
1.0%
20170719 2
2.0%
20170718 1
1.0%
20170714 1
1.0%
20170713 1
1.0%
20170707 1
1.0%

show_dt
Real number (ℝ)

HIGH CORRELATION 

Distinct41
Distinct (%)41.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20171378
Minimum20170119
Maximum20200112
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:46:38.305217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20170119
5-th percentile20170302
Q120170408
median20170506
Q320170618
95-th percentile20170727
Maximum20200112
Range29993
Interquartile range (IQR)210

Descriptive statistics

Standard deviation5080.3881
Coefficient of variation (CV)0.00025186123
Kurtosis29.848516
Mean20171378
Median Absolute Deviation (MAD)104
Skewness5.5879574
Sum2.0171378 × 109
Variance25810343
MonotonicityNot monotonic
2023-12-10T18:46:38.545820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
20170630 9
 
9.0%
20170408 7
 
7.0%
20170526 6
 
6.0%
20170302 5
 
5.0%
20170506 5
 
5.0%
20170322 4
 
4.0%
20170727 4
 
4.0%
20170530 4
 
4.0%
20170719 4
 
4.0%
20170610 3
 
3.0%
Other values (31) 49
49.0%
ValueCountFrequency (%)
20170119 1
 
1.0%
20170120 1
 
1.0%
20170214 1
 
1.0%
20170224 1
 
1.0%
20170302 5
5.0%
20170310 1
 
1.0%
20170321 1
 
1.0%
20170322 4
4.0%
20170323 3
3.0%
20170324 1
 
1.0%
ValueCountFrequency (%)
20200112 1
 
1.0%
20200110 2
 
2.0%
20170727 4
4.0%
20170719 4
4.0%
20170707 2
 
2.0%
20170630 9
9.0%
20170629 1
 
1.0%
20170627 1
 
1.0%
20170624 1
 
1.0%
20170616 3
 
3.0%

ticket_cnt
Real number (ℝ)

Distinct53
Distinct (%)53.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39.25
Minimum1
Maximum320
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:46:38.823979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q110
median22.5
Q350
95-th percentile121.95
Maximum320
Range319
Interquartile range (IQR)40

Descriptive statistics

Standard deviation49.814885
Coefficient of variation (CV)1.269169
Kurtosis11.797986
Mean39.25
Median Absolute Deviation (MAD)16.5
Skewness3.0086036
Sum3925
Variance2481.5227
MonotonicityNot monotonic
2023-12-10T18:46:39.106294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10 9
 
9.0%
30 6
 
6.0%
50 5
 
5.0%
3 4
 
4.0%
20 4
 
4.0%
2 4
 
4.0%
14 3
 
3.0%
7 3
 
3.0%
40 3
 
3.0%
1 3
 
3.0%
Other values (43) 56
56.0%
ValueCountFrequency (%)
1 3
 
3.0%
2 4
4.0%
3 4
4.0%
4 2
 
2.0%
5 2
 
2.0%
6 2
 
2.0%
7 3
 
3.0%
8 1
 
1.0%
10 9
9.0%
12 2
 
2.0%
ValueCountFrequency (%)
320 1
1.0%
221 1
1.0%
202 1
1.0%
167 1
1.0%
140 1
1.0%
121 1
1.0%
108 1
1.0%
100 1
1.0%
99 1
1.0%
97 1
1.0%

Interactions

2023-12-10T18:46:34.371330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:30.502945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:31.678236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:32.571800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:33.464204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:34.548789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:30.741830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:31.883880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:32.766229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:33.638935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:34.708242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:30.897524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:32.074085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:32.946535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:33.812646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:34.878191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:31.265729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:32.237549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:33.123296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:34.002306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:35.064767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:31.438409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:32.400887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:33.297757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:46:34.196145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:46:39.359962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
base_yearbase_monthbase_dayticket_dtshow_dtticket_cnt
base_year1.0000.6930.7550.9630.9630.000
base_month0.6931.0000.7410.7850.7850.217
base_day0.7550.7411.0000.7540.7540.251
ticket_dt0.9630.7850.7541.0000.9630.000
show_dt0.9630.7850.7540.9631.0000.000
ticket_cnt0.0000.2170.2510.0000.0001.000
2023-12-10T18:46:39.574273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
base_monthbase_dayticket_dtshow_dtticket_cntbase_year
base_month1.0000.2270.7890.803-0.1290.730
base_day0.2271.0000.3240.344-0.0590.397
ticket_dt0.7890.3241.0000.979-0.1450.826
show_dt0.8030.3440.9791.000-0.1720.826
ticket_cnt-0.129-0.059-0.145-0.1721.0000.000
base_year0.7300.3970.8260.8260.0001.000

Missing values

2023-12-10T18:46:35.327649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:46:35.561733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

base_yearbase_monthbase_dayticket_dtshow_dtticket_cnt
02017119201701192017011983
12020110202001102020011040
220171202017012020170120108
3201721420170214201702147
420172242017022420170224202
520173220170220201703027
6201732201702212017030220
72020110202001132020011024
8201732201702272017030271
920173220170228201703025
base_yearbase_monthbase_dayticket_dtshow_dtticket_cnt
9020177720170629201707078
91201777201707072017070714
922017719201707132017071950
932017719201707142017071950
942017719201707182017071943
952017719201707192017071916
962017727201706292017072710
972017727201707192017072714
982017727201707252017072710
992017727201707272017072772