Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.2 KiB
Average record size in memory43.3 B

Variable types

Numeric2
Text2
Categorical1

Alerts

wine_id is highly overall correlated with wine_pc and 1 other fieldsHigh correlation
wine_pc is highly overall correlated with wine_idHigh correlation
wine_cat is highly overall correlated with wine_idHigh correlation
wine_id has unique valuesUnique
wine_nm has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:02:11.264815
Analysis finished2023-12-10 10:02:13.372613
Duration2.11 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

wine_id
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean995.68
Minimum1
Maximum31521
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:02:13.486212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.95
Q127.75
median53.5
Q378.25
95-th percentile98.05
Maximum31521
Range31520
Interquartile range (IQR)50.5

Descriptive statistics

Standard deviation5395.2192
Coefficient of variation (CV)5.4186276
Kurtosis29.895978
Mean995.68
Median Absolute Deviation (MAD)25.5
Skewness5.5944057
Sum99568
Variance29108390
MonotonicityNot monotonic
2023-12-10T19:02:13.760138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
9 1
1.0%
10 1
1.0%
11 1
1.0%
12 1
1.0%
ValueCountFrequency (%)
31521 1
1.0%
31520 1
1.0%
31519 1
1.0%
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%

wine_nm
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:02:14.307185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length73
Median length47
Mean length33.73
Min length10

Characters and Unicode

Total characters3373
Distinct characters64
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row1000 Stories Bourbon Barrel Aged Gold Rush Red 2016
2nd rowKir-Yianni Akakies Sparkling Rose 2019
3rd row13 Celsius Sauvignon Blanc 2017
4th row14 Hands Riesling 2016
5th row14 Hands Chardonnay 2015
ValueCountFrequency (%)
2016 30
 
5.2%
2017 23
 
4.0%
a 23
 
4.0%
pinot 21
 
3.7%
2015 18
 
3.1%
noir 17
 
3.0%
vineyard 15
 
2.6%
to 15
 
2.6%
hands 12
 
2.1%
red 12
 
2.1%
Other values (177) 388
67.6%
2023-12-10T19:02:15.076266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
474
 
14.1%
a 263
 
7.8%
e 225
 
6.7%
i 194
 
5.8%
n 189
 
5.6%
r 175
 
5.2%
o 163
 
4.8%
1 122
 
3.6%
l 117
 
3.5%
0 104
 
3.1%
Other values (54) 1347
39.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1974
58.5%
Decimal Number 475
 
14.1%
Space Separator 474
 
14.1%
Uppercase Letter 421
 
12.5%
Other Punctuation 19
 
0.6%
Dash Punctuation 6
 
0.2%
Open Punctuation 2
 
0.1%
Close Punctuation 2
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 74
17.6%
R 41
9.7%
P 37
8.8%
C 36
8.6%
V 32
 
7.6%
N 30
 
7.1%
B 27
 
6.4%
G 21
 
5.0%
S 19
 
4.5%
H 17
 
4.0%
Other values (15) 87
20.7%
Lowercase Letter
ValueCountFrequency (%)
a 263
13.3%
e 225
11.4%
i 194
9.8%
n 189
9.6%
r 175
8.9%
o 163
8.3%
l 117
 
5.9%
d 99
 
5.0%
t 92
 
4.7%
s 76
 
3.9%
Other values (13) 381
19.3%
Decimal Number
ValueCountFrequency (%)
1 122
25.7%
0 104
21.9%
2 98
20.6%
6 31
 
6.5%
7 30
 
6.3%
4 29
 
6.1%
5 20
 
4.2%
9 17
 
3.6%
3 15
 
3.2%
8 9
 
1.9%
Other Punctuation
ValueCountFrequency (%)
. 16
84.2%
' 3
 
15.8%
Space Separator
ValueCountFrequency (%)
474
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2395
71.0%
Common 978
29.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 263
 
11.0%
e 225
 
9.4%
i 194
 
8.1%
n 189
 
7.9%
r 175
 
7.3%
o 163
 
6.8%
l 117
 
4.9%
d 99
 
4.1%
t 92
 
3.8%
s 76
 
3.2%
Other values (38) 802
33.5%
Common
ValueCountFrequency (%)
474
48.5%
1 122
 
12.5%
0 104
 
10.6%
2 98
 
10.0%
6 31
 
3.2%
7 30
 
3.1%
4 29
 
3.0%
5 20
 
2.0%
9 17
 
1.7%
. 16
 
1.6%
Other values (6) 37
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3373
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
474
 
14.1%
a 263
 
7.8%
e 225
 
6.7%
i 194
 
5.8%
n 189
 
5.6%
r 175
 
5.2%
o 163
 
4.8%
1 122
 
3.6%
l 117
 
3.5%
0 104
 
3.1%
Other values (54) 1347
39.9%
Distinct72
Distinct (%)72.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:02:15.464961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length74
Median length49.5
Mean length39.84
Min length6

Characters and Unicode

Total characters3984
Distinct characters52
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51 ?
Unique (%)51.0%

Sample

1st rowOther Red Blends from California
2nd rowGreece
3rd rowSauvignon Blanc from Marlborough, New Zealand
4th rowRiesling from Washington
5th rowChardonnay from Washington
ValueCountFrequency (%)
from 97
 
17.5%
california 39
 
7.0%
blends 25
 
4.5%
valley 25
 
4.5%
red 22
 
4.0%
other 21
 
3.8%
pinot 20
 
3.6%
noir 17
 
3.1%
washington 15
 
2.7%
coast 13
 
2.3%
Other values (78) 261
47.0%
2023-12-10T19:02:16.203820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
455
 
11.4%
a 364
 
9.1%
o 337
 
8.5%
r 305
 
7.7%
n 276
 
6.9%
i 245
 
6.1%
e 239
 
6.0%
l 223
 
5.6%
t 166
 
4.2%
f 139
 
3.5%
Other values (42) 1235
31.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2948
74.0%
Uppercase Letter 469
 
11.8%
Space Separator 455
 
11.4%
Other Punctuation 105
 
2.6%
Dash Punctuation 7
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 364
12.3%
o 337
11.4%
r 305
10.3%
n 276
9.4%
i 245
8.3%
e 239
8.1%
l 223
7.6%
t 166
 
5.6%
f 139
 
4.7%
m 127
 
4.3%
Other values (16) 527
17.9%
Uppercase Letter
ValueCountFrequency (%)
C 94
20.0%
B 43
9.2%
S 40
8.5%
R 37
 
7.9%
V 35
 
7.5%
O 33
 
7.0%
A 32
 
6.8%
N 29
 
6.2%
P 25
 
5.3%
W 25
 
5.3%
Other values (11) 76
16.2%
Other Punctuation
ValueCountFrequency (%)
, 98
93.3%
/ 5
 
4.8%
. 2
 
1.9%
Space Separator
ValueCountFrequency (%)
455
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3417
85.8%
Common 567
 
14.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 364
 
10.7%
o 337
 
9.9%
r 305
 
8.9%
n 276
 
8.1%
i 245
 
7.2%
e 239
 
7.0%
l 223
 
6.5%
t 166
 
4.9%
f 139
 
4.1%
m 127
 
3.7%
Other values (37) 996
29.1%
Common
ValueCountFrequency (%)
455
80.2%
, 98
 
17.3%
- 7
 
1.2%
/ 5
 
0.9%
. 2
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3979
99.9%
None 5
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
455
 
11.4%
a 364
 
9.1%
o 337
 
8.5%
r 305
 
7.7%
n 276
 
6.9%
i 245
 
6.2%
e 239
 
6.0%
l 223
 
5.6%
t 166
 
4.2%
f 139
 
3.5%
Other values (41) 1230
30.9%
None
ValueCountFrequency (%)
é 5
100.0%

wine_cat
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Red Wine
66 
White Wine
27 
Pink and Rosé
 
4
Sparkling & Champagne
 
2
Sparkling Rose Wine
 
1

Length

Max length21
Median length8
Mean length9.11
Min length8

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st rowRed Wine
2nd rowSparkling Rose Wine
3rd rowWhite Wine
4th rowWhite Wine
5th rowWhite Wine

Common Values

ValueCountFrequency (%)
Red Wine 66
66.0%
White Wine 27
27.0%
Pink and Rosé 4
 
4.0%
Sparkling & Champagne 2
 
2.0%
Sparkling Rose Wine 1
 
1.0%

Length

2023-12-10T19:02:16.448143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:02:16.717553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
wine 94
45.4%
red 66
31.9%
white 27
 
13.0%
pink 4
 
1.9%
and 4
 
1.9%
rosé 4
 
1.9%
sparkling 3
 
1.4%
2
 
1.0%
champagne 2
 
1.0%
rose 1
 
0.5%

wine_pc
Real number (ℝ)

HIGH CORRELATION 

Distinct52
Distinct (%)52.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.7808
Minimum8.99
Maximum749.97
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:02:16.949205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8.99
5-th percentile9.99
Q112.99
median20
Q337.7425
95-th percentile121.4905
Maximum749.97
Range740.98
Interquartile range (IQR)24.7525

Descriptive statistics

Standard deviation108.80992
Coefficient of variation (CV)2.1857808
Kurtosis24.187631
Mean49.7808
Median Absolute Deviation (MAD)9.01
Skewness4.8134892
Sum4978.08
Variance11839.598
MonotonicityNot monotonic
2023-12-10T19:02:17.210356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10.99 8
 
8.0%
29.99 7
 
7.0%
11.99 5
 
5.0%
17.99 4
 
4.0%
16.99 4
 
4.0%
9.99 4
 
4.0%
12.99 4
 
4.0%
21.99 3
 
3.0%
19.99 3
 
3.0%
18.99 3
 
3.0%
Other values (42) 55
55.0%
ValueCountFrequency (%)
8.99 2
 
2.0%
9.99 4
4.0%
10.99 8
8.0%
11.99 5
5.0%
12.0 3
 
3.0%
12.99 4
4.0%
13.99 1
 
1.0%
14.0 1
 
1.0%
14.99 3
 
3.0%
15.0 1
 
1.0%
ValueCountFrequency (%)
749.97 1
1.0%
529.97 1
1.0%
479.98 1
1.0%
459.97 1
1.0%
150.0 1
1.0%
119.99 1
1.0%
109.99 1
1.0%
80.0 1
1.0%
76.99 1
1.0%
64.99 1
1.0%

Interactions

2023-12-10T19:02:12.655607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:02:12.046480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:02:12.859042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:02:12.382997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:02:17.382586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
wine_idwine_nmwine_region_nmwine_catwine_pc
wine_id1.0001.0001.0000.4530.202
wine_nm1.0001.0001.0001.0001.000
wine_region_nm1.0001.0001.0001.0000.000
wine_cat0.4531.0001.0001.0000.000
wine_pc0.2021.0000.0000.0001.000
2023-12-10T19:02:17.541121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
wine_idwine_pcwine_cat
wine_id1.0000.6650.541
wine_pc0.6651.0000.000
wine_cat0.5410.0001.000

Missing values

2023-12-10T19:02:13.136766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:02:13.307092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

wine_idwine_nmwine_region_nmwine_catwine_pc
011000 Stories Bourbon Barrel Aged Gold Rush Red 2016Other Red Blends from CaliforniaRed Wine17.99
131519Kir-Yianni Akakies Sparkling Rose 2019GreeceSparkling Rose Wine21.99
2313 Celsius Sauvignon Blanc 2017Sauvignon Blanc from Marlborough, New ZealandWhite Wine16.99
3414 Hands Riesling 2016Riesling from WashingtonWhite Wine9.99
4514 Hands Chardonnay 2015Chardonnay from WashingtonWhite Wine9.99
5614 Hands Cabernet Sauvignon 2016Cabernet Sauvignon from Columbia Valley, WashingtonRed Wine12.0
6714 Hands Moscato 2015Muscat from Columbia Valley, WashingtonWhite Wine8.99
731520Mira do O Druida Encruzado Reserva Dao Branco 2019PortugalWhite Wine29.99
8914 Hands Hot to Trot White Blend 2016Other White Blends from WashingtonWhite Wine12.0
91014 Hands Rose 2017Rosé from WashingtonPink and Rosé11.99
wine_idwine_nmwine_region_nmwine_catwine_pc
9091Abreu Vineyard Madrona Ranch 2004Bordeaux Red Blends from Napa Valley, CaliforniaRed Wine459.97
9192Abreu Vineyard Madrona Ranch 2013Bordeaux Red Blends from Napa Valley, CaliforniaRed Wine749.97
9293Acacia Carneros Pinot Noir 2016Pinot Noir from Carneros, CaliforniaRed Wine21.99
9394Acacia Carneros Chardonnay 2016Chardonnay from Carneros, CaliforniaWhite Wine18.99
9495Accordini Igino Le Bessole Amarone della Valpolicella 2013Other Red Blends from Valpolicella, Veneto, ItalyRed Wine64.99
9596Accordini Igino Ripasso Valpolicella 2010Other Red Blends from Valpolicella, Veneto, ItalyRed Wine22.0
9697Achaval-Ferrer Mendoza Cabernet Sauvignon 2016Cabernet Sauvignon from Mendoza, ArgentinaRed Wine28.99
9798Achaval-Ferrer Mendoza Malbec 2017Malbec from Mendoza, ArgentinaRed Wine28.99
9899Achaval-Ferrer Quimera 2014Bordeaux Red Blends from Mendoza, ArgentinaRed Wine36.99
99100Acinum Valpolicella Ripasso 2015Other Red Blends from Valpolicella, Veneto, ItalyRed Wine22.99