Overview

Dataset statistics

Number of variables22
Number of observations100
Missing cells95
Missing cells (%)4.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory17.4 KiB
Average record size in memory178.3 B

Variable types

Text2
Boolean18
Categorical2

Alerts

SICHUAN_FOOD_PREFER_AT has constant value ""Constant
MCDONALD_PREFER_AT has constant value ""Constant
MCCAIN_PREFER_AT has constant value ""Constant
BKNG_PREFER_AT has constant value ""Constant
FLMM_PREFER_AT has constant value ""Constant
DISH_PREFER_AT has constant value ""Constant
PUDDING_PREFER_AT has constant value ""Constant
FRUIT_PREFER_AT has constant value ""Constant
HOPT_PREFER_AT has constant value ""Constant
DESSERT_PREFER_AT has constant value ""Constant
HGSCHL_BELO_AT is highly imbalanced (63.4%)Imbalance
HGSCHL_AT is highly imbalanced (91.9%)Imbalance
UNIV_AT is highly imbalanced (91.9%)Imbalance
MSKUL_AT is highly imbalanced (91.9%)Imbalance
PIZZAHUT_PREFER_AT is highly imbalanced (91.9%)Imbalance
KFC_PREFER_AT is highly imbalanced (85.9%)Imbalance
STARBUCKS_PREFER_AT is highly imbalanced (71.4%)Imbalance
SHOPNG_BRAND_INCLN_CN has 95 (95.0%) missing valuesMissing
USER_KEY_NM has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:17:22.972523
Analysis finished2023-12-10 10:17:23.199661
Duration0.23 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

USER_KEY_NM
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:17:23.378459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters3600
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row33b0a01b-cf93-4974-93c5-f5b07604222e
2nd row943747c3-8a28-47bf-9385-a4bb0ff91490
3rd row2d948766-b939-438c-b826-46d2d81bb10b
4th row592f1460-82e2-44b5-9ace-01ea0e769d3d
5th row28c36b8c-7ca7-4482-8f2c-7f3d32876b26
ValueCountFrequency (%)
33b0a01b-cf93-4974-93c5-f5b07604222e 1
 
1.0%
914d528e-87a3-43c5-a9a4-95ba7db2d0a0 1
 
1.0%
5e12e094-90fb-44ff-88fc-15a90cd2de74 1
 
1.0%
08902951-d8d1-45ea-859f-ffa4dfdbfe51 1
 
1.0%
e248a439-59cf-41e7-aae0-bd9d645c7df8 1
 
1.0%
9b846e0e-8c17-4b01-9183-12becc45fbc1 1
 
1.0%
faf3f958-dff9-48c8-8afc-9665307717ae 1
 
1.0%
65e593a2-516d-4d73-b26f-30eb8c6c041b 1
 
1.0%
0b0f408c-4ba5-4fee-8517-bb43cf589fe2 1
 
1.0%
2de013b3-b719-4c2f-a3cc-e3e7ea870b76 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T19:17:23.829644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 400
 
11.1%
4 311
 
8.6%
8 218
 
6.1%
9 211
 
5.9%
5 210
 
5.8%
7 207
 
5.8%
2 203
 
5.6%
b 196
 
5.4%
a 195
 
5.4%
0 194
 
5.4%
Other values (7) 1255
34.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2101
58.4%
Lowercase Letter 1099
30.5%
Dash Punctuation 400
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 311
14.8%
8 218
10.4%
9 211
10.0%
5 210
10.0%
7 207
9.9%
2 203
9.7%
0 194
9.2%
1 189
9.0%
3 188
8.9%
6 170
8.1%
Lowercase Letter
ValueCountFrequency (%)
b 196
17.8%
a 195
17.7%
f 183
16.7%
e 180
16.4%
d 173
15.7%
c 172
15.7%
Dash Punctuation
ValueCountFrequency (%)
- 400
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2501
69.5%
Latin 1099
30.5%

Most frequent character per script

Common
ValueCountFrequency (%)
- 400
16.0%
4 311
12.4%
8 218
8.7%
9 211
8.4%
5 210
8.4%
7 207
8.3%
2 203
8.1%
0 194
7.8%
1 189
7.6%
3 188
7.5%
Latin
ValueCountFrequency (%)
b 196
17.8%
a 195
17.7%
f 183
16.7%
e 180
16.4%
d 173
15.7%
c 172
15.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3600
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 400
 
11.1%
4 311
 
8.6%
8 218
 
6.1%
9 211
 
5.9%
5 210
 
5.8%
7 207
 
5.8%
2 203
 
5.6%
b 196
 
5.4%
a 195
 
5.4%
0 194
 
5.4%
Other values (7) 1255
34.9%

HGSCHL_BELO_AT
Boolean

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
93 
True
 
7
ValueCountFrequency (%)
False 93
93.0%
True 7
 
7.0%
2023-12-10T19:17:23.984503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

HGSCHL_AT
Boolean

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
99 
True
 
1
ValueCountFrequency (%)
False 99
99.0%
True 1
 
1.0%
2023-12-10T19:17:24.070524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
80 
True
20 
ValueCountFrequency (%)
False 80
80.0%
True 20
 
20.0%
2023-12-10T19:17:24.195818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

UNIV_AT
Boolean

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
99 
True
 
1
ValueCountFrequency (%)
False 99
99.0%
True 1
 
1.0%
2023-12-10T19:17:24.301709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

MSKUL_AT
Boolean

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
99 
True
 
1
ValueCountFrequency (%)
False 99
99.0%
True 1
 
1.0%
2023-12-10T19:17:24.396233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

ASSETS_IDEX_CD
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
51 
0
49 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1 51
51.0%
0 49
49.0%

Length

2023-12-10T19:17:24.515148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:17:24.644106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 51
51.0%
0 49
49.0%

ASSETS_IDEX_NM
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
51 
49 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
51
51.0%
49
49.0%

Length

2023-12-10T19:17:24.777464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:17:24.922787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
51
51.0%
49
49.0%

SHOPNG_BRAND_INCLN_CN
Text

MISSING 

Distinct5
Distinct (%)100.0%
Missing95
Missing (%)95.0%
Memory size932.0 B
2023-12-10T19:17:25.125126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length4
Min length3

Characters and Unicode

Total characters20
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)100.0%

Sample

1st row浪莎|龙啸|晨光
2nd row长寿花
3rd row肯德基
4th row海飞丝
5th row周黑鸭
ValueCountFrequency (%)
浪莎|龙啸|晨光 1
20.0%
长寿花 1
20.0%
肯德基 1
20.0%
海飞丝 1
20.0%
周黑鸭 1
20.0%
2023-12-10T19:17:25.464550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
| 2
 
10.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (9) 9
45.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 18
90.0%
Math Symbol 2
 
10.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (8) 8
44.4%
Math Symbol
ValueCountFrequency (%)
| 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Han 18
90.0%
Common 2
 
10.0%

Most frequent character per script

Han
ValueCountFrequency (%)
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (8) 8
44.4%
Common
ValueCountFrequency (%)
| 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
CJK 18
90.0%
ASCII 2
 
10.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
| 2
100.0%
CJK
ValueCountFrequency (%)
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Other values (8) 8
44.4%

PIZZAHUT_PREFER_AT
Boolean

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
99 
True
 
1
ValueCountFrequency (%)
False 99
99.0%
True 1
 
1.0%
2023-12-10T19:17:25.617420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

SICHUAN_FOOD_PREFER_AT
Boolean

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
100 
ValueCountFrequency (%)
False 100
100.0%
2023-12-10T19:17:25.713443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

KFC_PREFER_AT
Boolean

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
98 
True
 
2
ValueCountFrequency (%)
False 98
98.0%
True 2
 
2.0%
2023-12-10T19:17:25.819740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

MCDONALD_PREFER_AT
Boolean

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
100 
ValueCountFrequency (%)
False 100
100.0%
2023-12-10T19:17:25.917671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

MCCAIN_PREFER_AT
Boolean

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
100 
ValueCountFrequency (%)
False 100
100.0%
2023-12-10T19:17:26.019966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

STARBUCKS_PREFER_AT
Boolean

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
95 
True
 
5
ValueCountFrequency (%)
False 95
95.0%
True 5
 
5.0%
2023-12-10T19:17:26.137135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

BKNG_PREFER_AT
Boolean

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
100 
ValueCountFrequency (%)
False 100
100.0%
2023-12-10T19:17:26.252164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

FLMM_PREFER_AT
Boolean

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
100 
ValueCountFrequency (%)
False 100
100.0%
2023-12-10T19:17:26.334260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

DISH_PREFER_AT
Boolean

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
100 
ValueCountFrequency (%)
False 100
100.0%
2023-12-10T19:17:26.433215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

PUDDING_PREFER_AT
Boolean

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
100 
ValueCountFrequency (%)
False 100
100.0%
2023-12-10T19:17:26.555780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

FRUIT_PREFER_AT
Boolean

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
100 
ValueCountFrequency (%)
False 100
100.0%
2023-12-10T19:17:26.661374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

HOPT_PREFER_AT
Boolean

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
100 
ValueCountFrequency (%)
False 100
100.0%
2023-12-10T19:17:26.783557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

DESSERT_PREFER_AT
Boolean

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
100 
ValueCountFrequency (%)
False 100
100.0%
2023-12-10T19:17:26.890534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Sample

USER_KEY_NMHGSCHL_BELO_ATHGSCHL_ATUNIV_ABOVE_ATUNIV_ATMSKUL_ATASSETS_IDEX_CDASSETS_IDEX_NMSHOPNG_BRAND_INCLN_CNPIZZAHUT_PREFER_ATSICHUAN_FOOD_PREFER_ATKFC_PREFER_ATMCDONALD_PREFER_ATMCCAIN_PREFER_ATSTARBUCKS_PREFER_ATBKNG_PREFER_ATFLMM_PREFER_ATDISH_PREFER_ATPUDDING_PREFER_ATFRUIT_PREFER_ATHOPT_PREFER_ATDESSERT_PREFER_AT
033b0a01b-cf93-4974-93c5-f5b07604222eNNNNN0<NA>NNNNNNNNNNNNN
1943747c3-8a28-47bf-9385-a4bb0ff91490NNNNN0<NA>NNNNNYNNNNNNN
22d948766-b939-438c-b826-46d2d81bb10bNNYNN1<NA>NNNNNNNNNNNNN
3592f1460-82e2-44b5-9ace-01ea0e769d3dNNNNN0<NA>NNNNNNNNNNNNN
428c36b8c-7ca7-4482-8f2c-7f3d32876b26NNNNN0<NA>NNNNNNNNNNNNN
5d437b922-6ce6-4d8e-b629-9ae755014d52NNYNN1<NA>NNNNNNNNNNNNN
6bb1fc97f-1573-4c10-88f3-a0185a328889NNNNN1<NA>NNNNNNNNNNNNN
787355c7f-c5a2-4d45-9f47-4ff67263b754NNYNN1<NA>NNNNNNNNNNNNN
81d09e2e0-54dc-4abd-9d2b-76e9d5ca54caNNNNN0<NA>NNNNNNNNNNNNN
9800cdbc0-50d3-46ed-a61f-53617bdb704fNNNNN0<NA>NNNNNNNNNNNNN
USER_KEY_NMHGSCHL_BELO_ATHGSCHL_ATUNIV_ABOVE_ATUNIV_ATMSKUL_ATASSETS_IDEX_CDASSETS_IDEX_NMSHOPNG_BRAND_INCLN_CNPIZZAHUT_PREFER_ATSICHUAN_FOOD_PREFER_ATKFC_PREFER_ATMCDONALD_PREFER_ATMCCAIN_PREFER_ATSTARBUCKS_PREFER_ATBKNG_PREFER_ATFLMM_PREFER_ATDISH_PREFER_ATPUDDING_PREFER_ATFRUIT_PREFER_ATHOPT_PREFER_ATDESSERT_PREFER_AT
90d2504d36-1fd8-4656-b555-658e53b71024NNNNN0<NA>NNNNNNNNNNNNN
9145f7df45-d9ea-46c5-a5e9-4fecafc8ccb2NNNNN0海飞丝NNNNNNNNNNNNN
92bc9e2923-2260-47e7-882b-87b05d992a06NNNNN1<NA>NNNNNNNNNNNNN
939c8582de-8113-4cd4-a2b0-7442bf8a19cbNNYNN1<NA>NNNNNNNNNNNNN
9480f25522-f7c1-45ad-8c02-8071764dd96cNNNNN0<NA>NNNNNNNNNNNNN
9596dd553f-b60c-4ca6-b5b8-9aff2b60483fYNNNN1<NA>NNNNNNNNNNNNN
96bfca9790-4b6e-4b84-9eb6-2e405ffefee5NNNNN1<NA>NNNNNNNNNNNNN
9791a53aab-a772-40a8-9194-e7f2a0273231NNNNN1<NA>NNNNNNNNNNNNN
9858a83404-782c-43a4-af21-2ef42e07e634NNNNN1<NA>NNNNNNNNNNNNN
9923960606-02f2-4670-9591-75f754c35f62NNNNN0周黑鸭NNNNNNNNNNNNN