Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.8 KiB
Average record size in memory69.3 B

Variable types

Numeric3
Categorical5

Alerts

visit_area_gugun_klang_nm is highly overall correlated with seq_no and 5 other fieldsHigh correlation
crtfc_str_nm is highly overall correlated with seq_no and 5 other fieldsHigh correlation
visit_area_addr is highly overall correlated with seq_no and 5 other fieldsHigh correlation
seq_no is highly overall correlated with cstmr_id and 4 other fieldsHigh correlation
goods_online_sle_dt is highly overall correlated with str_visit_dt and 5 other fieldsHigh correlation
str_visit_dt is highly overall correlated with goods_online_sle_dt and 5 other fieldsHigh correlation
cstmr_id is highly overall correlated with seq_no and 3 other fieldsHigh correlation
cstmr_visit_co is highly overall correlated with seq_no and 6 other fieldsHigh correlation
cstmr_visit_co is highly imbalanced (64.8%)Imbalance
seq_no has unique valuesUnique
str_visit_dt has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:51:40.344582
Analysis finished2023-12-10 09:51:43.528337
Duration3.18 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

seq_no
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean441.82
Minimum1
Maximum13059
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:51:43.667353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.95
Q127.75
median53.5
Q378.25
95-th percentile98.05
Maximum13059
Range13058
Interquartile range (IQR)50.5

Descriptive statistics

Standard deviation2230.0765
Coefficient of variation (CV)5.0474774
Kurtosis29.887261
Mean441.82
Median Absolute Deviation (MAD)25.5
Skewness5.5932224
Sum44182
Variance4973241
MonotonicityNot monotonic
2023-12-10T18:51:43.912322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
9 1
1.0%
10 1
1.0%
11 1
1.0%
12 1
1.0%
ValueCountFrequency (%)
13059 1
1.0%
13058 1
1.0%
13057 1
1.0%
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%

cstmr_id
Categorical

HIGH CORRELATION 

Distinct36
Distinct (%)36.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
601363384001
 
3
601370148001
 
3
601370236001
 
3
601353398001
 
3
601372470001
 
3
Other values (31)
85 

Length

Max length12
Median length12
Mean length11.82
Min length6

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row601353091001
2nd rownankai
3rd row601353398001
4th row601353398001
5th row601353398001

Common Values

ValueCountFrequency (%)
601363384001 3
 
3.0%
601370148001 3
 
3.0%
601370236001 3
 
3.0%
601353398001 3
 
3.0%
601372470001 3
 
3.0%
601356277001 3
 
3.0%
601356543001 3
 
3.0%
601356831001 3
 
3.0%
601356864001 3
 
3.0%
601357175001 3
 
3.0%
Other values (26) 70
70.0%

Length

2023-12-10T18:51:44.138845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
601363384001 3
 
3.0%
601372041001 3
 
3.0%
601362459001 3
 
3.0%
nankai 3
 
3.0%
601370938001 3
 
3.0%
601366768001 3
 
3.0%
601366963001 3
 
3.0%
601369074001 3
 
3.0%
601368097001 3
 
3.0%
601368128001 3
 
3.0%
Other values (26) 70
70.0%

cstmr_visit_co
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
3
90 
2
 
7
44
 
3

Length

Max length2
Median length1
Mean length1.03
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row44
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
3 90
90.0%
2 7
 
7.0%
44 3
 
3.0%

Length

2023-12-10T18:51:44.335654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:51:44.490712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 90
90.0%
2 7
 
7.0%
44 3
 
3.0%

visit_area_addr
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
京都市左京区田中上柳町25-3
34 
京都市中京区壬生賀陽御所町3番地20
32 
京都市左京区上高野東山55
31 
大阪府泉南市泉南郡田尻町空港中1
 
3

Length

Max length18
Median length16
Mean length15.37
Min length13

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row京都市左京区田中上柳町25-3
2nd row大阪府泉南市泉南郡田尻町空港中1
3rd row京都市左京区田中上柳町25-3
4th row京都市中京区壬生賀陽御所町3番地20
5th row京都市左京区上高野東山55

Common Values

ValueCountFrequency (%)
京都市左京区田中上柳町25-3 34
34.0%
京都市中京区壬生賀陽御所町3番地20 32
32.0%
京都市左京区上高野東山55 31
31.0%
大阪府泉南市泉南郡田尻町空港中1 3
 
3.0%

Length

2023-12-10T18:51:44.687940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:51:44.872302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
京都市左京区田中上柳町25-3 34
34.0%
京都市中京区壬生賀陽御所町3番地20 32
32.0%
京都市左京区上高野東山55 31
31.0%
大阪府泉南市泉南郡田尻町空港中1 3
 
3.0%

visit_area_gugun_klang_nm
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
교토부 교토시 사쿄구
65 
교토부 교토시 나카교구
32 
오사카부 센난시 센난군
 
3

Length

Max length12
Median length11
Mean length11.35
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row교토부 교토시 사쿄구
2nd row오사카부 센난시 센난군
3rd row교토부 교토시 사쿄구
4th row교토부 교토시 나카교구
5th row교토부 교토시 사쿄구

Common Values

ValueCountFrequency (%)
교토부 교토시 사쿄구 65
65.0%
교토부 교토시 나카교구 32
32.0%
오사카부 센난시 센난군 3
 
3.0%

Length

2023-12-10T18:51:45.097408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:51:45.282468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
교토부 97
32.3%
교토시 97
32.3%
사쿄구 65
21.7%
나카교구 32
 
10.7%
오사카부 3
 
1.0%
센난시 3
 
1.0%
센난군 3
 
1.0%

crtfc_str_nm
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
叡山電鉄
34 
京福電気鉄道(鋼索係事務所)
32 
京都八瀬 瑠璃光院
31 
n・e・s・t関西空港店
 
3

Length

Max length14
Median length12
Mean length8.99
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row叡山電鉄
2nd rown・e・s・t関西空港店
3rd row叡山電鉄
4th row京福電気鉄道(鋼索係事務所)
5th row京都八瀬 瑠璃光院

Common Values

ValueCountFrequency (%)
叡山電鉄 34
34.0%
京福電気鉄道(鋼索係事務所) 32
32.0%
京都八瀬 瑠璃光院 31
31.0%
n・e・s・t関西空港店 3
 
3.0%

Length

2023-12-10T18:51:45.472411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:51:45.722678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
叡山電鉄 34
26.0%
京福電気鉄道(鋼索係事務所) 32
24.4%
京都八瀬 31
23.7%
瑠璃光院 31
23.7%
n・e・s・t関西空港店 3
 
2.3%

goods_online_sle_dt
Real number (ℝ)

HIGH CORRELATION 

Distinct38
Distinct (%)38.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0201098 × 1013
Minimum2.0200323 × 1013
Maximum2.0201129 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:51:46.011009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0200323 × 1013
5-th percentile2.0201107 × 1013
Q12.0201119 × 1013
median2.0201122 × 1013
Q32.0201126 × 1013
95-th percentile2.0201129 × 1013
Maximum2.0201129 × 1013
Range8.0599147 × 108
Interquartile range (IQR)7106105

Descriptive statistics

Standard deviation1.3681937 × 108
Coefficient of variation (CV)6.7728682 × 10-6
Kurtosis29.763069
Mean2.0201098 × 1013
Median Absolute Deviation (MAD)4015088.5
Skewness-5.5764023
Sum2.0201098 × 1015
Variance1.8719541 × 1016
MonotonicityNot monotonic
2023-12-10T18:51:46.372261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
20201126142137 3
 
3.0%
20201125130833 3
 
3.0%
20201107162024 3
 
3.0%
20201115093330 3
 
3.0%
20201129154728 3
 
3.0%
20201122083130 3
 
3.0%
20201124164212 3
 
3.0%
20201129051422 3
 
3.0%
20201107165743 3
 
3.0%
20201129084316 3
 
3.0%
Other values (28) 70
70.0%
ValueCountFrequency (%)
20200323163255 1
 
1.0%
20200323163347 1
 
1.0%
20200327085236 1
 
1.0%
20201107000014 1
 
1.0%
20201107000936 2
2.0%
20201107162024 3
3.0%
20201107165743 3
3.0%
20201115084721 3
3.0%
20201115093330 3
3.0%
20201115102122 3
3.0%
ValueCountFrequency (%)
20201129154728 3
3.0%
20201129113023 3
3.0%
20201129105711 3
3.0%
20201129084316 3
3.0%
20201129051422 3
3.0%
20201128174335 3
3.0%
20201128153016 3
3.0%
20201128134322 2
2.0%
20201126170421 3
3.0%
20201126142137 3
3.0%

str_visit_dt
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0201098 × 1013
Minimum2.0200323 × 1013
Maximum2.0201129 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:51:46.667190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0200323 × 1013
5-th percentile2.0201107 × 1013
Q12.0201119 × 1013
median2.0201122 × 1013
Q32.0201126 × 1013
95-th percentile2.0201129 × 1013
Maximum2.0201129 × 1013
Range8.0603949 × 108
Interquartile range (IQR)7003040.2

Descriptive statistics

Standard deviation1.3682629 × 108
Coefficient of variation (CV)6.7732104 × 10-6
Kurtosis29.763359
Mean2.0201098 × 1013
Median Absolute Deviation (MAD)4007443
Skewness-5.5764406
Sum2.0201098 × 1015
Variance1.8721433 × 1016
MonotonicityNot monotonic
2023-12-10T18:51:46.958236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20201107160042 1
 
1.0%
20201107182138 1
 
1.0%
20201122185818 1
 
1.0%
20201122172931 1
 
1.0%
20201122165700 1
 
1.0%
20201124180939 1
 
1.0%
20201124170647 1
 
1.0%
20201124164250 1
 
1.0%
20201129203114 1
 
1.0%
20201129192358 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
20200323163628 1
1.0%
20200323163643 1
1.0%
20200327100113 1
1.0%
20201107160042 1
1.0%
20201107162039 1
1.0%
20201107165929 1
1.0%
20201107175053 1
1.0%
20201107175349 1
1.0%
20201107182106 1
1.0%
20201107182138 1
1.0%
ValueCountFrequency (%)
20201129203114 1
1.0%
20201129192358 1
1.0%
20201129190148 1
1.0%
20201129184748 1
1.0%
20201129184430 1
1.0%
20201129183159 1
1.0%
20201129180222 1
1.0%
20201129180137 1
1.0%
20201129172741 1
1.0%
20201129172458 1
1.0%

Interactions

2023-12-10T18:51:42.625838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:51:41.407235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:51:41.951752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:51:42.773374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:51:41.554910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:51:42.159602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:51:42.934160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:51:41.758399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:51:42.393071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:51:47.142579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
seq_nocstmr_idcstmr_visit_covisit_area_addrvisit_area_gugun_klang_nmcrtfc_str_nmgoods_online_sle_dtstr_visit_dt
seq_no1.0001.0001.0001.0001.0001.0000.9190.919
cstmr_id1.0001.0001.0000.2600.7680.2601.0001.000
cstmr_visit_co1.0001.0001.0000.6710.9420.6711.0001.000
visit_area_addr1.0000.2600.6711.0001.0001.0001.0001.000
visit_area_gugun_klang_nm1.0000.7680.9421.0001.0001.0001.0001.000
crtfc_str_nm1.0000.2600.6711.0001.0001.0001.0001.000
goods_online_sle_dt0.9191.0001.0001.0001.0001.0001.0000.692
str_visit_dt0.9191.0001.0001.0001.0001.0000.6921.000
2023-12-10T18:51:47.372182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
visit_area_gugun_klang_nmcstmr_visit_cocrtfc_str_nmvisit_area_addrcstmr_id
visit_area_gugun_klang_nm1.0000.7040.9950.9950.410
cstmr_visit_co0.7041.0000.6970.6970.812
crtfc_str_nm0.9950.6971.0001.0000.084
visit_area_addr0.9950.6971.0001.0000.084
cstmr_id0.4100.8120.0840.0841.000
2023-12-10T18:51:47.552419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
seq_nogoods_online_sle_dtstr_visit_dtcstmr_idcstmr_visit_covisit_area_addrvisit_area_gugun_klang_nmcrtfc_str_nm
seq_no1.000-0.085-0.0920.8080.9950.9900.9950.990
goods_online_sle_dt-0.0851.0000.9790.8080.9950.9900.9950.990
str_visit_dt-0.0920.9791.0000.8080.9950.9900.9950.990
cstmr_id0.8080.8080.8081.0000.8120.0840.4100.084
cstmr_visit_co0.9950.9950.9950.8121.0000.6970.7040.697
visit_area_addr0.9900.9900.9900.0840.6971.0000.9951.000
visit_area_gugun_klang_nm0.9950.9950.9950.4100.7040.9951.0000.995
crtfc_str_nm0.9900.9900.9900.0840.6971.0000.9951.000

Missing values

2023-12-10T18:51:43.173238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:51:43.434165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

seq_nocstmr_idcstmr_visit_covisit_area_addrvisit_area_gugun_klang_nmcrtfc_str_nmgoods_online_sle_dtstr_visit_dt
016013530910012京都市左京区田中上柳町25-3교토부 교토시 사쿄구叡山電鉄2020110700001420201107160042
113057nankai44大阪府泉南市泉南郡田尻町空港中1오사카부 센난시 센난군n・e・s・t関西空港店2020032316334720200323163628
236013533980013京都市左京区田中上柳町25-3교토부 교토시 사쿄구叡山電鉄2020112214013120201122164123
346013533980013京都市中京区壬生賀陽御所町3番地20교토부 교토시 나카교구京福電気鉄道(鋼索係事務所)2020112214013120201122171416
456013533980013京都市左京区上高野東山55교토부 교토시 사쿄구京都八瀬 瑠璃光院2020112214013120201122183215
566013559620013京都市左京区田中上柳町25-3교토부 교토시 사쿄구叡山電鉄2020110700093620201107175349
676013559620013京都市中京区壬生賀陽御所町3番地20교토부 교토시 나카교구京福電気鉄道(鋼索係事務所)2020110700093620201107182106
713058nankai44大阪府泉南市泉南郡田尻町空港中1오사카부 센난시 센난군n・e・s・t関西空港店2020032316325520200323163643
896013562770013京都市左京区田中上柳町25-3교토부 교토시 사쿄구叡山電鉄2020112617042120201126171610
9106013562770013京都市中京区壬生賀陽御所町3番地20교토부 교토시 나카교구京福電気鉄道(鋼索係事務所)2020112617042120201126180033
seq_nocstmr_idcstmr_visit_covisit_area_addrvisit_area_gugun_klang_nmcrtfc_str_nmgoods_online_sle_dtstr_visit_dt
90916013711800013京都市中京区壬生賀陽御所町3番地20교토부 교토시 나카교구京福電気鉄道(鋼索係事務所)2020112817433520201128182007
91926013711800013京都市左京区上高野東山55교토부 교토시 사쿄구京都八瀬 瑠璃光院2020112817433520201128192305
92936013720410013京都市左京区田中上柳町25-3교토부 교토시 사쿄구叡山電鉄2020112911302320201129172458
93946013720410013京都市中京区壬生賀陽御所町3番地20교토부 교토시 나카교구京福電気鉄道(鋼索係事務所)2020112911302320201129180222
94956013720410013京都市左京区上高野東山55교토부 교토시 사쿄구京都八瀬 瑠璃光院2020112911302320201129184430
95966013724700013京都市左京区田中上柳町25-3교토부 교토시 사쿄구叡山電鉄2020111510212220201115160100
96976013724700013京都市中京区壬生賀陽御所町3番地20교토부 교토시 나카교구京福電気鉄道(鋼索係事務所)2020111510212220201115162705
97986013724700013京都市左京区上高野東山55교토부 교토시 사쿄구京都八瀬 瑠璃光院2020111510212220201115182237
98996013729760013京都市左京区田中上柳町25-3교토부 교토시 사쿄구叡山電鉄2020112214225020201122154934
991006013729760013京都市中京区壬生賀陽御所町3番地20교토부 교토시 나카교구京福電気鉄道(鋼索係事務所)2020112214225020201122161319