Overview

Dataset statistics

Number of variables33
Number of observations100
Missing cells1563
Missing cells (%)47.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory27.5 KiB
Average record size in memory281.3 B

Variable types

Text17
Numeric7
Categorical9

Dataset

Description당뇨 환자의 처방 약물 코드와 최초 처방일과 최종 처방일. sulfonylurea (RxNorm 코드: 1597772, 1597758, 1597773, 19101729, 21133671, 19059797), sulfonylurea+metformin(42953698, 42953917, 42953740), meglitinide(19023425, 19023424, 19023426, 42962884, 19107111, 19107110, 1502829), metformin(19106521, 40164929, 40164946, 40164897, 40164894, 40164925), TZD(1525221, 19079293, 42960773), DPP4i(19125041, 40239218, 43013911, 43013924, 42960599, 42961500), DPP4i-MET(40164922, 42708088,42708090, 42708086), Insulin(46234044, 35782236, 35779361, 41348914, 35786039, 36809748, 42920572, 46234044, 41370419, 41349142, 46234044, 35782557, 35159339, 35781503, 35781503, 46234044, 46234044, 586875, 35781503, 35781503, 46234044, 41348508, 40717097 , 35779506, 40755064, 42921713)
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/diabetes_pre

Alerts

SU-MET_f_prcd is highly imbalanced (64.1%)Imbalance
SU-MET_l_prcd is highly imbalanced (71.9%)Imbalance
Meg_f_prcd is highly imbalanced (64.3%)Imbalance
Meg_l_prcd is highly imbalanced (70.0%)Imbalance
TZD_f_prcd is highly imbalanced (70.1%)Imbalance
TZD_l_prcd is highly imbalanced (72.1%)Imbalance
DPP4i-MET_f_prcd is highly imbalanced (62.6%)Imbalance
DPP4i-MET_l_prcd is highly imbalanced (71.9%)Imbalance
SU_f_date has 62 (62.0%) missing valuesMissing
SU_f_prcd has 62 (62.0%) missing valuesMissing
SU_l_date has 73 (73.0%) missing valuesMissing
SU-MET_f_date has 90 (90.0%) missing valuesMissing
SU-MET_l_date has 92 (92.0%) missing valuesMissing
Meg_f_date has 85 (85.0%) missing valuesMissing
Meg_l_date has 88 (88.0%) missing valuesMissing
Met_f_date has 34 (34.0%) missing valuesMissing
Met_f_prcd has 34 (34.0%) missing valuesMissing
Met_l_date has 45 (45.0%) missing valuesMissing
Met_l_prcd has 45 (45.0%) missing valuesMissing
TZD_f_date has 90 (90.0%) missing valuesMissing
TZD_l_date has 91 (91.0%) missing valuesMissing
DPP4i_f_date has 54 (54.0%) missing valuesMissing
DPP4i_f_prcd has 54 (54.0%) missing valuesMissing
DPP4i_l_date has 64 (64.0%) missing valuesMissing
DPP4i_l_prcd has 64 (64.0%) missing valuesMissing
DPP4i-MET_f_date has 87 (87.0%) missing valuesMissing
DPP4i-MET_l_date has 91 (91.0%) missing valuesMissing
Insul_f_date has 62 (62.0%) missing valuesMissing
Insul_f_prcd has 62 (62.0%) missing valuesMissing
Insul_l_date has 67 (67.0%) missing valuesMissing
Insul_l_prcd has 67 (67.0%) missing valuesMissing
RID has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:57:22.504108
Analysis finished2023-10-08 18:57:23.959959
Duration1.46 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-10-09T03:57:24.427594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowR0000001
2nd rowR0000002
3rd rowR0000003
4th rowR0000004
5th rowR0000005
ValueCountFrequency (%)
r0000001 1
 
1.0%
r0000063 1
 
1.0%
r0000074 1
 
1.0%
r0000073 1
 
1.0%
r0000072 1
 
1.0%
r0000071 1
 
1.0%
r0000070 1
 
1.0%
r0000069 1
 
1.0%
r0000068 1
 
1.0%
r0000067 1
 
1.0%
Other values (90) 90
90.0%
2023-10-09T03:57:25.414154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 519
64.9%
R 100
 
12.5%
1 21
 
2.6%
3 20
 
2.5%
4 20
 
2.5%
5 20
 
2.5%
6 20
 
2.5%
7 20
 
2.5%
8 20
 
2.5%
9 20
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 700
87.5%
Uppercase Letter 100
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 519
74.1%
1 21
 
3.0%
3 20
 
2.9%
4 20
 
2.9%
5 20
 
2.9%
6 20
 
2.9%
7 20
 
2.9%
8 20
 
2.9%
9 20
 
2.9%
2 20
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
R 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 700
87.5%
Latin 100
 
12.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 519
74.1%
1 21
 
3.0%
3 20
 
2.9%
4 20
 
2.9%
5 20
 
2.9%
6 20
 
2.9%
7 20
 
2.9%
8 20
 
2.9%
9 20
 
2.9%
2 20
 
2.9%
Latin
ValueCountFrequency (%)
R 100
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 519
64.9%
R 100
 
12.5%
1 21
 
2.6%
3 20
 
2.5%
4 20
 
2.5%
5 20
 
2.5%
6 20
 
2.5%
7 20
 
2.5%
8 20
 
2.5%
9 20
 
2.5%

SU_f_date
Text

MISSING 

Distinct29
Distinct (%)76.3%
Missing62
Missing (%)62.0%
Memory size932.0 B
2023-10-09T03:57:25.754601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters228
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)57.9%

Sample

1st rowSep-09
2nd rowAug-14
3rd rowFeb-12
4th rowNov-14
5th rowAug-15
ValueCountFrequency (%)
sep-09 3
 
7.9%
jul-09 3
 
7.9%
mar-14 2
 
5.3%
jan-12 2
 
5.3%
dec-18 2
 
5.3%
mar-19 2
 
5.3%
feb-10 2
 
5.3%
dec-14 1
 
2.6%
dec-15 1
 
2.6%
oct-17 1
 
2.6%
Other values (19) 19
50.0%
2023-10-09T03:57:26.314555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 38
16.7%
1 31
 
13.6%
e 12
 
5.3%
0 11
 
4.8%
9 10
 
4.4%
u 10
 
4.4%
J 8
 
3.5%
c 8
 
3.5%
p 7
 
3.1%
a 7
 
3.1%
Other values (22) 86
37.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 76
33.3%
Lowercase Letter 76
33.3%
Dash Punctuation 38
16.7%
Uppercase Letter 38
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 12
15.8%
u 10
13.2%
c 8
10.5%
p 7
9.2%
a 7
9.2%
r 7
9.2%
g 4
 
5.3%
t 4
 
5.3%
n 4
 
5.3%
l 4
 
5.3%
Other values (3) 9
11.8%
Decimal Number
ValueCountFrequency (%)
1 31
40.8%
0 11
 
14.5%
9 10
 
13.2%
8 6
 
7.9%
4 6
 
7.9%
2 3
 
3.9%
5 3
 
3.9%
3 3
 
3.9%
7 2
 
2.6%
6 1
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
J 8
21.1%
A 6
15.8%
S 5
13.2%
M 5
13.2%
O 4
10.5%
D 4
10.5%
N 3
 
7.9%
F 3
 
7.9%
Dash Punctuation
ValueCountFrequency (%)
- 38
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 114
50.0%
Latin 114
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 12
 
10.5%
u 10
 
8.8%
J 8
 
7.0%
c 8
 
7.0%
p 7
 
6.1%
a 7
 
6.1%
r 7
 
6.1%
A 6
 
5.3%
S 5
 
4.4%
M 5
 
4.4%
Other values (11) 39
34.2%
Common
ValueCountFrequency (%)
- 38
33.3%
1 31
27.2%
0 11
 
9.6%
9 10
 
8.8%
8 6
 
5.3%
4 6
 
5.3%
2 3
 
2.6%
5 3
 
2.6%
3 3
 
2.6%
7 2
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 228
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 38
16.7%
1 31
 
13.6%
e 12
 
5.3%
0 11
 
4.8%
9 10
 
4.4%
u 10
 
4.4%
J 8
 
3.5%
c 8
 
3.5%
p 7
 
3.1%
a 7
 
3.1%
Other values (22) 86
37.7%

SU_f_prcd
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)15.8%
Missing62
Missing (%)62.0%
Infinite0
Infinite (%)0.0%
Mean10155377
Minimum1597758
Maximum21133671
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:26.546481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1597758
5-th percentile1597772
Q11597772
median1597773
Q319101729
95-th percentile21133671
Maximum21133671
Range19535913
Interquartile range (IQR)17503957

Descriptive statistics

Standard deviation9163680.1
Coefficient of variation (CV)0.9023476
Kurtosis-2.0736725
Mean10155377
Median Absolute Deviation (MAD)8
Skewness0.12532045
Sum3.8590433 × 108
Variance8.3973033 × 1013
MonotonicityNot monotonic
2023-10-09T03:57:26.812770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1597772 16
 
16.0%
19101729 12
 
12.0%
21133671 5
 
5.0%
1597773 3
 
3.0%
19059797 1
 
1.0%
1597758 1
 
1.0%
(Missing) 62
62.0%
ValueCountFrequency (%)
1597758 1
 
1.0%
1597772 16
16.0%
1597773 3
 
3.0%
19059797 1
 
1.0%
19101729 12
12.0%
21133671 5
 
5.0%
ValueCountFrequency (%)
21133671 5
 
5.0%
19101729 12
12.0%
19059797 1
 
1.0%
1597773 3
 
3.0%
1597772 16
16.0%
1597758 1
 
1.0%

SU_l_date
Text

MISSING 

Distinct21
Distinct (%)77.8%
Missing73
Missing (%)73.0%
Memory size932.0 B
2023-10-09T03:57:27.141551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters162
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)59.3%

Sample

1st rowMar-16
2nd rowJun-19
3rd rowJan-15
4th rowNov-16
5th rowJul-14
ValueCountFrequency (%)
jul-18 3
 
11.1%
apr-19 2
 
7.4%
mar-17 2
 
7.4%
dec-18 2
 
7.4%
jun-19 2
 
7.4%
mar-19 1
 
3.7%
apr-13 1
 
3.7%
dec-12 1
 
3.7%
feb-19 1
 
3.7%
jun-17 1
 
3.7%
Other values (11) 11
40.7%
2023-10-09T03:57:27.714661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 28
17.3%
- 27
16.7%
J 9
 
5.6%
u 8
 
4.9%
a 7
 
4.3%
r 7
 
4.3%
9 6
 
3.7%
e 6
 
3.7%
8 6
 
3.7%
n 5
 
3.1%
Other values (20) 53
32.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 54
33.3%
Lowercase Letter 54
33.3%
Dash Punctuation 27
16.7%
Uppercase Letter 27
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 8
14.8%
a 7
13.0%
r 7
13.0%
e 6
11.1%
n 5
9.3%
l 4
7.4%
p 3
 
5.6%
b 3
 
5.6%
v 3
 
5.6%
o 3
 
5.6%
Other values (3) 5
9.3%
Decimal Number
ValueCountFrequency (%)
1 28
51.9%
9 6
 
11.1%
8 6
 
11.1%
3 3
 
5.6%
7 3
 
5.6%
5 2
 
3.7%
6 2
 
3.7%
4 2
 
3.7%
0 1
 
1.9%
2 1
 
1.9%
Uppercase Letter
ValueCountFrequency (%)
J 9
33.3%
M 5
18.5%
A 4
14.8%
F 3
 
11.1%
N 3
 
11.1%
D 3
 
11.1%
Dash Punctuation
ValueCountFrequency (%)
- 27
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 81
50.0%
Latin 81
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
J 9
 
11.1%
u 8
 
9.9%
a 7
 
8.6%
r 7
 
8.6%
e 6
 
7.4%
n 5
 
6.2%
M 5
 
6.2%
A 4
 
4.9%
l 4
 
4.9%
p 3
 
3.7%
Other values (9) 23
28.4%
Common
ValueCountFrequency (%)
1 28
34.6%
- 27
33.3%
9 6
 
7.4%
8 6
 
7.4%
3 3
 
3.7%
7 3
 
3.7%
5 2
 
2.5%
6 2
 
2.5%
4 2
 
2.5%
0 1
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 162
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 28
17.3%
- 27
16.7%
J 9
 
5.6%
u 8
 
4.9%
a 7
 
4.3%
r 7
 
4.3%
9 6
 
3.7%
e 6
 
3.7%
8 6
 
3.7%
n 5
 
3.1%
Other values (20) 53
32.7%

SU_l_prcd
Categorical

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
73 
1597772
19101729
 
7
21133671
 
7
1597773
 
3

Length

Max length8
Median length4
Mean length4.96
Min length4

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row19101729
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 73
73.0%
1597772 9
 
9.0%
19101729 7
 
7.0%
21133671 7
 
7.0%
1597773 3
 
3.0%
19059797 1
 
1.0%

Length

2023-10-09T03:57:27.993062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:28.345200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 73
73.0%
1597772 9
 
9.0%
19101729 7
 
7.0%
21133671 7
 
7.0%
1597773 3
 
3.0%
19059797 1
 
1.0%

SU-MET_f_date
Text

MISSING 

Distinct9
Distinct (%)90.0%
Missing90
Missing (%)90.0%
Memory size932.0 B
2023-10-09T03:57:28.570522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters60
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)80.0%

Sample

1st rowFeb-14
2nd rowJun-16
3rd rowFeb-13
4th rowNov-17
5th rowJul-16
ValueCountFrequency (%)
apr-13 2
20.0%
feb-14 1
10.0%
jun-16 1
10.0%
feb-13 1
10.0%
nov-17 1
10.0%
jul-16 1
10.0%
dec-14 1
10.0%
jan-18 1
10.0%
jun-14 1
10.0%
2023-10-09T03:57:29.036996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 10
16.7%
1 10
16.7%
J 4
 
6.7%
u 3
 
5.0%
3 3
 
5.0%
n 3
 
5.0%
e 3
 
5.0%
4 3
 
5.0%
6 2
 
3.3%
p 2
 
3.3%
Other values (13) 17
28.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20
33.3%
Lowercase Letter 20
33.3%
Dash Punctuation 10
16.7%
Uppercase Letter 10
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 3
15.0%
n 3
15.0%
e 3
15.0%
p 2
10.0%
b 2
10.0%
r 2
10.0%
o 1
 
5.0%
v 1
 
5.0%
l 1
 
5.0%
c 1
 
5.0%
Decimal Number
ValueCountFrequency (%)
1 10
50.0%
3 3
 
15.0%
4 3
 
15.0%
6 2
 
10.0%
7 1
 
5.0%
8 1
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
J 4
40.0%
A 2
20.0%
F 2
20.0%
N 1
 
10.0%
D 1
 
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30
50.0%
Latin 30
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
J 4
13.3%
u 3
10.0%
n 3
10.0%
e 3
10.0%
p 2
 
6.7%
A 2
 
6.7%
b 2
 
6.7%
F 2
 
6.7%
r 2
 
6.7%
N 1
 
3.3%
Other values (6) 6
20.0%
Common
ValueCountFrequency (%)
- 10
33.3%
1 10
33.3%
3 3
 
10.0%
4 3
 
10.0%
6 2
 
6.7%
7 1
 
3.3%
8 1
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 60
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 10
16.7%
1 10
16.7%
J 4
 
6.7%
u 3
 
5.0%
3 3
 
5.0%
n 3
 
5.0%
e 3
 
5.0%
4 3
 
5.0%
6 2
 
3.3%
p 2
 
3.3%
Other values (13) 17
28.3%

SU-MET_f_prcd
Categorical

IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
90 
42953917
 
5
42953740
 
5

Length

Max length8
Median length4
Mean length4.4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 90
90.0%
42953917 5
 
5.0%
42953740 5
 
5.0%

Length

2023-10-09T03:57:29.421523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:29.660943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 90
90.0%
42953917 5
 
5.0%
42953740 5
 
5.0%

SU-MET_l_date
Text

MISSING 

Distinct7
Distinct (%)87.5%
Missing92
Missing (%)92.0%
Memory size932.0 B
2023-10-09T03:57:29.883833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters48
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)75.0%

Sample

1st rowFeb-18
2nd rowAug-13
3rd rowFeb-18
4th rowSep-16
5th rowDec-15
ValueCountFrequency (%)
feb-18 2
25.0%
aug-13 1
12.5%
sep-16 1
12.5%
dec-15 1
12.5%
dec-18 1
12.5%
may-18 1
12.5%
jun-19 1
12.5%
2023-10-09T03:57:30.405482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 8
16.7%
1 8
16.7%
e 5
10.4%
8 4
 
8.3%
F 2
 
4.2%
b 2
 
4.2%
u 2
 
4.2%
D 2
 
4.2%
c 2
 
4.2%
5 1
 
2.1%
Other values (12) 12
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16
33.3%
Lowercase Letter 16
33.3%
Dash Punctuation 8
16.7%
Uppercase Letter 8
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5
31.2%
b 2
 
12.5%
u 2
 
12.5%
c 2
 
12.5%
n 1
 
6.2%
y 1
 
6.2%
a 1
 
6.2%
p 1
 
6.2%
g 1
 
6.2%
Decimal Number
ValueCountFrequency (%)
1 8
50.0%
8 4
25.0%
5 1
 
6.2%
6 1
 
6.2%
3 1
 
6.2%
9 1
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
F 2
25.0%
D 2
25.0%
J 1
12.5%
M 1
12.5%
S 1
12.5%
A 1
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24
50.0%
Latin 24
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5
20.8%
F 2
 
8.3%
b 2
 
8.3%
u 2
 
8.3%
D 2
 
8.3%
c 2
 
8.3%
n 1
 
4.2%
J 1
 
4.2%
y 1
 
4.2%
a 1
 
4.2%
Other values (5) 5
20.8%
Common
ValueCountFrequency (%)
- 8
33.3%
1 8
33.3%
8 4
16.7%
5 1
 
4.2%
6 1
 
4.2%
3 1
 
4.2%
9 1
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 8
16.7%
1 8
16.7%
e 5
10.4%
8 4
 
8.3%
F 2
 
4.2%
b 2
 
4.2%
u 2
 
4.2%
D 2
 
4.2%
c 2
 
4.2%
5 1
 
2.1%
Other values (12) 12
25.0%

SU-MET_l_prcd
Categorical

IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
92 
42953740
 
7
42953917
 
1

Length

Max length8
Median length4
Mean length4.32
Min length4

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 92
92.0%
42953740 7
 
7.0%
42953917 1
 
1.0%

Length

2023-10-09T03:57:30.672084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:30.950467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 92
92.0%
42953740 7
 
7.0%
42953917 1
 
1.0%

Meg_f_date
Text

MISSING 

Distinct10
Distinct (%)66.7%
Missing85
Missing (%)85.0%
Memory size932.0 B
2023-10-09T03:57:31.309247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters90
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)40.0%

Sample

1st rowJul-09
2nd rowMar-10
3rd rowAug-14
4th rowFeb-12
5th rowMay-12
ValueCountFrequency (%)
jul-09 3
20.0%
apr-10 2
13.3%
aug-09 2
13.3%
sep-09 2
13.3%
mar-10 1
 
6.7%
aug-14 1
 
6.7%
feb-12 1
 
6.7%
may-12 1
 
6.7%
nov-10 1
 
6.7%
dec-11 1
 
6.7%
2023-10-09T03:57:32.029346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 15
16.7%
0 11
12.2%
1 9
 
10.0%
9 7
 
7.8%
u 6
 
6.7%
A 5
 
5.6%
e 4
 
4.4%
p 4
 
4.4%
g 3
 
3.3%
J 3
 
3.3%
Other values (15) 23
25.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 30
33.3%
Lowercase Letter 30
33.3%
Dash Punctuation 15
16.7%
Uppercase Letter 15
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 6
20.0%
e 4
13.3%
p 4
13.3%
g 3
10.0%
r 3
10.0%
l 3
10.0%
a 2
 
6.7%
b 1
 
3.3%
y 1
 
3.3%
o 1
 
3.3%
Other values (2) 2
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
A 5
33.3%
J 3
20.0%
S 2
 
13.3%
M 2
 
13.3%
F 1
 
6.7%
N 1
 
6.7%
D 1
 
6.7%
Decimal Number
ValueCountFrequency (%)
0 11
36.7%
1 9
30.0%
9 7
23.3%
2 2
 
6.7%
4 1
 
3.3%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 45
50.0%
Latin 45
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 6
13.3%
A 5
11.1%
e 4
 
8.9%
p 4
 
8.9%
g 3
 
6.7%
J 3
 
6.7%
r 3
 
6.7%
l 3
 
6.7%
S 2
 
4.4%
M 2
 
4.4%
Other values (9) 10
22.2%
Common
ValueCountFrequency (%)
- 15
33.3%
0 11
24.4%
1 9
20.0%
9 7
15.6%
2 2
 
4.4%
4 1
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 15
16.7%
0 11
12.2%
1 9
 
10.0%
9 7
 
7.8%
u 6
 
6.7%
A 5
 
5.6%
e 4
 
4.4%
p 4
 
4.4%
g 3
 
3.3%
J 3
 
3.3%
Other values (15) 23
25.6%

Meg_f_prcd
Categorical

IMBALANCE 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
85 
19107111
 
5
19107110
 
5
42962884
 
2
1502829
 
2

Length

Max length8
Median length4
Mean length4.58
Min length4

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 85
85.0%
19107111 5
 
5.0%
19107110 5
 
5.0%
42962884 2
 
2.0%
1502829 2
 
2.0%
19023425 1
 
1.0%

Length

2023-10-09T03:57:32.317541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:32.548024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 85
85.0%
19107111 5
 
5.0%
19107110 5
 
5.0%
42962884 2
 
2.0%
1502829 2
 
2.0%
19023425 1
 
1.0%

Meg_l_date
Text

MISSING 

Distinct12
Distinct (%)100.0%
Missing88
Missing (%)88.0%
Memory size932.0 B
2023-10-09T03:57:32.806173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters72
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)100.0%

Sample

1st rowMar-10
2nd rowOct-14
3rd rowJun-14
4th rowNov-12
5th rowJul-15
ValueCountFrequency (%)
mar-10 1
8.3%
oct-14 1
8.3%
jun-14 1
8.3%
nov-12 1
8.3%
jul-15 1
8.3%
feb-15 1
8.3%
oct-11 1
8.3%
oct-10 1
8.3%
aug-12 1
8.3%
jun-15 1
8.3%
Other values (2) 2
16.7%
2023-10-09T03:57:33.369803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 13
18.1%
- 12
16.7%
J 4
 
5.6%
u 4
 
5.6%
n 3
 
4.2%
2 3
 
4.2%
O 3
 
4.2%
c 3
 
4.2%
t 3
 
4.2%
5 3
 
4.2%
Other values (15) 21
29.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24
33.3%
Lowercase Letter 24
33.3%
Dash Punctuation 12
16.7%
Uppercase Letter 12
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 4
16.7%
n 3
12.5%
c 3
12.5%
t 3
12.5%
b 2
8.3%
e 2
8.3%
a 2
8.3%
g 1
 
4.2%
l 1
 
4.2%
v 1
 
4.2%
Other values (2) 2
8.3%
Decimal Number
ValueCountFrequency (%)
1 13
54.2%
2 3
 
12.5%
5 3
 
12.5%
4 2
 
8.3%
0 2
 
8.3%
3 1
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
J 4
33.3%
O 3
25.0%
F 2
16.7%
A 1
 
8.3%
M 1
 
8.3%
N 1
 
8.3%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 36
50.0%
Latin 36
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
J 4
11.1%
u 4
11.1%
n 3
 
8.3%
O 3
 
8.3%
c 3
 
8.3%
t 3
 
8.3%
b 2
 
5.6%
e 2
 
5.6%
a 2
 
5.6%
F 2
 
5.6%
Other values (8) 8
22.2%
Common
ValueCountFrequency (%)
1 13
36.1%
- 12
33.3%
2 3
 
8.3%
5 3
 
8.3%
4 2
 
5.6%
0 2
 
5.6%
3 1
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 72
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 13
18.1%
- 12
16.7%
J 4
 
5.6%
u 4
 
5.6%
n 3
 
4.2%
2 3
 
4.2%
O 3
 
4.2%
c 3
 
4.2%
t 3
 
4.2%
5 3
 
4.2%
Other values (15) 21
29.2%

Meg_l_prcd
Categorical

IMBALANCE 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
88 
1502829
 
5
19107110
 
3
19107111
 
2
42962884
 
1

Length

Max length8
Median length4
Mean length4.43
Min length4

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 88
88.0%
1502829 5
 
5.0%
19107110 3
 
3.0%
19107111 2
 
2.0%
42962884 1
 
1.0%
19023425 1
 
1.0%

Length

2023-10-09T03:57:33.867413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:34.507498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 88
88.0%
1502829 5
 
5.0%
19107110 3
 
3.0%
19107111 2
 
2.0%
42962884 1
 
1.0%
19023425 1
 
1.0%

Met_f_date
Text

MISSING 

Distinct41
Distinct (%)62.1%
Missing34
Missing (%)34.0%
Memory size932.0 B
2023-10-09T03:57:34.995116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters396
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)48.5%

Sample

1st rowApr-17
2nd rowFeb-14
3rd rowOct-11
4th rowJul-10
5th rowJul-14
ValueCountFrequency (%)
jul-09 9
 
13.6%
sep-09 5
 
7.6%
oct-09 5
 
7.6%
aug-09 4
 
6.1%
apr-10 3
 
4.5%
mar-14 2
 
3.0%
jul-10 2
 
3.0%
jan-12 2
 
3.0%
oct-14 2
 
3.0%
nov-16 1
 
1.5%
Other values (31) 31
47.0%
2023-10-09T03:57:36.050975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 66
16.7%
1 45
 
11.4%
0 35
 
8.8%
9 26
 
6.6%
u 25
 
6.3%
J 23
 
5.8%
l 15
 
3.8%
e 14
 
3.5%
p 13
 
3.3%
A 13
 
3.3%
Other values (22) 121
30.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 132
33.3%
Lowercase Letter 132
33.3%
Dash Punctuation 66
16.7%
Uppercase Letter 66
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 25
18.9%
l 15
11.4%
e 14
10.6%
p 13
9.8%
c 13
9.8%
a 10
 
7.6%
t 9
 
6.8%
r 9
 
6.8%
n 8
 
6.1%
g 7
 
5.3%
Other values (4) 9
 
6.8%
Decimal Number
ValueCountFrequency (%)
1 45
34.1%
0 35
26.5%
9 26
19.7%
4 10
 
7.6%
2 6
 
4.5%
5 3
 
2.3%
6 3
 
2.3%
7 2
 
1.5%
8 2
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
J 23
34.8%
A 13
19.7%
O 9
 
13.6%
S 7
 
10.6%
M 5
 
7.6%
D 4
 
6.1%
F 3
 
4.5%
N 2
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 66
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 198
50.0%
Latin 198
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 25
12.6%
J 23
11.6%
l 15
 
7.6%
e 14
 
7.1%
p 13
 
6.6%
A 13
 
6.6%
c 13
 
6.6%
a 10
 
5.1%
t 9
 
4.5%
r 9
 
4.5%
Other values (12) 54
27.3%
Common
ValueCountFrequency (%)
- 66
33.3%
1 45
22.7%
0 35
17.7%
9 26
 
13.1%
4 10
 
5.1%
2 6
 
3.0%
5 3
 
1.5%
6 3
 
1.5%
7 2
 
1.0%
8 2
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 396
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 66
16.7%
1 45
 
11.4%
0 35
 
8.8%
9 26
 
6.6%
u 25
 
6.3%
J 23
 
5.8%
l 15
 
3.8%
e 14
 
3.5%
p 13
 
3.3%
A 13
 
3.3%
Other values (22) 121
30.6%

Met_f_prcd
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)9.1%
Missing34
Missing (%)34.0%
Infinite0
Infinite (%)0.0%
Mean37293325
Minimum19106521
Maximum40164946
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:36.386387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19106521
5-th percentile19106521
Q140164897
median40164929
Q340164929
95-th percentile40164946
Maximum40164946
Range21058425
Interquartile range (IQR)32

Descriptive statistics

Standard deviation7282081.1
Coefficient of variation (CV)0.195265
Kurtosis2.7875244
Mean37293325
Median Absolute Deviation (MAD)17
Skewness-2.1688585
Sum2.4613595 × 109
Variance5.3028705 × 1013
MonotonicityNot monotonic
2023-10-09T03:57:37.354525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
40164929 20
20.0%
40164946 14
14.0%
40164925 12
 
12.0%
40164897 10
 
10.0%
19106521 9
 
9.0%
40164894 1
 
1.0%
(Missing) 34
34.0%
ValueCountFrequency (%)
19106521 9
9.0%
40164894 1
 
1.0%
40164897 10
10.0%
40164925 12
12.0%
40164929 20
20.0%
40164946 14
14.0%
ValueCountFrequency (%)
40164946 14
14.0%
40164929 20
20.0%
40164925 12
12.0%
40164897 10
10.0%
40164894 1
 
1.0%
19106521 9
9.0%

Met_l_date
Text

MISSING 

Distinct37
Distinct (%)67.3%
Missing45
Missing (%)45.0%
Memory size932.0 B
2023-10-09T03:57:37.845612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters330
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)50.9%

Sample

1st rowFeb-18
2nd rowFeb-15
3rd rowMar-12
4th rowAug-15
5th rowNov-11
ValueCountFrequency (%)
apr-19 5
 
9.1%
may-19 4
 
7.3%
feb-15 3
 
5.5%
mar-12 3
 
5.5%
aug-15 3
 
5.5%
jun-19 3
 
5.5%
feb-18 2
 
3.6%
feb-19 2
 
3.6%
mar-19 2
 
3.6%
oct-13 1
 
1.8%
Other values (27) 27
49.1%
2023-10-09T03:57:38.882328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 61
18.5%
- 55
16.7%
9 17
 
5.2%
e 15
 
4.5%
u 13
 
3.9%
a 13
 
3.9%
J 12
 
3.6%
r 12
 
3.6%
A 11
 
3.3%
p 10
 
3.0%
Other values (23) 111
33.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 110
33.3%
Lowercase Letter 110
33.3%
Dash Punctuation 55
16.7%
Uppercase Letter 55
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 15
13.6%
u 13
11.8%
a 13
11.8%
r 12
10.9%
p 10
9.1%
n 9
8.2%
b 8
7.3%
c 7
6.4%
y 5
 
4.5%
g 4
 
3.6%
Other values (4) 14
12.7%
Decimal Number
ValueCountFrequency (%)
1 61
55.5%
9 17
 
15.5%
5 9
 
8.2%
2 4
 
3.6%
8 4
 
3.6%
0 3
 
2.7%
7 3
 
2.7%
6 3
 
2.7%
4 3
 
2.7%
3 3
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
J 12
21.8%
A 11
20.0%
M 10
18.2%
F 8
14.5%
D 4
 
7.3%
N 4
 
7.3%
O 3
 
5.5%
S 3
 
5.5%
Dash Punctuation
ValueCountFrequency (%)
- 55
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 165
50.0%
Latin 165
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 15
 
9.1%
u 13
 
7.9%
a 13
 
7.9%
J 12
 
7.3%
r 12
 
7.3%
A 11
 
6.7%
p 10
 
6.1%
M 10
 
6.1%
n 9
 
5.5%
F 8
 
4.8%
Other values (12) 52
31.5%
Common
ValueCountFrequency (%)
1 61
37.0%
- 55
33.3%
9 17
 
10.3%
5 9
 
5.5%
2 4
 
2.4%
8 4
 
2.4%
0 3
 
1.8%
7 3
 
1.8%
6 3
 
1.8%
4 3
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 330
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 61
18.5%
- 55
16.7%
9 17
 
5.2%
e 15
 
4.5%
u 13
 
3.9%
a 13
 
3.9%
J 12
 
3.6%
r 12
 
3.6%
A 11
 
3.3%
p 10
 
3.0%
Other values (23) 111
33.6%

Met_l_prcd
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)10.9%
Missing45
Missing (%)45.0%
Infinite0
Infinite (%)0.0%
Mean38633410
Minimum19106521
Maximum40164946
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:39.233296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19106521
5-th percentile19106521
Q140164925
median40164929
Q340164946
95-th percentile40164946
Maximum40164946
Range21058425
Interquartile range (IQR)21

Descriptive statistics

Standard deviation5519025.8
Coefficient of variation (CV)0.1428563
Kurtosis9.8044907
Mean38633410
Median Absolute Deviation (MAD)17
Skewness-3.3836476
Sum2.1248375 × 109
Variance3.0459646 × 1013
MonotonicityNot monotonic
2023-10-09T03:57:39.473636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
40164929 20
20.0%
40164946 19
19.0%
40164897 6
 
6.0%
40164925 5
 
5.0%
19106521 4
 
4.0%
40164894 1
 
1.0%
(Missing) 45
45.0%
ValueCountFrequency (%)
19106521 4
 
4.0%
40164894 1
 
1.0%
40164897 6
 
6.0%
40164925 5
 
5.0%
40164929 20
20.0%
40164946 19
19.0%
ValueCountFrequency (%)
40164946 19
19.0%
40164929 20
20.0%
40164925 5
 
5.0%
40164897 6
 
6.0%
40164894 1
 
1.0%
19106521 4
 
4.0%

TZD_f_date
Text

MISSING 

Distinct9
Distinct (%)90.0%
Missing90
Missing (%)90.0%
Memory size932.0 B
2023-10-09T03:57:39.686539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters60
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)80.0%

Sample

1st rowJun-16
2nd rowAug-17
3rd rowJun-16
4th rowMar-19
5th rowDec-17
ValueCountFrequency (%)
jun-16 2
20.0%
aug-17 1
10.0%
mar-19 1
10.0%
dec-17 1
10.0%
jan-14 1
10.0%
nov-17 1
10.0%
sep-15 1
10.0%
mar-16 1
10.0%
feb-12 1
10.0%
2023-10-09T03:57:40.154478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 10
16.7%
1 10
16.7%
J 3
 
5.0%
a 3
 
5.0%
n 3
 
5.0%
6 3
 
5.0%
e 3
 
5.0%
u 3
 
5.0%
7 3
 
5.0%
r 2
 
3.3%
Other values (16) 17
28.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20
33.3%
Lowercase Letter 20
33.3%
Dash Punctuation 10
16.7%
Uppercase Letter 10
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
15.0%
n 3
15.0%
e 3
15.0%
u 3
15.0%
r 2
10.0%
o 1
 
5.0%
b 1
 
5.0%
p 1
 
5.0%
v 1
 
5.0%
c 1
 
5.0%
Decimal Number
ValueCountFrequency (%)
1 10
50.0%
6 3
 
15.0%
7 3
 
15.0%
5 1
 
5.0%
4 1
 
5.0%
9 1
 
5.0%
2 1
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
J 3
30.0%
M 2
20.0%
F 1
 
10.0%
S 1
 
10.0%
D 1
 
10.0%
N 1
 
10.0%
A 1
 
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30
50.0%
Latin 30
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
J 3
 
10.0%
a 3
 
10.0%
n 3
 
10.0%
e 3
 
10.0%
u 3
 
10.0%
r 2
 
6.7%
M 2
 
6.7%
o 1
 
3.3%
F 1
 
3.3%
b 1
 
3.3%
Other values (8) 8
26.7%
Common
ValueCountFrequency (%)
- 10
33.3%
1 10
33.3%
6 3
 
10.0%
7 3
 
10.0%
5 1
 
3.3%
4 1
 
3.3%
9 1
 
3.3%
2 1
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 60
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 10
16.7%
1 10
16.7%
J 3
 
5.0%
a 3
 
5.0%
n 3
 
5.0%
6 3
 
5.0%
e 3
 
5.0%
u 3
 
5.0%
7 3
 
5.0%
r 2
 
3.3%
Other values (16) 17
28.3%

TZD_f_prcd
Categorical

IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
90 
42960773
 
6
1525221
 
3
19079293
 
1

Length

Max length8
Median length4
Mean length4.37
Min length4

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row42960773

Common Values

ValueCountFrequency (%)
<NA> 90
90.0%
42960773 6
 
6.0%
1525221 3
 
3.0%
19079293 1
 
1.0%

Length

2023-10-09T03:57:40.616453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:40.914299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 90
90.0%
42960773 6
 
6.0%
1525221 3
 
3.0%
19079293 1
 
1.0%

TZD_l_date
Text

MISSING 

Distinct8
Distinct (%)88.9%
Missing91
Missing (%)91.0%
Memory size932.0 B
2023-10-09T03:57:41.128619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters54
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)77.8%

Sample

1st rowMar-19
2nd rowMay-19
3rd rowMay-18
4th rowJun-19
5th rowApr-19
ValueCountFrequency (%)
apr-19 2
22.2%
mar-19 1
11.1%
may-19 1
11.1%
may-18 1
11.1%
jun-19 1
11.1%
feb-18 1
11.1%
mar-16 1
11.1%
aug-14 1
11.1%
2023-10-09T03:57:41.713680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 9
16.7%
1 9
16.7%
9 5
9.3%
r 4
 
7.4%
M 4
 
7.4%
a 4
 
7.4%
A 3
 
5.6%
p 2
 
3.7%
u 2
 
3.7%
8 2
 
3.7%
Other values (9) 10
18.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18
33.3%
Lowercase Letter 18
33.3%
Dash Punctuation 9
16.7%
Uppercase Letter 9
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 4
22.2%
a 4
22.2%
p 2
11.1%
u 2
11.1%
y 2
11.1%
n 1
 
5.6%
e 1
 
5.6%
b 1
 
5.6%
g 1
 
5.6%
Decimal Number
ValueCountFrequency (%)
1 9
50.0%
9 5
27.8%
8 2
 
11.1%
6 1
 
5.6%
4 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
M 4
44.4%
A 3
33.3%
J 1
 
11.1%
F 1
 
11.1%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27
50.0%
Latin 27
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 4
14.8%
M 4
14.8%
a 4
14.8%
A 3
11.1%
p 2
7.4%
u 2
7.4%
y 2
7.4%
J 1
 
3.7%
n 1
 
3.7%
F 1
 
3.7%
Other values (3) 3
11.1%
Common
ValueCountFrequency (%)
- 9
33.3%
1 9
33.3%
9 5
18.5%
8 2
 
7.4%
6 1
 
3.7%
4 1
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 9
16.7%
1 9
16.7%
9 5
9.3%
r 4
 
7.4%
M 4
 
7.4%
a 4
 
7.4%
A 3
 
5.6%
p 2
 
3.7%
u 2
 
3.7%
8 2
 
3.7%
Other values (9) 10
18.5%

TZD_l_prcd
Categorical

IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
91 
42960773
 
5
1525221
 
3
19079293
 
1

Length

Max length8
Median length4
Mean length4.33
Min length4

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row19079293

Common Values

ValueCountFrequency (%)
<NA> 91
91.0%
42960773 5
 
5.0%
1525221 3
 
3.0%
19079293 1
 
1.0%

Length

2023-10-09T03:57:41.945947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:42.142037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 91
91.0%
42960773 5
 
5.0%
1525221 3
 
3.0%
19079293 1
 
1.0%

DPP4i_f_date
Text

MISSING 

Distinct39
Distinct (%)84.8%
Missing54
Missing (%)54.0%
Memory size932.0 B
2023-10-09T03:57:42.607201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters276
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)71.7%

Sample

1st rowFeb-14
2nd rowDec-17
3rd rowNov-10
4th rowNov-16
5th rowMar-14
ValueCountFrequency (%)
jul-09 3
 
6.5%
oct-09 2
 
4.3%
may-19 2
 
4.3%
dec-18 2
 
4.3%
aug-14 2
 
4.3%
dec-13 2
 
4.3%
sep-09 1
 
2.2%
jan-12 1
 
2.2%
mar-17 1
 
2.2%
jun-14 1
 
2.2%
Other values (29) 29
63.0%
2023-10-09T03:57:43.433480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 46
16.7%
1 43
15.6%
u 15
 
5.4%
c 15
 
5.4%
e 14
 
5.1%
J 12
 
4.3%
0 9
 
3.3%
9 8
 
2.9%
4 8
 
2.9%
O 8
 
2.9%
Other values (23) 98
35.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 92
33.3%
Lowercase Letter 92
33.3%
Dash Punctuation 46
16.7%
Uppercase Letter 46
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 15
16.3%
c 15
16.3%
e 14
15.2%
t 8
8.7%
n 6
 
6.5%
l 6
 
6.5%
a 5
 
5.4%
p 4
 
4.3%
g 4
 
4.3%
b 4
 
4.3%
Other values (4) 11
12.0%
Decimal Number
ValueCountFrequency (%)
1 43
46.7%
0 9
 
9.8%
9 8
 
8.7%
4 8
 
8.7%
8 6
 
6.5%
6 5
 
5.4%
3 4
 
4.3%
7 4
 
4.3%
5 4
 
4.3%
2 1
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
J 12
26.1%
O 8
17.4%
D 7
15.2%
A 5
10.9%
M 4
 
8.7%
F 4
 
8.7%
S 3
 
6.5%
N 3
 
6.5%
Dash Punctuation
ValueCountFrequency (%)
- 46
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 138
50.0%
Latin 138
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 15
 
10.9%
c 15
 
10.9%
e 14
 
10.1%
J 12
 
8.7%
O 8
 
5.8%
t 8
 
5.8%
D 7
 
5.1%
n 6
 
4.3%
l 6
 
4.3%
a 5
 
3.6%
Other values (12) 42
30.4%
Common
ValueCountFrequency (%)
- 46
33.3%
1 43
31.2%
0 9
 
6.5%
9 8
 
5.8%
4 8
 
5.8%
8 6
 
4.3%
6 5
 
3.6%
3 4
 
2.9%
7 4
 
2.9%
5 4
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 276
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 46
16.7%
1 43
15.6%
u 15
 
5.4%
c 15
 
5.4%
e 14
 
5.1%
J 12
 
4.3%
0 9
 
3.3%
9 8
 
2.9%
4 8
 
2.9%
O 8
 
2.9%
Other values (23) 98
35.5%

DPP4i_f_prcd
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)13.0%
Missing54
Missing (%)54.0%
Infinite0
Infinite (%)0.0%
Mean34428458
Minimum19125041
Maximum43013924
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:43.738163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19125041
5-th percentile19125041
Q119125041
median40239218
Q342961500
95-th percentile43013924
Maximum43013924
Range23888883
Interquartile range (IQR)23836459

Descriptive statistics

Standard deviation10821404
Coefficient of variation (CV)0.31431568
Kurtosis-1.4813267
Mean34428458
Median Absolute Deviation (MAD)2748487.5
Skewness-0.73212073
Sum1.5837091 × 109
Variance1.1710279 × 1014
MonotonicityNot monotonic
2023-10-09T03:57:43.958549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
19125041 15
 
15.0%
40239218 13
 
13.0%
42961500 7
 
7.0%
43013924 4
 
4.0%
43013911 4
 
4.0%
42960599 3
 
3.0%
(Missing) 54
54.0%
ValueCountFrequency (%)
19125041 15
15.0%
40239218 13
13.0%
42960599 3
 
3.0%
42961500 7
7.0%
43013911 4
 
4.0%
43013924 4
 
4.0%
ValueCountFrequency (%)
43013924 4
 
4.0%
43013911 4
 
4.0%
42961500 7
7.0%
42960599 3
 
3.0%
40239218 13
13.0%
19125041 15
15.0%

DPP4i_l_date
Text

MISSING 

Distinct24
Distinct (%)66.7%
Missing64
Missing (%)64.0%
Memory size932.0 B
2023-10-09T03:57:44.265744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters216
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)47.2%

Sample

1st rowFeb-15
2nd rowMar-19
3rd rowAug-15
4th rowMay-19
5th rowJul-14
ValueCountFrequency (%)
apr-19 4
 
11.1%
mar-19 4
 
11.1%
dec-18 3
 
8.3%
aug-15 2
 
5.6%
aug-18 2
 
5.6%
feb-19 2
 
5.6%
dec-15 2
 
5.6%
jul-18 1
 
2.8%
dec-14 1
 
2.8%
may-17 1
 
2.8%
Other values (14) 14
38.9%
2023-10-09T03:57:44.866056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 37
17.1%
- 36
16.7%
9 12
 
5.6%
e 11
 
5.1%
r 10
 
4.6%
c 9
 
4.2%
u 8
 
3.7%
A 8
 
3.7%
a 8
 
3.7%
M 8
 
3.7%
Other values (21) 69
31.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 72
33.3%
Lowercase Letter 72
33.3%
Dash Punctuation 36
16.7%
Uppercase Letter 36
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 11
15.3%
r 10
13.9%
c 9
12.5%
u 8
11.1%
a 8
11.1%
p 5
6.9%
g 4
 
5.6%
v 3
 
4.2%
o 3
 
4.2%
l 3
 
4.2%
Other values (4) 8
11.1%
Decimal Number
ValueCountFrequency (%)
1 37
51.4%
9 12
 
16.7%
8 7
 
9.7%
5 6
 
8.3%
7 5
 
6.9%
4 3
 
4.2%
2 1
 
1.4%
0 1
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
A 8
22.2%
M 8
22.2%
D 7
19.4%
J 4
11.1%
N 3
 
8.3%
F 3
 
8.3%
O 2
 
5.6%
S 1
 
2.8%
Dash Punctuation
ValueCountFrequency (%)
- 36
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 108
50.0%
Latin 108
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 11
 
10.2%
r 10
 
9.3%
c 9
 
8.3%
u 8
 
7.4%
A 8
 
7.4%
a 8
 
7.4%
M 8
 
7.4%
D 7
 
6.5%
p 5
 
4.6%
g 4
 
3.7%
Other values (12) 30
27.8%
Common
ValueCountFrequency (%)
1 37
34.3%
- 36
33.3%
9 12
 
11.1%
8 7
 
6.5%
5 6
 
5.6%
7 5
 
4.6%
4 3
 
2.8%
2 1
 
0.9%
0 1
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 216
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 37
17.1%
- 36
16.7%
9 12
 
5.6%
e 11
 
5.1%
r 10
 
4.6%
c 9
 
4.2%
u 8
 
3.7%
A 8
 
3.7%
a 8
 
3.7%
M 8
 
3.7%
Other values (21) 69
31.9%

DPP4i_l_prcd
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)16.7%
Missing64
Missing (%)64.0%
Infinite0
Infinite (%)0.0%
Mean35007634
Minimum19125041
Maximum43013924
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:45.108145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19125041
5-th percentile19125041
Q119125041
median40239218
Q342961500
95-th percentile43013924
Maximum43013924
Range23888883
Interquartile range (IQR)23836459

Descriptive statistics

Standard deviation10742649
Coefficient of variation (CV)0.3068659
Kurtosis-1.3105721
Mean35007634
Median Absolute Deviation (MAD)2748487.5
Skewness-0.84582401
Sum1.2602748 × 109
Variance1.1540451 × 1014
MonotonicityNot monotonic
2023-10-09T03:57:45.403189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
19125041 11
 
11.0%
40239218 9
 
9.0%
42960599 5
 
5.0%
42961500 4
 
4.0%
43013924 4
 
4.0%
43013911 3
 
3.0%
(Missing) 64
64.0%
ValueCountFrequency (%)
19125041 11
11.0%
40239218 9
9.0%
42960599 5
5.0%
42961500 4
 
4.0%
43013911 3
 
3.0%
43013924 4
 
4.0%
ValueCountFrequency (%)
43013924 4
 
4.0%
43013911 3
 
3.0%
42961500 4
 
4.0%
42960599 5
5.0%
40239218 9
9.0%
19125041 11
11.0%

DPP4i-MET_f_date
Text

MISSING 

Distinct11
Distinct (%)84.6%
Missing87
Missing (%)87.0%
Memory size932.0 B
2023-10-09T03:57:45.769963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters78
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)69.2%

Sample

1st rowFeb-17
2nd rowMar-19
3rd rowNov-12
4th rowNov-18
5th rowDec-16
ValueCountFrequency (%)
dec-16 2
15.4%
oct-15 2
15.4%
feb-17 1
7.7%
mar-19 1
7.7%
nov-12 1
7.7%
nov-18 1
7.7%
jan-17 1
7.7%
jul-18 1
7.7%
mar-17 1
7.7%
mar-11 1
7.7%
2023-10-09T03:57:46.389736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 14
17.9%
- 13
16.7%
c 4
 
5.1%
e 4
 
5.1%
a 4
 
5.1%
M 3
 
3.8%
r 3
 
3.8%
7 3
 
3.8%
J 2
 
2.6%
8 2
 
2.6%
Other values (15) 26
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26
33.3%
Lowercase Letter 26
33.3%
Dash Punctuation 13
16.7%
Uppercase Letter 13
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 4
15.4%
e 4
15.4%
a 4
15.4%
r 3
11.5%
v 2
7.7%
o 2
7.7%
b 2
7.7%
t 2
7.7%
n 1
 
3.8%
u 1
 
3.8%
Decimal Number
ValueCountFrequency (%)
1 14
53.8%
7 3
 
11.5%
8 2
 
7.7%
2 2
 
7.7%
5 2
 
7.7%
6 2
 
7.7%
9 1
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
M 3
23.1%
J 2
15.4%
N 2
15.4%
D 2
15.4%
F 2
15.4%
O 2
15.4%
Dash Punctuation
ValueCountFrequency (%)
- 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 39
50.0%
Latin 39
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 4
 
10.3%
e 4
 
10.3%
a 4
 
10.3%
M 3
 
7.7%
r 3
 
7.7%
J 2
 
5.1%
v 2
 
5.1%
o 2
 
5.1%
N 2
 
5.1%
D 2
 
5.1%
Other values (7) 11
28.2%
Common
ValueCountFrequency (%)
1 14
35.9%
- 13
33.3%
7 3
 
7.7%
8 2
 
5.1%
2 2
 
5.1%
5 2
 
5.1%
6 2
 
5.1%
9 1
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 14
17.9%
- 13
16.7%
c 4
 
5.1%
e 4
 
5.1%
a 4
 
5.1%
M 3
 
3.8%
r 3
 
3.8%
7 3
 
3.8%
J 2
 
2.6%
8 2
 
2.6%
Other values (15) 26
33.3%

DPP4i-MET_f_prcd
Categorical

IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
87 
40164922
 
6
42708090
 
5
42708088
 
2

Length

Max length8
Median length4
Mean length4.52
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row42708090
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 87
87.0%
40164922 6
 
6.0%
42708090 5
 
5.0%
42708088 2
 
2.0%

Length

2023-10-09T03:57:46.738829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:46.944950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 87
87.0%
40164922 6
 
6.0%
42708090 5
 
5.0%
42708088 2
 
2.0%

DPP4i-MET_l_date
Text

MISSING 

Distinct8
Distinct (%)88.9%
Missing91
Missing (%)91.0%
Memory size932.0 B
2023-10-09T03:57:47.195056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters54
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)77.8%

Sample

1st rowJun-19
2nd rowNov-13
3rd rowJun-19
4th rowMar-19
5th rowNov-18
ValueCountFrequency (%)
jun-19 2
22.2%
nov-13 1
11.1%
mar-19 1
11.1%
nov-18 1
11.1%
apr-19 1
11.1%
jul-18 1
11.1%
oct-16 1
11.1%
aug-12 1
11.1%
2023-10-09T03:57:47.909430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 9
16.7%
1 9
16.7%
u 4
 
7.4%
9 4
 
7.4%
J 3
 
5.6%
A 2
 
3.7%
8 2
 
3.7%
r 2
 
3.7%
v 2
 
3.7%
o 2
 
3.7%
Other values (13) 15
27.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18
33.3%
Lowercase Letter 18
33.3%
Dash Punctuation 9
16.7%
Uppercase Letter 9
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 4
22.2%
r 2
11.1%
v 2
11.1%
o 2
11.1%
n 2
11.1%
g 1
 
5.6%
t 1
 
5.6%
c 1
 
5.6%
a 1
 
5.6%
l 1
 
5.6%
Decimal Number
ValueCountFrequency (%)
1 9
50.0%
9 4
22.2%
8 2
 
11.1%
6 1
 
5.6%
3 1
 
5.6%
2 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
J 3
33.3%
A 2
22.2%
N 2
22.2%
O 1
 
11.1%
M 1
 
11.1%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27
50.0%
Latin 27
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 4
14.8%
J 3
11.1%
A 2
 
7.4%
r 2
 
7.4%
v 2
 
7.4%
o 2
 
7.4%
N 2
 
7.4%
n 2
 
7.4%
g 1
 
3.7%
O 1
 
3.7%
Other values (6) 6
22.2%
Common
ValueCountFrequency (%)
- 9
33.3%
1 9
33.3%
9 4
14.8%
8 2
 
7.4%
6 1
 
3.7%
3 1
 
3.7%
2 1
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 9
16.7%
1 9
16.7%
u 4
 
7.4%
9 4
 
7.4%
J 3
 
5.6%
A 2
 
3.7%
8 2
 
3.7%
r 2
 
3.7%
v 2
 
3.7%
o 2
 
3.7%
Other values (13) 15
27.8%

DPP4i-MET_l_prcd
Categorical

IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
91 
42708090
 
4
40164922
 
4
42708088
 
1

Length

Max length8
Median length4
Mean length4.36
Min length4

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row<NA>
3rd row42708090
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 91
91.0%
42708090 4
 
4.0%
40164922 4
 
4.0%
42708088 1
 
1.0%

Length

2023-10-09T03:57:48.198826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:48.422100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 91
91.0%
42708090 4
 
4.0%
40164922 4
 
4.0%
42708088 1
 
1.0%

Insul_f_date
Text

MISSING 

Distinct33
Distinct (%)86.8%
Missing62
Missing (%)62.0%
Memory size932.0 B
2023-10-09T03:57:48.840147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters228
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)73.7%

Sample

1st rowOct-11
2nd rowJul-14
3rd rowNov-09
4th rowJun-18
5th rowAug-09
ValueCountFrequency (%)
oct-11 2
 
5.3%
aug-09 2
 
5.3%
apr-10 2
 
5.3%
may-11 2
 
5.3%
jul-09 2
 
5.3%
mar-16 1
 
2.6%
aug-13 1
 
2.6%
jun-17 1
 
2.6%
may-15 1
 
2.6%
oct-13 1
 
2.6%
Other values (23) 23
60.5%
2023-10-09T03:57:49.716710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 38
16.7%
1 34
14.9%
0 13
 
5.7%
u 12
 
5.3%
9 9
 
3.9%
e 8
 
3.5%
J 8
 
3.5%
c 8
 
3.5%
A 8
 
3.5%
a 8
 
3.5%
Other values (23) 82
36.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 76
33.3%
Lowercase Letter 76
33.3%
Dash Punctuation 38
16.7%
Uppercase Letter 38
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 12
15.8%
e 8
10.5%
c 8
10.5%
a 8
10.5%
y 6
7.9%
g 5
6.6%
l 5
6.6%
p 5
6.6%
t 5
6.6%
r 4
 
5.3%
Other values (4) 10
13.2%
Decimal Number
ValueCountFrequency (%)
1 34
44.7%
0 13
 
17.1%
9 9
 
11.8%
8 4
 
5.3%
4 4
 
5.3%
3 4
 
5.3%
7 3
 
3.9%
6 2
 
2.6%
5 2
 
2.6%
2 1
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
J 8
21.1%
A 8
21.1%
M 7
18.4%
O 5
13.2%
D 3
 
7.9%
F 3
 
7.9%
S 2
 
5.3%
N 2
 
5.3%
Dash Punctuation
ValueCountFrequency (%)
- 38
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 114
50.0%
Latin 114
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 12
 
10.5%
e 8
 
7.0%
J 8
 
7.0%
c 8
 
7.0%
A 8
 
7.0%
a 8
 
7.0%
M 7
 
6.1%
y 6
 
5.3%
O 5
 
4.4%
g 5
 
4.4%
Other values (12) 39
34.2%
Common
ValueCountFrequency (%)
- 38
33.3%
1 34
29.8%
0 13
 
11.4%
9 9
 
7.9%
8 4
 
3.5%
4 4
 
3.5%
3 4
 
3.5%
7 3
 
2.6%
6 2
 
1.8%
5 2
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 228
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 38
16.7%
1 34
14.9%
0 13
 
5.7%
u 12
 
5.3%
9 9
 
3.9%
e 8
 
3.5%
J 8
 
3.5%
c 8
 
3.5%
A 8
 
3.5%
a 8
 
3.5%
Other values (23) 82
36.0%

Insul_f_prcd
Real number (ℝ)

MISSING 

Distinct8
Distinct (%)21.1%
Missing62
Missing (%)62.0%
Infinite0
Infinite (%)0.0%
Mean36295127
Minimum586875
Maximum41348914
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:50.118694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum586875
5-th percentile35686358
Q135779361
median35781503
Q339768735
95-th percentile41348914
Maximum41348914
Range40762039
Interquartile range (IQR)3989374

Descriptive statistics

Standard deviation6401141
Coefficient of variation (CV)0.17636365
Kurtosis27.628311
Mean36295127
Median Absolute Deviation (MAD)2142
Skewness-4.8300189
Sum1.3792148 × 109
Variance4.0974605 × 1013
MonotonicityNot monotonic
2023-10-09T03:57:50.339929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
35781503 13
 
13.0%
35779361 10
 
10.0%
41348914 6
 
6.0%
40755064 4
 
4.0%
36809748 2
 
2.0%
35782236 1
 
1.0%
586875 1
 
1.0%
35159339 1
 
1.0%
(Missing) 62
62.0%
ValueCountFrequency (%)
586875 1
 
1.0%
35159339 1
 
1.0%
35779361 10
10.0%
35781503 13
13.0%
35782236 1
 
1.0%
36809748 2
 
2.0%
40755064 4
 
4.0%
41348914 6
6.0%
ValueCountFrequency (%)
41348914 6
6.0%
40755064 4
 
4.0%
36809748 2
 
2.0%
35782236 1
 
1.0%
35781503 13
13.0%
35779361 10
10.0%
35159339 1
 
1.0%
586875 1
 
1.0%

Insul_l_date
Text

MISSING 

Distinct24
Distinct (%)72.7%
Missing67
Missing (%)67.0%
Memory size932.0 B
2023-10-09T03:57:50.709970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters198
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)54.5%

Sample

1st rowMar-12
2nd rowSep-11
3rd rowJun-19
4th rowMay-19
5th rowApr-12
ValueCountFrequency (%)
may-19 3
 
9.1%
jun-19 3
 
9.1%
apr-19 3
 
9.1%
aug-15 2
 
6.1%
oct-13 2
 
6.1%
jul-18 2
 
6.1%
dec-16 1
 
3.0%
nov-10 1
 
3.0%
jan-15 1
 
3.0%
feb-19 1
 
3.0%
Other values (14) 14
42.4%
2023-10-09T03:57:51.428865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 33
16.7%
1 33
16.7%
9 11
 
5.6%
u 9
 
4.5%
A 9
 
4.5%
J 8
 
4.0%
a 8
 
4.0%
r 8
 
4.0%
p 7
 
3.5%
M 6
 
3.0%
Other values (23) 66
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 66
33.3%
Lowercase Letter 66
33.3%
Dash Punctuation 33
16.7%
Uppercase Letter 33
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 9
13.6%
a 8
12.1%
r 8
12.1%
p 7
10.6%
n 5
7.6%
c 5
7.6%
e 5
7.6%
y 4
6.1%
v 3
 
4.5%
g 3
 
4.5%
Other values (4) 9
13.6%
Decimal Number
ValueCountFrequency (%)
1 33
50.0%
9 11
 
16.7%
5 5
 
7.6%
0 4
 
6.1%
2 3
 
4.5%
8 3
 
4.5%
3 3
 
4.5%
7 2
 
3.0%
6 1
 
1.5%
4 1
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
A 9
27.3%
J 8
24.2%
M 6
18.2%
D 3
 
9.1%
N 3
 
9.1%
O 2
 
6.1%
S 1
 
3.0%
F 1
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 33
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 99
50.0%
Latin 99
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 9
 
9.1%
A 9
 
9.1%
J 8
 
8.1%
a 8
 
8.1%
r 8
 
8.1%
p 7
 
7.1%
M 6
 
6.1%
n 5
 
5.1%
c 5
 
5.1%
e 5
 
5.1%
Other values (12) 29
29.3%
Common
ValueCountFrequency (%)
- 33
33.3%
1 33
33.3%
9 11
 
11.1%
5 5
 
5.1%
0 4
 
4.0%
2 3
 
3.0%
8 3
 
3.0%
3 3
 
3.0%
7 2
 
2.0%
6 1
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 198
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 33
16.7%
1 33
16.7%
9 11
 
5.6%
u 9
 
4.5%
A 9
 
4.5%
J 8
 
4.0%
a 8
 
4.0%
r 8
 
4.0%
p 7
 
3.5%
M 6
 
3.0%
Other values (23) 66
33.3%

Insul_l_prcd
Real number (ℝ)

MISSING 

Distinct10
Distinct (%)30.3%
Missing67
Missing (%)67.0%
Infinite0
Infinite (%)0.0%
Mean36090901
Minimum586875
Maximum42920572
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:51.740958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum586875
5-th percentile35159339
Q135779361
median35781503
Q336809748
95-th percentile41977714
Maximum42920572
Range42333697
Interquartile range (IQR)1030387

Descriptive statistics

Standard deviation6895649.4
Coefficient of variation (CV)0.19106338
Kurtosis23.336394
Mean36090901
Median Absolute Deviation (MAD)622164
Skewness-4.3923422
Sum1.1909997 × 109
Variance4.7549981 × 1013
MonotonicityNot monotonic
2023-10-09T03:57:52.019405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
35781503 12
 
12.0%
35159339 6
 
6.0%
41349142 4
 
4.0%
35779361 3
 
3.0%
42920572 2
 
2.0%
36809748 2
 
2.0%
40755064 1
 
1.0%
41348914 1
 
1.0%
35779506 1
 
1.0%
586875 1
 
1.0%
(Missing) 67
67.0%
ValueCountFrequency (%)
586875 1
 
1.0%
35159339 6
6.0%
35779361 3
 
3.0%
35779506 1
 
1.0%
35781503 12
12.0%
36809748 2
 
2.0%
40755064 1
 
1.0%
41348914 1
 
1.0%
41349142 4
 
4.0%
42920572 2
 
2.0%
ValueCountFrequency (%)
42920572 2
 
2.0%
41349142 4
 
4.0%
41348914 1
 
1.0%
40755064 1
 
1.0%
36809748 2
 
2.0%
35781503 12
12.0%
35779506 1
 
1.0%
35779361 3
 
3.0%
35159339 6
6.0%
586875 1
 
1.0%

Sample

RIDSU_f_dateSU_f_prcdSU_l_dateSU_l_prcdSU-MET_f_dateSU-MET_f_prcdSU-MET_l_dateSU-MET_l_prcdMeg_f_dateMeg_f_prcdMeg_l_dateMeg_l_prcdMet_f_dateMet_f_prcdMet_l_dateMet_l_prcdTZD_f_dateTZD_f_prcdTZD_l_dateTZD_l_prcdDPP4i_f_dateDPP4i_f_prcdDPP4i_l_dateDPP4i_l_prcdDPP4i-MET_f_dateDPP4i-MET_f_prcdDPP4i-MET_l_dateDPP4i-MET_l_prcdInsul_f_dateInsul_f_prcdInsul_l_dateInsul_l_prcd
0R0000001<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Apr-1719106521Feb-1819106521<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
1R0000002Sep-0919101729Mar-1619101729<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
2R0000003<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Feb-1440164946Feb-1540164946<NA><NA><NA><NA>Feb-1440239218Feb-1540239218Feb-1742708090Jun-1942708090<NA><NA><NA><NA>
3R0000004<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Oct-1140164929Mar-1240164929<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Oct-1141348914Mar-1235159339
4R0000005<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Jun-1642960773Mar-1919079293Dec-1740239218Mar-1940239218<NA><NA><NA><NA><NA><NA><NA><NA>
5R0000006Aug-1419101729Jun-1921133671<NA><NA><NA><NA><NA><NA><NA><NA>Jul-1040164929Aug-1540164897<NA><NA><NA><NA>Nov-1019125041Aug-1519125041<NA><NA><NA><NA><NA><NA><NA><NA>
6R0000007<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
7R0000008<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Jul-1440164946<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Jul-1435779361<NA><NA>
8R0000009Feb-1219101729<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Aug-0919106521Nov-1119106521<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
9R0000010<NA><NA><NA><NA><NA><NA><NA><NA>Jul-0919107111Mar-1019107111Jul-0940164929May-1940164946<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
RIDSU_f_dateSU_f_prcdSU_l_dateSU_l_prcdSU-MET_f_dateSU-MET_f_prcdSU-MET_l_dateSU-MET_l_prcdMeg_f_dateMeg_f_prcdMeg_l_dateMeg_l_prcdMet_f_dateMet_f_prcdMet_l_dateMet_l_prcdTZD_f_dateTZD_f_prcdTZD_l_dateTZD_l_prcdDPP4i_f_dateDPP4i_f_prcdDPP4i_l_dateDPP4i_l_prcdDPP4i-MET_f_dateDPP4i-MET_f_prcdDPP4i-MET_l_dateDPP4i-MET_l_prcdInsul_f_dateInsul_f_prcdInsul_l_dateInsul_l_prcd
90R0000091<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Jul-0940164925Oct-1140164946<NA><NA><NA><NA>Jul-0919125041Oct-1119125041<NA><NA><NA><NA><NA><NA><NA><NA>
91R0000092Sep-1419101729Nov-1419101729Jun-1442953917Jun-1942953740<NA><NA><NA><NA>Jun-1440164929Dec-1640164929Mar-1619079293<NA><NA>Jun-1440239218May-1742960599<NA><NA><NA><NA><NA><NA><NA><NA>
92R0000093<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>May-1940164929May-1940164929<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
93R0000094<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
94R0000095Jun-131597772<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Feb-1240164922Aug-1240164922<NA><NA><NA><NA>
95R0000096<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
96R0000097<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Jan-1540164929Feb-1940164929<NA><NA><NA><NA>Mar-1743013911Feb-1943013911<NA><NA><NA><NA><NA><NA><NA><NA>
97R0000098<NA><NA><NA><NA><NA><NA><NA><NA>Sep-091502829Feb-121502829Sep-0940164946Nov-1740164946Feb-121525221Aug-141525221Aug-1440239218Nov-1740239218<NA><NA><NA><NA>Dec-0935781503Dec-0935781503
98R0000099<NA><NA><NA><NA><NA><NA><NA><NA>Aug-0919107111<NA><NA>Dec-0940164946Jul-1140164946<NA><NA><NA><NA>Nov-1119125041Mar-1943013911<NA><NA><NA><NA><NA><NA><NA><NA>
99R0000100<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>Oct-1240164946<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>