Overview

Dataset statistics

Number of variables33
Number of observations100
Missing cells1545
Missing cells (%)46.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory27.5 KiB
Average record size in memory281.3 B

Variable types

Numeric6
DateTime16
Categorical11

Dataset

Description당뇨 환자의 처방 약물 코드와 최초 처방일과 최종 처방일. sulfonylurea (RxNorm 코드: 1597772, 1597758, 1597773, 19101729, 21133671, 19059797), sulfonylurea+metformin(42953698, 42953917, 42953740), meglitinide(19023425, 19023424, 19023426, 42962884, 19107111, 19107110, 1502829), metformin(19106521, 40164929, 40164946, 40164897, 40164894, 40164925), TZD(1525221, 19079293, 42960773), DPP4i(19125041, 40239218, 43013911, 43013924, 42960599, 42961500), DPP4i-MET(40164922, 42708088,42708090, 42708086), Insulin(46234044, 35782236, 35779361, 41348914, 35786039, 36809748, 42920572, 46234044, 41370419, 41349142, 46234044, 35782557, 35159339, 35781503, 35781503, 46234044, 46234044, 586875, 35781503, 35781503, 46234044, 41348508, 40717097 , 35779506, 40755064, 42921713)
Author가톨릭대학교 은평성모병원
URLhttp://cmcdata.net/data/dataset/diabetes_pre-eunpyeong

Alerts

Meg_f_date has constant value ""Constant
Meg_l_date has constant value ""Constant
SU-MET_f_prcd is highly imbalanced (67.1%)Imbalance
SU-MET_l_prcd is highly imbalanced (65.9%)Imbalance
Meg_f_prcd is highly imbalanced (91.9%)Imbalance
Meg_l_prcd is highly imbalanced (91.9%)Imbalance
TZD_f_prcd is highly imbalanced (87.9%)Imbalance
TZD_l_prcd is highly imbalanced (89.8%)Imbalance
DPP4i-MET_f_prcd is highly imbalanced (56.5%)Imbalance
DPP4i-MET_l_prcd is highly imbalanced (70.8%)Imbalance
SU_f_date has 62 (62.0%) missing valuesMissing
SU_f_prcd has 62 (62.0%) missing valuesMissing
SU_l_date has 65 (65.0%) missing valuesMissing
SU-MET_f_date has 89 (89.0%) missing valuesMissing
SU-MET_l_date has 90 (90.0%) missing valuesMissing
Meg_f_date has 99 (99.0%) missing valuesMissing
Meg_l_date has 99 (99.0%) missing valuesMissing
Met_f_date has 55 (55.0%) missing valuesMissing
Met_f_prcd has 55 (55.0%) missing valuesMissing
Met_l_date has 58 (58.0%) missing valuesMissing
Met_l_prcd has 58 (58.0%) missing valuesMissing
TZD_f_date has 97 (97.0%) missing valuesMissing
TZD_l_date has 98 (98.0%) missing valuesMissing
DPP4i_f_date has 65 (65.0%) missing valuesMissing
DPP4i_l_date has 67 (67.0%) missing valuesMissing
DPP4i-MET_f_date has 86 (86.0%) missing valuesMissing
DPP4i-MET_l_date has 90 (90.0%) missing valuesMissing
Insul_f_date has 61 (61.0%) missing valuesMissing
Insul_f_prcd has 61 (61.0%) missing valuesMissing
Insul_l_date has 64 (64.0%) missing valuesMissing
Insul_l_prcd has 64 (64.0%) missing valuesMissing
RID has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:56:32.444150
Analysis finished2023-10-08 18:56:33.046086
Duration0.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:33.244343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-10-09T03:56:33.570480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

SU_f_date
Date

MISSING 

Distinct36
Distinct (%)94.7%
Missing62
Missing (%)62.0%
Memory size932.0 B
Minimum2015-09-04 00:00:00
Maximum2020-04-10 00:00:00
2023-10-09T03:56:33.858725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:34.109540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)

SU_f_prcd
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)15.8%
Missing62
Missing (%)62.0%
Infinite0
Infinite (%)0.0%
Mean15757033
Minimum1597758
Maximum21133671
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:34.332088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1597758
5-th percentile1597769.9
Q119070280
median19101729
Q321133671
95-th percentile21133671
Maximum21133671
Range19535913
Interquartile range (IQR)2063391

Descriptive statistics

Standard deviation8044345.8
Coefficient of variation (CV)0.51052414
Kurtosis-0.4049996
Mean15757033
Median Absolute Deviation (MAD)2031942
Skewness-1.2437383
Sum5.9876726 × 108
Variance6.4711499 × 1013
MonotonicityNot monotonic
2023-10-09T03:56:34.653796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
21133671 15
 
15.0%
19101729 13
 
13.0%
1597772 5
 
5.0%
1597773 2
 
2.0%
1597758 2
 
2.0%
19059797 1
 
1.0%
(Missing) 62
62.0%
ValueCountFrequency (%)
1597758 2
 
2.0%
1597772 5
 
5.0%
1597773 2
 
2.0%
19059797 1
 
1.0%
19101729 13
13.0%
21133671 15
15.0%
ValueCountFrequency (%)
21133671 15
15.0%
19101729 13
13.0%
19059797 1
 
1.0%
1597773 2
 
2.0%
1597772 5
 
5.0%
1597758 2
 
2.0%

SU_l_date
Date

MISSING 

Distinct34
Distinct (%)97.1%
Missing65
Missing (%)65.0%
Memory size932.0 B
Minimum2015-12-21 00:00:00
Maximum2020-04-27 00:00:00
2023-10-09T03:56:34.914752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:35.318923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)

SU_l_prcd
Categorical

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
65 
21133671
16 
19101729
11 
19059797
 
3
DGMPD4
 
2
Other values (2)
 
3

Length

Max length8
Median length4
Mean length5.3
Min length4

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row21133671
3rd row19101729
4th row<NA>
5th row21133671

Common Values

ValueCountFrequency (%)
<NA> 65
65.0%
21133671 16
 
16.0%
19101729 11
 
11.0%
19059797 3
 
3.0%
DGMPD4 2
 
2.0%
DGMPD2 2
 
2.0%
DGMPD3 1
 
1.0%

Length

2023-10-09T03:56:35.837188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:36.164774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 65
65.0%
21133671 16
 
16.0%
19101729 11
 
11.0%
19059797 3
 
3.0%
dgmpd4 2
 
2.0%
dgmpd2 2
 
2.0%
dgmpd3 1
 
1.0%

SU-MET_f_date
Date

MISSING 

Distinct11
Distinct (%)100.0%
Missing89
Missing (%)89.0%
Memory size932.0 B
Minimum2015-09-16 00:00:00
Maximum2019-12-12 00:00:00
2023-10-09T03:56:36.560446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:36.788206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)

SU-MET_f_prcd
Categorical

IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
89 
42953740
 
6
42953917
 
3
42953698
 
2

Length

Max length8
Median length4
Mean length4.44
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 89
89.0%
42953740 6
 
6.0%
42953917 3
 
3.0%
42953698 2
 
2.0%

Length

2023-10-09T03:56:37.168440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:37.406318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 89
89.0%
42953740 6
 
6.0%
42953917 3
 
3.0%
42953698 2
 
2.0%

SU-MET_l_date
Date

MISSING 

Distinct10
Distinct (%)100.0%
Missing90
Missing (%)90.0%
Memory size932.0 B
Minimum2016-04-07 00:00:00
Maximum2020-04-13 00:00:00
2023-10-09T03:56:37.776180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:38.198399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)

SU-MET_l_prcd
Categorical

IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
90 
42953740
 
8
42953917
 
2

Length

Max length8
Median length4
Mean length4.4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 90
90.0%
42953740 8
 
8.0%
42953917 2
 
2.0%

Length

2023-10-09T03:56:38.464333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:38.679327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 90
90.0%
42953740 8
 
8.0%
42953917 2
 
2.0%

Meg_f_date
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing99
Missing (%)99.0%
Memory size932.0 B
Minimum2015-09-22 00:00:00
Maximum2015-09-22 00:00:00
2023-10-09T03:56:38.870491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:39.028217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Meg_f_prcd
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
99 
19023426
 
1

Length

Max length8
Median length4
Mean length4.04
Min length4

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 99
99.0%
19023426 1
 
1.0%

Length

2023-10-09T03:56:39.750304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:40.157458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 99
99.0%
19023426 1
 
1.0%

Meg_l_date
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing99
Missing (%)99.0%
Memory size932.0 B
Minimum2016-06-13 00:00:00
Maximum2016-06-13 00:00:00
2023-10-09T03:56:40.484703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:40.749888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Meg_l_prcd
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
99 
19023426
 
1

Length

Max length8
Median length4
Mean length4.04
Min length4

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 99
99.0%
19023426 1
 
1.0%

Length

2023-10-09T03:56:41.230455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:41.494543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 99
99.0%
19023426 1
 
1.0%

Met_f_date
Date

MISSING 

Distinct43
Distinct (%)95.6%
Missing55
Missing (%)55.0%
Memory size932.0 B
Minimum2015-09-07 00:00:00
Maximum2020-04-10 00:00:00
2023-10-09T03:56:41.672762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:41.937623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=43)

Met_f_prcd
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)13.3%
Missing55
Missing (%)55.0%
Infinite0
Infinite (%)0.0%
Mean37825104
Minimum19106521
Maximum40164946
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:42.187873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19106521
5-th percentile19106521
Q140164925
median40164929
Q340164929
95-th percentile40164946
Maximum40164946
Range21058425
Interquartile range (IQR)4

Descriptive statistics

Standard deviation6692800.8
Coefficient of variation (CV)0.17694071
Kurtosis4.769103
Mean37825104
Median Absolute Deviation (MAD)4
Skewness-2.5610449
Sum1.7021297 × 109
Variance4.4793582 × 1013
MonotonicityNot monotonic
2023-10-09T03:56:42.370328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
40164929 15
 
15.0%
40164925 10
 
10.0%
40164946 9
 
9.0%
19106521 5
 
5.0%
40164897 5
 
5.0%
40164894 1
 
1.0%
(Missing) 55
55.0%
ValueCountFrequency (%)
19106521 5
 
5.0%
40164894 1
 
1.0%
40164897 5
 
5.0%
40164925 10
10.0%
40164929 15
15.0%
40164946 9
9.0%
ValueCountFrequency (%)
40164946 9
9.0%
40164929 15
15.0%
40164925 10
10.0%
40164897 5
 
5.0%
40164894 1
 
1.0%
19106521 5
 
5.0%

Met_l_date
Date

MISSING 

Distinct42
Distinct (%)100.0%
Missing58
Missing (%)58.0%
Memory size932.0 B
Minimum2015-11-09 00:00:00
Maximum2020-04-27 00:00:00
2023-10-09T03:56:42.694093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:43.091383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)

Met_l_prcd
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)14.3%
Missing58
Missing (%)58.0%
Infinite0
Infinite (%)0.0%
Mean38159362
Minimum19106521
Maximum40164946
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:43.371701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19106521
5-th percentile19106521
Q140164904
median40164929
Q340164929
95-th percentile40164946
Maximum40164946
Range21058425
Interquartile range (IQR)25

Descriptive statistics

Standard deviation6256488.7
Coefficient of variation (CV)0.16395685
Kurtosis6.4923583
Mean38159362
Median Absolute Deviation (MAD)4
Skewness-2.8609726
Sum1.6026932 × 109
Variance3.9143651 × 1013
MonotonicityNot monotonic
2023-10-09T03:56:43.558584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
40164929 20
 
20.0%
40164925 6
 
6.0%
40164946 5
 
5.0%
19106521 4
 
4.0%
40164897 4
 
4.0%
40164894 3
 
3.0%
(Missing) 58
58.0%
ValueCountFrequency (%)
19106521 4
 
4.0%
40164894 3
 
3.0%
40164897 4
 
4.0%
40164925 6
 
6.0%
40164929 20
20.0%
40164946 5
 
5.0%
ValueCountFrequency (%)
40164946 5
 
5.0%
40164929 20
20.0%
40164925 6
 
6.0%
40164897 4
 
4.0%
40164894 3
 
3.0%
19106521 4
 
4.0%

TZD_f_date
Date

MISSING 

Distinct3
Distinct (%)100.0%
Missing97
Missing (%)97.0%
Memory size932.0 B
Minimum2015-09-25 00:00:00
Maximum2017-10-26 00:00:00
2023-10-09T03:56:43.769797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:43.981266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=3)

TZD_f_prcd
Categorical

IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
97 
19079293
 
1
1525221
 
1
42960773
 
1

Length

Max length8
Median length4
Mean length4.11
Min length4

Unique

Unique3 ?
Unique (%)3.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row19079293

Common Values

ValueCountFrequency (%)
<NA> 97
97.0%
19079293 1
 
1.0%
1525221 1
 
1.0%
42960773 1
 
1.0%

Length

2023-10-09T03:56:44.342359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:44.548418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 97
97.0%
19079293 1
 
1.0%
1525221 1
 
1.0%
42960773 1
 
1.0%

TZD_l_date
Date

MISSING 

Distinct2
Distinct (%)100.0%
Missing98
Missing (%)98.0%
Memory size932.0 B
Minimum2016-03-29 00:00:00
Maximum2018-02-20 00:00:00
2023-10-09T03:56:44.710702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:44.903809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=2)

TZD_l_prcd
Categorical

IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
98 
19079293
 
1
1525221
 
1

Length

Max length8
Median length4
Mean length4.07
Min length4

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row19079293

Common Values

ValueCountFrequency (%)
<NA> 98
98.0%
19079293 1
 
1.0%
1525221 1
 
1.0%

Length

2023-10-09T03:56:45.378912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:45.655902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 98
98.0%
19079293 1
 
1.0%
1525221 1
 
1.0%

DPP4i_f_date
Date

MISSING 

Distinct35
Distinct (%)100.0%
Missing65
Missing (%)65.0%
Memory size932.0 B
Minimum2015-09-03 00:00:00
Maximum2020-03-23 00:00:00
2023-10-09T03:56:45.906247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:46.167399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)

DPP4i_f_prcd
Categorical

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
65 
42961500
20 
40239218
12 
19125041
 
3

Length

Max length8
Median length4
Mean length5.4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row40239218
4th row<NA>
5th row40239218

Common Values

ValueCountFrequency (%)
<NA> 65
65.0%
42961500 20
 
20.0%
40239218 12
 
12.0%
19125041 3
 
3.0%

Length

2023-10-09T03:56:46.429421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:46.719927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 65
65.0%
42961500 20
 
20.0%
40239218 12
 
12.0%
19125041 3
 
3.0%

DPP4i_l_date
Date

MISSING 

Distinct31
Distinct (%)93.9%
Missing67
Missing (%)67.0%
Memory size932.0 B
Minimum2016-04-07 00:00:00
Maximum2020-05-02 00:00:00
2023-10-09T03:56:46.918831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:47.236417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)

DPP4i_l_prcd
Categorical

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
67 
42961500
14 
40239218
11 
42960599
 
4
19125041
 
3

Length

Max length8
Median length4
Mean length5.32
Min length4

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row<NA>
3rd row43013924
4th row<NA>
5th row40239218

Common Values

ValueCountFrequency (%)
<NA> 67
67.0%
42961500 14
 
14.0%
40239218 11
 
11.0%
42960599 4
 
4.0%
19125041 3
 
3.0%
43013924 1
 
1.0%

Length

2023-10-09T03:56:47.617148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:47.831038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 67
67.0%
42961500 14
 
14.0%
40239218 11
 
11.0%
42960599 4
 
4.0%
19125041 3
 
3.0%
43013924 1
 
1.0%

DPP4i-MET_f_date
Date

MISSING 

Distinct14
Distinct (%)100.0%
Missing86
Missing (%)86.0%
Memory size932.0 B
Minimum2015-09-17 00:00:00
Maximum2019-10-31 00:00:00
2023-10-09T03:56:48.009314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:48.281217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)

DPP4i-MET_f_prcd
Categorical

IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
86 
40164922
11 
42708088
 
3

Length

Max length8
Median length4
Mean length4.56
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 86
86.0%
40164922 11
 
11.0%
42708088 3
 
3.0%

Length

2023-10-09T03:56:48.646411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:48.930920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 86
86.0%
40164922 11
 
11.0%
42708088 3
 
3.0%

DPP4i-MET_l_date
Date

MISSING 

Distinct10
Distinct (%)100.0%
Missing90
Missing (%)90.0%
Memory size932.0 B
Minimum2015-10-21 00:00:00
Maximum2019-02-27 00:00:00
2023-10-09T03:56:49.119981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:49.337853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)

DPP4i-MET_l_prcd
Categorical

IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
90 
40164922
 
7
42708090
 
2
42708088
 
1

Length

Max length8
Median length4
Mean length4.4
Min length4

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 90
90.0%
40164922 7
 
7.0%
42708090 2
 
2.0%
42708088 1
 
1.0%

Length

2023-10-09T03:56:49.749412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:50.116967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 90
90.0%
40164922 7
 
7.0%
42708090 2
 
2.0%
42708088 1
 
1.0%

Insul_f_date
Date

MISSING 

Distinct37
Distinct (%)94.9%
Missing61
Missing (%)61.0%
Memory size932.0 B
Minimum2015-09-03 00:00:00
Maximum2020-04-17 00:00:00
2023-10-09T03:56:50.370404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:50.718915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)

Insul_f_prcd
Real number (ℝ)

MISSING 

Distinct7
Distinct (%)17.9%
Missing61
Missing (%)61.0%
Infinite0
Infinite (%)0.0%
Mean37877863
Minimum35779361
Maximum42921713
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:51.130050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum35779361
5-th percentile35779361
Q135781503
median35781503
Q340755064
95-th percentile42921713
Maximum42921713
Range7142352
Interquartile range (IQR)4973561

Descriptive statistics

Standard deviation2779463.9
Coefficient of variation (CV)0.073379639
Kurtosis-1.3899984
Mean37877863
Median Absolute Deviation (MAD)2142
Skewness0.69917761
Sum1.4772367 × 109
Variance7.7254198 × 1012
MonotonicityNot monotonic
2023-10-09T03:56:51.365583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
35781503 17
 
17.0%
35779361 6
 
6.0%
40755064 5
 
5.0%
41348914 5
 
5.0%
42921713 3
 
3.0%
36809748 2
 
2.0%
41370419 1
 
1.0%
(Missing) 61
61.0%
ValueCountFrequency (%)
35779361 6
 
6.0%
35781503 17
17.0%
36809748 2
 
2.0%
40755064 5
 
5.0%
41348914 5
 
5.0%
41370419 1
 
1.0%
42921713 3
 
3.0%
ValueCountFrequency (%)
42921713 3
 
3.0%
41370419 1
 
1.0%
41348914 5
 
5.0%
40755064 5
 
5.0%
36809748 2
 
2.0%
35781503 17
17.0%
35779361 6
 
6.0%

Insul_l_date
Date

MISSING 

Distinct35
Distinct (%)97.2%
Missing64
Missing (%)64.0%
Memory size932.0 B
Minimum2015-10-19 00:00:00
Maximum2020-05-04 00:00:00
2023-10-09T03:56:51.675179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:51.899145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)

Insul_l_prcd
Real number (ℝ)

MISSING 

Distinct9
Distinct (%)25.0%
Missing64
Missing (%)64.0%
Infinite0
Infinite (%)0.0%
Mean38829948
Minimum35159339
Maximum42921713
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:52.118313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum35159339
5-th percentile35624356
Q135781503
median36809748
Q342920572
95-th percentile42921713
Maximum42921713
Range7762374
Interquartile range (IQR)7139069

Descriptive statistics

Standard deviation3309731.5
Coefficient of variation (CV)0.085236568
Kurtosis-1.8952458
Mean38829948
Median Absolute Deviation (MAD)1650409
Skewness0.22043654
Sum1.3978781 × 109
Variance1.0954322 × 1013
MonotonicityNot monotonic
2023-10-09T03:56:52.345906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
35781503 13
 
13.0%
42921713 8
 
8.0%
40755064 4
 
4.0%
42920572 3
 
3.0%
35779361 2
 
2.0%
35159339 2
 
2.0%
36809748 2
 
2.0%
41348914 1
 
1.0%
40717097 1
 
1.0%
(Missing) 64
64.0%
ValueCountFrequency (%)
35159339 2
 
2.0%
35779361 2
 
2.0%
35781503 13
13.0%
36809748 2
 
2.0%
40717097 1
 
1.0%
40755064 4
 
4.0%
41348914 1
 
1.0%
42920572 3
 
3.0%
42921713 8
8.0%
ValueCountFrequency (%)
42921713 8
8.0%
42920572 3
 
3.0%
41348914 1
 
1.0%
40755064 4
 
4.0%
40717097 1
 
1.0%
36809748 2
 
2.0%
35781503 13
13.0%
35779361 2
 
2.0%
35159339 2
 
2.0%

Sample

RIDSU_f_dateSU_f_prcdSU_l_dateSU_l_prcdSU-MET_f_dateSU-MET_f_prcdSU-MET_l_dateSU-MET_l_prcdMeg_f_dateMeg_f_prcdMeg_l_dateMeg_l_prcdMet_f_dateMet_f_prcdMet_l_dateMet_l_prcdTZD_f_dateTZD_f_prcdTZD_l_dateTZD_l_prcdDPP4i_f_dateDPP4i_f_prcdDPP4i_l_dateDPP4i_l_prcdDPP4i-MET_f_dateDPP4i-MET_f_prcdDPP4i-MET_l_dateDPP4i-MET_l_prcdInsul_f_dateInsul_f_prcdInsul_l_dateInsul_l_prcd
01<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2019-10-16191065212020-02-1419106521<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2019-09-16407550642020-02-1440755064
122015-09-10211336712019-02-2621133671<NA><NA><NA><NA><NA><NA><NA><NA>2016-01-21401648972018-02-2740164897<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
232017-01-31191017292017-04-0419101729<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2016-12-29402392182019-08-3043013924<NA><NA><NA><NA><NA><NA><NA><NA>
34<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2016-08-19401649462019-03-1340164929<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2016-01-14357815032016-01-1435781503
452017-12-22191017292019-05-0321133671<NA><NA><NA><NA><NA><NA><NA><NA>2015-10-12401649462019-05-03401649292017-10-26190792932018-02-20190792932015-10-12402392182017-10-2640239218<NA><NA><NA><NA><NA><NA><NA><NA>
56<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2015-11-16402392182020-04-0940239218<NA><NA><NA><NA>2015-11-09357815032020-04-0942921713
672020-03-16190597972020-04-1019059797<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2020-03-23191250412020-04-1019125041<NA><NA><NA><NA><NA><NA><NA><NA>
782015-09-25211336712019-03-1321133671<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2015-09-2515252212016-03-2915252212015-09-25429615002020-03-2442960599<NA><NA><NA><NA>2016-11-08357793612020-03-2442920572
89<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
9102019-09-04211336712020-01-2321133671<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
RIDSU_f_dateSU_f_prcdSU_l_dateSU_l_prcdSU-MET_f_dateSU-MET_f_prcdSU-MET_l_dateSU-MET_l_prcdMeg_f_dateMeg_f_prcdMeg_l_dateMeg_l_prcdMet_f_dateMet_f_prcdMet_l_dateMet_l_prcdTZD_f_dateTZD_f_prcdTZD_l_dateTZD_l_prcdDPP4i_f_dateDPP4i_f_prcdDPP4i_l_dateDPP4i_l_prcdDPP4i-MET_f_dateDPP4i-MET_f_prcdDPP4i-MET_l_dateDPP4i-MET_l_prcdInsul_f_dateInsul_f_prcdInsul_l_dateInsul_l_prcd
90912016-03-0915977582016-04-06DGMPD32015-09-16429537402016-08-1042953740<NA><NA><NA><NA>2017-02-20401649292020-03-1640164929<NA><NA><NA><NA>2015-09-16402392182020-03-1642960599<NA><NA><NA><NA>2017-02-06368097482020-03-1642921713
9192<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
9293<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2020-03-2640164929<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2020-03-26429217132020-04-2542921713
9394<NA><NA><NA><NA>2019-12-12429537402020-01-0842953740<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2019-12-12429615002020-01-0842961500<NA><NA><NA><NA>2019-12-23357815032019-12-2335781503
9495<NA><NA><NA><NA>2016-05-23429536982017-08-1842953740<NA><NA><NA><NA>2017-08-28401649292017-08-3040164929<NA><NA><NA><NA>2017-08-26429615002017-08-3042961500<NA><NA><NA><NA>2015-09-07407550642017-08-3036809748
9596<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
9697<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2016-04-18401649292016-05-2340164929<NA><NA><NA><NA>2016-04-18402392182016-05-2340239218<NA><NA><NA><NA>2016-04-18357793612016-05-2335779361
97982019-07-0115977582020-03-26DGMPD4<NA><NA><NA><NA><NA><NA><NA><NA>2019-07-01401648972020-03-2640164897<NA><NA><NA><NA>2019-07-01429615002020-03-2642961500<NA><NA><NA><NA><NA><NA><NA><NA>
98992015-09-2215977722020-04-27DGMPD2<NA><NA><NA><NA><NA><NA><NA><NA>2015-09-22401649292020-04-2740164929<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
991002015-12-17191017292017-02-0419101729<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>2016-01-2742960773<NA><NA>2015-09-30429615002017-02-17402392182015-10-21401649222015-10-21401649222015-09-30357793612017-02-2035781503