Overview

Dataset statistics

Number of variables6
Number of observations100
Missing cells4
Missing cells (%)0.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.3 KiB
Average record size in memory54.3 B

Variable types

Numeric5
DateTime1

Dataset

Description당뇨병 환자들이 시행한 혈액 검사 중에 약물 부작용을 평가할 수 있는 검사 데이터를 포함함. 검사항목은 AST(GOT), ALT(GPT), Bun, Creatinine 등 당뇨병의 간독성, 신독성등 다양한 부작용을 평가할 수 있는 주요 검사항목이 포함됨 - AST(Aspartate aminotransferase. GOT(Glutamic Oxalacetic Transaminase)), ALT(alanine aminotransferase, GPT(glutamic pyruvate transaminase)): 간세포 손상을 반영하는 아미노전이효소(Aminotransferases)로 기본적인 간기능검사 항목임 -BUN(Blood Urea Nitrogen): 간세포 손상이나 신장의 기능을 평가할 수 있는 항목 - Creatinine: 근육에서 크레틴(Creatine)으로부터 생성되며 신장 기능 이외의 영향이 적어 신기능을 평가하는데 유용함
Author가톨릭대학교 은평성모병원
URLhttp://cmcdata.net/data/dataset/diabetes_sideeffects-eunpyeong

Alerts

BUN_VAL is highly overall correlated with Cr_VAL High correlation
Cr_VAL is highly overall correlated with BUN_VAL High correlation
AST_VAL is highly overall correlated with ALT_VAL High correlation
ALT_VAL is highly overall correlated with AST_VAL High correlation
AST_VAL has 2 (2.0%) missing valuesMissing
ALT_VAL has 2 (2.0%) missing valuesMissing
RID has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:57:29.106924
Analysis finished2023-10-08 18:57:35.157148
Duration6.05 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:35.395531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-10-09T03:57:35.934239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%
Distinct91
Distinct (%)91.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2015-10-01 00:00:00
Maximum2020-01-31 00:00:00
2023-10-09T03:57:36.410733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:36.791338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

BUN_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct84
Distinct (%)84.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.07
Minimum7.6
Maximum121.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:37.244699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7.6
5-th percentile9.86
Q113.3
median16.65
Q322.675
95-th percentile35.655
Maximum121.6
Range114
Interquartile range (IQR)9.375

Descriptive statistics

Standard deviation13.680654
Coefficient of variation (CV)0.68164695
Kurtosis31.387032
Mean20.07
Median Absolute Deviation (MAD)4.35
Skewness4.7435752
Sum2007
Variance187.1603
MonotonicityNot monotonic
2023-10-09T03:57:38.078418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15.3 3
 
3.0%
14.4 3
 
3.0%
12.3 2
 
2.0%
18.4 2
 
2.0%
17.5 2
 
2.0%
16.9 2
 
2.0%
14.1 2
 
2.0%
16.6 2
 
2.0%
24.8 2
 
2.0%
13.3 2
 
2.0%
Other values (74) 78
78.0%
ValueCountFrequency (%)
7.6 1
1.0%
8.5 1
1.0%
9.0 2
2.0%
9.1 1
1.0%
9.9 2
2.0%
10.2 1
1.0%
10.5 1
1.0%
10.7 1
1.0%
11.2 1
1.0%
11.4 2
2.0%
ValueCountFrequency (%)
121.6 1
1.0%
68.0 1
1.0%
44.4 1
1.0%
36.8 1
1.0%
36.7 1
1.0%
35.6 1
1.0%
35.3 1
1.0%
34.7 1
1.0%
33.1 1
1.0%
31.8 1
1.0%

Cr_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct65
Distinct (%)65.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2227
Minimum0.48
Maximum9.07
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:38.704119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.48
5-th percentile0.559
Q10.72
median0.9
Q31.0725
95-th percentile2.733
Maximum9.07
Range8.59
Interquartile range (IQR)0.3525

Descriptive statistics

Standard deviation1.325764
Coefficient of variation (CV)1.0842921
Kurtosis21.848524
Mean1.2227
Median Absolute Deviation (MAD)0.18
Skewness4.4874165
Sum122.27
Variance1.7576502
MonotonicityNot monotonic
2023-10-09T03:57:38.999620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.92 4
 
4.0%
1.03 4
 
4.0%
0.95 4
 
4.0%
0.72 4
 
4.0%
1.0 3
 
3.0%
0.58 3
 
3.0%
0.87 3
 
3.0%
0.76 3
 
3.0%
0.89 3
 
3.0%
0.96 2
 
2.0%
Other values (55) 67
67.0%
ValueCountFrequency (%)
0.48 1
 
1.0%
0.49 1
 
1.0%
0.52 1
 
1.0%
0.53 1
 
1.0%
0.54 1
 
1.0%
0.56 2
2.0%
0.58 3
3.0%
0.59 2
2.0%
0.61 2
2.0%
0.62 1
 
1.0%
ValueCountFrequency (%)
9.07 1
1.0%
8.32 1
1.0%
6.49 1
1.0%
3.63 1
1.0%
3.55 1
1.0%
2.69 2
2.0%
2.51 1
1.0%
2.11 1
1.0%
1.73 1
1.0%
1.58 1
1.0%

AST_VAL
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct36
Distinct (%)36.7%
Missing2
Missing (%)2.0%
Infinite0
Infinite (%)0.0%
Mean33.112245
Minimum12
Maximum564
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:39.299440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile14.7
Q118
median23
Q331.75
95-th percentile71.6
Maximum564
Range552
Interquartile range (IQR)13.75

Descriptive statistics

Standard deviation56.328543
Coefficient of variation (CV)1.7011394
Kurtosis83.472625
Mean33.112245
Median Absolute Deviation (MAD)6
Skewness8.8318612
Sum3245
Variance3172.9048
MonotonicityNot monotonic
2023-10-09T03:57:39.631996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
16 7
 
7.0%
25 7
 
7.0%
19 6
 
6.0%
15 6
 
6.0%
20 5
 
5.0%
21 5
 
5.0%
27 5
 
5.0%
18 5
 
5.0%
23 4
 
4.0%
12 4
 
4.0%
Other values (26) 44
44.0%
ValueCountFrequency (%)
12 4
4.0%
13 1
 
1.0%
15 6
6.0%
16 7
7.0%
17 4
4.0%
18 5
5.0%
19 6
6.0%
20 5
5.0%
21 5
5.0%
22 3
3.0%
ValueCountFrequency (%)
564 1
1.0%
89 1
1.0%
86 1
1.0%
75 2
2.0%
71 1
1.0%
61 1
1.0%
52 1
1.0%
50 1
1.0%
47 2
2.0%
42 1
1.0%

ALT_VAL
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct41
Distinct (%)41.8%
Missing2
Missing (%)2.0%
Infinite0
Infinite (%)0.0%
Mean33.285714
Minimum6
Maximum375
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:39.949000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile11.85
Q118
median24.5
Q335.5
95-th percentile63
Maximum375
Range369
Interquartile range (IQR)17.5

Descriptive statistics

Standard deviation39.187627
Coefficient of variation (CV)1.1773107
Kurtosis60.537129
Mean33.285714
Median Absolute Deviation (MAD)7.5
Skewness7.0752103
Sum3262
Variance1535.6701
MonotonicityNot monotonic
2023-10-09T03:57:40.218219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
21 7
 
7.0%
19 5
 
5.0%
31 5
 
5.0%
24 4
 
4.0%
14 4
 
4.0%
17 4
 
4.0%
18 4
 
4.0%
29 3
 
3.0%
61 3
 
3.0%
45 3
 
3.0%
Other values (31) 56
56.0%
ValueCountFrequency (%)
6 1
 
1.0%
10 2
2.0%
11 2
2.0%
12 3
3.0%
13 3
3.0%
14 4
4.0%
15 3
3.0%
16 1
 
1.0%
17 4
4.0%
18 4
4.0%
ValueCountFrequency (%)
375 1
 
1.0%
99 1
 
1.0%
98 1
 
1.0%
68 1
 
1.0%
63 3
3.0%
61 3
3.0%
59 2
2.0%
53 1
 
1.0%
52 1
 
1.0%
51 2
2.0%

Interactions

2023-10-09T03:57:32.845840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:29.601453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.319212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:31.211981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:32.031707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:33.005117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:29.750641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.511632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:31.375398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:32.166207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:33.210056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:29.894854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.650657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:31.530843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:32.334564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:33.423721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.028356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.785698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:31.653891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:32.514381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:33.723525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.167414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.952121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:31.906890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:32.653256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:57:40.393225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDBUN/Cr_DATEBUN_VALCr_VALAST_VALALT_VAL
RID1.0000.7260.2660.2140.1310.000
BUN/Cr_DATE0.7261.0000.9750.9700.0000.000
BUN_VAL0.2660.9751.0000.9480.0000.000
Cr_VAL0.2140.9700.9481.0000.0000.000
AST_VAL0.1310.0000.0000.0001.0000.779
ALT_VAL0.0000.0000.0000.0000.7791.000
2023-10-09T03:57:40.677555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDBUN_VALCr_VALAST_VALALT_VAL
RID1.0000.063-0.068-0.0110.216
BUN_VAL0.0631.0000.711-0.078-0.179
Cr_VAL-0.0680.7111.000-0.032-0.043
AST_VAL-0.011-0.078-0.0321.0000.652
ALT_VAL0.216-0.179-0.0430.6521.000

Missing values

2023-10-09T03:57:34.004708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:57:34.423022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-10-09T03:57:34.895776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

RIDBUN/Cr_DATEBUN_VALCr_VALAST_VALALT_VAL
012019-08-19T00:00:0029.41.42311
122016-12-27T00:00:0015.30.958999
232019-07-23T00:00:0012.00.562413
342017-10-31T00:00:0012.70.872127
452019-11-01T00:00:0035.31.492514
562017-04-14T00:00:009.10.757144
672019-04-16T00:00:0030.32.512518
782019-06-20T00:00:0022.91.033222
892016-09-06T00:00:0014.60.841812
9102019-05-21T00:00:0025.41.473631
RIDBUN/Cr_DATEBUN_VALCr_VALAST_VALALT_VAL
90912019-08-19T00:00:0018.20.731821
91922016-04-04T00:00:0031.81.03<NA><NA>
92932019-04-26T00:00:0014.70.722020
93942020-01-30T00:00:0029.93.633827
94952019-05-23T00:00:00121.63.551528
95962019-05-22T00:00:0030.60.863545
96972019-10-10T00:00:008.50.481215
97982019-11-05T00:00:0013.40.664126
98992018-07-10T00:00:0017.50.923153
991002018-08-16T00:00:0014.10.764752