Overview

Dataset statistics

Number of variables7
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.1 KiB
Average record size in memory62.3 B

Variable types

Text1
DateTime1
Numeric5

Dataset

Description당뇨병 환자들이 시행한 혈액 검사 중에 간, 신장 기능 평가할 수 있는 검사 데이터를 포함함. 검사항목은 Bun, Creatinine, AST(GOT), ALT(GPT), MDRD-eGFR - AST(Aspartate aminotransferase. GOT(Glutamic Oxalacetic Transaminase)), ALT(alanine aminotransferase, GPT(glutamic pyruvate transaminase)): 간세포 손상을 반영하는 아미노전이효소(Aminotransferases)로 기본적인 간기능검사 항목임 - BUN(Blood Urea Nitrogen): 간세포 손상이나 신장의 기능을 평가할 수 있는 항목 - Creatinine: 근육에서 크레틴(Creatine)으로부터 생성되며 신장 기능 이외의 영향이 적어 신기능을 평가하는데 유용함 - MDRD-eGFR(Modification of Diet in Renal Disease Study, MDRD-Estimated Glomerular Filtration Rate, eGFR): 혈액 내 크레아티닌 수치를 측정하고 그 결과를 MDRD공식을 사용하여 계산해 신장이 얼마나 잘 기능 하는지를 나태내는 수치
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/diabetes_lab

Alerts

Cr_VAL is highly overall correlated with MDRD_VALHigh correlation
AST_VAL is highly overall correlated with ALT_VALHigh correlation
ALT_VAL is highly overall correlated with AST_VALHigh correlation
MDRD_VAL is highly overall correlated with Cr_VALHigh correlation
RID has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:57:13.256801
Analysis finished2023-10-08 18:57:21.459808
Duration8.2 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-10-09T03:57:21.853052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowR0000001
2nd rowR0000002
3rd rowR0000003
4th rowR0000004
5th rowR0000005
ValueCountFrequency (%)
r0000001 1
 
1.0%
r0000063 1
 
1.0%
r0000074 1
 
1.0%
r0000073 1
 
1.0%
r0000072 1
 
1.0%
r0000071 1
 
1.0%
r0000070 1
 
1.0%
r0000069 1
 
1.0%
r0000068 1
 
1.0%
r0000067 1
 
1.0%
Other values (90) 90
90.0%
2023-10-09T03:57:22.644547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 519
64.9%
R 100
 
12.5%
1 21
 
2.6%
3 20
 
2.5%
4 20
 
2.5%
5 20
 
2.5%
6 20
 
2.5%
7 20
 
2.5%
8 20
 
2.5%
9 20
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 700
87.5%
Uppercase Letter 100
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 519
74.1%
1 21
 
3.0%
3 20
 
2.9%
4 20
 
2.9%
5 20
 
2.9%
6 20
 
2.9%
7 20
 
2.9%
8 20
 
2.9%
9 20
 
2.9%
2 20
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
R 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 700
87.5%
Latin 100
 
12.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 519
74.1%
1 21
 
3.0%
3 20
 
2.9%
4 20
 
2.9%
5 20
 
2.9%
6 20
 
2.9%
7 20
 
2.9%
8 20
 
2.9%
9 20
 
2.9%
2 20
 
2.9%
Latin
ValueCountFrequency (%)
R 100
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 519
64.9%
R 100
 
12.5%
1 21
 
2.6%
3 20
 
2.5%
4 20
 
2.5%
5 20
 
2.5%
6 20
 
2.5%
7 20
 
2.5%
8 20
 
2.5%
9 20
 
2.5%
Distinct62
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2009-06-01 00:00:00
Maximum2019-05-01 00:00:00
2023-10-09T03:57:22.958408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:23.235159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

BUN_VAL
Real number (ℝ)

Distinct80
Distinct (%)80.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.849
Minimum5.7
Maximum84
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:23.463992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5.7
5-th percentile9.18
Q111.8
median15.25
Q319.375
95-th percentile26.12
Maximum84
Range78.3
Interquartile range (IQR)7.575

Descriptive statistics

Standard deviation9.129954
Coefficient of variation (CV)0.54186919
Kurtosis29.572725
Mean16.849
Median Absolute Deviation (MAD)3.7
Skewness4.4307295
Sum1684.9
Variance83.35606
MonotonicityNot monotonic
2023-10-09T03:57:23.791902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17.9 3
 
3.0%
11.0 3
 
3.0%
12.1 3
 
3.0%
11.9 3
 
3.0%
17.7 2
 
2.0%
14.3 2
 
2.0%
14.7 2
 
2.0%
10.8 2
 
2.0%
21.1 2
 
2.0%
11.8 2
 
2.0%
Other values (70) 76
76.0%
ValueCountFrequency (%)
5.7 1
1.0%
7.2 1
1.0%
7.6 1
1.0%
7.8 1
1.0%
8.8 1
1.0%
9.2 1
1.0%
9.4 1
1.0%
9.5 2
2.0%
9.8 2
2.0%
9.9 2
2.0%
ValueCountFrequency (%)
84.0 1
1.0%
42.8 1
1.0%
37.3 1
1.0%
31.1 1
1.0%
30.3 1
1.0%
25.9 1
1.0%
24.9 1
1.0%
24.8 1
1.0%
24.7 1
1.0%
23.6 1
1.0%

Cr_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct58
Distinct (%)58.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9646
Minimum0.48
Maximum5.91
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:24.024381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.48
5-th percentile0.58
Q10.72
median0.87
Q31.0125
95-th percentile1.4415
Maximum5.91
Range5.43
Interquartile range (IQR)0.2925

Descriptive statistics

Standard deviation0.58045162
Coefficient of variation (CV)0.6017537
Kurtosis54.033165
Mean0.9646
Median Absolute Deviation (MAD)0.15
Skewness6.587708
Sum96.46
Variance0.33692408
MonotonicityNot monotonic
2023-10-09T03:57:24.296271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.69 5
 
5.0%
0.79 4
 
4.0%
1.05 3
 
3.0%
0.92 3
 
3.0%
0.87 3
 
3.0%
1.01 3
 
3.0%
0.78 3
 
3.0%
0.84 3
 
3.0%
0.9 3
 
3.0%
0.83 3
 
3.0%
Other values (48) 67
67.0%
ValueCountFrequency (%)
0.48 1
1.0%
0.5 1
1.0%
0.51 1
1.0%
0.56 1
1.0%
0.58 2
2.0%
0.59 1
1.0%
0.6 1
1.0%
0.61 2
2.0%
0.62 1
1.0%
0.65 1
1.0%
ValueCountFrequency (%)
5.91 1
1.0%
2.22 1
1.0%
2.14 1
1.0%
1.98 1
1.0%
1.47 1
1.0%
1.44 1
1.0%
1.43 1
1.0%
1.41 1
1.0%
1.31 1
1.0%
1.27 1
1.0%

AST_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct34
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.59
Minimum12
Maximum196
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:24.550793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile15
Q118
median22
Q329
95-th percentile64.5
Maximum196
Range184
Interquartile range (IQR)11

Descriptive statistics

Standard deviation23.3403
Coefficient of variation (CV)0.81637985
Kurtosis28.546768
Mean28.59
Median Absolute Deviation (MAD)5
Skewness4.7253139
Sum2859
Variance544.7696
MonotonicityNot monotonic
2023-10-09T03:57:24.938281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
20 7
 
7.0%
18 7
 
7.0%
15 7
 
7.0%
19 7
 
7.0%
17 6
 
6.0%
21 5
 
5.0%
28 5
 
5.0%
22 5
 
5.0%
24 4
 
4.0%
36 4
 
4.0%
Other values (24) 43
43.0%
ValueCountFrequency (%)
12 1
 
1.0%
14 3
3.0%
15 7
7.0%
16 4
4.0%
17 6
6.0%
18 7
7.0%
19 7
7.0%
20 7
7.0%
21 5
5.0%
22 5
5.0%
ValueCountFrequency (%)
196 1
1.0%
119 1
1.0%
85 1
1.0%
75 1
1.0%
74 1
1.0%
64 1
1.0%
55 1
1.0%
53 1
1.0%
50 1
1.0%
45 1
1.0%

ALT_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct51
Distinct (%)51.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.83
Minimum7
Maximum218
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:25.299899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile13
Q118
median24
Q337.25
95-th percentile91.25
Maximum218
Range211
Interquartile range (IQR)19.25

Descriptive statistics

Standard deviation31.136407
Coefficient of variation (CV)0.8939537
Kurtosis15.066775
Mean34.83
Median Absolute Deviation (MAD)8
Skewness3.3896001
Sum3483
Variance969.47586
MonotonicityNot monotonic
2023-10-09T03:57:25.889508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18 7
 
7.0%
20 6
 
6.0%
19 6
 
6.0%
16 5
 
5.0%
13 4
 
4.0%
21 4
 
4.0%
29 4
 
4.0%
28 3
 
3.0%
24 3
 
3.0%
14 3
 
3.0%
Other values (41) 55
55.0%
ValueCountFrequency (%)
7 1
 
1.0%
9 1
 
1.0%
11 1
 
1.0%
12 1
 
1.0%
13 4
4.0%
14 3
3.0%
15 2
 
2.0%
16 5
5.0%
17 2
 
2.0%
18 7
7.0%
ValueCountFrequency (%)
218 1
1.0%
175 1
1.0%
100 1
1.0%
96 2
2.0%
91 1
1.0%
90 1
1.0%
72 1
1.0%
71 1
1.0%
67 1
1.0%
66 1
1.0%

MDRD_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct97
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean81.9759
Minimum9.62
Maximum146.34
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:26.146045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9.62
5-th percentile36.9555
Q170.7375
median82.88
Q394.6825
95-th percentile121.599
Maximum146.34
Range136.72
Interquartile range (IQR)23.945

Descriptive statistics

Standard deviation22.999817
Coefficient of variation (CV)0.28056803
Kurtosis1.215002
Mean81.9759
Median Absolute Deviation (MAD)12.165
Skewness-0.34966167
Sum8197.59
Variance528.99157
MonotonicityNot monotonic
2023-10-09T03:57:26.368436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
88.8 2
 
2.0%
85.66 2
 
2.0%
78.84 2
 
2.0%
98.58 1
 
1.0%
80.17 1
 
1.0%
90.22 1
 
1.0%
64.94 1
 
1.0%
9.62 1
 
1.0%
62.76 1
 
1.0%
52.28 1
 
1.0%
Other values (87) 87
87.0%
ValueCountFrequency (%)
9.62 1
1.0%
22.53 1
1.0%
24.78 1
1.0%
34.76 1
1.0%
36.49 1
1.0%
36.98 1
1.0%
46.45 1
1.0%
47.08 1
1.0%
52.28 1
1.0%
53.48 1
1.0%
ValueCountFrequency (%)
146.34 1
1.0%
131.06 1
1.0%
126.28 1
1.0%
125.67 1
1.0%
122.91 1
1.0%
121.53 1
1.0%
116.96 1
1.0%
114.62 1
1.0%
109.69 1
1.0%
108.71 1
1.0%

Interactions

2023-10-09T03:57:20.226038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:16.578588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:17.430339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:18.171519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:18.930920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:20.402857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:16.787300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:17.571391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:18.329682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:19.124218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:20.570846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:16.942602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:17.709269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:18.504645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:19.441651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:20.781471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:17.141256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:17.841309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:18.659952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:19.831564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:20.911042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:17.283413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:18.012815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:18.790561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:20.052614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:57:26.539617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDBUN/Cr_DATEBUN_VALCr_VALAST_VALALT_VALMDRD_VAL
RID1.0001.0001.0001.0001.0001.0001.000
BUN/Cr_DATE1.0001.0000.0000.0000.4900.5750.636
BUN_VAL1.0000.0001.0000.8260.0930.0000.667
Cr_VAL1.0000.0000.8261.0000.0000.0000.900
AST_VAL1.0000.4900.0930.0001.0000.8790.000
ALT_VAL1.0000.5750.0000.0000.8791.0000.000
MDRD_VAL1.0000.6360.6670.9000.0000.0001.000
2023-10-09T03:57:26.800750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
BUN_VALCr_VALAST_VALALT_VALMDRD_VAL
BUN_VAL1.0000.372-0.192-0.188-0.358
Cr_VAL0.3721.0000.1900.191-0.746
AST_VAL-0.1920.1901.0000.736-0.070
ALT_VAL-0.1880.1910.7361.0000.030
MDRD_VAL-0.358-0.746-0.0700.0301.000

Missing values

2023-10-09T03:57:21.182529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:57:21.375962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

RIDBUN/Cr_DATEBUN_VALCr_VALAST_VALALT_VALMDRD_VAL
0R00000012009-0923.10.59201498.58
1R00000022011-1222.00.79393470.76
2R00000032009-1217.90.92201484.8
3R00000042017-0615.40.87857199.86
4R00000052009-0724.71.01182154.51
5R00000062015-0713.30.78151673.01
6R00000072017-0811.40.842441103.39
7R00000082015-1215.10.7511917579.94
8R00000092010-0812.00.94367288.44
9R00000102015-119.90.72172380.55
RIDBUN/Cr_DATEBUN_VALCr_VALAST_VALALT_VALMDRD_VAL
90R00000912016-0412.80.723167121.53
91R00000922009-0914.10.71181880.92
92R00000932015-1116.60.66161388.8
93R00000942013-0618.01.07172881.15
94R00000952019-0213.60.69192885.93
95R00000962017-0118.50.581521103.08
96R00000972014-1120.30.621844114.62
97R00000982016-0225.90.782218101.19
98R00000992015-1013.20.52624126.28
99R00001002013-1012.10.86202291.99