Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 100 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 7.7 KiB |
Average record size in memory | 79.3 B |
Variable types
Text | 1 |
---|---|
DateTime | 2 |
Numeric | 6 |
Dataset
Description | 당뇨병 환자들이 시행한 혈액 검사 결과를 이용하여 공존질환과의 관련성을 평가할 수 있는 검사 데이터를 포함함. 검사 항목은 HbA1c, TG, HDL, LDL로 신장병증, 망막병증, 심근경색, 백내장과 혈관성 질환의 평가가 가능함. - HbA1c(당화혈색소): 혈액 속 적혈구 내 혈색소에 포도당 일부가 결합한 상태. 일반 혈당 검사가 검사 시점 혈당만을 알 수 있는데 반해 당화혈색소를 통해 3개월 간의 평균 혈당을 알 수 있음 -Total Cholesterol(TC, 총콜레스테롤) : 혈액 내에 있는 모든 콜레스테롤을 뜻함 - Triglyceride(TG, 중성지방): 혈 중 트리글리세라이드의 양을 측정. 혈 중 트리글리세라이드가 증가하는 이유는 분명하지 않으나 심혈관 질환으로 진행될 위험의 증가와 관련이 있음 - LDL(Low Density Lipoprotein Cholesterol): 나쁜 콜레스테롤이라고도 불리는 저밀도 지단백 콜레스테롤. 신체 콜레스테롤의 대부분을 차지하며 수치가 높으면 심장질환 및 뇌놀중 위험이 높아짐 - HDL(High Density Lipoprotein Cholesterol): 좋은 콜레스테롤이라고도 불리는 고밀도 지단백 콜레스테롤로 콜레스테롤을 흡수하여 간으로 다시 운반함. 높은 HDL cholesterol은 심장질환과 뇌졸중 위험을 낮출 수 있음 |
---|---|
Author | 가톨릭대학교 서울성모병원 |
URL | http://cmcdata.net/data/dataset/diabetes_coexlab |
Reproduction
Analysis started | 2023-10-08 18:56:11.458644 |
---|---|
Analysis finished | 2023-10-08 18:56:18.776097 |
Duration | 7.32 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
RID
Text
UNIQUE
 
Distinct | 100 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Value | Count | Frequency (%) |
r0000001 | 1 | 1.0% |
r0000063 | 1 | 1.0% |
r0000074 | 1 | 1.0% |
r0000073 | 1 | 1.0% |
r0000072 | 1 | 1.0% |
r0000071 | 1 | 1.0% |
r0000070 | 1 | 1.0% |
r0000069 | 1 | 1.0% |
r0000068 | 1 | 1.0% |
r0000067 | 1 | 1.0% |
Other values (90) | 90 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 519 | |
R | 100 | 12.5% |
1 | 21 | 2.6% |
3 | 20 | 2.5% |
4 | 20 | 2.5% |
5 | 20 | 2.5% |
6 | 20 | 2.5% |
7 | 20 | 2.5% |
8 | 20 | 2.5% |
9 | 20 | 2.5% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 700 | |
Uppercase Letter | 100 | 12.5% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 519 | |
1 | 21 | 3.0% |
3 | 20 | 2.9% |
4 | 20 | 2.9% |
5 | 20 | 2.9% |
6 | 20 | 2.9% |
7 | 20 | 2.9% |
8 | 20 | 2.9% |
9 | 20 | 2.9% |
2 | 20 | 2.9% |
Uppercase Letter
Value | Count | Frequency (%) |
R | 100 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 700 | |
Latin | 100 | 12.5% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 519 | |
1 | 21 | 3.0% |
3 | 20 | 2.9% |
4 | 20 | 2.9% |
5 | 20 | 2.9% |
6 | 20 | 2.9% |
7 | 20 | 2.9% |
8 | 20 | 2.9% |
9 | 20 | 2.9% |
2 | 20 | 2.9% |
Latin
Value | Count | Frequency (%) |
R | 100 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 800 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 519 | |
R | 100 | 12.5% |
1 | 21 | 2.6% |
3 | 20 | 2.5% |
4 | 20 | 2.5% |
5 | 20 | 2.5% |
6 | 20 | 2.5% |
7 | 20 | 2.5% |
8 | 20 | 2.5% |
9 | 20 | 2.5% |
A1C_DATE
Date
Distinct | 63 |
---|---|
Distinct (%) | 63.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Minimum | 2009-06-01 00:00:00 |
---|---|
Maximum | 2019-05-01 00:00:00 |
A1C_VAL
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 50 |
---|---|
Distinct (%) | 50.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 7.963 |
Minimum | 5.5 |
---|---|
Maximum | 17.6 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 5.5 |
---|---|
5-th percentile | 5.795 |
Q1 | 6.475 |
median | 7.35 |
Q3 | 8.925 |
95-th percentile | 12.105 |
Maximum | 17.6 |
Range | 12.1 |
Interquartile range (IQR) | 2.45 |
Descriptive statistics
Standard deviation | 2.1123808 |
---|---|
Coefficient of variation (CV) | 0.26527449 |
Kurtosis | 3.4973665 |
Mean | 7.963 |
Median Absolute Deviation (MAD) | 1.1 |
Skewness | 1.5920068 |
Sum | 796.3 |
Variance | 4.4621525 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
7.0 | 6 | 6.0% |
6.1 | 5 | 5.0% |
6.2 | 4 | 4.0% |
8.0 | 4 | 4.0% |
6.8 | 4 | 4.0% |
6.5 | 4 | 4.0% |
7.4 | 3 | 3.0% |
5.9 | 3 | 3.0% |
7.7 | 3 | 3.0% |
7.6 | 3 | 3.0% |
Other values (40) | 61 |
Value | Count | Frequency (%) |
5.5 | 2 | 2.0% |
5.6 | 2 | 2.0% |
5.7 | 1 | 1.0% |
5.8 | 1 | 1.0% |
5.9 | 3 | |
6.0 | 2 | 2.0% |
6.1 | 5 | |
6.2 | 4 | |
6.3 | 2 | 2.0% |
6.4 | 3 |
Value | Count | Frequency (%) |
17.6 | 1 | 1.0% |
13.0 | 1 | 1.0% |
12.8 | 1 | 1.0% |
12.3 | 1 | 1.0% |
12.2 | 1 | 1.0% |
12.1 | 1 | 1.0% |
11.9 | 1 | 1.0% |
11.6 | 1 | 1.0% |
11.3 | 1 | 1.0% |
10.9 | 3 |
A1C_VAL_C
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 9 |
---|---|
Distinct (%) | 9.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 8.02 |
Minimum | 6 |
---|---|
Maximum | 18 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 6 |
---|---|
5-th percentile | 6 |
Q1 | 6.75 |
median | 7 |
Q3 | 9 |
95-th percentile | 12 |
Maximum | 18 |
Range | 12 |
Interquartile range (IQR) | 2.25 |
Descriptive statistics
Standard deviation | 2.1224152 |
---|---|
Coefficient of variation (CV) | 0.2646403 |
Kurtosis | 4.0078624 |
Mean | 8.02 |
Median Absolute Deviation (MAD) | 1 |
Skewness | 1.6551006 |
Sum | 802 |
Variance | 4.5046465 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
7 | 28 | |
6 | 25 | |
8 | 17 | |
9 | 10 | 10.0% |
10 | 6 | 6.0% |
11 | 6 | 6.0% |
12 | 5 | 5.0% |
13 | 2 | 2.0% |
18 | 1 | 1.0% |
Value | Count | Frequency (%) |
6 | 25 | |
7 | 28 | |
8 | 17 | |
9 | 10 | 10.0% |
10 | 6 | 6.0% |
11 | 6 | 6.0% |
12 | 5 | 5.0% |
13 | 2 | 2.0% |
18 | 1 | 1.0% |
Value | Count | Frequency (%) |
18 | 1 | 1.0% |
13 | 2 | 2.0% |
12 | 5 | 5.0% |
11 | 6 | 6.0% |
10 | 6 | 6.0% |
9 | 10 | 10.0% |
8 | 17 | |
7 | 28 | |
6 | 25 |
Distinct | 64 |
---|---|
Distinct (%) | 64.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Minimum | 2009-06-01 00:00:00 |
---|---|
Maximum | 2019-05-01 00:00:00 |
TC_VAL
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 75 |
---|---|
Distinct (%) | 75.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 173.99 |
Minimum | 74 |
---|---|
Maximum | 286 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 74 |
---|---|
5-th percentile | 101.9 |
Q1 | 141.75 |
median | 166.5 |
Q3 | 204 |
95-th percentile | 247.15 |
Maximum | 286 |
Range | 212 |
Interquartile range (IQR) | 62.25 |
Descriptive statistics
Standard deviation | 45.512457 |
---|---|
Coefficient of variation (CV) | 0.26158088 |
Kurtosis | -0.50474156 |
Mean | 173.99 |
Median Absolute Deviation (MAD) | 31.5 |
Skewness | 0.17659274 |
Sum | 17399 |
Variance | 2071.3837 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
164 | 4 | 4.0% |
225 | 3 | 3.0% |
151 | 3 | 3.0% |
169 | 2 | 2.0% |
133 | 2 | 2.0% |
214 | 2 | 2.0% |
165 | 2 | 2.0% |
156 | 2 | 2.0% |
186 | 2 | 2.0% |
174 | 2 | 2.0% |
Other values (65) | 76 |
Value | Count | Frequency (%) |
74 | 1 | |
90 | 2 | |
97 | 1 | |
100 | 1 | |
102 | 1 | |
108 | 1 | |
110 | 1 | |
111 | 1 | |
115 | 1 | |
116 | 1 |
Value | Count | Frequency (%) |
286 | 1 | |
270 | 1 | |
263 | 1 | |
257 | 1 | |
250 | 1 | |
247 | 1 | |
244 | 1 | |
243 | 2 | |
242 | 1 | |
238 | 1 |
TG_VAL
Real number (ℝ)
Distinct | 83 |
---|---|
Distinct (%) | 83.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 173.39 |
Minimum | 34 |
---|---|
Maximum | 813 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 34 |
---|---|
5-th percentile | 51.7 |
Q1 | 85.5 |
median | 134 |
Q3 | 206.5 |
95-th percentile | 495.2 |
Maximum | 813 |
Range | 779 |
Interquartile range (IQR) | 121 |
Descriptive statistics
Standard deviation | 144.3631 |
---|---|
Coefficient of variation (CV) | 0.83259185 |
Kurtosis | 5.7712173 |
Mean | 173.39 |
Median Absolute Deviation (MAD) | 58.5 |
Skewness | 2.2561405 |
Sum | 17339 |
Variance | 20840.705 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
95 | 3 | 3.0% |
98 | 3 | 3.0% |
97 | 3 | 3.0% |
52 | 2 | 2.0% |
102 | 2 | 2.0% |
77 | 2 | 2.0% |
34 | 2 | 2.0% |
58 | 2 | 2.0% |
164 | 2 | 2.0% |
283 | 2 | 2.0% |
Other values (73) | 77 |
Value | Count | Frequency (%) |
34 | 2 | |
37 | 1 | |
43 | 1 | |
46 | 1 | |
52 | 2 | |
53 | 2 | |
58 | 2 | |
62 | 1 | |
64 | 1 | |
65 | 1 |
Value | Count | Frequency (%) |
813 | 1 | |
712 | 1 | |
620 | 1 | |
555 | 1 | |
518 | 1 | |
494 | 1 | |
457 | 1 | |
451 | 1 | |
389 | 1 | |
328 | 1 |
HDL_VAL
Real number (ℝ)
Distinct | 42 |
---|---|
Distinct (%) | 42.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 45.83 |
Minimum | 23 |
---|---|
Maximum | 89 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 23 |
---|---|
5-th percentile | 26.95 |
Q1 | 37 |
median | 43 |
Q3 | 52 |
95-th percentile | 68 |
Maximum | 89 |
Range | 66 |
Interquartile range (IQR) | 15 |
Descriptive statistics
Standard deviation | 13.378752 |
---|---|
Coefficient of variation (CV) | 0.29192128 |
Kurtosis | 1.2671043 |
Mean | 45.83 |
Median Absolute Deviation (MAD) | 8 |
Skewness | 0.97429211 |
Sum | 4583 |
Variance | 178.99101 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
41 | 5 | 5.0% |
33 | 5 | 5.0% |
42 | 5 | 5.0% |
40 | 5 | 5.0% |
43 | 4 | 4.0% |
47 | 4 | 4.0% |
32 | 4 | 4.0% |
60 | 4 | 4.0% |
45 | 4 | 4.0% |
37 | 4 | 4.0% |
Other values (32) | 56 |
Value | Count | Frequency (%) |
23 | 1 | 1.0% |
24 | 1 | 1.0% |
25 | 1 | 1.0% |
26 | 2 | 2.0% |
27 | 1 | 1.0% |
30 | 2 | 2.0% |
31 | 1 | 1.0% |
32 | 4 | |
33 | 5 | |
34 | 3 |
Value | Count | Frequency (%) |
89 | 1 | 1.0% |
86 | 2 | |
79 | 1 | 1.0% |
68 | 2 | |
67 | 1 | 1.0% |
64 | 2 | |
63 | 1 | 1.0% |
61 | 2 | |
60 | 4 | |
59 | 2 |
LDL_VAL
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 72 |
---|---|
Distinct (%) | 72.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 97 |
Minimum | 25 |
---|---|
Maximum | 189 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 25 |
---|---|
5-th percentile | 45.65 |
Q1 | 66.75 |
median | 89 |
Q3 | 123.5 |
95-th percentile | 165.1 |
Maximum | 189 |
Range | 164 |
Interquartile range (IQR) | 56.75 |
Descriptive statistics
Standard deviation | 38.654648 |
---|---|
Coefficient of variation (CV) | 0.39850153 |
Kurtosis | -0.71311439 |
Mean | 97 |
Median Absolute Deviation (MAD) | 30 |
Skewness | 0.35482674 |
Sum | 9700 |
Variance | 1494.1818 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
123 | 3 | 3.0% |
97 | 3 | 3.0% |
89 | 3 | 3.0% |
85 | 3 | 3.0% |
78 | 3 | 3.0% |
46 | 3 | 3.0% |
71 | 3 | 3.0% |
100 | 2 | 2.0% |
60 | 2 | 2.0% |
81 | 2 | 2.0% |
Other values (62) | 73 |
Value | Count | Frequency (%) |
25 | 1 | 1.0% |
28 | 1 | 1.0% |
37 | 1 | 1.0% |
39 | 2 | |
46 | 3 | |
47 | 1 | 1.0% |
48 | 1 | 1.0% |
51 | 1 | 1.0% |
52 | 1 | 1.0% |
54 | 1 | 1.0% |
Value | Count | Frequency (%) |
189 | 1 | |
178 | 1 | |
176 | 1 | |
170 | 1 | |
167 | 1 | |
165 | 1 | |
163 | 1 | |
157 | 1 | |
153 | 1 | |
152 | 2 |
RID | A1C_DATE | A1C_VAL | A1C_VAL_C | TC/TG/HDL/LDL_DATE | TC_VAL | TG_VAL | HDL_VAL | LDL_VAL | |
---|---|---|---|---|---|---|---|---|---|
RID | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
A1C_DATE | 1.000 | 1.000 | 0.779 | 0.813 | 0.999 | 0.756 | 0.706 | 0.000 | 0.554 |
A1C_VAL | 1.000 | 0.779 | 1.000 | 0.928 | 0.890 | 0.115 | 0.378 | 0.336 | 0.000 |
A1C_VAL_C | 1.000 | 0.813 | 0.928 | 1.000 | 0.900 | 0.000 | 0.406 | 0.362 | 0.192 |
TC/TG/HDL/LDL_DATE | 1.000 | 0.999 | 0.890 | 0.900 | 1.000 | 0.516 | 0.000 | 0.000 | 0.000 |
TC_VAL | 1.000 | 0.756 | 0.115 | 0.000 | 0.516 | 1.000 | 0.170 | 0.336 | 0.888 |
TG_VAL | 1.000 | 0.706 | 0.378 | 0.406 | 0.000 | 0.170 | 1.000 | 0.000 | 0.000 |
HDL_VAL | 1.000 | 0.000 | 0.336 | 0.362 | 0.000 | 0.336 | 0.000 | 1.000 | 0.058 |
LDL_VAL | 1.000 | 0.554 | 0.000 | 0.192 | 0.000 | 0.888 | 0.000 | 0.058 | 1.000 |
A1C_VAL | A1C_VAL_C | TC_VAL | TG_VAL | HDL_VAL | LDL_VAL | |
---|---|---|---|---|---|---|
A1C_VAL | 1.000 | 0.978 | 0.202 | 0.217 | -0.110 | 0.213 |
A1C_VAL_C | 0.978 | 1.000 | 0.200 | 0.236 | -0.123 | 0.219 |
TC_VAL | 0.202 | 0.200 | 1.000 | 0.346 | 0.195 | 0.863 |
TG_VAL | 0.217 | 0.236 | 0.346 | 1.000 | -0.392 | 0.156 |
HDL_VAL | -0.110 | -0.123 | 0.195 | -0.392 | 1.000 | 0.071 |
LDL_VAL | 0.213 | 0.219 | 0.863 | 0.156 | 0.071 | 1.000 |
RID | A1C_DATE | A1C_VAL | A1C_VAL_C | TC/TG/HDL/LDL_DATE | TC_VAL | TG_VAL | HDL_VAL | LDL_VAL | |
---|---|---|---|---|---|---|---|---|---|
0 | R0000001 | 2009-09 | 6.4 | 6 | 2009-09 | 126 | 43 | 50 | 64 |
1 | R0000002 | 2011-11 | 10.1 | 10 | 2011-12 | 144 | 111 | 41 | 82 |
2 | R0000003 | 2009-12 | 6.6 | 7 | 2009-12 | 193 | 95 | 48 | 129 |
3 | R0000004 | 2017-06 | 12.1 | 12 | 2017-06 | 141 | 275 | 38 | 56 |
4 | R0000005 | 2009-07 | 6.8 | 7 | 2009-07 | 143 | 87 | 68 | 56 |
5 | R0000006 | 2015-07 | 12.8 | 13 | 2015-07 | 199 | 91 | 52 | 123 |
6 | R0000007 | 2017-08 | 8.9 | 9 | 2017-08 | 197 | 154 | 40 | 131 |
7 | R0000008 | 2015-12 | 9.8 | 10 | 2015-12 | 164 | 208 | 37 | 97 |
8 | R0000009 | 2010-08 | 10.9 | 11 | 2010-09 | 164 | 53 | 59 | 89 |
9 | R0000010 | 2015-12 | 7.5 | 8 | 2015-11 | 118 | 112 | 51 | 47 |
RID | A1C_DATE | A1C_VAL | A1C_VAL_C | TC/TG/HDL/LDL_DATE | TC_VAL | TG_VAL | HDL_VAL | LDL_VAL | |
---|---|---|---|---|---|---|---|---|---|
90 | R0000091 | 2016-04 | 8.0 | 8 | 2016-04 | 183 | 197 | 43 | 108 |
91 | R0000092 | 2009-09 | 7.0 | 7 | 2009-09 | 165 | 34 | 52 | 102 |
92 | R0000093 | 2015-11 | 10.7 | 11 | 2015-11 | 257 | 100 | 58 | 176 |
93 | R0000094 | 2013-06 | 17.6 | 18 | 2013-06 | 232 | 97 | 86 | 122 |
94 | R0000095 | 2019-02 | 9.7 | 10 | 2019-02 | 159 | 109 | 60 | 79 |
95 | R0000096 | 2016-12 | 7.0 | 7 | 2016-12 | 74 | 66 | 24 | 37 |
96 | R0000097 | 2014-12 | 9.0 | 9 | 2014-11 | 225 | 283 | 36 | 150 |
97 | R0000098 | 2016-02 | 5.9 | 6 | 2016-02 | 110 | 52 | 33 | 66 |
98 | R0000099 | 2015-10 | 9.4 | 9 | 2015-10 | 145 | 138 | 64 | 57 |
99 | R0000100 | 2013-10 | 7.0 | 7 | 2013-10 | 198 | 102 | 48 | 134 |