Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 100 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 7.6 KiB |
Average record size in memory | 77.3 B |
Variable types
Numeric | 4 |
---|---|
DateTime | 4 |
Text | 1 |
Dataset
Description | 알코올 사용 장애 환자들이 시행한 혈액 검사를 이용하여 당뇨, 고지혈증 질환과의 관련성을 평가할 수 있는 검사 데이터를 포함함. 검체 채취 일장, 접수 일자를 이용하여 처방시점으로 부터의 기간을 계산한 시점 데이터를 생성함. 검사항목은HbA1c, Glucose, HDL Cholesterol, LDL Cholesterol 등의 검사항목이 포함됨 - HbA1c(당화혈색소) :혈액 속 적혈구 내 혈색소에 포도당 일부가 결합한 상태. 일반 혈당 검사가 검사 시점 혈당만을 알 수 있는데 반해 당화혈색소를 통해 3개월 간의 평균 혈당을 알 수 있음 - LDL(Low Density Lipoprotein) Cholesterol : 나쁜 콜레스테롤이라고도 불리는 저밀도 지단백 콜레스테롤. 신체 콜레스테롤의 대부분을 차지하며 수치가 높으면 심장질환 및 뇌놀중 위험이 높아짐 - HDL(High Density Lipoprotein) Cholesterol : 좋은 콜레스테롤이라고도 불리는 고밀도 지단백 콜레스테롤로 콜레스테롤을 흡수하여 간으로 다시 운반함. 높은 HDL cholesterol은 심장질환과 뇌졸중 위험을 낮출 수 있음 |
---|---|
Author | 가톨릭대학교 은평성모병원 |
URL | http://cmcdata.net/data/dataset/coexistence-disease-analysis-blood-test-data-alcohol-use-disorder-eunpyeong |
Reproduction
Analysis started | 2023-10-08 18:56:49.224184 |
---|---|
Analysis finished | 2023-10-08 18:56:54.568114 |
Duration | 5.34 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
RID
Real number (ℝ)
UNIQUE
 
Distinct | 100 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 50.5 |
Minimum | 1 |
---|---|
Maximum | 100 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 5.95 |
Q1 | 25.75 |
median | 50.5 |
Q3 | 75.25 |
95-th percentile | 95.05 |
Maximum | 100 |
Range | 99 |
Interquartile range (IQR) | 49.5 |
Descriptive statistics
Standard deviation | 29.011492 |
---|---|
Coefficient of variation (CV) | 0.57448499 |
Kurtosis | -1.2 |
Mean | 50.5 |
Median Absolute Deviation (MAD) | 25 |
Skewness | 0 |
Sum | 5050 |
Variance | 841.66667 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 1.0% |
65 | 1 | 1.0% |
75 | 1 | 1.0% |
74 | 1 | 1.0% |
73 | 1 | 1.0% |
72 | 1 | 1.0% |
71 | 1 | 1.0% |
70 | 1 | 1.0% |
69 | 1 | 1.0% |
68 | 1 | 1.0% |
Other values (90) | 90 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
100 | 1 | |
99 | 1 | |
98 | 1 | |
97 | 1 | |
96 | 1 | |
95 | 1 | |
94 | 1 | |
93 | 1 | |
92 | 1 | |
91 | 1 |
A1C_DCT
Date
Distinct | 90 |
---|---|
Distinct (%) | 90.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Minimum | 2015-11-17 00:00:00 |
---|---|
Maximum | 2020-04-07 00:00:00 |
A1C_SRC
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 48 |
---|---|
Distinct (%) | 48.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 6.58 |
Minimum | 4 |
---|---|
Maximum | 14.4 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 4 |
---|---|
5-th percentile | 4.595 |
Q1 | 5.3 |
median | 5.9 |
Q3 | 7.425 |
95-th percentile | 10.01 |
Maximum | 14.4 |
Range | 10.4 |
Interquartile range (IQR) | 2.125 |
Descriptive statistics
Standard deviation | 1.8508529 |
---|---|
Coefficient of variation (CV) | 0.28128464 |
Kurtosis | 2.4682709 |
Mean | 6.58 |
Median Absolute Deviation (MAD) | 0.8 |
Skewness | 1.4458676 |
Sum | 658 |
Variance | 3.4256566 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
5.3 | 9 | 9.0% |
5.7 | 5 | 5.0% |
6.0 | 4 | 4.0% |
5.2 | 4 | 4.0% |
6.3 | 4 | 4.0% |
5.6 | 4 | 4.0% |
5.0 | 4 | 4.0% |
5.4 | 3 | 3.0% |
7.1 | 3 | 3.0% |
5.5 | 3 | 3.0% |
Other values (38) | 57 |
Value | Count | Frequency (%) |
4.0 | 1 | 1.0% |
4.3 | 1 | 1.0% |
4.4 | 2 | |
4.5 | 1 | 1.0% |
4.6 | 2 | |
4.8 | 1 | 1.0% |
4.9 | 2 | |
5.0 | 4 | |
5.1 | 3 | |
5.2 | 4 |
Value | Count | Frequency (%) |
14.4 | 1 | 1.0% |
11.4 | 1 | 1.0% |
10.9 | 1 | 1.0% |
10.4 | 1 | 1.0% |
10.2 | 1 | 1.0% |
10.0 | 1 | 1.0% |
9.7 | 1 | 1.0% |
9.6 | 3 | |
9.4 | 2 | |
9.3 | 1 | 1.0% |
GLC_DCT
Date
Distinct | 94 |
---|---|
Distinct (%) | 94.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Minimum | 2015-11-18 00:00:00 |
---|---|
Maximum | 2020-04-07 00:00:00 |
GLC_SRC
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 71 |
---|---|
Distinct (%) | 71.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 160.48 |
Minimum | 79 |
---|---|
Maximum | 672 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 79 |
---|---|
5-th percentile | 93.85 |
Q1 | 115.75 |
median | 134 |
Q3 | 181 |
95-th percentile | 269.65 |
Maximum | 672 |
Range | 593 |
Interquartile range (IQR) | 65.25 |
Descriptive statistics
Standard deviation | 85.252978 |
---|---|
Coefficient of variation (CV) | 0.5312374 |
Kurtosis | 15.104049 |
Mean | 160.48 |
Median Absolute Deviation (MAD) | 23.5 |
Skewness | 3.3394795 |
Sum | 16048 |
Variance | 7268.0703 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
121 | 4 | 4.0% |
102 | 4 | 4.0% |
120 | 3 | 3.0% |
124 | 3 | 3.0% |
201 | 3 | 3.0% |
126 | 3 | 3.0% |
159 | 3 | 3.0% |
135 | 2 | 2.0% |
181 | 2 | 2.0% |
125 | 2 | 2.0% |
Other values (61) | 71 |
Value | Count | Frequency (%) |
79 | 1 | 1.0% |
84 | 1 | 1.0% |
88 | 2 | |
91 | 1 | 1.0% |
94 | 1 | 1.0% |
95 | 1 | 1.0% |
98 | 2 | |
101 | 1 | 1.0% |
102 | 4 | |
105 | 1 | 1.0% |
Value | Count | Frequency (%) |
672 | 1 | |
505 | 1 | |
426 | 1 | |
313 | 1 | |
301 | 1 | |
268 | 1 | |
265 | 1 | |
260 | 1 | |
255 | 1 | |
253 | 1 |
HDL_DCT
Date
Distinct | 90 |
---|---|
Distinct (%) | 90.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Minimum | 2015-11-18 00:00:00 |
---|---|
Maximum | 2020-04-07 00:00:00 |
HDL_SRC
Text
Distinct | 54 |
---|---|
Distinct (%) | 54.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Value | Count | Frequency (%) |
42 | 5 | 5.0% |
47 | 5 | 5.0% |
46 | 4 | 4.0% |
34 | 4 | 4.0% |
37 | 4 | 4.0% |
40 | 4 | 4.0% |
39 | 3 | 3.0% |
38 | 3 | 3.0% |
71 | 3 | 3.0% |
64 | 3 | 3.0% |
Other values (45) | 63 |
Most occurring characters
Value | Count | Frequency (%) |
4 | 41 | |
3 | 25 | |
6 | 24 | |
7 | 22 | |
5 | 18 | |
1 | 18 | |
2 | 15 | 7.5% |
9 | 14 | 7.0% |
0 | 10 | 5.0% |
8 | 10 | 5.0% |
Other values (2) | 2 | 1.0% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 197 | |
Math Symbol | 1 | 0.5% |
Space Separator | 1 | 0.5% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
4 | 41 | |
3 | 25 | |
6 | 24 | |
7 | 22 | |
5 | 18 | |
1 | 18 | |
2 | 15 | 7.6% |
9 | 14 | 7.1% |
0 | 10 | 5.1% |
8 | 10 | 5.1% |
Math Symbol
Value | Count | Frequency (%) |
< | 1 |
Space Separator
Value | Count | Frequency (%) |
1 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 199 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
4 | 41 | |
3 | 25 | |
6 | 24 | |
7 | 22 | |
5 | 18 | |
1 | 18 | |
2 | 15 | 7.5% |
9 | 14 | 7.0% |
0 | 10 | 5.0% |
8 | 10 | 5.0% |
Other values (2) | 2 | 1.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 199 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
4 | 41 | |
3 | 25 | |
6 | 24 | |
7 | 22 | |
5 | 18 | |
1 | 18 | |
2 | 15 | 7.5% |
9 | 14 | 7.0% |
0 | 10 | 5.0% |
8 | 10 | 5.0% |
Other values (2) | 2 | 1.0% |
LDL_DCT
Date
Distinct | 90 |
---|---|
Distinct (%) | 90.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Minimum | 2015-11-18 00:00:00 |
---|---|
Maximum | 2020-04-07 00:00:00 |
LDL_SRC
Real number (ℝ)
Distinct | 74 |
---|---|
Distinct (%) | 74.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 85.42 |
Minimum | 11 |
---|---|
Maximum | 211 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 11 |
---|---|
5-th percentile | 27.9 |
Q1 | 51.75 |
median | 82.5 |
Q3 | 112.75 |
95-th percentile | 165 |
Maximum | 211 |
Range | 200 |
Interquartile range (IQR) | 61 |
Descriptive statistics
Standard deviation | 41.923735 |
---|---|
Coefficient of variation (CV) | 0.49079531 |
Kurtosis | -0.15173594 |
Mean | 85.42 |
Median Absolute Deviation (MAD) | 31 |
Skewness | 0.56727878 |
Sum | 8542 |
Variance | 1757.5996 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
50 | 3 | 3.0% |
102 | 3 | 3.0% |
44 | 3 | 3.0% |
51 | 3 | 3.0% |
122 | 3 | 3.0% |
55 | 2 | 2.0% |
52 | 2 | 2.0% |
107 | 2 | 2.0% |
115 | 2 | 2.0% |
112 | 2 | 2.0% |
Other values (64) | 75 |
Value | Count | Frequency (%) |
11 | 1 | |
16 | 1 | |
20 | 1 | |
25 | 1 | |
26 | 1 | |
28 | 1 | |
30 | 1 | |
32 | 1 | |
38 | 1 | |
39 | 1 |
Value | Count | Frequency (%) |
211 | 1 | |
182 | 1 | |
172 | 1 | |
171 | 1 | |
165 | 2 | |
161 | 1 | |
157 | 1 | |
155 | 1 | |
141 | 1 | |
137 | 1 |
RID | A1C_DCT | A1C_SRC | GLC_DCT | GLC_SRC | HDL_DCT | HDL_SRC | LDL_DCT | LDL_SRC | |
---|---|---|---|---|---|---|---|---|---|
RID | 1.000 | 0.879 | 0.102 | 0.557 | 0.260 | 0.834 | 0.000 | 0.834 | 0.101 |
A1C_DCT | 0.879 | 1.000 | 0.946 | 0.998 | 0.948 | 0.999 | 0.000 | 0.999 | 0.000 |
A1C_SRC | 0.102 | 0.946 | 1.000 | 0.970 | 0.621 | 0.959 | 0.000 | 0.959 | 0.000 |
GLC_DCT | 0.557 | 0.998 | 0.970 | 1.000 | 0.992 | 0.999 | 0.878 | 0.999 | 0.932 |
GLC_SRC | 0.260 | 0.948 | 0.621 | 0.992 | 1.000 | 0.988 | 0.499 | 0.988 | 0.104 |
HDL_DCT | 0.834 | 0.999 | 0.959 | 0.999 | 0.988 | 1.000 | 0.000 | 1.000 | 0.000 |
HDL_SRC | 0.000 | 0.000 | 0.000 | 0.878 | 0.499 | 0.000 | 1.000 | 0.000 | 0.704 |
LDL_DCT | 0.834 | 0.999 | 0.959 | 0.999 | 0.988 | 1.000 | 0.000 | 1.000 | 0.000 |
LDL_SRC | 0.101 | 0.000 | 0.000 | 0.932 | 0.104 | 0.000 | 0.704 | 0.000 | 1.000 |
RID | A1C_SRC | GLC_SRC | LDL_SRC | |
---|---|---|---|---|
RID | 1.000 | -0.028 | -0.027 | 0.115 |
A1C_SRC | -0.028 | 1.000 | 0.652 | -0.029 |
GLC_SRC | -0.027 | 0.652 | 1.000 | -0.027 |
LDL_SRC | 0.115 | -0.029 | -0.027 | 1.000 |
RID | A1C_DCT | A1C_SRC | GLC_DCT | GLC_SRC | HDL_DCT | HDL_SRC | LDL_DCT | LDL_SRC | |
---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2018-04-10T00:00:00 | 8.2 | 2018-04-09T00:00:00 | 246 | 2018-04-10T00:00:00 | 37 | 2018-04-10T00:00:00 | 25 |
1 | 2 | 2017-04-05T00:00:00 | 7.1 | 2017-04-05T00:00:00 | 149 | 2017-04-05T00:00:00 | 62 | 2017-04-05T00:00:00 | 50 |
2 | 3 | 2017-04-27T00:00:00 | 10.4 | 2017-04-28T00:00:00 | 217 | 2017-04-28T00:00:00 | 52 | 2017-04-28T00:00:00 | 102 |
3 | 4 | 2018-08-17T00:00:00 | 14.4 | 2018-09-04T00:00:00 | 121 | 2018-08-21T00:00:00 | 18 | 2018-08-21T00:00:00 | 75 |
4 | 5 | 2018-06-21T00:00:00 | 5.2 | 2018-06-21T00:00:00 | 129 | 2018-06-21T00:00:00 | 71 | 2018-06-21T00:00:00 | 41 |
5 | 6 | 2019-09-18T00:00:00 | 4.9 | 2019-09-18T00:00:00 | 121 | 2019-09-18T00:00:00 | 99 | 2019-09-18T00:00:00 | 40 |
6 | 7 | 2018-07-04T00:00:00 | 5.7 | 2018-06-26T00:00:00 | 124 | 2018-06-26T00:00:00 | 37 | 2018-06-26T00:00:00 | 50 |
7 | 8 | 2019-07-04T00:00:00 | 6.8 | 2019-07-04T00:00:00 | 200 | 2019-07-04T00:00:00 | 61 | 2019-07-04T00:00:00 | 155 |
8 | 9 | 2019-10-10T00:00:00 | 6.3 | 2019-10-10T00:00:00 | 201 | 2019-10-10T00:00:00 | 46 | 2019-10-10T00:00:00 | 97 |
9 | 10 | 2017-08-25T00:00:00 | 5.7 | 2017-08-25T00:00:00 | 126 | 2017-08-25T00:00:00 | 74 | 2017-08-25T00:00:00 | 54 |
RID | A1C_DCT | A1C_SRC | GLC_DCT | GLC_SRC | HDL_DCT | HDL_SRC | LDL_DCT | LDL_SRC | |
---|---|---|---|---|---|---|---|---|---|
90 | 91 | 2019-10-29T00:00:00 | 5.0 | 2019-10-15T00:00:00 | 115 | 2019-10-30T00:00:00 | 45 | 2019-10-30T00:00:00 | 65 |
91 | 92 | 2020-01-10T00:00:00 | 9.6 | 2020-01-10T00:00:00 | 225 | 2020-01-10T00:00:00 | 39 | 2020-01-10T00:00:00 | 51 |
92 | 93 | 2018-07-09T00:00:00 | 6.3 | 2018-07-09T00:00:00 | 125 | 2018-07-09T00:00:00 | 34 | 2018-07-09T00:00:00 | 53 |
93 | 94 | 2019-10-14T00:00:00 | 5.3 | 2019-10-14T00:00:00 | 138 | 2019-10-14T00:00:00 | 41 | 2019-10-14T00:00:00 | 124 |
94 | 95 | 2020-01-31T00:00:00 | 5.1 | 2020-01-31T00:00:00 | 102 | 2020-01-31T00:00:00 | 47 | 2020-01-31T00:00:00 | 135 |
95 | 96 | 2019-10-08T00:00:00 | 5.5 | 2019-10-09T00:00:00 | 130 | 2019-10-08T00:00:00 | 39 | 2019-10-08T00:00:00 | 32 |
96 | 97 | 2019-11-09T00:00:00 | 5.3 | 2019-11-09T00:00:00 | 177 | 2019-11-09T00:00:00 | 50 | 2019-11-09T00:00:00 | 136 |
97 | 98 | 2018-04-20T00:00:00 | 10.9 | 2018-04-20T00:00:00 | 505 | 2018-04-21T00:00:00 | 38 | 2018-04-21T00:00:00 | 39 |
98 | 99 | 2019-06-22T00:00:00 | 4.6 | 2019-07-11T00:00:00 | 115 | 2019-07-11T00:00:00 | 40 | 2019-07-11T00:00:00 | 43 |
99 | 100 | 2019-05-16T00:00:00 | 5.3 | 2019-05-16T00:00:00 | 98 | 2019-05-16T00:00:00 | 64 | 2019-05-16T00:00:00 | 122 |