Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 100 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 7.6 KiB |
Average record size in memory | 77.3 B |
Variable types
Text | 1 |
---|---|
DateTime | 4 |
Numeric | 4 |
Dataset
Description | 고지혈증 환자들이 시행한 혈액 검사 중에 스타틴 약물의 효과를 평가할 수 있는 주요 검사 데이터를 포함하며 검체 채취 일자와 접수 일자를 이용하여 처방시점으로 부터의 기간을 계산한 시점 데이터를 생성함. 검사항목은 LDL Cholesterol, HDL Cholesterol, Total Cholesterol, Triglyceride 등 지질검사항목이 포함됨 - LDL(Low Density Lipoprotein) Cholesterol : 나쁜 콜레스테롤이라고도 불리는 저밀도 지단백 콜레스테롤. 신체 콜레스테롤의 대부분을 차지하며 수치가 높으면 심장질환 및 뇌놀중 위험이 높아짐 - HDL(High Density Lipoprotein) Cholesterol : 좋은 콜레스테롤이라고도 불리는 고밀도 지단백 콜레스테롤로 콜레스테롤을 흡수하여 간으로 다시 운반함. 높은 HDL cholesterol은 심장질환과 뇌졸중 위험을 낮출 수 있음 - Total Cholesterol(TC, 총콜레스테롤) : 혈액 내에 있는 모든 콜레스테롤을 뜻함 - Triglyceride(TG, 중성지방) : 대게 인체의 지방조직에 저장되며, 아주 일부만이 혈액 내에 존재합니다. ‘고중성지방혈증’ 또한 고지혈증의 하나이며, 피부와 내장, 혈관 등에 축적되어 비만과 각종 질환을 일으킴 |
---|---|
Author | 가톨릭대학교 서울성모병원 |
URL | http://cmcdata.net/data/dataset/main-effect-blood-test-data-dyslipidemia |
Reproduction
Analysis started | 2023-10-08 18:57:27.123224 |
---|---|
Analysis finished | 2023-10-08 18:57:32.693570 |
Duration | 5.57 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
RID
Text
UNIQUE
 
Distinct | 100 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Value | Count | Frequency (%) |
r0000022 | 1 | 1.0% |
r0001324 | 1 | 1.0% |
r0001495 | 1 | 1.0% |
r0001445 | 1 | 1.0% |
r0001442 | 1 | 1.0% |
r0001440 | 1 | 1.0% |
r0001433 | 1 | 1.0% |
r0001415 | 1 | 1.0% |
r0001409 | 1 | 1.0% |
r0001407 | 1 | 1.0% |
Other values (90) | 90 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 396 | |
R | 100 | 12.5% |
1 | 65 | 8.1% |
7 | 33 | 4.1% |
3 | 32 | 4.0% |
5 | 32 | 4.0% |
4 | 31 | 3.9% |
8 | 31 | 3.9% |
2 | 29 | 3.6% |
6 | 29 | 3.6% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 700 | |
Uppercase Letter | 100 | 12.5% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 396 | |
1 | 65 | 9.3% |
7 | 33 | 4.7% |
3 | 32 | 4.6% |
5 | 32 | 4.6% |
4 | 31 | 4.4% |
8 | 31 | 4.4% |
2 | 29 | 4.1% |
6 | 29 | 4.1% |
9 | 22 | 3.1% |
Uppercase Letter
Value | Count | Frequency (%) |
R | 100 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 700 | |
Latin | 100 | 12.5% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 396 | |
1 | 65 | 9.3% |
7 | 33 | 4.7% |
3 | 32 | 4.6% |
5 | 32 | 4.6% |
4 | 31 | 4.4% |
8 | 31 | 4.4% |
2 | 29 | 4.1% |
6 | 29 | 4.1% |
9 | 22 | 3.1% |
Latin
Value | Count | Frequency (%) |
R | 100 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 800 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 396 | |
R | 100 | 12.5% |
1 | 65 | 8.1% |
7 | 33 | 4.1% |
3 | 32 | 4.0% |
5 | 32 | 4.0% |
4 | 31 | 3.9% |
8 | 31 | 3.9% |
2 | 29 | 3.6% |
6 | 29 | 3.6% |
TC_DCT
Date
Distinct | 49 |
---|---|
Distinct (%) | 49.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Minimum | 2003-04-03 00:00:00 |
---|---|
Maximum | 2006-02-12 00:00:00 |
TC_VAL
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 80 |
---|---|
Distinct (%) | 80.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 188.09 |
Minimum | 70 |
---|---|
Maximum | 307 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 70 |
---|---|
5-th percentile | 116.9 |
Q1 | 160.75 |
median | 186 |
Q3 | 216 |
95-th percentile | 266.05 |
Maximum | 307 |
Range | 237 |
Interquartile range (IQR) | 55.25 |
Descriptive statistics
Standard deviation | 45.857067 |
---|---|
Coefficient of variation (CV) | 0.24380386 |
Kurtosis | -0.023873567 |
Mean | 188.09 |
Median Absolute Deviation (MAD) | 27.5 |
Skewness | 0.2450285 |
Sum | 18809 |
Variance | 2102.8706 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
196 | 4 | 4.0% |
190 | 3 | 3.0% |
186 | 3 | 3.0% |
180 | 3 | 3.0% |
199 | 2 | 2.0% |
163 | 2 | 2.0% |
203 | 2 | 2.0% |
256 | 2 | 2.0% |
216 | 2 | 2.0% |
167 | 2 | 2.0% |
Other values (70) | 75 |
Value | Count | Frequency (%) |
70 | 1 | |
108 | 1 | |
112 | 1 | |
114 | 1 | |
115 | 1 | |
117 | 1 | |
119 | 1 | |
121 | 1 | |
122 | 1 | |
124 | 1 |
Value | Count | Frequency (%) |
307 | 1 | |
300 | 1 | |
284 | 1 | |
271 | 1 | |
267 | 1 | |
266 | 1 | |
264 | 1 | |
259 | 1 | |
256 | 2 | |
255 | 1 |
TG_DCT
Date
Distinct | 49 |
---|---|
Distinct (%) | 49.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Minimum | 2003-04-03 00:00:00 |
---|---|
Maximum | 2006-02-12 00:00:00 |
TG_VAL
Real number (ℝ)
Distinct | 80 |
---|---|
Distinct (%) | 80.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 158.15 |
Minimum | 36 |
---|---|
Maximum | 676 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 36 |
---|---|
5-th percentile | 64.85 |
Q1 | 96.75 |
median | 131 |
Q3 | 173.25 |
95-th percentile | 310.8 |
Maximum | 676 |
Range | 640 |
Interquartile range (IQR) | 76.5 |
Descriptive statistics
Standard deviation | 108.25898 |
---|---|
Coefficient of variation (CV) | 0.68453357 |
Kurtosis | 9.4778168 |
Mean | 158.15 |
Median Absolute Deviation (MAD) | 38 |
Skewness | 2.7969473 |
Sum | 15815 |
Variance | 11720.008 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
97 | 3 | 3.0% |
100 | 3 | 3.0% |
132 | 3 | 3.0% |
102 | 2 | 2.0% |
120 | 2 | 2.0% |
90 | 2 | 2.0% |
80 | 2 | 2.0% |
161 | 2 | 2.0% |
148 | 2 | 2.0% |
113 | 2 | 2.0% |
Other values (70) | 77 |
Value | Count | Frequency (%) |
36 | 1 | |
46 | 1 | |
56 | 1 | |
61 | 1 | |
62 | 1 | |
65 | 1 | |
68 | 1 | |
71 | 1 | |
75 | 1 | |
80 | 2 |
Value | Count | Frequency (%) |
676 | 1 | |
623 | 1 | |
542 | 1 | |
516 | 1 | |
326 | 1 | |
310 | 1 | |
304 | 1 | |
286 | 1 | |
280 | 1 | |
264 | 1 |
HDL_DCT
Date
Distinct | 49 |
---|---|
Distinct (%) | 49.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Minimum | 2003-04-03 00:00:00 |
---|---|
Maximum | 2006-02-12 00:00:00 |
HDL_VAL
Real number (ℝ)
Distinct | 44 |
---|---|
Distinct (%) | 44.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 46.05 |
Minimum | 15 |
---|---|
Maximum | 113 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 15 |
---|---|
5-th percentile | 28.85 |
Q1 | 34 |
median | 45 |
Q3 | 55 |
95-th percentile | 68.15 |
Maximum | 113 |
Range | 98 |
Interquartile range (IQR) | 21 |
Descriptive statistics
Standard deviation | 14.701319 |
---|---|
Coefficient of variation (CV) | 0.31924689 |
Kurtosis | 4.1727758 |
Mean | 46.05 |
Median Absolute Deviation (MAD) | 10 |
Skewness | 1.3816347 |
Sum | 4605 |
Variance | 216.12879 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
34 | 6 | 6.0% |
45 | 5 | 5.0% |
33 | 4 | 4.0% |
48 | 4 | 4.0% |
44 | 4 | 4.0% |
46 | 4 | 4.0% |
31 | 4 | 4.0% |
56 | 4 | 4.0% |
39 | 4 | 4.0% |
57 | 3 | 3.0% |
Other values (34) | 58 |
Value | Count | Frequency (%) |
15 | 1 | 1.0% |
25 | 1 | 1.0% |
26 | 3 | |
29 | 2 | 2.0% |
30 | 3 | |
31 | 4 | |
32 | 2 | 2.0% |
33 | 4 | |
34 | 6 | |
35 | 1 | 1.0% |
Value | Count | Frequency (%) |
113 | 1 | |
91 | 1 | |
87 | 1 | |
76 | 1 | |
71 | 1 | |
68 | 1 | |
67 | 1 | |
63 | 1 | |
61 | 2 | |
60 | 1 |
LDL_DCT
Date
Distinct | 50 |
---|---|
Distinct (%) | 50.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Minimum | 2003-04-03 00:00:00 |
---|---|
Maximum | 2006-02-14 00:00:00 |
LDL_VAL
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 63 |
---|---|
Distinct (%) | 63.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 98.13 |
Minimum | 18 |
---|---|
Maximum | 207 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 18 |
---|---|
5-th percentile | 54.65 |
Q1 | 78 |
median | 96 |
Q3 | 118.5 |
95-th percentile | 149 |
Maximum | 207 |
Range | 189 |
Interquartile range (IQR) | 40.5 |
Descriptive statistics
Standard deviation | 31.659698 |
---|---|
Coefficient of variation (CV) | 0.32263016 |
Kurtosis | 0.59942723 |
Mean | 98.13 |
Median Absolute Deviation (MAD) | 20 |
Skewness | 0.325284 |
Sum | 9813 |
Variance | 1002.3365 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
98 | 4 | 4.0% |
83 | 4 | 4.0% |
87 | 4 | 4.0% |
101 | 3 | 3.0% |
112 | 3 | 3.0% |
149 | 3 | 3.0% |
103 | 3 | 3.0% |
94 | 3 | 3.0% |
69 | 2 | 2.0% |
70 | 2 | 2.0% |
Other values (53) | 69 |
Value | Count | Frequency (%) |
18 | 1 | |
32 | 1 | |
42 | 2 | |
48 | 1 | |
55 | 2 | |
57 | 2 | |
58 | 1 | |
60 | 2 | |
61 | 2 | |
62 | 1 |
Value | Count | Frequency (%) |
207 | 1 | 1.0% |
165 | 1 | 1.0% |
154 | 1 | 1.0% |
149 | 3 | |
148 | 1 | 1.0% |
143 | 2 | |
142 | 1 | 1.0% |
140 | 1 | 1.0% |
139 | 1 | 1.0% |
138 | 1 | 1.0% |
RID | TC_DCT | TC_VAL | TG_DCT | TG_VAL | HDL_DCT | HDL_VAL | LDL_DCT | LDL_VAL | |
---|---|---|---|---|---|---|---|---|---|
RID | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
TC_DCT | 1.000 | 1.000 | 0.000 | 1.000 | 0.000 | 1.000 | 0.000 | 1.000 | 0.691 |
TC_VAL | 1.000 | 0.000 | 1.000 | 0.000 | 0.512 | 0.000 | 0.677 | 0.000 | 0.806 |
TG_DCT | 1.000 | 1.000 | 0.000 | 1.000 | 0.000 | 1.000 | 0.000 | 1.000 | 0.691 |
TG_VAL | 1.000 | 0.000 | 0.512 | 0.000 | 1.000 | 0.000 | 0.568 | 0.000 | 0.322 |
HDL_DCT | 1.000 | 1.000 | 0.000 | 1.000 | 0.000 | 1.000 | 0.000 | 1.000 | 0.691 |
HDL_VAL | 1.000 | 0.000 | 0.677 | 0.000 | 0.568 | 0.000 | 1.000 | 0.354 | 0.510 |
LDL_DCT | 1.000 | 1.000 | 0.000 | 1.000 | 0.000 | 1.000 | 0.354 | 1.000 | 0.703 |
LDL_VAL | 1.000 | 0.691 | 0.806 | 0.691 | 0.322 | 0.691 | 0.510 | 0.703 | 1.000 |
TC_VAL | TG_VAL | HDL_VAL | LDL_VAL | |
---|---|---|---|---|
TC_VAL | 1.000 | 0.206 | 0.361 | 0.913 |
TG_VAL | 0.206 | 1.000 | -0.326 | 0.111 |
HDL_VAL | 0.361 | -0.326 | 1.000 | 0.270 |
LDL_VAL | 0.913 | 0.111 | 0.270 | 1.000 |
RID | TC_DCT | TC_VAL | TG_DCT | TG_VAL | HDL_DCT | HDL_VAL | LDL_DCT | LDL_VAL | |
---|---|---|---|---|---|---|---|---|---|
0 | R0000022 | 2003-04-03 | 235 | 2003-04-03 | 280 | 2003-04-03 | 33 | 2003-04-03 | 143 |
1 | R0000025 | 2003-05-28 | 300 | 2003-05-28 | 119 | 2003-05-28 | 60 | 2003-05-28 | 207 |
2 | R0000033 | 2004-04-24 | 196 | 2004-04-24 | 97 | 2004-04-24 | 87 | 2004-04-24 | 83 |
3 | R0000035 | 2004-05-28 | 150 | 2004-05-28 | 676 | 2004-05-28 | 26 | 2004-05-28 | 42 |
4 | R0000036 | 2004-08-10 | 174 | 2004-08-10 | 91 | 2004-08-10 | 36 | 2004-08-10 | 121 |
5 | R0000037 | 2004-08-26 | 217 | 2004-08-26 | 192 | 2004-08-26 | 46 | 2004-08-26 | 149 |
6 | R0000039 | 2004-12-06 | 161 | 2004-12-06 | 88 | 2004-12-06 | 31 | 2004-12-06 | 103 |
7 | R0000043 | 2004-12-27 | 242 | 2004-12-27 | 310 | 2004-12-27 | 43 | 2004-12-27 | 133 |
8 | R0000047 | 2005-01-14 | 180 | 2005-01-14 | 110 | 2005-01-14 | 61 | 2005-01-14 | 103 |
9 | R0000058 | 2005-03-12 | 226 | 2005-03-12 | 146 | 2005-03-12 | 30 | 2005-03-12 | 165 |
RID | TC_DCT | TC_VAL | TG_DCT | TG_VAL | HDL_DCT | HDL_VAL | LDL_DCT | LDL_VAL | |
---|---|---|---|---|---|---|---|---|---|
90 | R0001940 | 2006-01-23 | 165 | 2006-01-23 | 113 | 2006-01-23 | 43 | 2006-01-23 | 88 |
91 | R0001948 | 2006-01-18 | 162 | 2006-01-18 | 56 | 2006-01-18 | 44 | 2006-01-18 | 83 |
92 | R0001966 | 2006-01-05 | 153 | 2006-01-05 | 36 | 2006-01-05 | 48 | 2006-01-05 | 83 |
93 | R0001969 | 2006-01-18 | 237 | 2006-01-18 | 286 | 2006-01-18 | 46 | 2006-01-18 | 118 |
94 | R0001977 | 2006-01-21 | 199 | 2006-01-21 | 130 | 2006-01-21 | 48 | 2006-01-21 | 105 |
95 | R0002001 | 2006-01-18 | 140 | 2006-01-18 | 87 | 2006-01-18 | 57 | 2006-01-18 | 57 |
96 | R0002003 | 2006-01-19 | 212 | 2006-01-19 | 153 | 2006-01-19 | 55 | 2006-01-19 | 103 |
97 | R0002023 | 2006-01-19 | 196 | 2006-01-19 | 213 | 2006-01-19 | 31 | 2006-01-19 | 101 |
98 | R0002052 | 2006-01-05 | 236 | 2006-01-05 | 120 | 2006-01-05 | 53 | 2006-01-05 | 132 |
99 | R0002069 | 2006-01-17 | 249 | 2006-01-17 | 174 | 2006-01-17 | 47 | 2006-01-17 | 140 |