Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 KiB
Average record size in memory77.3 B

Variable types

Text1
DateTime4
Numeric4

Dataset

Description고지혈증 환자들이 시행한 혈액 검사 중에 스타틴 약물의 효과를 평가할 수 있는 주요 검사 데이터를 포함하며 검체 채취 일자와 접수 일자를 이용하여 처방시점으로 부터의 기간을 계산한 시점 데이터를 생성함. 검사항목은 LDL Cholesterol, HDL Cholesterol, Total Cholesterol, Triglyceride 등 지질검사항목이 포함됨 - LDL(Low Density Lipoprotein) Cholesterol : 나쁜 콜레스테롤이라고도 불리는 저밀도 지단백 콜레스테롤. 신체 콜레스테롤의 대부분을 차지하며 수치가 높으면 심장질환 및 뇌놀중 위험이 높아짐 - HDL(High Density Lipoprotein) Cholesterol : 좋은 콜레스테롤이라고도 불리는 고밀도 지단백 콜레스테롤로 콜레스테롤을 흡수하여 간으로 다시 운반함. 높은 HDL cholesterol은 심장질환과 뇌졸중 위험을 낮출 수 있음 - Total Cholesterol(TC, 총콜레스테롤) : 혈액 내에 있는 모든 콜레스테롤을 뜻함 - Triglyceride(TG, 중성지방) : 대게 인체의 지방조직에 저장되며, 아주 일부만이 혈액 내에 존재합니다. ‘고중성지방혈증’ 또한 고지혈증의 하나이며, 피부와 내장, 혈관 등에 축적되어 비만과 각종 질환을 일으킴
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/main-effect-blood-test-data-dyslipidemia

Alerts

TC_VAL is highly overall correlated with LDL_VALHigh correlation
LDL_VAL is highly overall correlated with TC_VALHigh correlation
RID has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:57:27.123224
Analysis finished2023-10-08 18:57:32.693570
Duration5.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-10-09T03:57:33.174579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowR0000022
2nd rowR0000025
3rd rowR0000033
4th rowR0000035
5th rowR0000036
ValueCountFrequency (%)
r0000022 1
 
1.0%
r0001324 1
 
1.0%
r0001495 1
 
1.0%
r0001445 1
 
1.0%
r0001442 1
 
1.0%
r0001440 1
 
1.0%
r0001433 1
 
1.0%
r0001415 1
 
1.0%
r0001409 1
 
1.0%
r0001407 1
 
1.0%
Other values (90) 90
90.0%
2023-10-09T03:57:35.452258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 396
49.5%
R 100
 
12.5%
1 65
 
8.1%
7 33
 
4.1%
3 32
 
4.0%
5 32
 
4.0%
4 31
 
3.9%
8 31
 
3.9%
2 29
 
3.6%
6 29
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 700
87.5%
Uppercase Letter 100
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 396
56.6%
1 65
 
9.3%
7 33
 
4.7%
3 32
 
4.6%
5 32
 
4.6%
4 31
 
4.4%
8 31
 
4.4%
2 29
 
4.1%
6 29
 
4.1%
9 22
 
3.1%
Uppercase Letter
ValueCountFrequency (%)
R 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 700
87.5%
Latin 100
 
12.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 396
56.6%
1 65
 
9.3%
7 33
 
4.7%
3 32
 
4.6%
5 32
 
4.6%
4 31
 
4.4%
8 31
 
4.4%
2 29
 
4.1%
6 29
 
4.1%
9 22
 
3.1%
Latin
ValueCountFrequency (%)
R 100
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 396
49.5%
R 100
 
12.5%
1 65
 
8.1%
7 33
 
4.1%
3 32
 
4.0%
5 32
 
4.0%
4 31
 
3.9%
8 31
 
3.9%
2 29
 
3.6%
6 29
 
3.6%

TC_DCT
Date

Distinct49
Distinct (%)49.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2003-04-03 00:00:00
Maximum2006-02-12 00:00:00
2023-10-09T03:57:35.956199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:36.424567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)

TC_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct80
Distinct (%)80.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean188.09
Minimum70
Maximum307
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:36.982207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum70
5-th percentile116.9
Q1160.75
median186
Q3216
95-th percentile266.05
Maximum307
Range237
Interquartile range (IQR)55.25

Descriptive statistics

Standard deviation45.857067
Coefficient of variation (CV)0.24380386
Kurtosis-0.023873567
Mean188.09
Median Absolute Deviation (MAD)27.5
Skewness0.2450285
Sum18809
Variance2102.8706
MonotonicityNot monotonic
2023-10-09T03:57:37.325181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
196 4
 
4.0%
190 3
 
3.0%
186 3
 
3.0%
180 3
 
3.0%
199 2
 
2.0%
163 2
 
2.0%
203 2
 
2.0%
256 2
 
2.0%
216 2
 
2.0%
167 2
 
2.0%
Other values (70) 75
75.0%
ValueCountFrequency (%)
70 1
1.0%
108 1
1.0%
112 1
1.0%
114 1
1.0%
115 1
1.0%
117 1
1.0%
119 1
1.0%
121 1
1.0%
122 1
1.0%
124 1
1.0%
ValueCountFrequency (%)
307 1
1.0%
300 1
1.0%
284 1
1.0%
271 1
1.0%
267 1
1.0%
266 1
1.0%
264 1
1.0%
259 1
1.0%
256 2
2.0%
255 1
1.0%

TG_DCT
Date

Distinct49
Distinct (%)49.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2003-04-03 00:00:00
Maximum2006-02-12 00:00:00
2023-10-09T03:57:37.675224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:38.073452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)

TG_VAL
Real number (ℝ)

Distinct80
Distinct (%)80.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean158.15
Minimum36
Maximum676
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:38.522062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36
5-th percentile64.85
Q196.75
median131
Q3173.25
95-th percentile310.8
Maximum676
Range640
Interquartile range (IQR)76.5

Descriptive statistics

Standard deviation108.25898
Coefficient of variation (CV)0.68453357
Kurtosis9.4778168
Mean158.15
Median Absolute Deviation (MAD)38
Skewness2.7969473
Sum15815
Variance11720.008
MonotonicityNot monotonic
2023-10-09T03:57:38.995844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
97 3
 
3.0%
100 3
 
3.0%
132 3
 
3.0%
102 2
 
2.0%
120 2
 
2.0%
90 2
 
2.0%
80 2
 
2.0%
161 2
 
2.0%
148 2
 
2.0%
113 2
 
2.0%
Other values (70) 77
77.0%
ValueCountFrequency (%)
36 1
1.0%
46 1
1.0%
56 1
1.0%
61 1
1.0%
62 1
1.0%
65 1
1.0%
68 1
1.0%
71 1
1.0%
75 1
1.0%
80 2
2.0%
ValueCountFrequency (%)
676 1
1.0%
623 1
1.0%
542 1
1.0%
516 1
1.0%
326 1
1.0%
310 1
1.0%
304 1
1.0%
286 1
1.0%
280 1
1.0%
264 1
1.0%
Distinct49
Distinct (%)49.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2003-04-03 00:00:00
Maximum2006-02-12 00:00:00
2023-10-09T03:57:39.327736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:39.634591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)

HDL_VAL
Real number (ℝ)

Distinct44
Distinct (%)44.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46.05
Minimum15
Maximum113
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:39.925611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile28.85
Q134
median45
Q355
95-th percentile68.15
Maximum113
Range98
Interquartile range (IQR)21

Descriptive statistics

Standard deviation14.701319
Coefficient of variation (CV)0.31924689
Kurtosis4.1727758
Mean46.05
Median Absolute Deviation (MAD)10
Skewness1.3816347
Sum4605
Variance216.12879
MonotonicityNot monotonic
2023-10-09T03:57:40.191726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
34 6
 
6.0%
45 5
 
5.0%
33 4
 
4.0%
48 4
 
4.0%
44 4
 
4.0%
46 4
 
4.0%
31 4
 
4.0%
56 4
 
4.0%
39 4
 
4.0%
57 3
 
3.0%
Other values (34) 58
58.0%
ValueCountFrequency (%)
15 1
 
1.0%
25 1
 
1.0%
26 3
3.0%
29 2
 
2.0%
30 3
3.0%
31 4
4.0%
32 2
 
2.0%
33 4
4.0%
34 6
6.0%
35 1
 
1.0%
ValueCountFrequency (%)
113 1
1.0%
91 1
1.0%
87 1
1.0%
76 1
1.0%
71 1
1.0%
68 1
1.0%
67 1
1.0%
63 1
1.0%
61 2
2.0%
60 1
1.0%
Distinct50
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2003-04-03 00:00:00
Maximum2006-02-14 00:00:00
2023-10-09T03:57:41.042264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:41.292337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

LDL_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct63
Distinct (%)63.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98.13
Minimum18
Maximum207
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:41.586467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile54.65
Q178
median96
Q3118.5
95-th percentile149
Maximum207
Range189
Interquartile range (IQR)40.5

Descriptive statistics

Standard deviation31.659698
Coefficient of variation (CV)0.32263016
Kurtosis0.59942723
Mean98.13
Median Absolute Deviation (MAD)20
Skewness0.325284
Sum9813
Variance1002.3365
MonotonicityNot monotonic
2023-10-09T03:57:41.839587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
98 4
 
4.0%
83 4
 
4.0%
87 4
 
4.0%
101 3
 
3.0%
112 3
 
3.0%
149 3
 
3.0%
103 3
 
3.0%
94 3
 
3.0%
69 2
 
2.0%
70 2
 
2.0%
Other values (53) 69
69.0%
ValueCountFrequency (%)
18 1
1.0%
32 1
1.0%
42 2
2.0%
48 1
1.0%
55 2
2.0%
57 2
2.0%
58 1
1.0%
60 2
2.0%
61 2
2.0%
62 1
1.0%
ValueCountFrequency (%)
207 1
 
1.0%
165 1
 
1.0%
154 1
 
1.0%
149 3
3.0%
148 1
 
1.0%
143 2
2.0%
142 1
 
1.0%
140 1
 
1.0%
139 1
 
1.0%
138 1
 
1.0%

Interactions

2023-10-09T03:57:31.554902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:28.899477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:29.906128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.733183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:31.709961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:29.078454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.059586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.951703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:31.880777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:29.285762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.294244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:31.103228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:32.022920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:29.673492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:30.492059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:31.244669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:57:42.112769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDTC_DCTTC_VALTG_DCTTG_VALHDL_DCTHDL_VALLDL_DCTLDL_VAL
RID1.0001.0001.0001.0001.0001.0001.0001.0001.000
TC_DCT1.0001.0000.0001.0000.0001.0000.0001.0000.691
TC_VAL1.0000.0001.0000.0000.5120.0000.6770.0000.806
TG_DCT1.0001.0000.0001.0000.0001.0000.0001.0000.691
TG_VAL1.0000.0000.5120.0001.0000.0000.5680.0000.322
HDL_DCT1.0001.0000.0001.0000.0001.0000.0001.0000.691
HDL_VAL1.0000.0000.6770.0000.5680.0001.0000.3540.510
LDL_DCT1.0001.0000.0001.0000.0001.0000.3541.0000.703
LDL_VAL1.0000.6910.8060.6910.3220.6910.5100.7031.000
2023-10-09T03:57:42.400632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
TC_VALTG_VALHDL_VALLDL_VAL
TC_VAL1.0000.2060.3610.913
TG_VAL0.2061.000-0.3260.111
HDL_VAL0.361-0.3261.0000.270
LDL_VAL0.9130.1110.2701.000

Missing values

2023-10-09T03:57:32.285744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:57:32.605160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

RIDTC_DCTTC_VALTG_DCTTG_VALHDL_DCTHDL_VALLDL_DCTLDL_VAL
0R00000222003-04-032352003-04-032802003-04-03332003-04-03143
1R00000252003-05-283002003-05-281192003-05-28602003-05-28207
2R00000332004-04-241962004-04-24972004-04-24872004-04-2483
3R00000352004-05-281502004-05-286762004-05-28262004-05-2842
4R00000362004-08-101742004-08-10912004-08-10362004-08-10121
5R00000372004-08-262172004-08-261922004-08-26462004-08-26149
6R00000392004-12-061612004-12-06882004-12-06312004-12-06103
7R00000432004-12-272422004-12-273102004-12-27432004-12-27133
8R00000472005-01-141802005-01-141102005-01-14612005-01-14103
9R00000582005-03-122262005-03-121462005-03-12302005-03-12165
RIDTC_DCTTC_VALTG_DCTTG_VALHDL_DCTHDL_VALLDL_DCTLDL_VAL
90R00019402006-01-231652006-01-231132006-01-23432006-01-2388
91R00019482006-01-181622006-01-18562006-01-18442006-01-1883
92R00019662006-01-051532006-01-05362006-01-05482006-01-0583
93R00019692006-01-182372006-01-182862006-01-18462006-01-18118
94R00019772006-01-211992006-01-211302006-01-21482006-01-21105
95R00020012006-01-181402006-01-18872006-01-18572006-01-1857
96R00020032006-01-192122006-01-191532006-01-19552006-01-19103
97R00020232006-01-191962006-01-192132006-01-19312006-01-19101
98R00020522006-01-052362006-01-051202006-01-05532006-01-05132
99R00020692006-01-172492006-01-171742006-01-17472006-01-17140