Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 KiB
Average record size in memory77.3 B

Variable types

Text1
DateTime4
Numeric4

Dataset

Description알코올 사용 장애 환자들이 시행한 혈액 검사를 이용하여 당뇨, 고지혈증 질환과의 관련성을 평가할 수 있는 검사 데이터를 포함함. 검체 채취 일장, 접수 일자를 이용하여 처방시점으로 부터의 기간을 계산한 시점 데이터를 생성함. 검사항목은HbA1c, Glucose, HDL Cholesterol, LDL Cholesterol 등의 검사항목이 포함됨 - HbA1c(당화혈색소) :혈액 속 적혈구 내 혈색소에 포도당 일부가 결합한 상태. 일반 혈당 검사가 검사 시점 혈당만을 알 수 있는데 반해 당화혈색소를 통해 3개월 간의 평균 혈당을 알 수 있음 - LDL(Low Density Lipoprotein) Cholesterol : 나쁜 콜레스테롤이라고도 불리는 저밀도 지단백 콜레스테롤. 신체 콜레스테롤의 대부분을 차지하며 수치가 높으면 심장질환 및 뇌놀중 위험이 높아짐 - HDL(High Density Lipoprotein) Cholesterol : 좋은 콜레스테롤이라고도 불리는 고밀도 지단백 콜레스테롤로 콜레스테롤을 흡수하여 간으로 다시 운반함. 높은 HDL cholesterol은 심장질환과 뇌졸중 위험을 낮출 수 있음
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/coexistence-disease-analysis-blood-test-data-alcohol-use-disorder

Alerts

A1C_SRC is highly overall correlated with GLC_SRCHigh correlation
GLC_SRC is highly overall correlated with A1C_SRCHigh correlation
RID has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:56:30.201794
Analysis finished2023-10-08 18:56:34.782063
Duration4.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-10-09T03:56:35.330633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowR0000002
2nd rowR0000004
3rd rowR0000009
4th rowR0000020
5th rowR0000032
ValueCountFrequency (%)
r0000002 1
 
1.0%
r0000368 1
 
1.0%
r0000434 1
 
1.0%
r0000432 1
 
1.0%
r0000425 1
 
1.0%
r0000413 1
 
1.0%
r0000411 1
 
1.0%
r0000401 1
 
1.0%
r0000398 1
 
1.0%
r0000397 1
 
1.0%
Other values (90) 90
90.0%
2023-10-09T03:56:36.267899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 439
54.9%
R 100
 
12.5%
2 45
 
5.6%
4 41
 
5.1%
1 38
 
4.8%
3 34
 
4.2%
5 27
 
3.4%
6 23
 
2.9%
7 21
 
2.6%
9 17
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 700
87.5%
Uppercase Letter 100
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 439
62.7%
2 45
 
6.4%
4 41
 
5.9%
1 38
 
5.4%
3 34
 
4.9%
5 27
 
3.9%
6 23
 
3.3%
7 21
 
3.0%
9 17
 
2.4%
8 15
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
R 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 700
87.5%
Latin 100
 
12.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 439
62.7%
2 45
 
6.4%
4 41
 
5.9%
1 38
 
5.4%
3 34
 
4.9%
5 27
 
3.9%
6 23
 
3.3%
7 21
 
3.0%
9 17
 
2.4%
8 15
 
2.1%
Latin
ValueCountFrequency (%)
R 100
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 439
54.9%
R 100
 
12.5%
2 45
 
5.6%
4 41
 
5.1%
1 38
 
4.8%
3 34
 
4.2%
5 27
 
3.4%
6 23
 
2.9%
7 21
 
2.6%
9 17
 
2.1%
Distinct97
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2008-11-03 00:00:00
Maximum2018-07-31 00:00:00
2023-10-09T03:56:36.736245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:37.059994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

A1C_SRC
Real number (ℝ)

HIGH CORRELATION 

Distinct71
Distinct (%)71.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean134.413
Minimum77
Maximum403
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:37.421699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum77
5-th percentile84.9
Q197.75
median113.5
Q3147
95-th percentile249.84
Maximum403
Range326
Interquartile range (IQR)49.25

Descriptive statistics

Standard deviation58.071189
Coefficient of variation (CV)0.43203551
Kurtosis6.8768221
Mean134.413
Median Absolute Deviation (MAD)20
Skewness2.3589926
Sum13441.3
Variance3372.263
MonotonicityNot monotonic
2023-10-09T03:56:38.004868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
111.0 5
 
5.0%
97.0 4
 
4.0%
113.0 3
 
3.0%
147.0 3
 
3.0%
91.0 3
 
3.0%
86.0 3
 
3.0%
110.0 2
 
2.0%
112.0 2
 
2.0%
170.0 2
 
2.0%
105.0 2
 
2.0%
Other values (61) 71
71.0%
ValueCountFrequency (%)
77.0 1
 
1.0%
78.0 1
 
1.0%
79.0 1
 
1.0%
82.0 1
 
1.0%
83.0 1
 
1.0%
85.0 1
 
1.0%
86.0 3
3.0%
87.0 1
 
1.0%
89.0 2
2.0%
91.0 3
3.0%
ValueCountFrequency (%)
403.0 1
1.0%
380.0 1
1.0%
277.0 1
1.0%
268.0 1
1.0%
262.0 1
1.0%
249.2 1
1.0%
238.0 1
1.0%
237.4 1
1.0%
234.0 1
1.0%
213.0 1
1.0%
Distinct98
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2008-10-23 00:00:00
Maximum2018-07-31 00:00:00
2023-10-09T03:56:38.319081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:38.552836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

GLC_SRC
Real number (ℝ)

HIGH CORRELATION 

Distinct42
Distinct (%)42.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.433
Minimum4.2
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:38.875874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4.2
5-th percentile4.695
Q15.4
median5.95
Q37.125
95-th percentile10.02
Maximum13
Range8.8
Interquartile range (IQR)1.725

Descriptive statistics

Standard deviation1.6425562
Coefficient of variation (CV)0.25533285
Kurtosis2.7920744
Mean6.433
Median Absolute Deviation (MAD)0.65
Skewness1.5838866
Sum643.3
Variance2.6979909
MonotonicityNot monotonic
2023-10-09T03:56:39.141966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
5.5 11
 
11.0%
5.4 7
 
7.0%
5.7 5
 
5.0%
7.6 5
 
5.0%
6.4 4
 
4.0%
6.5 4
 
4.0%
6.3 4
 
4.0%
5.1 4
 
4.0%
4.7 3
 
3.0%
5.0 3
 
3.0%
Other values (32) 50
50.0%
ValueCountFrequency (%)
4.2 1
 
1.0%
4.3 1
 
1.0%
4.4 1
 
1.0%
4.6 2
2.0%
4.7 3
3.0%
4.9 2
2.0%
5.0 3
3.0%
5.1 4
4.0%
5.2 2
2.0%
5.3 2
2.0%
ValueCountFrequency (%)
13.0 1
1.0%
11.2 1
1.0%
10.8 2
2.0%
10.4 1
1.0%
10.0 1
1.0%
9.4 2
2.0%
9.3 1
1.0%
8.6 1
1.0%
8.3 1
1.0%
8.2 2
2.0%
Distinct97
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2008-11-03 00:00:00
Maximum2018-07-31 00:00:00
2023-10-09T03:56:39.391106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:39.716706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

HDL_SRC
Real number (ℝ)

Distinct46
Distinct (%)46.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.09
Minimum3
Maximum94
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:40.683461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile9
Q136
median44.5
Q354
95-th percentile71.15
Maximum94
Range91
Interquartile range (IQR)18

Descriptive statistics

Standard deviation17.549408
Coefficient of variation (CV)0.39803601
Kurtosis0.43752756
Mean44.09
Median Absolute Deviation (MAD)9.5
Skewness-0.18746109
Sum4409
Variance307.98172
MonotonicityNot monotonic
2023-10-09T03:56:41.234790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
41 7
 
7.0%
47 6
 
6.0%
49 6
 
6.0%
65 5
 
5.0%
54 4
 
4.0%
39 4
 
4.0%
10 4
 
4.0%
40 4
 
4.0%
29 3
 
3.0%
55 3
 
3.0%
Other values (36) 54
54.0%
ValueCountFrequency (%)
3 1
 
1.0%
7 1
 
1.0%
8 2
2.0%
9 2
2.0%
10 4
4.0%
24 1
 
1.0%
27 2
2.0%
28 1
 
1.0%
29 3
3.0%
30 1
 
1.0%
ValueCountFrequency (%)
94 1
 
1.0%
83 1
 
1.0%
75 1
 
1.0%
74 2
 
2.0%
71 1
 
1.0%
68 1
 
1.0%
66 2
 
2.0%
65 5
5.0%
64 2
 
2.0%
61 1
 
1.0%
Distinct98
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2008-10-23 00:00:00
Maximum2018-07-31 00:00:00
2023-10-09T03:56:41.532775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:41.784523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

LDL_SRC
Real number (ℝ)

Distinct72
Distinct (%)72.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98.61
Minimum5
Maximum195
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:42.066189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile28.95
Q170
median99
Q3124
95-th percentile165.35
Maximum195
Range190
Interquartile range (IQR)54

Descriptive statistics

Standard deviation39.881616
Coefficient of variation (CV)0.40443785
Kurtosis-0.2955984
Mean98.61
Median Absolute Deviation (MAD)29
Skewness0.010456013
Sum9861
Variance1590.5433
MonotonicityNot monotonic
2023-10-09T03:56:42.395905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
70 3
 
3.0%
93 3
 
3.0%
117 3
 
3.0%
100 3
 
3.0%
69 3
 
3.0%
135 2
 
2.0%
98 2
 
2.0%
78 2
 
2.0%
55 2
 
2.0%
121 2
 
2.0%
Other values (62) 75
75.0%
ValueCountFrequency (%)
5 1
1.0%
17 1
1.0%
20 1
1.0%
23 1
1.0%
28 1
1.0%
29 1
1.0%
38 1
1.0%
42 1
1.0%
46 2
2.0%
48 2
2.0%
ValueCountFrequency (%)
195 1
1.0%
182 1
1.0%
177 1
1.0%
175 1
1.0%
172 1
1.0%
165 1
1.0%
164 1
1.0%
159 1
1.0%
154 1
1.0%
149 1
1.0%

Interactions

2023-10-09T03:56:33.676442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:31.660581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:32.371688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:32.938721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:33.866238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:31.851154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:32.537424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:33.110841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:34.003331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:32.005173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:32.654650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:33.297500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:34.153118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:32.172956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:32.788363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:33.455444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:56:42.719159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDA1C_DCTA1C_SRCGLC_DCTGLC_SRCHDL_DCTHDL_SRCLDL_DCTLDL_SRC
RID1.0001.0001.0001.0001.0001.0001.0001.0001.000
A1C_DCT1.0001.0000.0001.0000.0001.0000.9651.0000.951
A1C_SRC1.0000.0001.0000.8890.6350.8720.0000.8890.000
GLC_DCT1.0001.0000.8891.0000.0001.0000.9901.0000.959
GLC_SRC1.0000.0000.6350.0001.0000.0000.0000.0000.000
HDL_DCT1.0001.0000.8721.0000.0001.0000.9651.0000.844
HDL_SRC1.0000.9650.0000.9900.0000.9651.0000.9900.326
LDL_DCT1.0001.0000.8891.0000.0001.0000.9901.0000.959
LDL_SRC1.0000.9510.0000.9590.0000.8440.3260.9591.000
2023-10-09T03:56:43.042941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A1C_SRCGLC_SRCHDL_SRCLDL_SRC
A1C_SRC1.0000.629-0.165-0.073
GLC_SRC0.6291.000-0.048-0.113
HDL_SRC-0.165-0.0481.0000.210
LDL_SRC-0.073-0.1130.2101.000

Missing values

2023-10-09T03:56:34.351192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:56:34.636406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

RIDA1C_DCTA1C_SRCGLC_DCTGLC_SRCHDL_DCTHDL_SRCLDL_DCTLDL_SRC
0R00000022011-09-23113.02011-09-236.42011-09-23542011-09-23118
1R00000042014-10-24160.02014-10-246.32014-10-24522014-10-2423
2R00000092013-01-10111.02013-01-106.02013-01-10292013-01-10154
3R00000202015-09-23133.02015-09-235.52015-09-23342015-10-0293
4R00000322009-02-05249.22009-01-277.12009-01-24642009-01-24129
5R00000392011-04-05147.02011-04-227.82011-04-05402011-04-05141
6R00000442016-09-2793.02016-09-275.12016-09-27472016-09-27182
7R00000532016-05-1297.02016-05-125.52016-05-12492016-05-12133
8R00000572017-11-21109.02017-11-216.32017-11-21472017-11-21175
9R00000602016-08-1695.02016-08-165.82016-08-16412016-08-2984
RIDA1C_DCTA1C_SRCGLC_DCTGLC_SRCHDL_DCTHDL_SRCLDL_DCTLDL_SRC
90R00005032011-03-23380.02011-03-227.22011-03-2282011-03-22109
91R00005052009-05-16111.02009-05-124.72009-05-28382009-05-28107
92R00005072009-05-13111.02009-05-145.42009-05-13552009-05-13103
93R00005142011-03-23113.02011-03-237.02011-03-23392011-03-23128
94R00005162017-04-25116.02017-04-256.02017-04-2532017-04-2517
95R00005172010-01-2586.02010-01-255.42010-01-25322010-01-25149
96R00005202013-06-1799.02013-06-176.22013-06-17352013-06-1746
97R00005262010-11-16403.02010-11-1610.82010-11-20422010-11-20130
98R00005272013-07-01170.02013-07-156.42013-07-01382013-07-01117
99R00005282017-03-21111.02017-03-215.22017-03-21942017-03-2169