Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 KiB
Average record size in memory77.3 B

Variable types

Numeric4
DateTime4
Text1

Dataset

Description알코올 사용 장애 환자들이 시행한 혈액 검사를 이용하여 당뇨, 고지혈증 질환과의 관련성을 평가할 수 있는 검사 데이터를 포함함. 검체 채취 일장, 접수 일자를 이용하여 처방시점으로 부터의 기간을 계산한 시점 데이터를 생성함. 검사항목은HbA1c, Glucose, HDL Cholesterol, LDL Cholesterol 등의 검사항목이 포함됨 - HbA1c(당화혈색소) :혈액 속 적혈구 내 혈색소에 포도당 일부가 결합한 상태. 일반 혈당 검사가 검사 시점 혈당만을 알 수 있는데 반해 당화혈색소를 통해 3개월 간의 평균 혈당을 알 수 있음 - LDL(Low Density Lipoprotein) Cholesterol : 나쁜 콜레스테롤이라고도 불리는 저밀도 지단백 콜레스테롤. 신체 콜레스테롤의 대부분을 차지하며 수치가 높으면 심장질환 및 뇌놀중 위험이 높아짐 - HDL(High Density Lipoprotein) Cholesterol : 좋은 콜레스테롤이라고도 불리는 고밀도 지단백 콜레스테롤로 콜레스테롤을 흡수하여 간으로 다시 운반함. 높은 HDL cholesterol은 심장질환과 뇌졸중 위험을 낮출 수 있음
Author가톨릭대학교 은평성모병원
URLhttp://cmcdata.net/data/dataset/coexistence-disease-analysis-blood-test-data-alcohol-use-disorder-eunpyeong

Alerts

A1C_SRC is highly overall correlated with GLC_SRC High correlation
GLC_SRC is highly overall correlated with A1C_SRC High correlation
RID has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:56:49.224184
Analysis finished2023-10-08 18:56:54.568114
Duration5.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:54.738712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-10-09T03:56:55.425255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%
Distinct90
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2015-11-17 00:00:00
Maximum2020-04-07 00:00:00
2023-10-09T03:56:55.872509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:56.230652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

A1C_SRC
Real number (ℝ)

HIGH CORRELATION 

Distinct48
Distinct (%)48.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.58
Minimum4
Maximum14.4
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:56.529507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile4.595
Q15.3
median5.9
Q37.425
95-th percentile10.01
Maximum14.4
Range10.4
Interquartile range (IQR)2.125

Descriptive statistics

Standard deviation1.8508529
Coefficient of variation (CV)0.28128464
Kurtosis2.4682709
Mean6.58
Median Absolute Deviation (MAD)0.8
Skewness1.4458676
Sum658
Variance3.4256566
MonotonicityNot monotonic
2023-10-09T03:56:56.889000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
5.3 9
 
9.0%
5.7 5
 
5.0%
6.0 4
 
4.0%
5.2 4
 
4.0%
6.3 4
 
4.0%
5.6 4
 
4.0%
5.0 4
 
4.0%
5.4 3
 
3.0%
7.1 3
 
3.0%
5.5 3
 
3.0%
Other values (38) 57
57.0%
ValueCountFrequency (%)
4.0 1
 
1.0%
4.3 1
 
1.0%
4.4 2
2.0%
4.5 1
 
1.0%
4.6 2
2.0%
4.8 1
 
1.0%
4.9 2
2.0%
5.0 4
4.0%
5.1 3
3.0%
5.2 4
4.0%
ValueCountFrequency (%)
14.4 1
 
1.0%
11.4 1
 
1.0%
10.9 1
 
1.0%
10.4 1
 
1.0%
10.2 1
 
1.0%
10.0 1
 
1.0%
9.7 1
 
1.0%
9.6 3
3.0%
9.4 2
2.0%
9.3 1
 
1.0%
Distinct94
Distinct (%)94.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2015-11-18 00:00:00
Maximum2020-04-07 00:00:00
2023-10-09T03:56:57.666422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:57.927097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

GLC_SRC
Real number (ℝ)

HIGH CORRELATION 

Distinct71
Distinct (%)71.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean160.48
Minimum79
Maximum672
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:58.176035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum79
5-th percentile93.85
Q1115.75
median134
Q3181
95-th percentile269.65
Maximum672
Range593
Interquartile range (IQR)65.25

Descriptive statistics

Standard deviation85.252978
Coefficient of variation (CV)0.5312374
Kurtosis15.104049
Mean160.48
Median Absolute Deviation (MAD)23.5
Skewness3.3394795
Sum16048
Variance7268.0703
MonotonicityNot monotonic
2023-10-09T03:56:58.491685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
121 4
 
4.0%
102 4
 
4.0%
120 3
 
3.0%
124 3
 
3.0%
201 3
 
3.0%
126 3
 
3.0%
159 3
 
3.0%
135 2
 
2.0%
181 2
 
2.0%
125 2
 
2.0%
Other values (61) 71
71.0%
ValueCountFrequency (%)
79 1
 
1.0%
84 1
 
1.0%
88 2
2.0%
91 1
 
1.0%
94 1
 
1.0%
95 1
 
1.0%
98 2
2.0%
101 1
 
1.0%
102 4
4.0%
105 1
 
1.0%
ValueCountFrequency (%)
672 1
1.0%
505 1
1.0%
426 1
1.0%
313 1
1.0%
301 1
1.0%
268 1
1.0%
265 1
1.0%
260 1
1.0%
255 1
1.0%
253 1
1.0%
Distinct90
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2015-11-18 00:00:00
Maximum2020-04-07 00:00:00
2023-10-09T03:56:58.868004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:59.184021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct54
Distinct (%)54.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-10-09T03:56:59.614672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length1.99
Min length1

Characters and Unicode

Total characters199
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)29.0%

Sample

1st row37
2nd row62
3rd row52
4th row18
5th row71
ValueCountFrequency (%)
42 5
 
5.0%
47 5
 
5.0%
46 4
 
4.0%
34 4
 
4.0%
37 4
 
4.0%
40 4
 
4.0%
39 3
 
3.0%
38 3
 
3.0%
71 3
 
3.0%
64 3
 
3.0%
Other values (45) 63
62.4%
2023-10-09T03:57:00.510359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 41
20.6%
3 25
12.6%
6 24
12.1%
7 22
11.1%
5 18
9.0%
1 18
9.0%
2 15
 
7.5%
9 14
 
7.0%
0 10
 
5.0%
8 10
 
5.0%
Other values (2) 2
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 197
99.0%
Math Symbol 1
 
0.5%
Space Separator 1
 
0.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 41
20.8%
3 25
12.7%
6 24
12.2%
7 22
11.2%
5 18
9.1%
1 18
9.1%
2 15
 
7.6%
9 14
 
7.1%
0 10
 
5.1%
8 10
 
5.1%
Math Symbol
ValueCountFrequency (%)
< 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 199
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 41
20.6%
3 25
12.6%
6 24
12.1%
7 22
11.1%
5 18
9.0%
1 18
9.0%
2 15
 
7.5%
9 14
 
7.0%
0 10
 
5.0%
8 10
 
5.0%
Other values (2) 2
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 199
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 41
20.6%
3 25
12.6%
6 24
12.1%
7 22
11.1%
5 18
9.0%
1 18
9.0%
2 15
 
7.5%
9 14
 
7.0%
0 10
 
5.0%
8 10
 
5.0%
Other values (2) 2
 
1.0%
Distinct90
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2015-11-18 00:00:00
Maximum2020-04-07 00:00:00
2023-10-09T03:57:01.084467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:01.523368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

LDL_SRC
Real number (ℝ)

Distinct74
Distinct (%)74.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean85.42
Minimum11
Maximum211
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:01.780357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile27.9
Q151.75
median82.5
Q3112.75
95-th percentile165
Maximum211
Range200
Interquartile range (IQR)61

Descriptive statistics

Standard deviation41.923735
Coefficient of variation (CV)0.49079531
Kurtosis-0.15173594
Mean85.42
Median Absolute Deviation (MAD)31
Skewness0.56727878
Sum8542
Variance1757.5996
MonotonicityNot monotonic
2023-10-09T03:57:02.235118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50 3
 
3.0%
102 3
 
3.0%
44 3
 
3.0%
51 3
 
3.0%
122 3
 
3.0%
55 2
 
2.0%
52 2
 
2.0%
107 2
 
2.0%
115 2
 
2.0%
112 2
 
2.0%
Other values (64) 75
75.0%
ValueCountFrequency (%)
11 1
1.0%
16 1
1.0%
20 1
1.0%
25 1
1.0%
26 1
1.0%
28 1
1.0%
30 1
1.0%
32 1
1.0%
38 1
1.0%
39 1
1.0%
ValueCountFrequency (%)
211 1
1.0%
182 1
1.0%
172 1
1.0%
171 1
1.0%
165 2
2.0%
161 1
1.0%
157 1
1.0%
155 1
1.0%
141 1
1.0%
137 1
1.0%

Interactions

2023-10-09T03:56:53.483124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:51.204215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:52.039753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:52.681153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:53.637836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:51.563064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:52.179285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:52.880450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:53.803128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:51.730013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:52.318076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:53.026100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:53.971205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:51.885271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:52.526457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:53.225608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:57:02.488856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDA1C_DCTA1C_SRCGLC_DCTGLC_SRCHDL_DCTHDL_SRCLDL_DCTLDL_SRC
RID1.0000.8790.1020.5570.2600.8340.0000.8340.101
A1C_DCT0.8791.0000.9460.9980.9480.9990.0000.9990.000
A1C_SRC0.1020.9461.0000.9700.6210.9590.0000.9590.000
GLC_DCT0.5570.9980.9701.0000.9920.9990.8780.9990.932
GLC_SRC0.2600.9480.6210.9921.0000.9880.4990.9880.104
HDL_DCT0.8340.9990.9590.9990.9881.0000.0001.0000.000
HDL_SRC0.0000.0000.0000.8780.4990.0001.0000.0000.704
LDL_DCT0.8340.9990.9590.9990.9881.0000.0001.0000.000
LDL_SRC0.1010.0000.0000.9320.1040.0000.7040.0001.000
2023-10-09T03:57:02.949455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDA1C_SRCGLC_SRCLDL_SRC
RID1.000-0.028-0.0270.115
A1C_SRC-0.0281.0000.652-0.029
GLC_SRC-0.0270.6521.000-0.027
LDL_SRC0.115-0.029-0.0271.000

Missing values

2023-10-09T03:56:54.248460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:56:54.489752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

RIDA1C_DCTA1C_SRCGLC_DCTGLC_SRCHDL_DCTHDL_SRCLDL_DCTLDL_SRC
012018-04-10T00:00:008.22018-04-09T00:00:002462018-04-10T00:00:00372018-04-10T00:00:0025
122017-04-05T00:00:007.12017-04-05T00:00:001492017-04-05T00:00:00622017-04-05T00:00:0050
232017-04-27T00:00:0010.42017-04-28T00:00:002172017-04-28T00:00:00522017-04-28T00:00:00102
342018-08-17T00:00:0014.42018-09-04T00:00:001212018-08-21T00:00:00182018-08-21T00:00:0075
452018-06-21T00:00:005.22018-06-21T00:00:001292018-06-21T00:00:00712018-06-21T00:00:0041
562019-09-18T00:00:004.92019-09-18T00:00:001212019-09-18T00:00:00992019-09-18T00:00:0040
672018-07-04T00:00:005.72018-06-26T00:00:001242018-06-26T00:00:00372018-06-26T00:00:0050
782019-07-04T00:00:006.82019-07-04T00:00:002002019-07-04T00:00:00612019-07-04T00:00:00155
892019-10-10T00:00:006.32019-10-10T00:00:002012019-10-10T00:00:00462019-10-10T00:00:0097
9102017-08-25T00:00:005.72017-08-25T00:00:001262017-08-25T00:00:00742017-08-25T00:00:0054
RIDA1C_DCTA1C_SRCGLC_DCTGLC_SRCHDL_DCTHDL_SRCLDL_DCTLDL_SRC
90912019-10-29T00:00:005.02019-10-15T00:00:001152019-10-30T00:00:00452019-10-30T00:00:0065
91922020-01-10T00:00:009.62020-01-10T00:00:002252020-01-10T00:00:00392020-01-10T00:00:0051
92932018-07-09T00:00:006.32018-07-09T00:00:001252018-07-09T00:00:00342018-07-09T00:00:0053
93942019-10-14T00:00:005.32019-10-14T00:00:001382019-10-14T00:00:00412019-10-14T00:00:00124
94952020-01-31T00:00:005.12020-01-31T00:00:001022020-01-31T00:00:00472020-01-31T00:00:00135
95962019-10-08T00:00:005.52019-10-09T00:00:001302019-10-08T00:00:00392019-10-08T00:00:0032
96972019-11-09T00:00:005.32019-11-09T00:00:001772019-11-09T00:00:00502019-11-09T00:00:00136
97982018-04-20T00:00:0010.92018-04-20T00:00:005052018-04-21T00:00:00382018-04-21T00:00:0039
98992019-06-22T00:00:004.62019-07-11T00:00:001152019-07-11T00:00:00402019-07-11T00:00:0043
991002019-05-16T00:00:005.32019-05-16T00:00:00982019-05-16T00:00:00642019-05-16T00:00:00122