Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 KiB
Average record size in memory77.3 B

Variable types

Text1
Unsupported4
Numeric4

Dataset

Description알코올 사용 장애 환자들이 시행한 혈액 검사 중에 간기능의 효과를 평가할 수 있는 주요 검사 데이터를 포함하며 검체 채취 일자와 접수 일자를 이용하여 처방시점으로 부터의 기간을 계산한 시점 데이터를 생성함. 검사항목은AST(GOT), ALT(GPT), ALP, γ-GTP 등 간기능 개선 성과와 알코올 사용을 평가할 수 있는 주요 검사항목이 포함됨 - AST(Aspartate aminotransferase. GOT(Glutamic Oxalacetic Transaminase)), ALT(alanine aminotransferase, GPT(glutamic pyruvate transaminase)) : 간세포 손상을 반영하는 아미노전이효소(Aminotransferases)로 기본적인 간기능검사 항목임 - ALP(alkaline phosphatase, 알칼리인산분해효소) : 간세포 내 담관에 존재하는 효소로 즈로 담즙 배설 장애 시 빠르게 상승함 - γ-GTP(gamma(γ)-glutamyl transferase, GGT, 감마-글루타밀전이효소) : 간세포 내 담관에 존재하는 효소로 ALP와 함께 담즙 배설 장애를 판단하는데 사용되나, 간질환 없이도 알코올 중독자, 비만한 사람의 일부, 아세트아미노펜, 페니토인, 카르바마제핀 같은 약물의 과다복용 때도 상승할 수 있음
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/main-effect-blood-test-data-alcohol-use-disorder

Alerts

AST_SRC is highly overall correlated with ALT_SRC and 2 other fieldsHigh correlation
ALT_SRC is highly overall correlated with AST_SRCHigh correlation
ALP_SRC is highly overall correlated with AST_SRCHigh correlation
GTP_SRC is highly overall correlated with AST_SRCHigh correlation
RID has unique valuesUnique
AST_DCT is an unsupported type, check if it needs cleaning or further analysisUnsupported
ALT_DCT is an unsupported type, check if it needs cleaning or further analysisUnsupported
ALP_DCT is an unsupported type, check if it needs cleaning or further analysisUnsupported
GTP_DCT is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-10-08 18:56:19.576207
Analysis finished2023-10-08 18:56:23.238689
Duration3.66 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-10-09T03:56:23.720669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowR0000001
2nd rowR0000002
3rd rowR0000003
4th rowR0000005
5th rowR0000007
ValueCountFrequency (%)
r0000001 1
 
1.0%
r0000078 1
 
1.0%
r0000090 1
 
1.0%
r0000089 1
 
1.0%
r0000088 1
 
1.0%
r0000086 1
 
1.0%
r0000085 1
 
1.0%
r0000084 1
 
1.0%
r0000083 1
 
1.0%
r0000082 1
 
1.0%
Other values (90) 90
90.0%
2023-10-09T03:56:24.444357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 508
63.5%
R 100
 
12.5%
1 47
 
5.9%
3 21
 
2.6%
6 20
 
2.5%
8 19
 
2.4%
9 19
 
2.4%
7 17
 
2.1%
2 17
 
2.1%
5 16
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 700
87.5%
Uppercase Letter 100
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 508
72.6%
1 47
 
6.7%
3 21
 
3.0%
6 20
 
2.9%
8 19
 
2.7%
9 19
 
2.7%
7 17
 
2.4%
2 17
 
2.4%
5 16
 
2.3%
4 16
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
R 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 700
87.5%
Latin 100
 
12.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 508
72.6%
1 47
 
6.7%
3 21
 
3.0%
6 20
 
2.9%
8 19
 
2.7%
9 19
 
2.7%
7 17
 
2.4%
2 17
 
2.4%
5 16
 
2.3%
4 16
 
2.3%
Latin
ValueCountFrequency (%)
R 100
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 508
63.5%
R 100
 
12.5%
1 47
 
5.9%
3 21
 
2.6%
6 20
 
2.5%
8 19
 
2.4%
9 19
 
2.4%
7 17
 
2.1%
2 17
 
2.1%
5 16
 
2.0%

AST_DCT
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size932.0 B

AST_SRC
Real number (ℝ)

HIGH CORRELATION 

Distinct66
Distinct (%)66.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean81.72
Minimum15
Maximum1085
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:24.764986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile19
Q127.75
median41.5
Q378
95-th percentile212.25
Maximum1085
Range1070
Interquartile range (IQR)50.25

Descriptive statistics

Standard deviation137.1877
Coefficient of variation (CV)1.6787531
Kurtosis34.099658
Mean81.72
Median Absolute Deviation (MAD)19
Skewness5.396802
Sum8172
Variance18820.466
MonotonicityNot monotonic
2023-10-09T03:56:25.003464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33 4
 
4.0%
28 4
 
4.0%
30 3
 
3.0%
24 3
 
3.0%
23 3
 
3.0%
27 3
 
3.0%
19 3
 
3.0%
32 3
 
3.0%
20 3
 
3.0%
22 2
 
2.0%
Other values (56) 69
69.0%
ValueCountFrequency (%)
15 1
 
1.0%
16 1
 
1.0%
17 1
 
1.0%
19 3
3.0%
20 3
3.0%
21 2
2.0%
22 2
2.0%
23 3
3.0%
24 3
3.0%
25 1
 
1.0%
ValueCountFrequency (%)
1085 1
1.0%
752 1
1.0%
380 1
1.0%
309 1
1.0%
236 1
1.0%
211 1
1.0%
207 1
1.0%
187 1
1.0%
176 1
1.0%
174 1
1.0%

ALT_DCT
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size932.0 B

ALT_SRC
Real number (ℝ)

HIGH CORRELATION 

Distinct66
Distinct (%)66.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62.36
Minimum8
Maximum471
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:25.227696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile13
Q122
median41
Q370.5
95-th percentile156.85
Maximum471
Range463
Interquartile range (IQR)48.5

Descriptive statistics

Standard deviation71.969764
Coefficient of variation (CV)1.1541014
Kurtosis19.259367
Mean62.36
Median Absolute Deviation (MAD)20
Skewness3.8996276
Sum6236
Variance5179.6469
MonotonicityNot monotonic
2023-10-09T03:56:25.444094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22 6
 
6.0%
13 4
 
4.0%
40 4
 
4.0%
20 4
 
4.0%
25 4
 
4.0%
33 2
 
2.0%
119 2
 
2.0%
56 2
 
2.0%
111 2
 
2.0%
41 2
 
2.0%
Other values (56) 68
68.0%
ValueCountFrequency (%)
8 1
 
1.0%
11 1
 
1.0%
13 4
4.0%
14 2
2.0%
15 1
 
1.0%
16 1
 
1.0%
17 2
2.0%
18 1
 
1.0%
19 2
2.0%
20 4
4.0%
ValueCountFrequency (%)
471 1
1.0%
467 1
1.0%
200 1
1.0%
193 1
1.0%
173 1
1.0%
156 1
1.0%
142 1
1.0%
137 1
1.0%
129 1
1.0%
126 1
1.0%

ALP_DCT
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size932.0 B

ALP_SRC
Real number (ℝ)

HIGH CORRELATION 

Distinct68
Distinct (%)68.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean85.97
Minimum28
Maximum263
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:25.678223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum28
5-th percentile38.85
Q157
median75
Q395.25
95-th percentile195.5
Maximum263
Range235
Interquartile range (IQR)38.25

Descriptive statistics

Standard deviation45.475591
Coefficient of variation (CV)0.52897047
Kurtosis3.4461141
Mean85.97
Median Absolute Deviation (MAD)20
Skewness1.8030196
Sum8597
Variance2068.0294
MonotonicityNot monotonic
2023-10-09T03:56:25.978080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
59 4
 
4.0%
75 4
 
4.0%
57 3
 
3.0%
64 3
 
3.0%
50 3
 
3.0%
94 3
 
3.0%
55 3
 
3.0%
54 2
 
2.0%
112 2
 
2.0%
87 2
 
2.0%
Other values (58) 71
71.0%
ValueCountFrequency (%)
28 1
1.0%
32 1
1.0%
33 1
1.0%
34 1
1.0%
36 1
1.0%
39 1
1.0%
41 1
1.0%
43 1
1.0%
46 2
2.0%
48 2
2.0%
ValueCountFrequency (%)
263 1
1.0%
232 1
1.0%
216 1
1.0%
211 1
1.0%
205 1
1.0%
195 1
1.0%
187 1
1.0%
165 1
1.0%
162 1
1.0%
145 1
1.0%

GTP_DCT
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size932.0 B

GTP_SRC
Real number (ℝ)

HIGH CORRELATION 

Distinct89
Distinct (%)89.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean280.804
Minimum14
Maximum1928
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:26.198975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile24
Q169.75
median140.5
Q3328.25
95-th percentile889
Maximum1928
Range1914
Interquartile range (IQR)258.5

Descriptive statistics

Standard deviation320.02431
Coefficient of variation (CV)1.1396715
Kurtosis6.674795
Mean280.804
Median Absolute Deviation (MAD)99.85
Skewness2.2227236
Sum28080.4
Variance102415.56
MonotonicityNot monotonic
2023-10-09T03:56:26.484767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
108.0 3
 
3.0%
27.0 2
 
2.0%
103.0 2
 
2.0%
24.0 2
 
2.0%
95.0 2
 
2.0%
307.0 2
 
2.0%
97.0 2
 
2.0%
54.0 2
 
2.0%
82.0 2
 
2.0%
87.0 2
 
2.0%
Other values (79) 79
79.0%
ValueCountFrequency (%)
14.0 1
1.0%
18.0 1
1.0%
19.0 1
1.0%
20.0 1
1.0%
24.0 2
2.0%
26.0 1
1.0%
27.0 2
2.0%
35.0 1
1.0%
39.0 1
1.0%
40.0 1
1.0%
ValueCountFrequency (%)
1928.0 1
1.0%
1104.0 1
1.0%
1088.0 1
1.0%
1083.0 1
1.0%
965.0 1
1.0%
885.0 1
1.0%
811.0 1
1.0%
792.0 1
1.0%
786.0 1
1.0%
732.0 1
1.0%

Interactions

2023-10-09T03:56:21.789463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:19.784243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:20.347049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:20.985933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:21.960253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:19.951797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:20.474298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:21.137821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:22.185039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:20.077142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:20.606318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:21.346917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:22.478267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:20.176307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:20.811194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:21.589442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:56:27.057721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDAST_SRCALT_SRCALP_SRCGTP_SRC
RID1.0001.0001.0001.0001.000
AST_SRC1.0001.0000.9030.5210.543
ALT_SRC1.0000.9031.0000.0000.488
ALP_SRC1.0000.5210.0001.0000.696
GTP_SRC1.0000.5430.4880.6961.000
2023-10-09T03:56:27.232322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
AST_SRCALT_SRCALP_SRCGTP_SRC
AST_SRC1.0000.6110.5840.719
ALT_SRC0.6111.0000.2820.441
ALP_SRC0.5840.2821.0000.493
GTP_SRC0.7190.4410.4931.000

Missing values

2023-10-09T03:56:22.773153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:56:23.104905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

RIDAST_DCTAST_SRCALT_DCTALT_SRCALP_DCTALP_SRCGTP_DCTGTP_SRC
0R00000012011-05-15 00:00:00392011-05-15 00:00:00412011-05-15 00:00:00572011-05-15 00:00:0027.0
1R00000022011-09-23 00:00:00222011-09-23 00:00:00222011-09-23 00:00:00392011-09-23 00:00:0041.0
2R00000032016-01-05 00:00:001382016-01-05 00:00:00612016-01-05 00:00:002112016-01-05 00:00:00594.0
3R00000052017-10-24 00:00:00752017-10-24 00:00:00702017-10-24 00:00:00652017-10-24 00:00:00246.0
4R00000072010-06-04 00:00:00192010-06-04 00:00:00202010-06-04 00:00:00552010-06-04 00:00:0018.0
5R00000082012-11-29 00:00:00192012-11-29 00:00:00112012-11-29 00:00:00592012-11-29 00:00:0024.0
6R00000092012-12-27 00:00:00432012-12-27 00:00:001172012-12-27 00:00:00542012-12-27 00:00:00118.0
7R00000112009-05-27 00:00:00712009-05-27 00:00:001732009-05-27 00:00:00792009-05-27 00:00:00307.0
8R00000122016-06-27 00:00:00472016-06-27 00:00:00292016-06-27 00:00:001242016-06-27 00:00:00236.0
9R00000132016-12-14 00:00:00212016-12-14 00:00:00342016-12-14 00:00:00552016-12-14 00:00:0097.0
RIDAST_DCTAST_SRCALT_DCTALT_SRCALP_DCTALP_SRCGTP_DCTGTP_SRC
90R00001092012-09-08 00:00:0010852012-09-08 00:00:004672012-09-08 00:00:00882012-09-08 00:00:00467.0
91R00001102016-11-15 00:00:00262016-11-15 00:00:00562016-11-15 00:00:00822016-11-15 00:00:00259.0
92R00001112014-11-26 00:00:00482014-11-26 00:00:00132014-11-26 00:00:00962014-11-26 00:00:00432.0
93R00001122013-12-11 00:00:001872013-12-11 00:00:001102013-12-11 00:00:002322013-12-11 00:00:00207.0
94R00001132018-01-20 00:00:001022018-01-20 00:00:00492018-01-20 00:00:001172018-01-20 00:00:001104.0
95R00001142010-08-19 00:00:00922010-08-19 00:00:001262010-08-19 00:00:001042010-08-19 00:00:00732.0
96R00001152012-10-17 00:00:00432012-10-17 00:00:00202012-10-17 00:00:00482012-10-17 00:00:0066.0
97R00001162016-03-29 00:00:002362016-03-29 00:00:001932016-03-29 00:00:00782016-03-29 00:00:001088.0
98R00001182015-10-27 00:00:00782015-10-27 00:00:00392015-10-27 00:00:001452015-10-27 00:00:00811.0
99R00001192018-06-04272018-06-04432018-06-04492018-06-04108.0