Overview

Dataset statistics

Number of variables27
Number of observations30
Missing cells53
Missing cells (%)6.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.8 KiB
Average record size in memory233.4 B

Variable types

Numeric7
Categorical13
Boolean5
Text2

Dataset

Description조혈모세포이식 환자의 사망원인 혈액질환 레지스트리 데이터셋에서 임상의에 직접기입된 사망원인 분류 데이터를 출력, 공여자 데이터, Chronic GVHD의 발생여부와 발생장기의 정보를 확인할 수 있음. 공여자의 혈액형의 체크박스에서 데이터 추출 하여 혈액형 정보를 숫자(A+ =11, A- = 10, AB+ =31, AB- =30, B+ =21, B- =20, O+ =41, O- =40 )로 표기하였음.
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/stem-cell-transplantation

Alerts

TRNSPLANT_YN has constant value ""Constant
ANC_ENG_YN has constant value ""Constant
CELL_SOURCE is highly imbalanced (78.9%)Imbalance
ECOG is highly imbalanced (68.6%)Imbalance
AGE_DONOR has 1 (3.3%) missing valuesMissing
HCTCI has 2 (6.7%) missing valuesMissing
CD3_INFU has 1 (3.3%) missing valuesMissing
ANC_ENG_YN has 1 (3.3%) missing valuesMissing
PLT_ENG_YN has 3 (10.0%) missing valuesMissing
AGVHD_YN has 1 (3.3%) missing valuesMissing
CGVHD_YN has 1 (3.3%) missing valuesMissing
CGVHD_SITE has 19 (63.3%) missing valuesMissing
CGVHD_ADD_DRUG has 24 (80.0%) missing valuesMissing
PID has unique valuesUnique
HCTCI has 13 (43.3%) zerosZeros

Reproduction

Analysis started2023-10-08 18:57:47.236058
Analysis finished2023-10-08 18:57:48.024516
Duration0.79 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

PID
Real number (ℝ)

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.5
Minimum1
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-10-09T03:57:48.140129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.45
Q18.25
median15.5
Q322.75
95-th percentile28.55
Maximum30
Range29
Interquartile range (IQR)14.5

Descriptive statistics

Standard deviation8.8034084
Coefficient of variation (CV)0.56796183
Kurtosis-1.2
Mean15.5
Median Absolute Deviation (MAD)7.5
Skewness0
Sum465
Variance77.5
MonotonicityStrictly increasing
2023-10-09T03:57:48.392443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 1
 
3.3%
17 1
 
3.3%
30 1
 
3.3%
29 1
 
3.3%
28 1
 
3.3%
27 1
 
3.3%
26 1
 
3.3%
25 1
 
3.3%
24 1
 
3.3%
23 1
 
3.3%
Other values (20) 20
66.7%
ValueCountFrequency (%)
1 1
3.3%
2 1
3.3%
3 1
3.3%
4 1
3.3%
5 1
3.3%
6 1
3.3%
7 1
3.3%
8 1
3.3%
9 1
3.3%
10 1
3.3%
ValueCountFrequency (%)
30 1
3.3%
29 1
3.3%
28 1
3.3%
27 1
3.3%
26 1
3.3%
25 1
3.3%
24 1
3.3%
23 1
3.3%
22 1
3.3%
21 1
3.3%

SEX_PATIENT
Categorical

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
1
19 
2
11 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 19
63.3%
2 11
36.7%

Length

2023-10-09T03:57:48.700495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:48.978384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 19
63.3%
2 11
36.7%

AGE_PATIENT
Real number (ℝ)

Distinct22
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.466667
Minimum25
Maximum90
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-10-09T03:57:49.211251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile29.35
Q144.5
median58.5
Q364
95-th percentile74.55
Maximum90
Range65
Interquartile range (IQR)19.5

Descriptive statistics

Standard deviation15.90648
Coefficient of variation (CV)0.29204063
Kurtosis-0.39072746
Mean54.466667
Median Absolute Deviation (MAD)10
Skewness-0.097505464
Sum1634
Variance253.01609
MonotonicityNot monotonic
2023-10-09T03:57:49.464070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
59 3
 
10.0%
61 2
 
6.7%
60 2
 
6.7%
31 2
 
6.7%
52 2
 
6.7%
67 2
 
6.7%
74 2
 
6.7%
65 1
 
3.3%
58 1
 
3.3%
28 1
 
3.3%
Other values (12) 12
40.0%
ValueCountFrequency (%)
25 1
3.3%
28 1
3.3%
31 2
6.7%
34 1
3.3%
36 1
3.3%
39 1
3.3%
44 1
3.3%
46 1
3.3%
47 1
3.3%
51 1
3.3%
ValueCountFrequency (%)
90 1
 
3.3%
75 1
 
3.3%
74 2
6.7%
72 1
 
3.3%
67 2
6.7%
65 1
 
3.3%
61 2
6.7%
60 2
6.7%
59 3
10.0%
58 1
 
3.3%

TRNSPLANT_YN
Boolean

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size162.0 B
True
30 
ValueCountFrequency (%)
True 30
100.0%
2023-10-09T03:57:49.817785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

COD
Real number (ℝ)

Distinct6
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.4
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-10-09T03:57:50.071674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11.5
median4
Q34
95-th percentile5.55
Maximum7
Range6
Interquartile range (IQR)2.5

Descriptive statistics

Standard deviation1.6315848
Coefficient of variation (CV)0.47987788
Kurtosis-0.37781029
Mean3.4
Median Absolute Deviation (MAD)0
Skewness-0.29397473
Sum102
Variance2.662069
MonotonicityNot monotonic
2023-10-09T03:57:50.276156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
4 17
56.7%
1 8
26.7%
5 2
 
6.7%
3 1
 
3.3%
7 1
 
3.3%
6 1
 
3.3%
ValueCountFrequency (%)
1 8
26.7%
3 1
 
3.3%
4 17
56.7%
5 2
 
6.7%
6 1
 
3.3%
7 1
 
3.3%
ValueCountFrequency (%)
7 1
 
3.3%
6 1
 
3.3%
5 2
 
6.7%
4 17
56.7%
3 1
 
3.3%
1 8
26.7%

ABO_PATIENT
Categorical

Distinct5
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
41
21
11
31
Q

Length

Max length2
Median length2
Mean length1.9666667
Min length1

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row31
2nd row41
3rd rowQ
4th row21
5th row41

Common Values

ValueCountFrequency (%)
41 9
30.0%
21 8
26.7%
11 8
26.7%
31 4
13.3%
Q 1
 
3.3%

Length

2023-10-09T03:57:50.547493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:50.797736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
41 9
30.0%
21 8
26.7%
11 8
26.7%
31 4
13.3%
q 1
 
3.3%

DONOR_TYPE
Categorical

Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
URD
14 
HAPLO
SIB
AUTO
 
1

Length

Max length5
Median length3
Mean length3.6333333
Min length3

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st rowURD
2nd rowAUTO
3rd rowURD
4th rowURD
5th rowURD

Common Values

ValueCountFrequency (%)
URD 14
46.7%
HAPLO 9
30.0%
SIB 6
20.0%
AUTO 1
 
3.3%

Length

2023-10-09T03:57:51.043387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:51.289389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
urd 14
46.7%
haplo 9
30.0%
sib 6
20.0%
auto 1
 
3.3%

HLA_MATCH
Categorical

Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
Match
14 
HAPLO
Mismatch
AUTO
 
1

Length

Max length8
Median length5
Mean length5.5666667
Min length4

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st rowMatch
2nd rowAUTO
3rd rowMatch
4th rowMismatch
5th rowMismatch

Common Values

ValueCountFrequency (%)
Match 14
46.7%
HAPLO 9
30.0%
Mismatch 6
20.0%
AUTO 1
 
3.3%

Length

2023-10-09T03:57:51.520328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:51.814082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
match 14
46.7%
haplo 9
30.0%
mismatch 6
20.0%
auto 1
 
3.3%

CELL_SOURCE
Categorical

IMBALANCE 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
PBSC
29 
BM_PBSC
 
1

Length

Max length7
Median length4
Mean length4.1
Min length4

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st rowPBSC
2nd rowBM_PBSC
3rd rowPBSC
4th rowPBSC
5th rowPBSC

Common Values

ValueCountFrequency (%)
PBSC 29
96.7%
BM_PBSC 1
 
3.3%

Length

2023-10-09T03:57:52.176610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:52.438658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
pbsc 29
96.7%
bm_pbsc 1
 
3.3%

SEX_DONOR
Categorical

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
1
20 
2
<NA>
 
1

Length

Max length4
Median length1
Mean length1.1
Min length1

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row1
2nd row<NA>
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 20
66.7%
2 9
30.0%
<NA> 1
 
3.3%

Length

2023-10-09T03:57:52.703882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:52.920820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 20
66.7%
2 9
30.0%
na 1
 
3.3%

AGE_DONOR
Real number (ℝ)

MISSING 

Distinct23
Distinct (%)79.3%
Missing1
Missing (%)3.3%
Infinite0
Infinite (%)0.0%
Mean36.827586
Minimum9
Maximum63
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-10-09T03:57:53.100360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile17.6
Q128
median36
Q345
95-th percentile56.8
Maximum63
Range54
Interquartile range (IQR)17

Descriptive statistics

Standard deviation12.88429
Coefficient of variation (CV)0.34985431
Kurtosis-0.26965707
Mean36.827586
Median Absolute Deviation (MAD)9
Skewness-0.010449526
Sum1068
Variance166.00493
MonotonicityNot monotonic
2023-10-09T03:57:53.352998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
36 3
 
10.0%
25 2
 
6.7%
33 2
 
6.7%
35 2
 
6.7%
45 2
 
6.7%
26 1
 
3.3%
16 1
 
3.3%
28 1
 
3.3%
49 1
 
3.3%
39 1
 
3.3%
Other values (13) 13
43.3%
ValueCountFrequency (%)
9 1
3.3%
16 1
3.3%
20 1
3.3%
21 1
3.3%
25 2
6.7%
26 1
3.3%
28 1
3.3%
31 1
3.3%
33 2
6.7%
34 1
3.3%
ValueCountFrequency (%)
63 1
3.3%
58 1
3.3%
55 1
3.3%
54 1
3.3%
49 1
3.3%
48 1
3.3%
47 1
3.3%
45 2
6.7%
44 1
3.3%
42 1
3.3%

ABO_DONOR
Categorical

Distinct5
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
11
11 
21
41
31
<NA>
 
1

Length

Max length4
Median length2
Mean length2.0666667
Min length2

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row11
2nd row<NA>
3rd row11
4th row31
5th row21

Common Values

ValueCountFrequency (%)
11 11
36.7%
21 7
23.3%
41 6
20.0%
31 5
16.7%
<NA> 1
 
3.3%

Length

2023-10-09T03:57:53.670058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:53.945023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11 11
36.7%
21 7
23.3%
41 6
20.0%
31 5
16.7%
na 1
 
3.3%

ABO_MATCHING
Categorical

Distinct5
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
Match
15 
<NA>
Major mismatch
Bidirectional mismatch
Minor mismatch
 
1

Length

Max length22
Median length18
Mean length7.0333333
Min length4

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th rowMajor mismatch

Common Values

ValueCountFrequency (%)
Match 15
50.0%
<NA> 9
30.0%
Major mismatch 3
 
10.0%
Bidirectional mismatch 2
 
6.7%
Minor mismatch 1
 
3.3%

Length

2023-10-09T03:57:54.394335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:54.623151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
match 15
41.7%
na 9
25.0%
mismatch 6
 
16.7%
major 3
 
8.3%
bidirectional 2
 
5.6%
minor 1
 
2.8%

HCTCI
Real number (ℝ)

MISSING  ZEROS 

Distinct6
Distinct (%)21.4%
Missing2
Missing (%)6.7%
Infinite0
Infinite (%)0.0%
Mean1.6785714
Minimum0
Maximum6
Zeros13
Zeros (%)43.3%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-10-09T03:57:54.817831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1.5
Q33
95-th percentile4
Maximum6
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.8268283
Coefficient of variation (CV)1.0883232
Kurtosis-0.83216583
Mean1.6785714
Median Absolute Deviation (MAD)1.5
Skewness0.59416062
Sum47
Variance3.3373016
MonotonicityNot monotonic
2023-10-09T03:57:55.305551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 13
43.3%
4 5
 
16.7%
2 4
 
13.3%
3 4
 
13.3%
6 1
 
3.3%
1 1
 
3.3%
(Missing) 2
 
6.7%
ValueCountFrequency (%)
0 13
43.3%
1 1
 
3.3%
2 4
 
13.3%
3 4
 
13.3%
4 5
 
16.7%
6 1
 
3.3%
ValueCountFrequency (%)
6 1
 
3.3%
4 5
 
16.7%
3 4
 
13.3%
2 4
 
13.3%
1 1
 
3.3%
0 13
43.3%

ECOG
Categorical

IMBALANCE 

Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
1
27 
<NA>
 
1
11
 
1
0
 
1

Length

Max length4
Median length1
Mean length1.1333333
Min length1

Unique

Unique3 ?
Unique (%)10.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 27
90.0%
<NA> 1
 
3.3%
11 1
 
3.3%
0 1
 
3.3%

Length

2023-10-09T03:57:55.657110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:56.163887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 27
90.0%
na 1
 
3.3%
11 1
 
3.3%
0 1
 
3.3%

KPS
Categorical

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
90
18 
80
11 
<NA>
 
1

Length

Max length4
Median length2
Mean length2.0666667
Min length2

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row90
2nd row90
3rd row90
4th row90
5th row90

Common Values

ValueCountFrequency (%)
90 18
60.0%
80 11
36.7%
<NA> 1
 
3.3%

Length

2023-10-09T03:57:56.359611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:56.560391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
90 18
60.0%
80 11
36.7%
na 1
 
3.3%

CD34_INFU
Real number (ℝ)

Distinct10
Distinct (%)33.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.2666667
Minimum1
Maximum21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-10-09T03:57:56.762365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median6
Q36.75
95-th percentile12.2
Maximum21
Range20
Interquartile range (IQR)1.75

Descriptive statistics

Standard deviation3.6096932
Coefficient of variation (CV)0.57601487
Kurtosis9.6756511
Mean6.2666667
Median Absolute Deviation (MAD)1
Skewness2.6952094
Sum188
Variance13.029885
MonotonicityNot monotonic
2023-10-09T03:57:57.012635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
6 9
30.0%
5 7
23.3%
7 4
13.3%
3 3
 
10.0%
4 2
 
6.7%
1 1
 
3.3%
8 1
 
3.3%
14 1
 
3.3%
10 1
 
3.3%
21 1
 
3.3%
ValueCountFrequency (%)
1 1
 
3.3%
3 3
 
10.0%
4 2
 
6.7%
5 7
23.3%
6 9
30.0%
7 4
13.3%
8 1
 
3.3%
10 1
 
3.3%
14 1
 
3.3%
21 1
 
3.3%
ValueCountFrequency (%)
21 1
 
3.3%
14 1
 
3.3%
10 1
 
3.3%
8 1
 
3.3%
7 4
13.3%
6 9
30.0%
5 7
23.3%
4 2
 
6.7%
3 3
 
10.0%
1 1
 
3.3%

CD3_INFU
Real number (ℝ)

MISSING 

Distinct29
Distinct (%)100.0%
Missing1
Missing (%)3.3%
Infinite0
Infinite (%)0.0%
Mean334.65517
Minimum121
Maximum582
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-10-09T03:57:57.240546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum121
5-th percentile197
Q1259
median315
Q3405
95-th percentile557.2
Maximum582
Range461
Interquartile range (IQR)146

Descriptive statistics

Standard deviation111.86065
Coefficient of variation (CV)0.33425646
Kurtosis0.20380756
Mean334.65517
Median Absolute Deviation (MAD)60
Skewness0.65238365
Sum9705
Variance12512.805
MonotonicityNot monotonic
2023-10-09T03:57:57.463526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
223 1
 
3.3%
306 1
 
3.3%
189 1
 
3.3%
268 1
 
3.3%
433 1
 
3.3%
435 1
 
3.3%
347 1
 
3.3%
316 1
 
3.3%
405 1
 
3.3%
538 1
 
3.3%
Other values (19) 19
63.3%
ValueCountFrequency (%)
121 1
3.3%
189 1
3.3%
209 1
3.3%
223 1
3.3%
239 1
3.3%
241 1
3.3%
255 1
3.3%
259 1
3.3%
268 1
3.3%
291 1
3.3%
ValueCountFrequency (%)
582 1
3.3%
570 1
3.3%
538 1
3.3%
491 1
3.3%
435 1
3.3%
433 1
3.3%
411 1
3.3%
405 1
3.3%
366 1
3.3%
347 1
3.3%

ANC_ENG_YN
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)3.4%
Missing1
Missing (%)3.3%
Memory size192.0 B
True
29 
(Missing)
 
1
ValueCountFrequency (%)
True 29
96.7%
(Missing) 1
 
3.3%
2023-10-09T03:57:57.636106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

PLT_ENG_YN
Boolean

MISSING 

Distinct2
Distinct (%)7.4%
Missing3
Missing (%)10.0%
Memory size192.0 B
True
24 
False
(Missing)
ValueCountFrequency (%)
True 24
80.0%
False 3
 
10.0%
(Missing) 3
 
10.0%
2023-10-09T03:57:57.808387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

RELAPSE_YN
Categorical

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
N
15 
Persistent
10 
Y

Length

Max length10
Median length1
Mean length4
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowY
2nd rowY
3rd rowN
4th rowPersistent
5th rowPersistent

Common Values

ValueCountFrequency (%)
N 15
50.0%
Persistent 10
33.3%
Y 5
 
16.7%

Length

2023-10-09T03:57:57.985108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:58.158667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
n 15
50.0%
persistent 10
33.3%
y 5
 
16.7%

AGVHD_YN
Boolean

MISSING 

Distinct2
Distinct (%)6.9%
Missing1
Missing (%)3.3%
Memory size192.0 B
False
17 
True
12 
(Missing)
 
1
ValueCountFrequency (%)
False 17
56.7%
True 12
40.0%
(Missing) 1
 
3.3%
2023-10-09T03:57:58.307282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

AGVHD_MAX_GR
Categorical

Distinct5
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
0
17 
2
1
3
<NA>
 
1

Length

Max length4
Median length1
Mean length1.1
Min length1

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row0
2nd row<NA>
3rd row0
4th row0
5th row2

Common Values

ValueCountFrequency (%)
0 17
56.7%
2 6
 
20.0%
1 4
 
13.3%
3 2
 
6.7%
<NA> 1
 
3.3%

Length

2023-10-09T03:57:58.526141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:58.764311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 17
56.7%
2 6
 
20.0%
1 4
 
13.3%
3 2
 
6.7%
na 1
 
3.3%

AGVHD_ADD_DRUG
Categorical

Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
<NA>
23 
jakavi,mmf
jakavi
 
2
mmf
 
1

Length

Max length10
Median length4
Mean length4.9
Min length3

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th rowjakavi,mmf

Common Values

ValueCountFrequency (%)
<NA> 23
76.7%
jakavi,mmf 4
 
13.3%
jakavi 2
 
6.7%
mmf 1
 
3.3%

Length

2023-10-09T03:57:58.987000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:59.179473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 23
76.7%
jakavi,mmf 4
 
13.3%
jakavi 2
 
6.7%
mmf 1
 
3.3%

CGVHD_YN
Boolean

MISSING 

Distinct2
Distinct (%)6.9%
Missing1
Missing (%)3.3%
Memory size192.0 B
False
18 
True
11 
(Missing)
 
1
ValueCountFrequency (%)
False 18
60.0%
True 11
36.7%
(Missing) 1
 
3.3%
2023-10-09T03:57:59.320805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

CGVHD_SITE
Text

MISSING 

Distinct9
Distinct (%)81.8%
Missing19
Missing (%)63.3%
Memory size372.0 B
2023-10-09T03:57:59.505994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length22
Mean length13.181818
Min length4

Characters and Unicode

Total characters145
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)72.7%

Sample

1st rowmouth
2nd rowskin
3rd rowskin,nail,hematopietic
4th rowskin,nail,eyes,liver
5th rowskin
ValueCountFrequency (%)
skin 3
27.3%
mouth 1
 
9.1%
skin,nail,hematopietic 1
 
9.1%
skin,nail,eyes,liver 1
 
9.1%
eyes 1
 
9.1%
skin,nail,eyes,gi,liver 1
 
9.1%
skin,nail,gi,liver 1
 
9.1%
skin,nail,mouth,eyes,liver 1
 
9.1%
skin,nail,liver 1
 
9.1%
2023-10-09T03:57:59.940387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 22
15.2%
, 18
12.4%
n 15
10.3%
e 15
10.3%
s 13
9.0%
l 11
7.6%
k 9
 
6.2%
a 7
 
4.8%
v 5
 
3.4%
r 5
 
3.4%
Other values (10) 25
17.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 123
84.8%
Other Punctuation 18
 
12.4%
Uppercase Letter 4
 
2.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 22
17.9%
n 15
12.2%
e 15
12.2%
s 13
10.6%
l 11
8.9%
k 9
7.3%
a 7
 
5.7%
v 5
 
4.1%
r 5
 
4.1%
t 4
 
3.3%
Other values (7) 17
13.8%
Uppercase Letter
ValueCountFrequency (%)
G 2
50.0%
I 2
50.0%
Other Punctuation
ValueCountFrequency (%)
, 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 127
87.6%
Common 18
 
12.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 22
17.3%
n 15
11.8%
e 15
11.8%
s 13
10.2%
l 11
8.7%
k 9
7.1%
a 7
 
5.5%
v 5
 
3.9%
r 5
 
3.9%
t 4
 
3.1%
Other values (9) 21
16.5%
Common
ValueCountFrequency (%)
, 18
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 145
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 22
15.2%
, 18
12.4%
n 15
10.3%
e 15
10.3%
s 13
9.0%
l 11
7.6%
k 9
 
6.2%
a 7
 
4.8%
v 5
 
3.4%
r 5
 
3.4%
Other values (10) 25
17.2%

CGVHD_ADD_DRUG
Text

MISSING 

Distinct3
Distinct (%)50.0%
Missing24
Missing (%)80.0%
Memory size372.0 B
2023-10-09T03:58:00.135303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length7.5
Min length3

Characters and Unicode

Total characters45
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)16.7%

Sample

1st rowjakavi,mmf
2nd rowjakavi
3rd rowjakavi,mmf
4th rowjakavi
5th rowmmf
ValueCountFrequency (%)
jakavi,mmf 3
50.0%
jakavi 2
33.3%
mmf 1
 
16.7%
2023-10-09T03:58:00.546979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 10
22.2%
m 8
17.8%
j 5
11.1%
k 5
11.1%
v 5
11.1%
i 5
11.1%
f 4
 
8.9%
, 3
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 42
93.3%
Other Punctuation 3
 
6.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 10
23.8%
m 8
19.0%
j 5
11.9%
k 5
11.9%
v 5
11.9%
i 5
11.9%
f 4
 
9.5%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 42
93.3%
Common 3
 
6.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 10
23.8%
m 8
19.0%
j 5
11.9%
k 5
11.9%
v 5
11.9%
i 5
11.9%
f 4
 
9.5%
Common
ValueCountFrequency (%)
, 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 10
22.2%
m 8
17.8%
j 5
11.1%
k 5
11.1%
v 5
11.1%
i 5
11.1%
f 4
 
8.9%
, 3
 
6.7%

Sample

PIDSEX_PATIENTAGE_PATIENTTRNSPLANT_YNCODABO_PATIENTDONOR_TYPEHLA_MATCHCELL_SOURCESEX_DONORAGE_DONORABO_DONORABO_MATCHINGHCTCIECOGKPSCD34_INFUCD3_INFUANC_ENG_YNPLT_ENG_YNRELAPSE_YNAGVHD_YNAGVHD_MAX_GRAGVHD_ADD_DRUGCGVHD_YNCGVHD_SITECGVHD_ADD_DRUG
01165Y431URDMatchPBSC12611<NA>01906223YYYN0<NA>N<NA><NA>
12267Y441AUTOAUTOBM_PBSC<NA><NA><NA><NA>01901<NA>YYY<NA><NA><NA><NA><NA><NA>
23244Y5QURDMatchPBSC22011<NA>21906306YNNN0<NA>N<NA><NA>
34160Y121URDMismatchPBSC12531<NA>41903255<NA>NPersistentN0<NA>N<NA><NA>
45159Y141URDMismatchPBSC13421Major mismatch01907293YYPersistentY2jakavi,mmfN<NA><NA>
56131Y121HAPLOHAPLOPBSC15811Bidirectional mismatch01805241YYPersistentN0<NA>Ymouth<NA>
67160Y431URDMatchPBSC13331Match01908315YYNN0<NA>N<NA><NA>
78146Y421HAPLOHAPLOPBSC1921Match01903259Y<NA>PersistentN0<NA>N<NA><NA>
89125Y111URDMatchPBSC23511Match419014582Y<NA>PersistentY1<NA>Yskin<NA>
910231Y431HAPLOHAPLOPBSC16321<NA>31906366YYNY2<NA>N<NA><NA>
PIDSEX_PATIENTAGE_PATIENTTRNSPLANT_YNCODABO_PATIENTDONOR_TYPEHLA_MATCHCELL_SOURCESEX_DONORAGE_DONORABO_DONORABO_MATCHINGHCTCIECOGKPSCD34_INFUCD3_INFUANC_ENG_YNPLT_ENG_YNRELAPSE_YNAGVHD_YNAGVHD_MAX_GRAGVHD_ADD_DRUGCGVHD_YNCGVHD_SITECGVHD_ADD_DRUG
2021134Y541URDMismatchPBSC23541Match01803411YYPersistentY2jakaviN<NA><NA>
2122239Y141SIBMatchPBSC13141Match018010570YYPersistentY2<NA>N<NA><NA>
2223236Y411URDMismatchPBSC24511Match01905538YYNY3<NA>Yskin,nail,eyes,GI,liverjakavi,mmf
2324252Y441SIBMatchPBSC13921Major mismatch218021405YYNN0<NA>Yskin,nail,GI,liverjakavi
2425161Y421URDMatchPBSC13631<NA>41904316YYNY3mmfN<NA><NA>
2526128Y141URDMatchPBSC13641Match21906347YYYY1jakavi,mmfYskinmmf
2627152Y411URDMismatchPBSC14521Bidirectional mismatch41805435YNPersistentY2<NA>N<NA><NA>
2728258Y611HAPLOHAPLOPBSC13311Match211805433YYNN0<NA>Yskin,nail,mouth,eyes,liver<NA>
2829159Y441SIBMismatchPBSC14941Match30907268YYPersistentN0<NA>Yskin,nail,liverjakavi,mmf
2930174Y111URDMatchPBSC12811Match01906189YYYN0<NA>N<NA><NA>