Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.1 KiB
Average record size in memory72.3 B

Variable types

Numeric4
Categorical4

Dataset

Description고지혈증 환자의 검사 기록을 OMOP CDM 형식으로 생산한 데이터
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/_measurement_2020-omop-cdm

Alerts

measurement_type_concept_id has constant value ""Constant
unit_concept_id is highly overall correlated with unit_source_valueHigh correlation
unit_source_value is highly overall correlated with measurement_concept_id and 1 other fieldsHigh correlation
measurement_id is highly overall correlated with measurement_dateHigh correlation
measurement_concept_id is highly overall correlated with unit_source_valueHigh correlation
measurement_date is highly overall correlated with measurement_idHigh correlation
measurement_id has unique valuesUnique
measurement_concept_id has 28 (28.0%) zerosZeros
value_as_number has 2 (2.0%) zerosZeros

Reproduction

Analysis started2023-10-08 18:57:46.870516
Analysis finished2023-10-08 18:57:52.249045
Duration5.38 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

measurement_id
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean89571.75
Minimum44866
Maximum119049
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:52.483448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum44866
5-th percentile53165.9
Q159459.5
median99846
Q3103106.5
95-th percentile119040.05
Maximum119049
Range74183
Interquartile range (IQR)43647

Descriptive statistics

Standard deviation24635.097
Coefficient of variation (CV)0.275032
Kurtosis-1.1150824
Mean89571.75
Median Absolute Deviation (MAD)7388.5
Skewness-0.66391852
Sum8957175
Variance6.0688801 × 108
MonotonicityStrictly increasing
2023-10-09T03:57:52.762699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
44866 1
 
1.0%
102500 1
 
1.0%
103106 1
 
1.0%
103105 1
 
1.0%
103087 1
 
1.0%
102526 1
 
1.0%
102525 1
 
1.0%
102523 1
 
1.0%
102522 1
 
1.0%
102519 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
44866 1
1.0%
44867 1
1.0%
44869 1
1.0%
44872 1
1.0%
53164 1
1.0%
53166 1
1.0%
53167 1
1.0%
53168 1
1.0%
53169 1
1.0%
53170 1
1.0%
ValueCountFrequency (%)
119049 1
1.0%
119044 1
1.0%
119043 1
1.0%
119042 1
1.0%
119041 1
1.0%
119040 1
1.0%
116868 1
1.0%
116865 1
1.0%
116853 1
1.0%
116850 1
1.0%

measurement_concept_id
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct26
Distinct (%)26.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2170940.1
Minimum0
Maximum3036887
Zeros28
Zeros (%)28.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:53.014894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3011903
Q33020416
95-th percentile3024929
Maximum3036887
Range3036887
Interquartile range (IQR)3020416

Descriptive statistics

Standard deviation1360660.6
Coefficient of variation (CV)0.62676101
Kurtosis-1.031168
Mean2170940.1
Median Absolute Deviation (MAD)10998
Skewness-0.99483847
Sum2.1709401 × 108
Variance1.8513973 × 1012
MonotonicityNot monotonic
2023-10-09T03:57:53.247607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
0 28
28.0%
3000905 6
 
6.0%
3013682 6
 
6.0%
3020416 5
 
5.0%
3000963 5
 
5.0%
3023314 5
 
5.0%
3024929 5
 
5.0%
3016723 5
 
5.0%
3019550 5
 
5.0%
3023103 5
 
5.0%
Other values (16) 25
25.0%
ValueCountFrequency (%)
0 28
28.0%
3000905 6
 
6.0%
3000963 5
 
5.0%
3004410 2
 
2.0%
3004501 2
 
2.0%
3006923 2
 
2.0%
3007070 1
 
1.0%
3007220 1
 
1.0%
3009966 1
 
1.0%
3010156 2
 
2.0%
ValueCountFrequency (%)
3036887 1
 
1.0%
3035995 1
 
1.0%
3026910 2
 
2.0%
3024929 5
5.0%
3024128 1
 
1.0%
3023314 5
5.0%
3023103 5
5.0%
3022192 1
 
1.0%
3020416 5
5.0%
3019550 5
5.0%

measurement_date
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean201343.75
Minimum201106
Maximum201505
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:53.505130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201106
5-th percentile201112
Q1201204
median201406
Q3201408
95-th percentile201505
Maximum201505
Range399
Interquartile range (IQR)204

Descriptive statistics

Standard deviation145.27218
Coefficient of variation (CV)0.00072151326
Kurtosis-1.0257128
Mean201343.75
Median Absolute Deviation (MAD)5
Skewness-0.75677841
Sum20134375
Variance21104.008
MonotonicityIncreasing
2023-10-09T03:57:54.141605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
201408 23
23.0%
201112 20
20.0%
201504 13
13.0%
201401 10
10.0%
201405 7
 
7.0%
201505 6
 
6.0%
201204 5
 
5.0%
201106 4
 
4.0%
201404 4
 
4.0%
201407 4
 
4.0%
ValueCountFrequency (%)
201106 4
 
4.0%
201112 20
20.0%
201204 5
 
5.0%
201401 10
10.0%
201404 4
 
4.0%
201405 7
 
7.0%
201407 4
 
4.0%
201408 23
23.0%
201409 4
 
4.0%
201504 13
13.0%
ValueCountFrequency (%)
201505 6
 
6.0%
201504 13
13.0%
201409 4
 
4.0%
201408 23
23.0%
201407 4
 
4.0%
201405 7
 
7.0%
201404 4
 
4.0%
201401 10
10.0%
201204 5
 
5.0%
201112 20
20.0%

measurement_type_concept_id
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
44818702
100 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row44818702
2nd row44818702
3rd row44818702
4th row44818702
5th row44818702

Common Values

ValueCountFrequency (%)
44818702 100
100.0%

Length

2023-10-09T03:57:54.578267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:54.748703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
44818702 100
100.0%
Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
83 
4171756
12 
4172704
 
5

Length

Max length7
Median length4
Mean length4.51
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 83
83.0%
4171756 12
 
12.0%
4172704 5
 
5.0%

Length

2023-10-09T03:57:55.069119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:55.524447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 83
83.0%
4171756 12
 
12.0%
4172704 5
 
5.0%

value_as_number
Real number (ℝ)

ZEROS 

Distinct53
Distinct (%)53.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.98
Minimum0
Maximum392
Zeros2
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:55.728581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q14
median13
Q341.75
95-th percentile168.2
Maximum392
Range392
Interquartile range (IQR)37.75

Descriptive statistics

Standard deviation71.378629
Coefficient of variation (CV)1.586897
Kurtosis8.8991991
Mean44.98
Median Absolute Deviation (MAD)11
Skewness2.6996555
Sum4498
Variance5094.9087
MonotonicityNot monotonic
2023-10-09T03:57:55.924685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 13
 
13.0%
4 10
 
10.0%
11 5
 
5.0%
24 4
 
4.0%
12 4
 
4.0%
10 3
 
3.0%
5 3
 
3.0%
13 3
 
3.0%
19 3
 
3.0%
140 2
 
2.0%
Other values (43) 50
50.0%
ValueCountFrequency (%)
0 2
 
2.0%
1 13
13.0%
2 2
 
2.0%
3 1
 
1.0%
4 10
10.0%
5 3
 
3.0%
6 2
 
2.0%
7 2
 
2.0%
9 1
 
1.0%
10 3
 
3.0%
ValueCountFrequency (%)
392 1
1.0%
371 1
1.0%
239 1
1.0%
192 1
1.0%
172 1
1.0%
168 1
1.0%
152 1
1.0%
144 1
1.0%
141 1
1.0%
140 2
2.0%

unit_concept_id
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
86 
8554
14 

Length

Max length4
Median length1
Mean length1.42
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8554
2nd row0
3rd row0
4th row0
5th row8554

Common Values

ValueCountFrequency (%)
0 86
86.0%
8554 14
 
14.0%

Length

2023-10-09T03:57:56.124396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:56.284345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 86
86.0%
8554 14
 
14.0%

unit_source_value
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
㎎/㎗
18 
%
14 
14 
10^9/L
14 
mmol/L
10 
Other values (6)
30 

Length

Max length7
Median length6
Mean length3.43
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row%
2nd row<NA>
3rd row
4th row
5th row%

Common Values

ValueCountFrequency (%)
㎎/㎗ 18
18.0%
% 14
14.0%
14
14.0%
10^9/L 14
14.0%
mmol/L 10
10.0%
U/ℓ 9
9.0%
<NA> 7
 
7.0%
10^12/L 5
 
5.0%
g/㎗ 5
 
5.0%
㎎/ℓ 2
 
2.0%

Length

2023-10-09T03:57:56.482847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
㎎/㎗ 18
18.0%
14
14.0%
14
14.0%
10^9/l 14
14.0%
mmol/l 10
10.0%
u/ℓ 9
9.0%
na 7
 
7.0%
10^12/l 5
 
5.0%
g/㎗ 5
 
5.0%
㎎/ℓ 2
 
2.0%

Interactions

2023-10-09T03:57:50.779716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:47.425601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:48.415684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:49.251792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:50.958830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:47.778095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:48.588566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:49.459763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:51.117909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:47.999126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:48.786471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:49.661963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:51.357681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:48.210732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:49.015922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:50.143586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:57:56.675247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
measurement_idmeasurement_concept_idmeasurement_dateoperator_concept_idvalue_as_numberunit_concept_idunit_source_value
measurement_id1.0000.3940.8350.0000.0000.0000.000
measurement_concept_id0.3941.0000.5440.0000.2430.2070.979
measurement_date0.8350.5441.0000.0000.0000.0000.367
operator_concept_id0.0000.0000.0001.0000.3060.0000.682
value_as_number0.0000.2430.0000.3061.0000.4320.468
unit_concept_id0.0000.2070.0000.0000.4321.0001.000
unit_source_value0.0000.9790.3670.6820.4681.0001.000
2023-10-09T03:57:56.884250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
operator_concept_idunit_concept_idunit_source_value
operator_concept_id1.0000.0000.411
unit_concept_id0.0001.0000.955
unit_source_value0.4110.9551.000
2023-10-09T03:57:57.065086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
measurement_idmeasurement_concept_idmeasurement_datevalue_as_numberoperator_concept_idunit_concept_idunit_source_value
measurement_id1.0000.0130.988-0.0600.0000.0000.000
measurement_concept_id0.0131.0000.0240.1610.0000.1320.833
measurement_date0.9880.0241.000-0.0780.0000.0000.177
value_as_number-0.0600.161-0.0781.0000.4700.4500.252
operator_concept_id0.0000.0000.0000.4701.0000.0000.411
unit_concept_id0.0000.1320.0000.4500.0001.0000.955
unit_source_value0.0000.8330.1770.2520.4110.9551.000

Missing values

2023-10-09T03:57:51.700025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:57:52.152249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

measurement_idmeasurement_concept_idmeasurement_datemeasurement_type_concept_idoperator_concept_idvalue_as_numberunit_concept_idunit_source_value
044866020110644818702<NA>1138554%
144867020110644818702<NA>10<NA>
244869020110644818702<NA>100
344872020110644818702<NA>240
453164300441020111244818702<NA>58554%
553166300090520111244818702<NA>6010^9/L
653167302041620111244818702<NA>4010^12/L
753168300096320111244818702<NA>130g/㎗
8531693023314201112448187024171756398554%
953170302492920111244818702<NA>192010^9/L
measurement_idmeasurement_concept_idmeasurement_datemeasurement_type_concept_idoperator_concept_idvalue_as_numberunit_concept_idunit_source_value
90116850020150444818702<NA>110
91116853020150444818702<NA>290
92116865303599520150444818702<NA>500U/ℓ
93116868301015620150444818702<NA>00㎎/ℓ
94119040300090520150544818702<NA>5010^9/L
9511904130204162015054481870241717563010^12/L
96119042300096320150544818702417175690g/㎗
971190433023314201505448187024171756288554%
98119044302492920150544818702<NA>152010^9/L
991190493013682201505448187024172704240㎎/㎗