Overview

Dataset statistics

Number of variables7
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.1 KiB
Average record size in memory62.3 B

Variable types

Numeric3
Categorical4

Dataset

Description고지혈증 환자의 진단 정보를 OMOP CDM 형식으로 생산한 데이터
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/_condition_occurrence_2020-omop-cdm

Alerts

condition_status_source_value has constant value ""Constant
condition_status_concept_id has constant value ""Constant
condition_source_value is highly overall correlated with condition_concept_id and 1 other fieldsHigh correlation
condition_type_concept_id is highly overall correlated with condition_source_valueHigh correlation
condition_occurrence_id is highly overall correlated with condition_start_dateHigh correlation
condition_concept_id is highly overall correlated with condition_source_valueHigh correlation
condition_start_date is highly overall correlated with condition_occurrence_idHigh correlation
condition_occurrence_id has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:57:36.465424
Analysis finished2023-10-08 18:57:39.555659
Duration3.09 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

condition_occurrence_id
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52078455
Minimum30177580
Maximum75781836
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:39.752259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum30177580
5-th percentile32141348
Q137121454
median53523540
Q364609179
95-th percentile75547625
Maximum75781836
Range45604256
Interquartile range (IQR)27487725

Descriptive statistics

Standard deviation14434387
Coefficient of variation (CV)0.2771662
Kurtosis-1.3597282
Mean52078455
Median Absolute Deviation (MAD)14217894
Skewness0.026315487
Sum5.2078455 × 109
Variance2.0835154 × 1014
MonotonicityNot monotonic
2023-10-09T03:57:40.002898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
70252909 1
 
1.0%
72035568 1
 
1.0%
70252907 1
 
1.0%
75781833 1
 
1.0%
47823829 1
 
1.0%
41149888 1
 
1.0%
55895296 1
 
1.0%
67741743 1
 
1.0%
72035569 1
 
1.0%
40995285 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
30177580 1
1.0%
30177581 1
1.0%
30177582 1
1.0%
31348976 1
1.0%
31348977 1
1.0%
32183052 1
1.0%
32183053 1
1.0%
32183055 1
1.0%
33110890 1
1.0%
33177936 1
1.0%
ValueCountFrequency (%)
75781836 1
1.0%
75781835 1
1.0%
75781834 1
1.0%
75781833 1
1.0%
75547627 1
1.0%
75547625 1
1.0%
72035570 1
1.0%
72035569 1
1.0%
72035568 1
1.0%
72035567 1
1.0%

condition_concept_id
Real number (ℝ)

HIGH CORRELATION 

Distinct17
Distinct (%)17.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1658690.3
Minimum75860
Maximum4294549
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:40.394983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum75860
5-th percentile80502
Q1193518
median443731
Q34098483
95-th percentile4212516
Maximum4294549
Range4218689
Interquartile range (IQR)3904965

Descriptive statistics

Standard deviation1882554.6
Coefficient of variation (CV)1.1349645
Kurtosis-1.676965
Mean1658690.3
Median Absolute Deviation (MAD)363229
Skewness0.57936763
Sum1.6586903 × 108
Variance3.5440117 × 1012
MonotonicityNot monotonic
2023-10-09T03:57:40.684157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
80502 22
22.0%
443731 19
19.0%
4098483 12
12.0%
321318 7
 
7.0%
378416 6
 
6.0%
4159131 6
 
6.0%
4193704 5
 
5.0%
201820 5
 
5.0%
193518 3
 
3.0%
4127568 3
 
3.0%
Other values (7) 12
12.0%
ValueCountFrequency (%)
75860 2
 
2.0%
80502 22
22.0%
193518 3
 
3.0%
201820 5
 
5.0%
321318 7
 
7.0%
378416 6
 
6.0%
443731 19
19.0%
4001645 1
 
1.0%
4098483 12
12.0%
4102985 1
 
1.0%
ValueCountFrequency (%)
4294549 2
 
2.0%
4224741 2
 
2.0%
4212516 2
 
2.0%
4193704 5
5.0%
4174977 2
 
2.0%
4159131 6
6.0%
4127568 3
 
3.0%
4102985 1
 
1.0%
4098483 12
12.0%
4001645 1
 
1.0%

condition_start_date
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)31.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean201569.14
Minimum201111
Maximum201908
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:40.906019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201111
5-th percentile201205.85
Q1201311
median201609
Q3201803
95-th percentile201908
Maximum201908
Range797
Interquartile range (IQR)492

Descriptive statistics

Standard deviation243.9246
Coefficient of variation (CV)0.0012101287
Kurtosis-1.2398727
Mean201569.14
Median Absolute Deviation (MAD)199
Skewness-0.29656554
Sum20156914
Variance59499.213
MonotonicityNot monotonic
2023-10-09T03:57:41.279995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
201808 10
 
10.0%
201609 7
 
7.0%
201908 6
 
6.0%
201209 5
 
5.0%
201903 4
 
4.0%
201311 4
 
4.0%
201307 4
 
4.0%
201706 4
 
4.0%
201801 4
 
4.0%
201803 4
 
4.0%
Other values (21) 48
48.0%
ValueCountFrequency (%)
201111 3
3.0%
201203 2
 
2.0%
201206 3
3.0%
201209 5
5.0%
201212 3
3.0%
201304 2
 
2.0%
201307 4
4.0%
201309 1
 
1.0%
201311 4
4.0%
201403 2
 
2.0%
ValueCountFrequency (%)
201908 6
6.0%
201903 4
 
4.0%
201812 3
 
3.0%
201808 10
10.0%
201807 1
 
1.0%
201803 4
 
4.0%
201801 4
 
4.0%
201709 3
 
3.0%
201706 4
 
4.0%
201705 2
 
2.0%

condition_type_concept_id
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
44786629
63 
44786627
37 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row44786629
2nd row44786627
3rd row44786629
4th row44786629
5th row44786627

Common Values

ValueCountFrequency (%)
44786629 63
63.0%
44786627 37
37.0%

Length

2023-10-09T03:57:41.673640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:41.826594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
44786629 63
63.0%
44786627 37
37.0%

condition_source_value
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
E11
24 
M81
23 
E78
18 
H35
I20
Other values (8)
20 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st rowI20
2nd rowE11
3rd rowI20
4th rowM81
5th rowH35

Common Values

ValueCountFrequency (%)
E11 24
24.0%
M81 23
23.0%
E78 18
18.0%
H35 8
 
8.0%
I20 7
 
7.0%
E14 5
 
5.0%
N32 3
 
3.0%
K56 3
 
3.0%
F98 2
 
2.0%
K59 2
 
2.0%
Other values (3) 5
 
5.0%

Length

2023-10-09T03:57:42.091098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
e11 24
24.0%
m81 23
23.0%
e78 18
18.0%
h35 8
 
8.0%
i20 7
 
7.0%
e14 5
 
5.0%
n32 3
 
3.0%
k56 3
 
3.0%
f98 2
 
2.0%
k59 2
 
2.0%
Other values (3) 5
 
5.0%

condition_status_source_value
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
C
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowC
3rd rowC
4th rowC
5th rowC

Common Values

ValueCountFrequency (%)
C 100
100.0%

Length

2023-10-09T03:57:42.460543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:42.787286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c 100
100.0%

condition_status_concept_id
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
4230359
100 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4230359
2nd row4230359
3rd row4230359
4th row4230359
5th row4230359

Common Values

ValueCountFrequency (%)
4230359 100
100.0%

Length

2023-10-09T03:57:43.080510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:43.309897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4230359 100
100.0%

Interactions

2023-10-09T03:57:38.417090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:37.051121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:37.634924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:38.747542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:37.238753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:37.821722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:38.929767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:37.420876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:38.000473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:57:43.435794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
condition_occurrence_idcondition_concept_idcondition_start_datecondition_type_concept_idcondition_source_value
condition_occurrence_id1.0000.1980.9620.0000.318
condition_concept_id0.1981.0000.1530.2310.957
condition_start_date0.9620.1531.0000.0000.230
condition_type_concept_id0.0000.2310.0001.0000.951
condition_source_value0.3180.9570.2300.9511.000
2023-10-09T03:57:43.716125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
condition_source_valuecondition_type_concept_id
condition_source_value1.0000.907
condition_type_concept_id0.9071.000
2023-10-09T03:57:43.915439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
condition_occurrence_idcondition_concept_idcondition_start_datecondition_type_concept_idcondition_source_value
condition_occurrence_id1.000-0.1000.9990.0000.149
condition_concept_id-0.1001.000-0.0970.1310.779
condition_start_date0.999-0.0971.0000.0000.030
condition_type_concept_id0.0000.1310.0001.0000.907
condition_source_value0.1490.7790.0300.9071.000

Missing values

2023-10-09T03:57:39.202811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:57:39.471255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

condition_occurrence_idcondition_concept_idcondition_start_datecondition_type_concept_idcondition_source_valuecondition_status_source_valuecondition_status_concept_id
07025290932131820181244786629I20C4230359
17203556744373120190344786627E11C4230359
25841832632131820170544786629I20C4230359
3371215138050220131144786629M81C4230359
45351100437841620160944786627H35C4230359
536109157415913120130744786629E78C4230359
65858027644373120170644786627E11C4230359
7384859738050220140344786629M81C4230359
84669579137841620150944786627H35C4230359
954236221429454920161044786629F98C4230359
condition_occurrence_idcondition_concept_idcondition_start_datecondition_type_concept_idcondition_source_valuecondition_status_source_valuecondition_status_concept_id
903218305320182020120644786629E14C4230359
913651093337841620130944786627H35C4230359
9253671092429454920160944786629F98C4230359
936774174219351820180844786627K56C4230359
9453523541409848320160944786629E78C4230359
953610915620182020130744786629E14C4230359
9630177582415913120111144786629E78C4230359
9746695793422474120150944786629H35C4230359
986327338432131820180144786629I20C4230359
995352353944373120160944786627E11C4230359