Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 100 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 6.1 KiB |
Average record size in memory | 62.3 B |
Variable types
Numeric | 3 |
---|---|
Categorical | 4 |
Dataset
Description | 알코올 사용장애 환자의 증상 정보를 OMOP CDM 형식으로 생산한 데이터 |
---|---|
Author | 가톨릭대학교 서울성모병원 |
URL | http://cmcdata.net/data/dataset/alcohol_condition_occurrence_2020-omop-cdm |
condition_status_source_value has constant value "" | Constant |
condition_status_concept_id has constant value "" | Constant |
condition_occurrence_id is highly overall correlated with condition_start_date | High correlation |
condition_concept_id is highly overall correlated with condition_source_value | High correlation |
condition_start_date is highly overall correlated with condition_occurrence_id | High correlation |
condition_source_value is highly overall correlated with condition_concept_id | High correlation |
condition_type_concept_id is highly imbalanced (53.1%) | Imbalance |
condition_occurrence_id has unique values | Unique |
Reproduction
Analysis started | 2023-10-08 18:55:58.070727 |
---|---|
Analysis finished | 2023-10-08 18:56:02.651693 |
Duration | 4.58 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
condition_occurrence_id
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 100 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 56274886 |
Minimum | 37892410 |
---|---|
Maximum | 76135914 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 37892410 |
---|---|
5-th percentile | 38391587 |
Q1 | 46607654 |
median | 57291136 |
Q3 | 66851673 |
95-th percentile | 72039281 |
Maximum | 76135914 |
Range | 38243504 |
Interquartile range (IQR) | 20244020 |
Descriptive statistics
Standard deviation | 11203691 |
---|---|
Coefficient of variation (CV) | 0.19908865 |
Kurtosis | -1.1244452 |
Mean | 56274886 |
Median Absolute Deviation (MAD) | 9600076.5 |
Skewness | -0.16326485 |
Sum | 5.6274886 × 109 |
Variance | 1.2552269 × 1014 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
70329014 | 1 | 1.0% |
44799099 | 1 | 1.0% |
61973785 | 1 | 1.0% |
67690880 | 1 | 1.0% |
44434129 | 1 | 1.0% |
38392457 | 1 | 1.0% |
43181834 | 1 | 1.0% |
39411149 | 1 | 1.0% |
53432479 | 1 | 1.0% |
57291132 | 1 | 1.0% |
Other values (90) | 90 |
Value | Count | Frequency (%) |
37892410 | 1 | |
37959287 | 1 | |
37959288 | 1 | |
38375073 | 1 | |
38375074 | 1 | |
38392456 | 1 | |
38392457 | 1 | |
38761936 | 1 | |
39224891 | 1 | |
39224893 | 1 |
Value | Count | Frequency (%) |
76135914 | 1 | |
74487210 | 1 | |
74487206 | 1 | |
74453623 | 1 | |
74084930 | 1 | |
71931615 | 1 | |
71931613 | 1 | |
71931607 | 1 | |
70329014 | 1 | |
70121178 | 1 |
condition_concept_id
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 18 |
---|---|
Distinct (%) | 18.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 7302959.2 |
Minimum | 194984 |
---|---|
Maximum | 40525349 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 194984 |
---|---|
5-th percentile | 197032 |
Q1 | 320128 |
median | 4001171 |
Q3 | 4145627 |
95-th percentile | 40525349 |
Maximum | 40525349 |
Range | 40330365 |
Interquartile range (IQR) | 3825499 |
Descriptive statistics
Standard deviation | 13020870 |
---|---|
Coefficient of variation (CV) | 1.782958 |
Kurtosis | 2.841802 |
Mean | 7302959.2 |
Median Absolute Deviation (MAD) | 3565647 |
Skewness | 2.1491705 |
Sum | 7.3029592 × 108 |
Variance | 1.6954305 × 1014 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4001171 | 23 | |
201820 | 13 | |
40525349 | 11 | |
320128 | 10 | |
435524 | 8 | 8.0% |
4098483 | 7 | 7.0% |
197032 | 6 | 6.0% |
4340383 | 6 | 6.0% |
4145627 | 4 | 4.0% |
4096673 | 2 | 2.0% |
Other values (8) | 10 |
Value | Count | Frequency (%) |
194984 | 1 | 1.0% |
197032 | 6 | 6.0% |
201820 | 13 | |
320128 | 10 | |
321318 | 2 | 2.0% |
435524 | 8 | 8.0% |
4001171 | 23 | |
4096044 | 1 | 1.0% |
4096673 | 2 | 2.0% |
4098483 | 7 | 7.0% |
Value | Count | Frequency (%) |
40525349 | 11 | |
40397928 | 1 | 1.0% |
40356720 | 1 | 1.0% |
4342779 | 1 | 1.0% |
4340383 | 6 | |
4169287 | 2 | 2.0% |
4145627 | 4 | 4.0% |
4121624 | 1 | 1.0% |
4098483 | 7 | |
4096673 | 2 | 2.0% |
condition_start_date
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 36 |
---|---|
Distinct (%) | 36.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 201658.23 |
Minimum | 201402 |
---|---|
Maximum | 201909 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 201402 |
---|---|
5-th percentile | 201403 |
Q1 | 201509 |
median | 201703 |
Q3 | 201807 |
95-th percentile | 201902.2 |
Maximum | 201909 |
Range | 507 |
Interquartile range (IQR) | 298 |
Descriptive statistics
Standard deviation | 159.52574 |
---|---|
Coefficient of variation (CV) | 0.00079106982 |
Kurtosis | -1.1169059 |
Mean | 201658.23 |
Median Absolute Deviation (MAD) | 104 |
Skewness | -0.28461164 |
Sum | 20165823 |
Variance | 25448.462 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
201807 | 7 | 7.0% |
201405 | 6 | 6.0% |
201703 | 5 | 5.0% |
201808 | 5 | 5.0% |
201812 | 4 | 4.0% |
201403 | 4 | 4.0% |
201505 | 4 | 4.0% |
201906 | 4 | 4.0% |
201704 | 4 | 4.0% |
201803 | 4 | 4.0% |
Other values (26) | 53 |
Value | Count | Frequency (%) |
201402 | 3 | |
201403 | 4 | |
201404 | 1 | 1.0% |
201405 | 6 | |
201407 | 1 | 1.0% |
201411 | 2 | 2.0% |
201502 | 1 | 1.0% |
201503 | 1 | 1.0% |
201504 | 1 | 1.0% |
201505 | 4 |
Value | Count | Frequency (%) |
201909 | 1 | 1.0% |
201906 | 4 | |
201902 | 3 | |
201812 | 4 | |
201810 | 3 | |
201808 | 5 | |
201807 | 7 | |
201806 | 1 | 1.0% |
201803 | 4 | |
201802 | 2 | 2.0% |
condition_type_concept_id
Categorical
IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 2.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
44786629 | |
---|---|
44786627 |
Length
Max length | 8 |
---|---|
Median length | 8 |
Mean length | 8 |
Min length | 8 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 44786629 |
---|---|
2nd row | 44786629 |
3rd row | 44786629 |
4th row | 44786629 |
5th row | 44786629 |
Common Values
Value | Count | Frequency (%) |
44786629 | 90 | |
44786627 | 10 | 10.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
44786629 | 90 | |
44786627 | 10 | 10.0% |
condition_source_value
Categorical
HIGH CORRELATION
 
Distinct | 16 |
---|---|
Distinct (%) | 16.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
C22 | |
---|---|
E14 | |
Z94 | |
I10 | |
G47 | |
Other values (11) |
Length
Max length | 3 |
---|---|
Median length | 3 |
Mean length | 3 |
Min length | 3 |
Unique
Unique | 4 ? |
---|---|
Unique (%) | 4.0% |
Sample
1st row | I20 |
---|---|
2nd row | F32 |
3rd row | I20 |
4th row | E11 |
5th row | K76 |
Common Values
Value | Count | Frequency (%) |
C22 | 23 | |
E14 | 13 | |
Z94 | 11 | |
I10 | 10 | |
G47 | 8 | 8.0% |
E78 | 7 | 7.0% |
K70 | 6 | 6.0% |
N40 | 6 | 6.0% |
K80 | 4 | 4.0% |
E11 | 3 | 3.0% |
Other values (6) | 9 | 9.0% |
Length
Value | Count | Frequency (%) |
c22 | 23 | |
e14 | 13 | |
z94 | 11 | |
i10 | 10 | |
g47 | 8 | 8.0% |
e78 | 7 | 7.0% |
k70 | 6 | 6.0% |
n40 | 6 | 6.0% |
k80 | 4 | 4.0% |
e11 | 3 | 3.0% |
Other values (6) | 9 | 9.0% |
condition_status_source_value
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
C |
---|
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | C |
---|---|
2nd row | C |
3rd row | C |
4th row | C |
5th row | C |
Common Values
Value | Count | Frequency (%) |
C | 100 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
c | 100 |
condition_status_concept_id
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
4230359 |
---|
Length
Max length | 7 |
---|---|
Median length | 7 |
Mean length | 7 |
Min length | 7 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 4230359 |
---|---|
2nd row | 4230359 |
3rd row | 4230359 |
4th row | 4230359 |
5th row | 4230359 |
Common Values
Value | Count | Frequency (%) |
4230359 | 100 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
4230359 | 100 |
condition_occurrence_id | condition_concept_id | condition_start_date | condition_type_concept_id | condition_source_value | |
---|---|---|---|---|---|
condition_occurrence_id | 1.000 | 0.569 | 0.922 | 0.351 | 0.638 |
condition_concept_id | 0.569 | 1.000 | 0.686 | 0.071 | 1.000 |
condition_start_date | 0.922 | 0.686 | 1.000 | 0.275 | 0.606 |
condition_type_concept_id | 0.351 | 0.071 | 0.275 | 1.000 | 0.466 |
condition_source_value | 0.638 | 1.000 | 0.606 | 0.466 | 1.000 |
condition_type_concept_id | condition_source_value | |
---|---|---|
condition_type_concept_id | 1.000 | 0.338 |
condition_source_value | 0.338 | 1.000 |
condition_occurrence_id | condition_concept_id | condition_start_date | condition_type_concept_id | condition_source_value | |
---|---|---|---|---|---|
condition_occurrence_id | 1.000 | -0.155 | 0.999 | 0.250 | 0.296 |
condition_concept_id | -0.155 | 1.000 | -0.156 | 0.117 | 0.914 |
condition_start_date | 0.999 | -0.156 | 1.000 | 0.249 | 0.289 |
condition_type_concept_id | 0.250 | 0.117 | 0.249 | 1.000 | 0.338 |
condition_source_value | 0.296 | 0.914 | 0.289 | 0.338 | 1.000 |
condition_occurrence_id | condition_concept_id | condition_start_date | condition_type_concept_id | condition_source_value | condition_status_source_value | condition_status_concept_id | |
---|---|---|---|---|---|---|---|
0 | 70329014 | 321318 | 201812 | 44786629 | I20 | C | 4230359 |
1 | 67974602 | 40356720 | 201808 | 44786629 | F32 | C | 4230359 |
2 | 66284110 | 321318 | 201806 | 44786629 | I20 | C | 4230359 |
3 | 74084930 | 4096044 | 201906 | 44786629 | E11 | C | 4230359 |
4 | 76135914 | 194984 | 201909 | 44786629 | K76 | C | 4230359 |
5 | 67974600 | 4121624 | 201808 | 44786627 | I65 | C | 4230359 |
6 | 40026298 | 4169287 | 201407 | 44786629 | L29 | C | 4230359 |
7 | 58051545 | 201820 | 201705 | 44786629 | E14 | C | 4230359 |
8 | 58752609 | 4001171 | 201706 | 44786629 | C22 | C | 4230359 |
9 | 38761936 | 40525349 | 201404 | 44786629 | Z94 | C | 4230359 |
condition_occurrence_id | condition_concept_id | condition_start_date | condition_type_concept_id | condition_source_value | condition_status_source_value | condition_status_concept_id | |
---|---|---|---|---|---|---|---|
90 | 39235215 | 40525349 | 201405 | 44786629 | Z94 | C | 4230359 |
91 | 68860230 | 4098483 | 201810 | 44786629 | E78 | C | 4230359 |
92 | 41864809 | 40525349 | 201411 | 44786629 | Z94 | C | 4230359 |
93 | 37959288 | 40525349 | 201402 | 44786629 | Z94 | C | 4230359 |
94 | 70121173 | 435524 | 201812 | 44786629 | G47 | C | 4230359 |
95 | 44799096 | 40525349 | 201505 | 44786627 | Z94 | C | 4230359 |
96 | 44742779 | 40525349 | 201505 | 44786629 | Z94 | C | 4230359 |
97 | 67160029 | 4001171 | 201807 | 44786629 | C22 | C | 4230359 |
98 | 38375073 | 320128 | 201403 | 44786629 | I10 | C | 4230359 |
99 | 37959287 | 201820 | 201402 | 44786629 | E14 | C | 4230359 |