Overview

Dataset statistics

Number of variables7
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.2 KiB
Average record size in memory63.3 B

Variable types

Numeric3
Categorical4

Dataset

Description알코올 사용장애 환자의 관찰 기록을 OMOP CDM 형식으로 생산한 데이터
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/alcohol_observation_2020-omop-cdm

Alerts

observation_type_concept_id has constant value ""Constant
unit_concept_id is highly overall correlated with value_as_number and 2 other fieldsHigh correlation
unit_source_value is highly overall correlated with value_as_number and 2 other fieldsHigh correlation
observation_concept_id is highly overall correlated with value_as_number and 2 other fieldsHigh correlation
value_as_number is highly overall correlated with observation_concept_id and 2 other fieldsHigh correlation

Reproduction

Analysis started2023-10-08 18:56:30.035754
Analysis finished2023-10-08 18:56:32.826761
Duration2.79 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

observation_id
Real number (ℝ)

Distinct85
Distinct (%)85.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1057774 × 1016
Minimum9.643633 × 1010
Maximum2.458426 × 1016
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:32.987967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9.643633 × 1010
5-th percentile1.0894539 × 1011
Q12.5217985 × 1011
median5.7121808 × 1011
Q32.457347 × 1016
95-th percentile2.4580041 × 1016
Maximum2.458426 × 1016
Range2.4584164 × 1016
Interquartile range (IQR)2.4573218 × 1016

Descriptive statistics

Standard deviation1.2286038 × 1016
Coefficient of variation (CV)1.1110769
Kurtosis-1.9987369
Mean1.1057774 × 1016
Median Absolute Deviation (MAD)4.6227269 × 1011
Skewness0.20408207
Sum1.1057774 × 1018
Variance1.5094672 × 1032
MonotonicityNot monotonic
2023-10-09T03:56:33.376510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24573470000580000 2
 
2.0%
24576200001000000 2
 
2.0%
24580240000560000 2
 
2.0%
24576600000350000 2
 
2.0%
24574630000330000 2
 
2.0%
24571000000170000 2
 
2.0%
24567060000950000 2
 
2.0%
24575830000230000 2
 
2.0%
24578820001000000 2
 
2.0%
24579130000460000 2
 
2.0%
Other values (75) 80
80.0%
ValueCountFrequency (%)
96436330006 1
1.0%
98268710001 1
1.0%
99237860001 1
1.0%
99411780001 1
1.0%
108945380001 1
1.0%
108945390001 1
1.0%
130448900001 1
1.0%
130448910001 1
1.0%
141816890001 1
1.0%
141816900001 1
1.0%
ValueCountFrequency (%)
24584260000040000 1
1.0%
24582470000780000 1
1.0%
24580610000130000 1
1.0%
24580240000560000 2
2.0%
24580030000390000 1
1.0%
24579130000460000 2
2.0%
24578820001000000 2
2.0%
24577650000420000 2
2.0%
24576600000350000 2
2.0%
24576200001000000 2
2.0%

observation_concept_id
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
4099154
52 
4177340
48 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4099154
2nd row4177340
3rd row4099154
4th row4177340
5th row4099154

Common Values

ValueCountFrequency (%)
4099154 52
52.0%
4177340 48
48.0%

Length

2023-10-09T03:56:33.738552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:33.915569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4099154 52
52.0%
4177340 48
48.0%

observation_date
Real number (ℝ)

Distinct39
Distinct (%)39.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean201533.11
Minimum201104
Maximum201911
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:34.101039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201104
5-th percentile201104
Q1201312
median201556
Q3201709.5
95-th percentile201910
Maximum201911
Range807
Interquartile range (IQR)397.5

Descriptive statistics

Standard deviation254.39327
Coefficient of variation (CV)0.0012622902
Kurtosis-1.0771982
Mean201533.11
Median Absolute Deviation (MAD)155
Skewness-0.1511825
Sum20153311
Variance64715.937
MonotonicityNot monotonic
2023-10-09T03:56:34.357196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
201104 7
 
7.0%
201402 5
 
5.0%
201605 4
 
4.0%
201404 4
 
4.0%
201403 4
 
4.0%
201709 4
 
4.0%
201910 4
 
4.0%
201904 4
 
4.0%
201706 3
 
3.0%
201905 3
 
3.0%
Other values (29) 58
58.0%
ValueCountFrequency (%)
201104 7
7.0%
201107 2
 
2.0%
201112 2
 
2.0%
201203 2
 
2.0%
201204 2
 
2.0%
201211 2
 
2.0%
201301 2
 
2.0%
201305 2
 
2.0%
201309 1
 
1.0%
201310 2
 
2.0%
ValueCountFrequency (%)
201911 2
2.0%
201910 4
4.0%
201905 3
3.0%
201904 4
4.0%
201812 2
2.0%
201811 2
2.0%
201809 2
2.0%
201805 2
2.0%
201804 2
2.0%
201711 2
2.0%

observation_type_concept_id
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
44814644
100 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row44814644
2nd row44814644
3rd row44814644
4th row44814644
5th row44814644

Common Values

ValueCountFrequency (%)
44814644 100
100.0%

Length

2023-10-09T03:56:34.788771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:34.966888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
44814644 100
100.0%

value_as_number
Real number (ℝ)

HIGH CORRELATION 

Distinct76
Distinct (%)76.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean115.0885
Minimum29
Maximum189
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:35.332222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum29
5-th percentile51.765
Q165.75
median92
Q3168
95-th percentile178.1
Maximum189
Range160
Interquartile range (IQR)102.25

Descriptive statistics

Standard deviation52.25436
Coefficient of variation (CV)0.45403633
Kurtosis-1.8201058
Mean115.0885
Median Absolute Deviation (MAD)43.5
Skewness0.035975827
Sum11508.85
Variance2730.5182
MonotonicityNot monotonic
2023-10-09T03:56:35.743517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
170.0 4
 
4.0%
175.0 4
 
4.0%
168.0 4
 
4.0%
167.0 3
 
3.0%
65.0 3
 
3.0%
75.0 3
 
3.0%
70.0 2
 
2.0%
176.0 2
 
2.0%
172.0 2
 
2.0%
173.0 2
 
2.0%
Other values (66) 71
71.0%
ValueCountFrequency (%)
29.0 1
1.0%
38.0 1
1.0%
46.0 1
1.0%
51.0 1
1.0%
51.1 1
1.0%
51.8 1
1.0%
52.0 1
1.0%
52.8 1
1.0%
53.8 1
1.0%
54.0 2
2.0%
ValueCountFrequency (%)
189.0 1
 
1.0%
186.0 1
 
1.0%
184.3 1
 
1.0%
181.0 1
 
1.0%
180.0 1
 
1.0%
178.0 1
 
1.0%
177.6 1
 
1.0%
177.0 1
 
1.0%
176.0 2
2.0%
175.0 4
4.0%

unit_concept_id
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
9529
52 
8582
48 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9529
2nd row8582
3rd row9529
4th row8582
5th row9529

Common Values

ValueCountFrequency (%)
9529 52
52.0%
8582 48
48.0%

Length

2023-10-09T03:56:35.965377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:36.134588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
9529 52
52.0%
8582 48
48.0%

unit_source_value
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
KG
52 
CM
48 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKG
2nd rowCM
3rd rowKG
4th rowCM
5th rowKG

Common Values

ValueCountFrequency (%)
KG 52
52.0%
CM 48
48.0%

Length

2023-10-09T03:56:36.364506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:36.681125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kg 52
52.0%
cm 48
48.0%

Interactions

2023-10-09T03:56:31.887317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:30.408288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:31.174774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:32.077155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:30.769147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:31.504290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:32.284538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:30.979867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:31.692208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:56:36.820364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
observation_idobservation_concept_idobservation_datevalue_as_numberunit_concept_idunit_source_value
observation_id1.0000.0000.7050.1920.0000.000
observation_concept_id0.0001.0000.0001.0000.9990.999
observation_date0.7050.0001.0000.2450.0000.000
value_as_number0.1921.0000.2451.0001.0001.000
unit_concept_id0.0000.9990.0001.0001.0000.999
unit_source_value0.0000.9990.0001.0000.9991.000
2023-10-09T03:56:37.049932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
unit_concept_idunit_source_valueobservation_concept_id
unit_concept_id1.0000.9800.980
unit_source_value0.9801.0000.980
observation_concept_id0.9800.9801.000
2023-10-09T03:56:37.303299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
observation_idobservation_datevalue_as_numberobservation_concept_idunit_concept_idunit_source_value
observation_id1.0000.4420.0680.0000.0000.000
observation_date0.4421.0000.0430.0000.0000.000
value_as_number0.0680.0431.0000.9640.9640.964
observation_concept_id0.0000.0000.9641.0000.9800.980
unit_concept_id0.0000.0000.9640.9801.0000.980
unit_source_value0.0000.0000.9640.9800.9801.000

Missing values

2023-10-09T03:56:32.523363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:56:32.722270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

observation_idobservation_concept_idobservation_dateobservation_type_concept_idvalue_as_numberunit_concept_idunit_source_value
02457347000058000040991542015114481464492.09529KG
124573470000580000417734020151144814644189.08582CM
214465232000140991542012044481464465.09529KG
3144652310001417734020120444814644166.08582CM
42457511000063000040991542016054481464475.09529KG
524575110000630000417734020160544814644178.08582CM
650972738000140991542018124481464474.09529KG
7509727370001417734020181244814644180.08582CM
82456570000028000040991542013104481464475.09529KG
9222981690001417734020131044814644175.08582CM
observation_idobservation_concept_idobservation_dateobservation_type_concept_idvalue_as_numberunit_concept_idunit_source_value
9049582987000140991542018094481464475.39529KG
91495829860001417734020180944814644172.98582CM
9223342968000340991542013124481464452.89529KG
9324566430001620000417734020131244814644161.08582CM
942456971000151000040991542014114481464474.19529KG
9524569710001510000417734020141144814644184.38582CM
962456440000038000040991542013054481464461.459529KG
97204336800001417734020130544814644160.28582CM
982456244000019000040991542012114481464451.89529KG
9924562440000190000417734020121144814644161.68582CM