Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.3 KiB
Average record size in memory44.3 B

Variable types

Numeric2
Categorical3

Dataset

Description고지혈증 환자들의 스타틴 처방 데이터와 스타틴 처방 이전이나 이후에 처방된 선행 약물과 병용 약물 현황을 분석할 수 있는 데이터. 스타틴 약물 처방 데이터는 1일 기준 용량과 수량, 처방횟수, 처방 일수 데이터를 이용하여 총 투여량을 생성할 수 있음. 약물 처방 데이터는 RxNorm 코드로 매핑됨 -선행 약물 여부 : 0은 No, 1은 Yes로 구분 하였음 -병용 약물 여부 : 0은 No, 1은 Yes로 구분 하였음
Author가톨릭대학교 은평성모병원
URLhttp://cmcdata.net/data/dataset/precedence-combination-administration-drug-data-dyslipidemia-eunpyeong

Alerts

drug_start_date has constant value ""Constant
Drug_cd is highly overall correlated with Drug_nameHigh correlation
Drug_name is highly overall correlated with Drug_cdHigh correlation
drug_exposure is highly imbalanced (59.8%)Imbalance
일련번호 has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:56:13.913242
Analysis finished2023-10-08 18:56:15.330268
Duration1.42 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련번호
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:15.493426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-10-09T03:56:15.711578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

Drug_name
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Fenofibrate
15 
Gemfibrozil
15 
Omega-3
14 
Propranolol
14 
Thyroxine
14 
Other values (2)
28 

Length

Max length14
Median length11
Mean length10.16
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFenofibrate
2nd rowGemfibrozil
3rd rowOmega-3
4th rowPropranolol
5th rowThyroxine

Common Values

ValueCountFrequency (%)
Fenofibrate 15
15.0%
Gemfibrozil 15
15.0%
Omega-3 14
14.0%
Propranolol 14
14.0%
Thyroxine 14
14.0%
Warfarin 14
14.0%
Bisphosphonate 14
14.0%

Length

2023-10-09T03:56:15.966008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:16.230569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
fenofibrate 15
15.0%
gemfibrozil 15
15.0%
omega-3 14
14.0%
propranolol 14
14.0%
thyroxine 14
14.0%
warfarin 14
14.0%
bisphosphonate 14
14.0%

Drug_cd
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean669002.27
Minimum315106
Maximum966221
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:56:16.639973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum315106
5-th percentile315106
Q1349287
median855302
Q3904419
95-th percentile966221
Maximum966221
Range651115
Interquartile range (IQR)555132

Descriptive statistics

Standard deviation263613.51
Coefficient of variation (CV)0.39403978
Kurtosis-1.7714708
Mean669002.27
Median Absolute Deviation (MAD)110919
Skewness-0.28707765
Sum66900227
Variance6.9492082 × 1010
MonotonicityNot monotonic
2023-10-09T03:56:16.832867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
349287 15
15.0%
315106 15
15.0%
484348 14
14.0%
856448 14
14.0%
966221 14
14.0%
855302 14
14.0%
904419 14
14.0%
ValueCountFrequency (%)
315106 15
15.0%
349287 15
15.0%
484348 14
14.0%
855302 14
14.0%
856448 14
14.0%
904419 14
14.0%
966221 14
14.0%
ValueCountFrequency (%)
966221 14
14.0%
904419 14
14.0%
856448 14
14.0%
855302 14
14.0%
484348 14
14.0%
349287 15
15.0%
315106 15
15.0%

drug_exposure
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
92 
1
 
8

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 92
92.0%
1 8
 
8.0%

Length

2023-10-09T03:56:17.015897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:17.154788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 92
92.0%
1 8
 
8.0%

drug_start_date
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2015-10-14T00:00:00
100 

Length

Max length19
Median length19
Mean length19
Min length19

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015-10-14T00:00:00
2nd row2015-10-14T00:00:00
3rd row2015-10-14T00:00:00
4th row2015-10-14T00:00:00
5th row2015-10-14T00:00:00

Common Values

ValueCountFrequency (%)
2015-10-14T00:00:00 100
100.0%

Length

2023-10-09T03:56:17.349233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:56:17.510144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2015-10-14t00:00:00 100
100.0%

Interactions

2023-10-09T03:56:14.632319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:14.280475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:14.829333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:56:14.445445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:56:17.600922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호Drug_nameDrug_cddrug_exposure
일련번호1.0000.0000.0000.157
Drug_name0.0001.0001.0000.000
Drug_cd0.0001.0001.0000.000
drug_exposure0.1570.0000.0001.000
2023-10-09T03:56:17.732848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Drug_namedrug_exposure
Drug_name1.0000.000
drug_exposure0.0001.000
2023-10-09T03:56:17.855701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호Drug_cdDrug_namedrug_exposure
일련번호1.0000.0110.0000.112
Drug_cd0.0111.0000.9840.000
Drug_name0.0000.9841.0000.000
drug_exposure0.1120.0000.0001.000

Missing values

2023-10-09T03:56:15.083964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:56:15.257498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일련번호Drug_nameDrug_cddrug_exposuredrug_start_date
01Fenofibrate34928702015-10-14T00:00:00
12Gemfibrozil31510602015-10-14T00:00:00
23Omega-348434802015-10-14T00:00:00
34Propranolol85644802015-10-14T00:00:00
45Thyroxine96622102015-10-14T00:00:00
56Warfarin85530202015-10-14T00:00:00
67Bisphosphonate90441902015-10-14T00:00:00
78Fenofibrate34928702015-10-14T00:00:00
89Gemfibrozil31510602015-10-14T00:00:00
910Omega-348434802015-10-14T00:00:00
일련번호Drug_nameDrug_cddrug_exposuredrug_start_date
9091Bisphosphonate90441912015-10-14T00:00:00
9192Fenofibrate34928702015-10-14T00:00:00
9293Gemfibrozil31510602015-10-14T00:00:00
9394Omega-348434802015-10-14T00:00:00
9495Propranolol85644812015-10-14T00:00:00
9596Thyroxine96622102015-10-14T00:00:00
9697Warfarin85530202015-10-14T00:00:00
9798Bisphosphonate90441902015-10-14T00:00:00
9899Fenofibrate34928702015-10-14T00:00:00
99100Gemfibrozil31510612015-10-14T00:00:00