Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.4 KiB
Average record size in memory45.3 B

Variable types

Text1
Numeric4

Dataset

Description고지혈증 환자들의 최초 처방시점의 키, 몸무게와 같은 신체 계측 정보와 수축기/이완기 혈압을 포함하는 생체 징후 데이터. 키와 몸무게 데이터를 이용한 Body Mass Index(BMI)를 생성할 수 있으며 혈압 데이터를 이용하여 고혈압 여부를 판단할 수 있음
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/vital-signs-dyslipidemia-data

Alerts

SYSTOLIC is highly overall correlated with DIASTOLICHigh correlation
DIASTOLIC is highly overall correlated with SYSTOLICHigh correlation
RID has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:55:39.066970
Analysis finished2023-10-08 18:55:49.498324
Duration10.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-10-09T03:55:49.994676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowR0000076
2nd rowR0000082
3rd rowR0000083
4th rowR0000085
5th rowR0000087
ValueCountFrequency (%)
r0000076 1
 
1.0%
r0000290 1
 
1.0%
r0000329 1
 
1.0%
r0000327 1
 
1.0%
r0000322 1
 
1.0%
r0000321 1
 
1.0%
r0000320 1
 
1.0%
r0000318 1
 
1.0%
r0000315 1
 
1.0%
r0000314 1
 
1.0%
Other values (90) 90
90.0%
2023-10-09T03:55:51.490694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 422
52.8%
R 100
 
12.5%
2 51
 
6.4%
1 48
 
6.0%
3 48
 
6.0%
5 27
 
3.4%
4 26
 
3.2%
8 22
 
2.8%
7 20
 
2.5%
6 20
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 700
87.5%
Uppercase Letter 100
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 422
60.3%
2 51
 
7.3%
1 48
 
6.9%
3 48
 
6.9%
5 27
 
3.9%
4 26
 
3.7%
8 22
 
3.1%
7 20
 
2.9%
6 20
 
2.9%
9 16
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
R 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 700
87.5%
Latin 100
 
12.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 422
60.3%
2 51
 
7.3%
1 48
 
6.9%
3 48
 
6.9%
5 27
 
3.9%
4 26
 
3.7%
8 22
 
3.1%
7 20
 
2.9%
6 20
 
2.9%
9 16
 
2.3%
Latin
ValueCountFrequency (%)
R 100
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 422
52.8%
R 100
 
12.5%
2 51
 
6.4%
1 48
 
6.0%
3 48
 
6.0%
5 27
 
3.4%
4 26
 
3.2%
8 22
 
2.8%
7 20
 
2.5%
6 20
 
2.5%

BDHT
Real number (ℝ)

Distinct55
Distinct (%)55.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean162.734
Minimum150
Maximum183
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:55:51.841273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum150
5-th percentile151.975
Q1156.075
median163
Q3168.625
95-th percentile176.1
Maximum183
Range33
Interquartile range (IQR)12.55

Descriptive statistics

Standard deviation7.8114164
Coefficient of variation (CV)0.048001133
Kurtosis-0.45128775
Mean162.734
Median Absolute Deviation (MAD)6
Skewness0.40785802
Sum16273.4
Variance61.018226
MonotonicityNot monotonic
2023-10-09T03:55:52.097512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
165.0 8
 
8.0%
163.0 6
 
6.0%
158.0 5
 
5.0%
170.0 5
 
5.0%
153.0 4
 
4.0%
172.0 3
 
3.0%
156.0 3
 
3.0%
169.0 3
 
3.0%
153.5 3
 
3.0%
167.0 3
 
3.0%
Other values (45) 57
57.0%
ValueCountFrequency (%)
150.0 2
2.0%
150.2 1
 
1.0%
151.0 1
 
1.0%
151.5 1
 
1.0%
152.0 2
2.0%
153.0 4
4.0%
153.2 1
 
1.0%
153.5 3
3.0%
153.7 1
 
1.0%
154.0 1
 
1.0%
ValueCountFrequency (%)
183.0 1
 
1.0%
182.0 1
 
1.0%
180.0 1
 
1.0%
178.0 2
2.0%
176.0 1
 
1.0%
175.0 2
2.0%
174.0 1
 
1.0%
173.1 1
 
1.0%
172.0 3
3.0%
171.0 1
 
1.0%

BDWT
Real number (ℝ)

Distinct57
Distinct (%)57.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65.3225
Minimum46.5
Maximum90
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:55:52.340217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum46.5
5-th percentile51.95
Q158.7375
median64
Q371.2
95-th percentile82.9725
Maximum90
Range43.5
Interquartile range (IQR)12.4625

Descriptive statistics

Standard deviation9.3748357
Coefficient of variation (CV)0.14351618
Kurtosis-0.27415835
Mean65.3225
Median Absolute Deviation (MAD)6
Skewness0.37750014
Sum6532.25
Variance87.887544
MonotonicityNot monotonic
2023-10-09T03:55:52.631909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
64.0 5
 
5.0%
75.0 5
 
5.0%
70.0 5
 
5.0%
63.0 5
 
5.0%
58.0 4
 
4.0%
62.0 4
 
4.0%
53.0 3
 
3.0%
69.0 3
 
3.0%
57.0 3
 
3.0%
80.0 3
 
3.0%
Other values (47) 60
60.0%
ValueCountFrequency (%)
46.5 1
 
1.0%
48.0 1
 
1.0%
50.0 2
2.0%
51.0 1
 
1.0%
52.0 1
 
1.0%
52.8 1
 
1.0%
53.0 3
3.0%
53.4 1
 
1.0%
53.75 1
 
1.0%
54.0 2
2.0%
ValueCountFrequency (%)
90.0 1
 
1.0%
87.3 1
 
1.0%
84.4 1
 
1.0%
84.0 1
 
1.0%
83.4 1
 
1.0%
82.95 1
 
1.0%
80.2 1
 
1.0%
80.0 3
3.0%
79.4 1
 
1.0%
76.0 1
 
1.0%

SYSTOLIC
Real number (ℝ)

HIGH CORRELATION 

Distinct32
Distinct (%)32.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean127.88
Minimum100
Maximum150
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:55:52.902635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile106.9
Q1120
median130
Q3140
95-th percentile146.05
Maximum150
Range50
Interquartile range (IQR)20

Descriptive statistics

Standard deviation12.801736
Coefficient of variation (CV)0.10010741
Kurtosis-0.74962024
Mean127.88
Median Absolute Deviation (MAD)10
Skewness-0.28859456
Sum12788
Variance163.88444
MonotonicityNot monotonic
2023-10-09T03:55:53.138145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
140 17
17.0%
120 15
15.0%
130 14
14.0%
110 6
 
6.0%
100 4
 
4.0%
112 3
 
3.0%
150 3
 
3.0%
136 3
 
3.0%
121 3
 
3.0%
129 2
 
2.0%
Other values (22) 30
30.0%
ValueCountFrequency (%)
100 4
 
4.0%
105 1
 
1.0%
107 1
 
1.0%
110 6
 
6.0%
112 3
 
3.0%
113 1
 
1.0%
115 2
 
2.0%
116 1
 
1.0%
117 2
 
2.0%
120 15
15.0%
ValueCountFrequency (%)
150 3
 
3.0%
148 1
 
1.0%
147 1
 
1.0%
146 1
 
1.0%
145 2
 
2.0%
144 1
 
1.0%
143 2
 
2.0%
142 1
 
1.0%
141 2
 
2.0%
140 17
17.0%

DIASTOLIC
Real number (ℝ)

HIGH CORRELATION 

Distinct29
Distinct (%)29.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean74.05
Minimum50
Maximum90
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:55:53.365936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile59.95
Q170
median74.5
Q380
95-th percentile90
Maximum90
Range40
Interquartile range (IQR)10

Descriptive statistics

Standard deviation9.6172982
Coefficient of variation (CV)0.12987574
Kurtosis-0.45872115
Mean74.05
Median Absolute Deviation (MAD)5.5
Skewness-0.29019658
Sum7405
Variance92.492424
MonotonicityNot monotonic
2023-10-09T03:55:53.654064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
80 27
27.0%
70 17
17.0%
90 10
 
10.0%
60 9
 
9.0%
71 4
 
4.0%
83 3
 
3.0%
67 3
 
3.0%
72 2
 
2.0%
74 2
 
2.0%
78 2
 
2.0%
Other values (19) 21
21.0%
ValueCountFrequency (%)
50 1
 
1.0%
52 1
 
1.0%
54 1
 
1.0%
55 1
 
1.0%
59 1
 
1.0%
60 9
9.0%
61 1
 
1.0%
62 1
 
1.0%
64 1
 
1.0%
65 2
 
2.0%
ValueCountFrequency (%)
90 10
 
10.0%
86 1
 
1.0%
85 1
 
1.0%
83 3
 
3.0%
82 1
 
1.0%
81 1
 
1.0%
80 27
27.0%
78 2
 
2.0%
77 2
 
2.0%
76 1
 
1.0%

Interactions

2023-10-09T03:55:48.008646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:46.000153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:46.717454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:47.270178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:48.241827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:46.264848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:46.852340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:47.435068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:48.477766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:46.413953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:46.974835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:47.621216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:48.698506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:46.558917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:47.104441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:55:47.799708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:55:53.893414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDBDHTBDWTSYSTOLICDIASTOLIC
RID1.0001.0001.0001.0001.000
BDHT1.0001.0000.6000.0000.271
BDWT1.0000.6001.0000.2150.211
SYSTOLIC1.0000.0000.2151.0000.619
DIASTOLIC1.0000.2710.2110.6191.000
2023-10-09T03:55:54.866615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
BDHTBDWTSYSTOLICDIASTOLIC
BDHT1.0000.491-0.0990.096
BDWT0.4911.000-0.0160.198
SYSTOLIC-0.099-0.0161.0000.528
DIASTOLIC0.0960.1980.5281.000

Missing values

2023-10-09T03:55:49.041768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:55:49.305315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

RIDBDHTBDWTSYSTOLICDIASTOLIC
0R0000076158.064.014190
1R0000082168.562.013070
2R0000083155.846.514060
3R0000085182.075.011070
4R0000087163.060.013080
5R0000096169.062.013685
6R0000104158.075.314080
7R0000110169.465.914281
8R0000112175.070.012070
9R0000115158.659.414090
RIDBDHTBDWTSYSTOLICDIASTOLIC
90R0000384170.069.011070
91R0000394152.050.012070
92R0000395167.073.013683
93R0000404162.062.014574
94R0000405165.063.012580
95R0000408175.065.612070
96R0000409173.184.414080
97R0000415156.054.012171
98R0000423159.062.013080
99R0000425165.069.011776