Overview

Dataset statistics

Number of variables3
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.6 KiB
Average record size in memory26.3 B

Variable types

Text1
Categorical2

Dataset

Description병원정보시스템에 저장되어 있는 전체 데이터로 부터 고지혈증 연구를 위한 선정기준을 적용한 쿼리문을 생성하여 추출한 코호트의 인구통계학적 정보 데이터임. 스타틴을 최초 처방받은 환자들의 최초 처방 당시의 연령, 성별 데이터를 이용하여 연령대별 특성과 성별 특성을 분석할 수 있음. -SEX : 0은 남자, 1은 여자로 구분 하였음
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/demographic-data-dyslipidemia

Alerts

RID has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:55:48.946786
Analysis finished2023-10-08 18:55:52.024460
Duration3.08 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-10-09T03:55:52.469638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowR0000001
2nd rowR0000002
3rd rowR0000004
4th rowR0000010
5th rowR0000015
ValueCountFrequency (%)
r0000001 1
 
1.0%
r0000204 1
 
1.0%
r0000230 1
 
1.0%
r0000226 1
 
1.0%
r0000225 1
 
1.0%
r0000222 1
 
1.0%
r0000219 1
 
1.0%
r0000210 1
 
1.0%
r0000209 1
 
1.0%
r0000208 1
 
1.0%
Other values (90) 90
90.0%
2023-10-09T03:55:53.378967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 454
56.8%
R 100
 
12.5%
2 58
 
7.2%
1 51
 
6.4%
5 26
 
3.2%
3 24
 
3.0%
6 21
 
2.6%
8 18
 
2.2%
4 16
 
2.0%
7 16
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 700
87.5%
Uppercase Letter 100
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 454
64.9%
2 58
 
8.3%
1 51
 
7.3%
5 26
 
3.7%
3 24
 
3.4%
6 21
 
3.0%
8 18
 
2.6%
4 16
 
2.3%
7 16
 
2.3%
9 16
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
R 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 700
87.5%
Latin 100
 
12.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 454
64.9%
2 58
 
8.3%
1 51
 
7.3%
5 26
 
3.7%
3 24
 
3.4%
6 21
 
3.0%
8 18
 
2.6%
4 16
 
2.3%
7 16
 
2.3%
9 16
 
2.3%
Latin
ValueCountFrequency (%)
R 100
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 454
56.8%
R 100
 
12.5%
2 58
 
7.2%
1 51
 
6.4%
5 26
 
3.2%
3 24
 
3.0%
6 21
 
2.6%
8 18
 
2.2%
4 16
 
2.0%
7 16
 
2.0%

Age_grp
Categorical

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
60대
31 
50대
24 
40대
20 
70대
17 
30대
Other values (2)

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row60대
2nd row40대
3rd row60대
4th row60대
5th row40대

Common Values

ValueCountFrequency (%)
60대 31
31.0%
50대 24
24.0%
40대 20
20.0%
70대 17
17.0%
30대 4
 
4.0%
20대 2
 
2.0%
80대 2
 
2.0%

Length

2023-10-09T03:55:53.680969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:55:53.963261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
60대 31
31.0%
50대 24
24.0%
40대 20
20.0%
70대 17
17.0%
30대 4
 
4.0%
20대 2
 
2.0%
80대 2
 
2.0%

SEX
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
57 
0
43 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1 57
57.0%
0 43
43.0%

Length

2023-10-09T03:55:54.559904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:55:54.827320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 57
57.0%
0 43
43.0%

Correlations

2023-10-09T03:55:54.923401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDAge_grpSEX
RID1.0001.0001.000
Age_grp1.0001.0000.239
SEX1.0000.2391.000
2023-10-09T03:55:55.073977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Age_grpSEX
Age_grp1.0000.248
SEX0.2481.000
2023-10-09T03:55:55.213221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Age_grpSEX
Age_grp1.0000.248
SEX0.2481.000

Missing values

2023-10-09T03:55:51.765436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:55:51.960757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

RIDAge_grpSEX
0R000000160대1
1R000000240대0
2R000000460대0
3R000001060대0
4R000001540대0
5R000001640대0
6R000001850대1
7R000002170대1
8R000002270대0
9R000003040대1
RIDAge_grpSEX
90R000026550대0
91R000026640대0
92R000026840대1
93R000027650대1
94R000028060대0
95R000028150대1
96R000028370대0
97R000028450대1
98R000028520대0
99R000028760대1