Dataset statistics
Number of variables | 3 |
---|---|
Number of observations | 100 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 2.6 KiB |
Average record size in memory | 26.3 B |
Variable types
Text | 1 |
---|---|
Categorical | 2 |
Dataset
Description | 병원정보시스템에 저장되어 있는 전체 데이터에서 ICD-10 코드 중 F101, F102, F103, F104, F109의 진단코드를 가진 환자와 K700, K701, K703, K7030, K7031, K7041, K709의 진단코드를 가진 환자들을 추출한 코호트의 인구통계학적 정보 데이터임. 환자들의 최초 처방 당시의 연령, 성별 데이터를 이용하여 연령대별 특성과 성별 특성을 분석할 수 있음. -SEX : 0은 남자, 1은 여자로 구분 하였음 |
---|---|
Author | 가톨릭대학교 서울성모병원 |
URL | http://cmcdata.net/data/dataset/demographic-data-alcohol-use-disorder |
Reproduction
Analysis started | 2023-10-08 18:56:20.154028 |
---|---|
Analysis finished | 2023-10-08 18:56:20.562791 |
Duration | 0.41 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
RID
Text
UNIQUE
 
Distinct | 100 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Value | Count | Frequency (%) |
r0000002 | 1 | 1.0% |
r0000109 | 1 | 1.0% |
r0000133 | 1 | 1.0% |
r0000129 | 1 | 1.0% |
r0000128 | 1 | 1.0% |
r0000125 | 1 | 1.0% |
r0000122 | 1 | 1.0% |
r0000118 | 1 | 1.0% |
r0000116 | 1 | 1.0% |
r0000114 | 1 | 1.0% |
Other values (90) | 90 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 479 | |
R | 100 | 12.5% |
1 | 65 | 8.1% |
6 | 24 | 3.0% |
2 | 22 | 2.8% |
5 | 21 | 2.6% |
4 | 21 | 2.6% |
3 | 21 | 2.6% |
7 | 17 | 2.1% |
9 | 16 | 2.0% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 700 | |
Uppercase Letter | 100 | 12.5% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 479 | |
1 | 65 | 9.3% |
6 | 24 | 3.4% |
2 | 22 | 3.1% |
5 | 21 | 3.0% |
4 | 21 | 3.0% |
3 | 21 | 3.0% |
7 | 17 | 2.4% |
9 | 16 | 2.3% |
8 | 14 | 2.0% |
Uppercase Letter
Value | Count | Frequency (%) |
R | 100 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 700 | |
Latin | 100 | 12.5% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 479 | |
1 | 65 | 9.3% |
6 | 24 | 3.4% |
2 | 22 | 3.1% |
5 | 21 | 3.0% |
4 | 21 | 3.0% |
3 | 21 | 3.0% |
7 | 17 | 2.4% |
9 | 16 | 2.3% |
8 | 14 | 2.0% |
Latin
Value | Count | Frequency (%) |
R | 100 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 800 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 479 | |
R | 100 | 12.5% |
1 | 65 | 8.1% |
6 | 24 | 3.0% |
2 | 22 | 2.8% |
5 | 21 | 2.6% |
4 | 21 | 2.6% |
3 | 21 | 2.6% |
7 | 17 | 2.1% |
9 | 16 | 2.0% |
Age_grp
Categorical
Distinct | 6 |
---|---|
Distinct (%) | 6.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
50대 | |
---|---|
40대 | |
60대 | |
30대 | |
70대 |
Length
Max length | 3 |
---|---|
Median length | 3 |
Mean length | 3 |
Min length | 3 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 1.0% |
Sample
1st row | 60대 |
---|---|
2nd row | 50대 |
3rd row | 70대 |
4th row | 30대 |
5th row | 50대 |
Common Values
Value | Count | Frequency (%) |
50대 | 35 | |
40대 | 25 | |
60대 | 18 | |
30대 | 13 | 13.0% |
70대 | 8 | 8.0% |
10대 | 1 | 1.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
50대 | 35 | |
40대 | 25 | |
60대 | 18 | |
30대 | 13 | 13.0% |
70대 | 8 | 8.0% |
10대 | 1 | 1.0% |
SEX
Categorical
IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 2.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
0 | |
---|---|
1 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 0 |
---|---|
2nd row | 1 |
3rd row | 0 |
4th row | 1 |
5th row | 0 |
Common Values
Value | Count | Frequency (%) |
0 | 90 | |
1 | 10 | 10.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
0 | 90 | |
1 | 10 | 10.0% |
RID | Age_grp | SEX | |
---|---|---|---|
RID | 1.000 | 1.000 | 1.000 |
Age_grp | 1.000 | 1.000 | 0.123 |
SEX | 1.000 | 0.123 | 1.000 |
Age_grp | SEX | |
---|---|---|
Age_grp | 1.000 | 0.084 |
SEX | 0.084 | 1.000 |
Age_grp | SEX | |
---|---|---|
Age_grp | 1.000 | 0.084 |
SEX | 0.084 | 1.000 |
RID | Age_grp | SEX | |
---|---|---|---|
0 | R0000002 | 60대 | 0 |
1 | R0000003 | 50대 | 1 |
2 | R0000004 | 70대 | 0 |
3 | R0000006 | 30대 | 1 |
4 | R0000008 | 50대 | 0 |
5 | R0000010 | 30대 | 0 |
6 | R0000016 | 30대 | 0 |
7 | R0000019 | 50대 | 0 |
8 | R0000020 | 50대 | 0 |
9 | R0000022 | 40대 | 1 |
RID | Age_grp | SEX | |
---|---|---|---|
90 | R0000163 | 70대 | 1 |
91 | R0000164 | 60대 | 0 |
92 | R0000166 | 50대 | 0 |
93 | R0000171 | 40대 | 1 |
94 | R0000172 | 60대 | 0 |
95 | R0000173 | 50대 | 0 |
96 | R0000175 | 40대 | 0 |
97 | R0000176 | 60대 | 0 |
98 | R0000178 | 40대 | 0 |
99 | R0000181 | 30대 | 0 |