Overview

Dataset statistics

Number of variables3
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.8 KiB
Average record size in memory28.3 B

Variable types

Numeric2
Categorical1

Dataset

Description병원정보시스템에 저장되어 있는 전체 데이터에서 ICD-10 코드 중 E10, E11~14, 024의 진단코드를 가진 환자를 추출한 코호트의 인구통계학적 정보 데이터임. 환자들의 최초진단 당시의 연령, 성별 데이터를 이용하여 연령대별 특성과 성별 특성을 분석할 수 있음. -SEX : 0은 남자, 1은 여자로 구분 하였음
Author가톨릭대학교 은평성모병원
URLhttp://cmcdata.net/data/dataset/diabetes_demo-eunpyeong

Alerts

RID has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:57:09.466871
Analysis finished2023-10-08 18:57:12.634287
Duration3.17 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:12.901308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-10-09T03:57:13.484201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

Age_grp
Real number (ℝ)

Distinct41
Distinct (%)41.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean63.19
Minimum31
Maximum88
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:13.950102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum31
5-th percentile40
Q155
median63.5
Q373
95-th percentile85
Maximum88
Range57
Interquartile range (IQR)18

Descriptive statistics

Standard deviation13.083025
Coefficient of variation (CV)0.20704266
Kurtosis-0.37539115
Mean63.19
Median Absolute Deviation (MAD)8.5
Skewness-0.32296594
Sum6319
Variance171.16556
MonotonicityNot monotonic
2023-10-09T03:57:14.237722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
55 7
 
7.0%
60 5
 
5.0%
73 5
 
5.0%
68 4
 
4.0%
78 4
 
4.0%
58 4
 
4.0%
59 4
 
4.0%
69 4
 
4.0%
85 4
 
4.0%
64 3
 
3.0%
Other values (31) 56
56.0%
ValueCountFrequency (%)
31 2
2.0%
37 1
 
1.0%
40 3
3.0%
41 2
2.0%
42 1
 
1.0%
43 1
 
1.0%
44 2
2.0%
46 2
2.0%
47 1
 
1.0%
52 2
2.0%
ValueCountFrequency (%)
88 1
 
1.0%
87 1
 
1.0%
85 4
4.0%
82 1
 
1.0%
81 1
 
1.0%
79 2
2.0%
78 4
4.0%
77 3
3.0%
76 2
2.0%
75 3
3.0%

SEX
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
52 
0
48 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1 52
52.0%
0 48
48.0%

Length

2023-10-09T03:57:14.633979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-09T03:57:14.870460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 52
52.0%
0 48
48.0%

Interactions

2023-10-09T03:57:11.446350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:11.080669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:11.642500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:11.284614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-09T03:57:14.997493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDAge_grpSEX
RID1.0000.0000.000
Age_grp0.0001.0000.369
SEX0.0000.3691.000
2023-10-09T03:57:15.182904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RIDAge_grpSEX
RID1.0000.0220.000
Age_grp0.0221.0000.270
SEX0.0000.2701.000

Missing values

2023-10-09T03:57:12.326694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-09T03:57:12.499948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

RIDAge_grpSEX
01680
12461
23780
34740
45551
56590
67370
78631
89420
910871
RIDAge_grpSEX
9091401
9192761
9293550
9394691
9495560
9596711
9697691
9798771
9899761
99100601