Overview

Dataset statistics

Number of variables4
Number of observations398
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory13.0 KiB
Average record size in memory33.3 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description2021~2022년 제1형 당뇨병 환자 수 데이터1. 진료일기준(한의분류 제외, 약국 제외), 나이(1세 단위,연말기준)2. 주상병코드 E10으로 진료받은 환자 수- 건강보험 급여실적(의료급여 제외)이며, 비급여는 제외- 2023년 6월 지급분까지 반영3. 해당 자료는 요양기관에서 환자진료 중 진단명이 확정되지 않은 상태에서의 호소, 증세 등에 따라 일차진단명을 부여하고 청구한 내역 중 주진단명 기준으로 발췌한 것이므로 최종 확정된 질병과는 다를 수 있음※ 민원인의 제공신청에 따른 제공 건으로 2023-09-13 발췌
Author국민건강보험공단
URLhttps://www.data.go.kr/data/15122824/fileData.do

Reproduction

Analysis started2023-12-12 17:05:14.322246
Analysis finished2023-12-12 17:05:14.775751
Duration0.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

진료년도
Categorical

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2022년
200 
2021년
198 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021년
2nd row2021년
3rd row2021년
4th row2021년
5th row2021년

Common Values

ValueCountFrequency (%)
2022년 200
50.3%
2021년 198
49.7%

Length

2023-12-13T02:05:14.855196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:05:14.968780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022년 200
50.3%
2021년 198
49.7%

나이
Text

Distinct101
Distinct (%)25.4%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2023-12-13T02:05:15.278854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length2.9221106
Min length2

Characters and Unicode

Total characters1163
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0세
2nd row1세
3rd row1세
4th row2세
5th row2세
ValueCountFrequency (%)
50세 4
 
1.0%
1세 4
 
1.0%
70세 4
 
1.0%
69세 4
 
1.0%
68세 4
 
1.0%
67세 4
 
1.0%
66세 4
 
1.0%
65세 4
 
1.0%
64세 4
 
1.0%
63세 4
 
1.0%
Other values (92) 360
90.0%
2023-12-13T02:05:15.794114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
398
34.2%
1 82
 
7.1%
5 80
 
6.9%
2 80
 
6.9%
4 80
 
6.9%
3 80
 
6.9%
6 80
 
6.9%
8 80
 
6.9%
7 79
 
6.8%
9 75
 
6.4%
Other values (4) 49
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 759
65.3%
Other Letter 402
34.6%
Space Separator 2
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 82
10.8%
5 80
10.5%
2 80
10.5%
4 80
10.5%
3 80
10.5%
6 80
10.5%
8 80
10.5%
7 79
10.4%
9 75
9.9%
0 43
5.7%
Other Letter
ValueCountFrequency (%)
398
99.0%
2
 
0.5%
2
 
0.5%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 761
65.4%
Hangul 402
34.6%

Most frequent character per script

Common
ValueCountFrequency (%)
1 82
10.8%
5 80
10.5%
2 80
10.5%
4 80
10.5%
3 80
10.5%
6 80
10.5%
8 80
10.5%
7 79
10.4%
9 75
9.9%
0 43
5.7%
Hangul
ValueCountFrequency (%)
398
99.0%
2
 
0.5%
2
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 761
65.4%
Hangul 402
34.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
398
99.0%
2
 
0.5%
2
 
0.5%
ASCII
ValueCountFrequency (%)
1 82
10.8%
5 80
10.5%
2 80
10.5%
4 80
10.5%
3 80
10.5%
6 80
10.5%
8 80
10.5%
7 79
10.4%
9 75
9.9%
0 43
5.7%

성별
Categorical

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
남자
199 
여자
199 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남자
2nd row남자
3rd row여자
4th row남자
5th row여자

Common Values

ValueCountFrequency (%)
남자 199
50.0%
여자 199
50.0%

Length

2023-12-13T02:05:15.967816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:05:16.097037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
남자 199
50.0%
여자 199
50.0%

진료인원(명)
Real number (ℝ)

Distinct258
Distinct (%)64.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean225.65829
Minimum1
Maximum583
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-13T02:05:16.229276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q1107.25
median244
Q3308
95-th percentile468.3
Maximum583
Range582
Interquartile range (IQR)200.75

Descriptive statistics

Standard deviation140.02306
Coefficient of variation (CV)0.62050926
Kurtosis-0.65915922
Mean225.65829
Median Absolute Deviation (MAD)90
Skewness0.046718852
Sum89812
Variance19606.457
MonotonicityNot monotonic
2023-12-13T02:05:16.396998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 12
 
3.0%
2 7
 
1.8%
274 6
 
1.5%
3 5
 
1.3%
253 5
 
1.3%
10 4
 
1.0%
238 4
 
1.0%
277 4
 
1.0%
281 4
 
1.0%
235 4
 
1.0%
Other values (248) 343
86.2%
ValueCountFrequency (%)
1 12
3.0%
2 7
1.8%
3 5
1.3%
5 1
 
0.3%
6 1
 
0.3%
7 1
 
0.3%
8 1
 
0.3%
10 4
 
1.0%
11 2
 
0.5%
12 1
 
0.3%
ValueCountFrequency (%)
583 1
0.3%
554 1
0.3%
538 1
0.3%
535 1
0.3%
532 1
0.3%
528 1
0.3%
521 1
0.3%
520 1
0.3%
518 1
0.3%
514 1
0.3%

Interactions

2023-12-13T02:05:14.468061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:05:16.508022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
진료년도성별진료인원(명)
진료년도1.0000.0000.000
성별0.0001.0000.490
진료인원(명)0.0000.4901.000
2023-12-13T02:05:16.666824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별진료년도
성별1.0000.000
진료년도0.0001.000
2023-12-13T02:05:16.752956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
진료인원(명)진료년도성별
진료인원(명)1.0000.0000.373
진료년도0.0001.0000.000
성별0.3730.0001.000

Missing values

2023-12-13T02:05:14.612063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:05:14.736209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

진료년도나이성별진료인원(명)
02021년0세남자1
12021년1세남자1
22021년1세여자1
32021년2세남자5
42021년2세여자10
52021년3세남자14
62021년3세여자17
72021년4세남자20
82021년4세여자18
92021년5세남자26
진료년도나이성별진료인원(명)
3882022년95세남자1
3892022년95세여자11
3902022년96세남자1
3912022년96세여자10
3922022년97세남자2
3932022년97세여자1
3942022년98세남자1
3952022년98세여자3
3962022년99세여자1
3972022년100세 이상남자1