Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory546.9 KiB
Average record size in memory56.0 B

Variable types

Categorical3
Text3

Dataset

Description1. 진료년도: 2022년, 진료일기준2. 수진자 주소지: 대구광역시 달서구(주민등록 주소지 기준)3. 연령: 5세 단위, 연도말 기준4. 주상병코드 및 상병명: 2022년 한 해 동안 수진자 주소지 기준으로 청구된 상병코드 및 상병명5. 진료인원: 각 구분에 해당하는 진료인원 수- 건강보험 급여실적(의료급여 제외)이며 한의분류 및 약국 제외, 비급여 제외- 2023년 6월 지급분까지 반영- 개인정보 보호를 위해 시군구 단위의 5인 미만 자료는 '*' 처리※ 해당 질병통계 자료는 요양기관에서 환자진료 중 진단명이 확정되지 않은 상태에서의 호소, 증세 등에 따라일차진단명을 부여하고 청구한 내역 중 주진단명 기준으로 발췌한 것이므로 최종확정된 질병과는 다를 수 있음※ 민원인의 공공데이터 제공 신청에 따라 2024-03-13 발췌 및 구성한 데이터
Author국민건강보험공단
URLhttps://www.data.go.kr/data/15127191/fileData.do

Alerts

진료년도 has constant value ""Constant
수진자 주소지 has constant value ""Constant

Reproduction

Analysis started2024-03-16 04:18:44.033879
Analysis finished2024-03-16 04:18:45.572569
Duration1.54 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

진료년도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2022년
10000 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022년
2nd row2022년
3rd row2022년
4th row2022년
5th row2022년

Common Values

ValueCountFrequency (%)
2022년 10000
100.0%

Length

2024-03-16T13:18:45.698587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-16T13:18:45.884899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022년 10000
100.0%

수진자 주소지
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
달서구
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row달서구
2nd row달서구
3rd row달서구
4th row달서구
5th row달서구

Common Values

ValueCountFrequency (%)
달서구 10000
100.0%

Length

2024-03-16T13:18:46.106692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-16T13:18:46.342839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
달서구 10000
100.0%

연령
Categorical

Distinct21
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
60-64세
 
624
45-49세
 
610
50-54세
 
607
65-69세
 
602
40-44세
 
587
Other values (16)
6970 

Length

Max length7
Median length6
Mean length5.8593
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row15-19세
2nd row50-54세
3rd row30-34세
4th row35-39세
5th row60-64세

Common Values

ValueCountFrequency (%)
60-64세 624
 
6.2%
45-49세 610
 
6.1%
50-54세 607
 
6.1%
65-69세 602
 
6.0%
40-44세 587
 
5.9%
55-59세 583
 
5.8%
25-29세 571
 
5.7%
35-39세 568
 
5.7%
70-74세 566
 
5.7%
30-34세 560
 
5.6%
Other values (11) 4122
41.2%

Length

2024-03-16T13:18:46.564814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
60-64세 624
 
6.2%
45-49세 610
 
6.1%
50-54세 607
 
6.0%
65-69세 602
 
6.0%
40-44세 587
 
5.8%
55-59세 583
 
5.8%
25-29세 571
 
5.7%
35-39세 568
 
5.6%
70-74세 566
 
5.6%
30-34세 560
 
5.6%
Other values (12) 4181
41.6%
Distinct1432
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-16T13:18:47.412831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.9994
Min length2

Characters and Unicode

Total characters29994
Distinct characters32
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique147 ?
Unique (%)1.5%

Sample

1st rowH13
2nd rowJ06
3rd rowM61
4th rowN41
5th rowB19
ValueCountFrequency (%)
n17 18
 
0.2%
r63 17
 
0.2%
t81 16
 
0.2%
j18 16
 
0.2%
r11 16
 
0.2%
k21 15
 
0.1%
m71 15
 
0.1%
k03 15
 
0.1%
n31 15
 
0.1%
r06 15
 
0.1%
Other values (1422) 9842
98.4%
2024-03-16T13:18:48.554995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2583
 
8.6%
1 2326
 
7.8%
2 2272
 
7.6%
3 2218
 
7.4%
4 2004
 
6.7%
6 1892
 
6.3%
5 1860
 
6.2%
8 1683
 
5.6%
7 1625
 
5.4%
9 1525
 
5.1%
Other values (22) 10006
33.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19988
66.6%
Uppercase Letter 10006
33.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 773
 
7.7%
S 751
 
7.5%
M 663
 
6.6%
K 651
 
6.5%
N 635
 
6.3%
H 598
 
6.0%
L 567
 
5.7%
I 544
 
5.4%
D 538
 
5.4%
J 502
 
5.0%
Other values (12) 3784
37.8%
Decimal Number
ValueCountFrequency (%)
0 2583
12.9%
1 2326
11.6%
2 2272
11.4%
3 2218
11.1%
4 2004
10.0%
6 1892
9.5%
5 1860
9.3%
8 1683
8.4%
7 1625
8.1%
9 1525
7.6%

Most occurring scripts

ValueCountFrequency (%)
Common 19988
66.6%
Latin 10006
33.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 773
 
7.7%
S 751
 
7.5%
M 663
 
6.6%
K 651
 
6.5%
N 635
 
6.3%
H 598
 
6.0%
L 567
 
5.7%
I 544
 
5.4%
D 538
 
5.4%
J 502
 
5.0%
Other values (12) 3784
37.8%
Common
ValueCountFrequency (%)
0 2583
12.9%
1 2326
11.6%
2 2272
11.4%
3 2218
11.1%
4 2004
10.0%
6 1892
9.5%
5 1860
9.3%
8 1683
8.4%
7 1625
8.1%
9 1525
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29994
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2583
 
8.6%
1 2326
 
7.8%
2 2272
 
7.6%
3 2218
 
7.4%
4 2004
 
6.7%
6 1892
 
6.3%
5 1860
 
6.2%
8 1683
 
5.6%
7 1625
 
5.4%
9 1525
 
5.1%
Other values (22) 10006
33.4%
Distinct1432
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-16T13:18:49.036034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length24
Mean length10.4408
Min length1

Characters and Unicode

Total characters104408
Distinct characters497
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique147 ?
Unique (%)1.5%

Sample

1st row달리분류된질환에서의결막의장애
2nd row다발성및상세불명부위의급성상기도감염
3rd row근육의석회화및골화
4th row전립선의염증성질환
5th row상세불명의바이러스간염
ValueCountFrequency (%)
급성신부전 18
 
0.2%
음식및수액섭취에관계된증상및징후 17
 
0.2%
달리분류되지않은처치의합병증 16
 
0.2%
상세불명병원체의폐렴 16
 
0.2%
구역및구토 16
 
0.2%
위-식도역류병 15
 
0.1%
기타윤활낭병증 15
 
0.1%
치아경조직의기타질환 15
 
0.1%
달리분류되지않은방광의신경근육기능장애 15
 
0.1%
호흡의이상 15
 
0.1%
Other values (1422) 9842
98.4%
2024-03-16T13:18:49.841542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5966
 
5.7%
4136
 
4.0%
3542
 
3.4%
3022
 
2.9%
2892
 
2.8%
2669
 
2.6%
2359
 
2.3%
1765
 
1.7%
1726
 
1.7%
1715
 
1.6%
Other values (487) 74616
71.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 102997
98.6%
Other Punctuation 514
 
0.5%
Open Punctuation 205
 
0.2%
Close Punctuation 205
 
0.2%
Decimal Number 178
 
0.2%
Dash Punctuation 141
 
0.1%
Uppercase Letter 135
 
0.1%
Math Symbol 33
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5966
 
5.8%
4136
 
4.0%
3542
 
3.4%
3022
 
2.9%
2892
 
2.8%
2669
 
2.6%
2359
 
2.3%
1765
 
1.7%
1726
 
1.7%
1715
 
1.7%
Other values (457) 73205
71.1%
Uppercase Letter
ValueCountFrequency (%)
N 23
17.0%
B 14
10.4%
T 13
9.6%
K 13
9.6%
U 12
8.9%
A 12
8.9%
I 12
8.9%
O 10
7.4%
D 10
7.4%
S 10
7.4%
Decimal Number
ValueCountFrequency (%)
1 50
28.1%
9 34
19.1%
2 28
15.7%
0 24
13.5%
7 18
 
10.1%
6 12
 
6.7%
3 6
 
3.4%
8 6
 
3.4%
Other Punctuation
ValueCountFrequency (%)
, 489
95.1%
/ 13
 
2.5%
6
 
1.2%
. 6
 
1.2%
Open Punctuation
ValueCountFrequency (%)
[ 152
74.1%
( 53
 
25.9%
Close Punctuation
ValueCountFrequency (%)
] 152
74.1%
) 53
 
25.9%
Math Symbol
ValueCountFrequency (%)
~ 27
81.8%
+ 6
 
18.2%
Dash Punctuation
ValueCountFrequency (%)
- 141
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 102975
98.6%
Common 1276
 
1.2%
Latin 135
 
0.1%
Han 22
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5966
 
5.8%
4136
 
4.0%
3542
 
3.4%
3022
 
2.9%
2892
 
2.8%
2669
 
2.6%
2359
 
2.3%
1765
 
1.7%
1726
 
1.7%
1715
 
1.7%
Other values (452) 73183
71.1%
Common
ValueCountFrequency (%)
, 489
38.3%
[ 152
 
11.9%
] 152
 
11.9%
- 141
 
11.1%
( 53
 
4.2%
) 53
 
4.2%
1 50
 
3.9%
9 34
 
2.7%
2 28
 
2.2%
~ 27
 
2.1%
Other values (9) 97
 
7.6%
Latin
ValueCountFrequency (%)
N 23
17.0%
B 14
10.4%
T 13
9.6%
K 13
9.6%
U 12
8.9%
A 12
8.9%
I 12
8.9%
O 10
7.4%
D 10
7.4%
S 10
7.4%
Han
ValueCountFrequency (%)
8
36.4%
8
36.4%
2
 
9.1%
2
 
9.1%
2
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 102975
98.6%
ASCII 1405
 
1.3%
CJK 20
 
< 0.1%
Punctuation 6
 
< 0.1%
CJK Compat Ideographs 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5966
 
5.8%
4136
 
4.0%
3542
 
3.4%
3022
 
2.9%
2892
 
2.8%
2669
 
2.6%
2359
 
2.3%
1765
 
1.7%
1726
 
1.7%
1715
 
1.7%
Other values (452) 73183
71.1%
ASCII
ValueCountFrequency (%)
, 489
34.8%
[ 152
 
10.8%
] 152
 
10.8%
- 141
 
10.0%
( 53
 
3.8%
) 53
 
3.8%
1 50
 
3.6%
9 34
 
2.4%
2 28
 
2.0%
~ 27
 
1.9%
Other values (19) 226
16.1%
CJK
ValueCountFrequency (%)
8
40.0%
8
40.0%
2
 
10.0%
2
 
10.0%
Punctuation
ValueCountFrequency (%)
6
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
2
100.0%
Distinct1126
Distinct (%)11.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-16T13:18:50.596283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length1.7898
Min length1

Characters and Unicode

Total characters17898
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique620 ?
Unique (%)6.2%

Sample

1st row*
2nd row3441
3rd row*
4th row256
5th row*
ValueCountFrequency (%)
3540
35.4%
5 322
 
3.2%
6 255
 
2.5%
7 226
 
2.3%
8 210
 
2.1%
9 182
 
1.8%
10 168
 
1.7%
11 143
 
1.4%
12 128
 
1.3%
13 123
 
1.2%
Other values (1116) 4703
47.0%
2024-03-16T13:18:51.676363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 3540
19.8%
1 2862
16.0%
2 1870
10.4%
3 1429
8.0%
5 1422
7.9%
6 1278
 
7.1%
4 1221
 
6.8%
7 1197
 
6.7%
8 1101
 
6.2%
9 1025
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14358
80.2%
Other Punctuation 3540
 
19.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2862
19.9%
2 1870
13.0%
3 1429
10.0%
5 1422
9.9%
6 1278
8.9%
4 1221
8.5%
7 1197
8.3%
8 1101
 
7.7%
9 1025
 
7.1%
0 953
 
6.6%
Other Punctuation
ValueCountFrequency (%)
* 3540
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17898
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 3540
19.8%
1 2862
16.0%
2 1870
10.4%
3 1429
8.0%
5 1422
7.9%
6 1278
 
7.1%
4 1221
 
6.8%
7 1197
 
6.7%
8 1101
 
6.2%
9 1025
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17898
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 3540
19.8%
1 2862
16.0%
2 1870
10.4%
3 1429
8.0%
5 1422
7.9%
6 1278
 
7.1%
4 1221
 
6.8%
7 1197
 
6.7%
8 1101
 
6.2%
9 1025
 
5.7%

Missing values

2024-03-16T13:18:45.148156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-16T13:18:45.441089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

진료년도수진자 주소지연령주상병코드주상병명진료인원(명)
23552022년달서구15-19세H13달리분류된질환에서의결막의장애*
95852022년달서구50-54세J06다발성및상세불명부위의급성상기도감염3441
55382022년달서구30-34세M61근육의석회화및골화*
66252022년달서구35-39세N41전립선의염증성질환256
113342022년달서구60-64세B19상세불명의바이러스간염*
151072022년달서구75-79세M18제1수근중수관절의관절증*
85272022년달서구45-49세J30혈관운동성및앨러지성비염3621
171912022년달서구90-94세A02기타살모넬라감염*
84202022년달서구45-49세H47시[제2]신경및시각경로의기타장애15
169582022년달서구85-89세N18만성신장병188
진료년도수진자 주소지연령주상병코드주상병명진료인원(명)
147032022년달서구75-79세E34기타내분비장애*
42042022년달서구25-29세G53달리분류된질환에서의뇌신경장애24
167892022년달서구85-89세K03치아경조직의기타질환144
171922022년달서구90-94세A04기타세균성장감염9
170782022년달서구85-89세S21흉부의열린상처*
4602022년달서구0-4세Q74사지의기타선천기형7
27172022년달서구15-19세Q18얼굴및목의기타선천기형11
23272022년달서구15-19세G51안면신경장애10
134732022년달서구65-69세Z91달리분류되지않은위험요인의개인력*
64402022년달서구35-39세K31위및십이지장의기타질환50