Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells535
Missing cells (%)0.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory634.8 KiB
Average record size in memory65.0 B

Variable types

Categorical5
Text1
Numeric1

Dataset

Description1. 진료일기준(한의분류 제외, 약국 제외), 연령(연말기준)2. 건강보험 급여실적(의료급여 제외)이며, 비급여는 제외- 2023년 6월 지급분까지 반영3. 아래 질병통계 자료는 요양기관에서 환자진료중 진단명이 확정되지 않은 상태에서의 호소, 증세 등에 따라, 일차진단명을 부여하고 청구한 내역 중 주진단명 및 제1부상병 기준으로 발췌한 것이므로 최종 확정된 질병과는 다를 수 있음<주상병코드 및 제1부상병코드>-A: F70-99, R62-B: G40-47, G70-73, G80-83, Q00-07, Q85, Q87, Q90-99* 각 코드별 상세설명은 KOICD 질병분류정보센터(https://www.koicd.kr)에서 참조※ 시군구 단위의 5인 미만 자료는 개인정보 보호를 위해 공란으로 처리됩니다.※ 2023.12.4. 발췌 데이터로서, 민원인의 제공 신청에 따른 제공 건
Author국민건강보험공단
URLhttps://www.data.go.kr/data/15125351/fileData.do

Alerts

진료인원(명) has 535 (5.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 00:02:50.789217
Analysis finished2023-12-12 00:02:51.651629
Duration0.86 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
B
5036 
A
4964 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowA
3rd rowB
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
B 5036
50.4%
A 4964
49.6%

Length

2023-12-12T09:02:51.711829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:02:51.836731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
b 5036
50.4%
a 4964
49.6%

진료년도
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2021년
2038 
2019년
2007 
2020년
1997 
2022년
1988 
2018년
1970 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021년
2nd row2019년
3rd row2020년
4th row2019년
5th row2021년

Common Values

ValueCountFrequency (%)
2021년 2038
20.4%
2019년 2007
20.1%
2020년 1997
20.0%
2022년 1988
19.9%
2018년 1970
19.7%

Length

2023-12-12T09:02:51.956130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:02:52.065990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021년 2038
20.4%
2019년 2007
20.1%
2020년 1997
20.0%
2022년 1988
19.9%
2018년 1970
19.7%

시도
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경기도
6254 
서울특별시
3746 

Length

Max length5
Median length3
Mean length3.7492
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row경기도
3rd row서울특별시
4th row경기도
5th row경기도

Common Values

ValueCountFrequency (%)
경기도 6254
62.5%
서울특별시 3746
37.5%

Length

2023-12-12T09:02:52.228112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:02:52.359854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 6254
62.5%
서울특별시 3746
37.5%
Distinct72
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T09:02:52.610409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length4.1198
Min length2

Characters and Unicode

Total characters41198
Distinct characters80
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row구로구
2nd row과천시
3rd row구로구
4th row하남시
5th row구리시
ValueCountFrequency (%)
수원시 597
 
4.8%
성남시 456
 
3.6%
고양시 451
 
3.6%
용인시 434
 
3.5%
안산시 318
 
2.5%
안양시 297
 
2.4%
부천시 163
 
1.3%
동작구 162
 
1.3%
노원구 161
 
1.3%
일산서구 160
 
1.3%
Other values (66) 9331
74.5%
2023-12-12T09:02:53.017221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6566
 
15.9%
5974
 
14.5%
2530
 
6.1%
1497
 
3.6%
1215
 
2.9%
1194
 
2.9%
1072
 
2.6%
1052
 
2.6%
1043
 
2.5%
940
 
2.3%
Other values (70) 18115
44.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 38668
93.9%
Space Separator 2530
 
6.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6566
 
17.0%
5974
 
15.4%
1497
 
3.9%
1215
 
3.1%
1194
 
3.1%
1072
 
2.8%
1052
 
2.7%
1043
 
2.7%
940
 
2.4%
914
 
2.4%
Other values (69) 17201
44.5%
Space Separator
ValueCountFrequency (%)
2530
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 38668
93.9%
Common 2530
 
6.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6566
 
17.0%
5974
 
15.4%
1497
 
3.9%
1215
 
3.1%
1194
 
3.1%
1072
 
2.8%
1052
 
2.7%
1043
 
2.7%
940
 
2.4%
914
 
2.4%
Other values (69) 17201
44.5%
Common
ValueCountFrequency (%)
2530
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 38668
93.9%
ASCII 2530
 
6.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6566
 
17.0%
5974
 
15.4%
1497
 
3.9%
1215
 
3.1%
1194
 
3.1%
1072
 
2.8%
1052
 
2.7%
1043
 
2.7%
940
 
2.4%
914
 
2.4%
Other values (69) 17201
44.5%
ASCII
ValueCountFrequency (%)
2530
100.0%

연령
Categorical

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
16~18세
1027 
1세
1018 
4세
1007 
2세
1006 
3세
1004 
Other values (5)
4938 

Length

Max length6
Median length2
Mean length3.1041
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4세
2nd row11~15세
3rd row5세
4th row4세
5th row4세

Common Values

ValueCountFrequency (%)
16~18세 1027
10.3%
1세 1018
10.2%
4세 1007
10.1%
2세 1006
10.1%
3세 1004
10.0%
11~15세 1002
10.0%
5세 992
9.9%
6세 992
9.9%
0세 977
9.8%
7~10세 975
9.8%

Length

2023-12-12T09:02:53.183525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:02:53.312953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
16~18세 1027
10.3%
1세 1018
10.2%
4세 1007
10.1%
2세 1006
10.1%
3세 1004
10.0%
11~15세 1002
10.0%
5세 992
9.9%
6세 992
9.9%
0세 977
9.8%
7~10세 975
9.8%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
여자
5006 
남자
4994 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남자
2nd row남자
3rd row여자
4th row남자
5th row여자

Common Values

ValueCountFrequency (%)
여자 5006
50.1%
남자 4994
49.9%

Length

2023-12-12T09:02:53.452085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:02:53.558166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
여자 5006
50.1%
남자 4994
49.9%

진료인원(명)
Real number (ℝ)

MISSING 

Distinct582
Distinct (%)6.1%
Missing535
Missing (%)5.3%
Infinite0
Infinite (%)0.0%
Mean80.345061
Minimum5
Maximum1559
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T09:02:53.695764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile7
Q117
median35
Q390
95-th percentile313
Maximum1559
Range1554
Interquartile range (IQR)73

Descriptive statistics

Standard deviation118.32596
Coefficient of variation (CV)1.4727222
Kurtosis18.70616
Mean80.345061
Median Absolute Deviation (MAD)23
Skewness3.5634552
Sum760466
Variance14001.032
MonotonicityNot monotonic
2023-12-12T09:02:53.870133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11 217
 
2.2%
15 212
 
2.1%
12 204
 
2.0%
14 200
 
2.0%
16 198
 
2.0%
13 187
 
1.9%
10 185
 
1.8%
19 179
 
1.8%
9 176
 
1.8%
8 174
 
1.7%
Other values (572) 7533
75.3%
(Missing) 535
 
5.3%
ValueCountFrequency (%)
5 168
1.7%
6 161
1.6%
7 164
1.6%
8 174
1.7%
9 176
1.8%
10 185
1.8%
11 217
2.2%
12 204
2.0%
13 187
1.9%
14 200
2.0%
ValueCountFrequency (%)
1559 1
< 0.1%
1427 1
< 0.1%
1231 1
< 0.1%
1224 1
< 0.1%
1215 1
< 0.1%
1189 1
< 0.1%
1039 1
< 0.1%
1020 1
< 0.1%
986 1
< 0.1%
985 1
< 0.1%

Interactions

2023-12-12T09:02:51.349495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T09:02:53.990498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주부상병코드그룹진료년도시도시군구연령성별진료인원(명)
주부상병코드그룹1.0000.0000.0000.0000.0000.0070.039
진료년도0.0001.0000.0000.0000.0000.0000.085
시도0.0000.0001.0001.0000.0000.0000.008
시군구0.0000.0001.0001.0000.0000.0000.325
연령0.0000.0000.0000.0001.0000.0000.400
성별0.0070.0000.0000.0000.0001.0000.078
진료인원(명)0.0390.0850.0080.3250.4000.0781.000
2023-12-12T09:02:54.101355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주부상병코드그룹시도성별연령진료년도
주부상병코드그룹1.0000.0000.0050.0000.000
시도0.0001.0000.0000.0000.000
성별0.0050.0001.0000.0000.000
연령0.0000.0000.0001.0000.000
진료년도0.0000.0000.0000.0001.000
2023-12-12T09:02:54.199976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
진료인원(명)주부상병코드그룹진료년도시도연령성별
진료인원(명)1.0000.0390.0490.0080.1940.078
주부상병코드그룹0.0391.0000.0000.0000.0000.005
진료년도0.0490.0001.0000.0000.0000.000
시도0.0080.0000.0001.0000.0000.000
연령0.1940.0000.0000.0001.0000.000
성별0.0780.0050.0000.0000.0001.000

Missing values

2023-12-12T09:02:51.488336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:02:51.599437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

주부상병코드그룹진료년도시도시군구연령성별진료인원(명)
11065B2021년서울특별시구로구4세남자24
2253A2019년경기도과천시11~15세남자51
9732B2020년서울특별시구로구5세여자19
2385A2019년경기도하남시4세남자61
4948A2021년경기도구리시4세여자17
9381B2019년경기도양평군0세남자<NA>
4652A2021년경기도성남시 분당구0세여자20
185A2018년서울특별시도봉구2세여자17
3636A2020년경기도오산시0세여자10
4014A2020년경기도양평군1세여자<NA>
주부상병코드그룹진료년도시도시군구연령성별진료인원(명)
10065B2020년경기도안양시 만안구2세남자8
2127A2019년경기도안산시1세남자<NA>
10A2018년서울특별시종로구5세남자24
12900B2022년경기도고양시 덕양구1세남자37
10165B2020년경기도동두천시2세남자<NA>
12598B2022년경기도수원시 권선구1세남자20
10945B2021년서울특별시노원구4세남자23
5692A2022년서울특별시강서구6세남자152
8941B2019년경기도고양시 일산서구11~15세남자240
12198B2022년서울특별시중랑구1세남자23