Overview

Dataset statistics

Number of variables8
Number of observations8010
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory532.0 KiB
Average record size in memory68.0 B

Variable types

Categorical4
Text1
Numeric3

Dataset

Description국민건강보험가입자 및 의료급여수급권자 중 고혈압 또는 당뇨병 또는 이상지질혈증 환자의 비율○ 기준연도: 2009~2021년○ 시도 및 시군구: 전국 17개 시도의 시군구별 구분- 연초 기준 지명 구축. 연중에 지명이 바뀌는 경우 거주자 중 지명 변경 후 정보 이동이 있었던 일부만 반영○ 성별: ALL(전체), 1(남성), 2(여성)○ 연령그룹: ALL(전체), 00(10세미만), 10(10대), 20(20대), 30(30대), 40(40대), 50(50대), 60(60대), 70(70대), 80(80세이상)○ 분모(명): 국민건강보험가입자 또는 의료급여수급권자 수○ 분자(명): 고혈압 또는 당뇨병 또는 이상지질혈증이 발생한 환자 수- 고혈압 환자: 고혈압 주상병 코드(I10~I15)와 함께 고혈압 약제를 처방 받은 환자- 당뇨병 환자: 당뇨병 주상병 코드(E10~E14)와 함께 당뇨병 약제를 처방 받은 환자- 이상지질혈증 환자: 이상지질혈증 주상병 코드(E78)와 함께 이상지질혈증 약제를 처방 받은 환자○ 고당지의료이용률 = 분자/분모*100(%)※ 분모가 20 미만인 행은 정보 제공 불가
Author국민건강보험공단
URLhttps://www.data.go.kr/data/15119871/fileData.do

Alerts

기준연도 has constant value ""Constant
분모(명) is highly overall correlated with 분자(명)High correlation
분자(명) is highly overall correlated with 분모(명) and 1 other fieldsHigh correlation
고당지의료이용률 is highly overall correlated with 분자(명) and 1 other fieldsHigh correlation
연령그룹 is highly overall correlated with 고당지의료이용률High correlation
분모(명) is highly skewed (γ1 = 44.2784302)Skewed
분자(명) is highly skewed (γ1 = 41.28332342)Skewed
분자(명) has 109 (1.4%) zerosZeros
고당지의료이용률 has 109 (1.4%) zerosZeros

Reproduction

Analysis started2023-12-16 15:23:10.633918
Analysis finished2023-12-16 15:23:20.793684
Duration10.16 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준연도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size62.7 KiB
2022
8010 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 8010
100.0%

Length

2023-12-16T15:23:21.288974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-16T15:23:21.850939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 8010
100.0%

시도
Categorical

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size62.7 KiB
경기도
1290 
서울특별시
780 
경상북도
750 
경상남도
690 
전라남도
690 
Other values (13)
3810 

Length

Max length7
Median length5
Mean length4.1086142
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전국
2nd row전국
3rd row전국
4th row전국
5th row전국

Common Values

ValueCountFrequency (%)
경기도 1290
16.1%
서울특별시 780
9.7%
경상북도 750
9.4%
경상남도 690
8.6%
전라남도 690
8.6%
강원도 570
7.1%
부산광역시 510
 
6.4%
충청남도 510
 
6.4%
전라북도 480
 
6.0%
충청북도 450
 
5.6%
Other values (8) 1290
16.1%

Length

2023-12-16T15:23:22.717660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 1290
16.1%
서울특별시 780
9.7%
경상북도 750
9.4%
경상남도 690
8.6%
전라남도 690
8.6%
강원도 570
7.1%
부산광역시 510
 
6.4%
충청남도 510
 
6.4%
전라북도 480
 
6.0%
충청북도 450
 
5.6%
Other values (8) 1290
16.1%
Distinct229
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size62.7 KiB
2023-12-16T15:23:24.632552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.3745318
Min length2

Characters and Unicode

Total characters27030
Distinct characters144
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전체
2nd row전체
3rd row전체
4th row전체
5th row전체
ValueCountFrequency (%)
전체 510
 
5.7%
중구 180
 
2.0%
동구 180
 
2.0%
서구 150
 
1.7%
북구 150
 
1.7%
창원시 150
 
1.7%
남구 150
 
1.7%
수원시 120
 
1.3%
청주시 120
 
1.3%
고양시 90
 
1.0%
Other values (228) 7170
79.9%
2023-12-16T15:23:27.669209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3180
 
11.8%
3000
 
11.1%
2550
 
9.4%
960
 
3.6%
720
 
2.7%
690
 
2.6%
690
 
2.6%
660
 
2.4%
630
 
2.3%
600
 
2.2%
Other values (134) 13350
49.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26070
96.4%
Space Separator 960
 
3.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3180
 
12.2%
3000
 
11.5%
2550
 
9.8%
720
 
2.8%
690
 
2.6%
690
 
2.6%
660
 
2.5%
630
 
2.4%
600
 
2.3%
570
 
2.2%
Other values (133) 12780
49.0%
Space Separator
ValueCountFrequency (%)
960
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26070
96.4%
Common 960
 
3.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3180
 
12.2%
3000
 
11.5%
2550
 
9.8%
720
 
2.8%
690
 
2.6%
690
 
2.6%
660
 
2.5%
630
 
2.4%
600
 
2.3%
570
 
2.2%
Other values (133) 12780
49.0%
Common
ValueCountFrequency (%)
960
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26070
96.4%
ASCII 960
 
3.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3180
 
12.2%
3000
 
11.5%
2550
 
9.8%
720
 
2.8%
690
 
2.6%
690
 
2.6%
660
 
2.5%
630
 
2.4%
600
 
2.3%
570
 
2.2%
Other values (133) 12780
49.0%
ASCII
ValueCountFrequency (%)
960
100.0%

성별
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size62.7 KiB
1
2670 
2
2670 
ALL
2670 

Length

Max length3
Median length1
Mean length1.6666667
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 2670
33.3%
2 2670
33.3%
ALL 2670
33.3%

Length

2023-12-16T15:23:28.359372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-16T15:23:29.113246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 2670
33.3%
2 2670
33.3%
all 2670
33.3%

연령그룹
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size62.7 KiB
0
801 
10
801 
20
801 
30
801 
40
801 
Other values (5)
4005 

Length

Max length3
Median length2
Mean length2
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row10
3rd row20
4th row30
5th row40

Common Values

ValueCountFrequency (%)
0 801
10.0%
10 801
10.0%
20 801
10.0%
30 801
10.0%
40 801
10.0%
50 801
10.0%
60 801
10.0%
70 801
10.0%
80 801
10.0%
ALL 801
10.0%

Length

2023-12-16T15:23:29.909264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-16T15:23:30.548061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 801
10.0%
10 801
10.0%
20 801
10.0%
30 801
10.0%
40 801
10.0%
50 801
10.0%
60 801
10.0%
70 801
10.0%
80 801
10.0%
all 801
10.0%

분모(명)
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct7245
Distinct (%)90.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77560.697
Minimum158
Maximum51893314
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size70.5 KiB
2023-12-16T15:23:31.389484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum158
5-th percentile1183.45
Q14524.25
median12779
Q331581.5
95-th percentile203008.65
Maximum51893314
Range51893156
Interquartile range (IQR)27057.25

Descriptive statistics

Standard deviation799486.49
Coefficient of variation (CV)10.307882
Kurtosis2496.6017
Mean77560.697
Median Absolute Deviation (MAD)9918
Skewness44.27843
Sum6.2126118 × 108
Variance6.3917865 × 1011
MonotonicityNot monotonic
2023-12-16T15:23:32.237000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1809 5
 
0.1%
1047 4
 
< 0.1%
2696 4
 
< 0.1%
1666 4
 
< 0.1%
1380 4
 
< 0.1%
8944 4
 
< 0.1%
4639 4
 
< 0.1%
1301 3
 
< 0.1%
854 3
 
< 0.1%
1083 3
 
< 0.1%
Other values (7235) 7972
99.5%
ValueCountFrequency (%)
158 1
< 0.1%
175 1
< 0.1%
216 1
< 0.1%
220 1
< 0.1%
242 1
< 0.1%
251 1
< 0.1%
268 1
< 0.1%
276 1
< 0.1%
282 1
< 0.1%
290 2
< 0.1%
ValueCountFrequency (%)
51893314 1
< 0.1%
25994398 1
< 0.1%
25898916 1
< 0.1%
13756986 1
< 0.1%
9640552 1
< 0.1%
8723077 1
< 0.1%
8088750 1
< 0.1%
7465685 1
< 0.1%
6944223 1
< 0.1%
6812763 1
< 0.1%

분자(명)
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct5145
Distinct (%)64.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22871.545
Minimum0
Maximum15293502
Zeros109
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size70.5 KiB
2023-12-16T15:23:32.835346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q1170.25
median2602.5
Q39983.75
95-th percentile58961.85
Maximum15293502
Range15293502
Interquartile range (IQR)9813.5

Descriptive statistics

Standard deviation243499
Coefficient of variation (CV)10.646373
Kurtosis2208.6165
Mean22871.545
Median Absolute Deviation (MAD)2580.5
Skewness41.283323
Sum1.8320108 × 108
Variance5.9291765 × 1010
MonotonicityNot monotonic
2023-12-16T15:23:33.728437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 109
 
1.4%
1 69
 
0.9%
2 66
 
0.8%
3 63
 
0.8%
5 61
 
0.8%
6 53
 
0.7%
7 49
 
0.6%
9 48
 
0.6%
11 48
 
0.6%
10 41
 
0.5%
Other values (5135) 7403
92.4%
ValueCountFrequency (%)
0 109
1.4%
1 69
0.9%
2 66
0.8%
3 63
0.8%
4 40
 
0.5%
5 61
0.8%
6 53
0.7%
7 49
0.6%
8 38
 
0.5%
9 48
0.6%
ValueCountFrequency (%)
15293502 1
< 0.1%
7692275 1
< 0.1%
7601227 1
< 0.1%
4662776 1
< 0.1%
3777053 1
< 0.1%
3502940 1
< 0.1%
2989908 1
< 0.1%
2694776 1
< 0.1%
2360193 1
< 0.1%
2302583 1
< 0.1%

고당지의료이용률
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3940
Distinct (%)49.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.305976
Minimum0
Maximum88.2
Zeros109
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size70.5 KiB
2023-12-16T15:23:34.593089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.1
Q11.86
median26.39
Q362.55
95-th percentile80.99
Maximum88.2
Range88.2
Interquartile range (IQR)60.69

Descriptive statistics

Standard deviation30.219873
Coefficient of variation (CV)0.93542671
Kurtosis-1.3572517
Mean32.305976
Median Absolute Deviation (MAD)25.425
Skewness0.42033112
Sum258770.87
Variance913.24073
MonotonicityNot monotonic
2023-12-16T15:23:35.386792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 109
 
1.4%
0.09 80
 
1.0%
0.1 70
 
0.9%
0.11 70
 
0.9%
0.08 57
 
0.7%
0.12 56
 
0.7%
0.07 55
 
0.7%
0.13 52
 
0.6%
0.06 41
 
0.5%
0.14 33
 
0.4%
Other values (3930) 7387
92.2%
ValueCountFrequency (%)
0.0 109
1.4%
0.03 9
 
0.1%
0.04 13
 
0.2%
0.05 21
 
0.3%
0.06 41
 
0.5%
0.07 55
0.7%
0.08 57
0.7%
0.09 80
1.0%
0.1 70
0.9%
0.11 70
0.9%
ValueCountFrequency (%)
88.2 1
< 0.1%
87.1 1
< 0.1%
86.67 1
< 0.1%
86.53 1
< 0.1%
86.52 1
< 0.1%
86.37 1
< 0.1%
86.33 1
< 0.1%
86.28 1
< 0.1%
86.09 1
< 0.1%
85.97 1
< 0.1%

Interactions

2023-12-16T15:23:17.574078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T15:23:13.326313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T15:23:15.350388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T15:23:17.943538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T15:23:13.958116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T15:23:16.266639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T15:23:18.354815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T15:23:14.684180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-16T15:23:16.877739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-16T15:23:35.884167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도성별연령그룹분모(명)분자(명)고당지의료이용률
시도1.0000.0000.0000.4520.4760.164
성별0.0001.0000.0000.0400.0050.367
연령그룹0.0000.0001.0000.0070.0000.960
분모(명)0.4520.0400.0071.0000.9400.044
분자(명)0.4760.0050.0000.9401.0000.045
고당지의료이용률0.1640.3670.9600.0440.0451.000
2023-12-16T15:23:36.603800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령그룹성별시도
연령그룹1.0000.0000.000
성별0.0001.0000.000
시도0.0000.0001.000
2023-12-16T15:23:37.218590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분모(명)분자(명)고당지의료이용률시도성별연령그룹
분모(명)1.0000.6560.0410.1970.0160.004
분자(명)0.6561.0000.7180.2350.0030.000
고당지의료이용률0.0410.7181.0000.0630.2370.657
시도0.1970.2350.0631.0000.0000.000
성별0.0160.0030.2370.0001.0000.000
연령그룹0.0040.0000.6570.0000.0001.000

Missing values

2023-12-16T15:23:19.033554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-16T15:23:20.413972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준연도시도시군구성별연령그룹분모(명)분자(명)고당지의료이용률
02022전국전체10174904118120.1
12022전국전체1102389691129460.54
22022전국전체1203400939761602.24
32022전국전체13035418643191569.01
42022전국전체1404116680106514625.87
52022전국전체1504405905194234444.09
62022전국전체1603677918230258362.61
72022전국전체1701770238131917574.52
82022전국전체18084664065295377.12
92022전국전체1ALL25898916769227529.7
기준연도시도시군구성별연령그룹분모(명)분자(명)고당지의료이용률
80002022제주특별자치도서귀포시ALL012631110.09
80012022제주특별자치도서귀포시ALL1017769940.53
80022022제주특별자치도서귀포시ALL20191393561.86
80032022제주특별자치도서귀포시ALL301956613126.71
80042022제주특별자치도서귀포시ALL4029061537718.5
80052022제주특별자치도서귀포시ALL50325761226437.65
80062022제주특별자치도서귀포시ALL60273071634659.86
80072022제주특별자치도서귀포시ALL70156801200476.56
80082022제주특별자치도서귀포시ALL8011487910679.27
80092022제주특별자치도서귀포시ALLALL1852165687030.7