Overview

Dataset statistics

Number of variables8
Number of observations8010
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory532.0 KiB
Average record size in memory68.0 B

Variable types

Categorical4
Text1
Numeric3

Dataset

Description국민건강보험가입자 및 의료급여수급권자 중 고혈압 또는 당뇨병 환자의 비율○ 시도 및 시군구: 전국 17개 시도의 시군구별 구분- 연초 기준 지명 구축. 연중에 지명이 바뀌는 경우 거주자 중 지명 변경 후 정보 이동이 있었던 일부만 반영○ 성별: ALL(전체), 1(남성), 2(여성)○ 연령그룹: ALL(전체), 00(10세미만), 10(10대), 20(20대), 30(30대), 40(40대), 50(50대), 60(60대), 70(70대), 80(80세이상)○ 분모(명): 국민건강보험가입자 또는 의료급여수급권자 수○ 분자(명): 고혈압 또는 당뇨병이 발생한 환자 수- 고혈압 환자: 고혈압 주상병 코드(I10~I15)와 함께 고혈압 약제를 처방 받은 환자- 당뇨병 환자: 당뇨병 주상병 코드(E10~E14)와 함께 당뇨병 약제를 처방 받은 환자○ 고당의료이용률 = 분자/분모*100(%)※ 분모가 20 미만인 행은 정보 제공 불가
Author국민건강보험공단
URLhttps://www.data.go.kr/data/15119863/fileData.do

Alerts

기준연도 has constant value ""Constant
분모(명) is highly overall correlated with 분자(명)High correlation
분자(명) is highly overall correlated with 분모(명) and 1 other fieldsHigh correlation
고당의료이용률 is highly overall correlated with 분자(명) and 1 other fieldsHigh correlation
연령그룹 is highly overall correlated with 고당의료이용률High correlation
분모(명) is highly skewed (γ1 = 44.2784302)Skewed
분자(명) is highly skewed (γ1 = 41.34779685)Skewed
분자(명) has 116 (1.4%) zerosZeros
고당의료이용률 has 116 (1.4%) zerosZeros

Reproduction

Analysis started2024-04-17 09:55:57.944417
Analysis finished2024-04-17 09:55:59.736307
Duration1.79 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준연도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size62.7 KiB
2022
8010 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 8010
100.0%

Length

2024-04-17T18:55:59.793611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:55:59.888092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 8010
100.0%

시도
Categorical

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size62.7 KiB
경기도
1290 
서울특별시
780 
경상북도
750 
경상남도
690 
전라남도
690 
Other values (13)
3810 

Length

Max length7
Median length5
Mean length4.1086142
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전국
2nd row전국
3rd row전국
4th row전국
5th row전국

Common Values

ValueCountFrequency (%)
경기도 1290
16.1%
서울특별시 780
9.7%
경상북도 750
9.4%
경상남도 690
8.6%
전라남도 690
8.6%
강원도 570
7.1%
부산광역시 510
 
6.4%
충청남도 510
 
6.4%
전라북도 480
 
6.0%
충청북도 450
 
5.6%
Other values (8) 1290
16.1%

Length

2024-04-17T18:55:59.990461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 1290
16.1%
서울특별시 780
9.7%
경상북도 750
9.4%
경상남도 690
8.6%
전라남도 690
8.6%
강원도 570
7.1%
부산광역시 510
 
6.4%
충청남도 510
 
6.4%
전라북도 480
 
6.0%
충청북도 450
 
5.6%
Other values (8) 1290
16.1%
Distinct229
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size62.7 KiB
2024-04-17T18:56:00.302757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.3745318
Min length2

Characters and Unicode

Total characters27030
Distinct characters144
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전체
2nd row전체
3rd row전체
4th row전체
5th row전체
ValueCountFrequency (%)
전체 510
 
5.7%
중구 180
 
2.0%
동구 180
 
2.0%
서구 150
 
1.7%
북구 150
 
1.7%
창원시 150
 
1.7%
남구 150
 
1.7%
수원시 120
 
1.3%
청주시 120
 
1.3%
고양시 90
 
1.0%
Other values (228) 7170
79.9%
2024-04-17T18:56:00.742569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3180
 
11.8%
3000
 
11.1%
2550
 
9.4%
960
 
3.6%
720
 
2.7%
690
 
2.6%
690
 
2.6%
660
 
2.4%
630
 
2.3%
600
 
2.2%
Other values (134) 13350
49.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26070
96.4%
Space Separator 960
 
3.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3180
 
12.2%
3000
 
11.5%
2550
 
9.8%
720
 
2.8%
690
 
2.6%
690
 
2.6%
660
 
2.5%
630
 
2.4%
600
 
2.3%
570
 
2.2%
Other values (133) 12780
49.0%
Space Separator
ValueCountFrequency (%)
960
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26070
96.4%
Common 960
 
3.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3180
 
12.2%
3000
 
11.5%
2550
 
9.8%
720
 
2.8%
690
 
2.6%
690
 
2.6%
660
 
2.5%
630
 
2.4%
600
 
2.3%
570
 
2.2%
Other values (133) 12780
49.0%
Common
ValueCountFrequency (%)
960
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26070
96.4%
ASCII 960
 
3.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3180
 
12.2%
3000
 
11.5%
2550
 
9.8%
720
 
2.8%
690
 
2.6%
690
 
2.6%
660
 
2.5%
630
 
2.4%
600
 
2.3%
570
 
2.2%
Other values (133) 12780
49.0%
ASCII
ValueCountFrequency (%)
960
100.0%

성별
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size62.7 KiB
1
2670 
2
2670 
ALL
2670 

Length

Max length3
Median length1
Mean length1.6666667
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 2670
33.3%
2 2670
33.3%
ALL 2670
33.3%

Length

2024-04-17T18:56:00.884681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:56:00.989378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 2670
33.3%
2 2670
33.3%
all 2670
33.3%

연령그룹
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size62.7 KiB
0
801 
10
801 
20
801 
30
801 
40
801 
Other values (5)
4005 

Length

Max length3
Median length2
Mean length2
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row10
3rd row20
4th row30
5th row40

Common Values

ValueCountFrequency (%)
0 801
10.0%
10 801
10.0%
20 801
10.0%
30 801
10.0%
40 801
10.0%
50 801
10.0%
60 801
10.0%
70 801
10.0%
80 801
10.0%
ALL 801
10.0%

Length

2024-04-17T18:56:01.098067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:56:01.236064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 801
10.0%
10 801
10.0%
20 801
10.0%
30 801
10.0%
40 801
10.0%
50 801
10.0%
60 801
10.0%
70 801
10.0%
80 801
10.0%
all 801
10.0%

분모(명)
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct7245
Distinct (%)90.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77560.697
Minimum158
Maximum51893314
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size70.5 KiB
2024-04-17T18:56:01.374038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum158
5-th percentile1183.45
Q14524.25
median12779
Q331581.5
95-th percentile203008.65
Maximum51893314
Range51893156
Interquartile range (IQR)27057.25

Descriptive statistics

Standard deviation799486.49
Coefficient of variation (CV)10.307882
Kurtosis2496.6017
Mean77560.697
Median Absolute Deviation (MAD)9918
Skewness44.27843
Sum6.2126118 × 108
Variance6.3917865 × 1011
MonotonicityNot monotonic
2024-04-17T18:56:01.511870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1809 5
 
0.1%
1047 4
 
< 0.1%
2696 4
 
< 0.1%
1666 4
 
< 0.1%
1380 4
 
< 0.1%
8944 4
 
< 0.1%
4639 4
 
< 0.1%
1301 3
 
< 0.1%
854 3
 
< 0.1%
1083 3
 
< 0.1%
Other values (7235) 7972
99.5%
ValueCountFrequency (%)
158 1
< 0.1%
175 1
< 0.1%
216 1
< 0.1%
220 1
< 0.1%
242 1
< 0.1%
251 1
< 0.1%
268 1
< 0.1%
276 1
< 0.1%
282 1
< 0.1%
290 2
< 0.1%
ValueCountFrequency (%)
51893314 1
< 0.1%
25994398 1
< 0.1%
25898916 1
< 0.1%
13756986 1
< 0.1%
9640552 1
< 0.1%
8723077 1
< 0.1%
8088750 1
< 0.1%
7465685 1
< 0.1%
6944223 1
< 0.1%
6812763 1
< 0.1%

분자(명)
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct4917
Distinct (%)61.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18785.269
Minimum0
Maximum12560064
Zeros116
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size70.5 KiB
2024-04-17T18:56:01.633765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q1128
median2119
Q38316
95-th percentile47292.9
Maximum12560064
Range12560064
Interquartile range (IQR)8188

Descriptive statistics

Standard deviation199875.3
Coefficient of variation (CV)10.640002
Kurtosis2215.2524
Mean18785.269
Median Absolute Deviation (MAD)2103
Skewness41.347797
Sum1.5047001 × 108
Variance3.9950137 × 1010
MonotonicityNot monotonic
2024-04-17T18:56:01.763670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 116
 
1.4%
2 76
 
0.9%
1 74
 
0.9%
5 72
 
0.9%
3 68
 
0.8%
7 57
 
0.7%
6 56
 
0.7%
9 55
 
0.7%
4 53
 
0.7%
10 44
 
0.5%
Other values (4907) 7339
91.6%
ValueCountFrequency (%)
0 116
1.4%
1 74
0.9%
2 76
0.9%
3 68
0.8%
4 53
0.7%
5 72
0.9%
6 56
0.7%
7 57
0.7%
8 41
 
0.5%
9 55
0.7%
ValueCountFrequency (%)
12560064 1
< 0.1%
6600964 1
< 0.1%
5959100 1
< 0.1%
3746731 1
< 0.1%
3077751 1
< 0.1%
2694042 1
< 0.1%
2651913 1
< 0.1%
2158963 1
< 0.1%
2027500 1
< 0.1%
1852924 1
< 0.1%

고당의료이용률
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3980
Distinct (%)49.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.777829
Minimum0
Maximum85.58
Zeros116
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size70.5 KiB
2024-04-17T18:56:01.887249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.09
Q11.35
median20.705
Q351.505
95-th percentile75.7855
Maximum85.58
Range85.58
Interquartile range (IQR)50.155

Descriptive statistics

Standard deviation27.471351
Coefficient of variation (CV)0.98896683
Kurtosis-1.1391334
Mean27.777829
Median Absolute Deviation (MAD)20.035
Skewness0.58157175
Sum222500.41
Variance754.67514
MonotonicityNot monotonic
2024-04-17T18:56:02.007951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 116
 
1.4%
0.08 85
 
1.1%
0.11 70
 
0.9%
0.1 67
 
0.8%
0.09 65
 
0.8%
0.07 62
 
0.8%
0.12 50
 
0.6%
0.06 43
 
0.5%
0.05 41
 
0.5%
0.38 38
 
0.5%
Other values (3970) 7373
92.0%
ValueCountFrequency (%)
0.0 116
1.4%
0.03 13
 
0.2%
0.04 26
 
0.3%
0.05 41
 
0.5%
0.06 43
 
0.5%
0.07 62
0.8%
0.08 85
1.1%
0.09 65
0.8%
0.1 67
0.8%
0.11 70
0.9%
ValueCountFrequency (%)
85.58 1
< 0.1%
84.81 1
< 0.1%
83.74 1
< 0.1%
83.08 1
< 0.1%
83.07 1
< 0.1%
82.9 1
< 0.1%
82.73 1
< 0.1%
82.54 1
< 0.1%
82.52 1
< 0.1%
82.39 1
< 0.1%

Interactions

2024-04-17T18:55:58.990738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T18:55:58.462891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T18:55:58.714098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T18:55:59.087897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T18:55:58.541788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T18:55:58.804560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T18:55:59.479695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T18:55:58.627461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T18:55:58.896348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-17T18:56:02.094258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도성별연령그룹분모(명)분자(명)고당의료이용률
시도1.0000.0000.0000.4520.5160.151
성별0.0001.0000.0000.0400.0400.371
연령그룹0.0000.0001.0000.0070.0020.957
분모(명)0.4520.0400.0071.0000.9890.018
분자(명)0.5160.0400.0020.9891.0000.026
고당의료이용률0.1510.3710.9570.0180.0261.000
2024-04-17T18:56:02.200632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별연령그룹시도
성별1.0000.0000.000
연령그룹0.0001.0000.000
시도0.0000.0001.000
2024-04-17T18:56:02.292615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분모(명)분자(명)고당의료이용률시도성별연령그룹
분모(명)1.0000.6410.0330.1970.0160.004
분자(명)0.6411.0000.7320.2320.0160.001
고당의료이용률0.0330.7321.0000.0580.2400.645
시도0.1970.2320.0581.0000.0000.000
성별0.0160.0160.2400.0001.0000.000
연령그룹0.0040.0010.6450.0000.0001.000

Missing values

2024-04-17T18:55:59.578585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-17T18:55:59.684929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준연도시도시군구성별연령그룹분모(명)분자(명)고당의료이용률
02022전국전체10174904116560.09
12022전국전체1102389691105470.44
22022전국전체1203400939553391.63
32022전국전체13035418642265676.4
42022전국전체140411668082577720.06
52022전국전체1504405905163953537.21
62022전국전체1603677918202750055.13
72022전국전체1701770238120216967.91
82022전국전체18084664061187472.27
92022전국전체1ALL25898916660096425.49
기준연도시도시군구성별연령그룹분모(명)분자(명)고당의료이용률
80002022제주특별자치도서귀포시ALL012631100.08
80012022제주특별자치도서귀포시ALL1017769870.49
80022022제주특별자치도서귀포시ALL20191392921.53
80032022제주특별자치도서귀포시ALL301956610375.3
80042022제주특별자치도서귀포시ALL4029061441815.2
80052022제주특별자치도서귀포시ALL5032576993930.51
80062022제주특별자치도서귀포시ALL60273071336948.96
80072022제주특별자치도서귀포시ALL70156801061667.7
80082022제주특별자치도서귀포시ALL8011487851374.11
80092022제주특별자치도서귀포시ALLALL1852164828126.07