Overview

Dataset statistics

Number of variables9
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory839.8 KiB
Average record size in memory86.0 B

Variable types

Categorical3
Text1
Numeric5

Dataset

Description5단상병별 건강보험 진료비 통계 / 진료일자 기준(심사분은 각 진료년+4개월) (예) 진료년월: 2022.1월~12월, 심사년월: 2022.1월~2023.4월 / 보험자: 건강보험 / 요양기관 종별: 약국 제외 / 한방상병 제외
URLhttps://www.data.go.kr/data/15072880/fileData.do

Alerts

진료년도 has constant value ""Constant
환자수 is highly overall correlated with 명세서건수 and 3 other fieldsHigh correlation
명세서건수 is highly overall correlated with 환자수 and 3 other fieldsHigh correlation
입내원일수 is highly overall correlated with 환자수 and 3 other fieldsHigh correlation
보험자부담금 is highly overall correlated with 환자수 and 3 other fieldsHigh correlation
요양급여비용총액 is highly overall correlated with 환자수 and 3 other fieldsHigh correlation
환자수 is highly skewed (γ1 = 71.74764149)Skewed
명세서건수 is highly skewed (γ1 = 70.87948436)Skewed
입내원일수 is highly skewed (γ1 = 67.90367968)Skewed
보험자부담금 is highly skewed (γ1 = 41.94012728)Skewed
요양급여비용총액 is highly skewed (γ1 = 45.15516829)Skewed

Reproduction

Analysis started2023-12-12 13:00:36.627872
Analysis finished2023-12-12 13:00:40.949520
Duration4.32 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

진료년도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2022
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 10000
100.0%

Length

2023-12-12T22:00:41.033197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:00:41.132144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 10000
100.0%
Distinct7709
Distinct (%)77.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T22:00:41.476106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters50000
Distinct characters33
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5707 ?
Unique (%)57.1%

Sample

1st rowM8168
2nd rowK5782
3rd rowM1090
4th rowM7990
5th rowC181
ValueCountFrequency (%)
a560 4
 
< 0.1%
e1464 4
 
< 0.1%
f072 4
 
< 0.1%
i879 4
 
< 0.1%
h681 4
 
< 0.1%
s3631 4
 
< 0.1%
c1641 4
 
< 0.1%
e834 4
 
< 0.1%
k5721 4
 
< 0.1%
e078 4
 
< 0.1%
Other values (7699) 9960
99.6%
2023-12-12T22:00:42.057065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6643
13.3%
0 4635
 
9.3%
1 4270
 
8.5%
2 3918
 
7.8%
8 3475
 
7.0%
3 3233
 
6.5%
9 3137
 
6.3%
4 3130
 
6.3%
6 2647
 
5.3%
5 2640
 
5.3%
Other values (23) 12272
24.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 33357
66.7%
Uppercase Letter 10000
 
20.0%
Space Separator 6643
 
13.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 2337
23.4%
S 930
 
9.3%
K 574
 
5.7%
T 552
 
5.5%
H 478
 
4.8%
Q 434
 
4.3%
C 426
 
4.3%
I 417
 
4.2%
D 412
 
4.1%
E 381
 
3.8%
Other values (12) 3059
30.6%
Decimal Number
ValueCountFrequency (%)
0 4635
13.9%
1 4270
12.8%
2 3918
11.7%
8 3475
10.4%
3 3233
9.7%
9 3137
9.4%
4 3130
9.4%
6 2647
7.9%
5 2640
7.9%
7 2272
6.8%
Space Separator
ValueCountFrequency (%)
6643
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 40000
80.0%
Latin 10000
 
20.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 2337
23.4%
S 930
 
9.3%
K 574
 
5.7%
T 552
 
5.5%
H 478
 
4.8%
Q 434
 
4.3%
C 426
 
4.3%
I 417
 
4.2%
D 412
 
4.1%
E 381
 
3.8%
Other values (12) 3059
30.6%
Common
ValueCountFrequency (%)
6643
16.6%
0 4635
11.6%
1 4270
10.7%
2 3918
9.8%
8 3475
8.7%
3 3233
8.1%
9 3137
7.8%
4 3130
7.8%
6 2647
 
6.6%
5 2640
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6643
13.3%
0 4635
 
9.3%
1 4270
 
8.5%
2 3918
 
7.8%
8 3475
 
7.0%
3 3233
 
6.5%
9 3137
 
6.3%
4 3130
 
6.3%
6 2647
 
5.3%
5 2640
 
5.3%
Other values (23) 12272
24.5%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
5146 
4854 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
5146
51.5%
4854
48.5%

Length

2023-12-12T22:00:42.225125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:00:42.324093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5146
51.5%
4854
48.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
외래
5721 
입원
4279 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row외래
2nd row입원
3rd row입원
4th row외래
5th row입원

Common Values

ValueCountFrequency (%)
외래 5721
57.2%
입원 4279
42.8%

Length

2023-12-12T22:00:42.440446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:00:42.541447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
외래 5721
57.2%
입원 4279
42.8%

환자수
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct2493
Distinct (%)24.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6704.9307
Minimum1
Maximum10867248
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:00:42.646440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q17
median50.5
Q3440
95-th percentile13179.4
Maximum10867248
Range10867247
Interquartile range (IQR)433

Descriptive statistics

Standard deviation123340.74
Coefficient of variation (CV)18.395528
Kurtosis6081.5416
Mean6704.9307
Median Absolute Deviation (MAD)49.5
Skewness71.747641
Sum67049307
Variance1.5212938 × 1010
MonotonicityNot monotonic
2023-12-12T22:00:42.788060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 887
 
8.9%
2 527
 
5.3%
3 333
 
3.3%
4 292
 
2.9%
5 227
 
2.3%
6 200
 
2.0%
7 171
 
1.7%
8 167
 
1.7%
9 145
 
1.5%
10 126
 
1.3%
Other values (2483) 6925
69.2%
ValueCountFrequency (%)
1 887
8.9%
2 527
5.3%
3 333
 
3.3%
4 292
 
2.9%
5 227
 
2.3%
6 200
 
2.0%
7 171
 
1.7%
8 167
 
1.7%
9 145
 
1.5%
10 126
 
1.3%
ValueCountFrequency (%)
10867248 1
< 0.1%
2929112 1
< 0.1%
2768451 1
< 0.1%
2078845 1
< 0.1%
1305983 1
< 0.1%
1293145 1
< 0.1%
1047665 1
< 0.1%
931839 1
< 0.1%
920388 1
< 0.1%
765700 1
< 0.1%

명세서건수
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct3154
Distinct (%)31.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16007.252
Minimum1
Maximum24432321
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:00:42.919633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q111
median101
Q31017.5
95-th percentile36147.25
Maximum24432321
Range24432320
Interquartile range (IQR)1006.5

Descriptive statistics

Standard deviation278019.5
Coefficient of variation (CV)17.368347
Kurtosis6006.2069
Mean16007.252
Median Absolute Deviation (MAD)99
Skewness70.879484
Sum1.6007252 × 108
Variance7.7294845 × 1010
MonotonicityNot monotonic
2023-12-12T22:00:43.048990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 662
 
6.6%
2 419
 
4.2%
3 284
 
2.8%
4 219
 
2.2%
5 189
 
1.9%
6 166
 
1.7%
7 157
 
1.6%
8 142
 
1.4%
10 115
 
1.1%
11 104
 
1.0%
Other values (3144) 7543
75.4%
ValueCountFrequency (%)
1 662
6.6%
2 419
4.2%
3 284
2.8%
4 219
 
2.2%
5 189
 
1.9%
6 166
 
1.7%
7 157
 
1.6%
8 142
 
1.4%
9 102
 
1.0%
10 115
 
1.1%
ValueCountFrequency (%)
24432321 1
< 0.1%
6720782 1
< 0.1%
4578223 1
< 0.1%
4429749 1
< 0.1%
3916760 1
< 0.1%
3314644 1
< 0.1%
2635955 1
< 0.1%
2574913 1
< 0.1%
2229325 1
< 0.1%
2158557 1
< 0.1%

입내원일수
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct3795
Distinct (%)38.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18474.109
Minimum0
Maximum24418203
Zeros65
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:00:43.178788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q129
median244
Q31997.25
95-th percentile47770.85
Maximum24418203
Range24418203
Interquartile range (IQR)1968.25

Descriptive statistics

Standard deviation282203.13
Coefficient of variation (CV)15.275602
Kurtosis5644.6911
Mean18474.109
Median Absolute Deviation (MAD)240
Skewness67.90368
Sum1.8474109 × 108
Variance7.9638608 × 1010
MonotonicityNot monotonic
2023-12-12T22:00:43.309595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 296
 
3.0%
2 210
 
2.1%
3 193
 
1.9%
5 141
 
1.4%
4 139
 
1.4%
6 126
 
1.3%
7 121
 
1.2%
8 96
 
1.0%
11 89
 
0.9%
10 76
 
0.8%
Other values (3785) 8513
85.1%
ValueCountFrequency (%)
0 65
 
0.7%
1 296
3.0%
2 210
2.1%
3 193
1.9%
4 139
1.4%
5 141
1.4%
6 126
1.3%
7 121
1.2%
8 96
 
1.0%
9 73
 
0.7%
ValueCountFrequency (%)
24418203 1
< 0.1%
6673702 1
< 0.1%
4570342 1
< 0.1%
4428627 1
< 0.1%
3914253 1
< 0.1%
3312966 1
< 0.1%
3227507 1
< 0.1%
3169294 1
< 0.1%
2634336 1
< 0.1%
2574605 1
< 0.1%

보험자부담금
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct9898
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2740972 × 109
Minimum0
Maximum8.22893 × 1011
Zeros5
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:00:43.463268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile83222
Q12274900
median23041045
Q32.1051079 × 108
95-th percentile3.6654327 × 109
Maximum8.22893 × 1011
Range8.22893 × 1011
Interquartile range (IQR)2.0823589 × 108

Descriptive statistics

Standard deviation1.3124072 × 1010
Coefficient of variation (CV)10.300684
Kurtosis2309.5357
Mean1.2740972 × 109
Median Absolute Deviation (MAD)22846010
Skewness41.940127
Sum1.2740972 × 1013
Variance1.7224126 × 1020
MonotonicityNot monotonic
2023-12-12T22:00:43.628495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11970 21
 
0.2%
8530 9
 
0.1%
15370 6
 
0.1%
7170 6
 
0.1%
10630 5
 
0.1%
0 5
 
0.1%
14880 5
 
0.1%
10380 5
 
0.1%
3210 4
 
< 0.1%
10330 3
 
< 0.1%
Other values (9888) 9931
99.3%
ValueCountFrequency (%)
0 5
0.1%
160 1
 
< 0.1%
800 1
 
< 0.1%
2350 2
 
< 0.1%
2520 1
 
< 0.1%
2770 1
 
< 0.1%
2860 1
 
< 0.1%
3070 2
 
< 0.1%
3170 1
 
< 0.1%
3210 4
< 0.1%
ValueCountFrequency (%)
822893000000 1
< 0.1%
683971000000 1
< 0.1%
235128000000 1
< 0.1%
215531000000 1
< 0.1%
206861000000 1
< 0.1%
182563000000 1
< 0.1%
180810000000 1
< 0.1%
177767000000 1
< 0.1%
157670000000 1
< 0.1%
139273000000 1
< 0.1%

요양급여비용총액
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct9891
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6464795 × 109
Minimum0
Maximum1.16798 × 1012
Zeros5
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:00:44.011626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile125672.5
Q13352115
median32153150
Q32.8721504 × 108
95-th percentile4.977873 × 109
Maximum1.16798 × 1012
Range1.16798 × 1012
Interquartile range (IQR)2.8386292 × 108

Descriptive statistics

Standard deviation1.7025943 × 1010
Coefficient of variation (CV)10.340817
Kurtosis2699.8623
Mean1.6464795 × 109
Median Absolute Deviation (MAD)31844690
Skewness45.155168
Sum1.6464795 × 1013
Variance2.8988274 × 1020
MonotonicityNot monotonic
2023-12-12T22:00:44.146330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16970 30
 
0.3%
12130 16
 
0.2%
11870 7
 
0.1%
14780 6
 
0.1%
0 5
 
0.1%
21180 5
 
0.1%
23770 3
 
< 0.1%
33940 3
 
< 0.1%
29100 3
 
< 0.1%
102030 3
 
< 0.1%
Other values (9881) 9919
99.2%
ValueCountFrequency (%)
0 5
0.1%
190 1
 
< 0.1%
1000 1
 
< 0.1%
2790 1
 
< 0.1%
3850 2
 
< 0.1%
3960 1
 
< 0.1%
5530 1
 
< 0.1%
5610 1
 
< 0.1%
6120 1
 
< 0.1%
7460 1
 
< 0.1%
ValueCountFrequency (%)
1167980000000 1
< 0.1%
789164000000 1
< 0.1%
336612000000 1
< 0.1%
273920000000 1
< 0.1%
241003000000 1
< 0.1%
230619000000 1
< 0.1%
225292000000 1
< 0.1%
207276000000 1
< 0.1%
188306000000 1
< 0.1%
159763000000 1
< 0.1%

Interactions

2023-12-12T22:00:40.066879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:37.658180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:38.233597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:38.899418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:39.529015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:40.172590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:37.779969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:38.367069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:39.053091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:39.645453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:40.304171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:37.875601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:38.478676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:39.175143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:39.745467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:40.431751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:37.982289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:38.608235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:39.289800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:39.848982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:40.546323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:38.111179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:38.749674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:39.393246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:00:39.935132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:00:44.242128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별입원외래구분환자수명세서건수입내원일수보험자부담금요양급여비용총액
성별1.0000.0000.0090.0000.0000.0000.011
입원외래구분0.0001.0000.0180.0260.0000.0000.000
환자수0.0090.0181.0000.9730.9680.6760.789
명세서건수0.0000.0260.9731.0000.9990.7240.756
입내원일수0.0000.0000.9680.9991.0000.7460.794
보험자부담금0.0000.0000.6760.7240.7461.0000.990
요양급여비용총액0.0110.0000.7890.7560.7940.9901.000
2023-12-12T22:00:44.351593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별입원외래구분
성별1.0000.000
입원외래구분0.0001.000
2023-12-12T22:00:44.433464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
환자수명세서건수입내원일수보험자부담금요양급여비용총액성별입원외래구분
환자수1.0000.9810.9370.7990.8220.0060.012
명세서건수0.9811.0000.9410.7810.8040.0000.017
입내원일수0.9370.9411.0000.9200.9340.0000.000
보험자부담금0.7990.7810.9201.0000.9980.0000.000
요양급여비용총액0.8220.8040.9340.9981.0000.0140.000
성별0.0060.0000.0000.0000.0141.0000.000
입원외래구분0.0120.0170.0000.0000.0000.0001.000

Missing values

2023-12-12T22:00:40.696356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:00:40.874322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

진료년도주상병코드성별입원외래구분환자수명세서건수입내원일수보험자부담금요양급여비용총액
301432022M8168외래1964234231661215027810250
187092022K5782입원1321451034296867700380516910
225672022M1090입원1401731177215228420277003900
297652022M7990외래2836406391141984016131840
28162022C181입원2811221906229934720903238765130
46702022D0511입원1011684386773047508170
399072022S1314외래34808026590105226010
312002022M8688입원2437526168711350209019310
226022022M1099외래6434617918117908942260070906831594010
57002022D464외래713423415862591062965510
진료년도주상병코드성별입원외래구분환자수명세서건수입내원일수보험자부담금요양급여비용총액
11532022A830입원1128490249082160303213840
226062022M1100외래81616482660699260
37022022C675입원1112061884434946800478232380
416502022S568외래31010128620205420
193132022K761외래2636856854034300073259900
9612022A563외래455174930253030
384932022R470입원172199853187541540268275750
269072022M5414외래178545674566291236910428507510
287212022M7103입원11121111940043025365120
157612022J0381입원992838347404442210