Overview

Dataset statistics

Number of variables8
Number of observations480
Missing cells84
Missing cells (%)2.2%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory32.9 KiB
Average record size in memory70.3 B

Variable types

DateTime1
Categorical1
Numeric6

Dataset

Description진료비 청구 현황 데이터로 진료년월, 보험유형, 내원구분, 청구건수, 요양급여총액, 본인부담, 지원금, 장애인, 청구금액 등의 정보를 제공합니다.
Author질병관리청 국립마산병원
URLhttps://www.data.go.kr/data/3048703/fileData.do

Alerts

Dataset has 1 (0.2%) duplicate rowsDuplicates
인원(청구건수) is highly overall correlated with 본인부담(B) and 1 other fieldsHigh correlation
요양급여총액 is highly overall correlated with 본인부담(B) and 2 other fieldsHigh correlation
본인부담(B) is highly overall correlated with 인원(청구건수) and 2 other fieldsHigh correlation
지원금 is highly overall correlated with 인원(청구건수)High correlation
청구금액(D) is highly overall correlated with 요양급여총액 and 2 other fieldsHigh correlation
서식 is highly overall correlated with 요양급여총액 and 1 other fieldsHigh correlation
진료년월 has 12 (2.5%) missing valuesMissing
인원(청구건수) has 12 (2.5%) missing valuesMissing
요양급여총액 has 12 (2.5%) missing valuesMissing
본인부담(B) has 12 (2.5%) missing valuesMissing
지원금 has 12 (2.5%) missing valuesMissing
장애인(C) has 12 (2.5%) missing valuesMissing
청구금액(D) has 12 (2.5%) missing valuesMissing
인원(청구건수) has 6 (1.2%) zerosZeros
요양급여총액 has 6 (1.2%) zerosZeros
본인부담(B) has 38 (7.9%) zerosZeros
지원금 has 365 (76.0%) zerosZeros
장애인(C) has 430 (89.6%) zerosZeros
청구금액(D) has 7 (1.5%) zerosZeros

Reproduction

Analysis started2023-12-12 21:18:00.742182
Analysis finished2023-12-12 21:18:05.405906
Duration4.66 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

진료년월
Date

MISSING 

Distinct117
Distinct (%)25.0%
Missing12
Missing (%)2.5%
Memory size3.9 KiB
Minimum2014-01-01 00:00:00
Maximum2023-09-01 00:00:00
2023-12-13T06:18:05.483684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:05.630848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

서식
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
건강보험 외래
117 
건강보험 입원
117 
의료급여 외래
117 
의료급여 입원
117 
<NA>
12 

Length

Max length7
Median length7
Mean length6.925
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row건강보험 외래
2nd row건강보험 입원
3rd row의료급여 외래
4th row의료급여 입원
5th row건강보험 외래

Common Values

ValueCountFrequency (%)
건강보험 외래 117
24.4%
건강보험 입원 117
24.4%
의료급여 외래 117
24.4%
의료급여 입원 117
24.4%
<NA> 12
 
2.5%

Length

2023-12-13T06:18:05.747311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:18:05.841098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
건강보험 234
24.7%
외래 234
24.7%
입원 234
24.7%
의료급여 234
24.7%
na 12
 
1.3%

인원(청구건수)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct193
Distinct (%)41.2%
Missing12
Missing (%)2.5%
Infinite0
Infinite (%)0.0%
Mean88.087607
Minimum0
Maximum419
Zeros6
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size4.3 KiB
2023-12-13T06:18:05.959174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7
Q132
median58
Q3121.25
95-th percentile255.65
Maximum419
Range419
Interquartile range (IQR)89.25

Descriptive statistics

Standard deviation79.446525
Coefficient of variation (CV)0.90190355
Kurtosis1.4108488
Mean88.087607
Median Absolute Deviation (MAD)37
Skewness1.3597069
Sum41225
Variance6311.7503
MonotonicityNot monotonic
2023-12-13T06:18:06.084213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19 11
 
2.3%
52 7
 
1.5%
58 7
 
1.5%
43 7
 
1.5%
42 7
 
1.5%
35 7
 
1.5%
30 6
 
1.2%
37 6
 
1.2%
0 6
 
1.2%
36 6
 
1.2%
Other values (183) 398
82.9%
(Missing) 12
 
2.5%
ValueCountFrequency (%)
0 6
1.2%
1 3
0.6%
2 3
0.6%
3 1
 
0.2%
4 4
0.8%
5 2
 
0.4%
6 3
0.6%
7 4
0.8%
8 4
0.8%
9 2
 
0.4%
ValueCountFrequency (%)
419 1
0.2%
400 1
0.2%
365 1
0.2%
353 1
0.2%
342 1
0.2%
311 1
0.2%
307 1
0.2%
304 1
0.2%
293 1
0.2%
289 1
0.2%

요양급여총액
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct463
Distinct (%)98.9%
Missing12
Missing (%)2.5%
Infinite0
Infinite (%)0.0%
Mean86983967
Minimum0
Maximum9.1239458 × 108
Zeros6
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size4.3 KiB
2023-12-13T06:18:06.235962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile726424
Q15576860
median19548670
Q31.2452174 × 108
95-th percentile2.7949522 × 108
Maximum9.1239458 × 108
Range9.1239458 × 108
Interquartile range (IQR)1.1894488 × 108

Descriptive statistics

Standard deviation1.2020361 × 108
Coefficient of variation (CV)1.3819054
Kurtosis12.187663
Mean86983967
Median Absolute Deviation (MAD)19324955
Skewness2.7806897
Sum4.0708496 × 1010
Variance1.4448909 × 1016
MonotonicityNot monotonic
2023-12-13T06:18:06.379438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6
 
1.2%
27734320 1
 
0.2%
12511930 1
 
0.2%
70825570 1
 
0.2%
2270040 1
 
0.2%
172750450 1
 
0.2%
10046310 1
 
0.2%
54562520 1
 
0.2%
2429930 1
 
0.2%
126289090 1
 
0.2%
Other values (453) 453
94.4%
(Missing) 12
 
2.5%
ValueCountFrequency (%)
0 6
1.2%
11530 1
 
0.2%
16370 1
 
0.2%
49980 1
 
0.2%
58640 1
 
0.2%
74290 1
 
0.2%
104580 1
 
0.2%
125140 1
 
0.2%
147290 1
 
0.2%
201230 1
 
0.2%
ValueCountFrequency (%)
912394580 1
0.2%
849943320 1
0.2%
810540360 1
0.2%
670620200 1
0.2%
619323030 1
0.2%
519330840 1
0.2%
512327660 1
0.2%
506149100 1
0.2%
414181840 1
0.2%
369786900 1
0.2%

본인부담(B)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct422
Distinct (%)90.2%
Missing12
Missing (%)2.5%
Infinite0
Infinite (%)0.0%
Mean7607222.4
Minimum0
Maximum1.9823357 × 108
Zeros38
Zeros (%)7.9%
Negative0
Negative (%)0.0%
Memory size4.3 KiB
2023-12-13T06:18:06.505481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q141475
median1612090
Q33874400
95-th percentile33346858
Maximum1.9823357 × 108
Range1.9823357 × 108
Interquartile range (IQR)3832925

Descriptive statistics

Standard deviation20293549
Coefficient of variation (CV)2.6676686
Kurtosis45.005884
Mean7607222.4
Median Absolute Deviation (MAD)1574770
Skewness6.0612147
Sum3.5601801 × 109
Variance4.1182812 × 1014
MonotonicityNot monotonic
2023-12-13T06:18:06.663502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 38
 
7.9%
1500 3
 
0.6%
4500 3
 
0.6%
6000 2
 
0.4%
7500 2
 
0.4%
39699720 2
 
0.4%
21900 2
 
0.4%
2306500 2
 
0.4%
1827700 1
 
0.2%
25940 1
 
0.2%
Other values (412) 412
85.8%
(Missing) 12
 
2.5%
ValueCountFrequency (%)
0 38
7.9%
1500 3
 
0.6%
1750 1
 
0.2%
3000 1
 
0.2%
3500 1
 
0.2%
4500 3
 
0.6%
6000 2
 
0.4%
6990 1
 
0.2%
7500 2
 
0.4%
7850 1
 
0.2%
ValueCountFrequency (%)
198233570 1
0.2%
182665700 1
0.2%
170840320 1
0.2%
144343100 1
0.2%
135908270 1
0.2%
110404250 1
0.2%
74510780 1
0.2%
60116020 1
0.2%
52710730 1
0.2%
46895640 1
0.2%

지원금
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct102
Distinct (%)21.8%
Missing12
Missing (%)2.5%
Infinite0
Infinite (%)0.0%
Mean82683.632
Minimum0
Maximum843550
Zeros365
Zeros (%)76.0%
Negative0
Negative (%)0.0%
Memory size4.3 KiB
2023-12-13T06:18:06.800828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile641120
Maximum843550
Range843550
Interquartile range (IQR)0

Descriptive statistics

Standard deviation194540.85
Coefficient of variation (CV)2.3528339
Kurtosis4.0133562
Mean82683.632
Median Absolute Deviation (MAD)0
Skewness2.3160623
Sum38695940
Variance3.7846142 × 1010
MonotonicityNot monotonic
2023-12-13T06:18:06.943294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 365
76.0%
452070 2
 
0.4%
49270 2
 
0.4%
117800 1
 
0.2%
159780 1
 
0.2%
243140 1
 
0.2%
127320 1
 
0.2%
220380 1
 
0.2%
220880 1
 
0.2%
282800 1
 
0.2%
Other values (92) 92
 
19.2%
(Missing) 12
 
2.5%
ValueCountFrequency (%)
0 365
76.0%
4300 1
 
0.2%
4480 1
 
0.2%
4700 1
 
0.2%
9900 1
 
0.2%
16600 1
 
0.2%
17100 1
 
0.2%
17900 1
 
0.2%
31700 1
 
0.2%
36500 1
 
0.2%
ValueCountFrequency (%)
843550 1
0.2%
778840 1
0.2%
772250 1
0.2%
764500 1
0.2%
733750 1
0.2%
713900 1
0.2%
711900 1
0.2%
710950 1
0.2%
706750 1
0.2%
684000 1
0.2%

장애인(C)
Real number (ℝ)

MISSING  ZEROS 

Distinct27
Distinct (%)5.8%
Missing12
Missing (%)2.5%
Infinite0
Infinite (%)0.0%
Mean7271.1538
Minimum0
Maximum1660800
Zeros430
Zeros (%)89.6%
Negative0
Negative (%)0.0%
Memory size4.3 KiB
2023-12-13T06:18:07.058034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5178
Maximum1660800
Range1660800
Interquartile range (IQR)0

Descriptive statistics

Standard deviation81788.724
Coefficient of variation (CV)11.248383
Kurtosis360.78871
Mean7271.1538
Median Absolute Deviation (MAD)0
Skewness18.16732
Sum3402900
Variance6.6893954 × 109
MonotonicityNot monotonic
2023-12-13T06:18:07.175498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
0 430
89.6%
1000 5
 
1.0%
5220 3
 
0.6%
2000 3
 
0.6%
5310 3
 
0.6%
23800 2
 
0.4%
3000 2
 
0.4%
250080 1
 
0.2%
790 1
 
0.2%
239540 1
 
0.2%
Other values (17) 17
 
3.5%
(Missing) 12
 
2.5%
ValueCountFrequency (%)
0 430
89.6%
790 1
 
0.2%
1000 5
 
1.0%
2000 3
 
0.6%
2150 1
 
0.2%
2860 1
 
0.2%
3000 2
 
0.4%
5100 1
 
0.2%
5220 3
 
0.6%
5310 3
 
0.6%
ValueCountFrequency (%)
1660800 1
0.2%
366820 1
0.2%
283690 1
0.2%
250080 1
0.2%
239540 1
0.2%
172700 1
0.2%
132720 1
0.2%
52400 1
0.2%
51060 1
0.2%
24720 1
0.2%

청구금액(D)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct462
Distinct (%)98.7%
Missing12
Missing (%)2.5%
Infinite0
Infinite (%)0.0%
Mean79186751
Minimum0
Maximum7.1416101 × 108
Zeros7
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size4.3 KiB
2023-12-13T06:18:07.314735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile368804.5
Q14577370
median16486260
Q31.1967872 × 108
95-th percentile2.653258 × 108
Maximum7.1416101 × 108
Range7.1416101 × 108
Interquartile range (IQR)1.1510134 × 108

Descriptive statistics

Standard deviation1.0389655 × 108
Coefficient of variation (CV)1.3120446
Kurtosis7.6740321
Mean79186751
Median Absolute Deviation (MAD)16359210
Skewness2.2429573
Sum3.70594 × 1010
Variance1.0794493 × 1016
MonotonicityNot monotonic
2023-12-13T06:18:07.448811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 7
 
1.5%
26564080 1
 
0.2%
10390530 1
 
0.2%
69470800 1
 
0.2%
2244100 1
 
0.2%
163954630 1
 
0.2%
8216110 1
 
0.2%
53567520 1
 
0.2%
2394670 1
 
0.2%
119570980 1
 
0.2%
Other values (452) 452
94.2%
(Missing) 12
 
2.5%
ValueCountFrequency (%)
0 7
1.5%
11530 1
 
0.2%
16370 1
 
0.2%
49980 1
 
0.2%
56890 1
 
0.2%
62980 1
 
0.2%
74290 1
 
0.2%
75440 1
 
0.2%
99890 1
 
0.2%
154210 1
 
0.2%
ValueCountFrequency (%)
714161010 1
0.2%
667277620 1
0.2%
639700040 1
0.2%
526277100 1
0.2%
499924640 1
0.2%
483414760 1
0.2%
444820060 1
0.2%
401923410 1
0.2%
365568390 1
0.2%
354065820 1
0.2%

Interactions

2023-12-13T06:18:04.425596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:01.050870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:01.625300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:02.232289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:02.923731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:03.537006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:04.510884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:01.126042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:01.737437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:02.334296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:03.014130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:03.616974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:04.594244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:01.196453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:01.819970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:02.439808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:03.110102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:03.698046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:04.695251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:01.299063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:01.939153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:02.593545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:03.231645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:03.801558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:04.797791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:01.392383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:02.034615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:02.711884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:03.330832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:03.887428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:04.922049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:01.517702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:02.125348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:02.806487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:03.431770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:18:04.308025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:18:07.537128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
서식인원(청구건수)요양급여총액본인부담(B)지원금장애인(C)청구금액(D)
서식1.0000.6830.7770.4520.5320.2030.845
인원(청구건수)0.6831.0000.8730.7510.7430.0000.852
요양급여총액0.7770.8731.0000.9400.2160.1730.988
본인부담(B)0.4520.7510.9401.0000.4610.0720.910
지원금0.5320.7430.2160.4611.0000.2050.311
장애인(C)0.2030.0000.1730.0720.2051.0000.164
청구금액(D)0.8450.8520.9880.9100.3110.1641.000
2023-12-13T06:18:07.649397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인원(청구건수)요양급여총액본인부담(B)지원금장애인(C)청구금액(D)서식
인원(청구건수)1.0000.4020.5790.6390.2150.3830.481
요양급여총액0.4021.0000.7340.082-0.0580.9910.589
본인부담(B)0.5790.7341.0000.4150.0730.7080.304
지원금0.6390.0820.4151.0000.2410.0480.345
장애인(C)0.215-0.0580.0730.2411.000-0.0620.081
청구금액(D)0.3830.9910.7080.048-0.0621.0000.686
서식0.4810.5890.3040.3450.0810.6861.000

Missing values

2023-12-13T06:18:05.045806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:18:05.186345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T06:18:05.307379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

진료년월서식인원(청구건수)요양급여총액본인부담(B)지원금장애인(C)청구금액(D)
02014-01건강보험 외래260167660202627900664250014138120
12014-01건강보험 입원17722943952039539180251520172700189727640
22014-01의료급여 외래37192946073780001855680
32014-01의료급여 입원8111817221020643000117965780
42014-02건강보험 외래248148458902307600584500012538290
52014-02건강보험 입원16220455862035082300214720250080169226240
62014-02의료급여 외래5428783801020100104402765930
72014-02의료급여 입원759713366000097133660
82014-03건강보험 외래293195573103849100650300015708210
92014-03건강보험 입원17723704478039899850490440283690196861240
진료년월서식인원(청구건수)요양급여총액본인부담(B)지원금장애인(C)청구금액(D)
470<NA><NA><NA><NA><NA><NA><NA><NA>
471<NA><NA><NA><NA><NA><NA><NA><NA>
472<NA><NA><NA><NA><NA><NA><NA><NA>
473<NA><NA><NA><NA><NA><NA><NA><NA>
474<NA><NA><NA><NA><NA><NA><NA><NA>
475<NA><NA><NA><NA><NA><NA><NA><NA>
476<NA><NA><NA><NA><NA><NA><NA><NA>
477<NA><NA><NA><NA><NA><NA><NA><NA>
478<NA><NA><NA><NA><NA><NA><NA><NA>
479<NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

진료년월서식인원(청구건수)요양급여총액본인부담(B)지원금장애인(C)청구금액(D)# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA>12