Overview

Dataset statistics

Number of variables6
Number of observations240
Missing cells25
Missing cells (%)1.7%
Duplicate rows1
Duplicate rows (%)0.4%
Total size in memory12.3 KiB
Average record size in memory52.5 B

Variable types

DateTime1
Categorical1
Numeric4

Dataset

Description국립마산병원 외래 화자수 현황 데이터로 초진 및 재진으로 구분하여 건강보험(일반), 의료급여(일반), 비보험(일반) 환자수를 데이터로 제공하고 있습니다.
Author질병관리청 국립마산병원
URLhttps://www.data.go.kr/data/3048701/fileData.do

Alerts

Dataset has 1 (0.4%) duplicate rowsDuplicates
건강보험(일반) is highly overall correlated with 의료급여(일반) and 2 other fieldsHigh correlation
의료급여(일반) is highly overall correlated with 건강보험(일반) and 2 other fieldsHigh correlation
비보험(일반) is highly overall correlated with 건강보험(일반) and 2 other fieldsHigh correlation
계(일반) is highly overall correlated with 건강보험(일반) and 2 other fieldsHigh correlation
년월 has 4 (1.7%) missing valuesMissing
건강보험(일반) has 4 (1.7%) missing valuesMissing
의료급여(일반) has 5 (2.1%) missing valuesMissing
비보험(일반) has 8 (3.3%) missing valuesMissing
계(일반) has 4 (1.7%) missing valuesMissing
의료급여(일반) has 6 (2.5%) zerosZeros
비보험(일반) has 13 (5.4%) zerosZeros

Reproduction

Analysis started2023-12-12 21:48:54.174903
Analysis finished2023-12-12 21:48:55.820667
Duration1.65 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년월
Date

MISSING 

Distinct118
Distinct (%)50.0%
Missing4
Missing (%)1.7%
Memory size2.0 KiB
Minimum2014-01-01 00:00:00
Maximum2023-10-01 00:00:00
2023-12-13T06:48:56.159581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:56.279758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct3
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
초진
119 
재진
117 
<NA>
 
4

Length

Max length4
Median length2
Mean length2.0333333
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row초진
2nd row재진
3rd row초진
4th row재진
5th row초진

Common Values

ValueCountFrequency (%)
초진 119
49.6%
재진 117
48.8%
<NA> 4
 
1.7%

Length

2023-12-13T06:48:56.417411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:48:56.524464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
초진 119
49.6%
재진 117
48.8%
na 4
 
1.7%

건강보험(일반)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct144
Distinct (%)61.0%
Missing4
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean121.46186
Minimum8
Maximum353
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2023-12-13T06:48:56.624352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile31
Q157.5
median96.5
Q3173.5
95-th percentile264.25
Maximum353
Range345
Interquartile range (IQR)116

Descriptive statistics

Standard deviation78.046703
Coefficient of variation (CV)0.64256138
Kurtosis-0.08547842
Mean121.46186
Median Absolute Deviation (MAD)51.5
Skewness0.80410372
Sum28665
Variance6091.2879
MonotonicityNot monotonic
2023-12-13T06:48:56.742554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
89 5
 
2.1%
45 5
 
2.1%
38 4
 
1.7%
52 4
 
1.7%
80 4
 
1.7%
36 4
 
1.7%
71 4
 
1.7%
8 3
 
1.2%
94 3
 
1.2%
90 3
 
1.2%
Other values (134) 197
82.1%
(Missing) 4
 
1.7%
ValueCountFrequency (%)
8 3
1.2%
13 1
 
0.4%
20 1
 
0.4%
21 1
 
0.4%
23 1
 
0.4%
25 1
 
0.4%
29 2
0.8%
30 1
 
0.4%
31 2
0.8%
32 3
1.2%
ValueCountFrequency (%)
353 1
0.4%
348 1
0.4%
346 1
0.4%
337 1
0.4%
311 1
0.4%
308 1
0.4%
280 1
0.4%
275 1
0.4%
272 1
0.4%
269 2
0.8%

의료급여(일반)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct63
Distinct (%)26.8%
Missing5
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean21.548936
Minimum0
Maximum96
Zeros6
Zeros (%)2.5%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2023-12-13T06:48:56.879971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q18
median17
Q328.5
95-th percentile61
Maximum96
Range96
Interquartile range (IQR)20.5

Descriptive statistics

Standard deviation18.213852
Coefficient of variation (CV)0.84523205
Kurtosis1.5643153
Mean21.548936
Median Absolute Deviation (MAD)10
Skewness1.3351771
Sum5064
Variance331.74439
MonotonicityNot monotonic
2023-12-13T06:48:57.023903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 10
 
4.2%
8 10
 
4.2%
19 9
 
3.8%
24 9
 
3.8%
9 9
 
3.8%
7 8
 
3.3%
15 8
 
3.3%
1 7
 
2.9%
17 7
 
2.9%
16 7
 
2.9%
Other values (53) 151
62.9%
ValueCountFrequency (%)
0 6
2.5%
1 7
2.9%
2 6
2.5%
3 3
 
1.2%
4 7
2.9%
5 10
4.2%
6 6
2.5%
7 8
3.3%
8 10
4.2%
9 9
3.8%
ValueCountFrequency (%)
96 1
 
0.4%
76 2
0.8%
73 1
 
0.4%
71 1
 
0.4%
69 1
 
0.4%
67 1
 
0.4%
66 3
1.2%
62 1
 
0.4%
61 2
0.8%
60 1
 
0.4%

비보험(일반)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct30
Distinct (%)12.9%
Missing8
Missing (%)3.3%
Infinite0
Infinite (%)0.0%
Mean7.9094828
Minimum0
Maximum119
Zeros13
Zeros (%)5.4%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2023-12-13T06:48:57.140543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median4.5
Q38.25
95-th percentile29
Maximum119
Range119
Interquartile range (IQR)6.25

Descriptive statistics

Standard deviation12.947964
Coefficient of variation (CV)1.6370178
Kurtosis34.836835
Mean7.9094828
Median Absolute Deviation (MAD)2.5
Skewness5.2359653
Sum1835
Variance167.64978
MonotonicityNot monotonic
2023-12-13T06:48:57.246458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
3 32
13.3%
1 25
10.4%
2 24
10.0%
5 23
9.6%
4 22
9.2%
0 13
 
5.4%
7 13
 
5.4%
6 11
 
4.6%
8 11
 
4.6%
10 8
 
3.3%
Other values (20) 50
20.8%
ValueCountFrequency (%)
0 13
5.4%
1 25
10.4%
2 24
10.0%
3 32
13.3%
4 22
9.2%
5 23
9.6%
6 11
 
4.6%
7 13
5.4%
8 11
 
4.6%
9 6
 
2.5%
ValueCountFrequency (%)
119 1
0.4%
95 1
0.4%
70 1
0.4%
64 1
0.4%
43 1
0.4%
37 2
0.8%
33 2
0.8%
31 1
0.4%
30 1
0.4%
29 2
0.8%

계(일반)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct165
Distinct (%)69.9%
Missing4
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean150.69492
Minimum8
Maximum419
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2023-12-13T06:48:57.373479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile35.75
Q173.75
median125
Q3211.25
95-th percentile329
Maximum419
Range411
Interquartile range (IQR)137.5

Descriptive statistics

Standard deviation95.456485
Coefficient of variation (CV)0.63344198
Kurtosis-0.26536599
Mean150.69492
Median Absolute Deviation (MAD)66.5
Skewness0.732638
Sum35564
Variance9111.9406
MonotonicityNot monotonic
2023-12-13T06:48:57.538069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90 5
 
2.1%
54 4
 
1.7%
47 4
 
1.7%
97 3
 
1.2%
105 3
 
1.2%
122 3
 
1.2%
188 3
 
1.2%
131 3
 
1.2%
46 3
 
1.2%
69 3
 
1.2%
Other values (155) 202
84.2%
(Missing) 4
 
1.7%
ValueCountFrequency (%)
8 3
1.2%
17 1
 
0.4%
22 2
0.8%
25 1
 
0.4%
29 1
 
0.4%
32 1
 
0.4%
33 1
 
0.4%
34 1
 
0.4%
35 1
 
0.4%
36 1
 
0.4%
ValueCountFrequency (%)
419 1
0.4%
414 1
0.4%
392 1
0.4%
388 1
0.4%
383 1
0.4%
380 1
0.4%
361 1
0.4%
347 1
0.4%
345 1
0.4%
342 1
0.4%

Interactions

2023-12-13T06:48:55.229988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:54.341439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:54.659339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:54.924257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:55.311846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:54.412829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:54.729380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:55.014484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:55.401420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:54.490906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:54.792230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:55.090951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:55.482811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:54.580555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:54.860340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:48:55.164885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:48:57.634341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분(초진_재진)건강보험(일반)의료급여(일반)비보험(일반)계(일반)
구분(초진_재진)1.0000.6500.4580.1770.643
건강보험(일반)0.6501.0000.6190.4950.967
의료급여(일반)0.4580.6191.0000.2360.732
비보험(일반)0.1770.4950.2361.0000.419
계(일반)0.6430.9670.7320.4191.000
2023-12-13T06:48:57.751798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건강보험(일반)의료급여(일반)비보험(일반)계(일반)구분(초진_재진)
건강보험(일반)1.0000.6830.5160.9830.496
의료급여(일반)0.6831.0000.5990.7710.452
비보험(일반)0.5160.5991.0000.6020.187
계(일반)0.9830.7710.6021.0000.491
구분(초진_재진)0.4960.4520.1870.4911.000

Missing values

2023-12-13T06:48:55.571619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:48:55.665522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T06:48:55.757862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

년월구분(초진_재진)건강보험(일반)의료급여(일반)비보험(일반)계(일반)
02014-01초진211<NA>22
12014-01재진2543814306
22014-02초진344240
32014-02재진235516292
42014-03초진522<NA>54
52014-03재진2605410324
62014-04초진522<NA>54
72014-04재진2726227361
82014-05초진252229
92014-05재진264738345
년월구분(초진_재진)건강보험(일반)의료급여(일반)비보험(일반)계(일반)
2302023-08초진439153
2312023-08재진6213580
2322023-09초진4011253
2332023-09재진789390
2342023-10초진16951175
2352023-10재진90125107
236<NA><NA><NA><NA><NA><NA>
237<NA><NA><NA><NA><NA><NA>
238<NA><NA><NA><NA><NA><NA>
239<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

년월구분(초진_재진)건강보험(일반)의료급여(일반)비보험(일반)계(일반)# duplicates
0<NA><NA><NA><NA><NA><NA>4