Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory673.8 KiB
Average record size in memory69.0 B

Variable types

Categorical2
Text1
Numeric4

Dataset

Description의원급 의료기관의 표시과목별 행위 통계 / 진료일자 기준(심사분은 각 진료년+4개월) (예) 진료년월: 2020.1월~12월, 심사년월: 2020.1월~2021.4월 / 보험자: 건강보험 / ※ 2019년 진료분부터 서면, DRG 청구건 제외
URLhttps://www.data.go.kr/data/15055563/fileData.do

Alerts

진료년도 has constant value ""Constant
환자수 is highly overall correlated with 명세서청구건수 and 2 other fieldsHigh correlation
명세서청구건수 is highly overall correlated with 환자수 and 2 other fieldsHigh correlation
총사용량 is highly overall correlated with 환자수 and 2 other fieldsHigh correlation
진료행위청구금액 is highly overall correlated with 환자수 and 2 other fieldsHigh correlation
환자수 is highly skewed (γ1 = 20.24294616)Skewed
명세서청구건수 is highly skewed (γ1 = 28.75499717)Skewed
총사용량 is highly skewed (γ1 = 28.47770602)Skewed
진료행위청구금액 is highly skewed (γ1 = 34.37560011)Skewed
진료행위청구금액 has 112 (1.1%) zerosZeros

Reproduction

Analysis started2023-12-12 15:58:37.120817
Analysis finished2023-12-12 15:58:40.299237
Duration3.18 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

진료년도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2022
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 10000
100.0%

Length

2023-12-13T00:58:40.357233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:58:40.437131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 10000
100.0%

표시과목
Categorical

Distinct28
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일반의
1220 
내과
838 
정형외과
822 
외과
714 
산부인과
595 
Other values (23)
5811 

Length

Max length12
Median length7
Mean length4.3282
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row산부인과
2nd row외과
3rd row일반의
4th row소아청소년과
5th row내과

Common Values

ValueCountFrequency (%)
일반의 1220
 
12.2%
내과 838
 
8.4%
정형외과 822
 
8.2%
외과 714
 
7.1%
산부인과 595
 
5.9%
신경외과 554
 
5.5%
마취통증의학과 506
 
5.1%
이비인후과 497
 
5.0%
재활의학과 470
 
4.7%
가정의학과 440
 
4.4%
Other values (18) 3344
33.4%

Length

2023-12-13T00:58:40.521178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
일반의 1220
 
12.2%
내과 838
 
8.4%
정형외과 822
 
8.2%
외과 714
 
7.1%
산부인과 595
 
5.9%
신경외과 554
 
5.5%
마취통증의학과 506
 
5.1%
이비인후과 497
 
5.0%
재활의학과 470
 
4.7%
가정의학과 440
 
4.4%
Other values (18) 3344
33.4%
Distinct3677
Distinct (%)36.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T00:58:40.819859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters50000
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1598 ?
Unique (%)16.0%

Sample

1st rowD3214
2nd rowN0210
3rd rowD1850
4th rowY2400
5th rowG6503
ValueCountFrequency (%)
al801 14
 
0.1%
d7026 14
 
0.1%
d4350 14
 
0.1%
d1860 14
 
0.1%
d7015 13
 
0.1%
d6201 13
 
0.1%
d2252 13
 
0.1%
d2611 13
 
0.1%
aa250 13
 
0.1%
d7460 13
 
0.1%
Other values (3667) 9866
98.7%
2023-12-13T00:58:41.322341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 7993
16.0%
1 5873
11.7%
2 4884
9.8%
4 3658
 
7.3%
3 3635
 
7.3%
5 3278
 
6.6%
6 2744
 
5.5%
D 2506
 
5.0%
7 2191
 
4.4%
8 1776
 
3.6%
Other values (28) 11462
22.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 37145
74.3%
Uppercase Letter 12820
 
25.6%
Space Separator 28
 
0.1%
Currency Symbol 7
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 2506
19.5%
A 1530
11.9%
G 1158
 
9.0%
H 837
 
6.5%
E 705
 
5.5%
N 685
 
5.3%
B 681
 
5.3%
L 581
 
4.5%
C 567
 
4.4%
S 538
 
4.2%
Other values (16) 3032
23.7%
Decimal Number
ValueCountFrequency (%)
0 7993
21.5%
1 5873
15.8%
2 4884
13.1%
4 3658
9.8%
3 3635
9.8%
5 3278
8.8%
6 2744
 
7.4%
7 2191
 
5.9%
8 1776
 
4.8%
9 1113
 
3.0%
Space Separator
ValueCountFrequency (%)
28
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 37180
74.4%
Latin 12820
 
25.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 2506
19.5%
A 1530
11.9%
G 1158
 
9.0%
H 837
 
6.5%
E 705
 
5.5%
N 685
 
5.3%
B 681
 
5.3%
L 581
 
4.5%
C 567
 
4.4%
S 538
 
4.2%
Other values (16) 3032
23.7%
Common
ValueCountFrequency (%)
0 7993
21.5%
1 5873
15.8%
2 4884
13.1%
4 3658
9.8%
3 3635
9.8%
5 3278
8.8%
6 2744
 
7.4%
7 2191
 
5.9%
8 1776
 
4.8%
9 1113
 
3.0%
Other values (2) 35
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7993
16.0%
1 5873
11.7%
2 4884
9.8%
4 3658
 
7.3%
3 3635
 
7.3%
5 3278
 
6.6%
6 2744
 
5.5%
D 2506
 
5.0%
7 2191
 
4.4%
8 1776
 
3.6%
Other values (28) 11462
22.9%

환자수
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct2960
Distinct (%)29.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41874.321
Minimum1
Maximum17096454
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T00:58:41.550574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median44
Q3771
95-th percentile46166.05
Maximum17096454
Range17096453
Interquartile range (IQR)767

Descriptive statistics

Standard deviation442429.82
Coefficient of variation (CV)10.565659
Kurtosis530.74761
Mean41874.321
Median Absolute Deviation (MAD)43
Skewness20.242946
Sum4.1874322 × 108
Variance1.9574414 × 1011
MonotonicityNot monotonic
2023-12-13T00:58:41.771648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1491
 
14.9%
2 615
 
6.2%
3 392
 
3.9%
4 253
 
2.5%
5 208
 
2.1%
6 176
 
1.8%
7 152
 
1.5%
8 150
 
1.5%
9 107
 
1.1%
10 98
 
1.0%
Other values (2950) 6358
63.6%
ValueCountFrequency (%)
1 1491
14.9%
2 615
6.2%
3 392
 
3.9%
4 253
 
2.5%
5 208
 
2.1%
6 176
 
1.8%
7 152
 
1.5%
8 150
 
1.5%
9 107
 
1.1%
10 98
 
1.0%
ValueCountFrequency (%)
17096454 1
< 0.1%
13285051 1
< 0.1%
12419539 1
< 0.1%
11728819 1
< 0.1%
10909351 1
< 0.1%
9466206 1
< 0.1%
7735609 1
< 0.1%
6903488 1
< 0.1%
6865198 1
< 0.1%
6534314 1
< 0.1%

명세서청구건수
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct3135
Distinct (%)31.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean111964.96
Minimum1
Maximum73055041
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T00:58:41.991116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median50
Q3977
95-th percentile67983.95
Maximum73055041
Range73055040
Interquartile range (IQR)973

Descriptive statistics

Standard deviation1585213.8
Coefficient of variation (CV)14.158125
Kurtosis1067.6666
Mean111964.96
Median Absolute Deviation (MAD)49
Skewness28.754997
Sum1.1196496 × 109
Variance2.5129028 × 1012
MonotonicityNot monotonic
2023-12-13T00:58:42.227483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1424
 
14.2%
2 592
 
5.9%
3 377
 
3.8%
4 255
 
2.5%
5 194
 
1.9%
6 168
 
1.7%
7 150
 
1.5%
8 148
 
1.5%
10 111
 
1.1%
9 108
 
1.1%
Other values (3125) 6473
64.7%
ValueCountFrequency (%)
1 1424
14.2%
2 592
5.9%
3 377
 
3.8%
4 255
 
2.5%
5 194
 
1.9%
6 168
 
1.7%
7 150
 
1.5%
8 148
 
1.5%
9 108
 
1.1%
10 111
 
1.1%
ValueCountFrequency (%)
73055041 1
< 0.1%
72591612 1
< 0.1%
41328880 1
< 0.1%
38425323 1
< 0.1%
36380871 1
< 0.1%
32862566 1
< 0.1%
29545207 1
< 0.1%
28431177 1
< 0.1%
27277405 1
< 0.1%
23945572 1
< 0.1%

총사용량
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct4333
Distinct (%)43.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean127250.25
Minimum0.5
Maximum80322308
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T00:58:42.411924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.5
5-th percentile1
Q14.2
median55
Q31111.65
95-th percentile80417.37
Maximum80322308
Range80322307
Interquartile range (IQR)1107.45

Descriptive statistics

Standard deviation1717395.3
Coefficient of variation (CV)13.496204
Kurtosis1068.8219
Mean127250.25
Median Absolute Deviation (MAD)54
Skewness28.477706
Sum1.2725025 × 109
Variance2.9494467 × 1012
MonotonicityNot monotonic
2023-12-13T00:58:42.618388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.0 923
 
9.2%
2.0 382
 
3.8%
1.1 313
 
3.1%
3.0 253
 
2.5%
4.0 161
 
1.6%
2.2 138
 
1.4%
5.0 123
 
1.2%
6.0 116
 
1.2%
7.0 90
 
0.9%
8.0 87
 
0.9%
Other values (4323) 7414
74.1%
ValueCountFrequency (%)
0.5 15
 
0.1%
0.55 1
 
< 0.1%
0.58 1
 
< 0.1%
0.66 1
 
< 0.1%
0.75 1
 
< 0.1%
1.0 923
9.2%
1.1 313
 
3.1%
1.15 31
 
0.3%
1.2 15
 
0.1%
1.21 2
 
< 0.1%
ValueCountFrequency (%)
80322307.9 1
< 0.1%
78232832.0 1
< 0.1%
42463231.15 1
< 0.1%
41537287.8 1
< 0.1%
40048664.5 1
< 0.1%
33096723.0 1
< 0.1%
33056349.0 1
< 0.1%
30009297.52 1
< 0.1%
28439197.25 1
< 0.1%
24116353.0 1
< 0.1%

진료행위청구금액
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct9517
Distinct (%)95.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1235972 × 109
Minimum0
Maximum9.037726 × 1011
Zeros112
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T00:58:42.850086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5741.25
Q1105261.5
median1482867.5
Q324258888
95-th percentile1.1830847 × 109
Maximum9.037726 × 1011
Range9.037726 × 1011
Interquartile range (IQR)24153626

Descriptive statistics

Standard deviation1.618496 × 1010
Coefficient of variation (CV)14.404593
Kurtosis1499.8703
Mean1.1235972 × 109
Median Absolute Deviation (MAD)1473711.5
Skewness34.3756
Sum1.1235972 × 1013
Variance2.6195293 × 1020
MonotonicityNot monotonic
2023-12-13T00:58:43.049190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 112
 
1.1%
2990 6
 
0.1%
146752 6
 
0.1%
8234 6
 
0.1%
11121 6
 
0.1%
4968 5
 
0.1%
25116 5
 
0.1%
900 5
 
0.1%
22242 5
 
0.1%
9154 5
 
0.1%
Other values (9507) 9839
98.4%
ValueCountFrequency (%)
0 112
1.1%
70 2
 
< 0.1%
80 1
 
< 0.1%
100 1
 
< 0.1%
150 2
 
< 0.1%
200 1
 
< 0.1%
280 1
 
< 0.1%
288 2
 
< 0.1%
320 1
 
< 0.1%
325 1
 
< 0.1%
ValueCountFrequency (%)
903772596148 1
< 0.1%
672611145592 1
< 0.1%
482386160428 1
< 0.1%
472239649465 1
< 0.1%
390432653380 1
< 0.1%
379158862255 1
< 0.1%
291906088784 1
< 0.1%
202615528470 1
< 0.1%
190142941680 1
< 0.1%
181125539725 1
< 0.1%

Interactions

2023-12-13T00:58:39.390965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:37.901426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:38.372499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:38.916417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:39.500843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:38.026447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:38.505502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:39.062906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:39.614188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:38.131930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:38.619135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:39.184035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:39.718005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:38.259282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:38.776096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:58:39.297875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:58:43.200861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
표시과목환자수명세서청구건수총사용량진료행위청구금액
표시과목1.0000.2350.2690.2690.176
환자수0.2351.0000.7620.7870.847
명세서청구건수0.2690.7621.0000.9970.824
총사용량0.2690.7870.9971.0000.826
진료행위청구금액0.1760.8470.8240.8261.000
2023-12-13T00:58:43.358817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
환자수명세서청구건수총사용량진료행위청구금액표시과목
환자수1.0000.9940.9890.8800.089
명세서청구건수0.9941.0000.9950.8830.104
총사용량0.9890.9951.0000.8800.104
진료행위청구금액0.8800.8830.8801.0000.069
표시과목0.0890.1040.1040.0691.000

Missing values

2023-12-13T00:58:40.133560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:58:40.244059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

진료년도표시과목행위코드환자수명세서청구건수총사용량진료행위청구금액
164782022산부인과D3214171819.8872520
89492022외과N0210745583068741.0514377798
6852022일반의D1850176537324036802586829.384892870617
189062022소아청소년과Y24003026940591388234.01603655511
49992022내과G6503598562236803.2565156231
159092022마취통증의학과NN00427116116.08128120
217522022비뇨의학과AL807183418921893.02366250
72542022정신건강의학과FB050167167167.010386064
263022022가정의학과G6403369432439.04238879
102452022정형외과EB457485595595.1583615376
진료년도표시과목행위코드환자수명세서청구건수총사용량진료행위청구금액
101342022정형외과D7865455.5268400
104072022정형외과G180598010091009.010960192
101242022정형외과D7840777.7107668
102912022정형외과F610185701102010909.05332625258
257742022가정의학과D3211313235.1258541
141682022성형외과G7203222.019344
115812022정형외과T6154673056725568727646.3817371480245
94562022정형외과AA100667.045330
49682022내과G5201310131383231.020619759
67772022신경과M0045111.05727