Overview

Dataset statistics

Number of variables8
Number of observations151
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.9 KiB
Average record size in memory66.9 B

Variable types

Numeric2
DateTime1
Categorical4
Text1

Dataset

Description의료급여 환자 중 신부전증 환자에게 적용되는 제품의 코드와 적용일자, 종료일자를 나타낸 내역으로 각 제품의 제품명, 제약회사, 제품의 규격, 가격을 나타냄. 컬럼명은 제품코드, 제품적용시작일자, 제품적용종료일자, 제품명, 제약회사, 제품규격, 제품가격
URLhttps://www.data.go.kr/data/15121272/fileData.do

Alerts

비고 is highly overall correlated with 제품코드 and 2 other fieldsHigh correlation
제품적용종료일자 is highly overall correlated with 제품규격 and 1 other fieldsHigh correlation
제품코드 is highly overall correlated with 제약회사 and 1 other fieldsHigh correlation
제품가격 is highly overall correlated with 제약회사 and 1 other fieldsHigh correlation
제약회사 is highly overall correlated with 제품코드 and 2 other fieldsHigh correlation
제품규격 is highly overall correlated with 제품가격 and 3 other fieldsHigh correlation
제품적용종료일자 is highly imbalanced (67.4%)Imbalance
비고 is highly imbalanced (68.4%)Imbalance
제품코드 has unique valuesUnique
제품가격 has 9 (6.0%) zerosZeros

Reproduction

Analysis started2023-12-12 22:24:41.847318
Analysis finished2023-12-12 22:24:42.928111
Duration1.08 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

제품코드
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct151
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean568.78146
Minimum100
Maximum9999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-13T07:24:43.032886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile107.5
Q1302.5
median512
Q3726.5
95-th percentile904.5
Maximum9999
Range9899
Interquartile range (IQR)424

Descriptive statistics

Standard deviation818.56545
Coefficient of variation (CV)1.4391564
Kurtosis119.18928
Mean568.78146
Median Absolute Deviation (MAD)212
Skewness10.296841
Sum85886
Variance670049.4
MonotonicityNot monotonic
2023-12-13T07:24:43.213402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
903 1
 
0.7%
123 1
 
0.7%
116 1
 
0.7%
117 1
 
0.7%
118 1
 
0.7%
119 1
 
0.7%
120 1
 
0.7%
121 1
 
0.7%
122 1
 
0.7%
124 1
 
0.7%
Other values (141) 141
93.4%
ValueCountFrequency (%)
100 1
0.7%
101 1
0.7%
102 1
0.7%
103 1
0.7%
104 1
0.7%
105 1
0.7%
106 1
0.7%
107 1
0.7%
108 1
0.7%
109 1
0.7%
ValueCountFrequency (%)
9999 1
0.7%
911 1
0.7%
910 1
0.7%
909 1
0.7%
908 1
0.7%
907 1
0.7%
906 1
0.7%
905 1
0.7%
904 1
0.7%
903 1
0.7%
Distinct4
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
Minimum1999-11-15 00:00:00
Maximum2001-03-25 00:00:00
2023-12-13T07:24:43.337281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:24:43.428355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=4)

제품적용종료일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
9999-12-31
142 
2001-02-15
 
9

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9999-12-31
2nd row9999-12-31
3rd row9999-12-31
4th row9999-12-31
5th row9999-12-31

Common Values

ValueCountFrequency (%)
9999-12-31 142
94.0%
2001-02-15 9
 
6.0%

Length

2023-12-13T07:24:43.563141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:24:43.685523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
9999-12-31 142
94.0%
2001-02-15 9
 
6.0%
Distinct141
Distinct (%)93.4%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-13T07:24:43.900045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length18
Mean length13.516556
Min length2

Characters and Unicode

Total characters2041
Distinct characters115
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique131 ?
Unique (%)86.8%

Sample

1st row헤모비덱스0.15%2호액
2nd row헤모졸엘오
3rd row헤모졸엘지2
4th row헤모졸엘지4
5th row헤모트레이트디
ValueCountFrequency (%)
system 4
 
2.5%
겜브로솔트리오 4
 
2.5%
페리시스1.5%투백 2
 
1.2%
명문씨.에이.피.디액4.25 2
 
1.2%
씨에이피디2복강투석액 2
 
1.2%
명문씨.에이.피.디액2.5 2
 
1.2%
명문씨.에이.피.디액1.5 2
 
1.2%
페리시스4.25%투백 2
 
1.2%
씨에이피디4복강투석액 2
 
1.2%
페리시스2.3%투백 2
 
1.2%
Other values (133) 136
85.0%
2023-12-13T07:24:44.266230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 110
 
5.4%
2 95
 
4.7%
1 92
 
4.5%
5 91
 
4.5%
69
 
3.4%
64
 
3.1%
0 60
 
2.9%
% 60
 
2.9%
59
 
2.9%
4 53
 
2.6%
Other values (105) 1288
63.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1236
60.6%
Decimal Number 413
 
20.2%
Other Punctuation 170
 
8.3%
Lowercase Letter 100
 
4.9%
Close Punctuation 45
 
2.2%
Open Punctuation 45
 
2.2%
Dash Punctuation 19
 
0.9%
Space Separator 9
 
0.4%
Uppercase Letter 4
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
69
 
5.6%
64
 
5.2%
59
 
4.8%
48
 
3.9%
48
 
3.9%
44
 
3.6%
43
 
3.5%
43
 
3.5%
41
 
3.3%
34
 
2.8%
Other values (75) 743
60.1%
Lowercase Letter
ValueCountFrequency (%)
35
35.0%
l 25
25.0%
m 8
 
8.0%
s 8
 
8.0%
e 6
 
6.0%
t 4
 
4.0%
y 4
 
4.0%
i 4
 
4.0%
n 2
 
2.0%
b 2
 
2.0%
Decimal Number
ValueCountFrequency (%)
2 95
23.0%
1 92
22.3%
5 91
22.0%
0 60
14.5%
4 53
12.8%
3 13
 
3.1%
9 3
 
0.7%
7 2
 
0.5%
6 2
 
0.5%
8 2
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
G 2
50.0%
A 1
25.0%
B 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 110
64.7%
% 60
35.3%
Close Punctuation
ValueCountFrequency (%)
) 45
100.0%
Open Punctuation
ValueCountFrequency (%)
( 45
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19
100.0%
Space Separator
ValueCountFrequency (%)
9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1236
60.6%
Common 736
36.1%
Latin 69
 
3.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
69
 
5.6%
64
 
5.2%
59
 
4.8%
48
 
3.9%
48
 
3.9%
44
 
3.6%
43
 
3.5%
43
 
3.5%
41
 
3.3%
34
 
2.8%
Other values (75) 743
60.1%
Common
ValueCountFrequency (%)
. 110
14.9%
2 95
12.9%
1 92
12.5%
5 91
12.4%
0 60
8.2%
% 60
8.2%
4 53
7.2%
) 45
6.1%
( 45
6.1%
35
 
4.8%
Other values (7) 50
6.8%
Latin
ValueCountFrequency (%)
l 25
36.2%
m 8
 
11.6%
s 8
 
11.6%
e 6
 
8.7%
t 4
 
5.8%
y 4
 
5.8%
i 4
 
5.8%
n 2
 
2.9%
G 2
 
2.9%
b 2
 
2.9%
Other values (3) 4
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1236
60.6%
ASCII 770
37.7%
Letterlike Symbols 35
 
1.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 110
14.3%
2 95
12.3%
1 92
11.9%
5 91
11.8%
0 60
7.8%
% 60
7.8%
4 53
6.9%
) 45
5.8%
( 45
5.8%
l 25
 
3.2%
Other values (19) 94
12.2%
Hangul
ValueCountFrequency (%)
69
 
5.6%
64
 
5.2%
59
 
4.8%
48
 
3.9%
48
 
3.9%
44
 
3.6%
43
 
3.5%
43
 
3.5%
41
 
3.3%
34
 
2.8%
Other values (75) 743
60.1%
Letterlike Symbols
ValueCountFrequency (%)
35
100.0%

제약회사
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
한국갬브로
39 
녹십자의료공업
31 
박스터
19 
보령제약
17 
프레제니우스메디칼
17 
Other values (7)
28 

Length

Max length9
Median length7
Mean length5.3245033
Min length2

Unique

Unique2 ?
Unique (%)1.3%

Sample

1st row중외제약
2nd row한국갬브로
3rd row한국갬브로
4th row한국갬브로
5th row중외제약

Common Values

ValueCountFrequency (%)
한국갬브로 39
25.8%
녹십자의료공업 31
20.5%
박스터 19
12.6%
보령제약 17
11.3%
프레제니우스메디칼 17
11.3%
중외제약 10
 
6.6%
명문제약 8
 
5.3%
코오롱제약 4
 
2.6%
제일제당 2
 
1.3%
제니스팜 2
 
1.3%
Other values (2) 2
 
1.3%

Length

2023-12-13T07:24:44.444292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
한국갬브로 39
25.8%
녹십자의료공업 31
20.5%
박스터 19
12.6%
보령제약 17
11.3%
프레제니우스메디칼 17
11.3%
중외제약 10
 
6.6%
명문제약 8
 
5.3%
코오롱제약 4
 
2.6%
제일제당 2
 
1.3%
제니스팜 2
 
1.3%
Other values (2) 2
 
1.3%

제품규격
Categorical

HIGH CORRELATION 

Distinct35
Distinct (%)23.2%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2ℓ/2백
23 
1500ml/2백
12 
5ℓ/1백
12 
2ℓ/1백
11 
0
Other values (30)
84 

Length

Max length9
Median length8
Mean length5.8278146
Min length1

Unique

Unique10 ?
Unique (%)6.6%

Sample

1st row12.6L
2nd row5l/1백
3rd row5l/1백
4th row5l/1백
5th row20L/1백

Common Values

ValueCountFrequency (%)
2ℓ/2백 23
15.2%
1500ml/2백 12
 
7.9%
5ℓ/1백 12
 
7.9%
2ℓ/1백 11
 
7.3%
0 9
 
6.0%
2000ml/2백 9
 
6.0%
20L/1백 6
 
4.0%
5L/1백 5
 
3.3%
10l/1백 4
 
2.6%
10L 4
 
2.6%
Other values (25) 56
37.1%

Length

2023-12-13T07:24:44.618809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2ℓ/2백 23
15.2%
1500ml/2백 12
 
7.9%
5ℓ/1백 12
 
7.9%
2ℓ/1백 11
 
7.3%
0 9
 
6.0%
2000ml/2백 9
 
6.0%
5l/1백 8
 
5.3%
10l/1백 8
 
5.3%
20l/1백 6
 
4.0%
10l 4
 
2.6%
Other values (23) 49
32.5%

제품가격
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct51
Distinct (%)33.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17713.166
Minimum0
Maximum715934
Zeros9
Zeros (%)6.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2023-12-13T07:24:44.771809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17902
median11503
Q313069.5
95-th percentile25141.5
Maximum715934
Range715934
Interquartile range (IQR)5167.5

Descriptive statistics

Standard deviation62629.372
Coefficient of variation (CV)3.5357527
Kurtosis106.6505
Mean17713.166
Median Absolute Deviation (MAD)2325
Skewness9.9560774
Sum2674688
Variance3.9224383 × 109
MonotonicityNot monotonic
2023-12-13T07:24:44.958591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11630 17
 
11.3%
13428 13
 
8.6%
10052 9
 
6.0%
0 9
 
6.0%
13828 9
 
6.0%
7902 9
 
6.0%
11503 6
 
4.0%
11891 6
 
4.0%
12793 5
 
3.3%
3202 3
 
2.0%
Other values (41) 65
43.0%
ValueCountFrequency (%)
0 9
6.0%
3202 3
 
2.0%
3532 3
 
2.0%
3914 1
 
0.7%
3927 2
 
1.3%
4924 1
 
0.7%
4966 2
 
1.3%
5164 1
 
0.7%
5212 2
 
1.3%
5738 1
 
0.7%
ValueCountFrequency (%)
715934 1
 
0.7%
288149 1
 
0.7%
143187 1
 
0.7%
27931 2
1.3%
25144 3
2.0%
25139 1
 
0.7%
22347 1
 
0.7%
15812 1
 
0.7%
14800 1
 
0.7%
14771 2
1.3%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
보건복지부 고시 제2000-81호
135 
<NA>
 
9
보건복지부 고시 제2001-5호
 
5
보건복지부 고시 제2001-13호
 
2

Length

Max length18
Median length18
Mean length17.13245
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보건복지부 고시 제2000-81호
2nd row보건복지부 고시 제2000-81호
3rd row보건복지부 고시 제2000-81호
4th row보건복지부 고시 제2000-81호
5th row보건복지부 고시 제2000-81호

Common Values

ValueCountFrequency (%)
보건복지부 고시 제2000-81호 135
89.4%
<NA> 9
 
6.0%
보건복지부 고시 제2001-5호 5
 
3.3%
보건복지부 고시 제2001-13호 2
 
1.3%

Length

2023-12-13T07:24:45.130461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:24:45.261518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보건복지부 142
32.6%
고시 142
32.6%
제2000-81호 135
31.0%
na 9
 
2.1%
제2001-5호 5
 
1.1%
제2001-13호 2
 
0.5%

Interactions

2023-12-13T07:24:42.465987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:24:42.284280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:24:42.569602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:24:42.369002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:24:45.352821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
제품코드제품적용시작일자제품적용종료일자제약회사제품규격제품가격비고
제품코드1.0000.4380.2011.0000.0000.000NaN
제품적용시작일자0.4381.0001.0000.3800.9480.0001.000
제품적용종료일자0.2011.0001.0000.3581.0000.000NaN
제약회사1.0000.3800.3581.0000.9190.9770.226
제품규격0.0000.9481.0000.9191.0001.0000.872
제품가격0.0000.0000.0000.9771.0001.0000.000
비고NaN1.000NaN0.2260.8720.0001.000
2023-12-13T07:24:45.486388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
제약회사비고제품적용종료일자제품규격
제약회사1.0000.1280.2680.553
비고0.1281.0001.0000.610
제품적용종료일자0.2681.0001.0000.882
제품규격0.5530.6100.8821.000
2023-12-13T07:24:45.613778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
제품코드제품가격제품적용종료일자제약회사제품규격비고
제품코드1.000-0.0060.1280.9660.0001.000
제품가격-0.0061.0000.0000.7780.8880.000
제품적용종료일자0.1280.0001.0000.2680.8821.000
제약회사0.9660.7780.2681.0000.5530.128
제품규격0.0000.8880.8820.5531.0000.610
비고1.0000.0001.0000.1280.6101.000

Missing values

2023-12-13T07:24:42.706821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:24:42.856995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

제품코드제품적용시작일자제품적용종료일자제품명제약회사제품규격제품가격비고
09032001-01-019999-12-31헤모비덱스0.15%2호액중외제약12.6L11821보건복지부 고시 제2000-81호
19042001-01-019999-12-31헤모졸엘오한국갬브로5l/1백13428보건복지부 고시 제2000-81호
29052001-01-019999-12-31헤모졸엘지2한국갬브로5l/1백13428보건복지부 고시 제2000-81호
39062001-01-019999-12-31헤모졸엘지4한국갬브로5l/1백13428보건복지부 고시 제2000-81호
49072001-01-019999-12-31헤모트레이트디중외제약20L/1백27931보건복지부 고시 제2000-81호
59082001-01-019999-12-31헤모트레이트비1호중외제약10L/1백12539보건복지부 고시 제2000-81호
69092001-01-019999-12-31헤모트레이트비2호중외제약10L/1백10962보건복지부 고시 제2000-81호
79102001-01-019999-12-31헤모트레이트비2호액중외제약12.6L12463보건복지부 고시 제2000-81호
89112001-01-019999-12-31헤모트레이트에프중외제약20L/1백27931보건복지부 고시 제2000-81호
999991999-11-152001-02-15기타(전산통합이전자료)기타00<NA>
제품코드제품적용시작일자제품적용종료일자제품명제약회사제품규격제품가격비고
1414102001-01-019999-12-31바이카트293한국갬브로10ℓ13428보건복지부 고시 제2000-81호
1424112001-01-019999-12-31바이카트294한국갬브로10ℓ13428보건복지부 고시 제2000-81호
1434122001-01-019999-12-31바이카트761한국갬브로10ℓ13428보건복지부 고시 제2000-81호
1445001999-11-152001-02-15보령페리시스(전산통합이전자료)보령제약00<NA>
1455012001-01-019999-12-31보령페리시스2액(1.5%)보령제약2ℓ/2백11503보건복지부 고시 제2000-81호
1465022001-01-019999-12-31보령페리시스3액(2.5%)보령제약2ℓ/2백11503보건복지부 고시 제2000-81호
1475032001-01-019999-12-31보령페리시스4액(4.25%)보령제약2ℓ/2백11503보건복지부 고시 제2000-81호
1485042001-01-019999-12-31페리시스1.5%원백보령제약5ℓ/1백13828보건복지부 고시 제2000-81호
1495052001-01-019999-12-31페리시스1.5%투백보령제약1500ml8296보건복지부 고시 제2000-81호
1505062001-01-019999-12-31페리시스1.5%투백보령제약2000ml9599보건복지부 고시 제2000-81호