Overview

Dataset statistics

Number of variables5
Number of observations239
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.5 KiB
Average record size in memory40.5 B

Variable types

Text3
Categorical2

Dataset

Description보건복지부 국립나주병원에서 사용하고 있는 의약품에 대한 데이터로 약품코드, 성분한글명, 성분영문명, 약품분류(일반약, 향정), 약품구분(내복약, 외용약, 주사)에 대한 정보를 포함하고 있습니다.
Author보건복지부 국립나주병원
URLhttps://www.data.go.kr/data/15079898/fileData.do

Alerts

약품분류 is highly imbalanced (50.4%)Imbalance
약품코드 has unique valuesUnique

Reproduction

Analysis started2024-03-15 01:52:52.347531
Analysis finished2024-03-15 01:52:53.244298
Duration0.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

약품코드
Text

UNIQUE 

Distinct239
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2024-03-15T10:52:54.502438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length9
Mean length5.3305439
Min length3

Characters and Unicode

Total characters1274
Distinct characters36
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique239 ?
Unique (%)100.0%

Sample

1st rowDAAP
2nd rowDABILOD10
3rd rowDABILOD15
4th rowDACAM
5th rowDACAR2
ValueCountFrequency (%)
daap 1
 
0.4%
drisq2 1
 
0.4%
dro 1
 
0.4%
drzp 1
 
0.4%
dscital 1
 
0.4%
dscital2 1
 
0.4%
dscital5 1
 
0.4%
dscitalod10 1
 
0.4%
dscitalod20 1
 
0.4%
dsero 1
 
0.4%
Other values (229) 229
95.8%
2024-03-15T10:52:56.064988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
D 233
18.3%
A 89
 
7.0%
L 70
 
5.5%
P 68
 
5.3%
0 68
 
5.3%
I 62
 
4.9%
O 54
 
4.2%
T 53
 
4.2%
R 51
 
4.0%
1 44
 
3.5%
Other values (26) 482
37.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1037
81.4%
Decimal Number 235
 
18.4%
Other Punctuation 2
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 233
22.5%
A 89
 
8.6%
L 70
 
6.8%
P 68
 
6.6%
I 62
 
6.0%
O 54
 
5.2%
T 53
 
5.1%
R 51
 
4.9%
W 38
 
3.7%
E 37
 
3.6%
Other values (15) 282
27.2%
Decimal Number
ValueCountFrequency (%)
0 68
28.9%
1 44
18.7%
5 44
18.7%
2 40
17.0%
4 12
 
5.1%
3 11
 
4.7%
8 6
 
2.6%
6 5
 
2.1%
7 3
 
1.3%
9 2
 
0.9%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1037
81.4%
Common 237
 
18.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 233
22.5%
A 89
 
8.6%
L 70
 
6.8%
P 68
 
6.6%
I 62
 
6.0%
O 54
 
5.2%
T 53
 
5.1%
R 51
 
4.9%
W 38
 
3.7%
E 37
 
3.6%
Other values (15) 282
27.2%
Common
ValueCountFrequency (%)
0 68
28.7%
1 44
18.6%
5 44
18.6%
2 40
16.9%
4 12
 
5.1%
3 11
 
4.6%
8 6
 
2.5%
6 5
 
2.1%
7 3
 
1.3%
. 2
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1274
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
D 233
18.3%
A 89
 
7.0%
L 70
 
5.5%
P 68
 
5.3%
0 68
 
5.3%
I 62
 
4.9%
O 54
 
4.2%
T 53
 
4.2%
R 51
 
4.0%
1 44
 
3.5%
Other values (26) 482
37.8%
Distinct222
Distinct (%)92.9%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2024-03-15T10:52:56.937476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length304
Median length67
Mean length15.037657
Min length7

Characters and Unicode

Total characters3594
Distinct characters215
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique205 ?
Unique (%)85.8%

Sample

1st row아세트아미노펜 300mg
2nd row아리피프라졸 10mg
3rd row아리피프라졸 15mg
4th row아캄프로세이트칼슘 333mg
5th row아카보즈 100mg
ValueCountFrequency (%)
10mg 23
 
4.3%
25mg 16
 
3.0%
100mg 16
 
3.0%
50mg 15
 
2.8%
5mg 11
 
2.1%
2mg 11
 
2.1%
1mg 10
 
1.9%
아리피프라졸 10
 
1.9%
쿠에티아핀 9
 
1.7%
200mg 8
 
1.5%
Other values (258) 404
75.8%
2024-03-15T10:52:58.122491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
m 303
 
8.4%
g 298
 
8.3%
296
 
8.2%
0 210
 
5.8%
5 114
 
3.2%
1 112
 
3.1%
112
 
3.1%
2 93
 
2.6%
92
 
2.6%
72
 
2.0%
Other values (205) 1892
52.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1848
51.4%
Decimal Number 646
 
18.0%
Lowercase Letter 604
 
16.8%
Space Separator 296
 
8.2%
Other Punctuation 90
 
2.5%
Math Symbol 53
 
1.5%
Uppercase Letter 45
 
1.3%
Dash Punctuation 4
 
0.1%
Open Punctuation 4
 
0.1%
Close Punctuation 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
112
 
6.1%
92
 
5.0%
72
 
3.9%
69
 
3.7%
62
 
3.4%
56
 
3.0%
55
 
3.0%
54
 
2.9%
41
 
2.2%
39
 
2.1%
Other values (179) 1196
64.7%
Decimal Number
ValueCountFrequency (%)
0 210
32.5%
5 114
17.6%
1 112
17.3%
2 93
14.4%
3 43
 
6.7%
4 30
 
4.6%
6 17
 
2.6%
7 13
 
2.0%
8 8
 
1.2%
9 6
 
0.9%
Other Punctuation
ValueCountFrequency (%)
/ 48
53.3%
. 37
41.1%
% 3
 
3.3%
: 2
 
2.2%
Lowercase Letter
ValueCountFrequency (%)
m 303
50.2%
g 298
49.3%
μ 3
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
L 43
95.6%
D 1
 
2.2%
S 1
 
2.2%
Math Symbol
ValueCountFrequency (%)
+ 52
98.1%
1
 
1.9%
Space Separator
ValueCountFrequency (%)
296
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1848
51.4%
Common 1097
30.5%
Latin 646
 
18.0%
Greek 3
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
112
 
6.1%
92
 
5.0%
72
 
3.9%
69
 
3.7%
62
 
3.4%
56
 
3.0%
55
 
3.0%
54
 
2.9%
41
 
2.2%
39
 
2.1%
Other values (179) 1196
64.7%
Common
ValueCountFrequency (%)
296
27.0%
0 210
19.1%
5 114
 
10.4%
1 112
 
10.2%
2 93
 
8.5%
+ 52
 
4.7%
/ 48
 
4.4%
3 43
 
3.9%
. 37
 
3.4%
4 30
 
2.7%
Other values (10) 62
 
5.7%
Latin
ValueCountFrequency (%)
m 303
46.9%
g 298
46.1%
L 43
 
6.7%
D 1
 
0.2%
S 1
 
0.2%
Greek
ValueCountFrequency (%)
μ 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1848
51.4%
ASCII 1742
48.5%
None 3
 
0.1%
Arrows 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m 303
17.4%
g 298
17.1%
296
17.0%
0 210
12.1%
5 114
 
6.5%
1 112
 
6.4%
2 93
 
5.3%
+ 52
 
3.0%
/ 48
 
2.8%
L 43
 
2.5%
Other values (14) 173
9.9%
Hangul
ValueCountFrequency (%)
112
 
6.1%
92
 
5.0%
72
 
3.9%
69
 
3.7%
62
 
3.4%
56
 
3.0%
55
 
3.0%
54
 
2.9%
41
 
2.2%
39
 
2.1%
Other values (179) 1196
64.7%
None
ValueCountFrequency (%)
μ 3
100.0%
Arrows
ValueCountFrequency (%)
1
100.0%
Distinct223
Distinct (%)93.3%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2024-03-15T10:52:59.133569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length533
Median length77
Mean length26.435146
Min length11

Characters and Unicode

Total characters6318
Distinct characters68
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique207 ?
Unique (%)86.6%

Sample

1st rowAcetaminophen 300mg
2nd rowAripiprazole 10mg
3rd rowAripiprazole 15mg
4th rowAcamprosate Calcium 333mg
5th rowAcarbose 100mg
ValueCountFrequency (%)
hydrochloride 47
 
6.8%
10mg 22
 
3.2%
25mg 16
 
2.3%
100mg 16
 
2.3%
50mg 15
 
2.2%
sodium 13
 
1.9%
5mg 11
 
1.6%
2mg 11
 
1.6%
aripiprazole 10
 
1.4%
1mg 10
 
1.4%
Other values (307) 523
75.4%
2024-03-15T10:53:00.458470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
474
 
7.5%
i 466
 
7.4%
e 459
 
7.3%
m 435
 
6.9%
o 398
 
6.3%
a 341
 
5.4%
r 319
 
5.0%
g 310
 
4.9%
l 270
 
4.3%
n 263
 
4.2%
Other values (58) 2583
40.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4555
72.1%
Decimal Number 645
 
10.2%
Uppercase Letter 486
 
7.7%
Space Separator 474
 
7.5%
Other Punctuation 97
 
1.5%
Math Symbol 52
 
0.8%
Dash Punctuation 5
 
0.1%
Close Punctuation 2
 
< 0.1%
Open Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 466
10.2%
e 459
10.1%
m 435
9.5%
o 398
 
8.7%
a 341
 
7.5%
r 319
 
7.0%
g 310
 
6.8%
l 270
 
5.9%
n 263
 
5.8%
d 223
 
4.9%
Other values (16) 1071
23.5%
Uppercase Letter
ValueCountFrequency (%)
H 66
13.6%
L 64
13.2%
C 52
10.7%
A 51
10.5%
P 36
 
7.4%
S 33
 
6.8%
M 28
 
5.8%
D 28
 
5.8%
T 18
 
3.7%
B 17
 
3.5%
Other values (13) 93
19.1%
Decimal Number
ValueCountFrequency (%)
0 210
32.6%
5 115
17.8%
1 112
17.4%
2 92
14.3%
3 43
 
6.7%
4 30
 
4.7%
6 17
 
2.6%
7 13
 
2.0%
8 8
 
1.2%
9 5
 
0.8%
Other Punctuation
ValueCountFrequency (%)
/ 48
49.5%
. 44
45.4%
% 4
 
4.1%
& 1
 
1.0%
Space Separator
ValueCountFrequency (%)
474
100.0%
Math Symbol
ValueCountFrequency (%)
+ 52
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5037
79.7%
Common 1277
 
20.2%
Greek 4
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 466
 
9.3%
e 459
 
9.1%
m 435
 
8.6%
o 398
 
7.9%
a 341
 
6.8%
r 319
 
6.3%
g 310
 
6.2%
l 270
 
5.4%
n 263
 
5.2%
d 223
 
4.4%
Other values (37) 1553
30.8%
Common
ValueCountFrequency (%)
474
37.1%
0 210
16.4%
5 115
 
9.0%
1 112
 
8.8%
2 92
 
7.2%
+ 52
 
4.1%
/ 48
 
3.8%
. 44
 
3.4%
3 43
 
3.4%
4 30
 
2.3%
Other values (9) 57
 
4.5%
Greek
ValueCountFrequency (%)
μ 3
75.0%
β 1
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6314
99.9%
None 4
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
474
 
7.5%
i 466
 
7.4%
e 459
 
7.3%
m 435
 
6.9%
o 398
 
6.3%
a 341
 
5.4%
r 319
 
5.1%
g 310
 
4.9%
l 270
 
4.3%
n 263
 
4.2%
Other values (56) 2579
40.8%
None
ValueCountFrequency (%)
μ 3
75.0%
β 1
 
25.0%

약품분류
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
일반
213 
향정
26 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반
2nd row일반
3rd row일반
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 213
89.1%
향정 26
 
10.9%

Length

2024-03-15T10:53:00.869526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T10:53:01.192662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 213
89.1%
향정 26
 
10.9%

약품구분
Categorical

Distinct3
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
내복약
193 
주사약
36 
외용약
 
10

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row내복약
2nd row내복약
3rd row내복약
4th row내복약
5th row내복약

Common Values

ValueCountFrequency (%)
내복약 193
80.8%
주사약 36
 
15.1%
외용약 10
 
4.2%

Length

2024-03-15T10:53:01.515254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T10:53:01.780879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
내복약 193
80.8%
주사약 36
 
15.1%
외용약 10
 
4.2%

Correlations

2024-03-15T10:53:01.890250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
약품분류약품구분
약품분류1.0000.034
약품구분0.0341.000
2024-03-15T10:53:02.131551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
약품구분약품분류
약품구분1.0000.056
약품분류0.0561.000
2024-03-15T10:53:02.270031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
약품분류약품구분
약품분류1.0000.056
약품구분0.0561.000

Missing values

2024-03-15T10:52:52.798833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T10:52:53.121322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

약품코드성분한글명성분영문명약품분류약품구분
0DAAP아세트아미노펜 300mgAcetaminophen 300mg일반내복약
1DABILOD10아리피프라졸 10mgAripiprazole 10mg일반내복약
2DABILOD15아리피프라졸 15mgAripiprazole 15mg일반내복약
3DACAM아캄프로세이트칼슘 333mgAcamprosate Calcium 333mg일반내복약
4DACAR2아카보즈 100mgAcarbose 100mg일반내복약
5DACTI슈도에페드린염산염 60mg+트리프롤리딘염산염수화물 2.5mgPseudoephedrine Hydrochloride 60mg+Triprolidine Hydrochloride Hydrate 2.5mg일반내복약
6DALP알프라졸람 0.25mgAlprazolam 0.25mg향정내복약
7DALP125알프라졸람 0.125mgAlprazolam 0.125mg향정내복약
8DALP5알프라졸람 0.5mgAlprazolam 0.5mg향정내복약
9DAMA글리메피리드 2mgGlimepiride 2mg일반내복약
약품코드성분한글명성분영문명약품분류약품구분
229WNS1염화나트륨 9g/LSodium Chloride 9g/L일반주사약
230WPAL100팔리페리돈 100mgPaliperidone 100mg일반주사약
231WPAL150팔리페리돈 150mgPaliperidone 150mg일반주사약
232WPAL50팔리페리돈 50mgPaliperidone 50mg일반주사약
233WPAL546팔리페리돈팔미테이트 312mg/mLPaliperidone palmitate 312mg/mL일반주사약
234WPAL75팔리페리돈 75mgPaliperidone 75mg일반주사약
235WPAL819팔리페리돈팔미테이트 312mg/mLPaliperidone palmitate 312mg/mL일반주사약
236WPERI할로페리돌 5mg/mLHaloperidol 5mg/1mL일반주사약
237WPPCT염산프로파세타몰 1gPropacetamol Hydrochloride 1g일반주사약
238WTHI치아민염산염 50mg/2mLThiamine Hydrochloride 50mg/2mL일반주사약