gimi9 Pandas Profiling

Dataset statistics

Number of variables	6
Number of observations	10000
Missing cells	0
Missing cells (%)	0.0%
Duplicate rows	0
Duplicate rows (%)	0.0%
Total size in memory	546.9 KiB
Average record size in memory	56.0 B

Variable types

Text	5
Categorical	1

Dataset

Description	전통의학정보포털 오아시스의 한의연구보고서 입력 정보입니다. 온톨로지키워드제어번호 키워드분류, 식별자,약재명, 한글명, 온톨로지검색한문명으로 이루어져있습니다.
Author	한국한의학연구원
URL	https://www.data.go.kr/data/15086079/fileData.do

Alerts

온톨로지키워드제어번호 has unique values Unique

Reproduction

Analysis started	2023-12-12 22:30:25.832877
Analysis finished	2023-12-12 22:30:30.275585
Duration	4.44 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

온톨로지키워드제어번호
Text

UNIQUE

Distinct	10000
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

Length

Max length	6
Median length	6
Mean length	5.5743
Min length	1

Characters and Unicode

Total characters	55743
Distinct characters	11
Distinct categories	2 ?
Distinct scripts	1 ?
Distinct blocks	1 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	10000 ?
Unique (%)	100.0%

Sample

1st row	9,491
2nd row	12,922
3rd row	28,041
4th row	356
5th row	13,812

Value	Count	Frequency (%)
9,491	1	< 0.1%
26,727	1	< 0.1%
9,461	1	< 0.1%
8,161	1	< 0.1%
20,776	1	< 0.1%
15,667	1	< 0.1%
12,168	1	< 0.1%
9,222	1	< 0.1%
1,056	1	< 0.1%
22,705	1	< 0.1%
Other values (9990)	9990	99.9%

Most occurring characters

Value	Count	Frequency (%)
,	9626	17.3%
1	7532	13.5%
2	7095	12.7%
3	4107	7.4%
5	4080	7.3%
7	4024	7.2%
4	4021	7.2%
6	3975	7.1%
8	3899	7.0%
0	3706	6.6%

Most occurring categories

Value	Count	Frequency (%)
Decimal Number	46117	82.7%
Other Punctuation	9626	17.3%

Most frequent character per category

Decimal Number

Value	Count	Frequency (%)
1	7532	16.3%
2	7095	15.4%
3	4107	8.9%
5	4080	8.8%
7	4024	8.7%
4	4021	8.7%
6	3975	8.6%
8	3899	8.5%
0	3706	8.0%
9	3678	8.0%

Other Punctuation

Value	Count	Frequency (%)
,	9626	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	55743	100.0%

Most frequent character per script

Common

Value	Count	Frequency (%)
,	9626	17.3%
1	7532	13.5%
2	7095	12.7%
3	4107	7.4%
5	4080	7.3%
7	4024	7.2%
4	4021	7.2%
6	3975	7.1%
8	3899	7.0%
0	3706	6.6%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	55743	100.0%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
,	9626	17.3%
1	7532	13.5%
2	7095	12.7%
3	4107	7.4%
5	4080	7.3%
7	4024	7.2%
4	4021	7.2%
6	3975	7.1%
8	3899	7.0%
0	3706	6.6%

키워드분류
Categorical

Distinct	6
Distinct (%)	0.1%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

Formula	3885
Symptom	3509
Medicinal_Material	1051
Effect	688
Disease	647

Length

Max length	18
Median length	7
Mean length	8.0873
Min length	6

Unique

Unique	0 ?
Unique (%)	0.0%

Sample

1st row	Formula
2nd row	Formula
3rd row	Symptom
4th row	Disease
5th row	Formula

Common Values

Value	Count	Frequency (%)
Formula	3885	38.9%
Symptom	3509	35.1%
Medicinal_Material	1051	10.5%
Effect	688	6.9%
Disease	647	6.5%
Pattern	220	2.2%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
formula	3885	38.9%
symptom	3509	35.1%
medicinal_material	1051	10.5%
effect	688	6.9%
disease	647	6.5%
pattern	220	2.2%

식별자(CID)
Text

Distinct	9964
Distinct (%)	99.6%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

Length

Max length	180
Median length	89
Mean length	11.4994
Min length	3

Characters and Unicode

Total characters	114994
Distinct characters	2220
Distinct categories	10 ?
Distinct scripts	4 ?
Distinct blocks	5 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	9940 ?
Unique (%)	99.4%

Sample

1st row	FO海蛤散-MB得效-BK심계내과학
2nd row	FO藿香安胃散-MB東醫寶鑑-MB醫鑑-BK동의방제와처방해설-108
3rd row	SY食無味
4th row	DI宿食秘
5th row	FO金水六君煎-MB景岳全書

Value	Count	Frequency (%)
加	35	0.3%
mmoo	6	0.1%
syo	6	0.1%
合	6	0.1%
大黃	6	0.1%
人蔘	6	0.1%
fo二陳湯	5	< 0.1%
知母	5	< 0.1%
杏仁	5	< 0.1%
fo四物湯	5	< 0.1%
Other values (10097)	10201	99.2%

Most occurring characters

Value	Count	Frequency (%)
-	7705	6.7%
O	5367	4.7%
B	4911	4.3%
F	4573	4.0%
M	4316	3.8%
S	3511	3.1%
Y	3509	3.1%
K	2225	1.9%
방	2060	1.8%
湯	2012	1.7%
Other values (2210)	74805	65.1%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	71737	62.4%
Uppercase Letter	31313	27.2%
Dash Punctuation	7705	6.7%
Decimal Number	2218	1.9%
Connector Punctuation	1404	1.2%
Space Separator	286	0.2%
Lowercase Letter	254	0.2%
Other Punctuation	29	< 0.1%
Close Punctuation	24	< 0.1%
Open Punctuation	24	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
방	2060	2.9%
湯	2012	2.8%
학	1349	1.9%
제	1191	1.7%
의	1151	1.6%
醫	1121	1.6%
동	1095	1.5%
과	1068	1.5%
내	1047	1.5%
계	1044	1.5%
Other values (2150)	58599	81.7%

Lowercase Letter

Value	Count	Frequency (%)
a	34	13.4%
e	27	10.6%
r	27	10.6%
i	21	8.3%
t	21	8.3%
g	18	7.1%
s	15	5.9%
o	14	5.5%
c	12	4.7%
l	9	3.5%
Other values (13)	56	22.0%

Uppercase Letter

Value	Count	Frequency (%)
O	5367	17.1%
B	4911	15.7%
F	4573	14.6%
M	4316	13.8%
S	3511	11.2%
Y	3509	11.2%
K	2225	7.1%
E	689	2.2%
I	647	2.1%
D	647	2.1%
Other values (6)	918	2.9%

Decimal Number

Value	Count	Frequency (%)
1	291	13.1%
4	279	12.6%
2	265	11.9%
3	255	11.5%
5	240	10.8%
7	217	9.8%
6	182	8.2%
9	170	7.7%
0	162	7.3%
8	157	7.1%

Other Punctuation

Value	Count	Frequency (%)
.	13	44.8%
·	9	31.0%
/	3	10.3%
;	2	6.9%
\	1	3.4%
"	1	3.4%

Dash Punctuation

Value	Count	Frequency (%)
-	7705	100.0%

Connector Punctuation

Value	Count	Frequency (%)
_	1404	100.0%

Space Separator

Value	Count	Frequency (%)
	286	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	24	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	24	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Han	51791	45.0%
Latin	31567	27.5%
Hangul	19946	17.3%
Common	11690	10.2%

Most frequent character per script

Han

Value	Count	Frequency (%)
湯	2012	3.9%
醫	1121	2.2%
鑑	1021	2.0%
東	944	1.8%
寶	919	1.8%
散	854	1.6%
丸	687	1.3%
氣	669	1.3%
痛	485	0.9%
黃	475	0.9%
Other values (1777)	42604	82.3%

Hangul

Value	Count	Frequency (%)
방	2060	10.3%
학	1349	6.8%
제	1191	6.0%
의	1151	5.8%
동	1095	5.5%
과	1068	5.4%
내	1047	5.2%
계	1044	5.2%
해	891	4.5%
와	884	4.4%
Other values (363)	8166	40.9%

Latin

Value	Count	Frequency (%)
O	5367	17.0%
B	4911	15.6%
F	4573	14.5%
M	4316	13.7%
S	3511	11.1%
Y	3509	11.1%
K	2225	7.0%
E	689	2.2%
I	647	2.0%
D	647	2.0%
Other values (29)	1172	3.7%

Common

Value	Count	Frequency (%)
-	7705	65.9%
_	1404	12.0%
1	291	2.5%
	286	2.4%
4	279	2.4%
2	265	2.3%
3	255	2.2%
5	240	2.1%
7	217	1.9%
6	182	1.6%
Other values (11)	566	4.8%

Most occurring blocks

Value	Count	Frequency (%)
CJK	51789	45.0%
ASCII	43248	37.6%
Hangul	19946	17.3%
None	9	< 0.1%
CJK Compat Ideographs	2	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
-	7705	17.8%
O	5367	12.4%
B	4911	11.4%
F	4573	10.6%
M	4316	10.0%
S	3511	8.1%
Y	3509	8.1%
K	2225	5.1%
_	1404	3.2%
E	689	1.6%
Other values (49)	5038	11.6%

Hangul

Value	Count	Frequency (%)
방	2060	10.3%
학	1349	6.8%
제	1191	6.0%
의	1151	5.8%
동	1095	5.5%
과	1068	5.4%
내	1047	5.2%
계	1044	5.2%
해	891	4.5%
와	884	4.4%
Other values (363)	8166	40.9%

CJK

Value	Count	Frequency (%)
湯	2012	3.9%
醫	1121	2.2%
鑑	1021	2.0%
東	944	1.8%
寶	919	1.8%
散	854	1.6%
丸	687	1.3%
氣	669	1.3%
痛	485	0.9%
黃	475	0.9%
Other values (1775)	42602	82.3%

None

Value	Count	Frequency (%)
·	9	100.0%

CJK Compat Ideographs

Value	Count	Frequency (%)
鈴	1	50.0%
濾	1	50.0%

약재명(LOCAL_NAME)
Text

Distinct	9903
Distinct (%)	99.0%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

Length

Max length	178
Median length	87
Mean length	9.4994
Min length	1

Characters and Unicode

Total characters	94994
Distinct characters	2216
Distinct categories	10 ?
Distinct scripts	4 ?
Distinct blocks	5 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	9826 ?
Unique (%)	98.3%

Sample

1st row	海蛤散-MB得效-BK심계내과학
2nd row	藿香安胃散-MB東醫寶鑑-MB醫鑑-BK동의방제와처방해설-108
3rd row	食無味
4th row	宿食秘
5th row	金水六君煎-MB景岳全書

Value	Count	Frequency (%)
加	35	0.3%
oo	12	0.1%
o	7	0.1%
大黃	6	0.1%
人蔘	6	0.1%
二陳湯	6	0.1%
合	6	0.1%
黃芩	6	0.1%
知母	5	< 0.1%
杏仁	5	< 0.1%
Other values (10020)	10189	99.1%

Most occurring characters

Value	Count	Frequency (%)
-	7705	8.1%
B	4911	5.2%
K	2225	2.3%
M	2214	2.3%
방	2060	2.2%
湯	2012	2.1%
O	1482	1.6%
_	1404	1.5%
학	1349	1.4%
제	1191	1.3%
Other values (2206)	68441	72.0%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	71737	75.5%
Uppercase Letter	11313	11.9%
Dash Punctuation	7705	8.1%
Decimal Number	2218	2.3%
Connector Punctuation	1404	1.5%
Space Separator	286	0.3%
Lowercase Letter	254	0.3%
Other Punctuation	29	< 0.1%
Close Punctuation	24	< 0.1%
Open Punctuation	24	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
방	2060	2.9%
湯	2012	2.8%
학	1349	1.9%
제	1191	1.7%
의	1151	1.6%
醫	1121	1.6%
동	1095	1.5%
과	1068	1.5%
내	1047	1.5%
계	1044	1.5%
Other values (2150)	58599	81.7%

Lowercase Letter

Value	Count	Frequency (%)
a	34	13.4%
e	27	10.6%
r	27	10.6%
i	21	8.3%
t	21	8.3%
g	18	7.1%
s	15	5.9%
o	14	5.5%
c	12	4.7%
h	9	3.5%
Other values (13)	56	22.0%

Uppercase Letter

Value	Count	Frequency (%)
B	4911	43.4%
K	2225	19.7%
M	2214	19.6%
O	1482	13.1%
C	472	4.2%
P	2	< 0.1%
S	2	< 0.1%
L	1	< 0.1%
H	1	< 0.1%
A	1	< 0.1%
Other values (2)	2	< 0.1%

Decimal Number

Value	Count	Frequency (%)
1	291	13.1%
4	279	12.6%
2	265	11.9%
3	255	11.5%
5	240	10.8%
7	217	9.8%
6	182	8.2%
9	170	7.7%
0	162	7.3%
8	157	7.1%

Other Punctuation

Value	Count	Frequency (%)
.	13	44.8%
·	9	31.0%
/	3	10.3%
;	2	6.9%
\	1	3.4%
"	1	3.4%

Dash Punctuation

Value	Count	Frequency (%)
-	7705	100.0%

Connector Punctuation

Value	Count	Frequency (%)
_	1404	100.0%

Space Separator

Value	Count	Frequency (%)
	286	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	24	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	24	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Han	51791	54.5%
Hangul	19946	21.0%
Common	11690	12.3%
Latin	11567	12.2%

Most frequent character per script

Han

Value	Count	Frequency (%)
湯	2012	3.9%
醫	1121	2.2%
鑑	1021	2.0%
東	944	1.8%
寶	919	1.8%
散	854	1.6%
丸	687	1.3%
氣	669	1.3%
痛	485	0.9%
黃	475	0.9%
Other values (1777)	42604	82.3%

Hangul

Value	Count	Frequency (%)
방	2060	10.3%
학	1349	6.8%
제	1191	6.0%
의	1151	5.8%
동	1095	5.5%
과	1068	5.4%
내	1047	5.2%
계	1044	5.2%
해	891	4.5%
와	884	4.4%
Other values (363)	8166	40.9%

Latin

Value	Count	Frequency (%)
B	4911	42.5%
K	2225	19.2%
M	2214	19.1%
O	1482	12.8%
C	472	4.1%
a	34	0.3%
e	27	0.2%
r	27	0.2%
i	21	0.2%
t	21	0.2%
Other values (25)	133	1.1%

Common

Value	Count	Frequency (%)
-	7705	65.9%
_	1404	12.0%
1	291	2.5%
	286	2.4%
4	279	2.4%
2	265	2.3%
3	255	2.2%
5	240	2.1%
7	217	1.9%
6	182	1.6%
Other values (11)	566	4.8%

Most occurring blocks

Value	Count	Frequency (%)
CJK	51789	54.5%
ASCII	23248	24.5%
Hangul	19946	21.0%
None	9	< 0.1%
CJK Compat Ideographs	2	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
-	7705	33.1%
B	4911	21.1%
K	2225	9.6%
M	2214	9.5%
O	1482	6.4%
_	1404	6.0%
C	472	2.0%
1	291	1.3%
	286	1.2%
4	279	1.2%
Other values (45)	1979	8.5%

Hangul

Value	Count	Frequency (%)
방	2060	10.3%
학	1349	6.8%
제	1191	6.0%
의	1151	5.8%
동	1095	5.5%
과	1068	5.4%
내	1047	5.2%
계	1044	5.2%
해	891	4.5%
와	884	4.4%
Other values (363)	8166	40.9%

CJK

Value	Count	Frequency (%)
湯	2012	3.9%
醫	1121	2.2%
鑑	1021	2.0%
東	944	1.8%
寶	919	1.8%
散	854	1.6%
丸	687	1.3%
氣	669	1.3%
痛	485	0.9%
黃	475	0.9%
Other values (1775)	42602	82.3%

None

Value	Count	Frequency (%)
·	9	100.0%

CJK Compat Ideographs

Value	Count	Frequency (%)
鈴	1	50.0%
濾	1	50.0%

한문명
Text

Distinct	8050
Distinct (%)	80.5%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

Length

Max length	178
Median length	130
Mean length	5.1377
Min length	1

Characters and Unicode

Total characters	51377
Distinct characters	1769
Distinct categories	10 ?
Distinct scripts	4 ?
Distinct blocks	6 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	6960 ?
Unique (%)	69.6%

Sample

1st row	해합산
2nd row	곽향안위산
3rd row	식무미
4th row	숙식비
5th row	금수육군전

Value	Count	Frequency (%)
加	37	0.3%
가	16	0.1%
접촉시	13	0.1%
양격산	9	0.1%
계지복령환	9	0.1%
이진탕	9	0.1%
귀비탕	9	0.1%
지출환	8	0.1%
가미온담탕	8	0.1%
백호탕	8	0.1%
Other values (8423)	10592	98.8%

Most occurring characters

Value	Count	Frequency (%)
탕	1767	3.4%
-	1193	2.3%
증	1036	2.0%
산	858	1.7%
상	844	1.6%
_	798	1.6%
기	748	1.5%
	718	1.4%
아	698	1.4%
님	641	1.2%
Other values (1759)	42076	81.9%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	47743	92.9%
Dash Punctuation	1193	2.3%
Connector Punctuation	798	1.6%
Space Separator	718	1.4%
Uppercase Letter	410	0.8%
Lowercase Letter	236	0.5%
Open Punctuation	87	0.2%
Close Punctuation	87	0.2%
Decimal Number	87	0.2%
Other Punctuation	18	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
탕	1767	3.7%
증	1036	2.2%
산	858	1.8%
상	844	1.8%
기	748	1.6%
아	698	1.5%
님	641	1.3%
환	639	1.3%
지	540	1.1%
소	534	1.1%
Other values (1707)	39438	82.6%

Lowercase Letter

Value	Count	Frequency (%)
a	29	12.3%
r	27	11.4%
e	26	11.0%
t	20	8.5%
i	19	8.1%
g	17	7.2%
s	15	6.4%
o	13	5.5%
c	12	5.1%
n	9	3.8%
Other values (12)	49	20.8%

Decimal Number

Value	Count	Frequency (%)
5	23	26.4%
1	17	19.5%
7	14	16.1%
3	11	12.6%
0	7	8.0%
2	7	8.0%
4	3	3.4%
8	3	3.4%
9	1	1.1%
6	1	1.1%

Uppercase Letter

Value	Count	Frequency (%)
O	401	97.8%
P	2	0.5%
S	2	0.5%
H	1	0.2%
L	1	0.2%
A	1	0.2%
E	1	0.2%
R	1	0.2%

Other Punctuation

Value	Count	Frequency (%)
,	7	38.9%
·	5	27.8%
/	3	16.7%
;	2	11.1%
"	1	5.6%

Open Punctuation

Value	Count	Frequency (%)
(	56	64.4%
{	31	35.6%

Close Punctuation

Value	Count	Frequency (%)
)	56	64.4%
}	31	35.6%

Dash Punctuation

Value	Count	Frequency (%)
-	1193	100.0%

Connector Punctuation

Value	Count	Frequency (%)
_	798	100.0%

Space Separator

Value	Count	Frequency (%)
	718	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	38117	74.2%
Han	9626	18.7%
Common	2988	5.8%
Latin	646	1.3%

Most frequent character per script

Han

Value	Count	Frequency (%)
湯	240	2.5%
炒	201	2.1%
黃	162	1.7%
加	147	1.5%
子	134	1.4%
不	122	1.3%
酒	120	1.2%
去	117	1.2%
氣	107	1.1%
熱	94	1.0%
Other values (1172)	8182	85.0%

Hangul

Value	Count	Frequency (%)
탕	1767	4.6%
증	1036	2.7%
산	858	2.3%
상	844	2.2%
기	748	2.0%
아	698	1.8%
님	641	1.7%
환	639	1.7%
지	540	1.4%
소	534	1.4%
Other values (525)	29812	78.2%

Latin

Value	Count	Frequency (%)
O	401	62.1%
a	29	4.5%
r	27	4.2%
e	26	4.0%
t	20	3.1%
i	19	2.9%
g	17	2.6%
s	15	2.3%
o	13	2.0%
c	12	1.9%
Other values (20)	67	10.4%

Common

Value	Count	Frequency (%)
-	1193	39.9%
_	798	26.7%
	718	24.0%
(	56	1.9%
)	56	1.9%
}	31	1.0%
{	31	1.0%
5	23	0.8%
1	17	0.6%
7	14	0.5%
Other values (12)	51	1.7%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	38114	74.2%
CJK	9625	18.7%
ASCII	3629	7.1%
None	5	< 0.1%
Compat Jamo	3	< 0.1%
CJK Compat Ideographs	1	< 0.1%

Most frequent character per block

Hangul

Value	Count	Frequency (%)
탕	1767	4.6%
증	1036	2.7%
산	858	2.3%
상	844	2.2%
기	748	2.0%
아	698	1.8%
님	641	1.7%
환	639	1.7%
지	540	1.4%
소	534	1.4%
Other values (524)	29809	78.2%

ASCII

Value	Count	Frequency (%)
-	1193	32.9%
_	798	22.0%
	718	19.8%
O	401	11.0%
(	56	1.5%
)	56	1.5%
}	31	0.9%
{	31	0.9%
a	29	0.8%
r	27	0.7%
Other values (41)	289	8.0%

CJK

Value	Count	Frequency (%)
湯	240	2.5%
炒	201	2.1%
黃	162	1.7%
加	147	1.5%
子	134	1.4%
不	122	1.3%
酒	120	1.2%
去	117	1.2%
氣	107	1.1%
熱	94	1.0%
Other values (1171)	8181	85.0%

None

Value	Count	Frequency (%)
·	5	100.0%

Compat Jamo

Value	Count	Frequency (%)
ㆍ	3	100.0%

CJK Compat Ideographs

Value	Count	Frequency (%)
濾	1	100.0%

온톨로지검색한문명
Text

Distinct	8019
Distinct (%)	80.2%
Missing	0
Missing (%)	0.0%
Memory size	156.2 KiB

Length

Max length	178
Median length	130
Mean length	5.1315
Min length	1

Characters and Unicode

Total characters	51315
Distinct characters	2113
Distinct categories	10 ?
Distinct scripts	4 ?
Distinct blocks	5 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	6897 ?
Unique (%)	69.0%

Sample

1st row	海蛤散
2nd row	藿香安胃散
3rd row	食無味
4th row	宿食秘
5th row	金水六君煎

Value	Count	Frequency (%)
加	37	0.4%
oo	12	0.1%
二陳湯	11	0.1%
四物湯	10	0.1%
歸脾湯	9	0.1%
桂枝茯o丸	9	0.1%
枳朮丸	8	0.1%
血府逐瘀湯	8	0.1%
加味溫膽湯	8	0.1%
當歸四逆湯	8	0.1%
Other values (8115)	10163	98.8%

Most occurring characters

Value	Count	Frequency (%)
湯	2006	3.9%
_	1365	2.7%
O	1304	2.5%
-	1231	2.4%
散	848	1.7%
丸	684	1.3%
증	664	1.3%
상	653	1.3%
아	650	1.3%
氣	642	1.3%
Other values (2103)	41268	80.4%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	46724	91.1%
Connector Punctuation	1365	2.7%
Uppercase Letter	1313	2.6%
Dash Punctuation	1231	2.4%
Space Separator	283	0.6%
Lowercase Letter	252	0.5%
Decimal Number	85	0.2%
Close Punctuation	24	< 0.1%
Open Punctuation	24	< 0.1%
Other Punctuation	14	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
湯	2006	4.3%
散	848	1.8%
丸	684	1.5%
증	664	1.4%
상	653	1.4%
아	650	1.4%
氣	642	1.4%
님	641	1.4%
黃	469	1.0%
痛	457	1.0%
Other values (2053)	39010	83.5%

Lowercase Letter

Value	Count	Frequency (%)
a	34	13.5%
r	27	10.7%
e	27	10.7%
t	21	8.3%
i	21	8.3%
g	16	6.3%
s	15	6.0%
o	14	5.6%
c	12	4.8%
h	9	3.6%
Other values (13)	56	22.2%

Decimal Number

Value	Count	Frequency (%)
5	22	25.9%
1	17	20.0%
7	13	15.3%
3	11	12.9%
2	7	8.2%
0	7	8.2%
8	3	3.5%
4	3	3.5%
9	1	1.2%
6	1	1.2%

Uppercase Letter

Value	Count	Frequency (%)
O	1304	99.3%
P	2	0.2%
S	2	0.2%
L	1	0.1%
H	1	0.1%
A	1	0.1%
R	1	0.1%
E	1	0.1%

Other Punctuation

Value	Count	Frequency (%)
·	8	57.1%
/	3	21.4%
;	2	14.3%
"	1	7.1%

Connector Punctuation

Value	Count	Frequency (%)
_	1365	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	1231	100.0%

Space Separator

Value	Count	Frequency (%)
	283	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	24	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	24	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Han	41266	80.4%
Hangul	5458	10.6%
Common	3026	5.9%
Latin	1565	3.0%

Most frequent character per script

Han

Value	Count	Frequency (%)
湯	2006	4.9%
散	848	2.1%
丸	684	1.7%
氣	642	1.6%
黃	469	1.1%
痛	457	1.1%
加	439	1.1%
血	428	1.0%
不	425	1.0%
熱	421	1.0%
Other values (1684)	34447	83.5%

Hangul

Value	Count	Frequency (%)
증	664	12.2%
상	653	12.0%
아	650	11.9%
님	641	11.7%
하	214	3.9%
이	138	2.5%
에	133	2.4%
을	113	2.1%
다	110	2.0%
여	90	1.6%
Other values (359)	2052	37.6%

Latin

Value	Count	Frequency (%)
O	1304	83.3%
a	34	2.2%
r	27	1.7%
e	27	1.7%
t	21	1.3%
i	21	1.3%
g	16	1.0%
s	15	1.0%
o	14	0.9%
c	12	0.8%
Other values (21)	74	4.7%

Common

Value	Count	Frequency (%)
_	1365	45.1%
-	1231	40.7%
	283	9.4%
)	24	0.8%
(	24	0.8%
5	22	0.7%
1	17	0.6%
7	13	0.4%
3	11	0.4%
·	8	0.3%
Other values (9)	28	0.9%

Most occurring blocks

Value	Count	Frequency (%)
CJK	41264	80.4%
Hangul	5458	10.6%
ASCII	4583	8.9%
None	8	< 0.1%
CJK Compat Ideographs	2	< 0.1%

Most frequent character per block

CJK

Value	Count	Frequency (%)
湯	2006	4.9%
散	848	2.1%
丸	684	1.7%
氣	642	1.6%
黃	469	1.1%
痛	457	1.1%
加	439	1.1%
血	428	1.0%
不	425	1.0%
熱	421	1.0%
Other values (1682)	34445	83.5%

ASCII

Value	Count	Frequency (%)
_	1365	29.8%
O	1304	28.5%
-	1231	26.9%
	283	6.2%
a	34	0.7%
r	27	0.6%
e	27	0.6%
)	24	0.5%
(	24	0.5%
5	22	0.5%
Other values (39)	242	5.3%

Hangul

Value	Count	Frequency (%)
증	664	12.2%
상	653	12.0%
아	650	11.9%
님	641	11.7%
하	214	3.9%
이	138	2.5%
에	133	2.4%
을	113	2.1%
다	110	2.0%
여	90	1.6%
Other values (359)	2052	37.6%

None

Value	Count	Frequency (%)
·	8	100.0%

CJK Compat Ideographs

Value	Count	Frequency (%)
鈴	1	50.0%
濾	1	50.0%

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	온톨로지키워드제어번호	키워드분류	식별자(CID)	약재명(LOCAL_NAME)	한문명	온톨로지검색한문명
16047	9,491	Formula	FO海蛤散-MB得效-BK심계내과학	海蛤散-MB得效-BK심계내과학	해합산	海蛤散
18937	12,922	Formula	FO藿香安胃散-MB東醫寶鑑-MB醫鑑-BK동의방제와처방해설-108	藿香安胃散-MB東醫寶鑑-MB醫鑑-BK동의방제와처방해설-108	곽향안위산	藿香安胃散
7903	28,041	Symptom	SY食無味	食無味	식무미	食無味
8460	356	Disease	DI宿食秘	宿食秘	숙식비	宿食秘
18983	13,812	Formula	FO金水六君煎-MB景岳全書	金水六君煎-MB景岳全書	금수육군전	金水六君煎
3644	22,874	Symptom	SY氣上衝胸	氣上衝胸	기상충흉	氣上衝胸
17716	10,218	Formula	FO滋補養榮丸	滋補養榮丸	자보양영환	滋補養榮丸
7363	27,731	Symptom	SY頑癬	頑癬	완선	頑癬
24439	17,535	Medicinal_Material	MM銅靑	銅靑	동청	銅靑
17878	8,304	Formula	FO旋覆代O湯-MB傷寒論-BC辨太陽病脈證幷治下-BK방제학-399	旋覆代O湯-MB傷寒論-BC辨太陽病脈證幷治下-BK방제학-399	선복대자탕	旋覆代O湯

	온톨로지키워드제어번호	키워드분류	식별자(CID)	약재명(LOCAL_NAME)	한문명	온톨로지검색한문명
15836	9,444	Formula	FO活絡丹-MB東醫寶鑑-MB局方-BK동의방제와처방해설-861	活絡丹-MB東醫寶鑑-MB局方-BK동의방제와처방해설-861	활락단	活絡丹
1349	22,288	Symptom	SY或面黃而O	或面黃而O	혹면황이반	或面黃而O
27421	16,230	Medicinal_Material	MM玄胡索-炒	玄胡索-炒	玄胡索-炒	玄胡索-炒
28672	26,968	Symptom	SY贅O	贅O	췌우	贅O
1419	23,597	Symptom	SY熱結裏實-증상아님	熱結裏實-증상아님	熱結裏實-증상아님	熱結裏實-증상아님
24208	13,673	Formula	FO連翹敗毒散-MB東醫寶鑑-BC癰疽--癰疽五發證-MB醫鑑-BK동의방제와처방해설-104	連翹敗毒散-MB東醫寶鑑-BC癰疽--癰疽五發證-MB醫鑑-BK동의방제와처방해설-104	연교패독산	連翹敗毒散
7352	27,720	Symptom	SY項背拘急不舒	項背拘急不舒	항배구급불서	項背拘急不舒
14261	7,081	Formula	FO大秦O湯-MB東醫寶鑑-MB易老-BK동의방제와처방해설-287	大秦O湯-MB東醫寶鑑-MB易老-BK동의방제와처방해설-287	대진교탕	大秦O湯
4219	22,973	Symptom	SY氣血疼痛	氣血疼痛	氣血疼痛	氣血疼痛
15860	9,468	Formula	FO活血驅風湯(散)-BK신계내과학	活血驅風湯(散)-BK신계내과학	活血驅風湯(散)	活血驅風湯(散)

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Decimal Number

Other Punctuation

Most occurring scripts

Most frequent character per script

Common

Most occurring blocks

Most frequent character per block

ASCII

Common Values

Length

Common Values (Plot)

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Lowercase Letter

Uppercase Letter

Decimal Number

Other Punctuation

Dash Punctuation

Connector Punctuation

Space Separator

Close Punctuation

Open Punctuation

Most occurring scripts

Most frequent character per script

Han

Hangul

Latin

Common

Most occurring blocks

Most frequent character per block

ASCII

Hangul

CJK

None

CJK Compat Ideographs

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Lowercase Letter

Uppercase Letter

Decimal Number

Other Punctuation

Dash Punctuation

Connector Punctuation

Space Separator

Close Punctuation

Open Punctuation

Most occurring scripts

Most frequent character per script

Han

Hangul

Latin

Common

Most occurring blocks

Most frequent character per block

ASCII

Hangul

CJK

None

CJK Compat Ideographs

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Lowercase Letter

Decimal Number

Uppercase Letter

Other Punctuation

Open Punctuation

Close Punctuation

Dash Punctuation

Connector Punctuation