Dataset statistics
Number of variables | 31 |
---|---|
Number of observations | 10000 |
Missing cells | 131140 |
Missing cells (%) | 42.3% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 2.5 MiB |
Average record size in memory | 267.0 B |
Variable types
Text | 10 |
---|---|
Categorical | 11 |
Unsupported | 10 |
Dataset
Description | 역사관련 사이트 메타데이터 기반 통합 검색을 위하여 한국역사정보통합시스템이 제공 중인 역사 자료 메타데이터 중 멀티미디어 자료 |
---|---|
Author | 교육부 국사편찬위원회 |
URL | https://www.data.go.kr/data/15051037/fileData.do |
SUBJECT_KHON1 has constant value "" | Constant |
UNIT has constant value "" | Constant |
ALTERNATIVE has 4430 (44.3%) missing values | Missing |
DOCSENDER has 10000 (100.0%) missing values | Missing |
EDITOR has 10000 (100.0%) missing values | Missing |
AUTHOR has 10000 (100.0%) missing values | Missing |
TYPE has 10000 (100.0%) missing values | Missing |
PUBLISHER has 10000 (100.0%) missing values | Missing |
TABLEOFCONTENTS has 10000 (100.0%) missing values | Missing |
ABSTRACT has 4430 (44.3%) missing values | Missing |
ISPARTOF_ID has 5570 (55.7%) missing values | Missing |
ISPARTOF has 5570 (55.7%) missing values | Missing |
REQUIRES has 10000 (100.0%) missing values | Missing |
DATEEVENT has 10000 (100.0%) missing values | Missing |
DOCCREATED has 5570 (55.7%) missing values | Missing |
DOCISSUED has 10000 (100.0%) missing values | Missing |
CREATORSORT has 10000 (100.0%) missing values | Missing |
DATESORT has 5570 (55.7%) missing values | Missing |
URI_KHON has unique values | Unique |
URI_KHDP has unique values | Unique |
URL has unique values | Unique |
DOCSENDER is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
EDITOR is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
AUTHOR is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
TYPE is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
PUBLISHER is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
TABLEOFCONTENTS is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
REQUIRES is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
DATEEVENT is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
DOCISSUED is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
CREATORSORT is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2023-12-12 11:45:24.216106 |
---|---|
Analysis finished | 2023-12-12 11:45:27.683837 |
Duration | 3.47 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
URI_KHON
Text
UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 28 |
---|---|
Median length | 28 |
Mean length | 24.013 |
Min length | 19 |
Characters and Unicode
Total characters | 240130 |
---|---|
Distinct characters | 20 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 10000 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | KH.AKS.0539_23_5579 |
---|---|
2nd row | KH.AKS.0059_08_5121 |
3rd row | KH.AKS.0278_03_5705 |
4th row | KH.AKS.0196_14_5825 |
5th row | KH.AC.AC_ENG_0369_01_0204_02 |
Value | Count | Frequency (%) |
kh.aks.0539_23_5579 | 1 | < 0.1% |
kh.aks.0023_01_5069 | 1 | < 0.1% |
kh.aks.0589_08_5612 | 1 | < 0.1% |
kh.ac.ac_eng_0088_01_0317_01 | 1 | < 0.1% |
kh.aks.0078_20_5145 | 1 | < 0.1% |
kh.ac.ac_eng_0268_01_0339_02 | 1 | < 0.1% |
kh.ac.ac_eng_0098_01_0418_01 | 1 | < 0.1% |
kh.ac.ac_eng_0366_01_0278_01 | 1 | < 0.1% |
kh.ac.ac_eng_0014_01_0180_01 | 1 | < 0.1% |
kh.aks.0522_01_5561 | 1 | < 0.1% |
Other values (9990) | 9990 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 37315 | |
_ | 36710 | |
. | 20000 | 8.3% |
1 | 19233 | 8.0% |
A | 15570 | 6.5% |
K | 14430 | 6.0% |
C | 11140 | 4.6% |
5 | 10576 | 4.4% |
H | 10000 | 4.2% |
2 | 9261 | 3.9% |
Other values (10) | 55895 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 111140 | |
Uppercase Letter | 72280 | |
Connector Punctuation | 36710 | 15.3% |
Other Punctuation | 20000 | 8.3% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 37315 | |
1 | 19233 | |
5 | 10576 | 9.5% |
2 | 9261 | 8.3% |
3 | 8509 | 7.7% |
4 | 6518 | 5.9% |
6 | 5615 | 5.1% |
8 | 5192 | 4.7% |
7 | 4948 | 4.5% |
9 | 3973 | 3.6% |
Uppercase Letter
Value | Count | Frequency (%) |
A | 15570 | |
K | 14430 | |
C | 11140 | |
H | 10000 | |
E | 5570 | 7.7% |
N | 5570 | 7.7% |
G | 5570 | 7.7% |
S | 4430 | 6.1% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 36710 |
Other Punctuation
Value | Count | Frequency (%) |
. | 20000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 167850 | |
Latin | 72280 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 37315 | |
_ | 36710 | |
. | 20000 | |
1 | 19233 | |
5 | 10576 | 6.3% |
2 | 9261 | 5.5% |
3 | 8509 | 5.1% |
4 | 6518 | 3.9% |
6 | 5615 | 3.3% |
8 | 5192 | 3.1% |
Other values (2) | 8921 | 5.3% |
Latin
Value | Count | Frequency (%) |
A | 15570 | |
K | 14430 | |
C | 11140 | |
H | 10000 | |
E | 5570 | 7.7% |
N | 5570 | 7.7% |
G | 5570 | 7.7% |
S | 4430 | 6.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 240130 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 37315 | |
_ | 36710 | |
. | 20000 | 8.3% |
1 | 19233 | 8.0% |
A | 15570 | 6.5% |
K | 14430 | 6.0% |
C | 11140 | 4.6% |
5 | 10576 | 4.4% |
H | 10000 | 4.2% |
2 | 9261 | 3.9% |
Other values (10) | 55895 |
MDCENTER
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
AC | |
---|---|
AKS |
Length
Max length | 3 |
---|---|
Median length | 2 |
Mean length | 2.443 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | AKS |
---|---|
2nd row | AKS |
3rd row | AKS |
4th row | AKS |
5th row | AC |
Common Values
Value | Count | Frequency (%) |
AC | 5570 | |
AKS | 4430 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
ac | 5570 | |
aks | 4430 |
SUBJECT_KHON
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
KH.13.01.011 | |
---|---|
KH.13.02.001 |
Length
Max length | 12 |
---|---|
Median length | 12 |
Mean length | 12 |
Min length | 12 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | KH.13.02.001 |
---|---|
2nd row | KH.13.02.001 |
3rd row | KH.13.02.001 |
4th row | KH.13.02.001 |
5th row | KH.13.01.011 |
Common Values
Value | Count | Frequency (%) |
KH.13.01.011 | 5570 | |
KH.13.02.001 | 4430 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
kh.13.01.011 | 5570 | |
kh.13.02.001 | 4430 |
DBINFO
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
서양고서 | |
---|---|
한국민요대관 |
Length
Max length | 6 |
---|---|
Median length | 4 |
Mean length | 4.886 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 한국민요대관 |
---|---|
2nd row | 한국민요대관 |
3rd row | 한국민요대관 |
4th row | 한국민요대관 |
5th row | 서양고서 |
Common Values
Value | Count | Frequency (%) |
서양고서 | 5570 | |
한국민요대관 | 4430 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
서양고서 | 5570 | |
한국민요대관 | 4430 |
URI_KHDP
Text
UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 22 |
---|---|
Median length | 22 |
Mean length | 17.57 |
Min length | 12 |
Characters and Unicode
Total characters | 175700 |
---|---|
Distinct characters | 16 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 10000 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 0539_23_5579 |
---|---|
2nd row | 0059_08_5121 |
3rd row | 0278_03_5705 |
4th row | 0196_14_5825 |
5th row | AC_ENG_0369_01_0204_02 |
Value | Count | Frequency (%) |
0539_23_5579 | 1 | < 0.1% |
0023_01_5069 | 1 | < 0.1% |
0589_08_5612 | 1 | < 0.1% |
ac_eng_0088_01_0317_01 | 1 | < 0.1% |
0078_20_5145 | 1 | < 0.1% |
ac_eng_0268_01_0339_02 | 1 | < 0.1% |
ac_eng_0098_01_0418_01 | 1 | < 0.1% |
ac_eng_0366_01_0278_01 | 1 | < 0.1% |
ac_eng_0014_01_0180_01 | 1 | < 0.1% |
0522_01_5561 | 1 | < 0.1% |
Other values (9990) | 9990 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 37315 | |
_ | 36710 | |
1 | 19233 | |
5 | 10576 | 6.0% |
2 | 9261 | 5.3% |
3 | 8509 | 4.8% |
4 | 6518 | 3.7% |
6 | 5615 | 3.2% |
A | 5570 | 3.2% |
C | 5570 | 3.2% |
Other values (6) | 30823 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 111140 | |
Connector Punctuation | 36710 | 20.9% |
Uppercase Letter | 27850 | 15.9% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 37315 | |
1 | 19233 | |
5 | 10576 | 9.5% |
2 | 9261 | 8.3% |
3 | 8509 | 7.7% |
4 | 6518 | 5.9% |
6 | 5615 | 5.1% |
8 | 5192 | 4.7% |
7 | 4948 | 4.5% |
9 | 3973 | 3.6% |
Uppercase Letter
Value | Count | Frequency (%) |
A | 5570 | |
C | 5570 | |
E | 5570 | |
N | 5570 | |
G | 5570 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 36710 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 147850 | |
Latin | 27850 | 15.9% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 37315 | |
_ | 36710 | |
1 | 19233 | |
5 | 10576 | 7.2% |
2 | 9261 | 6.3% |
3 | 8509 | 5.8% |
4 | 6518 | 4.4% |
6 | 5615 | 3.8% |
8 | 5192 | 3.5% |
7 | 4948 | 3.3% |
Latin
Value | Count | Frequency (%) |
A | 5570 | |
C | 5570 | |
E | 5570 | |
N | 5570 | |
G | 5570 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 175700 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 37315 | |
_ | 36710 | |
1 | 19233 | |
5 | 10576 | 6.0% |
2 | 9261 | 5.3% |
3 | 8509 | 4.8% |
4 | 6518 | 3.7% |
6 | 5615 | 3.2% |
A | 5570 | 3.2% |
C | 5570 | 3.2% |
Other values (6) | 30823 |
MAINTITLE
Text
Distinct | 6848 |
---|---|
Distinct (%) | 68.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 200 |
---|---|
Median length | 185 |
Mean length | 29.0554 |
Min length | 1 |
Characters and Unicode
Total characters | 290554 |
---|---|
Distinct characters | 1175 |
Distinct categories | 16 ? |
Distinct scripts | 5 ? |
Distinct blocks | 9 ? |
Unique
Unique | 6508 ? |
---|---|
Unique (%) | 65.1% |
Sample
1st row | 토끼타령 |
---|---|
2nd row | 곱새치기소리 |
3rd row | 창부타령 |
4th row | 창부타령 |
5th row | THERESA MULLIN, Bootle, Liverpool. |
Value | Count | Frequency (%) |
the | 2974 | 6.3% |
of | 1991 | 4.2% |
a | 1044 | 2.2% |
in | 1010 | 2.1% |
fig | 777 | 1.6% |
and | 693 | 1.5% |
at | 487 | 1.0% |
450 | 0.9% | |
to | 357 | 0.8% |
창부타령 | 334 | 0.7% |
Other values (11469) | 37409 |
Most occurring characters
Value | Count | Frequency (%) |
37551 | 12.9% | |
e | 13500 | 4.6% |
a | 10409 | 3.6% |
n | 9545 | 3.3% |
o | 9263 | 3.2% |
i | 9152 | 3.1% |
E | 9092 | 3.1% |
A | 8705 | 3.0% |
t | 8351 | 2.9% |
r | 7766 | 2.7% |
Other values (1165) | 167220 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 114110 | |
Uppercase Letter | 93606 | |
Space Separator | 37551 | 12.9% |
Other Letter | 24041 | 8.3% |
Other Punctuation | 11681 | 4.0% |
Decimal Number | 4422 | 1.5% |
Dash Punctuation | 2057 | 0.7% |
Open Punctuation | 1528 | 0.5% |
Close Punctuation | 1526 | 0.5% |
Letter Number | 10 | < 0.1% |
Other values (6) | 22 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
리 | 1778 | 7.4% |
소 | 1332 | 5.5% |
대 | 1129 | 4.7% |
담 | 1097 | 4.6% |
는 | 819 | 3.4% |
타 | 786 | 3.3% |
령 | 778 | 3.2% |
아 | 657 | 2.7% |
가 | 462 | 1.9% |
이 | 412 | 1.7% |
Other values (1032) | 14791 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 13500 | |
a | 10409 | 9.1% |
n | 9545 | 8.4% |
o | 9263 | 8.1% |
i | 9152 | 8.0% |
t | 8351 | 7.3% |
r | 7766 | 6.8% |
s | 6587 | 5.8% |
h | 5571 | 4.9% |
l | 5012 | 4.4% |
Other values (41) | 28954 |
Uppercase Letter
Value | Count | Frequency (%) |
E | 9092 | 9.7% |
A | 8705 | 9.3% |
T | 7257 | 7.8% |
I | 6809 | 7.3% |
N | 6777 | 7.2% |
S | 6350 | 6.8% |
O | 6123 | 6.5% |
R | 5976 | 6.4% |
H | 4586 | 4.9% |
C | 3571 | 3.8% |
Other values (30) | 28360 |
Other Punctuation
Value | Count | Frequency (%) |
. | 7487 | |
, | 2710 | 23.2% |
" | 583 | 5.0% |
' | 478 | 4.1% |
: | 182 | 1.6% |
; | 111 | 1.0% |
/ | 51 | 0.4% |
! | 38 | 0.3% |
& | 18 | 0.2% |
? | 18 | 0.2% |
Other values (2) | 5 | < 0.1% |
Decimal Number
Value | Count | Frequency (%) |
1 | 1003 | |
2 | 648 | |
3 | 407 | |
0 | 397 | 9.0% |
9 | 379 | 8.6% |
4 | 349 | 7.9% |
8 | 330 | 7.5% |
6 | 319 | 7.2% |
5 | 303 | 6.9% |
7 | 287 | 6.5% |
Letter Number
Value | Count | Frequency (%) |
Ⅱ | 4 | |
Ⅹ | 1 | 10.0% |
Ⅵ | 1 | 10.0% |
Ⅰ | 1 | 10.0% |
Ⅴ | 1 | 10.0% |
Ⅷ | 1 | 10.0% |
Ⅳ | 1 | 10.0% |
Open Punctuation
Value | Count | Frequency (%) |
( | 1524 | |
[ | 4 | 0.3% |
Close Punctuation
Value | Count | Frequency (%) |
) | 1522 | |
] | 4 | 0.3% |
Math Symbol
Value | Count | Frequency (%) |
~ | 5 | |
= | 1 | 16.7% |
Space Separator
Value | Count | Frequency (%) |
37551 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 2057 |
Other Symbol
Value | Count | Frequency (%) |
★ | 5 |
Other Number
Value | Count | Frequency (%) |
½ | 4 |
Final Punctuation
Value | Count | Frequency (%) |
» | 3 |
Initial Punctuation
Value | Count | Frequency (%) |
« | 3 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 207726 | |
Common | 58787 | 20.2% |
Hangul | 22610 | 7.8% |
Han | 1394 | 0.5% |
Katakana | 37 | < 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
리 | 1778 | 7.9% |
소 | 1332 | 5.9% |
대 | 1129 | 5.0% |
담 | 1097 | 4.9% |
는 | 819 | 3.6% |
타 | 786 | 3.5% |
령 | 778 | 3.4% |
아 | 657 | 2.9% |
가 | 462 | 2.0% |
이 | 412 | 1.8% |
Other values (606) | 13360 |
Han
Value | Count | Frequency (%) |
京 | 56 | 4.0% |
城 | 52 | 3.7% |
大 | 41 | 2.9% |
道 | 38 | 2.7% |
山 | 35 | 2.5% |
天 | 32 | 2.3% |
奉 | 29 | 2.1% |
南 | 24 | 1.7% |
寺 | 24 | 1.7% |
運 | 24 | 1.7% |
Other values (397) | 1039 |
Latin
Value | Count | Frequency (%) |
e | 13500 | 6.5% |
a | 10409 | 5.0% |
n | 9545 | 4.6% |
o | 9263 | 4.5% |
i | 9152 | 4.4% |
E | 9092 | 4.4% |
A | 8705 | 4.2% |
t | 8351 | 4.0% |
r | 7766 | 3.7% |
T | 7257 | 3.5% |
Other values (88) | 114686 |
Common
Value | Count | Frequency (%) |
37551 | ||
. | 7487 | 12.7% |
, | 2710 | 4.6% |
- | 2057 | 3.5% |
( | 1524 | 2.6% |
) | 1522 | 2.6% |
1 | 1003 | 1.7% |
2 | 648 | 1.1% |
" | 583 | 1.0% |
' | 478 | 0.8% |
Other values (25) | 3224 | 5.5% |
Katakana
Value | Count | Frequency (%) |
ノ | 11 | |
イ | 4 | 10.8% |
ラ | 2 | 5.4% |
ア | 2 | 5.4% |
ヌ | 2 | 5.4% |
シ | 2 | 5.4% |
ロ | 2 | 5.4% |
タ | 1 | 2.7% |
コ | 1 | 2.7% |
ヶ | 1 | 2.7% |
Other values (9) | 9 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 265760 | |
Hangul | 22610 | 7.8% |
CJK | 1393 | 0.5% |
None | 734 | 0.3% |
Katakana | 37 | < 0.1% |
Number Forms | 10 | < 0.1% |
Misc Symbols | 5 | < 0.1% |
Punctuation | 4 | < 0.1% |
CJK Ext A | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
37551 | 14.1% | |
e | 13500 | 5.1% |
a | 10409 | 3.9% |
n | 9545 | 3.6% |
o | 9263 | 3.5% |
i | 9152 | 3.4% |
E | 9092 | 3.4% |
A | 8705 | 3.3% |
t | 8351 | 3.1% |
r | 7766 | 2.9% |
Other values (71) | 142426 |
Hangul
Value | Count | Frequency (%) |
리 | 1778 | 7.9% |
소 | 1332 | 5.9% |
대 | 1129 | 5.0% |
담 | 1097 | 4.9% |
는 | 819 | 3.6% |
타 | 786 | 3.5% |
령 | 778 | 3.4% |
아 | 657 | 2.9% |
가 | 462 | 2.0% |
이 | 412 | 1.8% |
Other values (606) | 13360 |
None
Value | Count | Frequency (%) |
ŏ | 164 | |
é | 153 | |
è | 59 | 8.0% |
ô | 53 | 7.2% |
ō | 47 | 6.4% |
â | 26 | 3.5% |
à | 21 | 2.9% |
É | 18 | 2.5% |
ê | 18 | 2.5% |
Ö | 14 | 1.9% |
Other values (33) | 161 |
CJK
Value | Count | Frequency (%) |
京 | 56 | 4.0% |
城 | 52 | 3.7% |
大 | 41 | 2.9% |
道 | 38 | 2.7% |
山 | 35 | 2.5% |
天 | 32 | 2.3% |
奉 | 29 | 2.1% |
南 | 24 | 1.7% |
寺 | 24 | 1.7% |
運 | 24 | 1.7% |
Other values (396) | 1038 |
Katakana
Value | Count | Frequency (%) |
ノ | 11 | |
イ | 4 | 10.8% |
ラ | 2 | 5.4% |
ア | 2 | 5.4% |
ヌ | 2 | 5.4% |
シ | 2 | 5.4% |
ロ | 2 | 5.4% |
タ | 1 | 2.7% |
コ | 1 | 2.7% |
ヶ | 1 | 2.7% |
Other values (9) | 9 |
Misc Symbols
Value | Count | Frequency (%) |
★ | 5 |
Number Forms
Value | Count | Frequency (%) |
Ⅱ | 4 | |
Ⅹ | 1 | 10.0% |
Ⅵ | 1 | 10.0% |
Ⅰ | 1 | 10.0% |
Ⅴ | 1 | 10.0% |
Ⅷ | 1 | 10.0% |
Ⅳ | 1 | 10.0% |
Punctuation
Value | Count | Frequency (%) |
… | 4 |
CJK Ext A
Value | Count | Frequency (%) |
㕍 | 1 |
ALTERNATIVE
Text
MISSING
 
Distinct | 5365 |
---|---|
Distinct (%) | 96.3% |
Missing | 4430 |
Missing (%) | 44.3% |
Memory size | 156.2 KiB |
Length
Max length | 200 |
---|---|
Median length | 125 |
Mean length | 18.92675 |
Min length | 1 |
Characters and Unicode
Total characters | 105422 |
---|---|
Distinct characters | 1224 |
Distinct categories | 12 ? |
Distinct scripts | 4 ? |
Distinct blocks | 7 ? |
Unique
Unique | 5245 ? |
---|---|
Unique (%) | 94.2% |
Sample
1st row | 테레사 뮬린, 부틀, 리버풀 |
---|---|
2nd row | 달간의 미국인 학생들 |
3rd row | 옛 지방 관청 |
4th row | 조지 케네디, 카타운 하우스, 킬디모 |
5th row | 닛코 위쪽에 있는 케곤 폭포는 "우주의 신비를 풀 수 없어" 자살하는 일본 젊은이들의 성지이다 |
Value | Count | Frequency (%) |
357 | 1.3% | |
있는 | 225 | 0.8% |
일본 | 221 | 0.8% |
중국 | 162 | 0.6% |
있다 | 159 | 0.6% |
서울 | 148 | 0.6% |
신부 | 133 | 0.5% |
모습 | 128 | 0.5% |
fig | 108 | 0.4% |
한국의 | 101 | 0.4% |
Other values (13007) | 25070 |
Most occurring characters
Value | Count | Frequency (%) |
21271 | 20.2% | |
의 | 2930 | 2.8% |
, | 1772 | 1.7% |
. | 1502 | 1.4% |
이 | 1495 | 1.4% |
에 | 1399 | 1.3% |
는 | 1282 | 1.2% |
사 | 1092 | 1.0% |
다 | 1073 | 1.0% |
서 | 1002 | 1.0% |
Other values (1214) | 70604 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 71411 | |
Space Separator | 21271 | 20.2% |
Other Punctuation | 4176 | 4.0% |
Decimal Number | 2963 | 2.8% |
Uppercase Letter | 2239 | 2.1% |
Lowercase Letter | 2188 | 2.1% |
Dash Punctuation | 487 | 0.5% |
Open Punctuation | 339 | 0.3% |
Close Punctuation | 339 | 0.3% |
Math Symbol | 5 | < 0.1% |
Other values (2) | 4 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
의 | 2930 | 4.1% |
이 | 1495 | 2.1% |
에 | 1399 | 2.0% |
는 | 1282 | 1.8% |
사 | 1092 | 1.5% |
다 | 1073 | 1.5% |
서 | 1002 | 1.4% |
한 | 991 | 1.4% |
국 | 944 | 1.3% |
리 | 898 | 1.3% |
Other values (1111) | 58305 |
Lowercase Letter
Value | Count | Frequency (%) |
a | 267 | |
i | 230 | 10.5% |
n | 222 | 10.1% |
e | 176 | 8.0% |
o | 148 | 6.8% |
u | 130 | 5.9% |
h | 109 | 5.0% |
s | 97 | 4.4% |
g | 96 | 4.4% |
r | 86 | 3.9% |
Other values (28) | 627 |
Uppercase Letter
Value | Count | Frequency (%) |
I | 241 | 10.8% |
G | 179 | 8.0% |
N | 174 | 7.8% |
A | 161 | 7.2% |
S | 154 | 6.9% |
F | 137 | 6.1% |
H | 118 | 5.3% |
T | 116 | 5.2% |
O | 108 | 4.8% |
M | 102 | 4.6% |
Other values (21) | 749 |
Other Punctuation
Value | Count | Frequency (%) |
, | 1772 | |
. | 1502 | |
" | 494 | 11.8% |
: | 170 | 4.1% |
; | 92 | 2.2% |
/ | 54 | 1.3% |
' | 30 | 0.7% |
! | 25 | 0.6% |
? | 15 | 0.4% |
& | 12 | 0.3% |
Other values (2) | 10 | 0.2% |
Decimal Number
Value | Count | Frequency (%) |
1 | 738 | |
2 | 423 | |
0 | 292 | 9.9% |
9 | 257 | 8.7% |
3 | 243 | 8.2% |
4 | 208 | 7.0% |
6 | 208 | 7.0% |
8 | 206 | 7.0% |
5 | 201 | 6.8% |
7 | 187 | 6.3% |
Letter Number
Value | Count | Frequency (%) |
Ⅱ | 1 | |
Ⅷ | 1 | |
Ⅹ | 1 |
Open Punctuation
Value | Count | Frequency (%) |
( | 338 | |
[ | 1 | 0.3% |
Close Punctuation
Value | Count | Frequency (%) |
) | 338 | |
] | 1 | 0.3% |
Math Symbol
Value | Count | Frequency (%) |
~ | 4 | |
+ | 1 | 20.0% |
Space Separator
Value | Count | Frequency (%) |
21271 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 487 |
Other Number
Value | Count | Frequency (%) |
½ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 71320 | |
Common | 29581 | |
Latin | 4430 | 4.2% |
Han | 91 | 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
의 | 2930 | 4.1% |
이 | 1495 | 2.1% |
에 | 1399 | 2.0% |
는 | 1282 | 1.8% |
사 | 1092 | 1.5% |
다 | 1073 | 1.5% |
서 | 1002 | 1.4% |
한 | 991 | 1.4% |
국 | 944 | 1.3% |
리 | 898 | 1.3% |
Other values (1046) | 58214 |
Latin
Value | Count | Frequency (%) |
a | 267 | 6.0% |
I | 241 | 5.4% |
i | 230 | 5.2% |
n | 222 | 5.0% |
G | 179 | 4.0% |
e | 176 | 4.0% |
N | 174 | 3.9% |
A | 161 | 3.6% |
S | 154 | 3.5% |
o | 148 | 3.3% |
Other values (62) | 2478 |
Han
Value | Count | Frequency (%) |
刀 | 12 | 13.2% |
太 | 11 | 12.1% |
德 | 2 | 2.2% |
水 | 2 | 2.2% |
山 | 2 | 2.2% |
故 | 2 | 2.2% |
寺 | 2 | 2.2% |
皇 | 1 | 1.1% |
位 | 1 | 1.1% |
泉 | 1 | 1.1% |
Other values (55) | 55 |
Common
Value | Count | Frequency (%) |
21271 | ||
, | 1772 | 6.0% |
. | 1502 | 5.1% |
1 | 738 | 2.5% |
" | 494 | 1.7% |
- | 487 | 1.6% |
2 | 423 | 1.4% |
( | 338 | 1.1% |
) | 338 | 1.1% |
0 | 292 | 1.0% |
Other values (21) | 1926 | 6.5% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 71319 | |
ASCII | 33912 | |
CJK | 91 | 0.1% |
None | 87 | 0.1% |
Punctuation | 9 | < 0.1% |
Number Forms | 3 | < 0.1% |
Compat Jamo | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
21271 | ||
, | 1772 | 5.2% |
. | 1502 | 4.4% |
1 | 738 | 2.2% |
" | 494 | 1.5% |
- | 487 | 1.4% |
2 | 423 | 1.2% |
( | 338 | 1.0% |
) | 338 | 1.0% |
0 | 292 | 0.9% |
Other values (70) | 6257 | 18.5% |
Hangul
Value | Count | Frequency (%) |
의 | 2930 | 4.1% |
이 | 1495 | 2.1% |
에 | 1399 | 2.0% |
는 | 1282 | 1.8% |
사 | 1092 | 1.5% |
다 | 1073 | 1.5% |
서 | 1002 | 1.4% |
한 | 991 | 1.4% |
국 | 944 | 1.3% |
리 | 898 | 1.3% |
Other values (1045) | 58213 |
None
Value | Count | Frequency (%) |
ô | 24 | |
â | 24 | |
î | 8 | 9.2% |
ç | 6 | 6.9% |
ñ | 4 | 4.6% |
û | 4 | 4.6% |
É | 2 | 2.3% |
ö | 2 | 2.3% |
ä | 2 | 2.3% |
é | 2 | 2.3% |
Other values (9) | 9 | 10.3% |
CJK
Value | Count | Frequency (%) |
刀 | 12 | 13.2% |
太 | 11 | 12.1% |
德 | 2 | 2.2% |
水 | 2 | 2.2% |
山 | 2 | 2.2% |
故 | 2 | 2.2% |
寺 | 2 | 2.2% |
皇 | 1 | 1.1% |
位 | 1 | 1.1% |
泉 | 1 | 1.1% |
Other values (55) | 55 |
Punctuation
Value | Count | Frequency (%) |
… | 9 |
Compat Jamo
Value | Count | Frequency (%) |
ㅣ | 1 |
Number Forms
Value | Count | Frequency (%) |
Ⅱ | 1 | |
Ⅷ | 1 | |
Ⅹ | 1 |
DOCSENDER
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 10000 |
---|---|
Missing (%) | 100.0% |
Memory size | 166.0 KiB |
EDITOR
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 10000 |
---|---|
Missing (%) | 100.0% |
Memory size | 166.0 KiB |
AUTHOR
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 10000 |
---|---|
Missing (%) | 100.0% |
Memory size | 166.0 KiB |
SUBJECT_KHON1
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
KH.13 |
---|
Length
Max length | 5 |
---|---|
Median length | 5 |
Mean length | 5 |
Min length | 5 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | KH.13 |
---|---|
2nd row | KH.13 |
3rd row | KH.13 |
4th row | KH.13 |
5th row | KH.13 |
Common Values
Value | Count | Frequency (%) |
KH.13 | 10000 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
kh.13 | 10000 |
SUBJECT_KHON2
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
KH.13.01 | |
---|---|
KH.13.02 |
Length
Max length | 8 |
---|---|
Median length | 8 |
Mean length | 8 |
Min length | 8 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | KH.13.02 |
---|---|
2nd row | KH.13.02 |
3rd row | KH.13.02 |
4th row | KH.13.02 |
5th row | KH.13.01 |
Common Values
Value | Count | Frequency (%) |
KH.13.01 | 5570 | |
KH.13.02 | 4430 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
kh.13.01 | 5570 | |
kh.13.02 | 4430 |
SUBJECT_KHDP
Categorical
Distinct | 6 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
um | |
---|---|
e | |
c | |
d | |
a | 151 |
Length
Max length | 2 |
---|---|
Median length | 1 |
Mean length | 1.443 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | um |
---|---|
2nd row | um |
3rd row | um |
4th row | um |
5th row | d |
Common Values
Value | Count | Frequency (%) |
um | 4430 | |
e | 2261 | |
c | 2018 | |
d | 1043 | 10.4% |
a | 151 | 1.5% |
b | 97 | 1.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
um | 4430 | |
e | 2261 | |
c | 2018 | |
d | 1043 | 10.4% |
a | 151 | 1.5% |
b | 97 | 1.0% |
TYPE
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 10000 |
---|---|
Missing (%) | 100.0% |
Memory size | 166.0 KiB |
UNIT
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2 |
---|
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2 |
---|---|
2nd row | 2 |
3rd row | 2 |
4th row | 2 |
5th row | 2 |
Common Values
Value | Count | Frequency (%) |
2 | 10000 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2 | 10000 |
PUBLISHER
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 10000 |
---|---|
Missing (%) | 100.0% |
Memory size | 166.0 KiB |
FORMAT_MEDIUM
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
image/jpeg | |
---|---|
text/xml|audio/asf |
Length
Max length | 18 |
---|---|
Median length | 10 |
Mean length | 13.544 |
Min length | 10 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | text/xml|audio/asf |
---|---|
2nd row | text/xml|audio/asf |
3rd row | text/xml|audio/asf |
4th row | text/xml|audio/asf |
5th row | image/jpeg |
Common Values
Value | Count | Frequency (%) |
image/jpeg | 5570 | |
text/xml|audio/asf | 4430 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
image/jpeg | 5570 | |
text/xml|audio/asf | 4430 |
TABLEOFCONTENTS
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 10000 |
---|---|
Missing (%) | 100.0% |
Memory size | 166.0 KiB |
ABSTRACT
Text
MISSING
 
Distinct | 261 |
---|---|
Distinct (%) | 4.7% |
Missing | 4430 |
Missing (%) | 44.3% |
Memory size | 156.2 KiB |
Length
Max length | 1024 |
---|---|
Median length | 603 |
Mean length | 77.147935 |
Min length | 14 |
Characters and Unicode
Total characters | 429714 |
---|---|
Distinct characters | 374 |
Distinct categories | 13 ? |
Distinct scripts | 4 ? |
Distinct blocks | 5 ? |
Unique
Unique | 102 ? |
---|---|
Unique (%) | 1.8% |
Sample
1st row | 출전 : The Far East ( ) |
---|---|
2nd row | 출전 : The Far East ( ) |
3rd row | 출전 : Economic history of Chosen ( comp. in commemoration of the decennial of the Bank of Chosen ) |
4th row | 출전 : The Far East ( ) |
5th row | 출전 : Japan and Korea ( ) |
Value | Count | Frequency (%) |
17432 | ||
the | 6987 | 8.1% |
출전 | 5570 | 6.4% |
of | 4429 | 5.1% |
and | 3423 | 4.0% |
east | 2159 | 2.5% |
far | 1767 | 2.0% |
in | 1588 | 1.8% |
by | 1540 | 1.8% |
japan | 1251 | 1.4% |
Other values (2503) | 40302 |
Most occurring characters
Value | Count | Frequency (%) |
80788 | ||
a | 31923 | 7.4% |
e | 29626 | 6.9% |
n | 23425 | 5.5% |
o | 21682 | 5.0% |
t | 20685 | 4.8% |
i | 20373 | 4.7% |
s | 19884 | 4.6% |
r | 19655 | 4.6% |
h | 15736 | 3.7% |
Other values (364) | 145937 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 272229 | |
Space Separator | 80812 | 18.8% |
Uppercase Letter | 31139 | 7.2% |
Other Punctuation | 16578 | 3.9% |
Other Letter | 12215 | 2.8% |
Open Punctuation | 5744 | 1.3% |
Close Punctuation | 5743 | 1.3% |
Decimal Number | 3676 | 0.9% |
Dash Punctuation | 1382 | 0.3% |
Control | 190 | < 0.1% |
Other values (3) | 6 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
전 | 5579 | |
출 | 5570 | |
목 | 103 | 0.8% |
제 | 101 | 0.8% |
이 | 38 | 0.3% |
에 | 29 | 0.2% |
의 | 29 | 0.2% |
다 | 22 | 0.2% |
사 | 19 | 0.2% |
서 | 14 | 0.1% |
Other values (272) | 711 | 5.8% |
Lowercase Letter
Value | Count | Frequency (%) |
a | 31923 | |
e | 29626 | |
n | 23425 | |
o | 21682 | 8.0% |
t | 20685 | 7.6% |
i | 20373 | 7.5% |
s | 19884 | 7.3% |
r | 19655 | 7.2% |
h | 15736 | 5.8% |
l | 11241 | 4.1% |
Other values (26) | 57999 |
Uppercase Letter
Value | Count | Frequency (%) |
T | 3482 | |
E | 3452 | |
C | 3410 | |
F | 3061 | |
J | 2375 | 7.6% |
K | 2140 | 6.9% |
A | 1864 | 6.0% |
M | 1560 | 5.0% |
P | 1272 | 4.1% |
S | 1086 | 3.5% |
Other values (17) | 7437 |
Decimal Number
Value | Count | Frequency (%) |
1 | 942 | |
8 | 583 | |
6 | 503 | |
0 | 357 | 9.7% |
5 | 334 | 9.1% |
4 | 329 | 8.9% |
9 | 283 | 7.7% |
2 | 156 | 4.2% |
3 | 149 | 4.1% |
7 | 40 | 1.1% |
Other Punctuation
Value | Count | Frequency (%) |
: | 5941 | |
, | 5175 | |
. | 3540 | |
; | 987 | 6.0% |
' | 583 | 3.5% |
& | 192 | 1.2% |
/ | 134 | 0.8% |
" | 26 | 0.2% |
Space Separator
Value | Count | Frequency (%) |
80788 | ||
24 | < 0.1% |
Open Punctuation
Value | Count | Frequency (%) |
( | 5706 | |
[ | 38 | 0.7% |
Close Punctuation
Value | Count | Frequency (%) |
) | 5705 | |
] | 38 | 0.7% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1382 |
Control
Value | Count | Frequency (%) |
190 |
Final Punctuation
Value | Count | Frequency (%) |
’ | 4 |
Other Number
Value | Count | Frequency (%) |
½ | 1 |
Currency Symbol
Value | Count | Frequency (%) |
£ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 303368 | |
Common | 114131 | 26.6% |
Hangul | 12121 | 2.8% |
Han | 94 | < 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
전 | 5579 | |
출 | 5570 | |
목 | 103 | 0.8% |
제 | 101 | 0.8% |
이 | 38 | 0.3% |
에 | 29 | 0.2% |
의 | 29 | 0.2% |
다 | 22 | 0.2% |
사 | 19 | 0.2% |
서 | 14 | 0.1% |
Other values (214) | 617 | 5.1% |
Latin
Value | Count | Frequency (%) |
a | 31923 | 10.5% |
e | 29626 | 9.8% |
n | 23425 | 7.7% |
o | 21682 | 7.1% |
t | 20685 | 6.8% |
i | 20373 | 6.7% |
s | 19884 | 6.6% |
r | 19655 | 6.5% |
h | 15736 | 5.2% |
l | 11241 | 3.7% |
Other values (53) | 89138 |
Han
Value | Count | Frequency (%) |
元 | 8 | 8.5% |
寶 | 7 | 7.4% |
山 | 3 | 3.2% |
五 | 3 | 3.2% |
大 | 3 | 3.2% |
光 | 3 | 3.2% |
緖 | 3 | 3.2% |
通 | 2 | 2.1% |
吉 | 2 | 2.1% |
林 | 2 | 2.1% |
Other values (48) | 58 |
Common
Value | Count | Frequency (%) |
80788 | ||
: | 5941 | 5.2% |
( | 5706 | 5.0% |
) | 5705 | 5.0% |
, | 5175 | 4.5% |
. | 3540 | 3.1% |
- | 1382 | 1.2% |
; | 987 | 0.9% |
1 | 942 | 0.8% |
8 | 583 | 0.5% |
Other values (19) | 3382 | 3.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 417111 | |
Hangul | 12121 | 2.8% |
None | 384 | 0.1% |
CJK | 94 | < 0.1% |
Punctuation | 4 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
80788 | ||
a | 31923 | 7.7% |
e | 29626 | 7.1% |
n | 23425 | 5.6% |
o | 21682 | 5.2% |
t | 20685 | 5.0% |
i | 20373 | 4.9% |
s | 19884 | 4.8% |
r | 19655 | 4.7% |
h | 15736 | 3.8% |
Other values (67) | 133334 |
Hangul
Value | Count | Frequency (%) |
전 | 5579 | |
출 | 5570 | |
목 | 103 | 0.8% |
제 | 101 | 0.8% |
이 | 38 | 0.3% |
에 | 29 | 0.2% |
의 | 29 | 0.2% |
다 | 22 | 0.2% |
사 | 19 | 0.2% |
서 | 14 | 0.1% |
Other values (214) | 617 | 5.1% |
None
Value | Count | Frequency (%) |
ō | 196 | |
é | 124 | |
24 | 6.2% | |
ö | 14 | 3.6% |
æ | 6 | 1.6% |
ü | 6 | 1.6% |
ä | 3 | 0.8% |
ï | 2 | 0.5% |
Ü | 2 | 0.5% |
è | 2 | 0.5% |
Other values (4) | 5 | 1.3% |
CJK
Value | Count | Frequency (%) |
元 | 8 | 8.5% |
寶 | 7 | 7.4% |
山 | 3 | 3.2% |
五 | 3 | 3.2% |
大 | 3 | 3.2% |
光 | 3 | 3.2% |
緖 | 3 | 3.2% |
通 | 2 | 2.1% |
吉 | 2 | 2.1% |
林 | 2 | 2.1% |
Other values (48) | 58 |
Punctuation
Value | Count | Frequency (%) |
’ | 4 |
ISPARTOF_ID
Text
MISSING
 
Distinct | 643 |
---|---|
Distinct (%) | 14.5% |
Missing | 5570 |
Missing (%) | 55.7% |
Memory size | 156.2 KiB |
Length
Max length | 15 |
---|---|
Median length | 15 |
Mean length | 15 |
Min length | 15 |
Characters and Unicode
Total characters | 66450 |
---|---|
Distinct characters | 19 |
Distinct categories | 5 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 62 ? |
---|---|
Unique (%) | 1.4% |
Sample
1st row | KH.AKS.msu_5579 |
---|---|
2nd row | KH.AKS.msu_5121 |
3rd row | KH.AKS.msu_5705 |
4th row | KH.AKS.msu_5825 |
5th row | KH.AKS.msu_5288 |
Value | Count | Frequency (%) |
kh.aks.msu_5731 | 31 | 0.7% |
kh.aks.msu_5470 | 31 | 0.7% |
kh.aks.msu_5806 | 30 | 0.7% |
kh.aks.msu_5856 | 29 | 0.7% |
kh.aks.msu_5851 | 28 | 0.6% |
kh.aks.msu_5835 | 28 | 0.6% |
kh.aks.msu_5847 | 26 | 0.6% |
kh.aks.msu_5821 | 26 | 0.6% |
kh.aks.msu_5153 | 25 | 0.6% |
kh.aks.msu_5853 | 24 | 0.5% |
Other values (633) | 4152 |
Most occurring characters
Value | Count | Frequency (%) |
K | 8860 | |
. | 8860 | |
5 | 6005 | |
A | 4430 | 6.7% |
S | 4430 | 6.7% |
m | 4430 | 6.7% |
s | 4430 | 6.7% |
u | 4430 | 6.7% |
_ | 4430 | 6.7% |
H | 4430 | 6.7% |
Other values (9) | 11715 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 22150 | |
Decimal Number | 17720 | |
Lowercase Letter | 13290 | |
Other Punctuation | 8860 | 13.3% |
Connector Punctuation | 4430 | 6.7% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
5 | 6005 | |
8 | 1947 | 11.0% |
6 | 1517 | 8.6% |
7 | 1481 | 8.4% |
4 | 1465 | 8.3% |
0 | 1323 | 7.5% |
1 | 1222 | 6.9% |
2 | 1017 | 5.7% |
3 | 962 | 5.4% |
9 | 781 | 4.4% |
Uppercase Letter
Value | Count | Frequency (%) |
K | 8860 | |
A | 4430 | |
S | 4430 | |
H | 4430 |
Lowercase Letter
Value | Count | Frequency (%) |
m | 4430 | |
s | 4430 | |
u | 4430 |
Other Punctuation
Value | Count | Frequency (%) |
. | 8860 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 4430 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 35440 | |
Common | 31010 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
. | 8860 | |
5 | 6005 | |
_ | 4430 | |
8 | 1947 | 6.3% |
6 | 1517 | 4.9% |
7 | 1481 | 4.8% |
4 | 1465 | 4.7% |
0 | 1323 | 4.3% |
1 | 1222 | 3.9% |
2 | 1017 | 3.3% |
Other values (2) | 1743 | 5.6% |
Latin
Value | Count | Frequency (%) |
K | 8860 | |
A | 4430 | |
S | 4430 | |
m | 4430 | |
s | 4430 | |
u | 4430 | |
H | 4430 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 66450 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
K | 8860 | |
. | 8860 | |
5 | 6005 | |
A | 4430 | 6.7% |
S | 4430 | 6.7% |
m | 4430 | 6.7% |
s | 4430 | 6.7% |
u | 4430 | 6.7% |
_ | 4430 | 6.7% |
H | 4430 | 6.7% |
Other values (9) | 11715 |
ISPARTOF
Text
MISSING
 
Distinct | 240 |
---|---|
Distinct (%) | 5.4% |
Missing | 5570 |
Missing (%) | 55.7% |
Memory size | 156.2 KiB |
Length
Max length | 14 |
---|---|
Median length | 12 |
Mean length | 11.715124 |
Min length | 3 |
Characters and Unicode
Total characters | 51898 |
---|---|
Distinct characters | 168 |
Distinct categories | 2 ? |
Distinct scripts | 2 ? |
Distinct blocks | 2 ? |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 0.2% |
Sample
1st row | 전라남도 여수시 화정면 |
---|---|
2nd row | 강원도 정선군 임계면 |
3rd row | 경상북도 문경시 가은읍 |
4th row | 경상남도 사천시 남양동 |
5th row | 대전광역시 대덕구 신탄진동 |
Value | Count | Frequency (%) |
전라남도 | 1553 | 11.8% |
경상남도 | 843 | 6.4% |
강원도 | 627 | 4.8% |
경상북도 | 555 | 4.2% |
사천시 | 546 | 4.2% |
여수시 | 421 | 3.2% |
남해군 | 280 | 2.1% |
부산광역시 | 278 | 2.1% |
진도군 | 238 | 1.8% |
대구광역시 | 200 | 1.5% |
Other values (307) | 7594 |
Most occurring characters
Value | Count | Frequency (%) |
8705 | ||
도 | 4455 | 8.6% |
남 | 2984 | 5.7% |
면 | 2961 | 5.7% |
시 | 2647 | 5.1% |
군 | 2104 | 4.1% |
전 | 1760 | 3.4% |
경 | 1722 | 3.3% |
라 | 1720 | 3.3% |
상 | 1549 | 3.0% |
Other values (158) | 21291 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 43193 | |
Space Separator | 8705 | 16.8% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
도 | 4455 | 10.3% |
남 | 2984 | 6.9% |
면 | 2961 | 6.9% |
시 | 2647 | 6.1% |
군 | 2104 | 4.9% |
전 | 1760 | 4.1% |
경 | 1722 | 4.0% |
라 | 1720 | 4.0% |
상 | 1549 | 3.6% |
동 | 1297 | 3.0% |
Other values (157) | 19994 |
Space Separator
Value | Count | Frequency (%) |
8705 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 43193 | |
Common | 8705 | 16.8% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
도 | 4455 | 10.3% |
남 | 2984 | 6.9% |
면 | 2961 | 6.9% |
시 | 2647 | 6.1% |
군 | 2104 | 4.9% |
전 | 1760 | 4.1% |
경 | 1722 | 4.0% |
라 | 1720 | 4.0% |
상 | 1549 | 3.6% |
동 | 1297 | 3.0% |
Other values (157) | 19994 |
Common
Value | Count | Frequency (%) |
8705 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 43193 | |
ASCII | 8705 | 16.8% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
8705 |
Hangul
Value | Count | Frequency (%) |
도 | 4455 | 10.3% |
남 | 2984 | 6.9% |
면 | 2961 | 6.9% |
시 | 2647 | 6.1% |
군 | 2104 | 4.9% |
전 | 1760 | 4.1% |
경 | 1722 | 4.0% |
라 | 1720 | 4.0% |
상 | 1549 | 3.6% |
동 | 1297 | 3.0% |
Other values (157) | 19994 |
REQUIRES
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 10000 |
---|---|
Missing (%) | 100.0% |
Memory size | 166.0 KiB |
DATEEVENT
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 10000 |
---|---|
Missing (%) | 100.0% |
Memory size | 166.0 KiB |
DOCCREATED
Text
MISSING
 
Distinct | 273 |
---|---|
Distinct (%) | 6.2% |
Missing | 5570 |
Missing (%) | 55.7% |
Memory size | 156.2 KiB |
Length
Max length | 10 |
---|---|
Median length | 9 |
Mean length | 8.3340858 |
Min length | 2 |
Characters and Unicode
Total characters | 36920 |
---|---|
Distinct characters | 14 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 2 ? |
Unique
Unique | 15 ? |
---|---|
Unique (%) | 0.3% |
Sample
1st row | 1992.7.17 |
---|---|
2nd row | 1986.6.24 |
3rd row | 1985.8.6 |
4th row | 1998.11.13 |
5th row | 1980.8.4 |
Value | Count | Frequency (%) |
미상 | 481 | 10.9% |
1998.11.13 | 263 | 5.9% |
1998.11.14 | 152 | 3.4% |
2002.11.8 | 72 | 1.6% |
1996-99-99 | 70 | 1.6% |
1997.10.4 | 66 | 1.5% |
1996.7.19 | 55 | 1.2% |
1993.11.13 | 52 | 1.2% |
1993.7.9 | 51 | 1.2% |
1997.10.5 | 49 | 1.1% |
Other values (263) | 3119 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 8924 | |
9 | 7957 | |
. | 7696 | |
2 | 2366 | 6.4% |
0 | 1753 | 4.7% |
7 | 1553 | 4.2% |
8 | 1514 | 4.1% |
3 | 1228 | 3.3% |
5 | 1227 | 3.3% |
6 | 909 | 2.5% |
Other values (4) | 1793 | 4.9% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 28114 | |
Other Punctuation | 7696 | 20.8% |
Other Letter | 962 | 2.6% |
Dash Punctuation | 148 | 0.4% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 8924 | |
9 | 7957 | |
2 | 2366 | 8.4% |
0 | 1753 | 6.2% |
7 | 1553 | 5.5% |
8 | 1514 | 5.4% |
3 | 1228 | 4.4% |
5 | 1227 | 4.4% |
6 | 909 | 3.2% |
4 | 683 | 2.4% |
Other Letter
Value | Count | Frequency (%) |
미 | 481 | |
상 | 481 |
Other Punctuation
Value | Count | Frequency (%) |
. | 7696 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 148 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 35958 | |
Hangul | 962 | 2.6% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 8924 | |
9 | 7957 | |
. | 7696 | |
2 | 2366 | 6.6% |
0 | 1753 | 4.9% |
7 | 1553 | 4.3% |
8 | 1514 | 4.2% |
3 | 1228 | 3.4% |
5 | 1227 | 3.4% |
6 | 909 | 2.5% |
Other values (2) | 831 | 2.3% |
Hangul
Value | Count | Frequency (%) |
미 | 481 | |
상 | 481 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 35958 | |
Hangul | 962 | 2.6% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 8924 | |
9 | 7957 | |
. | 7696 | |
2 | 2366 | 6.6% |
0 | 1753 | 4.9% |
7 | 1553 | 4.3% |
8 | 1514 | 4.2% |
3 | 1228 | 3.4% |
5 | 1227 | 3.4% |
6 | 909 | 2.5% |
Other values (2) | 831 | 2.3% |
Hangul
Value | Count | Frequency (%) |
미 | 481 | |
상 | 481 |
DOCISSUED
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 10000 |
---|---|
Missing (%) | 100.0% |
Memory size | 166.0 KiB |
DATE_ISSUED
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
1900-01-01 00:00:00 | |
---|---|
2007-11-29 00:00:00 |
Length
Max length | 19 |
---|---|
Median length | 19 |
Mean length | 19 |
Min length | 19 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2007-11-29 00:00:00 |
---|---|
2nd row | 2007-11-29 00:00:00 |
3rd row | 2007-11-29 00:00:00 |
4th row | 2007-11-29 00:00:00 |
5th row | 1900-01-01 00:00:00 |
Common Values
Value | Count | Frequency (%) |
1900-01-01 00:00:00 | 5570 | |
2007-11-29 00:00:00 | 4430 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
00:00:00 | 10000 | |
1900-01-01 | 5570 | |
2007-11-29 | 4430 |
DATE_CREATED
Categorical
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2006-12-20 00:00:00 | |
---|---|
2003-11-01 00:00:00 | |
1900-01-01 00:00:00 | 12 |
Length
Max length | 19 |
---|---|
Median length | 19 |
Mean length | 19 |
Min length | 19 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2003-11-01 00:00:00 |
---|---|
2nd row | 2003-11-01 00:00:00 |
3rd row | 2003-11-01 00:00:00 |
4th row | 2003-11-01 00:00:00 |
5th row | 2006-12-20 00:00:00 |
Common Values
Value | Count | Frequency (%) |
2006-12-20 00:00:00 | 5570 | |
2003-11-01 00:00:00 | 4418 | |
1900-01-01 00:00:00 | 12 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
00:00:00 | 10000 | |
2006-12-20 | 5570 | |
2003-11-01 | 4418 | |
1900-01-01 | 12 | 0.1% |
DATE_MODIFIED
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2007-10-16 00:00:00 | |
---|---|
2008-04-21 00:00:00 |
Length
Max length | 19 |
---|---|
Median length | 19 |
Mean length | 19 |
Min length | 19 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2008-04-21 00:00:00 |
---|---|
2nd row | 2008-04-21 00:00:00 |
3rd row | 2008-04-21 00:00:00 |
4th row | 2008-04-21 00:00:00 |
5th row | 2007-10-16 00:00:00 |
Common Values
Value | Count | Frequency (%) |
2007-10-16 00:00:00 | 5570 | |
2008-04-21 00:00:00 | 4430 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
00:00:00 | 10000 | |
2007-10-16 | 5570 | |
2008-04-21 | 4430 |
URL
Text
UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 182 |
---|---|
Median length | 182 |
Mean length | 161.622 |
Min length | 136 |
Characters and Unicode
Total characters | 1616220 |
---|---|
Distinct characters | 50 |
Distinct categories | 8 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 10000 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5579&um20no=0539_23_5579</get> </url> |
---|---|
2nd row | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5121&um20no=0059_08_5121</get> </url> |
3rd row | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5705&um20no=0278_03_5705</get> </url> |
4th row | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5825&um20no=0196_14_5825</get> </url> |
5th row | <url> <get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=AC_ENG_0369&vol_id=01&page_id=0204&photo_id=02</get> </url> |
Value | Count | Frequency (%) |
url | 20000 | |
get>http://yoksa.aks.ac.kr/jsp/um/list.jsp?um10no=msu_5069&um20no=0023_01_5069</get | 1 | < 0.1% |
get>http://yoksa.aks.ac.kr/jsp/um/list.jsp?um10no=msu_5612&um20no=0589_08_5612</get | 1 | < 0.1% |
get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=ac_eng_0088&vol_id=01&page_id=0317&photo_id=01</get | 1 | < 0.1% |
get>http://yoksa.aks.ac.kr/jsp/um/list.jsp?um10no=msu_5145&um20no=0078_20_5145</get | 1 | < 0.1% |
get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=ac_eng_0268&vol_id=01&page_id=0339&photo_id=02</get | 1 | < 0.1% |
get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=ac_eng_0098&vol_id=01&page_id=0418&photo_id=01</get | 1 | < 0.1% |
get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=ac_eng_0366&vol_id=01&page_id=0278&photo_id=01</get | 1 | < 0.1% |
get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=ac_eng_0014&vol_id=01&page_id=0180&photo_id=01</get | 1 | < 0.1% |
get>http://yoksa.aks.ac.kr/jsp/um/list.jsp?um10no=msu_5561&um20no=0522_01_5561</get | 1 | < 0.1% |
Other values (9991) | 9991 |
Most occurring characters
Value | Count | Frequency (%) |
360000 | ||
p | 78990 | 4.9% |
o | 74560 | 4.6% |
/ | 64430 | 4.0% |
_ | 57850 | 3.6% |
a | 56710 | 3.5% |
t | 55570 | 3.4% |
0 | 53068 | 3.3% |
r | 46710 | 2.9% |
m | 44430 | 2.7% |
Other values (40) | 723902 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 702670 | |
Space Separator | 360000 | |
Other Punctuation | 177850 | 11.0% |
Decimal Number | 163290 | 10.1% |
Math Symbol | 116710 | 7.2% |
Connector Punctuation | 57850 | 3.6% |
Uppercase Letter | 32280 | 2.0% |
Dash Punctuation | 5570 | 0.3% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
p | 78990 | |
o | 74560 | 10.6% |
a | 56710 | 8.1% |
t | 55570 | 7.9% |
r | 46710 | 6.6% |
m | 44430 | 6.3% |
u | 43290 | 6.2% |
e | 42280 | 6.0% |
s | 37720 | 5.4% |
g | 31140 | 4.4% |
Other values (12) | 191270 |
Decimal Number
Value | Count | Frequency (%) |
0 | 53068 | |
2 | 25848 | |
1 | 24885 | |
5 | 16581 | 10.2% |
3 | 9471 | 5.8% |
4 | 7983 | 4.9% |
8 | 7139 | 4.4% |
6 | 7132 | 4.4% |
7 | 6429 | 3.9% |
9 | 4754 | 2.9% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 64430 | |
. | 40000 | |
& | 26710 | |
; | 26710 | |
: | 10000 | 5.6% |
? | 10000 | 5.6% |
Uppercase Letter
Value | Count | Frequency (%) |
E | 5570 | |
G | 5570 | |
N | 5570 | |
A | 5570 | |
C | 5570 | |
L | 4430 |
Math Symbol
Value | Count | Frequency (%) |
< | 40000 | |
> | 40000 | |
= | 36710 |
Space Separator
Value | Count | Frequency (%) |
360000 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 57850 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 5570 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 881270 | |
Latin | 734950 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
p | 78990 | 10.7% |
o | 74560 | 10.1% |
a | 56710 | 7.7% |
t | 55570 | 7.6% |
r | 46710 | 6.4% |
m | 44430 | 6.0% |
u | 43290 | 5.9% |
e | 42280 | 5.8% |
s | 37720 | 5.1% |
g | 31140 | 4.2% |
Other values (18) | 223550 |
Common
Value | Count | Frequency (%) |
360000 | ||
/ | 64430 | 7.3% |
_ | 57850 | 6.6% |
0 | 53068 | 6.0% |
. | 40000 | 4.5% |
< | 40000 | 4.5% |
> | 40000 | 4.5% |
= | 36710 | 4.2% |
& | 26710 | 3.0% |
; | 26710 | 3.0% |
Other values (12) | 135792 | 15.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1616220 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
360000 | ||
p | 78990 | 4.9% |
o | 74560 | 4.6% |
/ | 64430 | 4.0% |
_ | 57850 | 3.6% |
a | 56710 | 3.5% |
t | 55570 | 3.4% |
0 | 53068 | 3.3% |
r | 46710 | 2.9% |
m | 44430 | 2.7% |
Other values (40) | 723902 |
CREATORSORT
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 10000 |
---|---|
Missing (%) | 100.0% |
Memory size | 166.0 KiB |
DATESORT
Text
MISSING
 
Distinct | 274 |
---|---|
Distinct (%) | 6.2% |
Missing | 5570 |
Missing (%) | 55.7% |
Memory size | 156.2 KiB |
Length
Max length | 10 |
---|---|
Median length | 9 |
Mean length | 8.3706546 |
Min length | 2 |
Characters and Unicode
Total characters | 37082 |
---|---|
Distinct characters | 14 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 2 ? |
Unique
Unique | 15 ? |
---|---|
Unique (%) | 0.3% |
Sample
1st row | 1992.7.17 |
---|---|
2nd row | 1986.6.24 |
3rd row | 1985.8.6 |
4th row | 1998.11.13 |
5th row | 1980.8.4 |
Value | Count | Frequency (%) |
미상 | 481 | 10.9% |
1998.11.13 | 263 | 5.9% |
1998.11.14 | 152 | 3.4% |
2002.11.8 | 72 | 1.6% |
1996-99-99 | 70 | 1.6% |
1997.10.4 | 66 | 1.5% |
1996.7.19 | 55 | 1.2% |
1993.11.13 | 52 | 1.2% |
1993.7.9 | 51 | 1.2% |
1997.10.5 | 49 | 1.1% |
Other values (264) | 3119 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 8924 | |
9 | 8011 | |
. | 7723 | |
2 | 2366 | 6.4% |
0 | 1753 | 4.7% |
7 | 1553 | 4.2% |
8 | 1518 | 4.1% |
3 | 1228 | 3.3% |
5 | 1227 | 3.3% |
6 | 932 | 2.5% |
Other values (4) | 1847 | 5.0% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 28195 | |
Other Punctuation | 7723 | 20.8% |
Other Letter | 962 | 2.6% |
Dash Punctuation | 202 | 0.5% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 8924 | |
9 | 8011 | |
2 | 2366 | 8.4% |
0 | 1753 | 6.2% |
7 | 1553 | 5.5% |
8 | 1518 | 5.4% |
3 | 1228 | 4.4% |
5 | 1227 | 4.4% |
6 | 932 | 3.3% |
4 | 683 | 2.4% |
Other Letter
Value | Count | Frequency (%) |
미 | 481 | |
상 | 481 |
Other Punctuation
Value | Count | Frequency (%) |
. | 7723 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 202 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 36120 | |
Hangul | 962 | 2.6% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 8924 | |
9 | 8011 | |
. | 7723 | |
2 | 2366 | 6.6% |
0 | 1753 | 4.9% |
7 | 1553 | 4.3% |
8 | 1518 | 4.2% |
3 | 1228 | 3.4% |
5 | 1227 | 3.4% |
6 | 932 | 2.6% |
Other values (2) | 885 | 2.5% |
Hangul
Value | Count | Frequency (%) |
미 | 481 | |
상 | 481 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 36120 | |
Hangul | 962 | 2.6% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 8924 | |
9 | 8011 | |
. | 7723 | |
2 | 2366 | 6.6% |
0 | 1753 | 4.9% |
7 | 1553 | 4.3% |
8 | 1518 | 4.2% |
3 | 1228 | 3.4% |
5 | 1227 | 3.4% |
6 | 932 | 2.6% |
Other values (2) | 885 | 2.5% |
Hangul
Value | Count | Frequency (%) |
미 | 481 | |
상 | 481 |
URI_KHON | MDCENTER | SUBJECT_KHON | DBINFO | URI_KHDP | MAINTITLE | ALTERNATIVE | DOCSENDER | EDITOR | AUTHOR | SUBJECT_KHON1 | SUBJECT_KHON2 | SUBJECT_KHDP | TYPE | UNIT | PUBLISHER | FORMAT_MEDIUM | TABLEOFCONTENTS | ABSTRACT | ISPARTOF_ID | ISPARTOF | REQUIRES | DATEEVENT | DOCCREATED | DOCISSUED | DATE_ISSUED | DATE_CREATED | DATE_MODIFIED | URL | CREATORSORT | DATESORT | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
17649 | KH.AKS.0539_23_5579 | AKS | KH.13.02.001 | 한국민요대관 | 0539_23_5579 | 토끼타령 | <NA> | <NA> | <NA> | <NA> | KH.13 | KH.13.02 | um | <NA> | 2 | <NA> | text/xml|audio/asf | <NA> | <NA> | KH.AKS.msu_5579 | 전라남도 여수시 화정면 | <NA> | <NA> | 1992.7.17 | <NA> | 2007-11-29 00:00:00 | 2003-11-01 00:00:00 | 2008-04-21 00:00:00 | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5579&um20no=0539_23_5579</get> </url> | <NA> | 1992.7.17 |
11082 | KH.AKS.0059_08_5121 | AKS | KH.13.02.001 | 한국민요대관 | 0059_08_5121 | 곱새치기소리 | <NA> | <NA> | <NA> | <NA> | KH.13 | KH.13.02 | um | <NA> | 2 | <NA> | text/xml|audio/asf | <NA> | <NA> | KH.AKS.msu_5121 | 강원도 정선군 임계면 | <NA> | <NA> | 1986.6.24 | <NA> | 2007-11-29 00:00:00 | 2003-11-01 00:00:00 | 2008-04-21 00:00:00 | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5121&um20no=0059_08_5121</get> </url> | <NA> | 1986.6.24 |
13855 | KH.AKS.0278_03_5705 | AKS | KH.13.02.001 | 한국민요대관 | 0278_03_5705 | 창부타령 | <NA> | <NA> | <NA> | <NA> | KH.13 | KH.13.02 | um | <NA> | 2 | <NA> | text/xml|audio/asf | <NA> | <NA> | KH.AKS.msu_5705 | 경상북도 문경시 가은읍 | <NA> | <NA> | 1985.8.6 | <NA> | 2007-11-29 00:00:00 | 2003-11-01 00:00:00 | 2008-04-21 00:00:00 | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5705&um20no=0278_03_5705</get> </url> | <NA> | 1985.8.6 |
12648 | KH.AKS.0196_14_5825 | AKS | KH.13.02.001 | 한국민요대관 | 0196_14_5825 | 창부타령 | <NA> | <NA> | <NA> | <NA> | KH.13 | KH.13.02 | um | <NA> | 2 | <NA> | text/xml|audio/asf | <NA> | <NA> | KH.AKS.msu_5825 | 경상남도 사천시 남양동 | <NA> | <NA> | 1998.11.13 | <NA> | 2007-11-29 00:00:00 | 2003-11-01 00:00:00 | 2008-04-21 00:00:00 | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5825&um20no=0196_14_5825</get> </url> | <NA> | 1998.11.13 |
9711 | KH.AC.AC_ENG_0369_01_0204_02 | AC | KH.13.01.011 | 서양고서 | AC_ENG_0369_01_0204_02 | THERESA MULLIN, Bootle, Liverpool. | 테레사 뮬린, 부틀, 리버풀 | <NA> | <NA> | <NA> | KH.13 | KH.13.01 | d | <NA> | 2 | <NA> | image/jpeg | <NA> | 출전 : The Far East ( ) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 1900-01-01 00:00:00 | 2006-12-20 00:00:00 | 2007-10-16 00:00:00 | <url> <get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=AC_ENG_0369&vol_id=01&page_id=0204&photo_id=02</get> </url> | <NA> | <NA> |
8957 | KH.AC.AC_ENG_0365_01_0313_01 | AC | KH.13.01.011 | 서양고서 | AC_ENG_0365_01_0313_01 | AMERICAN STUDENTS AT DALGAN. | 달간의 미국인 학생들 | <NA> | <NA> | <NA> | KH.13 | KH.13.01 | c | <NA> | 2 | <NA> | image/jpeg | <NA> | 출전 : The Far East ( ) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 1900-01-01 00:00:00 | 2006-12-20 00:00:00 | 2007-10-16 00:00:00 | <url> <get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=AC_ENG_0365&vol_id=01&page_id=0313&photo_id=01</get> </url> | <NA> | <NA> |
15008 | KH.AKS.0364_08_5288 | AKS | KH.13.02.001 | 한국민요대관 | 0364_08_5288 | 까마귀노래 | <NA> | <NA> | <NA> | <NA> | KH.13 | KH.13.02 | um | <NA> | 2 | <NA> | text/xml|audio/asf | <NA> | <NA> | KH.AKS.msu_5288 | 대전광역시 대덕구 신탄진동 | <NA> | <NA> | 1980.8.4 | <NA> | 2007-11-29 00:00:00 | 2003-11-01 00:00:00 | 2008-04-21 00:00:00 | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5288&um20no=0364_08_5288</get> </url> | <NA> | 1980.8.4 |
5194 | KH.AC.AC_ENG_0234_01_0053_01 | AC | KH.13.01.011 | 서양고서 | AC_ENG_0234_01_0053_01 | Old Provincial Government Offices | 옛 지방 관청 | <NA> | <NA> | <NA> | KH.13 | KH.13.01 | e | <NA> | 2 | <NA> | image/jpeg | <NA> | 출전 : Economic history of Chosen ( comp. in commemoration of the decennial of the Bank of Chosen ) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 1900-01-01 00:00:00 | 2006-12-20 00:00:00 | 2007-10-16 00:00:00 | <url> <get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=AC_ENG_0234&vol_id=01&page_id=0053&photo_id=01</get> </url> | <NA> | <NA> |
9267 | KH.AC.AC_ENG_0367_01_0164_06 | AC | KH.13.01.011 | 서양고서 | AC_ENG_0367_01_0164_06 | GEORGIE KENNEDY, CARTOWN HOUSE, KILDIMO. | 조지 케네디, 카타운 하우스, 킬디모 | <NA> | <NA> | <NA> | KH.13 | KH.13.01 | d | <NA> | 2 | <NA> | image/jpeg | <NA> | 출전 : The Far East ( ) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 1900-01-01 00:00:00 | 2006-12-20 00:00:00 | 2007-10-16 00:00:00 | <url> <get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=AC_ENG_0367&vol_id=01&page_id=0164&photo_id=06</get> </url> | <NA> | <NA> |
10579 | KH.AKS.0011_03_5057 | AKS | KH.13.02.001 | 한국민요대관 | 0011_03_5057 | 모심는소리 | <NA> | <NA> | <NA> | <NA> | KH.13 | KH.13.02 | um | <NA> | 2 | <NA> | text/xml|audio/asf | <NA> | <NA> | KH.AKS.msu_5057 | 강원도 강릉시 성산면 | <NA> | <NA> | 2001.11.9 | <NA> | 2007-11-29 00:00:00 | 2003-11-01 00:00:00 | 2008-04-21 00:00:00 | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5057&um20no=0011_03_5057</get> </url> | <NA> | 2001.11.9 |
URI_KHON | MDCENTER | SUBJECT_KHON | DBINFO | URI_KHDP | MAINTITLE | ALTERNATIVE | DOCSENDER | EDITOR | AUTHOR | SUBJECT_KHON1 | SUBJECT_KHON2 | SUBJECT_KHDP | TYPE | UNIT | PUBLISHER | FORMAT_MEDIUM | TABLEOFCONTENTS | ABSTRACT | ISPARTOF_ID | ISPARTOF | REQUIRES | DATEEVENT | DOCCREATED | DOCISSUED | DATE_ISSUED | DATE_CREATED | DATE_MODIFIED | URL | CREATORSORT | DATESORT | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
7701 | KH.AC.AC_ENG_0339_01_0125_01 | AC | KH.13.01.011 | 서양고서 | AC_ENG_0339_01_0125_01 | ICHANG GORGE, RED-BOAT AND JUNK PASSING. | 이창 협곡, 빨간 배와 정크가 지나가고 있다 | <NA> | <NA> | <NA> | KH.13 | KH.13.01 | e | <NA> | 2 | <NA> | image/jpeg | <NA> | 출전 : Missionary joys in Japan ( ;or, Leaves from my journal. With an introd. by Barclay F. Buxton. ) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 1900-01-01 00:00:00 | 2006-12-20 00:00:00 | 2007-10-16 00:00:00 | <url> <get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=AC_ENG_0339&vol_id=01&page_id=0125&photo_id=01</get> </url> | <NA> | <NA> |
16933 | KH.AKS.0494_26_5523 | AKS | KH.13.02.001 | 한국민요대관 | 0494_26_5523 | 산아지타령 | <NA> | <NA> | <NA> | <NA> | KH.13 | KH.13.02 | um | <NA> | 2 | <NA> | text/xml|audio/asf | <NA> | <NA> | KH.AKS.msu_5523 | 전라남도 여수시 남면 | <NA> | <NA> | 1993.7.7 | <NA> | 2007-11-29 00:00:00 | 2003-11-01 00:00:00 | 2008-04-21 00:00:00 | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5523&um20no=0494_26_5523</get> </url> | <NA> | 1993.7.7 |
5088 | KH.AC.AC_ENG_0233_01_0203_01 | AC | KH.13.01.011 | 서양고서 | AC_ENG_0233_01_0203_01 | Fig. 108. -Strong erosion in Shantung, with wheat on remnants of tables. | 고지 아래의 밀밭과 산동 지방의 강한 부식 | <NA> | <NA> | <NA> | KH.13 | KH.13.01 | e | <NA> | 2 | <NA> | image/jpeg | <NA> | 출전 : Farmers of forty centuries ( or Permanent agriculture in China, Korea and Japan ) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 1900-01-01 00:00:00 | 2006-12-20 00:00:00 | 2007-10-16 00:00:00 | <url> <get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=AC_ENG_0233&vol_id=01&page_id=0203&photo_id=01</get> </url> | <NA> | <NA> |
1288 | KH.AC.AC_ENG_0047_01_0253_01 | AC | KH.13.01.011 | 서양고서 | AC_ENG_0047_01_0253_01 | TOMB NEAR SEOUL. | 서울 근교의 무덤 | <NA> | <NA> | <NA> | KH.13 | KH.13.01 | c | <NA> | 2 | <NA> | image/jpeg | <NA> | 출전 : The story of Korea ( by Joseph H. Longford. With 33 illustrations and three maps ) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 1900-01-01 00:00:00 | 2006-12-20 00:00:00 | 2007-10-16 00:00:00 | <url> <get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=AC_ENG_0047&vol_id=01&page_id=0253&photo_id=01</get> </url> | <NA> | <NA> |
1827 | KH.AC.AC_ENG_0082_01_0094_01 | AC | KH.13.01.011 | 서양고서 | AC_ENG_0082_01_0094_01 | 154. Model Farm at Suwon(水原), Kyungki Province(京畿道).155. Tuksum(纛島) Branch Farm near Seoul(京城). | 154. 수원의 시범 농장, 경기도155. 서울 근교 뚝섬 농장 | <NA> | <NA> | <NA> | KH.13 | KH.13.01 | e | <NA> | 2 | <NA> | image/jpeg | <NA> | 출전 : Pictorial Chosen and Manchuria ( compiled in commemoration of the decennial of the Bank of Chosen ) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 1900-01-01 00:00:00 | 2006-12-20 00:00:00 | 2007-10-16 00:00:00 | <url> <get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=AC_ENG_0082&vol_id=01&page_id=0094&photo_id=01</get> </url> | <NA> | <NA> |
18508 | KH.AKS.0593_11_5645 | AKS | KH.13.02.001 | 한국민요대관 | 0593_11_5645 | 북충(사물가락) | <NA> | <NA> | <NA> | <NA> | KH.13 | KH.13.02 | um | <NA> | 2 | <NA> | text/xml|audio/asf | <NA> | <NA> | KH.AKS.msu_5645 | 전라남도 진도군 지산면 | <NA> | <NA> | 1997.10.13 | <NA> | 2007-11-29 00:00:00 | 2003-11-01 00:00:00 | 2008-04-21 00:00:00 | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5645&um20no=0593_11_5645</get> </url> | <NA> | 1997.10.13 |
16257 | KH.AKS.0450_38_5470 | AKS | KH.13.02.001 | 한국민요대관 | 0450_38_5470 | 시집살이노래 | <NA> | <NA> | <NA> | <NA> | KH.13 | KH.13.02 | um | <NA> | 2 | <NA> | text/xml|audio/asf | <NA> | <NA> | KH.AKS.msu_5470 | 전라남도 담양군 수북면 | <NA> | <NA> | 1980.8.10 | <NA> | 2007-11-29 00:00:00 | 2003-11-01 00:00:00 | 2008-04-21 00:00:00 | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5470&um20no=0450_38_5470</get> </url> | <NA> | 1980.8.10 |
1748 | KH.AC.AC_ENG_0082_01_0039_02 | AC | KH.13.01.011 | 서양고서 | AC_ENG_0082_01_0039_02 | 53. Washing clothes.54. Making sticks for smoothing cloth after washing and drying. | 53. 빨래터54. 다듬이 만드는 모습 | <NA> | <NA> | <NA> | KH.13 | KH.13.01 | e | <NA> | 2 | <NA> | image/jpeg | <NA> | 출전 : Pictorial Chosen and Manchuria ( compiled in commemoration of the decennial of the Bank of Chosen ) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 1900-01-01 00:00:00 | 2006-12-20 00:00:00 | 2007-10-16 00:00:00 | <url> <get>http://www.e-coreana.or.kr/photo/group_se_02.jsp?op=2&book_id=AC_ENG_0082&vol_id=01&page_id=0039&photo_id=02</get> </url> | <NA> | <NA> |
17180 | KH.AKS.0505_02_5537 | AKS | KH.13.02.001 | 한국민요대관 | 0505_02_5537 | 산아지타령 | <NA> | <NA> | <NA> | <NA> | KH.13 | KH.13.02 | um | <NA> | 2 | <NA> | text/xml|audio/asf | <NA> | <NA> | KH.AKS.msu_5537 | 전라남도 여수시 소라면 | <NA> | <NA> | 1993.11.12 | <NA> | 2007-11-29 00:00:00 | 2003-11-01 00:00:00 | 2008-04-21 00:00:00 | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5537&um20no=0505_02_5537</get> </url> | <NA> | 1993.11.12 |
14218 | KH.AKS.0308_07_5731 | AKS | KH.13.02.001 | 한국민요대관 | 0308_07_5731 | 쌍가락지노래 | <NA> | <NA> | <NA> | <NA> | KH.13 | KH.13.02 | um | <NA> | 2 | <NA> | text/xml|audio/asf | <NA> | <NA> | KH.AKS.msu_5731 | 경상북도 영천시 화북면 | <NA> | <NA> | 1995.5.20 | <NA> | 2007-11-29 00:00:00 | 2003-11-01 00:00:00 | 2008-04-21 00:00:00 | <url> <get>http://yoksa.aks.ac.kr/jsp/um/List.jsp?um10no=msu_5731&um20no=0308_07_5731</get> </url> | <NA> | 1995.5.20 |