Dataset statistics
Number of variables | 10 |
---|---|
Number of observations | 10000 |
Missing cells | 767 |
Missing cells (%) | 0.8% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 888.7 KiB |
Average record size in memory | 91.0 B |
Variable types
Numeric | 2 |
---|---|
Text | 5 |
Categorical | 3 |
Dataset
Description | 자료번호,청구기호,서지번호,서명,저자,출판사,출판일,배가위치코드,배가위치명,언어명 |
---|---|
Author | 서울특별시 |
URL | https://data.seoul.go.kr/dataList/OA-2251/S/1/datasetView.do |
배가위치코드 is highly overall correlated with 자료번호 and 3 other fields | High correlation |
언어명 is highly overall correlated with 배가위치코드 and 1 other fields | High correlation |
배가위치명 is highly overall correlated with 자료번호 and 3 other fields | High correlation |
자료번호 is highly overall correlated with 서지번호 and 2 other fields | High correlation |
서지번호 is highly overall correlated with 자료번호 and 2 other fields | High correlation |
배가위치코드 is highly imbalanced (89.7%) | Imbalance |
배가위치명 is highly imbalanced (89.7%) | Imbalance |
언어명 is highly imbalanced (72.8%) | Imbalance |
저자 has 574 (5.7%) missing values | Missing |
자료번호 has unique values | Unique |
Reproduction
Analysis started | 2023-12-11 06:07:40.848836 |
---|---|
Analysis finished | 2023-12-11 06:07:45.143955 |
Duration | 4.3 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
자료번호
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 38591.378 |
Minimum | 21151 |
---|---|
Maximum | 55249 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 21151 |
---|---|
5-th percentile | 22905.95 |
Q1 | 30206.75 |
median | 38628.5 |
Q3 | 47161.75 |
95-th percentile | 53697.25 |
Maximum | 55249 |
Range | 34098 |
Interquartile range (IQR) | 16955 |
Descriptive statistics
Standard deviation | 9829.1902 |
---|---|
Coefficient of variation (CV) | 0.25469912 |
Kurtosis | -1.1904679 |
Mean | 38591.378 |
Median Absolute Deviation (MAD) | 8467.5 |
Skewness | -0.03229132 |
Sum | 3.8591378 × 108 |
Variance | 96612980 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
48924 | 1 | < 0.1% |
28725 | 1 | < 0.1% |
38576 | 1 | < 0.1% |
36890 | 1 | < 0.1% |
35325 | 1 | < 0.1% |
39103 | 1 | < 0.1% |
53297 | 1 | < 0.1% |
27789 | 1 | < 0.1% |
41721 | 1 | < 0.1% |
41989 | 1 | < 0.1% |
Other values (9990) | 9990 |
Value | Count | Frequency (%) |
21151 | 1 | |
21159 | 1 | |
21162 | 1 | |
21163 | 1 | |
21168 | 1 | |
21170 | 1 | |
21171 | 1 | |
21172 | 1 | |
21177 | 1 | |
21178 | 1 |
Value | Count | Frequency (%) |
55249 | 1 | |
55247 | 1 | |
55245 | 1 | |
55243 | 1 | |
55236 | 1 | |
55234 | 1 | |
55230 | 1 | |
55219 | 1 | |
55218 | 1 | |
55216 | 1 |
청구기호
Text
Distinct | 9942 |
---|---|
Distinct (%) | 99.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 35 |
---|---|
Median length | 31 |
Mean length | 19.4027 |
Min length | 5 |
Characters and Unicode
Total characters | 194027 |
---|---|
Distinct characters | 486 |
Distinct categories | 11 ? |
Distinct scripts | 4 ? |
Distinct blocks | 5 ? |
Unique
Unique | 9926 ? |
---|---|
Unique (%) | 99.3% |
Sample
1st row | 340.9 하195ㅈ |
---|---|
2nd row | 818.08 한257한 v.8-4 |
3rd row | 810.81 정428여 v.17 |
4th row | P 911.005 서272 v.35 |
5th row | 911.0091 한257고 v.23 |
Value | Count | Frequency (%) |
p | 2078 | 6.7% |
r | 624 | 2.0% |
c.2 | 556 | 1.8% |
071.1 | 373 | 1.2% |
911.05 | 364 | 1.2% |
911 | 313 | 1.0% |
v.1 | 312 | 1.0% |
v.2 | 296 | 0.9% |
서272서 | 292 | 0.9% |
911.6 | 249 | 0.8% |
Other values (7300) | 25704 |
Most occurring characters
Value | Count | Frequency (%) |
49473 | ||
1 | 20373 | |
. | 14577 | 7.5% |
9 | 12900 | 6.6% |
2 | 12152 | 6.3% |
0 | 10818 | 5.6% |
3 | 8327 | 4.3% |
5 | 8038 | 4.1% |
6 | 7400 | 3.8% |
8 | 7177 | 3.7% |
Other values (476) | 42792 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 97870 | |
Space Separator | 49473 | |
Other Letter | 20979 | 10.8% |
Other Punctuation | 14581 | 7.5% |
Lowercase Letter | 4889 | 2.5% |
Uppercase Letter | 4451 | 2.3% |
Dash Punctuation | 721 | 0.4% |
Open Punctuation | 488 | 0.3% |
Close Punctuation | 463 | 0.2% |
Math Symbol | 111 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
서 | 1938 | 9.2% |
한 | 1419 | 6.8% |
국 | 813 | 3.9% |
ㅇ | 717 | 3.4% |
조 | 666 | 3.2% |
ㅅ | 638 | 3.0% |
동 | 604 | 2.9% |
민 | 514 | 2.5% |
이 | 467 | 2.2% |
ㅈ | 465 | 2.2% |
Other values (419) | 12738 |
Lowercase Letter
Value | Count | Frequency (%) |
v | 4782 | |
c | 41 | 0.8% |
n | 13 | 0.3% |
s | 12 | 0.2% |
e | 11 | 0.2% |
d | 4 | 0.1% |
k | 4 | 0.1% |
t | 4 | 0.1% |
r | 3 | 0.1% |
h | 3 | 0.1% |
Other values (9) | 12 | 0.2% |
Uppercase Letter
Value | Count | Frequency (%) |
P | 2275 | |
C | 723 | 16.2% |
R | 626 | 14.1% |
V | 416 | 9.3% |
A | 240 | 5.4% |
D | 140 | 3.1% |
S | 13 | 0.3% |
L | 4 | 0.1% |
N | 3 | 0.1% |
H | 2 | < 0.1% |
Other values (8) | 9 | 0.2% |
Decimal Number
Value | Count | Frequency (%) |
1 | 20373 | |
9 | 12900 | |
2 | 12152 | |
0 | 10818 | |
3 | 8327 | |
5 | 8038 | 8.2% |
6 | 7400 | 7.6% |
8 | 7177 | 7.3% |
7 | 5816 | 5.9% |
4 | 4869 | 5.0% |
Other Punctuation
Value | Count | Frequency (%) |
. | 14577 | |
? | 4 | < 0.1% |
Open Punctuation
Value | Count | Frequency (%) |
( | 483 | |
[ | 5 | 1.0% |
Close Punctuation
Value | Count | Frequency (%) |
) | 458 | |
] | 5 | 1.1% |
Space Separator
Value | Count | Frequency (%) |
49473 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 721 |
Math Symbol
Value | Count | Frequency (%) |
~ | 111 |
Letter Number
Value | Count | Frequency (%) |
Ⅰ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 163707 | |
Hangul | 20822 | 10.7% |
Latin | 9341 | 4.8% |
Han | 157 | 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
서 | 1938 | 9.3% |
한 | 1419 | 6.8% |
국 | 813 | 3.9% |
ㅇ | 717 | 3.4% |
조 | 666 | 3.2% |
ㅅ | 638 | 3.1% |
동 | 604 | 2.9% |
민 | 514 | 2.5% |
이 | 467 | 2.2% |
ㅈ | 465 | 2.2% |
Other values (409) | 12581 |
Latin
Value | Count | Frequency (%) |
v | 4782 | |
P | 2275 | |
C | 723 | 7.7% |
R | 626 | 6.7% |
V | 416 | 4.5% |
A | 240 | 2.6% |
D | 140 | 1.5% |
c | 41 | 0.4% |
S | 13 | 0.1% |
n | 13 | 0.1% |
Other values (28) | 72 | 0.8% |
Common
Value | Count | Frequency (%) |
49473 | ||
1 | 20373 | |
. | 14577 | 8.9% |
9 | 12900 | 7.9% |
2 | 12152 | 7.4% |
0 | 10818 | 6.6% |
3 | 8327 | 5.1% |
5 | 8038 | 4.9% |
6 | 7400 | 4.5% |
8 | 7177 | 4.4% |
Other values (9) | 12472 | 7.6% |
Han
Value | Count | Frequency (%) |
上 | 71 | |
下 | 56 | |
中 | 23 | 14.6% |
後 | 1 | 0.6% |
經 | 1 | 0.6% |
子 | 1 | 0.6% |
基 | 1 | 0.6% |
豊 | 1 | 0.6% |
興 | 1 | 0.6% |
順 | 1 | 0.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 173047 | |
Hangul | 17093 | 8.8% |
Compat Jamo | 3729 | 1.9% |
CJK | 157 | 0.1% |
Number Forms | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
49473 | ||
1 | 20373 | |
. | 14577 | 8.4% |
9 | 12900 | 7.5% |
2 | 12152 | 7.0% |
0 | 10818 | 6.3% |
3 | 8327 | 4.8% |
5 | 8038 | 4.6% |
6 | 7400 | 4.3% |
8 | 7177 | 4.1% |
Other values (46) | 21812 |
Hangul
Value | Count | Frequency (%) |
서 | 1938 | 11.3% |
한 | 1419 | 8.3% |
국 | 813 | 4.8% |
조 | 666 | 3.9% |
동 | 604 | 3.5% |
민 | 514 | 3.0% |
이 | 467 | 2.7% |
사 | 455 | 2.7% |
대 | 450 | 2.6% |
시 | 413 | 2.4% |
Other values (390) | 9354 |
Compat Jamo
Value | Count | Frequency (%) |
ㅇ | 717 | |
ㅅ | 638 | |
ㅈ | 465 | |
ㄱ | 455 | |
ㅎ | 450 | |
ㄷ | 256 | 6.9% |
ㅁ | 252 | 6.8% |
ㅂ | 193 | 5.2% |
ㅊ | 133 | 3.6% |
ㄴ | 67 | 1.8% |
Other values (9) | 103 | 2.8% |
CJK
Value | Count | Frequency (%) |
上 | 71 | |
下 | 56 | |
中 | 23 | 14.6% |
後 | 1 | 0.6% |
經 | 1 | 0.6% |
子 | 1 | 0.6% |
基 | 1 | 0.6% |
豊 | 1 | 0.6% |
興 | 1 | 0.6% |
順 | 1 | 0.6% |
Number Forms
Value | Count | Frequency (%) |
Ⅰ | 1 |
서지번호
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 9681 |
---|---|
Distinct (%) | 97.0% |
Missing | 17 |
Missing (%) | 0.2% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 36316.34 |
Minimum | 19928 |
---|---|
Maximum | 52159 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 19928 |
---|---|
5-th percentile | 21603 |
Q1 | 28502 |
median | 36626 |
Q3 | 44100 |
95-th percentile | 50610.4 |
Maximum | 52159 |
Range | 32231 |
Interquartile range (IQR) | 15598 |
Descriptive statistics
Standard deviation | 9202.6033 |
---|---|
Coefficient of variation (CV) | 0.25340117 |
Kurtosis | -1.1570382 |
Mean | 36316.34 |
Median Absolute Deviation (MAD) | 7804 |
Skewness | -0.03964684 |
Sum | 3.6254603 × 108 |
Variance | 84687908 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
39305 | 13 | 0.1% |
40354 | 13 | 0.1% |
39219 | 11 | 0.1% |
39905 | 10 | 0.1% |
39221 | 9 | 0.1% |
39197 | 8 | 0.1% |
40305 | 8 | 0.1% |
40292 | 7 | 0.1% |
39352 | 7 | 0.1% |
40284 | 7 | 0.1% |
Other values (9671) | 9890 | |
(Missing) | 17 | 0.2% |
Value | Count | Frequency (%) |
19928 | 1 | |
19936 | 1 | |
19939 | 1 | |
19940 | 1 | |
19945 | 1 | |
19947 | 1 | |
19948 | 1 | |
19949 | 1 | |
19954 | 1 | |
19955 | 1 |
Value | Count | Frequency (%) |
52159 | 1 | |
52157 | 1 | |
52155 | 1 | |
52153 | 1 | |
52146 | 1 | |
52144 | 1 | |
52140 | 1 | |
52129 | 1 | |
52128 | 1 | |
52126 | 1 |
서명
Text
Distinct | 9582 |
---|---|
Distinct (%) | 95.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 100 |
---|---|
Median length | 83 |
Mean length | 21.9857 |
Min length | 1 |
Characters and Unicode
Total characters | 219857 |
---|---|
Distinct characters | 2472 |
Distinct categories | 15 ? |
Distinct scripts | 7 ? |
Distinct blocks | 12 ? |
Unique
Unique | 9321 ? |
---|---|
Unique (%) | 93.2% |
Sample
1st row | 죽은 자의 정치학 : 프랑스?미국?한국 국립묘지의 탄생과 진화 |
---|---|
2nd row | 韓國口碑文學大系 8-4:慶尙南道 晋州市 晋陽郡篇(2) |
3rd row | 與猶堂全集 17:政法集 第23~29卷 |
4th row | 韓國史論 35 |
5th row | 古文書集成 23:居昌 草溪鄭氏篇 |
Value | Count | Frequency (%) |
1393 | 3.4% | |
2 | 316 | 0.8% |
1 | 296 | 0.7% |
of | 261 | 0.6% |
서울 | 214 | 0.5% |
3 | 197 | 0.5% |
the | 195 | 0.5% |
서울의 | 157 | 0.4% |
국역 | 151 | 0.4% |
4 | 128 | 0.3% |
Other values (17742) | 37740 |
Most occurring characters
Value | Count | Frequency (%) |
32918 | 15.0% | |
1 | 7561 | 3.4% |
0 | 4331 | 2.0% |
2 | 4129 | 1.9% |
9 | 3856 | 1.8% |
( | 3392 | 1.5% |
) | 3372 | 1.5% |
사 | 2703 | 1.2% |
서 | 2366 | 1.1% |
3 | 2324 | 1.1% |
Other values (2462) | 152905 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 118303 | |
Space Separator | 32918 | 15.0% |
Decimal Number | 31095 | 14.1% |
Lowercase Letter | 15255 | 6.9% |
Other Punctuation | 7866 | 3.6% |
Open Punctuation | 3750 | 1.7% |
Close Punctuation | 3726 | 1.7% |
Uppercase Letter | 2397 | 1.1% |
Modifier Symbol | 1894 | 0.9% |
Math Symbol | 1420 | 0.6% |
Other values (5) | 1233 | 0.6% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
사 | 2703 | 2.3% |
서 | 2366 | 2.0% |
의 | 2286 | 1.9% |
국 | 1758 | 1.5% |
한 | 1412 | 1.2% |
울 | 1401 | 1.2% |
기 | 1359 | 1.1% |
대 | 1261 | 1.1% |
년 | 1246 | 1.1% |
역 | 1193 | 1.0% |
Other values (2327) | 101318 |
Uppercase Letter
Value | Count | Frequency (%) |
S | 285 | 11.9% |
K | 177 | 7.4% |
T | 167 | 7.0% |
A | 143 | 6.0% |
C | 123 | 5.1% |
E | 122 | 5.1% |
I | 121 | 5.0% |
N | 114 | 4.8% |
H | 112 | 4.7% |
J | 108 | 4.5% |
Other values (35) | 925 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 1736 | |
o | 1623 | |
a | 1329 | 8.7% |
n | 1301 | 8.5% |
i | 1158 | 7.6% |
r | 1094 | 7.2% |
t | 1066 | 7.0% |
s | 910 | 6.0% |
l | 740 | 4.9% |
u | 647 | 4.2% |
Other values (29) | 3651 |
Decimal Number
Value | Count | Frequency (%) |
1 | 7561 | |
0 | 4331 | |
2 | 4129 | |
9 | 3856 | |
3 | 2324 | 7.5% |
4 | 1955 | 6.3% |
8 | 1871 | 6.0% |
6 | 1717 | 5.5% |
7 | 1678 | 5.4% |
5 | 1673 | 5.4% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 2310 | |
: | 1745 | |
; | 1728 | |
. | 1441 | |
? | 416 | 5.3% |
' | 118 | 1.5% |
& | 45 | 0.6% |
# | 34 | 0.4% |
! | 27 | 0.3% |
: | 2 | < 0.1% |
Letter Number
Value | Count | Frequency (%) |
Ⅱ | 42 | |
Ⅰ | 33 | |
Ⅲ | 16 | 14.5% |
Ⅳ | 8 | 7.3% |
Ⅴ | 6 | 5.5% |
Ⅹ | 2 | 1.8% |
Ⅷ | 1 | 0.9% |
Ⅶ | 1 | 0.9% |
Ⅵ | 1 | 0.9% |
Math Symbol
Value | Count | Frequency (%) |
~ | 901 | |
= | 455 | |
> | 30 | 2.1% |
< | 29 | 2.0% |
+ | 3 | 0.2% |
| | 2 | 0.1% |
Open Punctuation
Value | Count | Frequency (%) |
( | 3392 | |
[ | 333 | 8.9% |
『 | 18 | 0.5% |
「 | 6 | 0.2% |
《 | 1 | < 0.1% |
Close Punctuation
Value | Count | Frequency (%) |
) | 3372 | |
] | 327 | 8.8% |
』 | 20 | 0.5% |
」 | 6 | 0.2% |
》 | 1 | < 0.1% |
Space Separator
Value | Count | Frequency (%) |
32918 |
Modifier Symbol
Value | Count | Frequency (%) |
^ | 1894 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1115 |
Other Symbol
Value | Count | Frequency (%) |
○ | 4 |
Final Punctuation
Value | Count | Frequency (%) |
’ | 2 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 83792 | |
Hangul | 82552 | |
Han | 35529 | |
Latin | 17653 | 8.0% |
Hiragana | 134 | 0.1% |
Cyrillic | 109 | < 0.1% |
Katakana | 88 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
國 | 932 | 2.6% |
第 | 912 | 2.6% |
史 | 874 | 2.5% |
韓 | 838 | 2.4% |
文 | 779 | 2.2% |
報 | 709 | 2.0% |
集 | 676 | 1.9% |
年 | 645 | 1.8% |
書 | 552 | 1.6% |
朝 | 507 | 1.4% |
Other values (1415) | 28105 |
Hangul
Value | Count | Frequency (%) |
사 | 2703 | 3.3% |
서 | 2366 | 2.9% |
의 | 2286 | 2.8% |
국 | 1758 | 2.1% |
한 | 1412 | 1.7% |
울 | 1401 | 1.7% |
기 | 1359 | 1.6% |
대 | 1261 | 1.5% |
년 | 1246 | 1.5% |
역 | 1193 | 1.4% |
Other values (844) | 65567 |
Latin
Value | Count | Frequency (%) |
e | 1736 | 9.8% |
o | 1623 | 9.2% |
a | 1329 | 7.5% |
n | 1301 | 7.4% |
i | 1158 | 6.6% |
r | 1094 | 6.2% |
t | 1066 | 6.0% |
s | 910 | 5.2% |
l | 740 | 4.2% |
u | 647 | 3.7% |
Other values (51) | 6049 |
Common
Value | Count | Frequency (%) |
32918 | ||
1 | 7561 | 9.0% |
0 | 4331 | 5.2% |
2 | 4129 | 4.9% |
9 | 3856 | 4.6% |
( | 3392 | 4.0% |
) | 3372 | 4.0% |
3 | 2324 | 2.8% |
/ | 2310 | 2.8% |
4 | 1955 | 2.3% |
Other values (32) | 17644 |
Katakana
Value | Count | Frequency (%) |
ア | 14 | 15.9% |
ル | 9 | 10.2% |
ジ | 7 | 8.0% |
ソ | 5 | 5.7% |
ゥ | 4 | 4.5% |
ト | 3 | 3.4% |
シ | 3 | 3.4% |
リ | 3 | 3.4% |
ン | 2 | 2.3% |
ケ | 2 | 2.3% |
Other values (27) | 36 |
Cyrillic
Value | Count | Frequency (%) |
Р | 15 | |
А | 11 | 10.1% |
О | 9 | 8.3% |
Г | 7 | 6.4% |
Ф | 7 | 6.4% |
И | 6 | 5.5% |
С | 5 | 4.6% |
К | 5 | 4.6% |
и | 4 | 3.7% |
с | 4 | 3.7% |
Other values (22) | 36 |
Hiragana
Value | Count | Frequency (%) |
の | 69 | |
と | 36 | |
か | 4 | 3.0% |
に | 4 | 3.0% |
る | 3 | 2.2% |
け | 2 | 1.5% |
で | 2 | 1.5% |
れ | 1 | 0.7% |
た | 1 | 0.7% |
も | 1 | 0.7% |
Other values (11) | 11 | 8.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 101273 | |
Hangul | 82431 | |
CJK | 34865 | 15.9% |
CJK Compat Ideographs | 664 | 0.3% |
Hiragana | 134 | 0.1% |
Compat Jamo | 121 | 0.1% |
Number Forms | 110 | 0.1% |
Cyrillic | 109 | < 0.1% |
Katakana | 88 | < 0.1% |
None | 56 | < 0.1% |
Other values (2) | 6 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
32918 | ||
1 | 7561 | 7.5% |
0 | 4331 | 4.3% |
2 | 4129 | 4.1% |
9 | 3856 | 3.8% |
( | 3392 | 3.3% |
) | 3372 | 3.3% |
3 | 2324 | 2.3% |
/ | 2310 | 2.3% |
4 | 1955 | 1.9% |
Other values (74) | 35125 |
Hangul
Value | Count | Frequency (%) |
사 | 2703 | 3.3% |
서 | 2366 | 2.9% |
의 | 2286 | 2.8% |
국 | 1758 | 2.1% |
한 | 1412 | 1.7% |
울 | 1401 | 1.7% |
기 | 1359 | 1.6% |
대 | 1261 | 1.5% |
년 | 1246 | 1.5% |
역 | 1193 | 1.4% |
Other values (829) | 65446 |
CJK
Value | Count | Frequency (%) |
國 | 932 | 2.7% |
第 | 912 | 2.6% |
史 | 874 | 2.5% |
韓 | 838 | 2.4% |
文 | 779 | 2.2% |
報 | 709 | 2.0% |
集 | 676 | 1.9% |
年 | 645 | 1.8% |
書 | 552 | 1.6% |
朝 | 507 | 1.5% |
Other values (1342) | 27441 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
論 | 150 | |
年 | 105 | |
歷 | 81 | |
臨 | 31 | 4.7% |
李 | 30 | 4.5% |
六 | 28 | 4.2% |
金 | 19 | 2.9% |
梨 | 12 | 1.8% |
柳 | 9 | 1.4% |
列 | 9 | 1.4% |
Other values (63) | 190 |
Hiragana
Value | Count | Frequency (%) |
の | 69 | |
と | 36 | |
か | 4 | 3.0% |
に | 4 | 3.0% |
る | 3 | 2.2% |
け | 2 | 1.5% |
で | 2 | 1.5% |
れ | 1 | 0.7% |
た | 1 | 0.7% |
も | 1 | 0.7% |
Other values (11) | 11 | 8.2% |
Compat Jamo
Value | Count | Frequency (%) |
ㆍ | 52 | |
ㄱ | 12 | 9.9% |
ㅇ | 12 | 9.9% |
ㅎ | 9 | 7.4% |
ㅅ | 7 | 5.8% |
ㅈ | 5 | 4.1% |
ㄴ | 5 | 4.1% |
ㅁ | 5 | 4.1% |
ㄷ | 4 | 3.3% |
ㅂ | 4 | 3.3% |
Other values (5) | 6 | 5.0% |
Number Forms
Value | Count | Frequency (%) |
Ⅱ | 42 | |
Ⅰ | 33 | |
Ⅲ | 16 | 14.5% |
Ⅳ | 8 | 7.3% |
Ⅴ | 6 | 5.5% |
Ⅹ | 2 | 1.8% |
Ⅷ | 1 | 0.9% |
Ⅶ | 1 | 0.9% |
Ⅵ | 1 | 0.9% |
None
Value | Count | Frequency (%) |
』 | 20 | |
『 | 18 | |
」 | 6 | 10.7% |
「 | 6 | 10.7% |
: | 2 | 3.6% |
| | 2 | 3.6% |
》 | 1 | 1.8% |
《 | 1 | 1.8% |
Cyrillic
Value | Count | Frequency (%) |
Р | 15 | |
А | 11 | 10.1% |
О | 9 | 8.3% |
Г | 7 | 6.4% |
Ф | 7 | 6.4% |
И | 6 | 5.5% |
С | 5 | 4.6% |
К | 5 | 4.6% |
и | 4 | 3.7% |
с | 4 | 3.7% |
Other values (22) | 36 |
Katakana
Value | Count | Frequency (%) |
ア | 14 | 15.9% |
ル | 9 | 10.2% |
ジ | 7 | 8.0% |
ソ | 5 | 5.7% |
ゥ | 4 | 4.5% |
ト | 3 | 3.4% |
シ | 3 | 3.4% |
リ | 3 | 3.4% |
ン | 2 | 2.3% |
ケ | 2 | 2.3% |
Other values (27) | 36 |
Geometric Shapes
Value | Count | Frequency (%) |
○ | 4 |
Punctuation
Value | Count | Frequency (%) |
’ | 2 |
저자
Text
MISSING
 
Distinct | 4355 |
---|---|
Distinct (%) | 46.2% |
Missing | 574 |
Missing (%) | 5.7% |
Memory size | 156.2 KiB |
Length
Max length | 99 |
---|---|
Median length | 66 |
Mean length | 11.488861 |
Min length | 2 |
Characters and Unicode
Total characters | 108294 |
---|---|
Distinct characters | 1568 |
Distinct categories | 11 ? |
Distinct scripts | 6 ? |
Distinct blocks | 8 ? |
Unique
Unique | 3434 ? |
---|---|
Unique (%) | 36.4% |
Sample
1st row | 하상복 지음 |
---|---|
2nd row | 韓國精神文化硏究院 |
3rd row | 정약용(丁若鏞) |
4th row | 서울大學校 國史學科 |
5th row | 韓國精神文化硏究院 |
Value | Count | Frequency (%) |
편 | 1310 | 6.0% |
지음 | 1027 | 4.7% |
서울특별시 | 692 | 3.2% |
594 | 2.7% | |
編 | 447 | 2.1% |
옮김 | 355 | 1.6% |
서울特別市 | 306 | 1.4% |
민족문화추진회 | 302 | 1.4% |
저 | 229 | 1.1% |
國史編纂委員會 | 227 | 1.0% |
Other values (6219) | 16242 |
Most occurring characters
Value | Count | Frequency (%) |
13371 | 12.3% | |
; | 2369 | 2.2% |
편 | 2069 | 1.9% |
사 | 2001 | 1.8% |
서 | 1887 | 1.7% |
울 | 1740 | 1.6% |
회 | 1616 | 1.5% |
원 | 1416 | 1.3% |
지 | 1323 | 1.2% |
김 | 1309 | 1.2% |
Other values (1558) | 79193 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 85734 | |
Space Separator | 13371 | 12.3% |
Other Punctuation | 2844 | 2.6% |
Lowercase Letter | 1820 | 1.7% |
Open Punctuation | 1562 | 1.4% |
Close Punctuation | 1562 | 1.4% |
Uppercase Letter | 1114 | 1.0% |
Decimal Number | 242 | 0.2% |
Dash Punctuation | 43 | < 0.1% |
Control | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
편 | 2069 | 2.4% |
사 | 2001 | 2.3% |
서 | 1887 | 2.2% |
울 | 1740 | 2.0% |
회 | 1616 | 1.9% |
원 | 1416 | 1.7% |
지 | 1323 | 1.5% |
김 | 1309 | 1.5% |
국 | 1239 | 1.4% |
시 | 1214 | 1.4% |
Other values (1482) | 69920 |
Uppercase Letter
Value | Count | Frequency (%) |
S | 199 | |
B | 165 | |
K | 163 | |
A | 68 | 6.1% |
C | 57 | 5.1% |
T | 51 | 4.6% |
R | 48 | 4.3% |
M | 47 | 4.2% |
E | 46 | 4.1% |
O | 41 | 3.7% |
Other values (15) | 229 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 259 | |
o | 175 | |
n | 172 | |
i | 163 | |
a | 158 | 8.7% |
t | 144 | 7.9% |
r | 128 | 7.0% |
d | 75 | 4.1% |
u | 72 | 4.0% |
s | 72 | 4.0% |
Other values (14) | 402 |
Decimal Number
Value | Count | Frequency (%) |
0 | 63 | |
1 | 34 | |
2 | 30 | |
5 | 25 | 10.3% |
3 | 20 | 8.3% |
6 | 17 | 7.0% |
8 | 17 | 7.0% |
7 | 16 | 6.6% |
4 | 13 | 5.4% |
9 | 7 | 2.9% |
Other Punctuation
Value | Count | Frequency (%) |
; | 2369 | |
. | 211 | 7.4% |
: | 140 | 4.9% |
? | 83 | 2.9% |
& | 17 | 0.6% |
# | 16 | 0.6% |
' | 6 | 0.2% |
: | 1 | < 0.1% |
/ | 1 | < 0.1% |
Open Punctuation
Value | Count | Frequency (%) |
[ | 1223 | |
( | 339 | 21.7% |
Close Punctuation
Value | Count | Frequency (%) |
] | 1222 | |
) | 340 | 21.8% |
Space Separator
Value | Count | Frequency (%) |
13371 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 43 |
Control
Value | Count | Frequency (%) |
1 |
Other Symbol
Value | Count | Frequency (%) |
㈜ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 61729 | |
Han | 23913 | 22.1% |
Common | 19625 | 18.1% |
Latin | 2934 | 2.7% |
Katakana | 75 | 0.1% |
Hiragana | 18 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
學 | 1157 | 4.8% |
編 | 1035 | 4.3% |
國 | 960 | 4.0% |
會 | 959 | 4.0% |
硏 | 760 | 3.2% |
究 | 759 | 3.2% |
大 | 705 | 2.9% |
史 | 701 | 2.9% |
文 | 638 | 2.7% |
市 | 579 | 2.4% |
Other values (884) | 15660 |
Hangul
Value | Count | Frequency (%) |
편 | 2069 | 3.4% |
사 | 2001 | 3.2% |
서 | 1887 | 3.1% |
울 | 1740 | 2.8% |
회 | 1616 | 2.6% |
원 | 1416 | 2.3% |
지 | 1323 | 2.1% |
김 | 1309 | 2.1% |
국 | 1239 | 2.0% |
시 | 1214 | 2.0% |
Other values (573) | 45915 |
Latin
Value | Count | Frequency (%) |
e | 259 | 8.8% |
S | 199 | 6.8% |
o | 175 | 6.0% |
n | 172 | 5.9% |
B | 165 | 5.6% |
K | 163 | 5.6% |
i | 163 | 5.6% |
a | 158 | 5.4% |
t | 144 | 4.9% |
r | 128 | 4.4% |
Other values (39) | 1208 |
Common
Value | Count | Frequency (%) |
13371 | ||
; | 2369 | 12.1% |
[ | 1223 | 6.2% |
] | 1222 | 6.2% |
) | 340 | 1.7% |
( | 339 | 1.7% |
. | 211 | 1.1% |
: | 140 | 0.7% |
? | 83 | 0.4% |
0 | 63 | 0.3% |
Other values (16) | 264 | 1.3% |
Katakana
Value | Count | Frequency (%) |
ン | 20 | |
タ | 20 | |
セ | 20 | |
ソ | 4 | 5.3% |
ル | 3 | 4.0% |
ツ | 2 | 2.7% |
ゥ | 2 | 2.7% |
エ | 1 | 1.3% |
シ | 1 | 1.3% |
ア | 1 | 1.3% |
Hiragana
Value | Count | Frequency (%) |
ゆ | 4 | |
ま | 4 | |
に | 4 | |
さ | 3 | |
ん | 3 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 61701 | |
CJK | 23655 | 21.8% |
ASCII | 22557 | 20.8% |
CJK Compat Ideographs | 258 | 0.2% |
Katakana | 75 | 0.1% |
Compat Jamo | 27 | < 0.1% |
Hiragana | 18 | < 0.1% |
None | 3 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
13371 | ||
; | 2369 | 10.5% |
[ | 1223 | 5.4% |
] | 1222 | 5.4% |
) | 340 | 1.5% |
( | 339 | 1.5% |
e | 259 | 1.1% |
. | 211 | 0.9% |
S | 199 | 0.9% |
o | 175 | 0.8% |
Other values (63) | 2849 | 12.6% |
Hangul
Value | Count | Frequency (%) |
편 | 2069 | 3.4% |
사 | 2001 | 3.2% |
서 | 1887 | 3.1% |
울 | 1740 | 2.8% |
회 | 1616 | 2.6% |
원 | 1416 | 2.3% |
지 | 1323 | 2.1% |
김 | 1309 | 2.1% |
국 | 1239 | 2.0% |
시 | 1214 | 2.0% |
Other values (571) | 45887 |
CJK
Value | Count | Frequency (%) |
學 | 1157 | 4.9% |
編 | 1035 | 4.4% |
國 | 960 | 4.1% |
會 | 959 | 4.1% |
硏 | 760 | 3.2% |
究 | 759 | 3.2% |
大 | 705 | 3.0% |
史 | 701 | 3.0% |
文 | 638 | 2.7% |
市 | 579 | 2.4% |
Other values (845) | 15402 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
李 | 100 | |
歷 | 23 | 8.9% |
女 | 13 | 5.0% |
嶺 | 13 | 5.0% |
梨 | 11 | 4.3% |
年 | 8 | 3.1% |
聯 | 8 | 3.1% |
龍 | 7 | 2.7% |
柳 | 6 | 2.3% |
羅 | 6 | 2.3% |
Other values (29) | 63 |
Compat Jamo
Value | Count | Frequency (%) |
ㆍ | 27 |
Katakana
Value | Count | Frequency (%) |
ン | 20 | |
タ | 20 | |
セ | 20 | |
ソ | 4 | 5.3% |
ル | 3 | 4.0% |
ツ | 2 | 2.7% |
ゥ | 2 | 2.7% |
エ | 1 | 1.3% |
シ | 1 | 1.3% |
ア | 1 | 1.3% |
Hiragana
Value | Count | Frequency (%) |
ゆ | 4 | |
ま | 4 | |
に | 4 | |
さ | 3 | |
ん | 3 |
None
Value | Count | Frequency (%) |
: | 1 | |
/ | 1 | |
㈜ | 1 |
출판사
Text
Distinct | 2128 |
---|---|
Distinct (%) | 21.5% |
Missing | 99 |
Missing (%) | 1.0% |
Memory size | 156.2 KiB |
Length
Max length | 66 |
---|---|
Median length | 47 |
Mean length | 6.8715281 |
Min length | 1 |
Characters and Unicode
Total characters | 68035 |
---|---|
Distinct characters | 1003 |
Distinct categories | 10 ? |
Distinct scripts | 7 ? |
Distinct blocks | 9 ? |
Unique
Unique | 1259 ? |
---|---|
Unique (%) | 12.7% |
Sample
1st row | 모티브북 |
---|---|
2nd row | 한국정신문화연구원 |
3rd row | 아름출판사 |
4th row | 서울大學校 國史學科 |
5th row | 한국정신문화연구원 |
Value | Count | Frequency (%) |
서울특별시 | 1075 | 9.3% |
민족문화추진회 | 503 | 4.3% |
국사편찬위원회 | 350 | 3.0% |
세종대왕기념사업회 | 262 | 2.3% |
경인문화사 | 173 | 1.5% |
서울역사편찬원 | 122 | 1.0% |
한국고전번역원 | 114 | 1.0% |
홍보담당관 | 106 | 0.9% |
서울특별시사편찬위원회 | 99 | 0.9% |
민속원 | 95 | 0.8% |
Other values (2239) | 8721 |
Most occurring characters
Value | Count | Frequency (%) |
사 | 3253 | 4.8% |
서 | 2171 | 3.2% |
울 | 2011 | 3.0% |
회 | 1989 | 2.9% |
원 | 1838 | 2.7% |
문 | 1762 | 2.6% |
1723 | 2.5% | |
시 | 1652 | 2.4% |
국 | 1609 | 2.4% |
화 | 1475 | 2.2% |
Other values (993) | 48552 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 63885 | |
Space Separator | 1723 | 2.5% |
Lowercase Letter | 1121 | 1.6% |
Uppercase Letter | 733 | 1.1% |
Other Punctuation | 263 | 0.4% |
Decimal Number | 131 | 0.2% |
Open Punctuation | 86 | 0.1% |
Close Punctuation | 85 | 0.1% |
Dash Punctuation | 6 | < 0.1% |
Other Symbol | 2 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
사 | 3253 | 5.1% |
서 | 2171 | 3.4% |
울 | 2011 | 3.1% |
회 | 1989 | 3.1% |
원 | 1838 | 2.9% |
문 | 1762 | 2.8% |
시 | 1652 | 2.6% |
국 | 1609 | 2.5% |
화 | 1475 | 2.3% |
별 | 1233 | 1.9% |
Other values (925) | 44892 |
Uppercase Letter
Value | Count | Frequency (%) |
S | 175 | |
B | 173 | |
K | 152 | |
M | 56 | 7.6% |
C | 26 | 3.5% |
T | 21 | 2.9% |
G | 15 | 2.0% |
E | 13 | 1.8% |
U | 12 | 1.6% |
V | 11 | 1.5% |
Other values (13) | 79 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 140 | |
o | 136 | |
i | 121 | |
a | 99 | |
t | 95 | |
n | 87 | |
r | 77 | 6.9% |
s | 69 | 6.2% |
d | 48 | 4.3% |
u | 44 | 3.9% |
Other values (12) | 205 |
Decimal Number
Value | Count | Frequency (%) |
1 | 29 | |
0 | 27 | |
5 | 21 | |
8 | 18 | |
2 | 16 | |
6 | 8 | 6.1% |
3 | 3 | 2.3% |
9 | 3 | 2.3% |
4 | 3 | 2.3% |
7 | 3 | 2.3% |
Other Punctuation
Value | Count | Frequency (%) |
. | 87 | |
; | 78 | |
: | 44 | |
? | 41 | |
& | 9 | 3.4% |
# | 4 | 1.5% |
Open Punctuation
Value | Count | Frequency (%) |
( | 47 | |
[ | 39 |
Close Punctuation
Value | Count | Frequency (%) |
) | 46 | |
] | 39 |
Space Separator
Value | Count | Frequency (%) |
1723 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 6 |
Other Symbol
Value | Count | Frequency (%) |
㈜ | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 53395 | |
Han | 10427 | 15.3% |
Common | 2294 | 3.4% |
Latin | 1853 | 2.7% |
Katakana | 40 | 0.1% |
Hiragana | 25 | < 0.1% |
Cyrillic | 1 | < 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
사 | 3253 | 6.1% |
서 | 2171 | 4.1% |
울 | 2011 | 3.8% |
회 | 1989 | 3.7% |
원 | 1838 | 3.4% |
문 | 1762 | 3.3% |
시 | 1652 | 3.1% |
국 | 1609 | 3.0% |
화 | 1475 | 2.8% |
별 | 1233 | 2.3% |
Other values (451) | 34402 |
Han
Value | Count | Frequency (%) |
學 | 627 | 6.0% |
文 | 492 | 4.7% |
國 | 471 | 4.5% |
大 | 461 | 4.4% |
會 | 435 | 4.2% |
化 | 389 | 3.7% |
究 | 366 | 3.5% |
硏 | 365 | 3.5% |
社 | 336 | 3.2% |
史 | 319 | 3.1% |
Other values (442) | 6166 |
Latin
Value | Count | Frequency (%) |
S | 175 | 9.4% |
B | 173 | 9.3% |
K | 152 | 8.2% |
e | 140 | 7.6% |
o | 136 | 7.3% |
i | 121 | 6.5% |
a | 99 | 5.3% |
t | 95 | 5.1% |
n | 87 | 4.7% |
r | 77 | 4.2% |
Other values (34) | 598 |
Common
Value | Count | Frequency (%) |
1723 | ||
. | 87 | 3.8% |
; | 78 | 3.4% |
( | 47 | 2.0% |
) | 46 | 2.0% |
: | 44 | 1.9% |
? | 41 | 1.8% |
] | 39 | 1.7% |
[ | 39 | 1.7% |
1 | 29 | 1.3% |
Other values (12) | 121 | 5.3% |
Katakana
Value | Count | Frequency (%) |
ア | 6 | |
ル | 5 | |
ジ | 3 | 7.5% |
ソ | 3 | 7.5% |
タ | 2 | 5.0% |
ン | 2 | 5.0% |
セ | 2 | 5.0% |
ゥ | 2 | 5.0% |
ヴ | 2 | 5.0% |
ネ | 2 | 5.0% |
Other values (9) | 11 |
Hiragana
Value | Count | Frequency (%) |
ま | 8 | |
ゆ | 8 | |
に | 8 | |
の | 1 | 4.0% |
Cyrillic
Value | Count | Frequency (%) |
Щ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 53377 | |
CJK | 10343 | 15.2% |
ASCII | 4147 | 6.1% |
CJK Compat Ideographs | 84 | 0.1% |
Katakana | 40 | 0.1% |
Hiragana | 25 | < 0.1% |
Compat Jamo | 16 | < 0.1% |
None | 2 | < 0.1% |
Cyrillic | 1 | < 0.1% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
사 | 3253 | 6.1% |
서 | 2171 | 4.1% |
울 | 2011 | 3.8% |
회 | 1989 | 3.7% |
원 | 1838 | 3.4% |
문 | 1762 | 3.3% |
시 | 1652 | 3.1% |
국 | 1609 | 3.0% |
화 | 1475 | 2.8% |
별 | 1233 | 2.3% |
Other values (449) | 34384 |
ASCII
Value | Count | Frequency (%) |
1723 | ||
S | 175 | 4.2% |
B | 173 | 4.2% |
K | 152 | 3.7% |
e | 140 | 3.4% |
o | 136 | 3.3% |
i | 121 | 2.9% |
a | 99 | 2.4% |
t | 95 | 2.3% |
. | 87 | 2.1% |
Other values (56) | 1246 |
CJK
Value | Count | Frequency (%) |
學 | 627 | 6.1% |
文 | 492 | 4.8% |
國 | 471 | 4.6% |
大 | 461 | 4.5% |
會 | 435 | 4.2% |
化 | 389 | 3.8% |
究 | 366 | 3.5% |
硏 | 365 | 3.5% |
社 | 336 | 3.2% |
史 | 319 | 3.1% |
Other values (426) | 6082 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
歷 | 35 | |
嶺 | 11 | 13.1% |
女 | 10 | 11.9% |
李 | 5 | 6.0% |
梨 | 5 | 6.0% |
臨 | 4 | 4.8% |
龍 | 3 | 3.6% |
聯 | 2 | 2.4% |
茶 | 2 | 2.4% |
蘆 | 1 | 1.2% |
Other values (6) | 6 | 7.1% |
Compat Jamo
Value | Count | Frequency (%) |
ㆍ | 16 |
Hiragana
Value | Count | Frequency (%) |
ま | 8 | |
ゆ | 8 | |
に | 8 | |
の | 1 | 4.0% |
Katakana
Value | Count | Frequency (%) |
ア | 6 | |
ル | 5 | |
ジ | 3 | 7.5% |
ソ | 3 | 7.5% |
タ | 2 | 5.0% |
ン | 2 | 5.0% |
セ | 2 | 5.0% |
ゥ | 2 | 5.0% |
ヴ | 2 | 5.0% |
ネ | 2 | 5.0% |
Other values (9) | 11 |
None
Value | Count | Frequency (%) |
㈜ | 2 |
Cyrillic
Value | Count | Frequency (%) |
Щ | 1 |
출판일
Text
Distinct | 493 |
---|---|
Distinct (%) | 5.0% |
Missing | 77 |
Missing (%) | 0.8% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
2010 | 299 | 2.8% |
1995 | 288 | 2.7% |
2017 | 279 | 2.6% |
2016 | 271 | 2.6% |
2009 | 270 | 2.5% |
1994 | 270 | 2.5% |
2005 | 261 | 2.5% |
1993 | 261 | 2.5% |
2007 | 254 | 2.4% |
2006 | 254 | 2.4% |
Other values (412) | 7908 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 9009 | |
9 | 8519 | |
1 | 8375 | |
2 | 6133 | |
8 | 2390 | 5.3% |
7 | 1807 | 4.0% |
6 | 1342 | 3.0% |
5 | 1158 | 2.6% |
3 | 1063 | 2.4% |
4 | 1059 | 2.3% |
Other values (163) | 4305 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 40855 | |
Other Letter | 2058 | 4.6% |
Space Separator | 823 | 1.8% |
Other Punctuation | 472 | 1.0% |
Dash Punctuation | 393 | 0.9% |
Open Punctuation | 160 | 0.4% |
Close Punctuation | 143 | 0.3% |
Uppercase Letter | 125 | 0.3% |
Lowercase Letter | 81 | 0.2% |
Math Symbol | 50 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
유 | 140 | 6.8% |
성 | 138 | 6.7% |
천 | 117 | 5.7% |
연 | 111 | 5.4% |
색 | 110 | 5.3% |
年 | 106 | 5.2% |
불 | 97 | 4.7% |
명 | 97 | 4.7% |
月 | 73 | 3.5% |
년 | 65 | 3.2% |
Other values (131) | 1004 |
Decimal Number
Value | Count | Frequency (%) |
0 | 9009 | |
9 | 8519 | |
1 | 8375 | |
2 | 6133 | |
8 | 2390 | 5.8% |
7 | 1807 | 4.4% |
6 | 1342 | 3.3% |
5 | 1158 | 2.8% |
3 | 1063 | 2.6% |
4 | 1059 | 2.6% |
Lowercase Letter
Value | Count | Frequency (%) |
r | 16 | |
e | 16 | |
a | 16 | |
s | 16 | |
t | 16 | |
m | 1 | 1.2% |
Other Punctuation
Value | Count | Frequency (%) |
. | 237 | |
: | 139 | |
; | 44 | 9.3% |
? | 27 | 5.7% |
/ | 25 | 5.3% |
Uppercase Letter
Value | Count | Frequency (%) |
D | 65 | |
C | 25 | 20.0% |
V | 20 | 16.0% |
M | 15 | 12.0% |
Close Punctuation
Value | Count | Frequency (%) |
] | 118 | |
) | 25 | 17.5% |
Open Punctuation
Value | Count | Frequency (%) |
[ | 118 | |
( | 42 | 26.2% |
Space Separator
Value | Count | Frequency (%) |
823 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 393 |
Math Symbol
Value | Count | Frequency (%) |
~ | 50 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 42896 | |
Hangul | 1454 | 3.2% |
Han | 604 | 1.3% |
Latin | 206 | 0.5% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
유 | 140 | 9.6% |
성 | 138 | 9.5% |
천 | 117 | 8.0% |
연 | 111 | 7.6% |
색 | 110 | 7.6% |
불 | 97 | 6.7% |
명 | 97 | 6.7% |
년 | 65 | 4.5% |
매 | 58 | 4.0% |
월 | 40 | 2.8% |
Other values (82) | 481 |
Han
Value | Count | Frequency (%) |
年 | 106 | |
月 | 73 | |
報 | 64 | |
日 | 64 | |
東 | 42 | 7.0% |
亞 | 42 | 7.0% |
鮮 | 36 | 6.0% |
朝 | 36 | 6.0% |
昭 | 26 | 4.3% |
和 | 26 | 4.3% |
Other values (39) | 89 |
Common
Value | Count | Frequency (%) |
0 | 9009 | |
9 | 8519 | |
1 | 8375 | |
2 | 6133 | |
8 | 2390 | 5.6% |
7 | 1807 | 4.2% |
6 | 1342 | 3.1% |
5 | 1158 | 2.7% |
3 | 1063 | 2.5% |
4 | 1059 | 2.5% |
Other values (12) | 2041 | 4.8% |
Latin
Value | Count | Frequency (%) |
D | 65 | |
C | 25 | 12.1% |
V | 20 | 9.7% |
r | 16 | 7.8% |
e | 16 | 7.8% |
a | 16 | 7.8% |
s | 16 | 7.8% |
t | 16 | 7.8% |
M | 15 | 7.3% |
m | 1 | 0.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 43102 | |
Hangul | 1454 | 3.2% |
CJK | 603 | 1.3% |
CJK Compat Ideographs | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 9009 | |
9 | 8519 | |
1 | 8375 | |
2 | 6133 | |
8 | 2390 | 5.5% |
7 | 1807 | 4.2% |
6 | 1342 | 3.1% |
5 | 1158 | 2.7% |
3 | 1063 | 2.5% |
4 | 1059 | 2.5% |
Other values (22) | 2247 | 5.2% |
Hangul
Value | Count | Frequency (%) |
유 | 140 | 9.6% |
성 | 138 | 9.5% |
천 | 117 | 8.0% |
연 | 111 | 7.6% |
색 | 110 | 7.6% |
불 | 97 | 6.7% |
명 | 97 | 6.7% |
년 | 65 | 4.5% |
매 | 58 | 4.0% |
월 | 40 | 2.8% |
Other values (82) | 481 |
CJK
Value | Count | Frequency (%) |
年 | 106 | |
月 | 73 | |
報 | 64 | |
日 | 64 | |
東 | 42 | 7.0% |
亞 | 42 | 7.0% |
鮮 | 36 | 6.0% |
朝 | 36 | 6.0% |
昭 | 26 | 4.3% |
和 | 26 | 4.3% |
Other values (38) | 88 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
李 | 1 |
배가위치코드
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
1 | |
---|---|
<NA> | 135 |
Length
Max length | 4 |
---|---|
Median length | 1 |
Mean length | 1.0405 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 1 |
---|---|
2nd row | 1 |
3rd row | 1 |
4th row | 1 |
5th row | 1 |
Common Values
Value | Count | Frequency (%) |
1 | 9865 | |
<NA> | 135 | 1.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1 | 9865 | |
na | 135 | 1.4% |
배가위치명
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
서울역사자료실 | |
---|---|
<NA> | 135 |
Length
Max length | 7 |
---|---|
Median length | 7 |
Mean length | 6.9595 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 서울역사자료실 |
---|---|
2nd row | 서울역사자료실 |
3rd row | 서울역사자료실 |
4th row | 서울역사자료실 |
5th row | 서울역사자료실 |
Common Values
Value | Count | Frequency (%) |
서울역사자료실 | 9865 | |
<NA> | 135 | 1.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
서울역사자료실 | 9865 | |
na | 135 | 1.4% |
언어명
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 6 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
한국어 | |
---|---|
<NA> | |
일본어 | 120 |
영어 | 41 |
중국어 | 24 |
Length
Max length | 4 |
---|---|
Median length | 3 |
Mean length | 3.1219 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 한국어 |
---|---|
2nd row | 한국어 |
3rd row | 한국어 |
4th row | 한국어 |
5th row | 한국어 |
Common Values
Value | Count | Frequency (%) |
한국어 | 8555 | |
<NA> | 1257 | 12.6% |
일본어 | 120 | 1.2% |
영어 | 41 | 0.4% |
중국어 | 24 | 0.2% |
러시아어 | 3 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
한국어 | 8555 | |
na | 1257 | 12.6% |
일본어 | 120 | 1.2% |
영어 | 41 | 0.4% |
중국어 | 24 | 0.2% |
러시아어 | 3 | < 0.1% |
자료번호 | 서지번호 | 언어명 | |
---|---|---|---|
자료번호 | 1.000 | 0.994 | 0.169 |
서지번호 | 0.994 | 1.000 | 0.165 |
언어명 | 0.169 | 0.165 | 1.000 |
배가위치코드 | 언어명 | 배가위치명 | |
---|---|---|---|
배가위치코드 | 1.000 | 1.000 | 1.000 |
언어명 | 1.000 | 1.000 | 1.000 |
배가위치명 | 1.000 | 1.000 | 1.000 |
자료번호 | 서지번호 | 배가위치코드 | 배가위치명 | 언어명 | |
---|---|---|---|---|---|
자료번호 | 1.000 | 0.977 | 1.000 | 1.000 | 0.098 |
서지번호 | 0.977 | 1.000 | 1.000 | 1.000 | 0.096 |
배가위치코드 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
배가위치명 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
언어명 | 0.098 | 0.096 | 1.000 | 1.000 | 1.000 |
자료번호 | 청구기호 | 서지번호 | 서명 | 저자 | 출판사 | 출판일 | 배가위치코드 | 배가위치명 | 언어명 | |
---|---|---|---|---|---|---|---|---|---|---|
28739 | 48924 | 340.9 하195ㅈ | 45868 | 죽은 자의 정치학 : 프랑스?미국?한국 국립묘지의 탄생과 진화 | 하상복 지음 | 모티브북 | 2014 | 1 | 서울역사자료실 | 한국어 |
4566 | 30717 | 818.08 한257한 v.8-4 | 29494 | 韓國口碑文學大系 8-4:慶尙南道 晋州市 晋陽郡篇(2) | 韓國精神文化硏究院 | 한국정신문화연구원 | 1981 | 1 | 서울역사자료실 | 한국어 |
5840 | 30373 | 810.81 정428여 v.17 | 29150 | 與猶堂全集 17:政法集 第23~29卷 | 정약용(丁若鏞) | 아름출판사 | 1995 | 1 | 서울역사자료실 | 한국어 |
17609 | 37354 | P 911.005 서272 v.35 | 36131 | 韓國史論 35 | 서울大學校 國史學科 | 서울大學校 國史學科 | 1995 | 1 | 서울역사자료실 | 한국어 |
9528 | 33917 | 911.0091 한257고 v.23 | 32694 | 古文書集成 23:居昌 草溪鄭氏篇 | 韓國精神文化硏究院 | 한국정신문화연구원 | 1995 | 1 | 서울역사자료실 | 한국어 |
15213 | 39638 | 911.6 홍635ㅇ v.321 | 38415 | 안녕하세요 서울입니다 v.321 | 서울특별시 홍보담당관 | 서울특별시 홍보담당관 | 2002.: 유성천연색 - | 1 | 서울역사자료실 | 한국어 |
12154 | 31092 | 911.06302 국513 v.28 | 29869 | 韓民族獨立運動史資料集 28 ;의열투쟁 1 | 국사편찬위원회 | 국사편찬위원회 | 1996 | 1 | 서울역사자료실 | 한국어 |
28210 | 51753 | 070.434 신682ㅌ | 48654 | 특종 1987 :박종철과 한국 민주화 | 신성호 지음 | 중앙books :중앙일보플러스 | 2017 | 1 | 서울역사자료실 | <NA> |
18405 | 45001 | 911.78 대51ㅁ C.2 | 42009 | 沔川邑城 精密地表調査報告書 | 大田産業大學校 鄕土文化硏究所 편 | 唐津郡 | 1999 | 1 | 서울역사자료실 | 한국어 |
14884 | 39040 | AV(CD) 359.05 서272회 v.6 C.3 | 37817 | 서울특별시의회 회의록[컴퓨터파일] 제6대(2002.7.1~2004.11.19) | 서울特別市議會 | 서울特別市議會 | 2003 | 1 | 서울역사자료실 | 한국어 |
자료번호 | 청구기호 | 서지번호 | 서명 | 저자 | 출판사 | 출판일 | 배가위치코드 | 배가위치명 | 언어명 | |
---|---|---|---|---|---|---|---|---|---|---|
17873 | 43840 | 679.1 박67ㄱ | 40854 | 고려사 악지의 당악연구: 高麗史』樂志의 唐樂硏究 = (A)comparative study of Sa-ak in Goryeosa Akji and | 박은옥 | 민속원 | 2006 | <NA> | <NA> | 한국어 |
9567 | 34526 | P 906 역337 v.121 | 33303 | 歷史學報 第121輯 | <NA> | 歷史學會 | 1989 | 1 | 서울역사자료실 | 한국어 |
954 | 22201 | 326.3352 한239서 v.1 | 20978 | 서울市內버스 路線別 交通量 調査 | 서울特別市 ;韓國科學技術硏究所 | 서울특별시 | 1975 | 1 | 서울역사자료실 | 한국어 |
7254 | 26201 | 911.05 사174 v.116 | 24978 | 이조실록 116:중종공희대왕실록 (5년4월-5년12월) | (평양)사회과학원 민족고전연구소 번역 사회과학원 민족고전연구소 | 여강출판사 | 1993 | 1 | 서울역사자료실 | 한국어 |
19117 | 44516 | 600.15 서66ㅅ | 41489 | 서울시 문화지표 설정 및 측정 연구= (A)study on the development and measurement of cultural indicator | 서울시정개발연구원 [편]장영희 이기현 신경희 전기택 | 서울시정연구원 | 1996 | 1 | 서울역사자료실 | 한국어 |
5126 | 30203 | 259.3 대51삼 c.2 | 28980 | 삼일신고(譯解三一神誥) | 대종교 총본사 | 대종교 총본사 | 1949 | 1 | 서울역사자료실 | 한국어 |
31397 | 54880 | 911.1 안478ㅂ | 51790 | 북한 민중사 | 안문석 지음 | 일조각 | 2020 | 1 | 서울역사자료실 | <NA> |
29856 | 53318 | 334.4 서789ㄲ 2 | 50224 | 끌려가다 버려지다 우리 앞에 서다 ;사진과 자료로 보는 일본군 '위안부' 피해 여성 이야기 ^2 | 서울대 인권센터 정진성 연구팀 지음 | 푸른역사 | 2018 | 1 | 서울역사자료실 | <NA> |
28150 | 52446 | 600.15 강256문 2014 | 49350 | (2014년도) 문화재수리보고서 :도지정문화재 | 江原道 [편] | 강원도 | 2017 | 1 | 서울역사자료실 | <NA> |
23264 | 48778 | 시사 911.6 김721조 | 45721 | 조선왕릉의 멋 헌인릉 | 김웅호 | 시사편찬위원회 | <NA> | 1 | 서울역사자료실 | 한국어 |