Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 10000 |
Missing cells | 102 |
Missing cells (%) | 0.3% |
Duplicate rows | 355 |
Duplicate rows (%) | 3.5% |
Total size in memory | 400.4 KiB |
Average record size in memory | 41.0 B |
Variable types
Text | 3 |
---|---|
Numeric | 1 |
Dataset
Description | 국립과천과학관 과학기술자료실이 보유 중인 도서에 대한 정보입니다. 해당 데이터가 포함하는 컬럼은 다음과 같습니다. |
---|---|
Author | 과학기술정보통신부 국립과천과학관 |
URL | https://www.data.go.kr/data/15024947/fileData.do |
Dataset has 355 (3.5%) duplicate rows | Duplicates |
Reproduction
Analysis started | 2024-04-21 15:39:33.822155 |
---|---|
Analysis finished | 2024-04-21 15:39:38.500182 |
Duration | 4.68 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
서명
Text
Distinct | 9292 |
---|---|
Distinct (%) | 92.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 176 |
---|---|
Median length | 115 |
Mean length | 20.9847 |
Min length | 1 |
Characters and Unicode
Total characters | 209847 |
---|---|
Distinct characters | 2561 |
Distinct categories | 15 ? |
Distinct scripts | 8 ? |
Distinct blocks | 14 ? |
Unique
Unique | 8761 ? |
---|---|
Unique (%) | 87.6% |
Sample
1st row | 기술가치평가 사례 : 기법과 적용 |
---|---|
2nd row | IL MUSEO NAZIONALE DELLA SCIENZA E DELLA TECNECA LEONARDO DAVINCI |
3rd row | 환경문제와 첨단기술 |
4th row | (우리 겨레는) 수학의 달인 : 경주로 떠나는 수학 여행 |
5th row | 石城南京 |
Value | Count | Frequency (%) |
3063 | 6.8% | |
the | 540 | 1.2% |
of | 537 | 1.2% |
이야기 | 354 | 0.8% |
and | 319 | 0.7% |
과학 | 268 | 0.6% |
1 | 223 | 0.5% |
위한 | 191 | 0.4% |
science | 184 | 0.4% |
2 | 182 | 0.4% |
Other values (17878) | 39435 |
Most occurring characters
Value | Count | Frequency (%) |
36409 | 17.4% | |
e | 4274 | 2.0% |
의 | 3532 | 1.7% |
o | 3415 | 1.6% |
학 | 3236 | 1.5% |
n | 3164 | 1.5% |
i | 3128 | 1.5% |
과 | 3066 | 1.5% |
: | 2837 | 1.4% |
a | 2723 | 1.3% |
Other values (2551) | 144063 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 110191 | |
Space Separator | 36409 | 17.4% |
Lowercase Letter | 35114 | 16.7% |
Uppercase Letter | 9536 | 4.5% |
Decimal Number | 8425 | 4.0% |
Other Punctuation | 5822 | 2.8% |
Close Punctuation | 1633 | 0.8% |
Open Punctuation | 1632 | 0.8% |
Dash Punctuation | 568 | 0.3% |
Math Symbol | 389 | 0.2% |
Other values (5) | 128 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
의 | 3532 | 3.2% |
학 | 3236 | 2.9% |
과 | 3066 | 2.8% |
기 | 2377 | 2.2% |
사 | 1915 | 1.7% |
이 | 1829 | 1.7% |
한 | 1742 | 1.6% |
국 | 1300 | 1.2% |
는 | 1264 | 1.1% |
지 | 1157 | 1.0% |
Other values (2407) | 88773 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 4274 | |
o | 3415 | |
n | 3164 | 9.0% |
i | 3128 | 8.9% |
a | 2723 | 7.8% |
t | 2560 | 7.3% |
r | 2368 | 6.7% |
s | 2133 | 6.1% |
l | 1527 | 4.3% |
c | 1498 | 4.3% |
Other values (29) | 8324 |
Uppercase Letter
Value | Count | Frequency (%) |
E | 872 | 9.1% |
T | 865 | 9.1% |
S | 805 | 8.4% |
N | 685 | 7.2% |
A | 618 | 6.5% |
I | 601 | 6.3% |
C | 552 | 5.8% |
O | 464 | 4.9% |
R | 427 | 4.5% |
M | 394 | 4.1% |
Other values (27) | 3253 |
Other Punctuation
Value | Count | Frequency (%) |
: | 2837 | |
. | 1053 | 18.1% |
, | 975 | 16.7% |
· | 238 | 4.1% |
' | 198 | 3.4% |
? | 186 | 3.2% |
! | 113 | 1.9% |
/ | 69 | 1.2% |
& | 65 | 1.1% |
; | 32 | 0.5% |
Other values (9) | 56 | 1.0% |
Letter Number
Value | Count | Frequency (%) |
Ⅱ | 40 | |
Ⅰ | 36 | |
Ⅹ | 14 | 12.1% |
Ⅲ | 10 | 8.6% |
Ⅳ | 5 | 4.3% |
Ⅵ | 3 | 2.6% |
Ⅸ | 3 | 2.6% |
Ⅴ | 2 | 1.7% |
Ⅶ | 1 | 0.9% |
Ⅷ | 1 | 0.9% |
Decimal Number
Value | Count | Frequency (%) |
1 | 1947 | |
0 | 1469 | |
2 | 1387 | |
9 | 774 | 9.2% |
3 | 705 | 8.4% |
4 | 508 | 6.0% |
5 | 479 | 5.7% |
6 | 418 | 5.0% |
7 | 373 | 4.4% |
8 | 365 | 4.3% |
Math Symbol
Value | Count | Frequency (%) |
= | 289 | |
~ | 44 | 11.3% |
< | 18 | 4.6% |
> | 18 | 4.6% |
+ | 12 | 3.1% |
∼ | 8 | 2.1% |
Close Punctuation
Value | Count | Frequency (%) |
) | 1590 | |
] | 33 | 2.0% |
》 | 5 | 0.3% |
」 | 3 | 0.2% |
』 | 2 | 0.1% |
Open Punctuation
Value | Count | Frequency (%) |
( | 1588 | |
[ | 34 | 2.1% |
《 | 5 | 0.3% |
「 | 3 | 0.2% |
『 | 2 | 0.1% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 566 | |
- | 1 | 0.2% |
― | 1 | 0.2% |
Modifier Symbol
Value | Count | Frequency (%) |
` | 2 | |
˘ | 1 | |
´ | 1 |
Other Number
Value | Count | Frequency (%) |
² | 2 | |
½ | 1 |
Other Symbol
Value | Count | Frequency (%) |
ⓔ | 2 | |
ⓝ | 1 |
Space Separator
Value | Count | Frequency (%) |
36409 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 92431 | |
Common | 54890 | |
Latin | 44730 | |
Han | 17424 | 8.3% |
Hiragana | 215 | 0.1% |
Katakana | 121 | 0.1% |
Cyrillic | 34 | < 0.1% |
Greek | 2 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
學 | 635 | 3.6% |
國 | 528 | 3.0% |
史 | 507 | 2.9% |
韓 | 408 | 2.3% |
文 | 359 | 2.1% |
科 | 305 | 1.8% |
究 | 255 | 1.5% |
硏 | 255 | 1.5% |
古 | 254 | 1.5% |
報 | 244 | 1.4% |
Other values (1275) | 13674 |
Hangul
Value | Count | Frequency (%) |
의 | 3532 | 3.8% |
학 | 3236 | 3.5% |
과 | 3066 | 3.3% |
기 | 2377 | 2.6% |
사 | 1915 | 2.1% |
이 | 1829 | 2.0% |
한 | 1742 | 1.9% |
국 | 1300 | 1.4% |
는 | 1264 | 1.4% |
지 | 1157 | 1.3% |
Other values (1051) | 71013 |
Latin
Value | Count | Frequency (%) |
e | 4274 | 9.6% |
o | 3415 | 7.6% |
n | 3164 | 7.1% |
i | 3128 | 7.0% |
a | 2723 | 6.1% |
t | 2560 | 5.7% |
r | 2368 | 5.3% |
s | 2133 | 4.8% |
l | 1527 | 3.4% |
c | 1498 | 3.3% |
Other values (53) | 17940 |
Common
Value | Count | Frequency (%) |
36409 | ||
: | 2837 | 5.2% |
1 | 1947 | 3.5% |
) | 1590 | 2.9% |
( | 1588 | 2.9% |
0 | 1469 | 2.7% |
2 | 1387 | 2.5% |
. | 1053 | 1.9% |
, | 975 | 1.8% |
9 | 774 | 1.4% |
Other values (47) | 4861 | 8.9% |
Katakana
Value | Count | Frequency (%) |
ル | 18 | |
ジ | 16 | |
ナ | 15 | |
ャ | 15 | |
ア | 7 | 5.8% |
ロ | 5 | 4.1% |
ク | 4 | 3.3% |
ッ | 4 | 3.3% |
リ | 3 | 2.5% |
ヅ | 2 | 1.7% |
Other values (26) | 32 |
Hiragana
Value | Count | Frequency (%) |
の | 102 | |
と | 42 | |
か | 8 | 3.7% |
る | 6 | 2.8% |
ら | 5 | 2.3% |
に | 4 | 1.9% |
た | 4 | 1.9% |
な | 4 | 1.9% |
は | 4 | 1.9% |
し | 2 | 0.9% |
Other values (25) | 34 | 15.8% |
Cyrillic
Value | Count | Frequency (%) |
Н | 4 | 11.8% |
н | 3 | 8.8% |
о | 3 | 8.8% |
С | 2 | 5.9% |
Е | 2 | 5.9% |
В | 2 | 5.9% |
и | 2 | 5.9% |
а | 2 | 5.9% |
К | 1 | 2.9% |
И | 1 | 2.9% |
Other values (12) | 12 |
Greek
Value | Count | Frequency (%) |
Ι | 1 | |
π | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 99214 | |
Hangul | 92413 | |
CJK | 17055 | 8.1% |
CJK Compat Ideographs | 369 | 0.2% |
None | 275 | 0.1% |
Hiragana | 215 | 0.1% |
Katakana | 121 | 0.1% |
Number Forms | 116 | 0.1% |
Cyrillic | 34 | < 0.1% |
Compat Jamo | 18 | < 0.1% |
Other values (4) | 17 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
36409 | ||
e | 4274 | 4.3% |
o | 3415 | 3.4% |
n | 3164 | 3.2% |
i | 3128 | 3.2% |
: | 2837 | 2.9% |
a | 2723 | 2.7% |
t | 2560 | 2.6% |
r | 2368 | 2.4% |
s | 2133 | 2.1% |
Other values (77) | 36203 |
Hangul
Value | Count | Frequency (%) |
의 | 3532 | 3.8% |
학 | 3236 | 3.5% |
과 | 3066 | 3.3% |
기 | 2377 | 2.6% |
사 | 1915 | 2.1% |
이 | 1829 | 2.0% |
한 | 1742 | 1.9% |
국 | 1300 | 1.4% |
는 | 1264 | 1.4% |
지 | 1157 | 1.3% |
Other values (1045) | 70995 |
CJK
Value | Count | Frequency (%) |
學 | 635 | 3.7% |
國 | 528 | 3.1% |
史 | 507 | 3.0% |
韓 | 408 | 2.4% |
文 | 359 | 2.1% |
科 | 305 | 1.8% |
究 | 255 | 1.5% |
硏 | 255 | 1.5% |
古 | 254 | 1.5% |
報 | 244 | 1.4% |
Other values (1216) | 13305 |
None
Value | Count | Frequency (%) |
· | 238 | |
《 | 5 | 1.8% |
》 | 5 | 1.8% |
「 | 3 | 1.1% |
」 | 3 | 1.1% |
& | 3 | 1.1% |
! | 2 | 0.7% |
² | 2 | 0.7% |
。 | 2 | 0.7% |
』 | 2 | 0.7% |
Other values (8) | 10 | 3.6% |
Hiragana
Value | Count | Frequency (%) |
の | 102 | |
と | 42 | |
か | 8 | 3.7% |
る | 6 | 2.8% |
ら | 5 | 2.3% |
に | 4 | 1.9% |
た | 4 | 1.9% |
な | 4 | 1.9% |
は | 4 | 1.9% |
し | 2 | 0.9% |
Other values (25) | 34 | 15.8% |
CJK Compat Ideographs
Value | Count | Frequency (%) |
歷 | 58 | |
論 | 47 | 12.7% |
年 | 31 | 8.4% |
金 | 22 | 6.0% |
李 | 22 | 6.0% |
老 | 20 | 5.4% |
林 | 14 | 3.8% |
理 | 13 | 3.5% |
龍 | 12 | 3.3% |
曆 | 11 | 3.0% |
Other values (49) | 119 |
Number Forms
Value | Count | Frequency (%) |
Ⅱ | 40 | |
Ⅰ | 36 | |
Ⅹ | 14 | 12.1% |
Ⅲ | 10 | 8.6% |
Ⅳ | 5 | 4.3% |
Ⅵ | 3 | 2.6% |
Ⅸ | 3 | 2.6% |
Ⅴ | 2 | 1.7% |
Ⅶ | 1 | 0.9% |
Ⅷ | 1 | 0.9% |
Katakana
Value | Count | Frequency (%) |
ル | 18 | |
ジ | 16 | |
ナ | 15 | |
ャ | 15 | |
ア | 7 | 5.8% |
ロ | 5 | 4.1% |
ク | 4 | 3.3% |
ッ | 4 | 3.3% |
リ | 3 | 2.5% |
ヅ | 2 | 1.7% |
Other values (26) | 32 |
Math Operators
Value | Count | Frequency (%) |
∼ | 8 |
Compat Jamo
Value | Count | Frequency (%) |
ㆍ | 8 | |
ㄱ | 3 | 16.7% |
ㅇ | 3 | 16.7% |
ㄴ | 2 | 11.1% |
ㅅ | 1 | 5.6% |
ㅎ | 1 | 5.6% |
Punctuation
Value | Count | Frequency (%) |
… | 4 | |
― | 1 | 20.0% |
Cyrillic
Value | Count | Frequency (%) |
Н | 4 | 11.8% |
н | 3 | 8.8% |
о | 3 | 8.8% |
С | 2 | 5.9% |
Е | 2 | 5.9% |
В | 2 | 5.9% |
и | 2 | 5.9% |
а | 2 | 5.9% |
К | 1 | 2.9% |
И | 1 | 2.9% |
Other values (12) | 12 |
Enclosed Alphanum
Value | Count | Frequency (%) |
ⓔ | 2 | |
ⓝ | 1 |
Modifier Letters
Value | Count | Frequency (%) |
˘ | 1 |
저작자
Text
Distinct | 7203 |
---|---|
Distinct (%) | 72.5% |
Missing | 68 |
Missing (%) | 0.7% |
Memory size | 156.2 KiB |
Length
Max length | 186 |
---|---|
Median length | 88 |
Mean length | 13.562022 |
Min length | 2 |
Characters and Unicode
Total characters | 134698 |
---|---|
Distinct characters | 1822 |
Distinct categories | 12 ? |
Distinct scripts | 6 ? |
Distinct blocks | 10 ? |
Unique
Unique | 6238 ? |
---|---|
Unique (%) | 62.8% |
Sample
1st row | 박현우;정혜순;유선희;송명규 |
---|---|
2nd row | Orazio Curti |
3rd row | 한국과학기술정보연구원 기술정보분석팀 편 |
4th row | 안소정 글 ;최현정 그림 |
5th row | 蔣永才 |
Value | Count | Frequency (%) |
편 | 1878 | 6.9% |
지음 | 1860 | 6.8% |
옮김 | 1176 | 4.3% |
그림 | 527 | 1.9% |
編 | 360 | 1.3% |
著 | 325 | 1.2% |
글 | 268 | 1.0% |
저 | 231 | 0.8% |
엮음 | 133 | 0.5% |
the | 128 | 0.5% |
Other values (10402) | 20453 |
Most occurring characters
Value | Count | Frequency (%) |
17668 | 13.1% | |
; | 4712 | 3.5% |
지 | 3001 | 2.2% |
음 | 2853 | 2.1% |
김 | 2540 | 1.9% |
편 | 2403 | 1.8% |
이 | 2154 | 1.6% |
e | 2139 | 1.6% |
] | 1839 | 1.4% |
[ | 1838 | 1.4% |
Other values (1812) | 93551 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 83125 | |
Space Separator | 17668 | 13.1% |
Lowercase Letter | 16824 | 12.5% |
Other Punctuation | 7836 | 5.8% |
Uppercase Letter | 5274 | 3.9% |
Close Punctuation | 1872 | 1.4% |
Open Punctuation | 1871 | 1.4% |
Dash Punctuation | 113 | 0.1% |
Decimal Number | 100 | 0.1% |
Math Symbol | 9 | < 0.1% |
Other values (2) | 6 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
지 | 3001 | 3.6% |
음 | 2853 | 3.4% |
김 | 2540 | 3.1% |
편 | 2403 | 2.9% |
이 | 2154 | 2.6% |
학 | 1674 | 2.0% |
국 | 1531 | 1.8% |
옮 | 1360 | 1.6% |
한 | 1162 | 1.4% |
과 | 1053 | 1.3% |
Other values (1719) | 63394 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 2139 | |
o | 1559 | |
a | 1508 | 9.0% |
r | 1489 | 8.9% |
n | 1478 | 8.8% |
i | 1443 | 8.6% |
t | 1012 | 6.0% |
s | 852 | 5.1% |
l | 787 | 4.7% |
h | 710 | 4.2% |
Other values (16) | 3847 |
Uppercase Letter
Value | Count | Frequency (%) |
S | 447 | 8.5% |
C | 409 | 7.8% |
E | 348 | 6.6% |
A | 328 | 6.2% |
H | 316 | 6.0% |
T | 287 | 5.4% |
R | 278 | 5.3% |
M | 272 | 5.2% |
N | 252 | 4.8% |
J | 229 | 4.3% |
Other values (16) | 2108 |
Other Punctuation
Value | Count | Frequency (%) |
; | 4712 | |
. | 1365 | 17.4% |
, | 905 | 11.5% |
: | 715 | 9.1% |
· | 94 | 1.2% |
& | 18 | 0.2% |
' | 15 | 0.2% |
/ | 9 | 0.1% |
" | 2 | < 0.1% |
… | 1 | < 0.1% |
Decimal Number
Value | Count | Frequency (%) |
0 | 20 | |
1 | 19 | |
3 | 18 | |
2 | 13 | |
5 | 7 | 7.0% |
4 | 6 | 6.0% |
8 | 6 | 6.0% |
6 | 4 | 4.0% |
7 | 4 | 4.0% |
9 | 3 | 3.0% |
Close Punctuation
Value | Count | Frequency (%) |
] | 1839 | |
) | 23 | 1.2% |
》 | 8 | 0.4% |
〉 | 1 | 0.1% |
』 | 1 | 0.1% |
Open Punctuation
Value | Count | Frequency (%) |
[ | 1838 | |
( | 23 | 1.2% |
《 | 8 | 0.4% |
〈 | 1 | 0.1% |
『 | 1 | 0.1% |
Math Symbol
Value | Count | Frequency (%) |
> | 4 | |
< | 4 | |
= | 1 | 11.1% |
Modifier Symbol
Value | Count | Frequency (%) |
^ | 1 | |
´ | 1 | |
` | 1 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 112 | |
― | 1 | 0.9% |
Other Symbol
Value | Count | Frequency (%) |
ⓔ | 2 | |
㈜ | 1 |
Space Separator
Value | Count | Frequency (%) |
17668 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 71689 | |
Common | 29474 | |
Latin | 22098 | 16.4% |
Han | 11356 | 8.4% |
Katakana | 77 | 0.1% |
Hiragana | 4 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
編 | 573 | 5.0% |
著 | 495 | 4.4% |
學 | 467 | 4.1% |
國 | 356 | 3.1% |
韓 | 268 | 2.4% |
會 | 253 | 2.2% |
文 | 231 | 2.0% |
大 | 229 | 2.0% |
硏 | 207 | 1.8% |
究 | 204 | 1.8% |
Other values (913) | 8073 |
Hangul
Value | Count | Frequency (%) |
지 | 3001 | 4.2% |
음 | 2853 | 4.0% |
김 | 2540 | 3.5% |
편 | 2403 | 3.4% |
이 | 2154 | 3.0% |
학 | 1674 | 2.3% |
국 | 1531 | 2.1% |
옮 | 1360 | 1.9% |
한 | 1162 | 1.6% |
과 | 1053 | 1.5% |
Other values (754) | 51958 |
Latin
Value | Count | Frequency (%) |
e | 2139 | 9.7% |
o | 1559 | 7.1% |
a | 1508 | 6.8% |
r | 1489 | 6.7% |
n | 1478 | 6.7% |
i | 1443 | 6.5% |
t | 1012 | 4.6% |
s | 852 | 3.9% |
l | 787 | 3.6% |
h | 710 | 3.2% |
Other values (42) | 9121 |
Common
Value | Count | Frequency (%) |
17668 | ||
; | 4712 | 16.0% |
] | 1839 | 6.2% |
[ | 1838 | 6.2% |
. | 1365 | 4.6% |
, | 905 | 3.1% |
: | 715 | 2.4% |
- | 112 | 0.4% |
· | 94 | 0.3% |
( | 23 | 0.1% |
Other values (30) | 203 | 0.7% |
Katakana
Value | Count | Frequency (%) |
ン | 8 | 10.4% |
ス | 5 | 6.5% |
ィ | 4 | 5.2% |
ト | 4 | 5.2% |
シ | 3 | 3.9% |
フ | 3 | 3.9% |
リ | 3 | 3.9% |
ネ | 3 | 3.9% |
ル | 3 | 3.9% |
タ | 3 | 3.9% |
Other values (29) | 38 |
Hiragana
Value | Count | Frequency (%) |
と | 1 | |
の | 1 | |
そ | 1 | |
が | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 71679 | |
ASCII | 51452 | |
CJK | 11074 | 8.2% |
CJK Compat Ideographs | 282 | 0.2% |
None | 117 | 0.1% |
Katakana | 77 | 0.1% |
Compat Jamo | 9 | < 0.1% |
Hiragana | 4 | < 0.1% |
Enclosed Alphanum | 2 | < 0.1% |
Punctuation | 2 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
17668 | ||
; | 4712 | 9.2% |
e | 2139 | 4.2% |
] | 1839 | 3.6% |
[ | 1838 | 3.6% |
o | 1559 | 3.0% |
a | 1508 | 2.9% |
r | 1489 | 2.9% |
n | 1478 | 2.9% |
i | 1443 | 2.8% |
Other values (70) | 15779 |
Hangul
Value | Count | Frequency (%) |
지 | 3001 | 4.2% |
음 | 2853 | 4.0% |
김 | 2540 | 3.5% |
편 | 2403 | 3.4% |
이 | 2154 | 3.0% |
학 | 1674 | 2.3% |
국 | 1531 | 2.1% |
옮 | 1360 | 1.9% |
한 | 1162 | 1.6% |
과 | 1053 | 1.5% |
Other values (752) | 51948 |
CJK
Value | Count | Frequency (%) |
編 | 573 | 5.2% |
著 | 495 | 4.5% |
學 | 467 | 4.2% |
國 | 356 | 3.2% |
韓 | 268 | 2.4% |
會 | 253 | 2.3% |
文 | 231 | 2.1% |
大 | 229 | 2.1% |
硏 | 207 | 1.9% |
究 | 204 | 1.8% |
Other values (874) | 7791 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
李 | 153 | |
歷 | 14 | 5.0% |
林 | 9 | 3.2% |
龍 | 9 | 3.2% |
金 | 8 | 2.8% |
女 | 8 | 2.8% |
論 | 8 | 2.8% |
沈 | 7 | 2.5% |
嶺 | 6 | 2.1% |
盧 | 6 | 2.1% |
Other values (29) | 54 | 19.1% |
None
Value | Count | Frequency (%) |
· | 94 | |
《 | 8 | 6.8% |
》 | 8 | 6.8% |
^ | 1 | 0.9% |
㈜ | 1 | 0.9% |
〉 | 1 | 0.9% |
〈 | 1 | 0.9% |
』 | 1 | 0.9% |
『 | 1 | 0.9% |
´ | 1 | 0.9% |
Compat Jamo
Value | Count | Frequency (%) |
ㆍ | 9 |
Katakana
Value | Count | Frequency (%) |
ン | 8 | 10.4% |
ス | 5 | 6.5% |
ィ | 4 | 5.2% |
ト | 4 | 5.2% |
シ | 3 | 3.9% |
フ | 3 | 3.9% |
リ | 3 | 3.9% |
ネ | 3 | 3.9% |
ル | 3 | 3.9% |
タ | 3 | 3.9% |
Other values (29) | 38 |
Enclosed Alphanum
Value | Count | Frequency (%) |
ⓔ | 2 |
Punctuation
Value | Count | Frequency (%) |
― | 1 | |
… | 1 |
Hiragana
Value | Count | Frequency (%) |
と | 1 | |
の | 1 | |
そ | 1 | |
が | 1 |
발행자
Text
Distinct | 3414 |
---|---|
Distinct (%) | 34.2% |
Missing | 21 |
Missing (%) | 0.2% |
Memory size | 156.2 KiB |
Length
Max length | 94 |
---|---|
Median length | 84 |
Mean length | 8.0315663 |
Min length | 1 |
Characters and Unicode
Total characters | 80147 |
---|---|
Distinct characters | 1197 |
Distinct categories | 10 ? |
Distinct scripts | 6 ? |
Distinct blocks | 7 ? |
Unique
Unique | 2207 ? |
---|---|
Unique (%) | 22.1% |
Sample
1st row | 한국과학기술정보연구원 |
---|---|
2nd row | Federico Garolla Editore |
3rd row | 한국과학기술정보연구원 |
4th row | 창비 |
5th row | 上海敎育出版社 |
Value | Count | Frequency (%) |
press | 257 | 1.9% |
university | 224 | 1.7% |
of | 185 | 1.4% |
과학기술정책연구원 | 174 | 1.3% |
the | 162 | 1.2% |
국립중앙과학관 | 121 | 0.9% |
한국과학기술정보연구원 | 104 | 0.8% |
한국과학사학회 | 96 | 0.7% |
사이언스북스 | 88 | 0.7% |
전파과학사 | 85 | 0.6% |
Other values (3608) | 11994 |
Most occurring characters
Value | Count | Frequency (%) |
3584 | 4.5% | |
사 | 2303 | 2.9% |
학 | 2209 | 2.8% |
e | 1992 | 2.5% |
o | 1760 | 2.2% |
i | 1756 | 2.2% |
n | 1603 | 2.0% |
과 | 1600 | 2.0% |
국 | 1577 | 2.0% |
s | 1475 | 1.8% |
Other values (1187) | 60288 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 52631 | |
Lowercase Letter | 17667 | 22.0% |
Uppercase Letter | 5150 | 6.4% |
Space Separator | 3584 | 4.5% |
Other Punctuation | 780 | 1.0% |
Open Punctuation | 98 | 0.1% |
Close Punctuation | 96 | 0.1% |
Dash Punctuation | 82 | 0.1% |
Decimal Number | 58 | 0.1% |
Connector Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
사 | 2303 | 4.4% |
학 | 2209 | 4.2% |
과 | 1600 | 3.0% |
국 | 1577 | 3.0% |
한 | 1361 | 2.6% |
원 | 976 | 1.9% |
연 | 887 | 1.7% |
기 | 874 | 1.7% |
문 | 849 | 1.6% |
구 | 781 | 1.5% |
Other values (1106) | 39214 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 1992 | |
o | 1760 | |
i | 1756 | |
n | 1603 | |
s | 1475 | 8.3% |
r | 1460 | 8.3% |
a | 1256 | 7.1% |
t | 1132 | 6.4% |
c | 699 | 4.0% |
l | 661 | 3.7% |
Other values (16) | 3873 |
Uppercase Letter
Value | Count | Frequency (%) |
P | 488 | 9.5% |
C | 435 | 8.4% |
S | 397 | 7.7% |
T | 349 | 6.8% |
I | 329 | 6.4% |
U | 314 | 6.1% |
E | 311 | 6.0% |
A | 311 | 6.0% |
M | 268 | 5.2% |
N | 258 | 5.0% |
Other values (16) | 1690 |
Other Punctuation
Value | Count | Frequency (%) |
: | 288 | |
. | 174 | |
, | 147 | |
& | 101 | 12.9% |
; | 26 | 3.3% |
' | 13 | 1.7% |
· | 11 | 1.4% |
/ | 9 | 1.2% |
# | 6 | 0.8% |
& | 3 | 0.4% |
Decimal Number
Value | Count | Frequency (%) |
1 | 18 | |
2 | 14 | |
9 | 8 | |
0 | 5 | 8.6% |
5 | 4 | 6.9% |
7 | 4 | 6.9% |
4 | 2 | 3.4% |
3 | 2 | 3.4% |
8 | 1 | 1.7% |
Open Punctuation
Value | Count | Frequency (%) |
( | 87 | |
[ | 10 | 10.2% |
《 | 1 | 1.0% |
Close Punctuation
Value | Count | Frequency (%) |
) | 85 | |
] | 10 | 10.4% |
》 | 1 | 1.0% |
Space Separator
Value | Count | Frequency (%) |
3584 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 82 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 41901 | |
Latin | 22817 | |
Han | 10691 | 13.3% |
Common | 4699 | 5.9% |
Katakana | 35 | < 0.1% |
Hiragana | 4 | < 0.1% |
Most frequent character per script
Han
Value | Count | Frequency (%) |
學 | 634 | 5.9% |
社 | 534 | 5.0% |
國 | 461 | 4.3% |
文 | 384 | 3.6% |
韓 | 357 | 3.3% |
大 | 317 | 3.0% |
化 | 297 | 2.8% |
硏 | 270 | 2.5% |
究 | 262 | 2.5% |
校 | 227 | 2.1% |
Other values (544) | 6948 |
Hangul
Value | Count | Frequency (%) |
사 | 2303 | 5.5% |
학 | 2209 | 5.3% |
과 | 1600 | 3.8% |
국 | 1577 | 3.8% |
한 | 1361 | 3.2% |
원 | 976 | 2.3% |
연 | 887 | 2.1% |
기 | 874 | 2.1% |
문 | 849 | 2.0% |
구 | 781 | 1.9% |
Other values (525) | 28484 |
Latin
Value | Count | Frequency (%) |
e | 1992 | 8.7% |
o | 1760 | 7.7% |
i | 1756 | 7.7% |
n | 1603 | 7.0% |
s | 1475 | 6.5% |
r | 1460 | 6.4% |
a | 1256 | 5.5% |
t | 1132 | 5.0% |
c | 699 | 3.1% |
l | 661 | 2.9% |
Other values (42) | 9023 |
Common
Value | Count | Frequency (%) |
3584 | ||
: | 288 | 6.1% |
. | 174 | 3.7% |
, | 147 | 3.1% |
& | 101 | 2.1% |
( | 87 | 1.9% |
) | 85 | 1.8% |
- | 82 | 1.7% |
; | 26 | 0.6% |
1 | 18 | 0.4% |
Other values (19) | 107 | 2.3% |
Katakana
Value | Count | Frequency (%) |
イ | 3 | 8.6% |
ス | 3 | 8.6% |
ン | 3 | 8.6% |
ル | 2 | 5.7% |
エ | 2 | 5.7% |
サ | 2 | 5.7% |
ュ | 2 | 5.7% |
ト | 2 | 5.7% |
ナ | 2 | 5.7% |
ニ | 1 | 2.9% |
Other values (13) | 13 |
Hiragana
Value | Count | Frequency (%) |
ぺ | 1 | |
ん | 1 | |
か | 1 | |
り | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 41901 | |
ASCII | 27500 | |
CJK | 10616 | 13.2% |
CJK Compat Ideographs | 75 | 0.1% |
Katakana | 35 | < 0.1% |
None | 16 | < 0.1% |
Hiragana | 4 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
3584 | 13.0% | |
e | 1992 | 7.2% |
o | 1760 | 6.4% |
i | 1756 | 6.4% |
n | 1603 | 5.8% |
s | 1475 | 5.4% |
r | 1460 | 5.3% |
a | 1256 | 4.6% |
t | 1132 | 4.1% |
c | 699 | 2.5% |
Other values (67) | 10783 |
Hangul
Value | Count | Frequency (%) |
사 | 2303 | 5.5% |
학 | 2209 | 5.3% |
과 | 1600 | 3.8% |
국 | 1577 | 3.8% |
한 | 1361 | 3.2% |
원 | 976 | 2.3% |
연 | 887 | 2.1% |
기 | 874 | 2.1% |
문 | 849 | 2.0% |
구 | 781 | 1.9% |
Other values (525) | 28484 |
CJK
Value | Count | Frequency (%) |
學 | 634 | 6.0% |
社 | 534 | 5.0% |
國 | 461 | 4.3% |
文 | 384 | 3.6% |
韓 | 357 | 3.4% |
大 | 317 | 3.0% |
化 | 297 | 2.8% |
硏 | 270 | 2.5% |
究 | 262 | 2.5% |
校 | 227 | 2.1% |
Other values (525) | 6873 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
歷 | 21 | |
女 | 14 | |
嶺 | 8 | 10.7% |
梨 | 5 | 6.7% |
金 | 5 | 6.7% |
陸 | 4 | 5.3% |
靈 | 2 | 2.7% |
理 | 2 | 2.7% |
龍 | 2 | 2.7% |
驪 | 2 | 2.7% |
Other values (9) | 10 |
None
Value | Count | Frequency (%) |
· | 11 | |
& | 3 | 18.8% |
》 | 1 | 6.2% |
《 | 1 | 6.2% |
Katakana
Value | Count | Frequency (%) |
イ | 3 | 8.6% |
ス | 3 | 8.6% |
ン | 3 | 8.6% |
ル | 2 | 5.7% |
エ | 2 | 5.7% |
サ | 2 | 5.7% |
ュ | 2 | 5.7% |
ト | 2 | 5.7% |
ナ | 2 | 5.7% |
ニ | 1 | 2.9% |
Other values (13) | 13 |
Hiragana
Value | Count | Frequency (%) |
ぺ | 1 | |
ん | 1 | |
か | 1 | |
り | 1 |
발행년
Real number (ℝ)
Distinct | 107 |
---|---|
Distinct (%) | 1.1% |
Missing | 13 |
Missing (%) | 0.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1997.5798 |
Minimum | 1300 |
---|---|
Maximum | 2019 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1300 |
---|---|
5-th percentile | 1969 |
Q1 | 1992 |
median | 2001 |
Q3 | 2008 |
95-th percentile | 2015 |
Maximum | 2019 |
Range | 719 |
Interquartile range (IQR) | 16 |
Descriptive statistics
Standard deviation | 16.588664 |
---|---|
Coefficient of variation (CV) | 0.0083043814 |
Kurtosis | 336.4308 |
Mean | 1997.5798 |
Median Absolute Deviation (MAD) | 8 |
Skewness | -9.6969518 |
Sum | 19949829 |
Variance | 275.18378 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2008 | 474 | 4.7% |
2007 | 465 | 4.7% |
2006 | 436 | 4.4% |
2005 | 403 | 4.0% |
2004 | 365 | 3.6% |
1998 | 323 | 3.2% |
2000 | 307 | 3.1% |
1996 | 304 | 3.0% |
2001 | 288 | 2.9% |
2013 | 287 | 2.9% |
Other values (97) | 6335 |
Value | Count | Frequency (%) |
1300 | 1 | |
1691 | 1 | |
1708 | 1 | |
1848 | 1 | |
1863 | 1 | |
1867 | 2 | |
1887 | 1 | |
1901 | 1 | |
1908 | 1 | |
1913 | 1 |
Value | Count | Frequency (%) |
2019 | 2 | < 0.1% |
2018 | 70 | 0.7% |
2017 | 131 | |
2016 | 136 | |
2015 | 246 | |
2014 | 261 | |
2013 | 287 | |
2012 | 212 | |
2011 | 243 | |
2010 | 235 |
서명 | 저작자 | 발행자 | 발행년 | |
---|---|---|---|---|
8731 | 기술가치평가 사례 : 기법과 적용 | 박현우;정혜순;유선희;송명규 | 한국과학기술정보연구원 | 2002 |
14084 | IL MUSEO NAZIONALE DELLA SCIENZA E DELLA TECNECA LEONARDO DAVINCI | Orazio Curti | Federico Garolla Editore | 1984 |
8828 | 환경문제와 첨단기술 | 한국과학기술정보연구원 기술정보분석팀 편 | 한국과학기술정보연구원 | 2006 |
5680 | (우리 겨레는) 수학의 달인 : 경주로 떠나는 수학 여행 | 안소정 글 ;최현정 그림 | 창비 | 2010 |
13753 | 石城南京 | 蔣永才 | 上海敎育出版社 | 1985 |
8987 | 국립중앙과학관 여행 | 국립중앙과학관 [편] | 국립중앙과학관 | 2001 |
13386 | BRIDGES : The story of great bridges and how they built | Henry Billings | Viking Press | 1964 |
5565 | (알쏭달쏭) 수산물 | 김영혜 지음 | 국립수산과학원:농림수산식품부 | 2010 |
16106 | 조선후기 醫官의 顯官實職진출 : 경기도 守令 등 지방관을 중심으로 | 김양수 | 청주대학교 사학회 [편] | 1994 |
13674 | 天學初函 (三) | 李之藻 輯 | 臺灣學生書局 | 1965 |
서명 | 저작자 | 발행자 | 발행년 | |
---|---|---|---|---|
11735 | The Copernican Revolution | Thomas S. Kuhn | AlfredA, Knopf | 1959 |
12418 | This Kind of War : a study in unpreparedness | Fehrenbach, T. R. | The Macmillan Company | 1963 |
15398 | 年報 5 | 忠北大學校 博物館 [편] | 忠北大學校 博物館 | 1996 |
12202 | 한국과학사학회지 제16권 제1호 | 한국과학사학회 편 | 한국과학사학회 | 1994 |
9184 | Bulletin of the Korean Mathematical Society. 1984/2-2008/8 | <NA> | Korean Mathematical Society | 1984 |
1759 | (퀴즈!)과학상식 : 두뇌탐험 | 안영주 글;윤현우 그림 | 글송이 | 2008 |
280 | 博物館學 : 박물관 관리 운영의 이론과 실무 | 李蘭暎 지음 | 삼화출판사 | 2008 |
9563 | (2006) 백제의 공방 = Artifacts and workshops in Baekje | 국립부여박물관 [편] | 국립부여박물관 | 2006 |
3099 | 과학 사상 | <NA> | 범양사 | 1999 |
12835 | A Prelude -Symposium for Future Seoul/Kyoto Symposia on Language Problems in the Modern Sciences | KIM Yung Sik YOKOYAMA Toshio | the institute for Research in Humanities, Kyoto University | 2001 |
Most frequently occurring
서명 | 저작자 | 발행자 | 발행년 | # duplicates | |
---|---|---|---|---|---|
290 | 조선왕조실록 | 국사편찬위원회 | 탐구당 | 1986 | 30 |
231 | 수시력 수용과 칠정산 완성 : 중국 원형의 한국적 변형 | 박성래 지음 | 한국과학사학회 | 2002 | 15 |
3 | (外羅老島 宇宙센터 建設事業地域內 文化遺蹟 地表調査 報告書)外羅老島 | 국립중앙과학관 [편] | 국립중앙과학관 | 2001 | 8 |
189 | 다시 '민족과학'을 말한다 | 박성래 지음 | 나남출판 | 1996 | 8 |
67 | Modern Scientific Terms in East Asia ; Their Births and Modifications | Park Seongrae | 한국외국어대학교 국제지역연구센터 | 2004 | 6 |
32 | (전주 평화동 동도아파트 신축공사 예정지구 내)문화유적 지표조사 보고서 | 국립중앙과학관 [편] | 국립중앙과학관; (주)동부건설 | 2002 | 5 |
129 | 겨레과학의 발자취 2 : 유물로 보는 전통과학기술 | 국립중앙과학관 [편] | 국립중앙과학관 | 1996 | 5 |
170 | 국립중앙과학관 | 국립중앙과학관 [편] | 선명문화사 | 1999 | 5 |
6 | (국립과천과학관) 과학관 사이언스. 1 | 정인경;손영란 글 | 미래엔컬처그룹 | 2009 | 4 |
90 | 外大史學 창간호 | 韓國外國語大學校 歷史文化硏究所 편 | 韓國外國語大學校 歷史文化硏究所 | 1987 | 4 |