Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells102
Missing cells (%)0.3%
Duplicate rows355
Duplicate rows (%)3.5%
Total size in memory400.4 KiB
Average record size in memory41.0 B

Variable types

Text3
Numeric1

Dataset

Description국립과천과학관 과학기술자료실이 보유 중인 도서에 대한 정보입니다. 해당 데이터가 포함하는 컬럼은 다음과 같습니다.
Author과학기술정보통신부 국립과천과학관
URLhttps://www.data.go.kr/data/15024947/fileData.do

Alerts

Dataset has 355 (3.5%) duplicate rowsDuplicates

Reproduction

Analysis started2024-04-21 15:39:33.822155
Analysis finished2024-04-21 15:39:38.500182
Duration4.68 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

서명
Text

Distinct9292
Distinct (%)92.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-22T00:39:39.480910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length176
Median length115
Mean length20.9847
Min length1

Characters and Unicode

Total characters209847
Distinct characters2561
Distinct categories15 ?
Distinct scripts8 ?
Distinct blocks14 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8761 ?
Unique (%)87.6%

Sample

1st row기술가치평가 사례 : 기법과 적용
2nd rowIL MUSEO NAZIONALE DELLA SCIENZA E DELLA TECNECA LEONARDO DAVINCI
3rd row환경문제와 첨단기술
4th row(우리 겨레는) 수학의 달인 : 경주로 떠나는 수학 여행
5th row石城南京
ValueCountFrequency (%)
3063
 
6.8%
the 540
 
1.2%
of 537
 
1.2%
이야기 354
 
0.8%
and 319
 
0.7%
과학 268
 
0.6%
1 223
 
0.5%
위한 191
 
0.4%
science 184
 
0.4%
2 182
 
0.4%
Other values (17878) 39435
87.1%
2024-04-22T00:39:40.866265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
36409
 
17.4%
e 4274
 
2.0%
3532
 
1.7%
o 3415
 
1.6%
3236
 
1.5%
n 3164
 
1.5%
i 3128
 
1.5%
3066
 
1.5%
: 2837
 
1.4%
a 2723
 
1.3%
Other values (2551) 144063
68.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 110191
52.5%
Space Separator 36409
 
17.4%
Lowercase Letter 35114
 
16.7%
Uppercase Letter 9536
 
4.5%
Decimal Number 8425
 
4.0%
Other Punctuation 5822
 
2.8%
Close Punctuation 1633
 
0.8%
Open Punctuation 1632
 
0.8%
Dash Punctuation 568
 
0.3%
Math Symbol 389
 
0.2%
Other values (5) 128
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3532
 
3.2%
3236
 
2.9%
3066
 
2.8%
2377
 
2.2%
1915
 
1.7%
1829
 
1.7%
1742
 
1.6%
1300
 
1.2%
1264
 
1.1%
1157
 
1.0%
Other values (2407) 88773
80.6%
Lowercase Letter
ValueCountFrequency (%)
e 4274
12.2%
o 3415
9.7%
n 3164
 
9.0%
i 3128
 
8.9%
a 2723
 
7.8%
t 2560
 
7.3%
r 2368
 
6.7%
s 2133
 
6.1%
l 1527
 
4.3%
c 1498
 
4.3%
Other values (29) 8324
23.7%
Uppercase Letter
ValueCountFrequency (%)
E 872
 
9.1%
T 865
 
9.1%
S 805
 
8.4%
N 685
 
7.2%
A 618
 
6.5%
I 601
 
6.3%
C 552
 
5.8%
O 464
 
4.9%
R 427
 
4.5%
M 394
 
4.1%
Other values (27) 3253
34.1%
Other Punctuation
ValueCountFrequency (%)
: 2837
48.7%
. 1053
 
18.1%
, 975
 
16.7%
· 238
 
4.1%
' 198
 
3.4%
? 186
 
3.2%
! 113
 
1.9%
/ 69
 
1.2%
& 65
 
1.1%
; 32
 
0.5%
Other values (9) 56
 
1.0%
Letter Number
ValueCountFrequency (%)
40
34.5%
36
31.0%
14
 
12.1%
10
 
8.6%
5
 
4.3%
3
 
2.6%
3
 
2.6%
2
 
1.7%
1
 
0.9%
1
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 1947
23.1%
0 1469
17.4%
2 1387
16.5%
9 774
 
9.2%
3 705
 
8.4%
4 508
 
6.0%
5 479
 
5.7%
6 418
 
5.0%
7 373
 
4.4%
8 365
 
4.3%
Math Symbol
ValueCountFrequency (%)
= 289
74.3%
~ 44
 
11.3%
< 18
 
4.6%
> 18
 
4.6%
+ 12
 
3.1%
8
 
2.1%
Close Punctuation
ValueCountFrequency (%)
) 1590
97.4%
] 33
 
2.0%
5
 
0.3%
3
 
0.2%
2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 1588
97.3%
[ 34
 
2.1%
5
 
0.3%
3
 
0.2%
2
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 566
99.6%
1
 
0.2%
1
 
0.2%
Modifier Symbol
ValueCountFrequency (%)
` 2
50.0%
˘ 1
25.0%
´ 1
25.0%
Other Number
ValueCountFrequency (%)
² 2
66.7%
½ 1
33.3%
Other Symbol
ValueCountFrequency (%)
2
66.7%
1
33.3%
Space Separator
ValueCountFrequency (%)
36409
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 92431
44.0%
Common 54890
26.2%
Latin 44730
21.3%
Han 17424
 
8.3%
Hiragana 215
 
0.1%
Katakana 121
 
0.1%
Cyrillic 34
 
< 0.1%
Greek 2
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
635
 
3.6%
528
 
3.0%
507
 
2.9%
408
 
2.3%
359
 
2.1%
305
 
1.8%
255
 
1.5%
255
 
1.5%
254
 
1.5%
244
 
1.4%
Other values (1275) 13674
78.5%
Hangul
ValueCountFrequency (%)
3532
 
3.8%
3236
 
3.5%
3066
 
3.3%
2377
 
2.6%
1915
 
2.1%
1829
 
2.0%
1742
 
1.9%
1300
 
1.4%
1264
 
1.4%
1157
 
1.3%
Other values (1051) 71013
76.8%
Latin
ValueCountFrequency (%)
e 4274
 
9.6%
o 3415
 
7.6%
n 3164
 
7.1%
i 3128
 
7.0%
a 2723
 
6.1%
t 2560
 
5.7%
r 2368
 
5.3%
s 2133
 
4.8%
l 1527
 
3.4%
c 1498
 
3.3%
Other values (53) 17940
40.1%
Common
ValueCountFrequency (%)
36409
66.3%
: 2837
 
5.2%
1 1947
 
3.5%
) 1590
 
2.9%
( 1588
 
2.9%
0 1469
 
2.7%
2 1387
 
2.5%
. 1053
 
1.9%
, 975
 
1.8%
9 774
 
1.4%
Other values (47) 4861
 
8.9%
Katakana
ValueCountFrequency (%)
18
14.9%
16
13.2%
15
12.4%
15
12.4%
7
 
5.8%
5
 
4.1%
4
 
3.3%
4
 
3.3%
3
 
2.5%
2
 
1.7%
Other values (26) 32
26.4%
Hiragana
ValueCountFrequency (%)
102
47.4%
42
19.5%
8
 
3.7%
6
 
2.8%
5
 
2.3%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
2
 
0.9%
Other values (25) 34
 
15.8%
Cyrillic
ValueCountFrequency (%)
Н 4
 
11.8%
н 3
 
8.8%
о 3
 
8.8%
С 2
 
5.9%
Е 2
 
5.9%
В 2
 
5.9%
и 2
 
5.9%
а 2
 
5.9%
К 1
 
2.9%
И 1
 
2.9%
Other values (12) 12
35.3%
Greek
ValueCountFrequency (%)
Ι 1
50.0%
π 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 99214
47.3%
Hangul 92413
44.0%
CJK 17055
 
8.1%
CJK Compat Ideographs 369
 
0.2%
None 275
 
0.1%
Hiragana 215
 
0.1%
Katakana 121
 
0.1%
Number Forms 116
 
0.1%
Cyrillic 34
 
< 0.1%
Compat Jamo 18
 
< 0.1%
Other values (4) 17
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
36409
36.7%
e 4274
 
4.3%
o 3415
 
3.4%
n 3164
 
3.2%
i 3128
 
3.2%
: 2837
 
2.9%
a 2723
 
2.7%
t 2560
 
2.6%
r 2368
 
2.4%
s 2133
 
2.1%
Other values (77) 36203
36.5%
Hangul
ValueCountFrequency (%)
3532
 
3.8%
3236
 
3.5%
3066
 
3.3%
2377
 
2.6%
1915
 
2.1%
1829
 
2.0%
1742
 
1.9%
1300
 
1.4%
1264
 
1.4%
1157
 
1.3%
Other values (1045) 70995
76.8%
CJK
ValueCountFrequency (%)
635
 
3.7%
528
 
3.1%
507
 
3.0%
408
 
2.4%
359
 
2.1%
305
 
1.8%
255
 
1.5%
255
 
1.5%
254
 
1.5%
244
 
1.4%
Other values (1216) 13305
78.0%
None
ValueCountFrequency (%)
· 238
86.5%
5
 
1.8%
5
 
1.8%
3
 
1.1%
3
 
1.1%
3
 
1.1%
2
 
0.7%
² 2
 
0.7%
2
 
0.7%
2
 
0.7%
Other values (8) 10
 
3.6%
Hiragana
ValueCountFrequency (%)
102
47.4%
42
19.5%
8
 
3.7%
6
 
2.8%
5
 
2.3%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
2
 
0.9%
Other values (25) 34
 
15.8%
CJK Compat Ideographs
ValueCountFrequency (%)
58
15.7%
47
 
12.7%
31
 
8.4%
22
 
6.0%
22
 
6.0%
20
 
5.4%
14
 
3.8%
13
 
3.5%
12
 
3.3%
11
 
3.0%
Other values (49) 119
32.2%
Number Forms
ValueCountFrequency (%)
40
34.5%
36
31.0%
14
 
12.1%
10
 
8.6%
5
 
4.3%
3
 
2.6%
3
 
2.6%
2
 
1.7%
1
 
0.9%
1
 
0.9%
Katakana
ValueCountFrequency (%)
18
14.9%
16
13.2%
15
12.4%
15
12.4%
7
 
5.8%
5
 
4.1%
4
 
3.3%
4
 
3.3%
3
 
2.5%
2
 
1.7%
Other values (26) 32
26.4%
Math Operators
ValueCountFrequency (%)
8
100.0%
Compat Jamo
ValueCountFrequency (%)
8
44.4%
3
 
16.7%
3
 
16.7%
2
 
11.1%
1
 
5.6%
1
 
5.6%
Punctuation
ValueCountFrequency (%)
4
80.0%
1
 
20.0%
Cyrillic
ValueCountFrequency (%)
Н 4
 
11.8%
н 3
 
8.8%
о 3
 
8.8%
С 2
 
5.9%
Е 2
 
5.9%
В 2
 
5.9%
и 2
 
5.9%
а 2
 
5.9%
К 1
 
2.9%
И 1
 
2.9%
Other values (12) 12
35.3%
Enclosed Alphanum
ValueCountFrequency (%)
2
66.7%
1
33.3%
Modifier Letters
ValueCountFrequency (%)
˘ 1
100.0%
Distinct7203
Distinct (%)72.5%
Missing68
Missing (%)0.7%
Memory size156.2 KiB
2024-04-22T00:39:41.641499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length186
Median length88
Mean length13.562022
Min length2

Characters and Unicode

Total characters134698
Distinct characters1822
Distinct categories12 ?
Distinct scripts6 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6238 ?
Unique (%)62.8%

Sample

1st row박현우;정혜순;유선희;송명규
2nd rowOrazio Curti
3rd row한국과학기술정보연구원 기술정보분석팀 편
4th row안소정 글 ;최현정 그림
5th row蔣永才
ValueCountFrequency (%)
1878
 
6.9%
지음 1860
 
6.8%
옮김 1176
 
4.3%
그림 527
 
1.9%
360
 
1.3%
325
 
1.2%
268
 
1.0%
231
 
0.8%
엮음 133
 
0.5%
the 128
 
0.5%
Other values (10402) 20453
74.8%
2024-04-22T00:39:43.016177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17668
 
13.1%
; 4712
 
3.5%
3001
 
2.2%
2853
 
2.1%
2540
 
1.9%
2403
 
1.8%
2154
 
1.6%
e 2139
 
1.6%
] 1839
 
1.4%
[ 1838
 
1.4%
Other values (1812) 93551
69.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 83125
61.7%
Space Separator 17668
 
13.1%
Lowercase Letter 16824
 
12.5%
Other Punctuation 7836
 
5.8%
Uppercase Letter 5274
 
3.9%
Close Punctuation 1872
 
1.4%
Open Punctuation 1871
 
1.4%
Dash Punctuation 113
 
0.1%
Decimal Number 100
 
0.1%
Math Symbol 9
 
< 0.1%
Other values (2) 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3001
 
3.6%
2853
 
3.4%
2540
 
3.1%
2403
 
2.9%
2154
 
2.6%
1674
 
2.0%
1531
 
1.8%
1360
 
1.6%
1162
 
1.4%
1053
 
1.3%
Other values (1719) 63394
76.3%
Lowercase Letter
ValueCountFrequency (%)
e 2139
12.7%
o 1559
9.3%
a 1508
 
9.0%
r 1489
 
8.9%
n 1478
 
8.8%
i 1443
 
8.6%
t 1012
 
6.0%
s 852
 
5.1%
l 787
 
4.7%
h 710
 
4.2%
Other values (16) 3847
22.9%
Uppercase Letter
ValueCountFrequency (%)
S 447
 
8.5%
C 409
 
7.8%
E 348
 
6.6%
A 328
 
6.2%
H 316
 
6.0%
T 287
 
5.4%
R 278
 
5.3%
M 272
 
5.2%
N 252
 
4.8%
J 229
 
4.3%
Other values (16) 2108
40.0%
Other Punctuation
ValueCountFrequency (%)
; 4712
60.1%
. 1365
 
17.4%
, 905
 
11.5%
: 715
 
9.1%
· 94
 
1.2%
& 18
 
0.2%
' 15
 
0.2%
/ 9
 
0.1%
" 2
 
< 0.1%
1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 20
20.0%
1 19
19.0%
3 18
18.0%
2 13
13.0%
5 7
 
7.0%
4 6
 
6.0%
8 6
 
6.0%
6 4
 
4.0%
7 4
 
4.0%
9 3
 
3.0%
Close Punctuation
ValueCountFrequency (%)
] 1839
98.2%
) 23
 
1.2%
8
 
0.4%
1
 
0.1%
1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
[ 1838
98.2%
( 23
 
1.2%
8
 
0.4%
1
 
0.1%
1
 
0.1%
Math Symbol
ValueCountFrequency (%)
> 4
44.4%
< 4
44.4%
= 1
 
11.1%
Modifier Symbol
ValueCountFrequency (%)
1
33.3%
´ 1
33.3%
` 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 112
99.1%
1
 
0.9%
Other Symbol
ValueCountFrequency (%)
2
66.7%
1
33.3%
Space Separator
ValueCountFrequency (%)
17668
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 71689
53.2%
Common 29474
21.9%
Latin 22098
 
16.4%
Han 11356
 
8.4%
Katakana 77
 
0.1%
Hiragana 4
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
573
 
5.0%
495
 
4.4%
467
 
4.1%
356
 
3.1%
268
 
2.4%
253
 
2.2%
231
 
2.0%
229
 
2.0%
207
 
1.8%
204
 
1.8%
Other values (913) 8073
71.1%
Hangul
ValueCountFrequency (%)
3001
 
4.2%
2853
 
4.0%
2540
 
3.5%
2403
 
3.4%
2154
 
3.0%
1674
 
2.3%
1531
 
2.1%
1360
 
1.9%
1162
 
1.6%
1053
 
1.5%
Other values (754) 51958
72.5%
Latin
ValueCountFrequency (%)
e 2139
 
9.7%
o 1559
 
7.1%
a 1508
 
6.8%
r 1489
 
6.7%
n 1478
 
6.7%
i 1443
 
6.5%
t 1012
 
4.6%
s 852
 
3.9%
l 787
 
3.6%
h 710
 
3.2%
Other values (42) 9121
41.3%
Common
ValueCountFrequency (%)
17668
59.9%
; 4712
 
16.0%
] 1839
 
6.2%
[ 1838
 
6.2%
. 1365
 
4.6%
, 905
 
3.1%
: 715
 
2.4%
- 112
 
0.4%
· 94
 
0.3%
( 23
 
0.1%
Other values (30) 203
 
0.7%
Katakana
ValueCountFrequency (%)
8
 
10.4%
5
 
6.5%
4
 
5.2%
4
 
5.2%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (29) 38
49.4%
Hiragana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 71679
53.2%
ASCII 51452
38.2%
CJK 11074
 
8.2%
CJK Compat Ideographs 282
 
0.2%
None 117
 
0.1%
Katakana 77
 
0.1%
Compat Jamo 9
 
< 0.1%
Hiragana 4
 
< 0.1%
Enclosed Alphanum 2
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17668
34.3%
; 4712
 
9.2%
e 2139
 
4.2%
] 1839
 
3.6%
[ 1838
 
3.6%
o 1559
 
3.0%
a 1508
 
2.9%
r 1489
 
2.9%
n 1478
 
2.9%
i 1443
 
2.8%
Other values (70) 15779
30.7%
Hangul
ValueCountFrequency (%)
3001
 
4.2%
2853
 
4.0%
2540
 
3.5%
2403
 
3.4%
2154
 
3.0%
1674
 
2.3%
1531
 
2.1%
1360
 
1.9%
1162
 
1.6%
1053
 
1.5%
Other values (752) 51948
72.5%
CJK
ValueCountFrequency (%)
573
 
5.2%
495
 
4.5%
467
 
4.2%
356
 
3.2%
268
 
2.4%
253
 
2.3%
231
 
2.1%
229
 
2.1%
207
 
1.9%
204
 
1.8%
Other values (874) 7791
70.4%
CJK Compat Ideographs
ValueCountFrequency (%)
153
54.3%
14
 
5.0%
9
 
3.2%
9
 
3.2%
8
 
2.8%
8
 
2.8%
8
 
2.8%
7
 
2.5%
6
 
2.1%
6
 
2.1%
Other values (29) 54
 
19.1%
None
ValueCountFrequency (%)
· 94
80.3%
8
 
6.8%
8
 
6.8%
1
 
0.9%
1
 
0.9%
1
 
0.9%
1
 
0.9%
1
 
0.9%
1
 
0.9%
´ 1
 
0.9%
Compat Jamo
ValueCountFrequency (%)
9
100.0%
Katakana
ValueCountFrequency (%)
8
 
10.4%
5
 
6.5%
4
 
5.2%
4
 
5.2%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (29) 38
49.4%
Enclosed Alphanum
ValueCountFrequency (%)
2
100.0%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Hiragana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Distinct3414
Distinct (%)34.2%
Missing21
Missing (%)0.2%
Memory size156.2 KiB
2024-04-22T00:39:43.973769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length94
Median length84
Mean length8.0315663
Min length1

Characters and Unicode

Total characters80147
Distinct characters1197
Distinct categories10 ?
Distinct scripts6 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2207 ?
Unique (%)22.1%

Sample

1st row한국과학기술정보연구원
2nd rowFederico Garolla Editore
3rd row한국과학기술정보연구원
4th row창비
5th row上海敎育出版社
ValueCountFrequency (%)
press 257
 
1.9%
university 224
 
1.7%
of 185
 
1.4%
과학기술정책연구원 174
 
1.3%
the 162
 
1.2%
국립중앙과학관 121
 
0.9%
한국과학기술정보연구원 104
 
0.8%
한국과학사학회 96
 
0.7%
사이언스북스 88
 
0.7%
전파과학사 85
 
0.6%
Other values (3608) 11994
88.9%
2024-04-22T00:39:45.381368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3584
 
4.5%
2303
 
2.9%
2209
 
2.8%
e 1992
 
2.5%
o 1760
 
2.2%
i 1756
 
2.2%
n 1603
 
2.0%
1600
 
2.0%
1577
 
2.0%
s 1475
 
1.8%
Other values (1187) 60288
75.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 52631
65.7%
Lowercase Letter 17667
 
22.0%
Uppercase Letter 5150
 
6.4%
Space Separator 3584
 
4.5%
Other Punctuation 780
 
1.0%
Open Punctuation 98
 
0.1%
Close Punctuation 96
 
0.1%
Dash Punctuation 82
 
0.1%
Decimal Number 58
 
0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2303
 
4.4%
2209
 
4.2%
1600
 
3.0%
1577
 
3.0%
1361
 
2.6%
976
 
1.9%
887
 
1.7%
874
 
1.7%
849
 
1.6%
781
 
1.5%
Other values (1106) 39214
74.5%
Lowercase Letter
ValueCountFrequency (%)
e 1992
11.3%
o 1760
10.0%
i 1756
9.9%
n 1603
9.1%
s 1475
 
8.3%
r 1460
 
8.3%
a 1256
 
7.1%
t 1132
 
6.4%
c 699
 
4.0%
l 661
 
3.7%
Other values (16) 3873
21.9%
Uppercase Letter
ValueCountFrequency (%)
P 488
 
9.5%
C 435
 
8.4%
S 397
 
7.7%
T 349
 
6.8%
I 329
 
6.4%
U 314
 
6.1%
E 311
 
6.0%
A 311
 
6.0%
M 268
 
5.2%
N 258
 
5.0%
Other values (16) 1690
32.8%
Other Punctuation
ValueCountFrequency (%)
: 288
36.9%
. 174
22.3%
, 147
18.8%
& 101
 
12.9%
; 26
 
3.3%
' 13
 
1.7%
· 11
 
1.4%
/ 9
 
1.2%
# 6
 
0.8%
3
 
0.4%
Decimal Number
ValueCountFrequency (%)
1 18
31.0%
2 14
24.1%
9 8
13.8%
0 5
 
8.6%
5 4
 
6.9%
7 4
 
6.9%
4 2
 
3.4%
3 2
 
3.4%
8 1
 
1.7%
Open Punctuation
ValueCountFrequency (%)
( 87
88.8%
[ 10
 
10.2%
1
 
1.0%
Close Punctuation
ValueCountFrequency (%)
) 85
88.5%
] 10
 
10.4%
1
 
1.0%
Space Separator
ValueCountFrequency (%)
3584
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 82
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 41901
52.3%
Latin 22817
28.5%
Han 10691
 
13.3%
Common 4699
 
5.9%
Katakana 35
 
< 0.1%
Hiragana 4
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
634
 
5.9%
534
 
5.0%
461
 
4.3%
384
 
3.6%
357
 
3.3%
317
 
3.0%
297
 
2.8%
270
 
2.5%
262
 
2.5%
227
 
2.1%
Other values (544) 6948
65.0%
Hangul
ValueCountFrequency (%)
2303
 
5.5%
2209
 
5.3%
1600
 
3.8%
1577
 
3.8%
1361
 
3.2%
976
 
2.3%
887
 
2.1%
874
 
2.1%
849
 
2.0%
781
 
1.9%
Other values (525) 28484
68.0%
Latin
ValueCountFrequency (%)
e 1992
 
8.7%
o 1760
 
7.7%
i 1756
 
7.7%
n 1603
 
7.0%
s 1475
 
6.5%
r 1460
 
6.4%
a 1256
 
5.5%
t 1132
 
5.0%
c 699
 
3.1%
l 661
 
2.9%
Other values (42) 9023
39.5%
Common
ValueCountFrequency (%)
3584
76.3%
: 288
 
6.1%
. 174
 
3.7%
, 147
 
3.1%
& 101
 
2.1%
( 87
 
1.9%
) 85
 
1.8%
- 82
 
1.7%
; 26
 
0.6%
1 18
 
0.4%
Other values (19) 107
 
2.3%
Katakana
ValueCountFrequency (%)
3
 
8.6%
3
 
8.6%
3
 
8.6%
2
 
5.7%
2
 
5.7%
2
 
5.7%
2
 
5.7%
2
 
5.7%
2
 
5.7%
1
 
2.9%
Other values (13) 13
37.1%
Hiragana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 41901
52.3%
ASCII 27500
34.3%
CJK 10616
 
13.2%
CJK Compat Ideographs 75
 
0.1%
Katakana 35
 
< 0.1%
None 16
 
< 0.1%
Hiragana 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3584
 
13.0%
e 1992
 
7.2%
o 1760
 
6.4%
i 1756
 
6.4%
n 1603
 
5.8%
s 1475
 
5.4%
r 1460
 
5.3%
a 1256
 
4.6%
t 1132
 
4.1%
c 699
 
2.5%
Other values (67) 10783
39.2%
Hangul
ValueCountFrequency (%)
2303
 
5.5%
2209
 
5.3%
1600
 
3.8%
1577
 
3.8%
1361
 
3.2%
976
 
2.3%
887
 
2.1%
874
 
2.1%
849
 
2.0%
781
 
1.9%
Other values (525) 28484
68.0%
CJK
ValueCountFrequency (%)
634
 
6.0%
534
 
5.0%
461
 
4.3%
384
 
3.6%
357
 
3.4%
317
 
3.0%
297
 
2.8%
270
 
2.5%
262
 
2.5%
227
 
2.1%
Other values (525) 6873
64.7%
CJK Compat Ideographs
ValueCountFrequency (%)
21
28.0%
14
18.7%
8
 
10.7%
5
 
6.7%
5
 
6.7%
4
 
5.3%
2
 
2.7%
2
 
2.7%
2
 
2.7%
2
 
2.7%
Other values (9) 10
13.3%
None
ValueCountFrequency (%)
· 11
68.8%
3
 
18.8%
1
 
6.2%
1
 
6.2%
Katakana
ValueCountFrequency (%)
3
 
8.6%
3
 
8.6%
3
 
8.6%
2
 
5.7%
2
 
5.7%
2
 
5.7%
2
 
5.7%
2
 
5.7%
2
 
5.7%
1
 
2.9%
Other values (13) 13
37.1%
Hiragana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

발행년
Real number (ℝ)

Distinct107
Distinct (%)1.1%
Missing13
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean1997.5798
Minimum1300
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-22T00:39:45.784466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1300
5-th percentile1969
Q11992
median2001
Q32008
95-th percentile2015
Maximum2019
Range719
Interquartile range (IQR)16

Descriptive statistics

Standard deviation16.588664
Coefficient of variation (CV)0.0083043814
Kurtosis336.4308
Mean1997.5798
Median Absolute Deviation (MAD)8
Skewness-9.6969518
Sum19949829
Variance275.18378
MonotonicityNot monotonic
2024-04-22T00:39:46.221136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2008 474
 
4.7%
2007 465
 
4.7%
2006 436
 
4.4%
2005 403
 
4.0%
2004 365
 
3.6%
1998 323
 
3.2%
2000 307
 
3.1%
1996 304
 
3.0%
2001 288
 
2.9%
2013 287
 
2.9%
Other values (97) 6335
63.3%
ValueCountFrequency (%)
1300 1
< 0.1%
1691 1
< 0.1%
1708 1
< 0.1%
1848 1
< 0.1%
1863 1
< 0.1%
1867 2
< 0.1%
1887 1
< 0.1%
1901 1
< 0.1%
1908 1
< 0.1%
1913 1
< 0.1%
ValueCountFrequency (%)
2019 2
 
< 0.1%
2018 70
 
0.7%
2017 131
1.3%
2016 136
1.4%
2015 246
2.5%
2014 261
2.6%
2013 287
2.9%
2012 212
2.1%
2011 243
2.4%
2010 235
2.4%

Interactions

2024-04-22T00:39:37.878048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-04-22T00:39:38.097544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-22T00:39:38.248267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-22T00:39:38.404342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

서명저작자발행자발행년
8731기술가치평가 사례 : 기법과 적용박현우;정혜순;유선희;송명규한국과학기술정보연구원2002
14084IL MUSEO NAZIONALE DELLA SCIENZA E DELLA TECNECA LEONARDO DAVINCIOrazio CurtiFederico Garolla Editore1984
8828환경문제와 첨단기술한국과학기술정보연구원 기술정보분석팀 편한국과학기술정보연구원2006
5680(우리 겨레는) 수학의 달인 : 경주로 떠나는 수학 여행안소정 글 ;최현정 그림창비2010
13753石城南京蔣永才上海敎育出版社1985
8987국립중앙과학관 여행국립중앙과학관 [편]국립중앙과학관2001
13386BRIDGES : The story of great bridges and how they builtHenry BillingsViking Press1964
5565(알쏭달쏭) 수산물김영혜 지음국립수산과학원:농림수산식품부2010
16106조선후기 醫官의 顯官實職진출 : 경기도 守令 등 지방관을 중심으로김양수청주대학교 사학회 [편]1994
13674天學初函 (三)李之藻 輯臺灣學生書局1965
서명저작자발행자발행년
11735The Copernican RevolutionThomas S. KuhnAlfredA, Knopf1959
12418This Kind of War : a study in unpreparednessFehrenbach, T. R.The Macmillan Company1963
15398年報 5忠北大學校 博物館 [편]忠北大學校 博物館1996
12202한국과학사학회지 제16권 제1호한국과학사학회 편한국과학사학회1994
9184Bulletin of the Korean Mathematical Society. 1984/2-2008/8<NA>Korean Mathematical Society1984
1759(퀴즈!)과학상식 : 두뇌탐험안영주 글;윤현우 그림글송이2008
280博物館學 : 박물관 관리 운영의 이론과 실무李蘭暎 지음삼화출판사2008
9563(2006) 백제의 공방 = Artifacts and workshops in Baekje국립부여박물관 [편]국립부여박물관2006
3099과학 사상<NA>범양사1999
12835A Prelude -Symposium for Future Seoul/Kyoto Symposia on Language Problems in the Modern SciencesKIM Yung Sik YOKOYAMA Toshiothe institute for Research in Humanities, Kyoto University2001

Duplicate rows

Most frequently occurring

서명저작자발행자발행년# duplicates
290조선왕조실록국사편찬위원회탐구당198630
231수시력 수용과 칠정산 완성 : 중국 원형의 한국적 변형박성래 지음한국과학사학회200215
3(外羅老島 宇宙센터 建設事業地域內 文化遺蹟 地表調査 報告書)外羅老島국립중앙과학관 [편]국립중앙과학관20018
189다시 '민족과학'을 말한다박성래 지음나남출판19968
67Modern Scientific Terms in East Asia ; Their Births and ModificationsPark Seongrae한국외국어대학교 국제지역연구센터20046
32(전주 평화동 동도아파트 신축공사 예정지구 내)문화유적 지표조사 보고서국립중앙과학관 [편]국립중앙과학관; (주)동부건설20025
129겨레과학의 발자취 2 : 유물로 보는 전통과학기술국립중앙과학관 [편]국립중앙과학관19965
170국립중앙과학관국립중앙과학관 [편]선명문화사19995
6(국립과천과학관) 과학관 사이언스. 1정인경;손영란 글미래엔컬처그룹20094
90外大史學 창간호韓國外國語大學校 歷史文化硏究所 편韓國外國語大學校 歷史文化硏究所19874