Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells114
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory732.4 KiB
Average record size in memory75.0 B

Variable types

Numeric3
Text4
DateTime1

Dataset

Description한국과학기술원 소장자료 정보(자료유형,저자,서지제어번호,출판년도,출판사 등)에 대한 정보 입니다.해당 데이터가 보유한 컬럼은 다음과 같습니다.컬럼명: 자료유형,저자,서지제어번호,출판년도,출판사,입력일,서명
Author한국과학기술원
URLhttps://www.data.go.kr/data/3069671/fileData.do

Alerts

일련번호 is highly overall correlated with 서지제어번호 and 1 other fieldsHigh correlation
서지제어번호 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
출판년도 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
일련번호 has unique valuesUnique
서지제어번호 has unique valuesUnique
홈페이지 주소(URL) has unique valuesUnique

Reproduction

Analysis started2024-03-14 17:03:59.695203
Analysis finished2024-03-14 17:04:05.859496
Duration6.16 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30137.826
Minimum2
Maximum59756
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-15T02:04:06.068682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2931.25
Q115307.75
median30147.5
Q345064
95-th percentile56831.05
Maximum59756
Range59754
Interquartile range (IQR)29756.25

Descriptive statistics

Standard deviation17261.108
Coefficient of variation (CV)0.572739
Kurtosis-1.2039425
Mean30137.826
Median Absolute Deviation (MAD)14876
Skewness-0.010505142
Sum3.0137826 × 108
Variance2.9794585 × 108
MonotonicityNot monotonic
2024-03-15T02:04:06.501464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40038 1
 
< 0.1%
55027 1
 
< 0.1%
51841 1
 
< 0.1%
78 1
 
< 0.1%
27238 1
 
< 0.1%
5725 1
 
< 0.1%
8471 1
 
< 0.1%
36918 1
 
< 0.1%
33039 1
 
< 0.1%
23775 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
15 1
< 0.1%
16 1
< 0.1%
21 1
< 0.1%
27 1
< 0.1%
47 1
< 0.1%
54 1
< 0.1%
55 1
< 0.1%
56 1
< 0.1%
66 1
< 0.1%
ValueCountFrequency (%)
59756 1
< 0.1%
59743 1
< 0.1%
59739 1
< 0.1%
59733 1
< 0.1%
59728 1
< 0.1%
59726 1
< 0.1%
59722 1
< 0.1%
59720 1
< 0.1%
59718 1
< 0.1%
59713 1
< 0.1%

저자
Text

Distinct8716
Distinct (%)88.0%
Missing96
Missing (%)1.0%
Memory size156.2 KiB
2024-03-15T02:04:07.911812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length100
Median length95
Mean length13.527868
Min length2

Characters and Unicode

Total characters133980
Distinct characters839
Distinct categories11 ?
Distinct scripts7 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7914 ?
Unique (%)79.9%

Sample

1st row石原愼太郞
2nd row早居鎭夫
3rd rowMussen, Paul Henry
4th row정기수
5th rowDen Hartog, J. P
ValueCountFrequency (%)
j 384
 
1.7%
a 376
 
1.6%
on 350
 
1.5%
m 295
 
1.3%
h 245
 
1.1%
c 244
 
1.1%
r 243
 
1.1%
w 232
 
1.0%
e 225
 
1.0%
l 218
 
1.0%
Other values (9794) 20075
87.7%
2024-03-15T02:04:10.142417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12985
 
9.7%
e 9445
 
7.0%
n 7942
 
5.9%
a 7865
 
5.9%
r 7139
 
5.3%
o 6726
 
5.0%
i 6472
 
4.8%
, 5728
 
4.3%
l 4684
 
3.5%
t 4484
 
3.3%
Other values (829) 60510
45.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 80210
59.9%
Uppercase Letter 18599
 
13.9%
Other Letter 14061
 
10.5%
Space Separator 12985
 
9.7%
Other Punctuation 7102
 
5.3%
Decimal Number 415
 
0.3%
Dash Punctuation 222
 
0.2%
Close Punctuation 214
 
0.2%
Math Symbol 106
 
0.1%
Open Punctuation 65
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
496
 
3.5%
489
 
3.5%
283
 
2.0%
269
 
1.9%
229
 
1.6%
220
 
1.6%
218
 
1.6%
216
 
1.5%
194
 
1.4%
189
 
1.3%
Other values (710) 11258
80.1%
Lowercase Letter
ValueCountFrequency (%)
e 9445
11.8%
n 7942
9.9%
a 7865
9.8%
r 7139
 
8.9%
o 6726
 
8.4%
i 6472
 
8.1%
l 4684
 
5.8%
t 4484
 
5.6%
s 4126
 
5.1%
h 2848
 
3.6%
Other values (36) 18479
23.0%
Uppercase Letter
ValueCountFrequency (%)
S 1646
 
8.8%
C 1337
 
7.2%
M 1336
 
7.2%
A 1298
 
7.0%
R 1170
 
6.3%
J 1120
 
6.0%
H 1023
 
5.5%
B 987
 
5.3%
D 887
 
4.8%
G 858
 
4.6%
Other values (31) 6937
37.3%
Other Punctuation
ValueCountFrequency (%)
, 5728
80.7%
. 1174
 
16.5%
@ 59
 
0.8%
? 39
 
0.5%
' 34
 
0.5%
" 32
 
0.5%
: 13
 
0.2%
/ 10
 
0.1%
* 8
 
0.1%
& 4
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 119
28.7%
9 78
18.8%
8 52
12.5%
2 35
 
8.4%
0 31
 
7.5%
6 26
 
6.3%
3 21
 
5.1%
5 20
 
4.8%
7 20
 
4.8%
4 13
 
3.1%
Close Punctuation
ValueCountFrequency (%)
} 150
70.1%
) 62
29.0%
] 2
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
- 221
99.5%
1
 
0.5%
Math Symbol
ValueCountFrequency (%)
> 97
91.5%
< 9
 
8.5%
Open Punctuation
ValueCountFrequency (%)
( 62
95.4%
[ 3
 
4.6%
Space Separator
ValueCountFrequency (%)
12985
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 98732
73.7%
Common 21110
 
15.8%
Hangul 13654
 
10.2%
Han 292
 
0.2%
Hiragana 98
 
0.1%
Cyrillic 77
 
0.1%
Katakana 17
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
496
 
3.6%
489
 
3.6%
283
 
2.1%
269
 
2.0%
229
 
1.7%
220
 
1.6%
218
 
1.6%
216
 
1.6%
194
 
1.4%
189
 
1.4%
Other values (472) 10851
79.5%
Han
ValueCountFrequency (%)
13
 
4.5%
13
 
4.5%
12
 
4.1%
10
 
3.4%
5
 
1.7%
4
 
1.4%
4
 
1.4%
4
 
1.4%
3
 
1.0%
3
 
1.0%
Other values (171) 221
75.7%
Latin
ValueCountFrequency (%)
e 9445
 
9.6%
n 7942
 
8.0%
a 7865
 
8.0%
r 7139
 
7.2%
o 6726
 
6.8%
i 6472
 
6.6%
l 4684
 
4.7%
t 4484
 
4.5%
s 4126
 
4.2%
h 2848
 
2.9%
Other values (43) 37001
37.5%
Hiragana
ValueCountFrequency (%)
8
 
8.2%
6
 
6.1%
6
 
6.1%
5
 
5.1%
4
 
4.1%
4
 
4.1%
4
 
4.1%
3
 
3.1%
3
 
3.1%
3
 
3.1%
Other values (32) 52
53.1%
Cyrillic
ValueCountFrequency (%)
е 7
 
9.1%
о 6
 
7.8%
л 5
 
6.5%
Н 4
 
5.2%
и 4
 
5.2%
в 4
 
5.2%
н 3
 
3.9%
ч 3
 
3.9%
Т 3
 
3.9%
Л 3
 
3.9%
Other values (24) 35
45.5%
Common
ValueCountFrequency (%)
12985
61.5%
, 5728
27.1%
. 1174
 
5.6%
- 221
 
1.0%
} 150
 
0.7%
1 119
 
0.6%
> 97
 
0.5%
9 78
 
0.4%
) 62
 
0.3%
( 62
 
0.3%
Other values (22) 434
 
2.1%
Katakana
ValueCountFrequency (%)
3
17.6%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Other values (5) 5
29.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 119839
89.4%
Hangul 13652
 
10.2%
CJK 285
 
0.2%
Hiragana 98
 
0.1%
Cyrillic 77
 
0.1%
Katakana 17
 
< 0.1%
CJK Compat Ideographs 7
 
< 0.1%
Compat Jamo 2
 
< 0.1%
CJK Compat 1
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12985
 
10.8%
e 9445
 
7.9%
n 7942
 
6.6%
a 7865
 
6.6%
r 7139
 
6.0%
o 6726
 
5.6%
i 6472
 
5.4%
, 5728
 
4.8%
l 4684
 
3.9%
t 4484
 
3.7%
Other values (72) 46369
38.7%
Hangul
ValueCountFrequency (%)
496
 
3.6%
489
 
3.6%
283
 
2.1%
269
 
2.0%
229
 
1.7%
220
 
1.6%
218
 
1.6%
216
 
1.6%
194
 
1.4%
189
 
1.4%
Other values (471) 10849
79.5%
CJK
ValueCountFrequency (%)
13
 
4.6%
13
 
4.6%
12
 
4.2%
10
 
3.5%
5
 
1.8%
4
 
1.4%
4
 
1.4%
4
 
1.4%
3
 
1.1%
3
 
1.1%
Other values (165) 214
75.1%
Hiragana
ValueCountFrequency (%)
8
 
8.2%
6
 
6.1%
6
 
6.1%
5
 
5.1%
4
 
4.1%
4
 
4.1%
4
 
4.1%
3
 
3.1%
3
 
3.1%
3
 
3.1%
Other values (32) 52
53.1%
Cyrillic
ValueCountFrequency (%)
е 7
 
9.1%
о 6
 
7.8%
л 5
 
6.5%
Н 4
 
5.2%
и 4
 
5.2%
в 4
 
5.2%
н 3
 
3.9%
ч 3
 
3.9%
Т 3
 
3.9%
Л 3
 
3.9%
Other values (24) 35
45.5%
Katakana
ValueCountFrequency (%)
3
17.6%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Other values (5) 5
29.4%
CJK Compat Ideographs
ValueCountFrequency (%)
2
28.6%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
CJK Compat
ValueCountFrequency (%)
1
100.0%
Punctuation
ValueCountFrequency (%)
1
100.0%
None
ValueCountFrequency (%)
Ł 1
100.0%

서지제어번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35541.637
Minimum3
Maximum104392
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-15T02:04:10.622833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile3149.2
Q116553.75
median32496
Q348943
95-th percentile97797.15
Maximum104392
Range104389
Interquartile range (IQR)32389.25

Descriptive statistics

Standard deviation24957.911
Coefficient of variation (CV)0.70221614
Kurtosis0.86876787
Mean35541.637
Median Absolute Deviation (MAD)16198.5
Skewness1.0112908
Sum3.5541638 × 108
Variance6.2289734 × 108
MonotonicityNot monotonic
2024-03-15T02:04:11.034430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
43248 1
 
< 0.1%
76231 1
 
< 0.1%
56284 1
 
< 0.1%
147 1
 
< 0.1%
29284 1
 
< 0.1%
6259 1
 
< 0.1%
9158 1
 
< 0.1%
39758 1
 
< 0.1%
35680 1
 
< 0.1%
25512 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
3 1
< 0.1%
82 1
< 0.1%
83 1
< 0.1%
88 1
< 0.1%
94 1
< 0.1%
115 1
< 0.1%
122 1
< 0.1%
123 1
< 0.1%
124 1
< 0.1%
135 1
< 0.1%
ValueCountFrequency (%)
104392 1
< 0.1%
104379 1
< 0.1%
104375 1
< 0.1%
104369 1
< 0.1%
104364 1
< 0.1%
104361 1
< 0.1%
104357 1
< 0.1%
104355 1
< 0.1%
104353 1
< 0.1%
104348 1
< 0.1%
Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T02:04:11.947330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length65
Median length64
Mean length63.8712
Min length60

Characters and Unicode

Total characters638712
Distinct characters34
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowhttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=43248
2nd rowhttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=44451
3rd rowhttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=343
4th rowhttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=55688
5th rowhttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=18936
ValueCountFrequency (%)
http://library.kaist.ac.kr/search/detail/view.do?bibctrlno=43248 1
 
< 0.1%
http://library.kaist.ac.kr/search/detail/view.do?bibctrlno=39758 1
 
< 0.1%
http://library.kaist.ac.kr/search/detail/view.do?bibctrlno=43083 1
 
< 0.1%
http://library.kaist.ac.kr/search/detail/view.do?bibctrlno=36992 1
 
< 0.1%
http://library.kaist.ac.kr/search/detail/view.do?bibctrlno=56284 1
 
< 0.1%
http://library.kaist.ac.kr/search/detail/view.do?bibctrlno=147 1
 
< 0.1%
http://library.kaist.ac.kr/search/detail/view.do?bibctrlno=29284 1
 
< 0.1%
http://library.kaist.ac.kr/search/detail/view.do?bibctrlno=6259 1
 
< 0.1%
http://library.kaist.ac.kr/search/detail/view.do?bibctrlno=9158 1
 
< 0.1%
http://library.kaist.ac.kr/search/detail/view.do?bibctrlno=76231 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-03-15T02:04:13.592887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 50000
 
7.8%
t 50000
 
7.8%
/ 50000
 
7.8%
i 50000
 
7.8%
r 50000
 
7.8%
. 40000
 
6.3%
l 30000
 
4.7%
b 30000
 
4.7%
e 30000
 
4.7%
s 20000
 
3.1%
Other values (24) 238712
37.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 450000
70.5%
Other Punctuation 110000
 
17.2%
Decimal Number 48712
 
7.6%
Uppercase Letter 20000
 
3.1%
Math Symbol 10000
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 50000
11.1%
t 50000
11.1%
i 50000
11.1%
r 50000
11.1%
l 30000
 
6.7%
b 30000
 
6.7%
e 30000
 
6.7%
s 20000
 
4.4%
d 20000
 
4.4%
c 20000
 
4.4%
Other values (7) 100000
22.2%
Decimal Number
ValueCountFrequency (%)
1 5863
12.0%
2 5758
11.8%
3 5588
11.5%
4 5472
11.2%
5 5354
11.0%
9 4399
9.0%
0 4195
8.6%
7 4176
8.6%
6 3961
8.1%
8 3946
8.1%
Other Punctuation
ValueCountFrequency (%)
/ 50000
45.5%
. 40000
36.4%
? 10000
 
9.1%
: 10000
 
9.1%
Uppercase Letter
ValueCountFrequency (%)
C 10000
50.0%
N 10000
50.0%
Math Symbol
ValueCountFrequency (%)
= 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 470000
73.6%
Common 168712
 
26.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 50000
10.6%
t 50000
10.6%
i 50000
10.6%
r 50000
10.6%
l 30000
 
6.4%
b 30000
 
6.4%
e 30000
 
6.4%
s 20000
 
4.3%
d 20000
 
4.3%
c 20000
 
4.3%
Other values (9) 120000
25.5%
Common
ValueCountFrequency (%)
/ 50000
29.6%
. 40000
23.7%
= 10000
 
5.9%
? 10000
 
5.9%
: 10000
 
5.9%
1 5863
 
3.5%
2 5758
 
3.4%
3 5588
 
3.3%
4 5472
 
3.2%
5 5354
 
3.2%
Other values (5) 20677
12.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 638712
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 50000
 
7.8%
t 50000
 
7.8%
/ 50000
 
7.8%
i 50000
 
7.8%
r 50000
 
7.8%
. 40000
 
6.3%
l 30000
 
4.7%
b 30000
 
4.7%
e 30000
 
4.7%
s 20000
 
3.1%
Other values (24) 238712
37.4%

출판년도
Real number (ℝ)

HIGH CORRELATION 

Distinct75
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1980.9186
Minimum1923
Maximum2014
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-15T02:04:14.305655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1923
5-th percentile1965
Q11976
median1982
Q31987
95-th percentile1993
Maximum2014
Range91
Interquartile range (IQR)11

Descriptive statistics

Standard deviation8.6168787
Coefficient of variation (CV)0.0043499408
Kurtosis1.6743
Mean1980.9186
Median Absolute Deviation (MAD)5
Skewness-0.90339961
Sum19809186
Variance74.250599
MonotonicityNot monotonic
2024-03-15T02:04:14.781038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1984 741
 
7.4%
1983 731
 
7.3%
1982 648
 
6.5%
1985 480
 
4.8%
1981 451
 
4.5%
1988 441
 
4.4%
1987 437
 
4.4%
1980 424
 
4.2%
1979 374
 
3.7%
1992 316
 
3.2%
Other values (65) 4957
49.6%
ValueCountFrequency (%)
1923 1
 
< 0.1%
1927 1
 
< 0.1%
1928 1
 
< 0.1%
1930 1
 
< 0.1%
1935 1
 
< 0.1%
1938 4
< 0.1%
1939 1
 
< 0.1%
1940 2
< 0.1%
1941 1
 
< 0.1%
1943 2
< 0.1%
ValueCountFrequency (%)
2014 1
 
< 0.1%
2013 1
 
< 0.1%
2012 3
< 0.1%
2011 1
 
< 0.1%
2007 1
 
< 0.1%
2006 1
 
< 0.1%
2005 1
 
< 0.1%
2002 3
< 0.1%
2001 1
 
< 0.1%
2000 1
 
< 0.1%
Distinct2279
Distinct (%)22.8%
Missing18
Missing (%)0.2%
Memory size156.2 KiB
2024-03-15T02:04:16.530748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length100
Median length88
Mean length10.962132
Min length1

Characters and Unicode

Total characters109424
Distinct characters967
Distinct categories10 ?
Distinct scripts6 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1371 ?
Unique (%)13.7%

Sample

1st row이성과현실
2nd row고려원미디어
3rd rowHarper & Row
4th row乙酉文化社
5th rowMcGraw-Hill
ValueCountFrequency (%)
press 1203
 
7.3%
wiley 528
 
3.2%
academic 446
 
2.7%
mcgraw-hill 427
 
2.6%
springer-verlag 406
 
2.5%
co 391
 
2.4%
pub 346
 
2.1%
prentice-hall 287
 
1.7%
of 253
 
1.5%
society 245
 
1.5%
Other values (2315) 12022
72.6%
2024-03-15T02:04:18.427906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 8897
 
8.1%
r 6679
 
6.1%
6572
 
6.0%
i 6292
 
5.8%
s 5140
 
4.7%
a 5098
 
4.7%
n 5019
 
4.6%
l 4897
 
4.5%
o 4107
 
3.8%
c 3662
 
3.3%
Other values (957) 53061
48.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 66694
61.0%
Other Letter 19547
 
17.9%
Uppercase Letter 13721
 
12.5%
Space Separator 6572
 
6.0%
Dash Punctuation 1560
 
1.4%
Other Punctuation 1240
 
1.1%
Decimal Number 37
 
< 0.1%
Close Punctuation 27
 
< 0.1%
Open Punctuation 24
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1972
 
10.1%
721
 
3.7%
660
 
3.4%
567
 
2.9%
563
 
2.9%
471
 
2.4%
264
 
1.4%
249
 
1.3%
227
 
1.2%
226
 
1.2%
Other values (879) 13627
69.7%
Lowercase Letter
ValueCountFrequency (%)
e 8897
13.3%
r 6679
10.0%
i 6292
9.4%
s 5140
 
7.7%
a 5098
 
7.6%
n 5019
 
7.5%
l 4897
 
7.3%
o 4107
 
6.2%
c 3662
 
5.5%
t 3197
 
4.8%
Other values (16) 13706
20.6%
Uppercase Letter
ValueCountFrequency (%)
P 2643
19.3%
A 1251
9.1%
H 1232
9.0%
C 1143
 
8.3%
S 1099
 
8.0%
M 941
 
6.9%
W 851
 
6.2%
G 520
 
3.8%
V 518
 
3.8%
E 483
 
3.5%
Other values (16) 3040
22.2%
Other Punctuation
ValueCountFrequency (%)
. 833
67.2%
, 209
 
16.9%
& 104
 
8.4%
/ 47
 
3.8%
' 23
 
1.9%
@ 8
 
0.6%
; 7
 
0.6%
: 7
 
0.6%
? 1
 
0.1%
· 1
 
0.1%
Decimal Number
ValueCountFrequency (%)
9 13
35.1%
1 11
29.7%
2 5
 
13.5%
3 3
 
8.1%
8 2
 
5.4%
5 2
 
5.4%
4 1
 
2.7%
Close Punctuation
ValueCountFrequency (%)
] 19
70.4%
) 6
 
22.2%
} 2
 
7.4%
Dash Punctuation
ValueCountFrequency (%)
- 1558
99.9%
2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
[ 18
75.0%
( 6
 
25.0%
Space Separator
ValueCountFrequency (%)
6572
100.0%
Math Symbol
ValueCountFrequency (%)
< 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 80415
73.5%
Han 12778
 
11.7%
Common 9462
 
8.6%
Hangul 6367
 
5.8%
Katakana 381
 
0.3%
Hiragana 21
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
1972
 
15.4%
660
 
5.2%
567
 
4.4%
563
 
4.4%
471
 
3.7%
264
 
2.1%
249
 
1.9%
202
 
1.6%
189
 
1.5%
189
 
1.5%
Other values (482) 7452
58.3%
Hangul
ValueCountFrequency (%)
721
 
11.3%
227
 
3.6%
226
 
3.5%
222
 
3.5%
215
 
3.4%
156
 
2.5%
132
 
2.1%
129
 
2.0%
124
 
1.9%
116
 
1.8%
Other values (329) 4099
64.4%
Latin
ValueCountFrequency (%)
e 8897
 
11.1%
r 6679
 
8.3%
i 6292
 
7.8%
s 5140
 
6.4%
a 5098
 
6.3%
n 5019
 
6.2%
l 4897
 
6.1%
o 4107
 
5.1%
c 3662
 
4.6%
t 3197
 
4.0%
Other values (42) 27427
34.1%
Katakana
ValueCountFrequency (%)
36
 
9.4%
34
 
8.9%
34
 
8.9%
33
 
8.7%
31
 
8.1%
16
 
4.2%
16
 
4.2%
14
 
3.7%
12
 
3.1%
12
 
3.1%
Other values (35) 143
37.5%
Common
ValueCountFrequency (%)
6572
69.5%
- 1558
 
16.5%
. 833
 
8.8%
, 209
 
2.2%
& 104
 
1.1%
/ 47
 
0.5%
' 23
 
0.2%
] 19
 
0.2%
[ 18
 
0.2%
9 13
 
0.1%
Other values (16) 66
 
0.7%
Hiragana
ValueCountFrequency (%)
3
14.3%
3
14.3%
2
9.5%
2
9.5%
2
9.5%
2
9.5%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (3) 3
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 89874
82.1%
CJK 12728
 
11.6%
Hangul 6366
 
5.8%
Katakana 381
 
0.3%
CJK Compat Ideographs 50
 
< 0.1%
Hiragana 21
 
< 0.1%
Punctuation 2
 
< 0.1%
Compat Jamo 1
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 8897
 
9.9%
r 6679
 
7.4%
6572
 
7.3%
i 6292
 
7.0%
s 5140
 
5.7%
a 5098
 
5.7%
n 5019
 
5.6%
l 4897
 
5.4%
o 4107
 
4.6%
c 3662
 
4.1%
Other values (66) 33511
37.3%
CJK
ValueCountFrequency (%)
1972
 
15.5%
660
 
5.2%
567
 
4.5%
563
 
4.4%
471
 
3.7%
264
 
2.1%
249
 
2.0%
202
 
1.6%
189
 
1.5%
189
 
1.5%
Other values (467) 7402
58.2%
Hangul
ValueCountFrequency (%)
721
 
11.3%
227
 
3.6%
226
 
3.6%
222
 
3.5%
215
 
3.4%
156
 
2.5%
132
 
2.1%
129
 
2.0%
124
 
1.9%
116
 
1.8%
Other values (328) 4098
64.4%
Katakana
ValueCountFrequency (%)
36
 
9.4%
34
 
8.9%
34
 
8.9%
33
 
8.7%
31
 
8.1%
16
 
4.2%
16
 
4.2%
14
 
3.7%
12
 
3.1%
12
 
3.1%
Other values (35) 143
37.5%
CJK Compat Ideographs
ValueCountFrequency (%)
12
24.0%
9
18.0%
4
 
8.0%
3
 
6.0%
3
 
6.0%
3
 
6.0%
3
 
6.0%
2
 
4.0%
2
 
4.0%
2
 
4.0%
Other values (5) 7
14.0%
Hiragana
ValueCountFrequency (%)
3
14.3%
3
14.3%
2
9.5%
2
9.5%
2
9.5%
2
9.5%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (3) 3
14.3%
Punctuation
ValueCountFrequency (%)
2
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
None
ValueCountFrequency (%)
· 1
100.0%
Distinct727
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1991-05-07 00:00:00
Maximum1999-10-27 00:00:00
2024-03-15T02:04:18.822930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:04:19.116110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

서명
Text

Distinct9793
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T02:04:20.257631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length400
Median length242
Mean length36.0847
Min length1

Characters and Unicode

Total characters360847
Distinct characters2392
Distinct categories16 ?
Distinct scripts7 ?
Distinct blocks13 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9630 ?
Unique (%)96.3%

Sample

1st row그래도 ‘NO’라고 말할 수 있는 일본 : 미일 간의 근본문제
2nd row아침이 즐거운 건강 수면법
3rd rowPsychological development, a life-span approach
4th row幸福
5th rowMechanical vibrations
ValueCountFrequency (%)
3210
 
5.9%
of 2298
 
4.2%
and 2239
 
4.1%
the 1565
 
2.9%
in 973
 
1.8%
a 558
 
1.0%
for 499
 
0.9%
to 471
 
0.9%
theory 408
 
0.7%
proceedings 407
 
0.7%
Other values (14717) 42182
77.0%
2024-03-15T02:04:21.981554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
44824
 
12.4%
e 26462
 
7.3%
n 21805
 
6.0%
i 21761
 
6.0%
o 20827
 
5.8%
a 20375
 
5.6%
t 19536
 
5.4%
s 16654
 
4.6%
r 16464
 
4.6%
c 12759
 
3.5%
Other values (2382) 139380
38.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 246896
68.4%
Space Separator 44824
 
12.4%
Other Letter 40482
 
11.2%
Uppercase Letter 13352
 
3.7%
Other Punctuation 6866
 
1.9%
Decimal Number 5688
 
1.6%
Dash Punctuation 1390
 
0.4%
Close Punctuation 542
 
0.2%
Open Punctuation 535
 
0.1%
Math Symbol 259
 
0.1%
Other values (6) 13
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1206
 
3.0%
907
 
2.2%
519
 
1.3%
429
 
1.1%
336
 
0.8%
317
 
0.8%
312
 
0.8%
302
 
0.7%
301
 
0.7%
271
 
0.7%
Other values (2278) 35582
87.9%
Lowercase Letter
ValueCountFrequency (%)
e 26462
10.7%
n 21805
 
8.8%
i 21761
 
8.8%
o 20827
 
8.4%
a 20375
 
8.3%
t 19536
 
7.9%
s 16654
 
6.7%
r 16464
 
6.7%
c 12759
 
5.2%
l 10645
 
4.3%
Other values (18) 59608
24.1%
Uppercase Letter
ValueCountFrequency (%)
S 1356
 
10.2%
C 1315
 
9.8%
A 1285
 
9.6%
I 1062
 
8.0%
M 989
 
7.4%
T 960
 
7.2%
P 878
 
6.6%
E 693
 
5.2%
D 544
 
4.1%
F 528
 
4.0%
Other values (17) 3742
28.0%
Other Punctuation
ValueCountFrequency (%)
, 2748
40.0%
: 2743
40.0%
. 534
 
7.8%
/ 339
 
4.9%
' 214
 
3.1%
; 149
 
2.2%
& 48
 
0.7%
? 34
 
0.5%
" 22
 
0.3%
@ 18
 
0.3%
Other values (6) 17
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 1562
27.5%
9 1060
18.6%
2 672
11.8%
8 531
 
9.3%
7 402
 
7.1%
0 379
 
6.7%
3 354
 
6.2%
6 266
 
4.7%
4 235
 
4.1%
5 227
 
4.0%
Close Punctuation
ValueCountFrequency (%)
) 474
87.5%
] 53
 
9.8%
} 9
 
1.7%
5
 
0.9%
1
 
0.2%
Math Symbol
ValueCountFrequency (%)
= 196
75.7%
> 37
 
14.3%
+ 14
 
5.4%
< 11
 
4.2%
1
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 474
88.6%
[ 55
 
10.3%
5
 
0.9%
1
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 1386
99.7%
4
 
0.3%
Space Separator
ValueCountFrequency (%)
44824
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%
Modifier Symbol
ValueCountFrequency (%)
˙ 3
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 260248
72.1%
Common 60116
 
16.7%
Hangul 20341
 
5.6%
Han 18145
 
5.0%
Katakana 1358
 
0.4%
Hiragana 638
 
0.2%
Greek 1
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
907
 
5.0%
336
 
1.9%
302
 
1.7%
232
 
1.3%
228
 
1.3%
214
 
1.2%
210
 
1.2%
202
 
1.1%
166
 
0.9%
159
 
0.9%
Other values (1321) 15189
83.7%
Hangul
ValueCountFrequency (%)
1206
 
5.9%
519
 
2.6%
429
 
2.1%
317
 
1.6%
312
 
1.5%
301
 
1.5%
271
 
1.3%
261
 
1.3%
251
 
1.2%
247
 
1.2%
Other values (818) 16227
79.8%
Katakana
ValueCountFrequency (%)
120
 
8.8%
97
 
7.1%
78
 
5.7%
55
 
4.1%
51
 
3.8%
49
 
3.6%
48
 
3.5%
47
 
3.5%
46
 
3.4%
41
 
3.0%
Other values (66) 726
53.5%
Latin
ValueCountFrequency (%)
e 26462
 
10.2%
n 21805
 
8.4%
i 21761
 
8.4%
o 20827
 
8.0%
a 20375
 
7.8%
t 19536
 
7.5%
s 16654
 
6.4%
r 16464
 
6.3%
c 12759
 
4.9%
l 10645
 
4.1%
Other values (45) 72960
28.0%
Hiragana
ValueCountFrequency (%)
220
34.5%
99
15.5%
26
 
4.1%
23
 
3.6%
17
 
2.7%
17
 
2.7%
16
 
2.5%
13
 
2.0%
13
 
2.0%
12
 
1.9%
Other values (43) 182
28.5%
Common
ValueCountFrequency (%)
44824
74.6%
, 2748
 
4.6%
: 2743
 
4.6%
1 1562
 
2.6%
- 1386
 
2.3%
9 1060
 
1.8%
2 672
 
1.1%
. 534
 
0.9%
8 531
 
0.9%
) 474
 
0.8%
Other values (38) 3582
 
6.0%
Greek
ValueCountFrequency (%)
β 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 320320
88.8%
Hangul 20265
 
5.6%
CJK 17562
 
4.9%
Katakana 1358
 
0.4%
Hiragana 638
 
0.2%
CJK Compat Ideographs 583
 
0.2%
Compat Jamo 76
 
< 0.1%
None 28
 
< 0.1%
Punctuation 8
 
< 0.1%
Geometric Shapes 4
 
< 0.1%
Other values (3) 5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
44824
14.0%
e 26462
 
8.3%
n 21805
 
6.8%
i 21761
 
6.8%
o 20827
 
6.5%
a 20375
 
6.4%
t 19536
 
6.1%
s 16654
 
5.2%
r 16464
 
5.1%
c 12759
 
4.0%
Other values (75) 98853
30.9%
Hangul
ValueCountFrequency (%)
1206
 
6.0%
519
 
2.6%
429
 
2.1%
317
 
1.6%
312
 
1.5%
301
 
1.5%
271
 
1.3%
261
 
1.3%
251
 
1.2%
247
 
1.2%
Other values (817) 16151
79.7%
CJK
ValueCountFrequency (%)
907
 
5.2%
336
 
1.9%
302
 
1.7%
232
 
1.3%
228
 
1.3%
214
 
1.2%
210
 
1.2%
202
 
1.2%
166
 
0.9%
159
 
0.9%
Other values (1254) 14606
83.2%
Hiragana
ValueCountFrequency (%)
220
34.5%
99
15.5%
26
 
4.1%
23
 
3.6%
17
 
2.7%
17
 
2.7%
16
 
2.5%
13
 
2.0%
13
 
2.0%
12
 
1.9%
Other values (43) 182
28.5%
CJK Compat Ideographs
ValueCountFrequency (%)
125
21.4%
65
11.1%
62
10.6%
48
 
8.2%
44
 
7.5%
39
 
6.7%
22
 
3.8%
13
 
2.2%
12
 
2.1%
10
 
1.7%
Other values (57) 143
24.5%
Katakana
ValueCountFrequency (%)
120
 
8.8%
97
 
7.1%
78
 
5.7%
55
 
4.1%
51
 
3.8%
49
 
3.6%
48
 
3.5%
47
 
3.5%
46
 
3.4%
41
 
3.0%
Other values (66) 726
53.5%
Compat Jamo
ValueCountFrequency (%)
76
100.0%
None
ValueCountFrequency (%)
· 7
25.0%
5
17.9%
5
17.9%
2
 
7.1%
ł 2
 
7.1%
1
 
3.6%
Ł 1
 
3.6%
β 1
 
3.6%
1
 
3.6%
1
 
3.6%
Other values (2) 2
 
7.1%
Geometric Shapes
ValueCountFrequency (%)
4
100.0%
Punctuation
ValueCountFrequency (%)
4
50.0%
2
25.0%
2
25.0%
Modifier Letters
ValueCountFrequency (%)
˙ 3
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

Interactions

2024-03-15T02:04:04.279812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:04:02.779495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:04:03.486332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:04:04.532796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:04:03.005070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:04:03.757886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:04:04.777152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:04:03.216990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:04:04.017954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T02:04:22.256437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호서지제어번호출판년도
일련번호1.0000.9250.583
서지제어번호0.9251.0000.445
출판년도0.5830.4451.000
2024-03-15T02:04:22.496962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호서지제어번호출판년도
일련번호1.0001.0000.555
서지제어번호1.0001.0000.555
출판년도0.5550.5551.000

Missing values

2024-03-15T02:04:05.099792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T02:04:05.331548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-15T02:04:05.689914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

일련번호저자서지제어번호홈페이지 주소(URL)출판년도출판사입력일서명
4003740038石原愼太郞43248http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=432481990이성과현실1994-12-05그래도 ‘NO’라고 말할 수 있는 일본 : 미일 간의 근본문제
4111141112早居鎭夫44451http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=444511993고려원미디어1994-12-01아침이 즐거운 건강 수면법
263264Mussen, Paul Henry343http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=3431979Harper & Row1999-10-26Psychological development, a life-span approach
5127751278정기수55688http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=556881978乙酉文化社1993-11-24幸福
1754117542Den Hartog, J. P18936http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=189361956McGraw-Hill1994-11-24Mechanical vibrations
3728637287한국전기통신공사 홍보실40143http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=401431990韓國電氣通信公社1994-12-06電氣通信論叢 / 第1輯
1900819009Aggarwal, J. K20470http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=204701977IEEE1994-11-21Computer methods in image analysis
5675556756Begley, David L97701http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=977011991SPIE1997-02-18Free-space laser communication technologies III : proceedings, 21-22 January 1991, Los Angeles, California
4057640577유제구43840http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=438401990大光書林1994-12-06特殊加工
5085350854Kristeva, julia54920http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=549201988文藝出版社1994-01-11페미니즘과 文學
일련번호저자서지제어번호홈페이지 주소(URL)출판년도출판사입력일서명
40034004Christie, Agatha4411http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=44111978Dodd, Mead1994-11-18Destination unknown
3959939600Chorlton, Windsor42789http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=427891986한국일보타임-라이프1994-12-02永河期
2380523806Carter, Giles F25545http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=255451979American Society for Metals1994-11-21Principles of physical and chemical metallurgy
5685456855김지하97837http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=978371994범우사1997-12-23병든 바다 병든 지구
3445734458박명규37189http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=371891987韓國理工學社1994-11-29CP/M 기초와 활용
3152131522조성환34009http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=340091979博英社1994-12-02프로그램 經濟分析 = Economic Analysis : A programmed textbook
1925219253Lee, Kaiman20735http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=207351975Environmental Design & Research Center1994-11-21Environmental design evaluation : a matrix method
4677446775한양순50709http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=507091977延世大學校 出版部1993-12-03레크레이션의 理論과 實際 / 上 - 下
3838238383반전수일41542http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=415421966朝倉書店1994-12-06基礎技術
1409314094Society for Experimental Biology (Great Britain)15272http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=152721973University Press1994-11-15Rate control of biological processes