Overview

Dataset statistics

Number of variables7
Number of observations3315
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory184.7 KiB
Average record size in memory57.0 B

Variable types

Numeric1
Text5
Categorical1

Dataset

Description대구광역시립수성도서관에서 최근 소장하게 된 도서의 정보로 서명, 저자, 발행자, 발행년, 청구기호의 정보를 제공합니다.
Author대구광역시교육청 대구광역시립수성도서관
URLhttps://www.data.go.kr/data/15005090/fileData.do

Alerts

발행년 is highly imbalanced (58.5%)Imbalance
번호 has unique valuesUnique
등록번호 has unique valuesUnique

Reproduction

Analysis started2024-04-06 08:51:12.982314
Analysis finished2024-04-06 08:51:17.339320
Duration4.36 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct3315
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1658
Minimum1
Maximum3315
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.3 KiB
2024-04-06T17:51:17.567708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile166.7
Q1829.5
median1658
Q32486.5
95-th percentile3149.3
Maximum3315
Range3314
Interquartile range (IQR)1657

Descriptive statistics

Standard deviation957.1024
Coefficient of variation (CV)0.57726321
Kurtosis-1.2
Mean1658
Median Absolute Deviation (MAD)829
Skewness0
Sum5496270
Variance916045
MonotonicityStrictly increasing
2024-04-06T17:51:18.091877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
2216 1
 
< 0.1%
2206 1
 
< 0.1%
2207 1
 
< 0.1%
2208 1
 
< 0.1%
2209 1
 
< 0.1%
2210 1
 
< 0.1%
2211 1
 
< 0.1%
2212 1
 
< 0.1%
2213 1
 
< 0.1%
Other values (3305) 3305
99.7%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
3315 1
< 0.1%
3314 1
< 0.1%
3313 1
< 0.1%
3312 1
< 0.1%
3311 1
< 0.1%
3310 1
< 0.1%
3309 1
< 0.1%
3308 1
< 0.1%
3307 1
< 0.1%
3306 1
< 0.1%

등록번호
Text

UNIQUE 

Distinct3315
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size26.0 KiB
2024-04-06T17:51:18.741844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters39780
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3315 ?
Unique (%)100.0%

Sample

1st rowBDD000001351
2nd rowBDD000001352
3rd rowBDD000001353
4th rowBDD000001354
5th rowBDD000001355
ValueCountFrequency (%)
bdd000001351 1
 
< 0.1%
bds000519567 1
 
< 0.1%
bds000519569 1
 
< 0.1%
bds000519558 1
 
< 0.1%
bds000519559 1
 
< 0.1%
bds000519560 1
 
< 0.1%
bds000519561 1
 
< 0.1%
bds000519562 1
 
< 0.1%
bds000519563 1
 
< 0.1%
bds000519564 1
 
< 0.1%
Other values (3305) 3305
99.7%
2024-04-06T17:51:19.748206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 11926
30.0%
5 4349
 
10.9%
1 3589
 
9.0%
B 3315
 
8.3%
D 3135
 
7.9%
S 3045
 
7.7%
9 1981
 
5.0%
8 1964
 
4.9%
7 1572
 
4.0%
2 1369
 
3.4%
Other values (7) 3535
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 29835
75.0%
Uppercase Letter 9945
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 11926
40.0%
5 4349
 
14.6%
1 3589
 
12.0%
9 1981
 
6.6%
8 1964
 
6.6%
7 1572
 
5.3%
2 1369
 
4.6%
4 1091
 
3.7%
6 1002
 
3.4%
3 992
 
3.3%
Uppercase Letter
ValueCountFrequency (%)
B 3315
33.3%
D 3135
31.5%
S 3045
30.6%
E 200
 
2.0%
M 200
 
2.0%
Q 30
 
0.3%
W 20
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 29835
75.0%
Latin 9945
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 11926
40.0%
5 4349
 
14.6%
1 3589
 
12.0%
9 1981
 
6.6%
8 1964
 
6.6%
7 1572
 
5.3%
2 1369
 
4.6%
4 1091
 
3.7%
6 1002
 
3.4%
3 992
 
3.3%
Latin
ValueCountFrequency (%)
B 3315
33.3%
D 3135
31.5%
S 3045
30.6%
E 200
 
2.0%
M 200
 
2.0%
Q 30
 
0.3%
W 20
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39780
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 11926
30.0%
5 4349
 
10.9%
1 3589
 
9.0%
B 3315
 
8.3%
D 3135
 
7.9%
S 3045
 
7.7%
9 1981
 
5.0%
8 1964
 
4.9%
7 1572
 
4.0%
2 1369
 
3.4%
Other values (7) 3535
 
8.9%
Distinct3266
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size26.0 KiB
2024-04-06T17:51:20.357885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length17
Mean length11.933333
Min length7

Characters and Unicode

Total characters39559
Distinct characters423
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3231 ?
Unique (%)97.5%

Sample

1st row필(J) 650-팔17ㅇ
2nd row필(J) 796.8-닥833ㅇ
3rd row필(J) 896.8-바58ㅇ
4th row필(J) 896.8-아53ㅂ
5th row필(J) 796.8-타31ㅍ
ValueCountFrequency (%)
j 692
 
14.2%
mc 460
 
9.4%
양(j 157
 
3.2%
119
 
2.4%
43
 
0.9%
dv 30
 
0.6%
17
 
0.3%
일(j 10
 
0.2%
10
 
0.2%
필(j 10
 
0.2%
Other values (3263) 3327
68.2%
2024-04-06T17:51:21.330436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 4038
 
10.2%
8 3483
 
8.8%
1 3393
 
8.6%
3 3373
 
8.5%
2 2365
 
6.0%
. 2281
 
5.8%
4 2007
 
5.1%
5 1954
 
4.9%
9 1896
 
4.8%
7 1600
 
4.0%
Other values (413) 13169
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 22518
56.9%
Other Letter 6126
 
15.5%
Dash Punctuation 4038
 
10.2%
Other Punctuation 2281
 
5.8%
Uppercase Letter 2048
 
5.2%
Space Separator 1560
 
3.9%
Math Symbol 436
 
1.1%
Lowercase Letter 196
 
0.5%
Close Punctuation 178
 
0.4%
Open Punctuation 178
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
578
 
9.4%
328
 
5.4%
289
 
4.7%
282
 
4.6%
271
 
4.4%
212
 
3.5%
201
 
3.3%
188
 
3.1%
176
 
2.9%
170
 
2.8%
Other values (350) 3431
56.0%
Lowercase Letter
ValueCountFrequency (%)
c 20
 
10.2%
s 17
 
8.7%
a 16
 
8.2%
i 14
 
7.1%
h 14
 
7.1%
b 13
 
6.6%
w 13
 
6.6%
l 12
 
6.1%
t 10
 
5.1%
n 9
 
4.6%
Other values (13) 58
29.6%
Uppercase Letter
ValueCountFrequency (%)
J 872
42.6%
M 471
23.0%
C 465
22.7%
L 66
 
3.2%
D 38
 
1.9%
V 31
 
1.5%
S 19
 
0.9%
H 16
 
0.8%
B 15
 
0.7%
K 9
 
0.4%
Other values (12) 46
 
2.2%
Decimal Number
ValueCountFrequency (%)
8 3483
15.5%
1 3393
15.1%
3 3373
15.0%
2 2365
10.5%
4 2007
8.9%
5 1954
8.7%
9 1896
8.4%
7 1600
7.1%
6 1363
 
6.1%
0 1084
 
4.8%
Close Punctuation
ValueCountFrequency (%)
) 177
99.4%
] 1
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 177
99.4%
[ 1
 
0.6%
Dash Punctuation
ValueCountFrequency (%)
- 4038
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2281
100.0%
Space Separator
ValueCountFrequency (%)
1560
100.0%
Math Symbol
ValueCountFrequency (%)
= 436
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 31189
78.8%
Hangul 6126
 
15.5%
Latin 2244
 
5.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
578
 
9.4%
328
 
5.4%
289
 
4.7%
282
 
4.6%
271
 
4.4%
212
 
3.5%
201
 
3.3%
188
 
3.1%
176
 
2.9%
170
 
2.8%
Other values (350) 3431
56.0%
Latin
ValueCountFrequency (%)
J 872
38.9%
M 471
21.0%
C 465
20.7%
L 66
 
2.9%
D 38
 
1.7%
V 31
 
1.4%
c 20
 
0.9%
S 19
 
0.8%
s 17
 
0.8%
H 16
 
0.7%
Other values (35) 229
 
10.2%
Common
ValueCountFrequency (%)
- 4038
12.9%
8 3483
11.2%
1 3393
10.9%
3 3373
10.8%
2 2365
7.6%
. 2281
7.3%
4 2007
6.4%
5 1954
6.3%
9 1896
 
6.1%
7 1600
 
5.1%
Other values (8) 4799
15.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33433
84.5%
Hangul 3511
 
8.9%
Compat Jamo 2615
 
6.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 4038
12.1%
8 3483
10.4%
1 3393
10.1%
3 3373
10.1%
2 2365
 
7.1%
. 2281
 
6.8%
4 2007
 
6.0%
5 1954
 
5.8%
9 1896
 
5.7%
7 1600
 
4.8%
Other values (53) 7043
21.1%
Compat Jamo
ValueCountFrequency (%)
578
22.1%
328
12.5%
282
10.8%
201
 
7.7%
188
 
7.2%
176
 
6.7%
170
 
6.5%
161
 
6.2%
154
 
5.9%
95
 
3.6%
Other values (9) 282
10.8%
Hangul
ValueCountFrequency (%)
289
 
8.2%
271
 
7.7%
212
 
6.0%
129
 
3.7%
121
 
3.4%
73
 
2.1%
70
 
2.0%
57
 
1.6%
54
 
1.5%
51
 
1.5%
Other values (331) 2184
62.2%

서명
Text

Distinct3126
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Memory size26.0 KiB
2024-04-06T17:51:22.156081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length125
Median length71
Mean length24.169834
Min length1

Characters and Unicode

Total characters80123
Distinct characters1330
Distinct categories16 ?
Distinct scripts6 ?
Distinct blocks13 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3047 ?
Unique (%)91.9%

Sample

1st rowWhat kids should know about filipino visual art
2nd row(Ang)aking mukha
3rd row(Ang)awit ni balagtas
4th row(Ang)buhok nga naglimpyo kang suba
5th row(Ang)pamilya namin
ValueCountFrequency (%)
1443
 
7.0%
이야기 148
 
0.7%
1 110
 
0.5%
106
 
0.5%
위한 105
 
0.5%
2 97
 
0.5%
장편소설 71
 
0.3%
the 70
 
0.3%
그림책 64
 
0.3%
교과서 62
 
0.3%
Other values (9661) 18462
89.0%
2024-04-06T17:51:23.570431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17424
 
21.7%
: 1431
 
1.8%
1420
 
1.8%
1328
 
1.7%
1040
 
1.3%
, 960
 
1.2%
e 757
 
0.9%
734
 
0.9%
727
 
0.9%
707
 
0.9%
Other values (1320) 53595
66.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 48619
60.7%
Space Separator 17424
 
21.7%
Lowercase Letter 6501
 
8.1%
Other Punctuation 3635
 
4.5%
Decimal Number 1860
 
2.3%
Uppercase Letter 919
 
1.1%
Close Punctuation 474
 
0.6%
Open Punctuation 474
 
0.6%
Math Symbol 136
 
0.2%
Dash Punctuation 63
 
0.1%
Other values (6) 18
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1420
 
2.9%
1328
 
2.7%
1040
 
2.1%
734
 
1.5%
727
 
1.5%
707
 
1.5%
702
 
1.4%
684
 
1.4%
668
 
1.4%
642
 
1.3%
Other values (1215) 39967
82.2%
Lowercase Letter
ValueCountFrequency (%)
e 757
11.6%
o 567
 
8.7%
a 534
 
8.2%
i 496
 
7.6%
n 489
 
7.5%
t 471
 
7.2%
r 417
 
6.4%
s 385
 
5.9%
h 334
 
5.1%
l 329
 
5.1%
Other values (16) 1722
26.5%
Uppercase Letter
ValueCountFrequency (%)
T 90
 
9.8%
D 89
 
9.7%
A 88
 
9.6%
W 79
 
8.6%
M 68
 
7.4%
I 62
 
6.7%
S 46
 
5.0%
B 44
 
4.8%
C 39
 
4.2%
V 39
 
4.2%
Other values (16) 275
29.9%
Other Punctuation
ValueCountFrequency (%)
: 1431
39.4%
, 960
26.4%
. 692
19.0%
! 218
 
6.0%
? 194
 
5.3%
' 58
 
1.6%
· 49
 
1.3%
& 16
 
0.4%
% 4
 
0.1%
" 4
 
0.1%
Other values (4) 9
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 423
22.7%
2 349
18.8%
0 269
14.5%
3 200
10.8%
4 192
10.3%
5 134
 
7.2%
6 91
 
4.9%
9 79
 
4.2%
7 65
 
3.5%
8 58
 
3.1%
Math Symbol
ValueCountFrequency (%)
= 88
64.7%
~ 42
30.9%
× 2
 
1.5%
> 1
 
0.7%
< 1
 
0.7%
1
 
0.7%
1
 
0.7%
Close Punctuation
ValueCountFrequency (%)
) 427
90.1%
] 39
 
8.2%
5
 
1.1%
1
 
0.2%
1
 
0.2%
1
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 427
90.1%
[ 39
 
8.2%
5
 
1.1%
1
 
0.2%
1
 
0.2%
1
 
0.2%
Other Symbol
ValueCountFrequency (%)
1
50.0%
1
50.0%
Letter Number
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
17424
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 63
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 5
100.0%
Final Punctuation
ValueCountFrequency (%)
4
100.0%
Initial Punctuation
ValueCountFrequency (%)
4
100.0%
Modifier Letter
ValueCountFrequency (%)
ː 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 48395
60.4%
Common 24082
30.1%
Latin 7422
 
9.3%
Hiragana 105
 
0.1%
Han 83
 
0.1%
Katakana 36
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1420
 
2.9%
1328
 
2.7%
1040
 
2.1%
734
 
1.5%
727
 
1.5%
707
 
1.5%
702
 
1.5%
684
 
1.4%
668
 
1.4%
642
 
1.3%
Other values (1074) 39743
82.1%
Han
ValueCountFrequency (%)
3
 
3.6%
3
 
3.6%
3
 
3.6%
3
 
3.6%
3
 
3.6%
3
 
3.6%
2
 
2.4%
2
 
2.4%
2
 
2.4%
2
 
2.4%
Other values (57) 57
68.7%
Latin
ValueCountFrequency (%)
e 757
 
10.2%
o 567
 
7.6%
a 534
 
7.2%
i 496
 
6.7%
n 489
 
6.6%
t 471
 
6.3%
r 417
 
5.6%
s 385
 
5.2%
h 334
 
4.5%
l 329
 
4.4%
Other values (44) 2643
35.6%
Common
ValueCountFrequency (%)
17424
72.4%
: 1431
 
5.9%
, 960
 
4.0%
. 692
 
2.9%
) 427
 
1.8%
( 427
 
1.8%
1 423
 
1.8%
2 349
 
1.4%
0 269
 
1.1%
! 218
 
0.9%
Other values (41) 1462
 
6.1%
Hiragana
ValueCountFrequency (%)
7
 
6.7%
6
 
5.7%
6
 
5.7%
5
 
4.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
Other values (38) 57
54.3%
Katakana
ValueCountFrequency (%)
4
 
11.1%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
1
 
2.8%
1
 
2.8%
Other values (16) 16
44.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 48390
60.4%
ASCII 31419
39.2%
Hiragana 105
 
0.1%
CJK 82
 
0.1%
None 71
 
0.1%
Katakana 36
 
< 0.1%
Punctuation 9
 
< 0.1%
Compat Jamo 5
 
< 0.1%
Number Forms 2
 
< 0.1%
Misc Symbols 1
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17424
55.5%
: 1431
 
4.6%
, 960
 
3.1%
e 757
 
2.4%
. 692
 
2.2%
o 567
 
1.8%
a 534
 
1.7%
i 496
 
1.6%
n 489
 
1.6%
t 471
 
1.5%
Other values (74) 7598
24.2%
Hangul
ValueCountFrequency (%)
1420
 
2.9%
1328
 
2.7%
1040
 
2.1%
734
 
1.5%
727
 
1.5%
707
 
1.5%
702
 
1.5%
684
 
1.4%
668
 
1.4%
642
 
1.3%
Other values (1069) 39738
82.1%
None
ValueCountFrequency (%)
· 49
69.0%
5
 
7.0%
5
 
7.0%
2
 
2.8%
× 2
 
2.8%
1
 
1.4%
1
 
1.4%
1
 
1.4%
1
 
1.4%
1
 
1.4%
Other values (3) 3
 
4.2%
Hiragana
ValueCountFrequency (%)
7
 
6.7%
6
 
5.7%
6
 
5.7%
5
 
4.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
4
 
3.8%
Other values (38) 57
54.3%
Punctuation
ValueCountFrequency (%)
4
44.4%
4
44.4%
1
 
11.1%
Katakana
ValueCountFrequency (%)
4
 
11.1%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
1
 
2.8%
1
 
2.8%
Other values (16) 16
44.4%
CJK
ValueCountFrequency (%)
3
 
3.7%
3
 
3.7%
3
 
3.7%
3
 
3.7%
3
 
3.7%
3
 
3.7%
2
 
2.4%
2
 
2.4%
2
 
2.4%
2
 
2.4%
Other values (56) 56
68.3%
Misc Symbols
ValueCountFrequency (%)
1
100.0%
Compat Jamo
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Number Forms
ValueCountFrequency (%)
1
50.0%
1
50.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%
Modifier Letters
ValueCountFrequency (%)
ː 1
100.0%

저자
Text

Distinct2825
Distinct (%)85.2%
Missing0
Missing (%)0.0%
Memory size26.0 KiB
2024-04-06T17:51:24.464946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length140
Median length87
Mean length15.318552
Min length2

Characters and Unicode

Total characters50781
Distinct characters884
Distinct categories10 ?
Distinct scripts6 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2596 ?
Unique (%)78.3%

Sample

1st rowwritten by Raissa Rivera Falgui,Denise Besinga-Manlapaz ; illustratrated by Pat Portugal
2nd rowguhit ni Yas Doctor
3rd rowisinulat ni Eugene Y. Evasco ; iginuhit ni Ara Villena
4th rowkuwento ni Genevieve L. Asenjo ; guhit ni Viel Vidal
5th rowgunit ni Vanessa Tamayo
ValueCountFrequency (%)
1967
 
13.9%
지음 1581
 
11.1%
그림 797
 
5.6%
742
 
5.2%
옮김 737
 
5.2%
지은이 230
 
1.6%
by 153
 
1.1%
글·그림 126
 
0.9%
원작 62
 
0.4%
60
 
0.4%
Other values (4867) 7741
54.5%
2024-04-06T17:51:25.823673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10921
 
21.5%
2131
 
4.2%
; 1955
 
3.8%
1664
 
3.3%
1521
 
3.0%
1321
 
2.6%
1012
 
2.0%
979
 
1.9%
926
 
1.8%
808
 
1.6%
Other values (874) 27543
54.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30410
59.9%
Space Separator 10921
 
21.5%
Lowercase Letter 4509
 
8.9%
Other Punctuation 3689
 
7.3%
Uppercase Letter 866
 
1.7%
Open Punctuation 175
 
0.3%
Close Punctuation 175
 
0.3%
Dash Punctuation 16
 
< 0.1%
Decimal Number 16
 
< 0.1%
Math Symbol 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2131
 
7.0%
1664
 
5.5%
1521
 
5.0%
1321
 
4.3%
1012
 
3.3%
979
 
3.2%
926
 
3.0%
808
 
2.7%
513
 
1.7%
508
 
1.7%
Other values (799) 19027
62.6%
Lowercase Letter
ValueCountFrequency (%)
a 484
10.7%
e 469
10.4%
n 467
10.4%
i 456
10.1%
o 317
 
7.0%
t 313
 
6.9%
r 307
 
6.8%
l 288
 
6.4%
y 238
 
5.3%
s 208
 
4.6%
Other values (16) 962
21.3%
Uppercase Letter
ValueCountFrequency (%)
L 146
16.9%
S 70
 
8.1%
A 62
 
7.2%
J 59
 
6.8%
D 49
 
5.7%
H 48
 
5.5%
T 47
 
5.4%
C 47
 
5.4%
M 46
 
5.3%
B 42
 
4.8%
Other values (15) 250
28.9%
Other Punctuation
ValueCountFrequency (%)
; 1955
53.0%
: 744
 
20.2%
, 720
 
19.5%
· 152
 
4.1%
. 86
 
2.3%
? 29
 
0.8%
/ 1
 
< 0.1%
@ 1
 
< 0.1%
' 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 5
31.2%
0 3
18.8%
5 2
 
12.5%
7 2
 
12.5%
9 2
 
12.5%
2 1
 
6.2%
3 1
 
6.2%
Open Punctuation
ValueCountFrequency (%)
[ 154
88.0%
( 21
 
12.0%
Close Punctuation
ValueCountFrequency (%)
] 154
88.0%
) 21
 
12.0%
Math Symbol
ValueCountFrequency (%)
< 2
50.0%
> 2
50.0%
Space Separator
ValueCountFrequency (%)
10921
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30224
59.5%
Common 14996
29.5%
Latin 5375
 
10.6%
Han 79
 
0.2%
Katakana 64
 
0.1%
Hiragana 43
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2131
 
7.1%
1664
 
5.5%
1521
 
5.0%
1321
 
4.4%
1012
 
3.3%
979
 
3.2%
926
 
3.1%
808
 
2.7%
513
 
1.7%
508
 
1.7%
Other values (689) 18841
62.3%
Latin
ValueCountFrequency (%)
a 484
 
9.0%
e 469
 
8.7%
n 467
 
8.7%
i 456
 
8.5%
o 317
 
5.9%
t 313
 
5.8%
r 307
 
5.7%
l 288
 
5.4%
y 238
 
4.4%
s 208
 
3.9%
Other values (41) 1828
34.0%
Han
ValueCountFrequency (%)
12
 
15.2%
5
 
6.3%
3
 
3.8%
3
 
3.8%
2
 
2.5%
2
 
2.5%
2
 
2.5%
2
 
2.5%
2
 
2.5%
2
 
2.5%
Other values (39) 44
55.7%
Katakana
ValueCountFrequency (%)
9
 
14.1%
4
 
6.2%
3
 
4.7%
3
 
4.7%
3
 
4.7%
3
 
4.7%
3
 
4.7%
3
 
4.7%
2
 
3.1%
2
 
3.1%
Other values (23) 29
45.3%
Hiragana
ValueCountFrequency (%)
4
 
9.3%
3
 
7.0%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
Other values (18) 20
46.5%
Common
ValueCountFrequency (%)
10921
72.8%
; 1955
 
13.0%
: 744
 
5.0%
, 720
 
4.8%
[ 154
 
1.0%
] 154
 
1.0%
· 152
 
1.0%
. 86
 
0.6%
? 29
 
0.2%
( 21
 
0.1%
Other values (14) 60
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30224
59.5%
ASCII 20219
39.8%
None 152
 
0.3%
CJK 79
 
0.2%
Katakana 64
 
0.1%
Hiragana 43
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10921
54.0%
; 1955
 
9.7%
: 744
 
3.7%
, 720
 
3.6%
a 484
 
2.4%
e 469
 
2.3%
n 467
 
2.3%
i 456
 
2.3%
o 317
 
1.6%
t 313
 
1.5%
Other values (64) 3373
 
16.7%
Hangul
ValueCountFrequency (%)
2131
 
7.1%
1664
 
5.5%
1521
 
5.0%
1321
 
4.4%
1012
 
3.3%
979
 
3.2%
926
 
3.1%
808
 
2.7%
513
 
1.7%
508
 
1.7%
Other values (689) 18841
62.3%
None
ValueCountFrequency (%)
· 152
100.0%
CJK
ValueCountFrequency (%)
12
 
15.2%
5
 
6.3%
3
 
3.8%
3
 
3.8%
2
 
2.5%
2
 
2.5%
2
 
2.5%
2
 
2.5%
2
 
2.5%
2
 
2.5%
Other values (39) 44
55.7%
Katakana
ValueCountFrequency (%)
9
 
14.1%
4
 
6.2%
3
 
4.7%
3
 
4.7%
3
 
4.7%
3
 
4.7%
3
 
4.7%
3
 
4.7%
2
 
3.1%
2
 
3.1%
Other values (23) 29
45.3%
Hiragana
ValueCountFrequency (%)
4
 
9.3%
3
 
7.0%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
Other values (18) 20
46.5%
Distinct1375
Distinct (%)41.5%
Missing0
Missing (%)0.0%
Memory size26.0 KiB
2024-04-06T17:51:26.722939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length35
Mean length5.547813
Min length1

Characters and Unicode

Total characters18391
Distinct characters699
Distinct categories10 ?
Distinct scripts6 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique861 ?
Unique (%)26.0%

Sample

1st rowAdarna House
2nd rowAdarna House
3rd rowJohnny & Hansel Publications
4th rowAklat Alamid
5th rowAdarna House
ValueCountFrequency (%)
books 109
 
2.9%
다산어린이 62
 
1.7%
한국헤르만헤세 56
 
1.5%
그레이트북스 55
 
1.5%
dragonfly 55
 
1.5%
문학동네 54
 
1.4%
예림당 47
 
1.3%
그레이트 37
 
1.0%
books(그레이트북스 37
 
1.0%
위즈덤하우스 31
 
0.8%
Other values (1454) 3193
85.5%
2024-04-06T17:51:28.097172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 609
 
3.3%
602
 
3.3%
549
 
3.0%
486
 
2.6%
426
 
2.3%
374
 
2.0%
s 300
 
1.6%
r 263
 
1.4%
262
 
1.4%
a 256
 
1.4%
Other values (689) 14264
77.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13062
71.0%
Lowercase Letter 3530
 
19.2%
Uppercase Letter 890
 
4.8%
Space Separator 426
 
2.3%
Open Punctuation 185
 
1.0%
Close Punctuation 184
 
1.0%
Other Punctuation 67
 
0.4%
Decimal Number 44
 
0.2%
Dash Punctuation 2
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
602
 
4.6%
549
 
4.2%
486
 
3.7%
374
 
2.9%
262
 
2.0%
248
 
1.9%
198
 
1.5%
197
 
1.5%
191
 
1.5%
184
 
1.4%
Other values (613) 9771
74.8%
Lowercase Letter
ValueCountFrequency (%)
o 609
17.3%
s 300
 
8.5%
r 263
 
7.5%
a 256
 
7.3%
n 250
 
7.1%
l 241
 
6.8%
e 233
 
6.6%
i 215
 
6.1%
k 197
 
5.6%
g 113
 
3.2%
Other values (16) 853
24.2%
Uppercase Letter
ValueCountFrequency (%)
B 205
23.0%
D 73
 
8.2%
S 67
 
7.5%
P 59
 
6.6%
A 59
 
6.6%
R 46
 
5.2%
K 43
 
4.8%
H 41
 
4.6%
C 40
 
4.5%
M 40
 
4.5%
Other values (15) 217
24.4%
Other Punctuation
ValueCountFrequency (%)
& 18
26.9%
. 17
25.4%
? 9
13.4%
, 6
 
9.0%
' 5
 
7.5%
/ 4
 
6.0%
· 2
 
3.0%
; 2
 
3.0%
# 2
 
3.0%
! 1
 
1.5%
Decimal Number
ValueCountFrequency (%)
2 17
38.6%
1 16
36.4%
6 3
 
6.8%
0 3
 
6.8%
3 2
 
4.5%
5 2
 
4.5%
4 1
 
2.3%
Open Punctuation
ValueCountFrequency (%)
( 180
97.3%
[ 5
 
2.7%
Close Punctuation
ValueCountFrequency (%)
) 179
97.3%
] 5
 
2.7%
Space Separator
ValueCountFrequency (%)
426
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 12948
70.4%
Latin 4420
 
24.0%
Common 909
 
4.9%
Han 93
 
0.5%
Katakana 17
 
0.1%
Hiragana 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
602
 
4.6%
549
 
4.2%
486
 
3.8%
374
 
2.9%
262
 
2.0%
248
 
1.9%
198
 
1.5%
197
 
1.5%
191
 
1.5%
184
 
1.4%
Other values (542) 9657
74.6%
Han
ValueCountFrequency (%)
12
 
12.9%
7
 
7.5%
4
 
4.3%
3
 
3.2%
2
 
2.2%
2
 
2.2%
2
 
2.2%
2
 
2.2%
2
 
2.2%
2
 
2.2%
Other values (45) 55
59.1%
Latin
ValueCountFrequency (%)
o 609
 
13.8%
s 300
 
6.8%
r 263
 
6.0%
a 256
 
5.8%
n 250
 
5.7%
l 241
 
5.5%
e 233
 
5.3%
i 215
 
4.9%
B 205
 
4.6%
k 197
 
4.5%
Other values (41) 1651
37.4%
Common
ValueCountFrequency (%)
426
46.9%
( 180
19.8%
) 179
19.7%
& 18
 
2.0%
2 17
 
1.9%
. 17
 
1.9%
1 16
 
1.8%
? 9
 
1.0%
, 6
 
0.7%
[ 5
 
0.6%
Other values (15) 36
 
4.0%
Katakana
ValueCountFrequency (%)
2
11.8%
2
11.8%
2
11.8%
2
11.8%
2
11.8%
1
5.9%
1
5.9%
1
5.9%
1
5.9%
1
5.9%
Other values (2) 2
11.8%
Hiragana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12945
70.4%
ASCII 5327
29.0%
CJK 93
 
0.5%
Katakana 17
 
0.1%
Hiragana 4
 
< 0.1%
Compat Jamo 3
 
< 0.1%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 609
 
11.4%
426
 
8.0%
s 300
 
5.6%
r 263
 
4.9%
a 256
 
4.8%
n 250
 
4.7%
l 241
 
4.5%
e 233
 
4.4%
i 215
 
4.0%
B 205
 
3.8%
Other values (65) 2329
43.7%
Hangul
ValueCountFrequency (%)
602
 
4.7%
549
 
4.2%
486
 
3.8%
374
 
2.9%
262
 
2.0%
248
 
1.9%
198
 
1.5%
197
 
1.5%
191
 
1.5%
184
 
1.4%
Other values (539) 9654
74.6%
CJK
ValueCountFrequency (%)
12
 
12.9%
7
 
7.5%
4
 
4.3%
3
 
3.2%
2
 
2.2%
2
 
2.2%
2
 
2.2%
2
 
2.2%
2
 
2.2%
2
 
2.2%
Other values (45) 55
59.1%
Katakana
ValueCountFrequency (%)
2
11.8%
2
11.8%
2
11.8%
2
11.8%
2
11.8%
1
5.9%
1
5.9%
1
5.9%
1
5.9%
1
5.9%
Other values (2) 2
11.8%
None
ValueCountFrequency (%)
· 2
100.0%
Compat Jamo
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Hiragana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

발행년
Categorical

IMBALANCE 

Distinct27
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size26.0 KiB
2023
2105 
2022
388 
2024
319 
2021
228 
2020
 
46
Other values (22)
229 

Length

Max length6
Median length4
Mean length4.053997
Min length4

Unique

Unique4 ?
Unique (%)0.1%

Sample

1st row2022
2nd row2021
3rd row2023
4th row2022
5th row2021

Common Values

ValueCountFrequency (%)
2023 2105
63.5%
2022 388
 
11.7%
2024 319
 
9.6%
2021 228
 
6.9%
2020 46
 
1.4%
[2023] 43
 
1.3%
[2021] 42
 
1.3%
2019 27
 
0.8%
2017 23
 
0.7%
2016 16
 
0.5%
Other values (17) 78
 
2.4%

Length

2024-04-06T17:51:28.583480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2023 2150
64.9%
2022 388
 
11.7%
2024 323
 
9.7%
2021 270
 
8.1%
2020 46
 
1.4%
2019 27
 
0.8%
2017 23
 
0.7%
2016 16
 
0.5%
2012 11
 
0.3%
2018 11
 
0.3%
Other values (11) 50
 
1.5%

Interactions

2024-04-06T17:51:16.367364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T17:51:28.911237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호발행년
번호1.0000.698
발행년0.6981.000
2024-04-06T17:51:29.199878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호발행년
번호1.0000.336
발행년0.3361.000

Missing values

2024-04-06T17:51:16.827833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T17:51:17.164463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호등록번호청구기호서명저자발행자발행년
01BDD000001351필(J) 650-팔17ㅇWhat kids should know about filipino visual artwritten by Raissa Rivera Falgui,Denise Besinga-Manlapaz ; illustratrated by Pat PortugalAdarna House2022
12BDD000001352필(J) 796.8-닥833ㅇ(Ang)aking mukhaguhit ni Yas DoctorAdarna House2021
23BDD000001353필(J) 896.8-바58ㅇ(Ang)awit ni balagtasisinulat ni Eugene Y. Evasco ; iginuhit ni Ara VillenaJohnny & Hansel Publications2023
34BDD000001354필(J) 896.8-아53ㅂ(Ang)buhok nga naglimpyo kang subakuwento ni Genevieve L. Asenjo ; guhit ni Viel VidalAklat Alamid2022
45BDD000001355필(J) 796.8-타31ㅍ(Ang)pamilya namingunit ni Vanessa TamayoAdarna House2021
56BDD000001356필(J) 896.8-디31ㄷDatu Birang : ang tagapagtanggol ng apo sandawaisinulat ni Janine Dimaranan ; guhit ni Noah OcfemiaSouthern Voices2023
67BDD000001357필(J) 896.8-브231ㅁMay alaga akong bakulawkuwento ni Becky Bravo ; guhit ni Ara VillenaAdarna House2019
78BDD000001358필(J) 896.8-운55ㅍ(Mga)pasahero sa dyipsulat ni Gina Unson-Rivera ; guhit ni Domz AgsawayAnvil Publishing2022
89BDD000001359필(J) 896.8-운55ㅌ(Mga)Tinapay ni tinaysulat ni Gina Unson-Rivera ; guhit ni Domz AgsawayAnvil Publishing2022
910BDD000001360필(J) 896.8-산835ㅇUnang engkantadakuwento ni Al Santos ; guhit ni Jap MikelKomiket2022
번호등록번호청구기호서명저자발행자발행년
33053306BEM000005579양 813.7-S34iI went to see my father : a novelKyung-sook Shin ; translated by Anton HurWeidenfeld & Nicolson2023
33063307BEM000005580양 513.8-B45iIn love : a memoir of love and lossAmy BloomRandom House2022
33073308BEM000005581양 895.82-F57mMelancholy Ⅰ-ⅡJon FosseCartwheel Books2023
33083309BEM000005582양 843-H15nNuclear family : a novelJoseph HanCounterpoint2023
33093310BEM000005583양 843-D18l(The)last thing he told me : a novelLaura DaveMarysue Rucci Books2023
33103311BEM000005584양 843-C31s(The)school for good mothers : a novelJessamine ChanMarysue Rucci Books/Scribner2023
33113312BEM000005585양 859.7-B12w(The)winnersFredrik Backman ; translated by Neil SmithSimon & Schuster2021
33123313BEM000005586양 843-S25k-2Things we hide from the lightLucy ScoreHodder & Stoughton2023
33133314BEM000005587양 843-H16tThis other Eden : a novelPaul HardingW. W. Norton & Company2024
33143315BEM000005588양 813.7-H81wWelcome to the Hyunam-Dong BookshopHwang Bo-reum ; translated by Shanna TanBloomsbury Publishing2023