Overview

Dataset statistics

Number of variables7
Number of observations8251
Missing cells3
Missing cells (%)< 0.1%
Duplicate rows192
Duplicate rows (%)2.3%
Total size in memory459.4 KiB
Average record size in memory57.0 B

Variable types

Text3
Categorical3
Numeric1

Dataset

Description서명,저자,발행처,발행년도,자료유형,소장처,등록일
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15484/S/1/datasetView.do

Alerts

소장처 has constant value ""Constant
Dataset has 192 (2.3%) duplicate rowsDuplicates
발행년도 is highly imbalanced (57.7%)Imbalance
자료유형 is highly imbalanced (86.5%)Imbalance

Reproduction

Analysis started2024-05-10 22:57:58.203541
Analysis finished2024-05-10 22:58:05.290783
Duration7.09 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

서명
Text

Distinct8036
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Memory size64.6 KiB
2024-05-10T22:58:05.825212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length161
Median length106
Mean length31.043025
Min length1

Characters and Unicode

Total characters256136
Distinct characters2173
Distinct categories16 ?
Distinct scripts7 ?
Distinct blocks14 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7826 ?
Unique (%)94.8%

Sample

1st row2000년생이 온다 :초합리, 초개인, 초자율의 탈사회형 AI인간
2nd row도시의 미래 :현상과 전망 그리고 처방
3rd row기록하지 않으면 존재하지 않는다 :인권위 상임위원 3년의 기록
4th row로스쿨에 가고 싶어졌습니다 :서울대 로스쿨 학생들이 직접 말하는 지금 로스쿨 이야기
5th row미래 학교, 학생이 주도하는 교실
ValueCountFrequency (%)
531
 
0.9%
위한 435
 
0.7%
the 406
 
0.7%
이야기 344
 
0.6%
of 280
 
0.5%
장편소설 271
 
0.5%
1 264
 
0.5%
2 261
 
0.4%
199
 
0.3%
a 198
 
0.3%
Other values (24693) 54820
94.5%
2024-05-10T22:58:07.182143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
49765
 
19.4%
: 5337
 
2.1%
e 4380
 
1.7%
3740
 
1.5%
3428
 
1.3%
o 3291
 
1.3%
a 3168
 
1.2%
i 3010
 
1.2%
t 2864
 
1.1%
n 2852
 
1.1%
Other values (2163) 174301
68.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 142710
55.7%
Space Separator 49766
 
19.4%
Lowercase Letter 37768
 
14.7%
Other Punctuation 11134
 
4.3%
Decimal Number 6714
 
2.6%
Uppercase Letter 3446
 
1.3%
Open Punctuation 1623
 
0.6%
Close Punctuation 1623
 
0.6%
Math Symbol 992
 
0.4%
Dash Punctuation 323
 
0.1%
Other values (6) 37
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3740
 
2.6%
3428
 
2.4%
2765
 
1.9%
2306
 
1.6%
2081
 
1.5%
2012
 
1.4%
1886
 
1.3%
1825
 
1.3%
1711
 
1.2%
1612
 
1.1%
Other values (2001) 119344
83.6%
Lowercase Letter
ValueCountFrequency (%)
e 4380
11.6%
o 3291
 
8.7%
a 3168
 
8.4%
i 3010
 
8.0%
t 2864
 
7.6%
n 2852
 
7.6%
r 2479
 
6.6%
s 2442
 
6.5%
l 1921
 
5.1%
h 1385
 
3.7%
Other values (47) 9976
26.4%
Uppercase Letter
ValueCountFrequency (%)
T 413
 
12.0%
S 289
 
8.4%
A 281
 
8.2%
C 228
 
6.6%
P 206
 
6.0%
I 203
 
5.9%
M 185
 
5.4%
L 156
 
4.5%
G 151
 
4.4%
D 146
 
4.2%
Other values (28) 1188
34.5%
Other Punctuation
ValueCountFrequency (%)
: 5337
47.9%
, 2070
 
18.6%
? 1506
 
13.5%
. 1319
 
11.8%
! 393
 
3.5%
' 287
 
2.6%
& 78
 
0.7%
/ 37
 
0.3%
; 32
 
0.3%
% 28
 
0.3%
Other values (8) 47
 
0.4%
Decimal Number
ValueCountFrequency (%)
2 1987
29.6%
0 1304
19.4%
1 1110
16.5%
3 671
 
10.0%
4 474
 
7.1%
5 372
 
5.5%
9 214
 
3.2%
6 206
 
3.1%
7 198
 
2.9%
8 178
 
2.7%
Other Number
ValueCountFrequency (%)
4
21.1%
3
15.8%
3
15.8%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
¼ 1
 
5.3%
Math Symbol
ValueCountFrequency (%)
= 836
84.3%
~ 113
 
11.4%
+ 15
 
1.5%
> 8
 
0.8%
< 8
 
0.8%
× 5
 
0.5%
4
 
0.4%
| 3
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 1261
77.7%
[ 324
 
20.0%
25
 
1.5%
7
 
0.4%
6
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 1261
77.7%
] 324
 
20.0%
25
 
1.5%
7
 
0.4%
6
 
0.4%
Letter Number
ValueCountFrequency (%)
4
44.4%
4
44.4%
1
 
11.1%
Other Symbol
ValueCountFrequency (%)
3
60.0%
1
 
20.0%
1
 
20.0%
Space Separator
ValueCountFrequency (%)
49765
> 99.9%
  1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 323
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 140392
54.8%
Common 72203
28.2%
Latin 41050
 
16.0%
Han 1553
 
0.6%
Hiragana 559
 
0.2%
Katakana 206
 
0.1%
Cyrillic 173
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3740
 
2.7%
3428
 
2.4%
2765
 
2.0%
2306
 
1.6%
2081
 
1.5%
2012
 
1.4%
1886
 
1.3%
1825
 
1.3%
1711
 
1.2%
1612
 
1.1%
Other values (1262) 117026
83.4%
Han
ValueCountFrequency (%)
45
 
2.9%
35
 
2.3%
29
 
1.9%
21
 
1.4%
19
 
1.2%
19
 
1.2%
15
 
1.0%
13
 
0.8%
13
 
0.8%
13
 
0.8%
Other values (618) 1331
85.7%
Common
ValueCountFrequency (%)
49765
68.9%
: 5337
 
7.4%
, 2070
 
2.9%
2 1987
 
2.8%
? 1506
 
2.1%
. 1319
 
1.8%
0 1304
 
1.8%
( 1261
 
1.7%
) 1261
 
1.7%
1 1110
 
1.5%
Other values (54) 5283
 
7.3%
Latin
ValueCountFrequency (%)
e 4380
 
10.7%
o 3291
 
8.0%
a 3168
 
7.7%
i 3010
 
7.3%
t 2864
 
7.0%
n 2852
 
6.9%
r 2479
 
6.0%
s 2442
 
5.9%
l 1921
 
4.7%
h 1385
 
3.4%
Other values (48) 13258
32.3%
Hiragana
ValueCountFrequency (%)
72
 
12.9%
39
 
7.0%
30
 
5.4%
30
 
5.4%
27
 
4.8%
25
 
4.5%
21
 
3.8%
20
 
3.6%
19
 
3.4%
18
 
3.2%
Other values (46) 258
46.2%
Katakana
ValueCountFrequency (%)
15
 
7.3%
14
 
6.8%
13
 
6.3%
12
 
5.8%
10
 
4.9%
10
 
4.9%
8
 
3.9%
8
 
3.9%
7
 
3.4%
7
 
3.4%
Other values (45) 102
49.5%
Cyrillic
ValueCountFrequency (%)
а 26
15.0%
н 14
 
8.1%
л 11
 
6.4%
у 11
 
6.4%
д 11
 
6.4%
о 10
 
5.8%
и 9
 
5.2%
г 9
 
5.2%
р 8
 
4.6%
т 5
 
2.9%
Other values (30) 59
34.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 140391
54.8%
ASCII 113105
44.2%
CJK 1543
 
0.6%
Hiragana 559
 
0.2%
Katakana 206
 
0.1%
Cyrillic 173
 
0.1%
None 106
 
< 0.1%
Enclosed Alphanum 19
 
< 0.1%
Punctuation 10
 
< 0.1%
CJK Compat Ideographs 10
 
< 0.1%
Other values (4) 14
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
49765
44.0%
: 5337
 
4.7%
e 4380
 
3.9%
o 3291
 
2.9%
a 3168
 
2.8%
i 3010
 
2.7%
t 2864
 
2.5%
n 2852
 
2.5%
r 2479
 
2.2%
s 2442
 
2.2%
Other values (78) 33517
29.6%
Hangul
ValueCountFrequency (%)
3740
 
2.7%
3428
 
2.4%
2765
 
2.0%
2306
 
1.6%
2081
 
1.5%
2012
 
1.4%
1886
 
1.3%
1825
 
1.3%
1711
 
1.2%
1612
 
1.1%
Other values (1261) 117025
83.4%
Hiragana
ValueCountFrequency (%)
72
 
12.9%
39
 
7.0%
30
 
5.4%
30
 
5.4%
27
 
4.8%
25
 
4.5%
21
 
3.8%
20
 
3.6%
19
 
3.4%
18
 
3.2%
Other values (46) 258
46.2%
CJK
ValueCountFrequency (%)
45
 
2.9%
35
 
2.3%
29
 
1.9%
21
 
1.4%
19
 
1.2%
19
 
1.2%
15
 
1.0%
13
 
0.8%
13
 
0.8%
13
 
0.8%
Other values (613) 1321
85.6%
Cyrillic
ValueCountFrequency (%)
а 26
15.0%
н 14
 
8.1%
л 11
 
6.4%
у 11
 
6.4%
д 11
 
6.4%
о 10
 
5.8%
и 9
 
5.2%
г 9
 
5.2%
р 8
 
4.6%
т 5
 
2.9%
Other values (30) 59
34.1%
None
ValueCountFrequency (%)
25
23.6%
25
23.6%
13
12.3%
7
 
6.6%
7
 
6.6%
6
 
5.7%
6
 
5.7%
× 5
 
4.7%
4
 
3.8%
¼ 1
 
0.9%
Other values (7) 7
 
6.6%
Katakana
ValueCountFrequency (%)
15
 
7.3%
14
 
6.8%
13
 
6.3%
12
 
5.8%
10
 
4.9%
10
 
4.9%
8
 
3.9%
8
 
3.9%
7
 
3.4%
7
 
3.4%
Other values (45) 102
49.5%
Punctuation
ValueCountFrequency (%)
8
80.0%
1
 
10.0%
1
 
10.0%
CJK Compat Ideographs
ValueCountFrequency (%)
6
60.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
Number Forms
ValueCountFrequency (%)
4
44.4%
4
44.4%
1
 
11.1%
Enclosed Alphanum
ValueCountFrequency (%)
4
21.1%
3
15.8%
3
15.8%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Misc Symbols
ValueCountFrequency (%)
3
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%

저자
Text

Distinct6911
Distinct (%)83.8%
Missing0
Missing (%)0.0%
Memory size64.6 KiB
2024-05-10T22:58:07.888159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length66
Median length60
Mean length8.2513635
Min length2

Characters and Unicode

Total characters68082
Distinct characters1294
Distinct categories12 ?
Distinct scripts7 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6140 ?
Unique (%)74.4%

Sample

1st row임홍택 지음
2nd row윤대식 지음
3rd row박찬운 지음
4th row김성윤
5th row이보람
ValueCountFrequency (%)
지음 4172
 
22.1%
806
 
4.3%
by 536
 
2.8%
251
 
1.3%
239
 
1.3%
글?그림 195
 
1.0%
서울특별시 162
 
0.9%
편집 109
 
0.6%
연구책임 98
 
0.5%
83
 
0.4%
Other values (8255) 12269
64.8%
2024-05-10T22:58:09.236334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10684
 
15.7%
4533
 
6.7%
4227
 
6.2%
1254
 
1.8%
1120
 
1.6%
915
 
1.3%
e 859
 
1.3%
] 795
 
1.2%
[ 795
 
1.2%
a 786
 
1.2%
Other values (1284) 42114
61.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 45501
66.8%
Space Separator 10684
 
15.7%
Lowercase Letter 7734
 
11.4%
Uppercase Letter 1534
 
2.3%
Other Punctuation 944
 
1.4%
Close Punctuation 801
 
1.2%
Open Punctuation 801
 
1.2%
Dash Punctuation 39
 
0.1%
Decimal Number 32
 
< 0.1%
Math Symbol 9
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4533
 
10.0%
4227
 
9.3%
1254
 
2.8%
1120
 
2.5%
915
 
2.0%
649
 
1.4%
635
 
1.4%
635
 
1.4%
533
 
1.2%
451
 
1.0%
Other values (1163) 30549
67.1%
Lowercase Letter
ValueCountFrequency (%)
e 859
11.1%
a 786
10.2%
y 685
 
8.9%
i 607
 
7.8%
b 605
 
7.8%
n 592
 
7.7%
r 528
 
6.8%
o 440
 
5.7%
l 382
 
4.9%
t 378
 
4.9%
Other values (39) 1872
24.2%
Uppercase Letter
ValueCountFrequency (%)
S 147
 
9.6%
M 121
 
7.9%
C 109
 
7.1%
A 108
 
7.0%
J 102
 
6.6%
L 94
 
6.1%
B 93
 
6.1%
D 76
 
5.0%
H 76
 
5.0%
K 73
 
4.8%
Other values (30) 535
34.9%
Decimal Number
ValueCountFrequency (%)
1 12
37.5%
3 4
 
12.5%
2 4
 
12.5%
6 3
 
9.4%
9 3
 
9.4%
8 2
 
6.2%
4 2
 
6.2%
7 1
 
3.1%
0 1
 
3.1%
Other Punctuation
ValueCountFrequency (%)
? 718
76.1%
. 174
 
18.4%
: 19
 
2.0%
/ 13
 
1.4%
, 13
 
1.4%
' 5
 
0.5%
· 1
 
0.1%
& 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
] 795
99.3%
) 3
 
0.4%
2
 
0.2%
1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
[ 795
99.3%
( 3
 
0.4%
2
 
0.2%
1
 
0.1%
Math Symbol
ValueCountFrequency (%)
> 4
44.4%
< 4
44.4%
+ 1
 
11.1%
Space Separator
ValueCountFrequency (%)
10684
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 39
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 44654
65.6%
Common 13313
 
19.6%
Latin 9150
 
13.4%
Han 676
 
1.0%
Katakana 150
 
0.2%
Cyrillic 118
 
0.2%
Hiragana 21
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4533
 
10.2%
4227
 
9.5%
1254
 
2.8%
1120
 
2.5%
915
 
2.0%
649
 
1.5%
635
 
1.4%
635
 
1.4%
533
 
1.2%
451
 
1.0%
Other values (789) 29702
66.5%
Han
ValueCountFrequency (%)
97
 
14.3%
31
 
4.6%
18
 
2.7%
13
 
1.9%
9
 
1.3%
8
 
1.2%
7
 
1.0%
7
 
1.0%
7
 
1.0%
6
 
0.9%
Other values (300) 473
70.0%
Latin
ValueCountFrequency (%)
e 859
 
9.4%
a 786
 
8.6%
y 685
 
7.5%
i 607
 
6.6%
b 605
 
6.6%
n 592
 
6.5%
r 528
 
5.8%
o 440
 
4.8%
l 382
 
4.2%
t 378
 
4.1%
Other values (42) 3288
35.9%
Katakana
ValueCountFrequency (%)
11
 
7.3%
9
 
6.0%
7
 
4.7%
7
 
4.7%
7
 
4.7%
6
 
4.0%
6
 
4.0%
6
 
4.0%
6
 
4.0%
5
 
3.3%
Other values (42) 80
53.3%
Cyrillic
ValueCountFrequency (%)
а 11
 
9.3%
о 9
 
7.6%
р 8
 
6.8%
н 7
 
5.9%
и 7
 
5.9%
с 6
 
5.1%
е 5
 
4.2%
д 5
 
4.2%
м 5
 
4.2%
э 4
 
3.4%
Other values (27) 51
43.2%
Common
ValueCountFrequency (%)
10684
80.3%
] 795
 
6.0%
[ 795
 
6.0%
? 718
 
5.4%
. 174
 
1.3%
- 39
 
0.3%
: 19
 
0.1%
/ 13
 
0.1%
, 13
 
0.1%
1 12
 
0.1%
Other values (22) 51
 
0.4%
Hiragana
ValueCountFrequency (%)
4
19.0%
3
14.3%
2
9.5%
2
9.5%
2
9.5%
2
9.5%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (2) 2
9.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 44654
65.6%
ASCII 22454
33.0%
CJK 674
 
1.0%
Katakana 150
 
0.2%
Cyrillic 118
 
0.2%
Hiragana 21
 
< 0.1%
None 7
 
< 0.1%
Enclosed Alphanum 2
 
< 0.1%
CJK Compat Ideographs 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10684
47.6%
e 859
 
3.8%
] 795
 
3.5%
[ 795
 
3.5%
a 786
 
3.5%
? 718
 
3.2%
y 685
 
3.1%
i 607
 
2.7%
b 605
 
2.7%
n 592
 
2.6%
Other values (68) 5328
23.7%
Hangul
ValueCountFrequency (%)
4533
 
10.2%
4227
 
9.5%
1254
 
2.8%
1120
 
2.5%
915
 
2.0%
649
 
1.5%
635
 
1.4%
635
 
1.4%
533
 
1.2%
451
 
1.0%
Other values (789) 29702
66.5%
CJK
ValueCountFrequency (%)
97
 
14.4%
31
 
4.6%
18
 
2.7%
13
 
1.9%
9
 
1.3%
8
 
1.2%
7
 
1.0%
7
 
1.0%
7
 
1.0%
6
 
0.9%
Other values (298) 471
69.9%
Cyrillic
ValueCountFrequency (%)
а 11
 
9.3%
о 9
 
7.6%
р 8
 
6.8%
н 7
 
5.9%
и 7
 
5.9%
с 6
 
5.1%
е 5
 
4.2%
д 5
 
4.2%
м 5
 
4.2%
э 4
 
3.4%
Other values (27) 51
43.2%
Katakana
ValueCountFrequency (%)
11
 
7.3%
9
 
6.0%
7
 
4.7%
7
 
4.7%
7
 
4.7%
6
 
4.0%
6
 
4.0%
6
 
4.0%
6
 
4.0%
5
 
3.3%
Other values (42) 80
53.3%
Hiragana
ValueCountFrequency (%)
4
19.0%
3
14.3%
2
9.5%
2
9.5%
2
9.5%
2
9.5%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (2) 2
9.5%
Enclosed Alphanum
ValueCountFrequency (%)
2
100.0%
None
ValueCountFrequency (%)
2
28.6%
2
28.6%
· 1
14.3%
1
14.3%
1
14.3%
CJK Compat Ideographs
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct3364
Distinct (%)40.8%
Missing3
Missing (%)< 0.1%
Memory size64.6 KiB
2024-05-10T22:58:09.932900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length87
Median length54
Mean length7.5425558
Min length1

Characters and Unicode

Total characters62211
Distinct characters1074
Distinct categories10 ?
Distinct scripts7 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2082 ?
Unique (%)25.2%

Sample

1st row11%
2nd row박영사
3rd row혜윰터
4th row메가스터디books : 메가스터디
5th row두드림미디어
ValueCountFrequency (%)
872
 
7.3%
서울특별시 262
 
2.2%
출판문화원 95
 
0.8%
위즈덤하우스 88
 
0.7%
books 86
 
0.7%
한국방송통신대학교 84
 
0.7%
김영사 78
 
0.7%
국회예산정책처 73
 
0.6%
다산북스 72
 
0.6%
문학동네 66
 
0.6%
Other values (3648) 10162
85.1%
2024-05-10T22:58:11.171173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3692
 
5.9%
1402
 
2.3%
1302
 
2.1%
: 1162
 
1.9%
o 1153
 
1.9%
e 1105
 
1.8%
1083
 
1.7%
1014
 
1.6%
a 890
 
1.4%
i 858
 
1.4%
Other values (1064) 48550
78.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 42190
67.8%
Lowercase Letter 10051
 
16.2%
Space Separator 3692
 
5.9%
Uppercase Letter 2847
 
4.6%
Other Punctuation 1898
 
3.1%
Open Punctuation 670
 
1.1%
Close Punctuation 670
 
1.1%
Decimal Number 180
 
0.3%
Dash Punctuation 9
 
< 0.1%
Math Symbol 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1402
 
3.3%
1302
 
3.1%
1083
 
2.6%
1014
 
2.4%
724
 
1.7%
695
 
1.6%
681
 
1.6%
670
 
1.6%
651
 
1.5%
617
 
1.5%
Other values (956) 33351
79.0%
Lowercase Letter
ValueCountFrequency (%)
o 1153
11.5%
e 1105
11.0%
a 890
 
8.9%
i 858
 
8.5%
r 798
 
7.9%
n 760
 
7.6%
s 716
 
7.1%
t 581
 
5.8%
l 444
 
4.4%
u 395
 
3.9%
Other values (36) 2351
23.4%
Uppercase Letter
ValueCountFrequency (%)
B 361
 
12.7%
P 281
 
9.9%
S 197
 
6.9%
M 187
 
6.6%
K 166
 
5.8%
A 165
 
5.8%
C 164
 
5.8%
H 156
 
5.5%
T 130
 
4.6%
R 115
 
4.0%
Other values (22) 925
32.5%
Other Punctuation
ValueCountFrequency (%)
: 1162
61.2%
? 539
28.4%
. 61
 
3.2%
& 43
 
2.3%
' 31
 
1.6%
# 30
 
1.6%
; 13
 
0.7%
, 11
 
0.6%
/ 5
 
0.3%
% 1
 
0.1%
Other values (2) 2
 
0.1%
Decimal Number
ValueCountFrequency (%)
2 55
30.6%
1 52
28.9%
0 27
15.0%
5 20
 
11.1%
6 8
 
4.4%
8 6
 
3.3%
4 5
 
2.8%
7 5
 
2.8%
9 1
 
0.6%
3 1
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 604
90.1%
[ 66
 
9.9%
Close Punctuation
ValueCountFrequency (%)
) 604
90.1%
] 66
 
9.9%
Math Symbol
ValueCountFrequency (%)
× 2
50.0%
+ 2
50.0%
Space Separator
ValueCountFrequency (%)
3692
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 41250
66.3%
Latin 12800
 
20.6%
Common 7123
 
11.4%
Han 755
 
1.2%
Katakana 168
 
0.3%
Cyrillic 98
 
0.2%
Hiragana 17
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1402
 
3.4%
1302
 
3.2%
1083
 
2.6%
1014
 
2.5%
724
 
1.8%
695
 
1.7%
681
 
1.7%
670
 
1.6%
651
 
1.6%
617
 
1.5%
Other values (710) 32411
78.6%
Han
ValueCountFrequency (%)
47
 
6.2%
47
 
6.2%
39
 
5.2%
35
 
4.6%
15
 
2.0%
15
 
2.0%
14
 
1.9%
13
 
1.7%
13
 
1.7%
12
 
1.6%
Other values (183) 505
66.9%
Latin
ValueCountFrequency (%)
o 1153
 
9.0%
e 1105
 
8.6%
a 890
 
7.0%
i 858
 
6.7%
r 798
 
6.2%
n 760
 
5.9%
s 716
 
5.6%
t 581
 
4.5%
l 444
 
3.5%
u 395
 
3.1%
Other values (42) 5100
39.8%
Katakana
ValueCountFrequency (%)
12
 
7.1%
12
 
7.1%
9
 
5.4%
8
 
4.8%
8
 
4.8%
7
 
4.2%
7
 
4.2%
6
 
3.6%
6
 
3.6%
5
 
3.0%
Other values (31) 88
52.4%
Common
ValueCountFrequency (%)
3692
51.8%
: 1162
 
16.3%
( 604
 
8.5%
) 604
 
8.5%
? 539
 
7.6%
[ 66
 
0.9%
] 66
 
0.9%
. 61
 
0.9%
2 55
 
0.8%
1 52
 
0.7%
Other values (20) 222
 
3.1%
Cyrillic
ValueCountFrequency (%)
и 11
11.2%
а 11
11.2%
н 11
11.2%
р 8
 
8.2%
м 8
 
8.2%
о 5
 
5.1%
л 5
 
5.1%
М 4
 
4.1%
д 4
 
4.1%
г 4
 
4.1%
Other values (16) 27
27.6%
Hiragana
ValueCountFrequency (%)
3
17.6%
3
17.6%
2
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Other values (2) 2
11.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 41218
66.3%
ASCII 19920
32.0%
CJK 754
 
1.2%
Katakana 168
 
0.3%
Cyrillic 98
 
0.2%
Compat Jamo 32
 
0.1%
Hiragana 17
 
< 0.1%
None 3
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3692
18.5%
: 1162
 
5.8%
o 1153
 
5.8%
e 1105
 
5.5%
a 890
 
4.5%
i 858
 
4.3%
r 798
 
4.0%
n 760
 
3.8%
s 716
 
3.6%
( 604
 
3.0%
Other values (70) 8182
41.1%
Hangul
ValueCountFrequency (%)
1402
 
3.4%
1302
 
3.2%
1083
 
2.6%
1014
 
2.5%
724
 
1.8%
695
 
1.7%
681
 
1.7%
670
 
1.6%
651
 
1.6%
617
 
1.5%
Other values (697) 32379
78.6%
CJK
ValueCountFrequency (%)
47
 
6.2%
47
 
6.2%
39
 
5.2%
35
 
4.6%
15
 
2.0%
15
 
2.0%
14
 
1.9%
13
 
1.7%
13
 
1.7%
12
 
1.6%
Other values (182) 504
66.8%
Katakana
ValueCountFrequency (%)
12
 
7.1%
12
 
7.1%
9
 
5.4%
8
 
4.8%
8
 
4.8%
7
 
4.2%
7
 
4.2%
6
 
3.6%
6
 
3.6%
5
 
3.0%
Other values (31) 88
52.4%
Cyrillic
ValueCountFrequency (%)
и 11
11.2%
а 11
11.2%
н 11
11.2%
р 8
 
8.2%
м 8
 
8.2%
о 5
 
5.1%
л 5
 
5.1%
М 4
 
4.1%
д 4
 
4.1%
г 4
 
4.1%
Other values (16) 27
27.6%
Compat Jamo
ValueCountFrequency (%)
7
21.9%
6
18.8%
3
9.4%
3
9.4%
2
 
6.2%
2
 
6.2%
2
 
6.2%
2
 
6.2%
1
 
3.1%
1
 
3.1%
Other values (3) 3
9.4%
Hiragana
ValueCountFrequency (%)
3
17.6%
3
17.6%
2
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Other values (2) 2
11.8%
None
ValueCountFrequency (%)
× 2
66.7%
1
33.3%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%

발행년도
Categorical

IMBALANCE 

Distinct30
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size64.6 KiB
2023
4679 
2024
1528 
2022
904 
2021
 
406
2020
 
267
Other values (25)
 
467

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2023 4679
56.7%
2024 1528
 
18.5%
2022 904
 
11.0%
2021 406
 
4.9%
2020 267
 
3.2%
2019 136
 
1.6%
2018 65
 
0.8%
2014 46
 
0.6%
2015 32
 
0.4%
2017 31
 
0.4%
Other values (20) 157
 
1.9%

Length

2024-05-10T22:58:11.802545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2023 4679
56.7%
2024 1528
 
18.5%
2022 904
 
11.0%
2021 406
 
4.9%
2020 267
 
3.2%
2019 136
 
1.6%
2018 65
 
0.8%
2014 46
 
0.6%
2015 32
 
0.4%
2017 31
 
0.4%
Other values (19) 156
 
1.9%

자료유형
Categorical

IMBALANCE 

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size64.6 KiB
단행본
7729 
서울시 간행물
 
349
기술용역보고서
 
123
학술용역보고서
 
32
서울시 연속간행물
 
8
Other values (4)
 
10

Length

Max length10
Median length3
Mean length3.253545
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row단행본
2nd row단행본
3rd row단행본
4th row단행본
5th row단행본

Common Values

ValueCountFrequency (%)
단행본 7729
93.7%
서울시 간행물 349
 
4.2%
기술용역보고서 123
 
1.5%
학술용역보고서 32
 
0.4%
서울시 연속간행물 8
 
0.1%
일반도서 5
 
0.1%
백서(서울시간행물) 2
 
< 0.1%
DVD자료 2
 
< 0.1%
기관 연속간행물 1
 
< 0.1%

Length

2024-05-10T22:58:12.369820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T22:58:12.796830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단행본 7729
89.8%
서울시 357
 
4.1%
간행물 349
 
4.1%
기술용역보고서 123
 
1.4%
학술용역보고서 32
 
0.4%
연속간행물 9
 
0.1%
일반도서 5
 
0.1%
백서(서울시간행물 2
 
< 0.1%
dvd자료 2
 
< 0.1%
기관 1
 
< 0.1%

소장처
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size64.6 KiB
서울도서관
8251 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울도서관
2nd row서울도서관
3rd row서울도서관
4th row서울도서관
5th row서울도서관

Common Values

ValueCountFrequency (%)
서울도서관 8251
100.0%

Length

2024-05-10T22:58:13.409461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T22:58:13.774718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울도서관 8251
100.0%

등록일
Real number (ℝ)

Distinct49
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240314
Minimum20240111
Maximum20240509
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size72.6 KiB
2024-05-10T22:58:14.206679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240111
5-th percentile20240117
Q120240221
median20240321
Q320240412
95-th percentile20240504
Maximum20240509
Range398
Interquartile range (IQR)191

Descriptive statistics

Standard deviation120.49105
Coefficient of variation (CV)5.9530225 × 10-6
Kurtosis-0.95231863
Mean20240314
Median Absolute Deviation (MAD)97
Skewness-0.18204374
Sum1.6700283 × 1011
Variance14518.092
MonotonicityDecreasing
2024-05-10T22:58:14.734239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
20240329 491
 
6.0%
20240321 489
 
5.9%
20240405 411
 
5.0%
20240308 408
 
4.9%
20240315 399
 
4.8%
20240418 355
 
4.3%
20240215 329
 
4.0%
20240327 306
 
3.7%
20240131 291
 
3.5%
20240425 290
 
3.5%
Other values (39) 4482
54.3%
ValueCountFrequency (%)
20240111 146
1.8%
20240112 177
2.1%
20240117 115
 
1.4%
20240118 175
2.1%
20240119 5
 
0.1%
20240124 226
2.7%
20240125 104
 
1.3%
20240130 201
2.4%
20240131 291
3.5%
20240201 1
 
< 0.1%
ValueCountFrequency (%)
20240509 201
2.4%
20240508 182
2.2%
20240504 196
2.4%
20240502 241
2.9%
20240501 143
1.7%
20240430 2
 
< 0.1%
20240425 290
3.5%
20240424 168
2.0%
20240418 355
4.3%
20240417 172
2.1%

Interactions

2024-05-10T22:58:04.181761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-10T22:58:15.049013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발행년도자료유형등록일
발행년도1.0000.5930.356
자료유형0.5931.0000.304
등록일0.3560.3041.000
2024-05-10T22:58:15.498896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자료유형발행년도
자료유형1.0000.264
발행년도0.2641.000
2024-05-10T22:58:15.813959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록일발행년도자료유형
등록일1.0000.1560.163
발행년도0.1561.0000.264
자료유형0.1630.2641.000

Missing values

2024-05-10T22:58:04.580377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-10T22:58:05.087966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

서명저자발행처발행년도자료유형소장처등록일
02000년생이 온다 :초합리, 초개인, 초자율의 탈사회형 AI인간임홍택 지음11%2023단행본서울도서관20240509
1도시의 미래 :현상과 전망 그리고 처방윤대식 지음박영사2023단행본서울도서관20240509
2기록하지 않으면 존재하지 않는다 :인권위 상임위원 3년의 기록박찬운 지음혜윰터2023단행본서울도서관20240509
3로스쿨에 가고 싶어졌습니다 :서울대 로스쿨 학생들이 직접 말하는 지금 로스쿨 이야기김성윤메가스터디books : 메가스터디2023단행본서울도서관20240509
4미래 학교, 학생이 주도하는 교실이보람두드림미디어2023단행본서울도서관20240509
5센스 :문화 차이를 느껴야 영어가 는다 =Sense : making sense of enlish language under Korean culture안준성 지음안다2023단행본서울도서관20240509
67 weeks challenge :기초영어법이시원시원스쿨닷컴 : 에스제이더블유인터내셔널2023단행본서울도서관20240509
7시골버스는 착하다 :이철 동시집이철 지음학이사어린이 : 학이사2023단행본서울도서관20240509
8한글과 타자기 :한글 기계화의 기술, 미학, 역사김태호 지음역사비평사2023단행본서울도서관20240509
9벌거벗은 한국사 :본격 우리 역사 스토리텔링쇼 ,영웅편tvN story <벌거벗은 한국사> 제작팀 지음프런트페이지2023단행본서울도서관20240509
서명저자발행처발행년도자료유형소장처등록일
8241내 하루는 네 시간 :10년 루푸스 악착발랄 투병기희우 지음[발행처불명]2020단행본서울도서관20240111
8242현대가 낳은 아기괴물들허혜민 글?그림꿀떡2020단행본서울도서관20240111
8243작은 책방에서 만나 :열 번째 시집박혜숙 지음[발행처불명]2020단행본서울도서관20240111
8244이런 시베리아양서연 지음[발행처불명]2020단행본서울도서관20240111
8245애프터 로드킬 :남겨진 것들의 기록말더 글?그림[발행처불명]2020단행본서울도서관20240111
8246애정결핍 :과거의 내가 현재의 나에게고선영 저[발행처불명]2020단행본서울도서관20240111
8247까만밤의 인턴썰까만밤 글[발행처불명]2020단행본서울도서관20240111
8248우리는 영영 볼 수 없겠지만지은이: 도티끌스튜디오 티끌2020단행본서울도서관20240111
8249하고 싶은 말이 이렇게 많을 줄이야김남제 지음실험과 관찰2020단행본서울도서관20240111
8250모스크바 여행기김웃 지음옥구슬2021단행본서울도서관20240111

Duplicate rows

Most frequently occurring

서명저자발행처발행년도자료유형소장처등록일# duplicates
158역사를 바꾼 젊은 영웅들유수진 책임작대한민국해양연맹: 도훈2023단행본서울도서관202401123
0(2022년) 서울시 노인실태조사김정현 연구책임서울시복지재단2022서울시 간행물서울도서관202405042
1(2022년) 장애인 전수조사 :지체?뇌병변 장애인이송희 연구서울시복지재단2022서울시 간행물서울도서관202405042
2(2023년) 서울시 노숙인이용시설 현장평가 결과보고서성기원서울시복지재단2023서울시 간행물서울도서관202403082
3(2023년) 서울시 장애인거주시설 퇴소장애인 자립실태 조사 :보고서서울특별시 복지정책실 복지기획관 장애인복지정책과 [편]서울특별시 장애인복지정책과2024서울시 간행물서울도서관202405042
4(2023년) 서울시 장애인복지관 현장평가 결과보고서성기원서울시복지재단2023서울시 간행물서울도서관202403082
5(2023년) 서울시 지하수 보조측정망 관리개선 용역 :최종보고서서울특별시 물순환안전국 수변감성도시과서울특별시 물순환안전국 수변감성도시과2024기술용역보고서서울도서관202405042
6(ADHD?틱?정서장애 자폐스펙트럼) 음식치유가 답이다 :고현아의 새로운 개념과 음식치유법 .1고현아오늘도 사랑해2020단행본서울도서관202403272
7(Un) peu de magie dans l'air[by] Fabrice ColinGautier Languereau2022단행본서울도서관202403272
8(달라지는) 서울생활 :서울시 정책 미리 알고 알차게 즐겨요! .2024김태균서울특별시2024서울시 간행물서울도서관202403082