Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory556.6 KiB
Average record size in memory57.0 B

Variable types

Text4
Categorical1
Numeric1

Dataset

Description방송통신위원회 도서 반출입시스템으로 도서에 대한 요약 제공
Author방송통신위원회
URLhttps://www.data.go.kr/data/3047123/fileData.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
별치 is highly imbalanced (79.0%)Imbalance

Reproduction

Analysis started2023-12-11 22:47:47.852807
Analysis finished2023-12-11 22:47:49.864966
Duration2.01 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9996
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:47:50.167022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length23
Mean length11.7127
Min length8

Characters and Unicode

Total characters117127
Distinct characters447
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9992 ?
Unique (%)99.9%

Sample

1st row359.01 이14ㅇ
2nd row61B 02-063
3rd row90Z 94-026
4th row320.911 사15ㅅ
5th row00A 92-008
ValueCountFrequency (%)
00a 524
 
2.2%
b 412
 
1.7%
813.6 388
 
1.6%
61b 349
 
1.5%
c.2 339
 
1.4%
v.2 319
 
1.4%
v.1 304
 
1.3%
11a 301
 
1.3%
21a 268
 
1.1%
37a 261
 
1.1%
Other values (6259) 20140
85.3%
2023-12-12T07:47:50.767198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 16839
14.4%
13605
11.6%
1 11482
 
9.8%
3 9004
 
7.7%
2 8198
 
7.0%
9 6194
 
5.3%
. 5557
 
4.7%
- 5435
 
4.6%
6 5389
 
4.6%
8 4657
 
4.0%
Other values (437) 30767
26.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 75002
64.0%
Space Separator 13605
 
11.6%
Other Letter 8867
 
7.6%
Uppercase Letter 6606
 
5.6%
Other Punctuation 5581
 
4.8%
Dash Punctuation 5435
 
4.6%
Lowercase Letter 2030
 
1.7%
Format 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
714
 
8.1%
564
 
6.4%
528
 
6.0%
485
 
5.5%
426
 
4.8%
420
 
4.7%
389
 
4.4%
335
 
3.8%
310
 
3.5%
275
 
3.1%
Other values (372) 4421
49.9%
Uppercase Letter
ValueCountFrequency (%)
A 2712
41.1%
B 1446
21.9%
Z 999
 
15.1%
C 275
 
4.2%
L 215
 
3.3%
J 176
 
2.7%
E 158
 
2.4%
W 106
 
1.6%
D 91
 
1.4%
R 76
 
1.2%
Other values (16) 352
 
5.3%
Lowercase Letter
ValueCountFrequency (%)
v 1317
64.9%
c 385
 
19.0%
m 66
 
3.3%
t 58
 
2.9%
p 22
 
1.1%
a 20
 
1.0%
w 17
 
0.8%
s 17
 
0.8%
n 16
 
0.8%
b 14
 
0.7%
Other values (13) 98
 
4.8%
Decimal Number
ValueCountFrequency (%)
0 16839
22.5%
1 11482
15.3%
3 9004
12.0%
2 8198
10.9%
9 6194
 
8.3%
6 5389
 
7.2%
8 4657
 
6.2%
5 4487
 
6.0%
4 4406
 
5.9%
7 4346
 
5.8%
Other Punctuation
ValueCountFrequency (%)
. 5557
99.6%
, 23
 
0.4%
/ 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
13605
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5435
100.0%
Format
ValueCountFrequency (%)
­ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 99624
85.1%
Hangul 8867
 
7.6%
Latin 8636
 
7.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
714
 
8.1%
564
 
6.4%
528
 
6.0%
485
 
5.5%
426
 
4.8%
420
 
4.7%
389
 
4.4%
335
 
3.8%
310
 
3.5%
275
 
3.1%
Other values (372) 4421
49.9%
Latin
ValueCountFrequency (%)
A 2712
31.4%
B 1446
16.7%
v 1317
15.3%
Z 999
 
11.6%
c 385
 
4.5%
C 275
 
3.2%
L 215
 
2.5%
J 176
 
2.0%
E 158
 
1.8%
W 106
 
1.2%
Other values (39) 847
 
9.8%
Common
ValueCountFrequency (%)
0 16839
16.9%
13605
13.7%
1 11482
11.5%
3 9004
9.0%
2 8198
8.2%
9 6194
 
6.2%
. 5557
 
5.6%
- 5435
 
5.5%
6 5389
 
5.4%
8 4657
 
4.7%
Other values (6) 13264
13.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 108259
92.4%
Hangul 4699
 
4.0%
Compat Jamo 4168
 
3.6%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 16839
15.6%
13605
12.6%
1 11482
10.6%
3 9004
8.3%
2 8198
 
7.6%
9 6194
 
5.7%
. 5557
 
5.1%
- 5435
 
5.0%
6 5389
 
5.0%
8 4657
 
4.3%
Other values (54) 21899
20.2%
Compat Jamo
ValueCountFrequency (%)
714
17.1%
564
13.5%
426
10.2%
420
10.1%
389
9.3%
335
8.0%
310
7.4%
275
 
6.6%
206
 
4.9%
113
 
2.7%
Other values (9) 416
10.0%
Hangul
ValueCountFrequency (%)
528
 
11.2%
485
 
10.3%
185
 
3.9%
137
 
2.9%
130
 
2.8%
128
 
2.7%
106
 
2.3%
89
 
1.9%
81
 
1.7%
74
 
1.6%
Other values (353) 2756
58.7%
None
ValueCountFrequency (%)
­ 1
100.0%

서명
Text

Distinct8345
Distinct (%)83.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:47:51.127838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length124
Median length76
Mean length15.4013
Min length1

Characters and Unicode

Total characters154013
Distinct characters1207
Distinct categories15 ?
Distinct scripts4 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7669 ?
Unique (%)76.7%

Sample

1st row아래로부터의 정부개혁
2nd rowChanges in industrial interdependency betweeb Japan and Korea since 1985
3rd row환경청정기술개발의 국제적동향파악 및 종합추진전략 방안에 관한 연구
4th row세계속의 한국경제
5th row미국 ABC 및 영국 IBA 광고방송 기준
ValueCountFrequency (%)
연구 843
 
2.6%
409
 
1.2%
관한 388
 
1.2%
위한 346
 
1.0%
방안 184
 
0.6%
보고서 183
 
0.6%
방송 175
 
0.5%
미디어 162
 
0.5%
the 145
 
0.4%
디지털 140
 
0.4%
Other values (13709) 30035
91.0%
2023-12-12T07:47:51.627496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23026
 
15.0%
3317
 
2.2%
2959
 
1.9%
2046
 
1.3%
1871
 
1.2%
1864
 
1.2%
1726
 
1.1%
1659
 
1.1%
1633
 
1.1%
e 1569
 
1.0%
Other values (1197) 112343
72.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 103138
67.0%
Space Separator 23026
 
15.0%
Lowercase Letter 14462
 
9.4%
Decimal Number 5087
 
3.3%
Uppercase Letter 4031
 
2.6%
Open Punctuation 1411
 
0.9%
Close Punctuation 1411
 
0.9%
Other Punctuation 1060
 
0.7%
Math Symbol 234
 
0.2%
Dash Punctuation 118
 
0.1%
Other values (5) 35
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3317
 
3.2%
2959
 
2.9%
2046
 
2.0%
1871
 
1.8%
1864
 
1.8%
1726
 
1.7%
1659
 
1.6%
1633
 
1.6%
1545
 
1.5%
1390
 
1.3%
Other values (1101) 83128
80.6%
Lowercase Letter
ValueCountFrequency (%)
e 1569
10.8%
i 1435
9.9%
a 1331
9.2%
n 1329
9.2%
o 1295
 
9.0%
t 1138
 
7.9%
s 916
 
6.3%
r 882
 
6.1%
c 673
 
4.7%
d 585
 
4.0%
Other values (16) 3309
22.9%
Uppercase Letter
ValueCountFrequency (%)
T 609
15.1%
V 335
 
8.3%
I 271
 
6.7%
C 257
 
6.4%
A 256
 
6.4%
B 253
 
6.3%
S 239
 
5.9%
M 197
 
4.9%
E 195
 
4.8%
K 162
 
4.0%
Other values (16) 1257
31.2%
Other Punctuation
ValueCountFrequency (%)
· 390
36.8%
, 309
29.2%
. 117
 
11.0%
? 70
 
6.6%
' 46
 
4.3%
! 43
 
4.1%
& 34
 
3.2%
/ 27
 
2.5%
: 8
 
0.8%
" 6
 
0.6%
Other values (6) 10
 
0.9%
Decimal Number
ValueCountFrequency (%)
0 1367
26.9%
1 990
19.5%
9 896
17.6%
2 784
15.4%
8 211
 
4.1%
7 190
 
3.7%
3 190
 
3.7%
5 163
 
3.2%
6 150
 
2.9%
4 146
 
2.9%
Math Symbol
ValueCountFrequency (%)
~ 216
92.3%
+ 10
 
4.3%
5
 
2.1%
= 3
 
1.3%
Letter Number
ValueCountFrequency (%)
10
45.5%
10
45.5%
1
 
4.5%
1
 
4.5%
Open Punctuation
ValueCountFrequency (%)
( 1409
99.9%
2
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 1408
99.8%
3
 
0.2%
Space Separator
ValueCountFrequency (%)
23026
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 118
100.0%
Final Punctuation
ValueCountFrequency (%)
8
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 102652
66.7%
Common 32360
 
21.0%
Latin 18515
 
12.0%
Han 486
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3317
 
3.2%
2959
 
2.9%
2046
 
2.0%
1871
 
1.8%
1864
 
1.8%
1726
 
1.7%
1659
 
1.6%
1633
 
1.6%
1545
 
1.5%
1390
 
1.4%
Other values (927) 82642
80.5%
Han
ValueCountFrequency (%)
36
 
7.4%
23
 
4.7%
17
 
3.5%
16
 
3.3%
16
 
3.3%
15
 
3.1%
13
 
2.7%
13
 
2.7%
12
 
2.5%
12
 
2.5%
Other values (164) 313
64.4%
Latin
ValueCountFrequency (%)
e 1569
 
8.5%
i 1435
 
7.8%
a 1331
 
7.2%
n 1329
 
7.2%
o 1295
 
7.0%
t 1138
 
6.1%
s 916
 
4.9%
r 882
 
4.8%
c 673
 
3.6%
T 609
 
3.3%
Other values (46) 7338
39.6%
Common
ValueCountFrequency (%)
23026
71.2%
( 1409
 
4.4%
) 1408
 
4.4%
0 1367
 
4.2%
1 990
 
3.1%
9 896
 
2.8%
2 784
 
2.4%
· 390
 
1.2%
, 309
 
1.0%
~ 216
 
0.7%
Other values (30) 1565
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 102651
66.7%
ASCII 50440
32.8%
CJK 476
 
0.3%
None 397
 
0.3%
Number Forms 22
 
< 0.1%
Punctuation 10
 
< 0.1%
CJK Compat Ideographs 10
 
< 0.1%
Math Operators 5
 
< 0.1%
Letterlike Symbols 1
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
23026
45.7%
e 1569
 
3.1%
i 1435
 
2.8%
( 1409
 
2.8%
) 1408
 
2.8%
0 1367
 
2.7%
a 1331
 
2.6%
n 1329
 
2.6%
o 1295
 
2.6%
t 1138
 
2.3%
Other values (73) 15133
30.0%
Hangul
ValueCountFrequency (%)
3317
 
3.2%
2959
 
2.9%
2046
 
2.0%
1871
 
1.8%
1864
 
1.8%
1726
 
1.7%
1659
 
1.6%
1633
 
1.6%
1545
 
1.5%
1390
 
1.4%
Other values (926) 82641
80.5%
None
ValueCountFrequency (%)
· 390
98.2%
3
 
0.8%
2
 
0.5%
1
 
0.3%
1
 
0.3%
CJK
ValueCountFrequency (%)
36
 
7.6%
23
 
4.8%
17
 
3.6%
16
 
3.4%
16
 
3.4%
15
 
3.2%
13
 
2.7%
13
 
2.7%
12
 
2.5%
12
 
2.5%
Other values (160) 303
63.7%
Number Forms
ValueCountFrequency (%)
10
45.5%
10
45.5%
1
 
4.5%
1
 
4.5%
Punctuation
ValueCountFrequency (%)
8
80.0%
2
 
20.0%
Math Operators
ValueCountFrequency (%)
5
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
5
50.0%
2
 
20.0%
2
 
20.0%
1
 
10.0%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

저자
Text

Distinct4687
Distinct (%)46.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:47:51.881939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length73
Median length59
Mean length5.8234
Min length2

Characters and Unicode

Total characters58234
Distinct characters723
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3381 ?
Unique (%)33.8%

Sample

1st row이계식
2nd rowLee HongBae
3rd row신명교
4th row사공일
5th row방송위원회
ValueCountFrequency (%)
방송위원회 526
 
4.1%
한국언론재단 101
 
0.8%
한국방송학회 100
 
0.8%
한국방송공사 97
 
0.8%
한국언론학회 91
 
0.7%
한국방송광고공사 81
 
0.6%
한국방송개발원 61
 
0.5%
시공사 60
 
0.5%
한국언론연구원 59
 
0.5%
편집부 58
 
0.5%
Other values (5542) 11454
90.3%
2023-12-12T07:47:52.267975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2693
 
4.6%
1662
 
2.9%
1562
 
2.7%
1557
 
2.7%
1501
 
2.6%
1500
 
2.6%
1407
 
2.4%
, 1249
 
2.1%
1157
 
2.0%
916
 
1.6%
Other values (713) 43030
73.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 45896
78.8%
Lowercase Letter 6102
 
10.5%
Space Separator 2693
 
4.6%
Uppercase Letter 1969
 
3.4%
Other Punctuation 1386
 
2.4%
Decimal Number 89
 
0.2%
Close Punctuation 34
 
0.1%
Open Punctuation 34
 
0.1%
Dash Punctuation 31
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1662
 
3.6%
1562
 
3.4%
1557
 
3.4%
1501
 
3.3%
1500
 
3.3%
1407
 
3.1%
1157
 
2.5%
916
 
2.0%
884
 
1.9%
877
 
1.9%
Other values (647) 32873
71.6%
Lowercase Letter
ValueCountFrequency (%)
e 712
11.7%
a 665
10.9%
n 568
 
9.3%
i 553
 
9.1%
o 474
 
7.8%
r 471
 
7.7%
s 352
 
5.8%
t 311
 
5.1%
l 296
 
4.9%
d 207
 
3.4%
Other values (16) 1493
24.5%
Uppercase Letter
ValueCountFrequency (%)
B 177
 
9.0%
S 159
 
8.1%
T 158
 
8.0%
M 143
 
7.3%
K 135
 
6.9%
C 121
 
6.1%
J 112
 
5.7%
A 105
 
5.3%
N 98
 
5.0%
D 96
 
4.9%
Other values (15) 665
33.8%
Decimal Number
ValueCountFrequency (%)
2 41
46.1%
1 20
22.5%
0 17
19.1%
3 4
 
4.5%
4 4
 
4.5%
5 2
 
2.2%
6 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
, 1249
90.1%
. 120
 
8.7%
· 9
 
0.6%
& 8
 
0.6%
Space Separator
ValueCountFrequency (%)
2693
100.0%
Close Punctuation
ValueCountFrequency (%)
) 34
100.0%
Open Punctuation
ValueCountFrequency (%)
( 34
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 31
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 45896
78.8%
Latin 8071
 
13.9%
Common 4267
 
7.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1662
 
3.6%
1562
 
3.4%
1557
 
3.4%
1501
 
3.3%
1500
 
3.3%
1407
 
3.1%
1157
 
2.5%
916
 
2.0%
884
 
1.9%
877
 
1.9%
Other values (647) 32873
71.6%
Latin
ValueCountFrequency (%)
e 712
 
8.8%
a 665
 
8.2%
n 568
 
7.0%
i 553
 
6.9%
o 474
 
5.9%
r 471
 
5.8%
s 352
 
4.4%
t 311
 
3.9%
l 296
 
3.7%
d 207
 
2.6%
Other values (41) 3462
42.9%
Common
ValueCountFrequency (%)
2693
63.1%
, 1249
29.3%
. 120
 
2.8%
2 41
 
1.0%
) 34
 
0.8%
( 34
 
0.8%
- 31
 
0.7%
1 20
 
0.5%
0 17
 
0.4%
· 9
 
0.2%
Other values (5) 19
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 45896
78.8%
ASCII 12329
 
21.2%
None 9
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2693
21.8%
, 1249
 
10.1%
e 712
 
5.8%
a 665
 
5.4%
n 568
 
4.6%
i 553
 
4.5%
o 474
 
3.8%
r 471
 
3.8%
s 352
 
2.9%
t 311
 
2.5%
Other values (55) 4281
34.7%
Hangul
ValueCountFrequency (%)
1662
 
3.6%
1562
 
3.4%
1557
 
3.4%
1501
 
3.3%
1500
 
3.3%
1407
 
3.1%
1157
 
2.5%
916
 
2.0%
884
 
1.9%
877
 
1.9%
Other values (647) 32873
71.6%
None
ValueCountFrequency (%)
· 9
100.0%
Distinct2118
Distinct (%)21.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T07:47:52.477557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length142
Median length75
Mean length6.9258
Min length1

Characters and Unicode

Total characters69258
Distinct characters769
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1282 ?
Unique (%)12.8%

Sample

1st row博英社
2nd rowKorea Institute for International economic policy
3rd row한국환경기술개발원
4th row김영사
5th row방송위원회
ValueCountFrequency (%)
방송위원회 661
 
5.7%
한국법제연구원 263
 
2.3%
대외경제정책연구원 244
 
2.1%
커뮤니케이션북스 218
 
1.9%
한국언론재단 196
 
1.7%
한국행정연구원 195
 
1.7%
한국방송개발원 174
 
1.5%
한국개발연구원 152
 
1.3%
한국방송공사 140
 
1.2%
방송통신위원회 120
 
1.0%
Other values (2220) 9293
79.7%
2023-12-12T07:47:52.844455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3054
 
4.4%
2944
 
4.3%
2905
 
4.2%
2021
 
2.9%
2006
 
2.9%
1885
 
2.7%
1747
 
2.5%
1656
 
2.4%
1562
 
2.3%
1478
 
2.1%
Other values (759) 48000
69.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 54881
79.2%
Lowercase Letter 10022
 
14.5%
Uppercase Letter 2109
 
3.0%
Space Separator 1656
 
2.4%
Other Punctuation 321
 
0.5%
Decimal Number 195
 
0.3%
Dash Punctuation 27
 
< 0.1%
Open Punctuation 21
 
< 0.1%
Close Punctuation 21
 
< 0.1%
Final Punctuation 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3054
 
5.6%
2944
 
5.4%
2905
 
5.3%
2021
 
3.7%
2006
 
3.7%
1885
 
3.4%
1747
 
3.2%
1562
 
2.8%
1478
 
2.7%
1140
 
2.1%
Other values (684) 34139
62.2%
Lowercase Letter
ValueCountFrequency (%)
e 1199
12.0%
o 997
9.9%
i 920
9.2%
n 861
8.6%
t 825
8.2%
a 811
8.1%
r 785
 
7.8%
s 678
 
6.8%
c 495
 
4.9%
l 488
 
4.9%
Other values (15) 1963
19.6%
Uppercase Letter
ValueCountFrequency (%)
K 198
 
9.4%
P 197
 
9.3%
I 191
 
9.1%
B 188
 
8.9%
M 153
 
7.3%
S 150
 
7.1%
T 142
 
6.7%
N 106
 
5.0%
R 106
 
5.0%
C 100
 
4.7%
Other values (15) 578
27.4%
Other Punctuation
ValueCountFrequency (%)
: 101
31.5%
& 62
19.3%
; 52
16.2%
. 38
 
11.8%
· 33
 
10.3%
18
 
5.6%
, 11
 
3.4%
/ 4
 
1.2%
@ 2
 
0.6%
Decimal Number
ValueCountFrequency (%)
2 87
44.6%
1 86
44.1%
0 9
 
4.6%
4 5
 
2.6%
3 4
 
2.1%
9 2
 
1.0%
5 1
 
0.5%
6 1
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 20
95.2%
[ 1
 
4.8%
Close Punctuation
ValueCountFrequency (%)
) 20
95.2%
] 1
 
4.8%
Space Separator
ValueCountFrequency (%)
1656
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 27
100.0%
Final Punctuation
ValueCountFrequency (%)
4
100.0%
Math Symbol
ValueCountFrequency (%)
| 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 53918
77.9%
Latin 12131
 
17.5%
Common 2246
 
3.2%
Han 963
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3054
 
5.7%
2944
 
5.5%
2905
 
5.4%
2021
 
3.7%
2006
 
3.7%
1885
 
3.5%
1747
 
3.2%
1562
 
2.9%
1478
 
2.7%
1140
 
2.1%
Other values (508) 33176
61.5%
Han
ValueCountFrequency (%)
177
18.4%
75
 
7.8%
65
 
6.7%
58
 
6.0%
47
 
4.9%
25
 
2.6%
24
 
2.5%
22
 
2.3%
18
 
1.9%
17
 
1.8%
Other values (166) 435
45.2%
Latin
ValueCountFrequency (%)
e 1199
 
9.9%
o 997
 
8.2%
i 920
 
7.6%
n 861
 
7.1%
t 825
 
6.8%
a 811
 
6.7%
r 785
 
6.5%
s 678
 
5.6%
c 495
 
4.1%
l 488
 
4.0%
Other values (40) 4072
33.6%
Common
ValueCountFrequency (%)
1656
73.7%
: 101
 
4.5%
2 87
 
3.9%
1 86
 
3.8%
& 62
 
2.8%
; 52
 
2.3%
. 38
 
1.7%
· 33
 
1.5%
- 27
 
1.2%
( 20
 
0.9%
Other values (15) 84
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 53918
77.9%
ASCII 14322
 
20.7%
CJK 960
 
1.4%
None 51
 
0.1%
Punctuation 4
 
< 0.1%
CJK Compat Ideographs 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3054
 
5.7%
2944
 
5.5%
2905
 
5.4%
2021
 
3.7%
2006
 
3.7%
1885
 
3.5%
1747
 
3.2%
1562
 
2.9%
1478
 
2.7%
1140
 
2.1%
Other values (508) 33176
61.5%
ASCII
ValueCountFrequency (%)
1656
 
11.6%
e 1199
 
8.4%
o 997
 
7.0%
i 920
 
6.4%
n 861
 
6.0%
t 825
 
5.8%
a 811
 
5.7%
r 785
 
5.5%
s 678
 
4.7%
c 495
 
3.5%
Other values (62) 5095
35.6%
CJK
ValueCountFrequency (%)
177
18.4%
75
 
7.8%
65
 
6.8%
58
 
6.0%
47
 
4.9%
25
 
2.6%
24
 
2.5%
22
 
2.3%
18
 
1.9%
17
 
1.8%
Other values (163) 432
45.0%
None
ValueCountFrequency (%)
· 33
64.7%
18
35.3%
Punctuation
ValueCountFrequency (%)
4
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

별치
Categorical

IMBALANCE 

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
8951 
B
 
412
L
 
145
Z
 
135
W
 
89
Other values (8)
 
268

Length

Max length4
Median length4
Mean length3.6853
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 8951
89.5%
B 412
 
4.1%
L 145
 
1.5%
Z 135
 
1.4%
W 89
 
0.9%
E 81
 
0.8%
R 64
 
0.6%
X 40
 
0.4%
H 22
 
0.2%
P 19
 
0.2%
Other values (3) 42
 
0.4%

Length

2023-12-12T07:47:53.017604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 8951
89.5%
b 412
 
4.1%
l 145
 
1.5%
z 135
 
1.4%
w 89
 
0.9%
e 81
 
0.8%
r 64
 
0.6%
x 40
 
0.4%
h 22
 
0.2%
p 19
 
0.2%
Other values (3) 42
 
0.4%

출판년도
Real number (ℝ)

Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2001.3512
Minimum1956
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:47:53.184003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1956
5-th percentile1986
Q11996
median2002
Q32007
95-th percentile2015
Maximum2019
Range63
Interquartile range (IQR)11

Descriptive statistics

Standard deviation8.9013809
Coefficient of variation (CV)0.0044476856
Kurtosis1.1703731
Mean2001.3512
Median Absolute Deviation (MAD)5
Skewness-0.70064161
Sum20013512
Variance79.234582
MonotonicityNot monotonic
2023-12-12T07:47:53.336530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2006 578
 
5.8%
2004 564
 
5.6%
2005 555
 
5.5%
2007 552
 
5.5%
2001 525
 
5.2%
2003 475
 
4.8%
2002 468
 
4.7%
1994 408
 
4.1%
2000 370
 
3.7%
1997 350
 
3.5%
Other values (49) 5155
51.5%
ValueCountFrequency (%)
1956 1
 
< 0.1%
1961 3
 
< 0.1%
1962 13
0.1%
1963 2
 
< 0.1%
1964 1
 
< 0.1%
1966 4
 
< 0.1%
1967 8
 
0.1%
1968 1
 
< 0.1%
1969 8
 
0.1%
1970 22
0.2%
ValueCountFrequency (%)
2019 71
 
0.7%
2018 78
 
0.8%
2017 87
 
0.9%
2016 228
2.3%
2015 142
1.4%
2014 166
1.7%
2013 198
2.0%
2012 189
1.9%
2011 266
2.7%
2010 237
2.4%

Interactions

2023-12-12T07:47:49.570940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:47:53.425322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
별치출판년도
별치1.0000.533
출판년도0.5331.000
2023-12-12T07:47:53.517264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
출판년도별치
출판년도1.0000.256
별치0.2561.000

Missing values

2023-12-12T07:47:49.682883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:47:49.803996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

청구기호서명저자출판사별치출판년도
13754359.01 이14ㅇ아래로부터의 정부개혁이계식博英社<NA>1997
542661B 02-063Changes in industrial interdependency betweeb Japan and Korea since 1985Lee HongBaeKorea Institute for International economic policy<NA>2002
579290Z 94-026환경청정기술개발의 국제적동향파악 및 종합추진전략 방안에 관한 연구신명교한국환경기술개발원<NA>1994
14288320.911 사15ㅅ세계속의 한국경제사공일김영사<NA>1993
210800A 92-008미국 ABC 및 영국 IBA 광고방송 기준방송위원회방송위원회<NA>1992
10758070.4 안44ㅎ행동하는 언론, 공공 저널리즘안병길전망<NA>2005
691368A 05-014디지털 시대 한·일 양국의 저작권 법제와 처리관행에 관한 비교 연구한국방송광고공사한국방송광고공사<NA>2005
15351005.3 엣829ㅍ프레젠테이션을 부탁해엣킨슨, 클리프정보문화사<NA>2009
584936A 96-002방송관계법 개정방향에 관한 공청회국회 제도개선특별위원회국회제도개선특별위원회<NA>1996
553861B 03-084European integration and the Asia-pacific region김흥종Korea Institute for International economic policy<NA>2003
청구기호서명저자출판사별치출판년도
262321A 00-014지역공동체와 저널리즘한국언론재단한국언론재단<NA>2000
685167Z 83-001TV광고방송량 83상반기한국광보문화연구원한국광보문화연구원<NA>1983
116121B 04-011KI 도입기반 구축 연구 결과 발표 및 토론회한국언론학회한국언론학회<NA>2004
448735A 03-029규제영향분석 지침서 및 교재개발윤종설한국행정연구원<NA>2003
15729326.14 이59ㅁ미디어 소비자 광고의 변화이시훈한경사<NA>2008
627883C 99-002(언론개혁시민연대 토론회)방송개혁, 이제 시작이다언론개혁시민연대언론개혁시민연대<NA>1999
131100A 06-053T-Commerce의 방송산업 파급효과와 정책방안에 관한 연구주정민방송위원회<NA>2006
379836A 95-003(국회)경과보고서국회사무처국회사무처<NA>1995
13355367.564 우94ㅂ(왕초보 박과장)부동산 경매로 집고 사고 돈도 벌다우형달원앤원북스<NA>2006
424331A 03-002문화예술인실태조사문화관광부문화관광부: 한국문화관광정책연구원<NA>2003

Duplicate rows

Most frequently occurring

청구기호서명저자출판사별치출판년도# duplicates
002A 05-003외국방송 재송신 승인 정책 수립을 위한 전문가 토론회방송위원회방송위원회<NA>20052