Overview

Dataset statistics

Number of variables12
Number of observations1708
Missing cells35
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory160.3 KiB
Average record size in memory96.1 B

Variable types

Unsupported4
Text6
Categorical2

Dataset

Description경기도 용인시 18개 공공도서관의 신간도서 현황입니다. 자료실명, 서명, 저작자, 발행년 등의 데이터를 제공합니다. ※ 데이터기준일자 : 2023-04-30
URLhttps://www.data.go.kr/data/3038399/fileData.do

Alerts

Unnamed: 8 is highly overall correlated with Unnamed: 3High correlation
Unnamed: 3 is highly overall correlated with Unnamed: 8High correlation
Unnamed: 8 is highly imbalanced (99.3%)Imbalance
Unnamed: 1 has unique valuesUnique
수지도서관 신간도서 정보 (2023.01.01~2023.04.30. 배가도서) is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 15:38:05.346921
Analysis finished2023-12-12 15:38:07.639389
Duration2.29 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Missing0
Missing (%)0.0%
Memory size13.5 KiB

Unnamed: 1
Text

UNIQUE 

Distinct1708
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size13.5 KiB
2023-12-13T00:38:07.820184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length11.995316
Min length4

Characters and Unicode

Total characters20488
Distinct characters17
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1708 ?
Unique (%)100.0%

Sample

1st row등록번호
2nd rowJM0000110714
3rd rowJM0000110715
4th rowJM0000110716
5th rowJM0000110717
ValueCountFrequency (%)
등록번호 1
 
0.1%
sm0000220650 1
 
0.1%
sm0000220648 1
 
0.1%
sm0000220647 1
 
0.1%
sm0000220646 1
 
0.1%
sm0000220644 1
 
0.1%
sm0000220643 1
 
0.1%
sm0000220642 1
 
0.1%
sm0000220641 1
 
0.1%
sm0000220640 1
 
0.1%
Other values (1698) 1698
99.4%
2023-12-13T00:38:08.303135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8536
41.7%
2 2865
 
14.0%
M 1707
 
8.3%
1 1333
 
6.5%
S 1174
 
5.7%
8 760
 
3.7%
6 678
 
3.3%
3 678
 
3.3%
4 646
 
3.2%
9 542
 
2.6%
Other values (7) 1569
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 17070
83.3%
Uppercase Letter 3414
 
16.7%
Other Letter 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8536
50.0%
2 2865
 
16.8%
1 1333
 
7.8%
8 760
 
4.5%
6 678
 
4.0%
3 678
 
4.0%
4 646
 
3.8%
9 542
 
3.2%
7 539
 
3.2%
5 493
 
2.9%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Uppercase Letter
ValueCountFrequency (%)
M 1707
50.0%
S 1174
34.4%
J 533
 
15.6%

Most occurring scripts

ValueCountFrequency (%)
Common 17070
83.3%
Latin 3414
 
16.7%
Hangul 4
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8536
50.0%
2 2865
 
16.8%
1 1333
 
7.8%
8 760
 
4.5%
6 678
 
4.0%
3 678
 
4.0%
4 646
 
3.8%
9 542
 
3.2%
7 539
 
3.2%
5 493
 
2.9%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Latin
ValueCountFrequency (%)
M 1707
50.0%
S 1174
34.4%
J 533
 
15.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20484
> 99.9%
Hangul 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8536
41.7%
2 2865
 
14.0%
M 1707
 
8.3%
1 1333
 
6.5%
S 1174
 
5.7%
8 760
 
3.7%
6 678
 
3.3%
3 678
 
3.3%
4 646
 
3.2%
9 542
 
2.6%
Other values (3) 1565
 
7.6%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Distinct1702
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size13.5 KiB
2023-12-13T00:38:08.680900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length21
Mean length12.37822
Min length4

Characters and Unicode

Total characters21142
Distinct characters340
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1696 ?
Unique (%)99.3%

Sample

1st row청구기호
2nd row점자 813.8-전94ㄷ
3rd rowJ 843-C777b
4th rowJ 843-G894b-v.10
5th rowJ 843-G894b-v.11
ValueCountFrequency (%)
338
 
14.7%
만화 87
 
3.8%
유아 78
 
3.4%
j 45
 
2.0%
특화 18
 
0.8%
fl 11
 
0.5%
r 4
 
0.2%
용인 4
 
0.2%
843-코884ㅈ 2
 
0.1%
813.8-민74ㅁ-v.2 2
 
0.1%
Other values (1700) 1705
74.3%
2023-12-13T00:38:09.266852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 2183
 
10.3%
8 2026
 
9.6%
. 1761
 
8.3%
1 1565
 
7.4%
3 1490
 
7.0%
2 1243
 
5.9%
4 1081
 
5.1%
7 1022
 
4.8%
5 938
 
4.4%
9 935
 
4.4%
Other values (330) 6898
32.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11680
55.2%
Other Letter 4020
 
19.0%
Dash Punctuation 2183
 
10.3%
Other Punctuation 1761
 
8.3%
Lowercase Letter 652
 
3.1%
Space Separator 586
 
2.8%
Math Symbol 133
 
0.6%
Uppercase Letter 127
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
435
 
10.8%
373
 
9.3%
205
 
5.1%
169
 
4.2%
148
 
3.7%
138
 
3.4%
137
 
3.4%
127
 
3.2%
119
 
3.0%
111
 
2.8%
Other values (280) 2058
51.2%
Lowercase Letter
ValueCountFrequency (%)
v 463
71.0%
c 136
 
20.9%
b 17
 
2.6%
s 7
 
1.1%
w 4
 
0.6%
h 4
 
0.6%
a 4
 
0.6%
d 4
 
0.6%
l 3
 
0.5%
m 2
 
0.3%
Other values (8) 8
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
J 47
37.0%
L 14
 
11.0%
G 14
 
11.0%
F 12
 
9.4%
S 6
 
4.7%
R 6
 
4.7%
M 6
 
4.7%
E 4
 
3.1%
N 3
 
2.4%
C 3
 
2.4%
Other values (8) 12
 
9.4%
Decimal Number
ValueCountFrequency (%)
8 2026
17.3%
1 1565
13.4%
3 1490
12.8%
2 1243
10.6%
4 1081
9.3%
7 1022
8.8%
5 938
8.0%
9 935
8.0%
6 783
 
6.7%
0 597
 
5.1%
Dash Punctuation
ValueCountFrequency (%)
- 2183
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1761
100.0%
Space Separator
ValueCountFrequency (%)
586
100.0%
Math Symbol
ValueCountFrequency (%)
= 133
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16343
77.3%
Hangul 4020
 
19.0%
Latin 779
 
3.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
435
 
10.8%
373
 
9.3%
205
 
5.1%
169
 
4.2%
148
 
3.7%
138
 
3.4%
137
 
3.4%
127
 
3.2%
119
 
3.0%
111
 
2.8%
Other values (280) 2058
51.2%
Latin
ValueCountFrequency (%)
v 463
59.4%
c 136
 
17.5%
J 47
 
6.0%
b 17
 
2.2%
L 14
 
1.8%
G 14
 
1.8%
F 12
 
1.5%
s 7
 
0.9%
S 6
 
0.8%
R 6
 
0.8%
Other values (26) 57
 
7.3%
Common
ValueCountFrequency (%)
- 2183
13.4%
8 2026
12.4%
. 1761
10.8%
1 1565
9.6%
3 1490
9.1%
2 1243
7.6%
4 1081
6.6%
7 1022
6.3%
5 938
5.7%
9 935
5.7%
Other values (4) 2099
12.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17122
81.0%
Hangul 2369
 
11.2%
Compat Jamo 1651
 
7.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 2183
12.7%
8 2026
11.8%
. 1761
10.3%
1 1565
9.1%
3 1490
8.7%
2 1243
7.3%
4 1081
 
6.3%
7 1022
 
6.0%
5 938
 
5.5%
9 935
 
5.5%
Other values (40) 2878
16.8%
Hangul
ValueCountFrequency (%)
435
 
18.4%
169
 
7.1%
137
 
5.8%
105
 
4.4%
91
 
3.8%
89
 
3.8%
61
 
2.6%
47
 
2.0%
40
 
1.7%
34
 
1.4%
Other values (261) 1161
49.0%
Compat Jamo
ValueCountFrequency (%)
373
22.6%
205
12.4%
148
 
9.0%
138
 
8.4%
127
 
7.7%
119
 
7.2%
111
 
6.7%
94
 
5.7%
88
 
5.3%
76
 
4.6%
Other values (9) 172
10.4%

Unnamed: 3
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size13.5 KiB
[수지]제1종합자료실
725 
[수지]어린이자료실
518 
[수지]제2종합자료실
464 
자료실명
 
1

Length

Max length11
Median length11
Mean length10.692623
Min length4

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row자료실명
2nd row[수지]제1종합자료실
3rd row[수지]제1종합자료실
4th row[수지]어린이자료실
5th row[수지]어린이자료실

Common Values

ValueCountFrequency (%)
[수지]제1종합자료실 725
42.4%
[수지]어린이자료실 518
30.3%
[수지]제2종합자료실 464
27.2%
자료실명 1
 
0.1%

Length

2023-12-13T00:38:09.448322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:38:09.654342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수지]제1종합자료실 725
42.4%
수지]어린이자료실 518
30.3%
수지]제2종합자료실 464
27.2%
자료실명 1
 
0.1%
Distinct1622
Distinct (%)95.0%
Missing0
Missing (%)0.0%
Memory size13.5 KiB
2023-12-13T00:38:10.076401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length170
Median length67
Mean length26.92096
Min length1

Characters and Unicode

Total characters45981
Distinct characters1097
Distinct categories15 ?
Distinct scripts5 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1536 ?
Unique (%)89.9%

Sample

1st row서명
2nd row독도 수중 동굴의 비밀= Dokdo, Secrets of The Underwater Cave
3rd row(The)bear under the stairs
4th rowBear Grylls adventures. [10], (The) Mountain Challenge
5th rowBear Grylls adventures. [11], The Arctic Challenge
ValueCountFrequency (%)
431
 
3.7%
장편소설 81
 
0.7%
위한 77
 
0.7%
1 71
 
0.6%
2 64
 
0.6%
이야기 49
 
0.4%
the 44
 
0.4%
35
 
0.3%
3 34
 
0.3%
33
 
0.3%
Other values (6291) 10654
92.1%
2023-12-13T00:38:10.962850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10362
 
22.5%
: 920
 
2.0%
894
 
1.9%
800
 
1.7%
681
 
1.5%
, 496
 
1.1%
433
 
0.9%
407
 
0.9%
393
 
0.9%
388
 
0.8%
Other values (1087) 30207
65.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 28727
62.5%
Space Separator 10362
 
22.5%
Lowercase Letter 2619
 
5.7%
Other Punctuation 2002
 
4.4%
Decimal Number 1082
 
2.4%
Uppercase Letter 507
 
1.1%
Close Punctuation 280
 
0.6%
Open Punctuation 280
 
0.6%
Math Symbol 79
 
0.2%
Dash Punctuation 32
 
0.1%
Other values (5) 11
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
894
 
3.1%
800
 
2.8%
681
 
2.4%
433
 
1.5%
407
 
1.4%
393
 
1.4%
388
 
1.4%
383
 
1.3%
350
 
1.2%
340
 
1.2%
Other values (992) 23658
82.4%
Lowercase Letter
ValueCountFrequency (%)
e 339
12.9%
a 233
 
8.9%
r 214
 
8.2%
o 202
 
7.7%
t 196
 
7.5%
n 185
 
7.1%
i 176
 
6.7%
s 163
 
6.2%
l 144
 
5.5%
h 122
 
4.7%
Other values (16) 645
24.6%
Uppercase Letter
ValueCountFrequency (%)
T 71
14.0%
S 50
 
9.9%
B 39
 
7.7%
G 37
 
7.3%
C 33
 
6.5%
A 29
 
5.7%
I 26
 
5.1%
P 22
 
4.3%
E 20
 
3.9%
O 19
 
3.7%
Other values (16) 161
31.8%
Other Punctuation
ValueCountFrequency (%)
: 920
46.0%
, 496
24.8%
. 309
 
15.4%
! 116
 
5.8%
? 51
 
2.5%
· 42
 
2.1%
' 28
 
1.4%
/ 25
 
1.2%
& 10
 
0.5%
; 3
 
0.1%
Other values (2) 2
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 250
23.1%
2 232
21.4%
0 192
17.7%
3 116
10.7%
5 73
 
6.7%
4 69
 
6.4%
6 41
 
3.8%
7 40
 
3.7%
9 38
 
3.5%
8 31
 
2.9%
Math Symbol
ValueCountFrequency (%)
= 51
64.6%
~ 17
 
21.5%
× 3
 
3.8%
+ 3
 
3.8%
> 2
 
2.5%
< 2
 
2.5%
| 1
 
1.3%
Close Punctuation
ValueCountFrequency (%)
) 256
91.4%
] 23
 
8.2%
1
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 256
91.4%
[ 23
 
8.2%
1
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 31
96.9%
1
 
3.1%
Space Separator
ValueCountFrequency (%)
10362
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 6
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 28695
62.4%
Common 14126
30.7%
Latin 3128
 
6.8%
Han 31
 
0.1%
Hiragana 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
894
 
3.1%
800
 
2.8%
681
 
2.4%
433
 
1.5%
407
 
1.4%
393
 
1.4%
388
 
1.4%
383
 
1.3%
350
 
1.2%
340
 
1.2%
Other values (963) 23626
82.3%
Latin
ValueCountFrequency (%)
e 339
 
10.8%
a 233
 
7.4%
r 214
 
6.8%
o 202
 
6.5%
t 196
 
6.3%
n 185
 
5.9%
i 176
 
5.6%
s 163
 
5.2%
l 144
 
4.6%
h 122
 
3.9%
Other values (43) 1154
36.9%
Common
ValueCountFrequency (%)
10362
73.4%
: 920
 
6.5%
, 496
 
3.5%
. 309
 
2.2%
) 256
 
1.8%
( 256
 
1.8%
1 250
 
1.8%
2 232
 
1.6%
0 192
 
1.4%
3 116
 
0.8%
Other values (32) 737
 
5.2%
Han
ValueCountFrequency (%)
2
 
6.5%
2
 
6.5%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other values (18) 18
58.1%
Hiragana
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 28695
62.4%
ASCII 17202
37.4%
None 47
 
0.1%
CJK 31
 
0.1%
Punctuation 3
 
< 0.1%
Number Forms 2
 
< 0.1%
Hiragana 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10362
60.2%
: 920
 
5.3%
, 496
 
2.9%
e 339
 
2.0%
. 309
 
1.8%
) 256
 
1.5%
( 256
 
1.5%
1 250
 
1.5%
a 233
 
1.4%
2 232
 
1.3%
Other values (77) 3549
 
20.6%
Hangul
ValueCountFrequency (%)
894
 
3.1%
800
 
2.8%
681
 
2.4%
433
 
1.5%
407
 
1.4%
393
 
1.4%
388
 
1.4%
383
 
1.3%
350
 
1.2%
340
 
1.2%
Other values (963) 23626
82.3%
None
ValueCountFrequency (%)
· 42
89.4%
× 3
 
6.4%
1
 
2.1%
1
 
2.1%
CJK
ValueCountFrequency (%)
2
 
6.5%
2
 
6.5%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other values (18) 18
58.1%
Number Forms
ValueCountFrequency (%)
2
100.0%
Punctuation
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Hiragana
ValueCountFrequency (%)
1
100.0%
Distinct1514
Distinct (%)88.6%
Missing0
Missing (%)0.0%
Memory size13.5 KiB
2023-12-13T00:38:11.427048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length261
Median length94
Mean length14.469555
Min length2

Characters and Unicode

Total characters24714
Distinct characters670
Distinct categories11 ?
Distinct scripts5 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1371 ?
Unique (%)80.3%

Sample

1st row저작자
2nd row전혜선 글.그림
3rd rowHelen Cooper
4th rowby Bear Grylls ; illustrated by Emma McCann
5th rowBear Grylls; illustrated by Emma McCann
ValueCountFrequency (%)
지음 1163
 
16.7%
옮김 481
 
6.9%
그림 397
 
5.7%
301
 
4.3%
253
 
3.6%
글·그림 72
 
1.0%
by 57
 
0.8%
공]지음 28
 
0.4%
원작 27
 
0.4%
27
 
0.4%
Other values (2975) 4153
59.7%
2023-12-13T00:38:12.200551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5270
21.3%
1377
 
5.6%
1222
 
4.9%
; 998
 
4.0%
873
 
3.5%
557
 
2.3%
500
 
2.0%
498
 
2.0%
491
 
2.0%
407
 
1.6%
Other values (660) 12521
50.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16161
65.4%
Space Separator 5270
 
21.3%
Other Punctuation 1550
 
6.3%
Lowercase Letter 1220
 
4.9%
Uppercase Letter 274
 
1.1%
Open Punctuation 105
 
0.4%
Close Punctuation 105
 
0.4%
Decimal Number 22
 
0.1%
Dash Punctuation 4
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1377
 
8.5%
1222
 
7.6%
873
 
5.4%
557
 
3.4%
500
 
3.1%
498
 
3.1%
491
 
3.0%
407
 
2.5%
273
 
1.7%
272
 
1.7%
Other values (586) 9691
60.0%
Lowercase Letter
ValueCountFrequency (%)
a 141
11.6%
e 118
9.7%
r 114
 
9.3%
l 110
 
9.0%
t 94
 
7.7%
y 83
 
6.8%
n 81
 
6.6%
i 70
 
5.7%
b 67
 
5.5%
s 57
 
4.7%
Other values (16) 285
23.4%
Uppercase Letter
ValueCountFrequency (%)
M 28
10.2%
G 26
 
9.5%
S 25
 
9.1%
E 23
 
8.4%
B 21
 
7.7%
C 21
 
7.7%
T 18
 
6.6%
P 15
 
5.5%
A 15
 
5.5%
D 13
 
4.7%
Other values (13) 69
25.2%
Decimal Number
ValueCountFrequency (%)
0 5
22.7%
1 4
18.2%
7 4
18.2%
3 3
13.6%
2 2
 
9.1%
9 2
 
9.1%
8 2
 
9.1%
Other Punctuation
ValueCountFrequency (%)
; 998
64.4%
, 361
 
23.3%
· 82
 
5.3%
. 57
 
3.7%
: 52
 
3.4%
Open Punctuation
ValueCountFrequency (%)
[ 86
81.9%
( 17
 
16.2%
1
 
1.0%
1
 
1.0%
Close Punctuation
ValueCountFrequency (%)
] 86
81.9%
) 17
 
16.2%
1
 
1.0%
1
 
1.0%
Math Symbol
ValueCountFrequency (%)
> 1
50.0%
< 1
50.0%
Space Separator
ValueCountFrequency (%)
5270
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Nonspacing Mark
ValueCountFrequency (%)
́ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16159
65.4%
Common 7058
28.6%
Latin 1494
 
6.0%
Han 2
 
< 0.1%
Inherited 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1377
 
8.5%
1222
 
7.6%
873
 
5.4%
557
 
3.4%
500
 
3.1%
498
 
3.1%
491
 
3.0%
407
 
2.5%
273
 
1.7%
272
 
1.7%
Other values (584) 9689
60.0%
Latin
ValueCountFrequency (%)
a 141
 
9.4%
e 118
 
7.9%
r 114
 
7.6%
l 110
 
7.4%
t 94
 
6.3%
y 83
 
5.6%
n 81
 
5.4%
i 70
 
4.7%
b 67
 
4.5%
s 57
 
3.8%
Other values (39) 559
37.4%
Common
ValueCountFrequency (%)
5270
74.7%
; 998
 
14.1%
, 361
 
5.1%
[ 86
 
1.2%
] 86
 
1.2%
· 82
 
1.2%
. 57
 
0.8%
: 52
 
0.7%
( 17
 
0.2%
) 17
 
0.2%
Other values (14) 32
 
0.5%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%
Inherited
ValueCountFrequency (%)
́ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16159
65.4%
ASCII 8465
34.3%
None 87
 
0.4%
CJK 2
 
< 0.1%
Diacriticals 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5270
62.3%
; 998
 
11.8%
, 361
 
4.3%
a 141
 
1.7%
e 118
 
1.4%
r 114
 
1.3%
l 110
 
1.3%
t 94
 
1.1%
[ 86
 
1.0%
] 86
 
1.0%
Other values (57) 1087
 
12.8%
Hangul
ValueCountFrequency (%)
1377
 
8.5%
1222
 
7.6%
873
 
5.4%
557
 
3.4%
500
 
3.1%
498
 
3.1%
491
 
3.0%
407
 
2.5%
273
 
1.7%
272
 
1.7%
Other values (584) 9689
60.0%
None
ValueCountFrequency (%)
· 82
94.3%
1
 
1.1%
1
 
1.1%
1
 
1.1%
1
 
1.1%
ê 1
 
1.1%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
Diacriticals
ValueCountFrequency (%)
́ 1
100.0%
Distinct936
Distinct (%)54.8%
Missing0
Missing (%)0.0%
Memory size13.5 KiB
2023-12-13T00:38:12.590654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length27
Mean length5.2271663
Min length1

Characters and Unicode

Total characters8928
Distinct characters540
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique630 ?
Unique (%)36.9%

Sample

1st row발행자
2nd row에버그린 솔페지(evergreen solfege)
3rd rowPicture Corgi:
4th rowBonnier Zaffre
5th rowBear Grylls
ValueCountFrequency (%)
86
 
4.5%
문학동네 35
 
1.8%
위즈덤하우스 33
 
1.7%
비룡소 25
 
1.3%
창비 22
 
1.1%
books 20
 
1.0%
아울북 19
 
1.0%
김영사 19
 
1.0%
민음사 19
 
1.0%
서울문화사 15
 
0.8%
Other values (887) 1636
84.8%
2023-12-13T00:38:13.172015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
350
 
3.9%
244
 
2.7%
235
 
2.6%
220
 
2.5%
210
 
2.4%
: 205
 
2.3%
o 160
 
1.8%
154
 
1.7%
122
 
1.4%
115
 
1.3%
Other values (530) 6913
77.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6782
76.0%
Lowercase Letter 1157
 
13.0%
Uppercase Letter 361
 
4.0%
Space Separator 235
 
2.6%
Other Punctuation 225
 
2.5%
Open Punctuation 67
 
0.8%
Close Punctuation 67
 
0.8%
Decimal Number 32
 
0.4%
Connector Punctuation 1
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
350
 
5.2%
244
 
3.6%
220
 
3.2%
210
 
3.1%
154
 
2.3%
122
 
1.8%
115
 
1.7%
113
 
1.7%
100
 
1.5%
95
 
1.4%
Other values (466) 5059
74.6%
Lowercase Letter
ValueCountFrequency (%)
o 160
13.8%
s 114
9.9%
r 107
9.2%
e 103
8.9%
a 92
 
8.0%
i 83
 
7.2%
n 75
 
6.5%
l 74
 
6.4%
k 56
 
4.8%
t 39
 
3.4%
Other values (15) 254
22.0%
Uppercase Letter
ValueCountFrequency (%)
B 71
19.7%
H 36
 
10.0%
S 29
 
8.0%
P 25
 
6.9%
K 25
 
6.9%
R 23
 
6.4%
O 18
 
5.0%
G 17
 
4.7%
C 16
 
4.4%
D 11
 
3.0%
Other values (13) 90
24.9%
Other Punctuation
ValueCountFrequency (%)
: 205
91.1%
. 8
 
3.6%
& 4
 
1.8%
' 3
 
1.3%
, 2
 
0.9%
1
 
0.4%
# 1
 
0.4%
/ 1
 
0.4%
Decimal Number
ValueCountFrequency (%)
2 17
53.1%
1 13
40.6%
6 2
 
6.2%
Space Separator
ValueCountFrequency (%)
235
100.0%
Open Punctuation
ValueCountFrequency (%)
( 67
100.0%
Close Punctuation
ValueCountFrequency (%)
) 67
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6779
75.9%
Latin 1518
 
17.0%
Common 628
 
7.0%
Han 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
350
 
5.2%
244
 
3.6%
220
 
3.2%
210
 
3.1%
154
 
2.3%
122
 
1.8%
115
 
1.7%
113
 
1.7%
100
 
1.5%
95
 
1.4%
Other values (463) 5056
74.6%
Latin
ValueCountFrequency (%)
o 160
 
10.5%
s 114
 
7.5%
r 107
 
7.0%
e 103
 
6.8%
a 92
 
6.1%
i 83
 
5.5%
n 75
 
4.9%
l 74
 
4.9%
B 71
 
4.7%
k 56
 
3.7%
Other values (38) 583
38.4%
Common
ValueCountFrequency (%)
235
37.4%
: 205
32.6%
( 67
 
10.7%
) 67
 
10.7%
2 17
 
2.7%
1 13
 
2.1%
. 8
 
1.3%
& 4
 
0.6%
' 3
 
0.5%
6 2
 
0.3%
Other values (6) 7
 
1.1%
Han
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6777
75.9%
ASCII 2145
 
24.0%
CJK 3
 
< 0.1%
Compat Jamo 2
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
350
 
5.2%
244
 
3.6%
220
 
3.2%
210
 
3.1%
154
 
2.3%
122
 
1.8%
115
 
1.7%
113
 
1.7%
100
 
1.5%
95
 
1.4%
Other values (461) 5054
74.6%
ASCII
ValueCountFrequency (%)
235
 
11.0%
: 205
 
9.6%
o 160
 
7.5%
s 114
 
5.3%
r 107
 
5.0%
e 103
 
4.8%
a 92
 
4.3%
i 83
 
3.9%
n 75
 
3.5%
l 74
 
3.4%
Other values (53) 897
41.8%
None
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Compat Jamo
ValueCountFrequency (%)
1
50.0%
1
50.0%

Unnamed: 7
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size13.5 KiB

Unnamed: 8
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size13.5 KiB
1707 
구분
 
1

Length

Max length2
Median length1
Mean length1.0005855
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row구분
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
1707
99.9%
구분 1
 
0.1%

Length

2023-12-13T00:38:13.361802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:38:13.534821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1707
99.9%
구분 1
 
0.1%
Distinct1536
Distinct (%)90.8%
Missing16
Missing (%)0.9%
Memory size13.5 KiB
2023-12-13T00:38:13.896273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length52
Mean length16.472813
Min length4

Characters and Unicode

Total characters27872
Distinct characters139
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1410 ?
Unique (%)83.3%

Sample

1st row형태사항
2nd row47 p. ;삽화:22 cm
3rd row1 v.;col. ill.:27 cm+CD-ROM 1매
4th row139 p. ;ill. :20 cm
5th row116 p.;ill.:20 cm
ValueCountFrequency (%)
cm 1553
25.9%
p 493
 
8.2%
천연색삽화 164
 
2.7%
21 122
 
2.0%
p.;21 112
 
1.9%
삽화 102
 
1.7%
23 87
 
1.4%
p.;삽화:21 75
 
1.2%
22 72
 
1.2%
p.;20 63
 
1.0%
Other values (832) 3163
52.7%
2023-12-13T00:38:14.411117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4318
15.5%
2 2628
 
9.4%
. 1737
 
6.2%
c 1711
 
6.1%
; 1694
 
6.1%
m 1691
 
6.1%
p 1688
 
6.1%
1 1408
 
5.1%
: 1156
 
4.1%
1066
 
3.8%
Other values (129) 8775
31.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8374
30.0%
Lowercase Letter 5378
19.3%
Other Punctuation 4817
17.3%
Other Letter 4772
17.1%
Space Separator 4318
15.5%
Close Punctuation 76
 
0.3%
Open Punctuation 76
 
0.3%
Math Symbol 37
 
0.1%
Uppercase Letter 20
 
0.1%
Dash Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1066
22.3%
1061
22.2%
631
13.2%
630
13.2%
629
13.2%
174
 
3.6%
125
 
2.6%
53
 
1.1%
53
 
1.1%
46
 
1.0%
Other values (79) 304
 
6.4%
Lowercase Letter
ValueCountFrequency (%)
c 1711
31.8%
m 1691
31.4%
p 1688
31.4%
l 109
 
2.0%
i 59
 
1.1%
x 46
 
0.9%
o 25
 
0.5%
v 15
 
0.3%
s 6
 
0.1%
a 6
 
0.1%
Other values (9) 22
 
0.4%
Decimal Number
ValueCountFrequency (%)
2 2628
31.4%
1 1408
16.8%
3 964
 
11.5%
4 640
 
7.6%
0 615
 
7.3%
9 512
 
6.1%
6 459
 
5.5%
5 413
 
4.9%
8 373
 
4.5%
7 362
 
4.3%
Other Punctuation
ValueCountFrequency (%)
. 1737
36.1%
; 1694
35.2%
: 1156
24.0%
, 226
 
4.7%
* 2
 
< 0.1%
· 1
 
< 0.1%
& 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
C 5
25.0%
D 5
25.0%
M 4
20.0%
O 3
15.0%
R 3
15.0%
Close Punctuation
ValueCountFrequency (%)
] 54
71.1%
) 22
28.9%
Open Punctuation
ValueCountFrequency (%)
[ 54
71.1%
( 22
28.9%
Math Symbol
ValueCountFrequency (%)
+ 27
73.0%
× 10
 
27.0%
Space Separator
ValueCountFrequency (%)
4318
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17702
63.5%
Latin 5398
 
19.4%
Hangul 4772
 
17.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1066
22.3%
1061
22.2%
631
13.2%
630
13.2%
629
13.2%
174
 
3.6%
125
 
2.6%
53
 
1.1%
53
 
1.1%
46
 
1.0%
Other values (79) 304
 
6.4%
Common
ValueCountFrequency (%)
4318
24.4%
2 2628
14.8%
. 1737
9.8%
; 1694
 
9.6%
1 1408
 
8.0%
: 1156
 
6.5%
3 964
 
5.4%
4 640
 
3.6%
0 615
 
3.5%
9 512
 
2.9%
Other values (16) 2030
11.5%
Latin
ValueCountFrequency (%)
c 1711
31.7%
m 1691
31.3%
p 1688
31.3%
l 109
 
2.0%
i 59
 
1.1%
x 46
 
0.9%
o 25
 
0.5%
v 15
 
0.3%
s 6
 
0.1%
a 6
 
0.1%
Other values (14) 42
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23089
82.8%
Hangul 4772
 
17.1%
None 11
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4318
18.7%
2 2628
11.4%
. 1737
7.5%
c 1711
 
7.4%
; 1694
 
7.3%
m 1691
 
7.3%
p 1688
 
7.3%
1 1408
 
6.1%
: 1156
 
5.0%
3 964
 
4.2%
Other values (38) 4094
17.7%
Hangul
ValueCountFrequency (%)
1066
22.3%
1061
22.2%
631
13.2%
630
13.2%
629
13.2%
174
 
3.6%
125
 
2.6%
53
 
1.1%
53
 
1.1%
46
 
1.0%
Other values (79) 304
 
6.4%
None
ValueCountFrequency (%)
× 10
90.9%
· 1
 
9.1%

Unnamed: 10
Unsupported

REJECTED  UNSUPPORTED 

Missing12
Missing (%)0.7%
Memory size13.5 KiB

Unnamed: 11
Unsupported

REJECTED  UNSUPPORTED 

Missing7
Missing (%)0.4%
Memory size13.5 KiB

Correlations

2023-12-13T00:38:14.544231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 3Unnamed: 8
Unnamed: 31.0001.000
Unnamed: 81.0001.000
2023-12-13T00:38:14.657100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 8Unnamed: 3
Unnamed: 81.0000.999
Unnamed: 30.9991.000
2023-12-13T00:38:14.776916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 3Unnamed: 8
Unnamed: 31.0000.999
Unnamed: 80.9991.000

Missing values

2023-12-13T00:38:07.189236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:38:07.407147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T00:38:07.563941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

수지도서관 신간도서 정보 (2023.01.01~2023.04.30. 배가도서)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11
0연번등록번호청구기호자료실명서명저작자발행자발행년구분형태사항가격ISBN
11JM0000110714점자 813.8-전94ㄷ[수지]제1종합자료실독도 수중 동굴의 비밀= Dokdo, Secrets of The Underwater Cave전혜선 글.그림에버그린 솔페지(evergreen solfege)202247 p. ;삽화:22 cm40000979119718114
22JM0000110715J 843-C777b[수지]제1종합자료실(The)bear under the stairsHelen CooperPicture Corgi:20081 v.;col. ill.:27 cm+CD-ROM 1매140009780552558457
33JM0000110716J 843-G894b-v.10[수지]어린이자료실Bear Grylls adventures. [10], (The) Mountain Challengeby Bear Grylls ; illustrated by Emma McCannBonnier Zaffre2020139 p. ;ill. :20 cm102009781786960566
44JM0000110717J 843-G894b-v.11[수지]어린이자료실Bear Grylls adventures. [11], The Arctic ChallengeBear Grylls; illustrated by Emma McCannBear Grylls2019116 p.;ill.:20 cm85009781786960795
55JM0000110718J 843-G894b-v.12[수지]어린이자료실Bear Grylls adventures. [12], The Sailing ChallengeBear Grylls; illustrated by Emma McCannBear Grylls2019115 p.;ill.:20 cm85009781786960818
66JM0000110719J 843-G894b-v.1[수지]어린이자료실Bear Grylls adventures. [1], (The)blizzard challengeBear Grylls; illustrated by Emma McCannBear Grylls2017117 p.;ill.:20 cm110009781786960122
77JM0000110720J 843-G894b-v.2[수지]어린이자료실Bear Grylls adventure. [2], (The) Desert challengeby Bear Grylls, illustrated by Emma MccannBear Grylls2017115 p.;ill.:20 cm85009781786960139
88JM0000110721J 843-G894b-v.3[수지]어린이자료실Bear Grylls adventures. [3], The Jungle challengeBear Grylls ; illustrated by Emma McCannBear Grylls2017117p.;ill.:20cm +CD 1매99009781786960146
99JM0000110722J 843-G894b-v.4[수지]어린이자료실Bear Grylls adventures. [4], The sea challengeBear Grylls ; illustrated by Emma McCannBear Grylls2017118p.;ill.:20cm99009781786960153
수지도서관 신간도서 정보 (2023.01.01~2023.04.30. 배가도서)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11
16981698SM0000262990592.27-윤95ㅅ[수지]제1종합자료실(예쁘고 만들기 쉬운) 손뜨개 인형옷윤효진, 이산라, 박수진 [공]지음경향BP2022237 p.;천연색삽화:25 cm220009788969525260
16991699SM0000262991517-손54ㅂ[수지]제1종합자료실병! 도대체 왜 생길까?: 종횡무진 한의사가 정리한 '교양으로 읽는 병인 백과'손성훈 지음북랩2022454 p.;삽화, 도표:26 cm240009791168365988
17001700SM0000262992556-김74ㅍ[수지]제1종합자료실퓨쳐 모빌리티김정훈 지음동아엠앤비2022198 p.;천연색삽화:26 cm180009791163636311
17011701SM0000262993592.27-이92ㅁ[수지]제1종합자료실마마랜스의 일상 니트: 누구나 쉽게 즐길 수 있는 뜨개옷과 소품 17이하니 지음한스미디어2022200 p.;천연색삽화:27 cm220009791160078633
17021702SM0000262994598.1-김66ㅇ[수지]제1종합자료실0~5세 성장 발달에 맞추는 놀이 육아김원철 외 지음; 전선진 그림마음책방2022328 p.;천연색삽화:23 cm198009791190888219
17031703SM0000262995598.1-서298ㅈ[수지]제1종합자료실조금 다른 육아의 길을 걷는 중입니다: '생각의 힘'과 '마음의 힘'을 길러주는 미래형 육아 철학서린 글·그림루리책방2023299 p.;천연색삽화:21 cm180009791197333743
17041704SM0000262996만화 657.1-네65ㅇ=c.2[수지]어린이자료실양아치의 스피치네온비 글; 김인정 그림문학동네2023252 p.;천연색삽화:22 cm160009788954690546
17051705SM0000262997특화 984.221-조67ㄴ[수지]제1종합자료실(셀프트래블) 뉴욕: '23~'24 최신판조은정 지음상상2022267 p.;천연색삽화, 지도:21 cm+별책부록 1책165009791167821102
17061706SM0000262998특화 980.24-중62ㅂ-v.3[수지]제1종합자료실(베스트 프렌즈) 코타키나발루: 최신판 '20~'21김준현 지음중앙북스2019136 p.;천연색삽화, 지도:21 cm110009788927810568
17071707SM0000262999특화 980.24-프29ㅈ-v.12[수지]제1종합자료실(프렌즈) 독일: 최고의 독일 여행을 위한 한국인 맞춤형 가이드북유상현 지음중앙books:2022688 p.;천연색삽화, 지도:21 cm+독일 전도 1매240009788927879510