Overview

Dataset statistics

Number of variables19
Number of observations10000
Missing cells13625
Missing cells (%)7.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.5 MiB
Average record size in memory162.0 B

Variable types

Numeric2
Categorical5
Text11
DateTime1

Dataset

Description충청북도 단양군 단양군립도서관도서소장현황 목록으로 연번, 관리구분, 배가상태, 이용제한 구분, 등록번호, 청구기호, 배가기호, 자료실명, 서명, 저작자, 편/권차, 권서명, 발행자, 발행년, 발행지, 원/복본구분, 구분, 형태사항, 데이터 기준일자 등의 정보를 포함함.
Author충청북도 단양군
URLhttps://www.data.go.kr/data/15106862/fileData.do

Alerts

관리구분 has constant value ""Constant
이용제한구분 has constant value ""Constant
데이터 기준일자 has constant value ""Constant
배가기호 is highly overall correlated with 자료실명High correlation
자료실명 is highly overall correlated with 배가기호High correlation
배가상태 is highly imbalanced (97.4%)Imbalance
원_복본구분 is highly imbalanced (93.1%)Imbalance
편_권차 has 4264 (42.6%) missing valuesMissing
권서명 has 8809 (88.1%) missing valuesMissing
형태사항 has 513 (5.1%) missing valuesMissing
연번 has unique valuesUnique
등록번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 17:19:12.391669
Analysis finished2023-12-12 17:19:18.298891
Duration5.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25192.932
Minimum2
Maximum50811
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T02:19:18.392952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2493.85
Q112546.75
median25019
Q337952.75
95-th percentile48146.15
Maximum50811
Range50809
Interquartile range (IQR)25406

Descriptive statistics

Standard deviation14646.927
Coefficient of variation (CV)0.58139033
Kurtosis-1.1981815
Mean25192.932
Median Absolute Deviation (MAD)12689.5
Skewness0.014255863
Sum2.5192932 × 108
Variance2.1453248 × 108
MonotonicityNot monotonic
2023-12-13T02:19:18.583040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37771 1
 
< 0.1%
12444 1
 
< 0.1%
22178 1
 
< 0.1%
14537 1
 
< 0.1%
43615 1
 
< 0.1%
8693 1
 
< 0.1%
648 1
 
< 0.1%
15802 1
 
< 0.1%
27740 1
 
< 0.1%
2632 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
7 1
< 0.1%
13 1
< 0.1%
18 1
< 0.1%
23 1
< 0.1%
25 1
< 0.1%
29 1
< 0.1%
32 1
< 0.1%
33 1
< 0.1%
35 1
< 0.1%
ValueCountFrequency (%)
50811 1
< 0.1%
50807 1
< 0.1%
50806 1
< 0.1%
50804 1
< 0.1%
50802 1
< 0.1%
50800 1
< 0.1%
50793 1
< 0.1%
50785 1
< 0.1%
50781 1
< 0.1%
50779 1
< 0.1%

관리구분
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
본관
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row본관
2nd row본관
3rd row본관
4th row본관
5th row본관

Common Values

ValueCountFrequency (%)
본관 10000
100.0%

Length

2023-12-13T02:19:18.777049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:19:18.912934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
본관 10000
100.0%

배가상태
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
비치자료
9958 
관외대출자료
 
36
특별대출자료
 
6

Length

Max length6
Median length4
Mean length4.0084
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row비치자료
2nd row비치자료
3rd row비치자료
4th row비치자료
5th row비치자료

Common Values

ValueCountFrequency (%)
비치자료 9958
99.6%
관외대출자료 36
 
0.4%
특별대출자료 6
 
0.1%

Length

2023-12-13T02:19:19.040246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:19:19.183862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
비치자료 9958
99.6%
관외대출자료 36
 
0.4%
특별대출자료 6
 
0.1%

이용제한구분
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일반
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반
2nd row일반
3rd row일반
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 10000
100.0%

Length

2023-12-13T02:19:19.332469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:19:19.456927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 10000
100.0%

등록번호
Text

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T02:19:19.719557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters120000
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowADJ000036699
2nd rowADJ000012764
3rd rowADJ000023929
4th rowADD000001044
5th rowADJ000032145
ValueCountFrequency (%)
adj000036699 1
 
< 0.1%
adj000010671 1
 
< 0.1%
adj000036216 1
 
< 0.1%
ade000001671 1
 
< 0.1%
adj000018644 1
 
< 0.1%
adj000007691 1
 
< 0.1%
adj000042643 1
 
< 0.1%
adj000001107 1
 
< 0.1%
abc000000048 1
 
< 0.1%
adj000005222 1
 
< 0.1%
Other values (9990) 9990
99.9%
2023-12-13T02:19:20.228301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 47791
39.8%
A 10390
 
8.7%
D 9959
 
8.3%
J 8435
 
7.0%
3 5890
 
4.9%
2 5884
 
4.9%
4 5881
 
4.9%
1 5862
 
4.9%
7 3804
 
3.2%
5 3751
 
3.1%
Other values (7) 12353
 
10.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 90000
75.0%
Uppercase Letter 30000
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 47791
53.1%
3 5890
 
6.5%
2 5884
 
6.5%
4 5881
 
6.5%
1 5862
 
6.5%
7 3804
 
4.2%
5 3751
 
4.2%
8 3745
 
4.2%
9 3703
 
4.1%
6 3689
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
A 10390
34.6%
D 9959
33.2%
J 8435
28.1%
E 717
 
2.4%
C 312
 
1.0%
B 185
 
0.6%
T 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 90000
75.0%
Latin 30000
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 47791
53.1%
3 5890
 
6.5%
2 5884
 
6.5%
4 5881
 
6.5%
1 5862
 
6.5%
7 3804
 
4.2%
5 3751
 
4.2%
8 3745
 
4.2%
9 3703
 
4.1%
6 3689
 
4.1%
Latin
ValueCountFrequency (%)
A 10390
34.6%
D 9959
33.2%
J 8435
28.1%
E 717
 
2.4%
C 312
 
1.0%
B 185
 
0.6%
T 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 120000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 47791
39.8%
A 10390
 
8.7%
D 9959
 
8.3%
J 8435
 
7.0%
3 5890
 
4.9%
2 5884
 
4.9%
4 5881
 
4.9%
1 5862
 
4.9%
7 3804
 
3.2%
5 3751
 
3.1%
Other values (7) 12353
 
10.3%
Distinct9197
Distinct (%)92.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T02:19:20.661953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length26
Mean length13.9287
Min length1

Characters and Unicode

Total characters139287
Distinct characters578
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8826 ?
Unique (%)88.3%

Sample

1st row유아 408-말231ㅁ-28
2nd row아동 813.8-남55ㄱ
3rd row유아 813.8-오79ㅇ
4th row911-2-2
5th row아동 813.8-큰47ㅋ-3
ValueCountFrequency (%)
아동 5312
28.2%
유아 2497
 
13.2%
아동en 375
 
2.0%
딸림자료 148
 
0.8%
cd-rom 147
 
0.8%
일반 76
 
0.4%
아동ch 75
 
0.4%
아동ja 44
 
0.2%
688.6 40
 
0.2%
아동참고 36
 
0.2%
Other values (9139) 10096
53.6%
2023-12-13T02:19:21.194807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 15324
 
11.0%
8 13573
 
9.7%
1 10109
 
7.3%
3 9505
 
6.8%
8846
 
6.4%
8621
 
6.2%
9 6868
 
4.9%
2 6457
 
4.6%
. 6408
 
4.6%
5959
 
4.3%
Other values (568) 47617
34.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 69845
50.1%
Other Letter 35856
25.7%
Dash Punctuation 15324
 
11.0%
Space Separator 8846
 
6.4%
Other Punctuation 6423
 
4.6%
Uppercase Letter 2304
 
1.7%
Lowercase Letter 459
 
0.3%
Math Symbol 210
 
0.2%
Close Punctuation 10
 
< 0.1%
Open Punctuation 10
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8621
24.0%
5959
16.6%
2619
 
7.3%
1912
 
5.3%
1352
 
3.8%
785
 
2.2%
669
 
1.9%
635
 
1.8%
602
 
1.7%
531
 
1.5%
Other values (501) 12171
33.9%
Uppercase Letter
ValueCountFrequency (%)
N 384
16.7%
E 383
16.6%
C 250
10.9%
M 185
8.0%
D 181
7.9%
R 176
7.6%
O 166
7.2%
H 105
 
4.6%
I 85
 
3.7%
A 70
 
3.0%
Other values (14) 319
13.8%
Lowercase Letter
ValueCountFrequency (%)
s 102
22.2%
c 40
 
8.7%
m 39
 
8.5%
r 32
 
7.0%
a 31
 
6.8%
d 28
 
6.1%
b 27
 
5.9%
w 22
 
4.8%
f 20
 
4.4%
t 17
 
3.7%
Other values (14) 101
22.0%
Decimal Number
ValueCountFrequency (%)
8 13573
19.4%
1 10109
14.5%
3 9505
13.6%
9 6868
9.8%
2 6457
9.2%
4 5255
 
7.5%
0 5215
 
7.5%
5 4654
 
6.7%
7 4567
 
6.5%
6 3642
 
5.2%
Other Punctuation
ValueCountFrequency (%)
. 6408
99.8%
/ 10
 
0.2%
: 4
 
0.1%
, 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 15324
100.0%
Space Separator
ValueCountFrequency (%)
8846
100.0%
Math Symbol
ValueCountFrequency (%)
= 210
100.0%
Close Punctuation
ValueCountFrequency (%)
] 10
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 100668
72.3%
Hangul 35856
 
25.7%
Latin 2763
 
2.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8621
24.0%
5959
16.6%
2619
 
7.3%
1912
 
5.3%
1352
 
3.8%
785
 
2.2%
669
 
1.9%
635
 
1.8%
602
 
1.7%
531
 
1.5%
Other values (501) 12171
33.9%
Latin
ValueCountFrequency (%)
N 384
13.9%
E 383
13.9%
C 250
 
9.0%
M 185
 
6.7%
D 181
 
6.6%
R 176
 
6.4%
O 166
 
6.0%
H 105
 
3.8%
s 102
 
3.7%
I 85
 
3.1%
Other values (38) 746
27.0%
Common
ValueCountFrequency (%)
- 15324
15.2%
8 13573
13.5%
1 10109
10.0%
3 9505
9.4%
8846
8.8%
9 6868
6.8%
2 6457
6.4%
. 6408
6.4%
4 5255
 
5.2%
0 5215
 
5.2%
Other values (9) 13108
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 103431
74.3%
Hangul 26904
 
19.3%
Compat Jamo 8952
 
6.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 15324
14.8%
8 13573
13.1%
1 10109
9.8%
3 9505
9.2%
8846
8.6%
9 6868
6.6%
2 6457
6.2%
. 6408
6.2%
4 5255
 
5.1%
0 5215
 
5.0%
Other values (57) 15871
15.3%
Hangul
ValueCountFrequency (%)
8621
32.0%
5959
22.1%
2619
 
9.7%
466
 
1.7%
404
 
1.5%
274
 
1.0%
215
 
0.8%
162
 
0.6%
152
 
0.6%
149
 
0.6%
Other values (482) 7883
29.3%
Compat Jamo
ValueCountFrequency (%)
1912
21.4%
1352
15.1%
785
8.8%
669
 
7.5%
635
 
7.1%
602
 
6.7%
531
 
5.9%
523
 
5.8%
462
 
5.2%
354
 
4.0%
Other values (9) 1127
12.6%

배가기호
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.1928
Minimum1
Maximum201
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T02:19:21.306196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12
median2
Q33
95-th percentile7
Maximum201
Range200
Interquartile range (IQR)1

Descriptive statistics

Standard deviation21.167249
Coefficient of variation (CV)4.0762689
Kurtosis81.266677
Mean5.1928
Median Absolute Deviation (MAD)0
Skewness9.1037813
Sum51928
Variance448.05243
MonotonicityNot monotonic
2023-12-13T02:19:21.408156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2 5146
51.5%
3 2395
23.9%
4 1251
 
12.5%
7 555
 
5.5%
5 341
 
3.4%
1 130
 
1.3%
201 115
 
1.1%
9 51
 
0.5%
10 11
 
0.1%
8 4
 
< 0.1%
ValueCountFrequency (%)
1 130
 
1.3%
2 5146
51.5%
3 2395
23.9%
4 1251
 
12.5%
5 341
 
3.4%
7 555
 
5.5%
8 4
 
< 0.1%
9 51
 
0.5%
10 11
 
0.1%
11 1
 
< 0.1%
ValueCountFrequency (%)
201 115
 
1.1%
11 1
 
< 0.1%
10 11
 
0.1%
9 51
 
0.5%
8 4
 
< 0.1%
7 555
 
5.5%
5 341
 
3.4%
4 1251
 
12.5%
3 2395
23.9%
2 5146
51.5%

자료실명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
아동자료실
5146 
유아자료실
2395 
디지털자료실
1251 
다문화자료실
555 
보존서고
 
341
Other values (6)
 
312

Length

Max length8
Median length5
Mean length5.1917
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row유아자료실
2nd row아동자료실
3rd row유아자료실
4th row디지털자료실
5th row아동자료실

Common Values

ValueCountFrequency (%)
아동자료실 5146
51.5%
유아자료실 2395
23.9%
디지털자료실 1251
 
12.5%
다문화자료실 555
 
5.5%
보존서고 341
 
3.4%
종합자료실 130
 
1.3%
코아루작은도서관 115
 
1.1%
정감록 북카페 51
 
0.5%
스마트도서관 11
 
0.1%
정리실 4
 
< 0.1%

Length

2023-12-13T02:19:21.532339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
아동자료실 5146
51.2%
유아자료실 2395
23.8%
디지털자료실 1251
 
12.4%
다문화자료실 555
 
5.5%
보존서고 341
 
3.4%
종합자료실 130
 
1.3%
코아루작은도서관 115
 
1.1%
정감록 51
 
0.5%
북카페 51
 
0.5%
스마트도서관 11
 
0.1%
Other values (2) 5
 
< 0.1%

서명
Text

Distinct9543
Distinct (%)95.5%
Missing4
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-13T02:19:21.915366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length138
Median length89
Mean length17.689176
Min length1

Characters and Unicode

Total characters176821
Distinct characters1783
Distinct categories14 ?
Distinct scripts7 ?
Distinct blocks16 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9340 ?
Unique (%)93.4%

Sample

1st row맑은 날, 흐린 날
2nd row강아지도 꿈이 있다 : 남솔고 장편동화
3rd row우리 집에 놀러 오세요
4th row역사채널e. 2 -2, Vol.2
5th row아프지만 괜찮아
ValueCountFrequency (%)
2234
 
5.0%
이야기 372
 
0.8%
1 323
 
0.7%
the 226
 
0.5%
우리 226
 
0.5%
2 214
 
0.5%
175
 
0.4%
위한 121
 
0.3%
세계 116
 
0.3%
아이 113
 
0.3%
Other values (17767) 40472
90.8%
2023-12-13T02:19:22.451005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34875
 
19.7%
3374
 
1.9%
2243
 
1.3%
2101
 
1.2%
) 2038
 
1.2%
( 2038
 
1.2%
: 1968
 
1.1%
e 1929
 
1.1%
, 1831
 
1.0%
1740
 
1.0%
Other values (1773) 122684
69.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 106259
60.1%
Space Separator 34875
 
19.7%
Lowercase Letter 16333
 
9.2%
Other Punctuation 6833
 
3.9%
Decimal Number 4160
 
2.4%
Uppercase Letter 2953
 
1.7%
Close Punctuation 2314
 
1.3%
Open Punctuation 2314
 
1.3%
Math Symbol 430
 
0.2%
Dash Punctuation 336
 
0.2%
Other values (4) 14
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3374
 
3.2%
2243
 
2.1%
2101
 
2.0%
1740
 
1.6%
1740
 
1.6%
1666
 
1.6%
1536
 
1.4%
1444
 
1.4%
1422
 
1.3%
1347
 
1.3%
Other values (1630) 87646
82.5%
Lowercase Letter
ValueCountFrequency (%)
e 1929
11.8%
a 1351
 
8.3%
o 1317
 
8.1%
r 1186
 
7.3%
n 1172
 
7.2%
t 1129
 
6.9%
i 1116
 
6.8%
s 1059
 
6.5%
h 859
 
5.3%
l 755
 
4.6%
Other values (45) 4460
27.3%
Uppercase Letter
ValueCountFrequency (%)
S 314
 
10.6%
T 304
 
10.3%
C 178
 
6.0%
A 177
 
6.0%
E 175
 
5.9%
W 175
 
5.9%
M 153
 
5.2%
I 148
 
5.0%
D 146
 
4.9%
H 138
 
4.7%
Other values (25) 1045
35.4%
Other Punctuation
ValueCountFrequency (%)
: 1968
28.8%
, 1831
26.8%
. 1573
23.0%
! 966
14.1%
' 171
 
2.5%
· 139
 
2.0%
/ 57
 
0.8%
; 39
 
0.6%
& 28
 
0.4%
24
 
0.4%
Other values (10) 37
 
0.5%
Decimal Number
ValueCountFrequency (%)
1 1160
27.9%
0 619
14.9%
2 573
13.8%
3 395
 
9.5%
4 371
 
8.9%
5 329
 
7.9%
6 266
 
6.4%
7 168
 
4.0%
8 151
 
3.6%
9 128
 
3.1%
Math Symbol
ValueCountFrequency (%)
= 336
78.1%
~ 34
 
7.9%
23
 
5.3%
> 14
 
3.3%
< 14
 
3.3%
+ 9
 
2.1%
Close Punctuation
ValueCountFrequency (%)
) 2038
88.1%
] 272
 
11.8%
3
 
0.1%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 2038
88.1%
[ 272
 
11.8%
3
 
0.1%
1
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
4
57.1%
1
 
14.3%
1
 
14.3%
1
 
14.3%
Space Separator
ValueCountFrequency (%)
34875
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 336
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 105243
59.5%
Common 51275
29.0%
Latin 19162
 
10.8%
Han 549
 
0.3%
Hiragana 392
 
0.2%
Cyrillic 125
 
0.1%
Katakana 75
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3374
 
3.2%
2243
 
2.1%
2101
 
2.0%
1740
 
1.7%
1740
 
1.7%
1666
 
1.6%
1536
 
1.5%
1444
 
1.4%
1422
 
1.4%
1347
 
1.3%
Other values (1248) 86630
82.3%
Han
ValueCountFrequency (%)
31
 
5.6%
12
 
2.2%
12
 
2.2%
11
 
2.0%
10
 
1.8%
9
 
1.6%
9
 
1.6%
8
 
1.5%
8
 
1.5%
7
 
1.3%
Other values (271) 432
78.7%
Hiragana
ValueCountFrequency (%)
27
 
6.9%
22
 
5.6%
21
 
5.4%
18
 
4.6%
16
 
4.1%
16
 
4.1%
16
 
4.1%
15
 
3.8%
13
 
3.3%
12
 
3.1%
Other values (51) 216
55.1%
Latin
ValueCountFrequency (%)
e 1929
 
10.1%
a 1351
 
7.1%
o 1317
 
6.9%
r 1186
 
6.2%
n 1172
 
6.1%
t 1129
 
5.9%
i 1116
 
5.8%
s 1059
 
5.5%
h 859
 
4.5%
l 755
 
3.9%
Other values (45) 7289
38.0%
Common
ValueCountFrequency (%)
34875
68.0%
) 2038
 
4.0%
( 2038
 
4.0%
: 1968
 
3.8%
, 1831
 
3.6%
. 1573
 
3.1%
1 1160
 
2.3%
! 966
 
1.9%
0 619
 
1.2%
2 573
 
1.1%
Other values (42) 3634
 
7.1%
Katakana
ValueCountFrequency (%)
5
 
6.7%
5
 
6.7%
5
 
6.7%
4
 
5.3%
3
 
4.0%
3
 
4.0%
3
 
4.0%
3
 
4.0%
3
 
4.0%
3
 
4.0%
Other values (30) 38
50.7%
Cyrillic
ValueCountFrequency (%)
и 13
 
10.4%
а 12
 
9.6%
о 11
 
8.8%
н 9
 
7.2%
е 8
 
6.4%
р 6
 
4.8%
к 5
 
4.0%
л 5
 
4.0%
ь 4
 
3.2%
с 4
 
3.2%
Other values (26) 48
38.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 105221
59.5%
ASCII 70211
39.7%
CJK 547
 
0.3%
Hiragana 392
 
0.2%
None 190
 
0.1%
Cyrillic 125
 
0.1%
Katakana 75
 
< 0.1%
Math Operators 23
 
< 0.1%
Compat Jamo 22
 
< 0.1%
Punctuation 5
 
< 0.1%
Other values (6) 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
34875
49.7%
) 2038
 
2.9%
( 2038
 
2.9%
: 1968
 
2.8%
e 1929
 
2.7%
, 1831
 
2.6%
. 1573
 
2.2%
a 1351
 
1.9%
o 1317
 
1.9%
r 1186
 
1.7%
Other values (77) 20105
28.6%
Hangul
ValueCountFrequency (%)
3374
 
3.2%
2243
 
2.1%
2101
 
2.0%
1740
 
1.7%
1740
 
1.7%
1666
 
1.6%
1536
 
1.5%
1444
 
1.4%
1422
 
1.4%
1347
 
1.3%
Other values (1241) 86608
82.3%
None
ValueCountFrequency (%)
· 139
73.2%
24
 
12.6%
đ 5
 
2.6%
5
 
2.6%
3
 
1.6%
3
 
1.6%
3
 
1.6%
2
 
1.1%
2
 
1.1%
Ð 1
 
0.5%
Other values (3) 3
 
1.6%
CJK
ValueCountFrequency (%)
31
 
5.7%
12
 
2.2%
12
 
2.2%
11
 
2.0%
10
 
1.8%
9
 
1.6%
9
 
1.6%
8
 
1.5%
8
 
1.5%
7
 
1.3%
Other values (269) 430
78.6%
Hiragana
ValueCountFrequency (%)
27
 
6.9%
22
 
5.6%
21
 
5.4%
18
 
4.6%
16
 
4.1%
16
 
4.1%
16
 
4.1%
15
 
3.8%
13
 
3.3%
12
 
3.1%
Other values (51) 216
55.1%
Math Operators
ValueCountFrequency (%)
23
100.0%
Cyrillic
ValueCountFrequency (%)
и 13
 
10.4%
а 12
 
9.6%
о 11
 
8.8%
н 9
 
7.2%
е 8
 
6.4%
р 6
 
4.8%
к 5
 
4.0%
л 5
 
4.0%
ь 4
 
3.2%
с 4
 
3.2%
Other values (26) 48
38.4%
Punctuation
ValueCountFrequency (%)
5
100.0%
Compat Jamo
ValueCountFrequency (%)
5
22.7%
5
22.7%
5
22.7%
2
 
9.1%
2
 
9.1%
2
 
9.1%
1
 
4.5%
Katakana
ValueCountFrequency (%)
5
 
6.7%
5
 
6.7%
5
 
6.7%
4
 
5.3%
3
 
4.0%
3
 
4.0%
3
 
4.0%
3
 
4.0%
3
 
4.0%
3
 
4.0%
Other values (30) 38
50.7%
Enclosed Alphanum
ValueCountFrequency (%)
4
100.0%
Misc Symbols
ValueCountFrequency (%)
1
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Box Drawing
ValueCountFrequency (%)
1
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
50.0%
1
50.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%
Distinct8533
Distinct (%)85.5%
Missing20
Missing (%)0.2%
Memory size156.2 KiB
2023-12-13T02:19:22.809778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length287
Median length102
Mean length18.1
Min length1

Characters and Unicode

Total characters180638
Distinct characters1346
Distinct categories11 ?
Distinct scripts7 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7868 ?
Unique (%)78.8%

Sample

1st row이미애 글 ; 임일수 그림
2nd row남솔고 지음 ; 이진호 그림
3rd row오진희 글 ; 김홍모 그림
4th rowEBS 기획
5th row김별 지음 ; 윤은희 그림
ValueCountFrequency (%)
9147
 
17.0%
그림 5273
 
9.8%
4506
 
8.4%
옮김 2282
 
4.2%
지음 1750
 
3.2%
글·그림 790
 
1.5%
by 668
 
1.2%
원작 402
 
0.7%
233
 
0.4%
엮음 227
 
0.4%
Other values (12006) 28649
53.1%
2023-12-13T02:19:23.325526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
45319
25.1%
; 9114
 
5.0%
6596
 
3.7%
6481
 
3.6%
5727
 
3.2%
4751
 
2.6%
3572
 
2.0%
2644
 
1.5%
2327
 
1.3%
2076
 
1.1%
Other values (1336) 92031
50.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 103257
57.2%
Space Separator 45320
25.1%
Lowercase Letter 15328
 
8.5%
Other Punctuation 12275
 
6.8%
Uppercase Letter 2898
 
1.6%
Open Punctuation 697
 
0.4%
Close Punctuation 697
 
0.4%
Decimal Number 91
 
0.1%
Dash Punctuation 62
 
< 0.1%
Math Symbol 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6596
 
6.4%
6481
 
6.3%
5727
 
5.5%
4751
 
4.6%
3572
 
3.5%
2644
 
2.6%
2327
 
2.3%
2076
 
2.0%
1727
 
1.7%
1552
 
1.5%
Other values (1206) 65804
63.7%
Lowercase Letter
ValueCountFrequency (%)
e 1619
10.6%
a 1352
 
8.8%
r 1256
 
8.2%
i 1149
 
7.5%
n 1136
 
7.4%
t 1101
 
7.2%
l 1054
 
6.9%
y 945
 
6.2%
o 891
 
5.8%
b 824
 
5.4%
Other values (42) 4001
26.1%
Uppercase Letter
ValueCountFrequency (%)
S 294
 
10.1%
B 258
 
8.9%
M 249
 
8.6%
R 198
 
6.8%
P 171
 
5.9%
C 169
 
5.8%
J 163
 
5.6%
H 155
 
5.3%
L 138
 
4.8%
D 138
 
4.8%
Other values (33) 965
33.3%
Other Punctuation
ValueCountFrequency (%)
; 9114
74.2%
· 932
 
7.6%
. 774
 
6.3%
: 731
 
6.0%
, 667
 
5.4%
& 24
 
0.2%
/ 13
 
0.1%
' 8
 
0.1%
4
 
< 0.1%
! 3
 
< 0.1%
Other values (3) 5
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
3 32
35.2%
6 18
19.8%
7 11
 
12.1%
1 9
 
9.9%
5 6
 
6.6%
2 5
 
5.5%
0 4
 
4.4%
4 4
 
4.4%
9 2
 
2.2%
Open Punctuation
ValueCountFrequency (%)
[ 691
99.1%
( 3
 
0.4%
3
 
0.4%
Close Punctuation
ValueCountFrequency (%)
] 691
99.1%
) 3
 
0.4%
3
 
0.4%
Space Separator
ValueCountFrequency (%)
45319
> 99.9%
  1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
< 4
50.0%
> 4
50.0%
Other Symbol
ValueCountFrequency (%)
3
60.0%
2
40.0%
Dash Punctuation
ValueCountFrequency (%)
- 62
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 102095
56.5%
Common 59155
32.7%
Latin 18079
 
10.0%
Han 639
 
0.4%
Katakana 301
 
0.2%
Hiragana 222
 
0.1%
Cyrillic 147
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6596
 
6.5%
6481
 
6.3%
5727
 
5.6%
4751
 
4.7%
3572
 
3.5%
2644
 
2.6%
2327
 
2.3%
2076
 
2.0%
1727
 
1.7%
1552
 
1.5%
Other values (868) 64642
63.3%
Han
ValueCountFrequency (%)
37
 
5.8%
33
 
5.2%
27
 
4.2%
19
 
3.0%
18
 
2.8%
13
 
2.0%
13
 
2.0%
12
 
1.9%
9
 
1.4%
9
 
1.4%
Other values (218) 449
70.3%
Katakana
ValueCountFrequency (%)
41
 
13.6%
19
 
6.3%
13
 
4.3%
12
 
4.0%
10
 
3.3%
10
 
3.3%
9
 
3.0%
8
 
2.7%
8
 
2.7%
8
 
2.7%
Other values (54) 163
54.2%
Latin
ValueCountFrequency (%)
e 1619
 
9.0%
a 1352
 
7.5%
r 1256
 
6.9%
i 1149
 
6.4%
n 1136
 
6.3%
t 1101
 
6.1%
l 1054
 
5.8%
y 945
 
5.2%
o 891
 
4.9%
b 824
 
4.6%
Other values (45) 6752
37.3%
Hiragana
ValueCountFrequency (%)
22
 
9.9%
20
 
9.0%
12
 
5.4%
11
 
5.0%
11
 
5.0%
9
 
4.1%
9
 
4.1%
9
 
4.1%
8
 
3.6%
7
 
3.2%
Other values (36) 104
46.8%
Cyrillic
ValueCountFrequency (%)
а 14
 
9.5%
л 12
 
8.2%
и 10
 
6.8%
р 10
 
6.8%
о 10
 
6.8%
н 9
 
6.1%
е 7
 
4.8%
э 6
 
4.1%
с 6
 
4.1%
в 6
 
4.1%
Other values (30) 57
38.8%
Common
ValueCountFrequency (%)
45319
76.6%
; 9114
 
15.4%
· 932
 
1.6%
. 774
 
1.3%
: 731
 
1.2%
[ 691
 
1.2%
] 691
 
1.2%
, 667
 
1.1%
- 62
 
0.1%
3 32
 
0.1%
Other values (25) 142
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 102084
56.5%
ASCII 76280
42.2%
None 949
 
0.5%
CJK 637
 
0.4%
Katakana 301
 
0.2%
Hiragana 222
 
0.1%
Cyrillic 147
 
0.1%
Compat Jamo 11
 
< 0.1%
Enclosed Alphanum 3
 
< 0.1%
Box Drawing 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
45319
59.4%
; 9114
 
11.9%
e 1619
 
2.1%
a 1352
 
1.8%
r 1256
 
1.6%
i 1149
 
1.5%
n 1136
 
1.5%
t 1101
 
1.4%
l 1054
 
1.4%
y 945
 
1.2%
Other values (68) 12235
 
16.0%
Hangul
ValueCountFrequency (%)
6596
 
6.5%
6481
 
6.3%
5727
 
5.6%
4751
 
4.7%
3572
 
3.5%
2644
 
2.6%
2327
 
2.3%
2076
 
2.0%
1727
 
1.7%
1552
 
1.5%
Other values (865) 64631
63.3%
None
ValueCountFrequency (%)
· 932
98.2%
4
 
0.4%
3
 
0.3%
3
 
0.3%
2
 
0.2%
1
 
0.1%
1
 
0.1%
1
 
0.1%
đ 1
 
0.1%
  1
 
0.1%
Katakana
ValueCountFrequency (%)
41
 
13.6%
19
 
6.3%
13
 
4.3%
12
 
4.0%
10
 
3.3%
10
 
3.3%
9
 
3.0%
8
 
2.7%
8
 
2.7%
8
 
2.7%
Other values (54) 163
54.2%
CJK
ValueCountFrequency (%)
37
 
5.8%
33
 
5.2%
27
 
4.2%
19
 
3.0%
18
 
2.8%
13
 
2.0%
13
 
2.0%
12
 
1.9%
9
 
1.4%
9
 
1.4%
Other values (216) 447
70.2%
Hiragana
ValueCountFrequency (%)
22
 
9.9%
20
 
9.0%
12
 
5.4%
11
 
5.0%
11
 
5.0%
9
 
4.1%
9
 
4.1%
9
 
4.1%
8
 
3.6%
7
 
3.2%
Other values (36) 104
46.8%
Cyrillic
ValueCountFrequency (%)
а 14
 
9.5%
л 12
 
8.2%
и 10
 
6.8%
р 10
 
6.8%
о 10
 
6.8%
н 9
 
6.1%
е 7
 
4.8%
э 6
 
4.1%
с 6
 
4.1%
в 6
 
4.1%
Other values (30) 57
38.8%
Compat Jamo
ValueCountFrequency (%)
8
72.7%
2
 
18.2%
1
 
9.1%
Enclosed Alphanum
ValueCountFrequency (%)
3
100.0%
Box Drawing
ValueCountFrequency (%)
2
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
50.0%
1
50.0%

편_권차
Text

MISSING 

Distinct413
Distinct (%)7.2%
Missing4264
Missing (%)42.6%
Memory size156.2 KiB
2023-12-13T02:19:23.624652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length1.8807531
Min length1

Characters and Unicode

Total characters10788
Distinct characters50
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique200 ?
Unique (%)3.5%

Sample

1st row28
2nd row02월 02일
3rd row3
4th row15
5th row25
ValueCountFrequency (%)
1 474
 
7.9%
2 421
 
7.0%
3 305
 
5.1%
4 278
 
4.7%
5 242
 
4.1%
6 208
 
3.5%
7 162
 
2.7%
8 144
 
2.4%
9 138
 
2.3%
10 126
 
2.1%
Other values (320) 3477
58.2%
2023-12-13T02:19:24.072089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2120
19.7%
2 1613
15.0%
3 1214
11.3%
4 1026
9.5%
0 898
8.3%
5 828
 
7.7%
6 647
 
6.0%
7 540
 
5.0%
8 501
 
4.6%
9 442
 
4.1%
Other values (40) 959
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9829
91.1%
Other Letter 499
 
4.6%
Space Separator 239
 
2.2%
Other Punctuation 71
 
0.7%
Dash Punctuation 65
 
0.6%
Open Punctuation 30
 
0.3%
Close Punctuation 30
 
0.3%
Lowercase Letter 18
 
0.2%
Uppercase Letter 7
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
235
47.1%
235
47.1%
3
 
0.6%
3
 
0.6%
3
 
0.6%
2
 
0.4%
2
 
0.4%
2
 
0.4%
2
 
0.4%
2
 
0.4%
Other values (10) 10
 
2.0%
Decimal Number
ValueCountFrequency (%)
1 2120
21.6%
2 1613
16.4%
3 1214
12.4%
4 1026
10.4%
0 898
9.1%
5 828
 
8.4%
6 647
 
6.6%
7 540
 
5.5%
8 501
 
5.1%
9 442
 
4.5%
Lowercase Letter
ValueCountFrequency (%)
e 5
27.8%
b 4
22.2%
o 2
 
11.1%
l 1
 
5.6%
h 1
 
5.6%
c 1
 
5.6%
s 1
 
5.6%
r 1
 
5.6%
n 1
 
5.6%
a 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
F 4
57.1%
P 1
 
14.3%
C 1
 
14.3%
J 1
 
14.3%
Other Punctuation
ValueCountFrequency (%)
: 70
98.6%
, 1
 
1.4%
Space Separator
ValueCountFrequency (%)
239
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 65
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 30
100.0%
Close Punctuation
ValueCountFrequency (%)
] 30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10264
95.1%
Hangul 499
 
4.6%
Latin 25
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
235
47.1%
235
47.1%
3
 
0.6%
3
 
0.6%
3
 
0.6%
2
 
0.4%
2
 
0.4%
2
 
0.4%
2
 
0.4%
2
 
0.4%
Other values (10) 10
 
2.0%
Common
ValueCountFrequency (%)
1 2120
20.7%
2 1613
15.7%
3 1214
11.8%
4 1026
10.0%
0 898
8.7%
5 828
 
8.1%
6 647
 
6.3%
7 540
 
5.3%
8 501
 
4.9%
9 442
 
4.3%
Other values (6) 435
 
4.2%
Latin
ValueCountFrequency (%)
e 5
20.0%
b 4
16.0%
F 4
16.0%
o 2
 
8.0%
l 1
 
4.0%
h 1
 
4.0%
c 1
 
4.0%
s 1
 
4.0%
r 1
 
4.0%
P 1
 
4.0%
Other values (4) 4
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10289
95.4%
Hangul 499
 
4.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2120
20.6%
2 1613
15.7%
3 1214
11.8%
4 1026
10.0%
0 898
8.7%
5 828
 
8.0%
6 647
 
6.3%
7 540
 
5.2%
8 501
 
4.9%
9 442
 
4.3%
Other values (20) 460
 
4.5%
Hangul
ValueCountFrequency (%)
235
47.1%
235
47.1%
3
 
0.6%
3
 
0.6%
3
 
0.6%
2
 
0.4%
2
 
0.4%
2
 
0.4%
2
 
0.4%
2
 
0.4%
Other values (10) 10
 
2.0%

권서명
Text

MISSING 

Distinct1104
Distinct (%)92.7%
Missing8809
Missing (%)88.1%
Memory size156.2 KiB
2023-12-13T02:19:24.520079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length59
Median length38
Mean length12.287993
Min length1

Characters and Unicode

Total characters14635
Distinct characters850
Distinct categories14 ?
Distinct scripts5 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1063 ?
Unique (%)89.3%

Sample

1st rowVol.2
2nd row성하와 불의 노바썬
3rd row가면 쓴 우체국의 유령
4th row정 대리│권 사원 편
5th row집을 위한 발명 글
ValueCountFrequency (%)
44
 
1.2%
29
 
0.8%
빅북 25
 
0.7%
시대 21
 
0.6%
the 21
 
0.6%
조선 19
 
0.5%
중국어 17
 
0.5%
and 14
 
0.4%
사는 14
 
0.4%
동물 13
 
0.4%
Other values (2543) 3444
94.1%
2023-12-13T02:19:25.563317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2470
 
16.9%
261
 
1.8%
e 197
 
1.3%
188
 
1.3%
a 180
 
1.2%
166
 
1.1%
i 156
 
1.1%
r 155
 
1.1%
146
 
1.0%
143
 
1.0%
Other values (840) 10573
72.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9019
61.6%
Space Separator 2470
 
16.9%
Lowercase Letter 1887
 
12.9%
Other Punctuation 336
 
2.3%
Uppercase Letter 325
 
2.2%
Decimal Number 191
 
1.3%
Open Punctuation 171
 
1.2%
Close Punctuation 171
 
1.2%
Dash Punctuation 38
 
0.3%
Math Symbol 20
 
0.1%
Other values (4) 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
261
 
2.9%
188
 
2.1%
166
 
1.8%
146
 
1.6%
143
 
1.6%
139
 
1.5%
135
 
1.5%
134
 
1.5%
124
 
1.4%
124
 
1.4%
Other values (751) 7459
82.7%
Lowercase Letter
ValueCountFrequency (%)
e 197
 
10.4%
a 180
 
9.5%
i 156
 
8.3%
r 155
 
8.2%
t 136
 
7.2%
n 135
 
7.2%
s 133
 
7.0%
o 130
 
6.9%
h 92
 
4.9%
l 75
 
4.0%
Other values (15) 498
26.4%
Uppercase Letter
ValueCountFrequency (%)
S 43
13.2%
T 38
11.7%
I 27
 
8.3%
C 26
 
8.0%
A 23
 
7.1%
M 21
 
6.5%
E 17
 
5.2%
B 15
 
4.6%
H 15
 
4.6%
F 13
 
4.0%
Other values (13) 87
26.8%
Other Punctuation
ValueCountFrequency (%)
! 118
35.1%
, 116
34.5%
· 30
 
8.9%
' 23
 
6.8%
: 14
 
4.2%
. 13
 
3.9%
& 5
 
1.5%
5
 
1.5%
/ 3
 
0.9%
; 2
 
0.6%
Other values (5) 7
 
2.1%
Decimal Number
ValueCountFrequency (%)
0 60
31.4%
1 55
28.8%
2 29
15.2%
6 15
 
7.9%
5 9
 
4.7%
4 8
 
4.2%
3 6
 
3.1%
9 4
 
2.1%
8 3
 
1.6%
7 2
 
1.0%
Math Symbol
ValueCountFrequency (%)
~ 11
55.0%
6
30.0%
< 1
 
5.0%
> 1
 
5.0%
+ 1
 
5.0%
Open Punctuation
ValueCountFrequency (%)
( 137
80.1%
[ 34
 
19.9%
Close Punctuation
ValueCountFrequency (%)
) 137
80.1%
] 34
 
19.9%
Other Symbol
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
2470
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 38
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 3
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%
Control
ValueCountFrequency (%)
 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8966
61.3%
Common 3403
 
23.3%
Latin 2213
 
15.1%
Han 49
 
0.3%
Hiragana 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
261
 
2.9%
188
 
2.1%
166
 
1.9%
146
 
1.6%
143
 
1.6%
139
 
1.6%
135
 
1.5%
134
 
1.5%
124
 
1.4%
124
 
1.4%
Other values (710) 7406
82.6%
Latin
ValueCountFrequency (%)
e 197
 
8.9%
a 180
 
8.1%
i 156
 
7.0%
r 155
 
7.0%
t 136
 
6.1%
n 135
 
6.1%
s 133
 
6.0%
o 130
 
5.9%
h 92
 
4.2%
l 75
 
3.4%
Other values (39) 824
37.2%
Common
ValueCountFrequency (%)
2470
72.6%
( 137
 
4.0%
) 137
 
4.0%
! 118
 
3.5%
, 116
 
3.4%
0 60
 
1.8%
1 55
 
1.6%
- 38
 
1.1%
[ 34
 
1.0%
] 34
 
1.0%
Other values (30) 204
 
6.0%
Han
ValueCountFrequency (%)
3
 
6.1%
3
 
6.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
1
 
2.0%
1
 
2.0%
Other values (29) 29
59.2%
Hiragana
ValueCountFrequency (%)
2
50.0%
2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8966
61.3%
ASCII 5567
38.0%
CJK 49
 
0.3%
None 39
 
0.3%
Math Operators 6
 
< 0.1%
Hiragana 4
 
< 0.1%
Number Forms 1
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%
Punctuation 1
 
< 0.1%
Box Drawing 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2470
44.4%
e 197
 
3.5%
a 180
 
3.2%
i 156
 
2.8%
r 155
 
2.8%
( 137
 
2.5%
) 137
 
2.5%
t 136
 
2.4%
n 135
 
2.4%
s 133
 
2.4%
Other values (69) 1731
31.1%
Hangul
ValueCountFrequency (%)
261
 
2.9%
188
 
2.1%
166
 
1.9%
146
 
1.6%
143
 
1.6%
139
 
1.6%
135
 
1.5%
134
 
1.5%
124
 
1.4%
124
 
1.4%
Other values (710) 7406
82.6%
None
ValueCountFrequency (%)
· 30
76.9%
5
 
12.8%
2
 
5.1%
1
 
2.6%
đ 1
 
2.6%
Math Operators
ValueCountFrequency (%)
6
100.0%
CJK
ValueCountFrequency (%)
3
 
6.1%
3
 
6.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
1
 
2.0%
1
 
2.0%
Other values (29) 29
59.2%
Hiragana
ValueCountFrequency (%)
2
50.0%
2
50.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%
Punctuation
ValueCountFrequency (%)
1
100.0%
Box Drawing
ValueCountFrequency (%)
1
100.0%
Distinct1867
Distinct (%)18.7%
Missing3
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-13T02:19:25.996237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length39
Mean length5.5345604
Min length1

Characters and Unicode

Total characters55329
Distinct characters754
Distinct categories11 ?
Distinct scripts7 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique949 ?
Unique (%)9.5%

Sample

1st row한국헤밍웨이
2nd row온지
3rd row이후
4th rowEBS 미디어센터
5th row큰북작은북
ValueCountFrequency (%)
오디언소리 261
 
2.3%
시공주니어 228
 
2.0%
웅진주니어 218
 
2.0%
교원 169
 
1.5%
비룡소 166
 
1.5%
scholastic 165
 
1.5%
한국톨스토이 159
 
1.4%
한국헤르만헤세 134
 
1.2%
주니어김영사 129
 
1.2%
예림당 117
 
1.0%
Other values (1885) 9407
84.3%
2023-12-13T02:19:26.629782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1738
 
3.1%
1644
 
3.0%
1363
 
2.5%
1163
 
2.1%
1159
 
2.1%
1125
 
2.0%
906
 
1.6%
o 881
 
1.6%
866
 
1.6%
865
 
1.6%
Other values (744) 43619
78.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 43379
78.4%
Lowercase Letter 7283
 
13.2%
Uppercase Letter 1957
 
3.5%
Space Separator 1163
 
2.1%
Other Punctuation 1005
 
1.8%
Open Punctuation 207
 
0.4%
Close Punctuation 206
 
0.4%
Decimal Number 121
 
0.2%
Dash Punctuation 6
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1738
 
4.0%
1644
 
3.8%
1363
 
3.1%
1159
 
2.7%
1125
 
2.6%
906
 
2.1%
866
 
2.0%
865
 
2.0%
864
 
2.0%
817
 
1.9%
Other values (646) 32032
73.8%
Lowercase Letter
ValueCountFrequency (%)
o 881
12.1%
s 704
9.7%
i 639
 
8.8%
r 614
 
8.4%
a 493
 
6.8%
e 489
 
6.7%
c 466
 
6.4%
n 456
 
6.3%
l 422
 
5.8%
t 361
 
5.0%
Other values (30) 1758
24.1%
Uppercase Letter
ValueCountFrequency (%)
S 301
15.4%
B 275
14.1%
H 152
 
7.8%
O 125
 
6.4%
K 124
 
6.3%
C 120
 
6.1%
M 110
 
5.6%
P 107
 
5.5%
E 101
 
5.2%
R 74
 
3.8%
Other values (23) 468
23.9%
Other Punctuation
ValueCountFrequency (%)
: 756
75.2%
& 77
 
7.7%
· 43
 
4.3%
, 37
 
3.7%
. 30
 
3.0%
; 21
 
2.1%
' 20
 
2.0%
10
 
1.0%
/ 9
 
0.9%
# 2
 
0.2%
Decimal Number
ValueCountFrequency (%)
2 57
47.1%
1 53
43.8%
0 5
 
4.1%
3 2
 
1.7%
4 2
 
1.7%
6 1
 
0.8%
5 1
 
0.8%
Close Punctuation
ValueCountFrequency (%)
] 156
75.7%
) 50
 
24.3%
Open Punctuation
ValueCountFrequency (%)
[ 156
75.4%
( 51
 
24.6%
Space Separator
ValueCountFrequency (%)
1163
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 42777
77.3%
Latin 9206
 
16.6%
Common 2709
 
4.9%
Han 584
 
1.1%
Cyrillic 34
 
0.1%
Katakana 14
 
< 0.1%
Hiragana 5
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1738
 
4.1%
1644
 
3.8%
1363
 
3.2%
1159
 
2.7%
1125
 
2.6%
906
 
2.1%
866
 
2.0%
865
 
2.0%
864
 
2.0%
817
 
1.9%
Other values (550) 31430
73.5%
Han
ValueCountFrequency (%)
91
15.6%
91
15.6%
91
15.6%
19
 
3.3%
15
 
2.6%
13
 
2.2%
11
 
1.9%
11
 
1.9%
9
 
1.5%
9
 
1.5%
Other values (76) 224
38.4%
Latin
ValueCountFrequency (%)
o 881
 
9.6%
s 704
 
7.6%
i 639
 
6.9%
r 614
 
6.7%
a 493
 
5.4%
e 489
 
5.3%
c 466
 
5.1%
n 456
 
5.0%
l 422
 
4.6%
t 361
 
3.9%
Other values (44) 3681
40.0%
Common
ValueCountFrequency (%)
1163
42.9%
: 756
27.9%
] 156
 
5.8%
[ 156
 
5.8%
& 77
 
2.8%
2 57
 
2.1%
1 53
 
2.0%
( 51
 
1.9%
) 50
 
1.8%
· 43
 
1.6%
Other values (14) 147
 
5.4%
Cyrillic
ValueCountFrequency (%)
а 5
14.7%
к 3
 
8.8%
о 3
 
8.8%
н 3
 
8.8%
с 2
 
5.9%
е 2
 
5.9%
ь 2
 
5.9%
М 2
 
5.9%
х 2
 
5.9%
и 1
 
2.9%
Other values (9) 9
26.5%
Katakana
ValueCountFrequency (%)
3
21.4%
3
21.4%
3
21.4%
2
14.3%
1
 
7.1%
1
 
7.1%
1
 
7.1%
Hiragana
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 42776
77.3%
ASCII 11858
 
21.4%
CJK 584
 
1.1%
None 58
 
0.1%
Cyrillic 34
 
0.1%
Katakana 14
 
< 0.1%
Hiragana 5
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1738
 
4.1%
1644
 
3.8%
1363
 
3.2%
1159
 
2.7%
1125
 
2.6%
906
 
2.1%
866
 
2.0%
865
 
2.0%
864
 
2.0%
817
 
1.9%
Other values (549) 31429
73.5%
ASCII
ValueCountFrequency (%)
1163
 
9.8%
o 881
 
7.4%
: 756
 
6.4%
s 704
 
5.9%
i 639
 
5.4%
r 614
 
5.2%
a 493
 
4.2%
e 489
 
4.1%
c 466
 
3.9%
n 456
 
3.8%
Other values (63) 5197
43.8%
CJK
ValueCountFrequency (%)
91
15.6%
91
15.6%
91
15.6%
19
 
3.3%
15
 
2.6%
13
 
2.2%
11
 
1.9%
11
 
1.9%
9
 
1.5%
9
 
1.5%
Other values (76) 224
38.4%
None
ValueCountFrequency (%)
· 43
74.1%
10
 
17.2%
đ 2
 
3.4%
1
 
1.7%
1
 
1.7%
1
 
1.7%
Cyrillic
ValueCountFrequency (%)
а 5
14.7%
к 3
 
8.8%
о 3
 
8.8%
н 3
 
8.8%
с 2
 
5.9%
е 2
 
5.9%
ь 2
 
5.9%
М 2
 
5.9%
х 2
 
5.9%
и 1
 
2.9%
Other values (9) 9
26.5%
Katakana
ValueCountFrequency (%)
3
21.4%
3
21.4%
3
21.4%
2
14.3%
1
 
7.1%
1
 
7.1%
1
 
7.1%
Hiragana
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%
Distinct80
Distinct (%)0.8%
Missing12
Missing (%)0.1%
Memory size156.2 KiB
2023-12-13T02:19:26.913710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length4
Mean length4.1079295
Min length4

Characters and Unicode

Total characters41030
Distinct characters35
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)0.2%

Sample

1st row[2015]
2nd row2012
3rd row2013
4th row[2012]
5th row2014
ValueCountFrequency (%)
2011 1029
10.3%
2010 960
9.6%
2015 957
9.6%
2012 949
9.5%
2016 840
 
8.4%
2014 767
 
7.7%
2013 686
 
6.9%
2009 645
 
6.5%
2017 447
 
4.5%
2018 389
 
3.9%
Other values (54) 2321
23.2%
2023-12-13T02:19:27.385713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 12870
31.4%
2 10979
26.8%
1 8965
21.8%
9 1665
 
4.1%
5 1112
 
2.7%
6 1070
 
2.6%
4 902
 
2.2%
3 859
 
2.1%
8 857
 
2.1%
7 700
 
1.7%
Other values (25) 1051
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 39979
97.4%
Open Punctuation 501
 
1.2%
Close Punctuation 501
 
1.2%
Other Letter 20
 
< 0.1%
Other Punctuation 17
 
< 0.1%
Lowercase Letter 7
 
< 0.1%
Dash Punctuation 3
 
< 0.1%
Space Separator 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
10.0%
2
 
10.0%
2
 
10.0%
2
 
10.0%
2
 
10.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (5) 5
25.0%
Decimal Number
ValueCountFrequency (%)
0 12870
32.2%
2 10979
27.5%
1 8965
22.4%
9 1665
 
4.2%
5 1112
 
2.8%
6 1070
 
2.7%
4 902
 
2.3%
3 859
 
2.1%
8 857
 
2.1%
7 700
 
1.8%
Other Punctuation
ValueCountFrequency (%)
. 14
82.4%
, 2
 
11.8%
: 1
 
5.9%
Open Punctuation
ValueCountFrequency (%)
[ 499
99.6%
( 2
 
0.4%
Close Punctuation
ValueCountFrequency (%)
] 499
99.6%
) 2
 
0.4%
Lowercase Letter
ValueCountFrequency (%)
c 7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 41003
99.9%
Hangul 20
 
< 0.1%
Latin 7
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 12870
31.4%
2 10979
26.8%
1 8965
21.9%
9 1665
 
4.1%
5 1112
 
2.7%
6 1070
 
2.6%
4 902
 
2.2%
3 859
 
2.1%
8 857
 
2.1%
7 700
 
1.7%
Other values (9) 1024
 
2.5%
Hangul
ValueCountFrequency (%)
2
 
10.0%
2
 
10.0%
2
 
10.0%
2
 
10.0%
2
 
10.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (5) 5
25.0%
Latin
ValueCountFrequency (%)
c 7
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 41010
> 99.9%
Hangul 20
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 12870
31.4%
2 10979
26.8%
1 8965
21.9%
9 1665
 
4.1%
5 1112
 
2.7%
6 1070
 
2.6%
4 902
 
2.2%
3 859
 
2.1%
8 857
 
2.1%
7 700
 
1.7%
Other values (10) 1031
 
2.5%
Hangul
ValueCountFrequency (%)
2
 
10.0%
2
 
10.0%
2
 
10.0%
2
 
10.0%
2
 
10.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (5) 5
25.0%
Distinct148
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T02:19:27.636399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length2
Mean length2.4622
Min length1

Characters and Unicode

Total characters24622
Distinct characters152
Distinct categories7 ?
Distinct scripts5 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique68 ?
Unique (%)0.7%

Sample

1st row성남
2nd row서울
3rd row서울
4th row서울
5th row서울
ValueCountFrequency (%)
서울 6341
61.1%
파주 1702
 
16.4%
성남 383
 
3.7%
고양 349
 
3.4%
new 277
 
2.7%
york 277
 
2.7%
원본 198
 
1.9%
london 79
 
0.8%
oxford 60
 
0.6%
인천 46
 
0.4%
Other values (126) 669
 
6.4%
2023-12-13T02:19:28.088494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6344
25.8%
6342
25.8%
1752
 
7.1%
1704
 
6.9%
1537
 
6.2%
o 605
 
2.5%
409
 
1.7%
404
 
1.6%
389
 
1.6%
r 357
 
1.4%
Other values (142) 4779
19.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 18901
76.8%
Lowercase Letter 3026
 
12.3%
Space Separator 1537
 
6.2%
Uppercase Letter 1013
 
4.1%
Other Punctuation 55
 
0.2%
Open Punctuation 45
 
0.2%
Close Punctuation 45
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6344
33.6%
6342
33.6%
1752
 
9.3%
1704
 
9.0%
409
 
2.2%
404
 
2.1%
389
 
2.1%
349
 
1.8%
200
 
1.1%
199
 
1.1%
Other values (79) 809
 
4.3%
Lowercase Letter
ValueCountFrequency (%)
o 605
20.0%
r 357
11.8%
e 353
11.7%
n 315
10.4%
k 291
9.6%
w 287
9.5%
d 158
 
5.2%
i 128
 
4.2%
a 74
 
2.4%
s 61
 
2.0%
Other values (20) 397
13.1%
Uppercase Letter
ValueCountFrequency (%)
N 321
31.7%
Y 285
28.1%
L 82
 
8.1%
O 65
 
6.4%
M 49
 
4.8%
H 35
 
3.5%
S 34
 
3.4%
B 29
 
2.9%
U 23
 
2.3%
A 20
 
2.0%
Other values (17) 70
 
6.9%
Other Punctuation
ValueCountFrequency (%)
, 44
80.0%
. 7
 
12.7%
; 4
 
7.3%
Space Separator
ValueCountFrequency (%)
1537
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 45
100.0%
Close Punctuation
ValueCountFrequency (%)
] 45
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 18680
75.9%
Latin 4012
 
16.3%
Common 1682
 
6.8%
Han 221
 
0.9%
Cyrillic 27
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6344
34.0%
6342
34.0%
1752
 
9.4%
1704
 
9.1%
409
 
2.2%
404
 
2.2%
389
 
2.1%
349
 
1.9%
200
 
1.1%
199
 
1.1%
Other values (59) 588
 
3.1%
Latin
ValueCountFrequency (%)
o 605
15.1%
r 357
 
8.9%
e 353
 
8.8%
N 321
 
8.0%
n 315
 
7.9%
k 291
 
7.3%
w 287
 
7.2%
Y 285
 
7.1%
d 158
 
3.9%
i 128
 
3.2%
Other values (38) 912
22.7%
Han
ValueCountFrequency (%)
73
33.0%
40
18.1%
32
14.5%
21
 
9.5%
13
 
5.9%
7
 
3.2%
7
 
3.2%
7
 
3.2%
3
 
1.4%
3
 
1.4%
Other values (10) 15
 
6.8%
Cyrillic
ValueCountFrequency (%)
к 4
14.8%
о 4
14.8%
а 4
14.8%
в 4
14.8%
М 4
14.8%
с 4
14.8%
С 1
 
3.7%
П 1
 
3.7%
б 1
 
3.7%
Common
ValueCountFrequency (%)
1537
91.4%
[ 45
 
2.7%
] 45
 
2.7%
, 44
 
2.6%
. 7
 
0.4%
; 4
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 18680
75.9%
ASCII 5694
 
23.1%
CJK 221
 
0.9%
Cyrillic 27
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6344
34.0%
6342
34.0%
1752
 
9.4%
1704
 
9.1%
409
 
2.2%
404
 
2.2%
389
 
2.1%
349
 
1.9%
200
 
1.1%
199
 
1.1%
Other values (59) 588
 
3.1%
ASCII
ValueCountFrequency (%)
1537
27.0%
o 605
 
10.6%
r 357
 
6.3%
e 353
 
6.2%
N 321
 
5.6%
n 315
 
5.5%
k 291
 
5.1%
w 287
 
5.0%
Y 285
 
5.0%
d 158
 
2.8%
Other values (44) 1185
20.8%
CJK
ValueCountFrequency (%)
73
33.0%
40
18.1%
32
14.5%
21
 
9.5%
13
 
5.9%
7
 
3.2%
7
 
3.2%
7
 
3.2%
3
 
1.4%
3
 
1.4%
Other values (10) 15
 
6.8%
Cyrillic
ValueCountFrequency (%)
к 4
14.8%
о 4
14.8%
а 4
14.8%
в 4
14.8%
М 4
14.8%
с 4
14.8%
С 1
 
3.7%
П 1
 
3.7%
б 1
 
3.7%

원_복본구분
Categorical

IMBALANCE 

Distinct49
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
원본
9592 
복본
 
209
495000
 
17
660000
 
14
680000
 
13
Other values (44)
 
155

Length

Max length6
Median length2
Mean length2.0691
Min length1

Unique

Unique20 ?
Unique (%)0.2%

Sample

1st row원본
2nd row원본
3rd row원본
4th row원본
5th row원본

Common Values

ValueCountFrequency (%)
원본 9592
95.9%
복본 209
 
2.1%
495000 17
 
0.2%
660000 14
 
0.1%
680000 13
 
0.1%
420000 13
 
0.1%
7500 10
 
0.1%
50850 10
 
0.1%
620000 10
 
0.1%
392000 9
 
0.1%
Other values (39) 103
 
1.0%

Length

2023-12-13T02:19:28.241655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
원본 9592
95.9%
복본 209
 
2.1%
495000 17
 
0.2%
660000 14
 
0.1%
680000 13
 
0.1%
420000 13
 
0.1%
7500 10
 
0.1%
50850 10
 
0.1%
620000 10
 
0.1%
392000 9
 
0.1%
Other values (39) 103
 
1.0%

구분
Text

Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T02:19:28.501809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length1
Mean length1.2798
Min length1

Characters and Unicode

Total characters12798
Distinct characters85
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)0.3%

Sample

1st row
2nd row
3rd row
4th row
5th row
ValueCountFrequency (%)
9487
93.6%
부록 314
 
3.1%
cm 49
 
0.5%
70책:삽도;26cm 17
 
0.2%
60책:삽화;29 14
 
0.1%
삽도 13
 
0.1%
40책:색채삽도;22cm 13
 
0.1%
1책p.:삽도;25cm 13
 
0.1%
60책:그림;28cm 10
 
0.1%
68책:삽화;28 10
 
0.1%
Other values (76) 192
 
1.9%
2023-12-13T02:19:28.956091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9660
75.5%
314
 
2.5%
314
 
2.5%
2 204
 
1.6%
: 189
 
1.5%
; 179
 
1.4%
c 174
 
1.4%
m 170
 
1.3%
151
 
1.2%
134
 
1.0%
Other values (75) 1309
 
10.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11010
86.0%
Decimal Number 759
 
5.9%
Other Punctuation 443
 
3.5%
Lowercase Letter 370
 
2.9%
Space Separator 134
 
1.0%
Uppercase Letter 46
 
0.4%
Math Symbol 19
 
0.1%
Close Punctuation 6
 
< 0.1%
Open Punctuation 6
 
< 0.1%
Dash Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9660
87.7%
314
 
2.9%
314
 
2.9%
151
 
1.4%
116
 
1.1%
64
 
0.6%
43
 
0.4%
38
 
0.3%
22
 
0.2%
21
 
0.2%
Other values (44) 267
 
2.4%
Decimal Number
ValueCountFrequency (%)
2 204
26.9%
0 117
15.4%
6 89
11.7%
3 76
 
10.0%
1 67
 
8.8%
8 57
 
7.5%
5 48
 
6.3%
9 37
 
4.9%
7 36
 
4.7%
4 28
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
D 16
34.8%
C 6
 
13.0%
V 6
 
13.0%
M 4
 
8.7%
O 4
 
8.7%
R 4
 
8.7%
T 2
 
4.3%
S 2
 
4.3%
N 2
 
4.3%
Other Punctuation
ValueCountFrequency (%)
: 189
42.7%
; 179
40.4%
. 54
 
12.2%
, 21
 
4.7%
Lowercase Letter
ValueCountFrequency (%)
c 174
47.0%
m 170
45.9%
p 26
 
7.0%
Space Separator
ValueCountFrequency (%)
134
100.0%
Math Symbol
ValueCountFrequency (%)
+ 19
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11010
86.0%
Common 1372
 
10.7%
Latin 416
 
3.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9660
87.7%
314
 
2.9%
314
 
2.9%
151
 
1.4%
116
 
1.1%
64
 
0.6%
43
 
0.4%
38
 
0.3%
22
 
0.2%
21
 
0.2%
Other values (44) 267
 
2.4%
Common
ValueCountFrequency (%)
2 204
14.9%
: 189
13.8%
; 179
13.0%
134
9.8%
0 117
8.5%
6 89
6.5%
3 76
 
5.5%
1 67
 
4.9%
8 57
 
4.2%
. 54
 
3.9%
Other values (9) 206
15.0%
Latin
ValueCountFrequency (%)
c 174
41.8%
m 170
40.9%
p 26
 
6.2%
D 16
 
3.8%
C 6
 
1.4%
V 6
 
1.4%
M 4
 
1.0%
O 4
 
1.0%
R 4
 
1.0%
T 2
 
0.5%
Other values (2) 4
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11010
86.0%
ASCII 1788
 
14.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9660
87.7%
314
 
2.9%
314
 
2.9%
151
 
1.4%
116
 
1.1%
64
 
0.6%
43
 
0.4%
38
 
0.3%
22
 
0.2%
21
 
0.2%
Other values (44) 267
 
2.4%
ASCII
ValueCountFrequency (%)
2 204
11.4%
: 189
10.6%
; 179
10.0%
c 174
9.7%
m 170
9.5%
134
 
7.5%
0 117
 
6.5%
6 89
 
5.0%
3 76
 
4.3%
1 67
 
3.7%
Other values (21) 389
21.8%

형태사항
Text

MISSING 

Distinct5514
Distinct (%)58.1%
Missing513
Missing (%)5.1%
Memory size156.2 KiB
2023-12-13T02:19:29.380320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length57
Median length44
Mean length16.084748
Min length2

Characters and Unicode

Total characters152596
Distinct characters206
Distinct categories10 ?
Distinct scripts6 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4237 ?
Unique (%)44.7%

Sample

1st row31, [2] p.:삽화, 사진;24 cm
2nd row199p:삽도;26cm
3rd row64 p.; 25 cm
4th row비디오디스크 1매(각50분):유성, 천연색;12 cm
5th row72 p.:삽화;21 cm
ValueCountFrequency (%)
cm 6265
25.1%
전자책 656
 
2.6%
1책:천연색 655
 
2.6%
p.:삽화;26 380
 
1.5%
1 332
 
1.3%
p.:삽화;24 321
 
1.3%
32 320
 
1.3%
p.:삽화;23 304
 
1.2%
스테레오 266
 
1.1%
오디오북(약 261
 
1.0%
Other values (3040) 15213
60.9%
2023-12-13T02:19:29.866611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
15888
 
10.4%
2 11825
 
7.7%
: 8932
 
5.9%
c 8754
 
5.7%
; 8736
 
5.7%
m 8504
 
5.6%
. 7742
 
5.1%
1 7522
 
4.9%
7056
 
4.6%
p 6836
 
4.5%
Other values (196) 60801
39.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 38945
25.5%
Other Letter 38329
25.1%
Lowercase Letter 26906
17.6%
Other Punctuation 26406
17.3%
Space Separator 15888
10.4%
Close Punctuation 2023
 
1.3%
Open Punctuation 2023
 
1.3%
Uppercase Letter 1435
 
0.9%
Math Symbol 477
 
0.3%
Dash Punctuation 164
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7056
18.4%
5966
15.6%
4449
11.6%
3591
9.4%
3541
9.2%
2629
 
6.9%
1230
 
3.2%
915
 
2.4%
893
 
2.3%
792
 
2.1%
Other values (136) 7267
19.0%
Lowercase Letter
ValueCountFrequency (%)
c 8754
32.5%
m 8504
31.6%
p 6836
25.4%
l 1108
 
4.1%
x 617
 
2.3%
i 429
 
1.6%
o 294
 
1.1%
v 191
 
0.7%
d 34
 
0.1%
u 22
 
0.1%
Other values (11) 117
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
D 511
35.6%
C 198
 
13.8%
V 166
 
11.6%
R 161
 
11.2%
M 160
 
11.1%
O 160
 
11.1%
S 22
 
1.5%
T 21
 
1.5%
N 21
 
1.5%
X 8
 
0.6%
Other values (5) 7
 
0.5%
Decimal Number
ValueCountFrequency (%)
2 11825
30.4%
1 7522
19.3%
3 4426
 
11.4%
6 2711
 
7.0%
4 2637
 
6.8%
5 2266
 
5.8%
7 2132
 
5.5%
8 1838
 
4.7%
0 1806
 
4.6%
9 1782
 
4.6%
Other Punctuation
ValueCountFrequency (%)
: 8932
33.8%
; 8736
33.1%
. 7742
29.3%
, 969
 
3.7%
* 26
 
0.1%
/ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 1343
66.4%
) 680
33.6%
Open Punctuation
ValueCountFrequency (%)
[ 1343
66.4%
( 680
33.6%
Math Symbol
ValueCountFrequency (%)
+ 254
53.2%
× 223
46.8%
Space Separator
ValueCountFrequency (%)
15888
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 164
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 85926
56.3%
Hangul 38062
24.9%
Latin 28341
 
18.6%
Han 259
 
0.2%
Katakana 4
 
< 0.1%
Hiragana 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7056
18.5%
5966
15.7%
4449
11.7%
3591
9.4%
3541
9.3%
2629
 
6.9%
1230
 
3.2%
915
 
2.4%
893
 
2.3%
792
 
2.1%
Other values (121) 7000
18.4%
Latin
ValueCountFrequency (%)
c 8754
30.9%
m 8504
30.0%
p 6836
24.1%
l 1108
 
3.9%
x 617
 
2.2%
D 511
 
1.8%
i 429
 
1.5%
o 294
 
1.0%
C 198
 
0.7%
v 191
 
0.7%
Other values (26) 899
 
3.2%
Common
ValueCountFrequency (%)
15888
18.5%
2 11825
13.8%
: 8932
10.4%
; 8736
10.2%
. 7742
9.0%
1 7522
8.8%
3 4426
 
5.2%
6 2711
 
3.2%
4 2637
 
3.1%
5 2266
 
2.6%
Other values (14) 13241
15.4%
Han
ValueCountFrequency (%)
91
35.1%
91
35.1%
49
18.9%
15
 
5.8%
5
 
1.9%
2
 
0.8%
2
 
0.8%
1
 
0.4%
1
 
0.4%
1
 
0.4%
Katakana
ValueCountFrequency (%)
2
50.0%
2
50.0%
Hiragana
ValueCountFrequency (%)
2
50.0%
2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 114044
74.7%
Hangul 38062
 
24.9%
CJK 259
 
0.2%
None 223
 
0.1%
Katakana 4
 
< 0.1%
Hiragana 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
15888
13.9%
2 11825
10.4%
: 8932
 
7.8%
c 8754
 
7.7%
; 8736
 
7.7%
m 8504
 
7.5%
. 7742
 
6.8%
1 7522
 
6.6%
p 6836
 
6.0%
3 4426
 
3.9%
Other values (49) 24879
21.8%
Hangul
ValueCountFrequency (%)
7056
18.5%
5966
15.7%
4449
11.7%
3591
9.4%
3541
9.3%
2629
 
6.9%
1230
 
3.2%
915
 
2.4%
893
 
2.3%
792
 
2.1%
Other values (121) 7000
18.4%
None
ValueCountFrequency (%)
× 223
100.0%
CJK
ValueCountFrequency (%)
91
35.1%
91
35.1%
49
18.9%
15
 
5.8%
5
 
1.9%
2
 
0.8%
2
 
0.8%
1
 
0.4%
1
 
0.4%
1
 
0.4%
Katakana
ValueCountFrequency (%)
2
50.0%
2
50.0%
Hiragana
ValueCountFrequency (%)
2
50.0%
2
50.0%

데이터 기준일자
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-09-28 00:00:00
Maximum2022-09-28 00:00:00
2023-12-13T02:19:29.964472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:19:30.051899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-13T02:19:16.989007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:19:16.663381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:19:17.162723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:19:16.817467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:19:30.109912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번배가상태배가기호자료실명발행년원_복본구분구분
연번1.0000.0840.1030.6180.8270.3380.537
배가상태0.0841.0000.0000.0500.1460.0000.000
배가기호0.1030.0001.0001.0000.2030.3070.320
자료실명0.6180.0501.0001.0000.6030.4160.617
발행년0.8270.1460.2030.6031.0000.8200.815
원_복본구분0.3380.0000.3070.4160.8201.0000.999
구분0.5370.0000.3200.6170.8150.9991.000
2023-12-13T02:19:30.205434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
배가상태원_복본구분자료실명
배가상태1.0000.0000.029
원_복본구분0.0001.0000.148
자료실명0.0290.1481.000
2023-12-13T02:19:30.298353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번배가기호배가상태자료실명원_복본구분
연번1.000-0.1570.0490.3200.121
배가기호-0.1571.0000.0001.0000.255
배가상태0.0490.0001.0000.0290.000
자료실명0.3201.0000.0291.0000.148
원_복본구분0.1210.2550.0000.1481.000

Missing values

2023-12-13T02:19:17.472151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:19:17.869834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T02:19:18.142884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번관리구분배가상태이용제한구분등록번호청구기호배가기호자료실명서명저작자편_권차권서명발행자발행년발행지원_복본구분구분형태사항데이터 기준일자
3777037771본관비치자료일반ADJ000036699유아 408-말231ㅁ-283유아자료실맑은 날, 흐린 날이미애 글 ; 임일수 그림28<NA>한국헤밍웨이[2015]성남원본31, [2] p.:삽화, 사진;24 cm2022-09-28
1754817549본관비치자료일반ADJ000012764아동 813.8-남55ㄱ2아동자료실강아지도 꿈이 있다 : 남솔고 장편동화남솔고 지음 ; 이진호 그림<NA><NA>온지2012서울원본199p:삽도;26cm2022-09-28
2638326384본관비치자료일반ADJ000023929유아 813.8-오79ㅇ3유아자료실우리 집에 놀러 오세요오진희 글 ; 김홍모 그림<NA><NA>이후2013서울원본64 p.; 25 cm2022-09-28
38623863본관비치자료일반ADD000001044911-2-24디지털자료실역사채널e. 2 -2, Vol.2EBS 기획02월 02일Vol.2EBS 미디어센터[2012]서울원본비디오디스크 1매(각50분):유성, 천연색;12 cm2022-09-28
3338633387본관비치자료일반ADJ000032145아동 813.8-큰47ㅋ-32아동자료실아프지만 괜찮아김별 지음 ; 윤은희 그림3<NA>큰북작은북2014서울원본72 p.:삽화;21 cm2022-09-28
3003530036본관비치자료일반ADJ000028358아동 410-송225ㅅ-15=32아동자료실(코믹 메이플스토리)수학도둑. 15송도수 글 ; 서정은 그림15<NA>서울문화사2010서울복본170 p.:삽화;26 cm2022-09-28
4640146402본관비치자료일반ADJ000045504유아 813.8-이53ㄱ3유아자료실귀신 안녕이선미<NA><NA>글로연2018원본1300040 p.;24 cm<NA>2022-09-28
2096720968본관관외대출자료일반ADJ000016830아동 808.9-버219ㅈ2아동자료실지각대장 존존 버닝햄 글·그림 ; 박상희 옮김<NA><NA>비룡소2004서울원본1책:색채삽도;26cm2022-09-28
1649216493본관비치자료일반ADJ000011648유아 813.8-문58ㅇ3유아자료실우리는 벌거숭이 화가문승연 글 ; 이수지 그림<NA><NA>길벗어린이2012파주원본[30] p.:천연색삽화;22 cm2022-09-28
4823748238본관비치자료일반ADJ000047371아동 410-하69ㅅ-252아동자료실수학 비밀일기 : 수·연산편. 25하이툰 닷컴 글·그림25성하와 불의 노바썬천재코믹스2017서울원본135 p.:만화;26 cm2022-09-28
연번관리구분배가상태이용제한구분등록번호청구기호배가기호자료실명서명저작자편_권차권서명발행자발행년발행지원_복본구분구분형태사항데이터 기준일자
3362233623본관비치자료일반ADJ000032385유아 375.1-뜨233ㄸ-4-3201코아루작은도서관뜨레풀책놀이. 4-3동화가있는집04월 03일색깔`나`가족동심2010서울원본3책:삽화;15x16cm2022-09-28
1068410685본관비치자료일반ADJ000003221유아 808.9-봄45ㅂ-223유아자료실등대 소년 조르디얀나 카리올리 글 ; 마리나 마르콜린 그림 ; 김현좌 옮김22<NA>봄봄2011서울원본[32] p.:삽화;24x30 cm2022-09-28
78187819본관비치자료일반ADE000003650802.564디지털자료실말버릇의 힘 : 1일 1언 긍정의 말이 불러온 기적 같은 변화나이토 요시히토 지음 ; 김윤경 옮김<NA><NA>비즈니스북스2021서울원본전자책 1책:천연색2022-09-28
4202542026본관비치자료일반ADJ000041045아동 747-옥58ㅇ-6-3=32아동자료실The Castle GardenRoderick Hunt06월 03일<NA>Oxford university press2016Oxford복본6책;22cm2022-09-28
57195720본관비치자료일반ADE000001551813.8-양45ㅎ4디지털자료실황제를 우습게 본 치우양봉선<NA><NA>한국문학방송2013서울원본전자책 1책:천연색2022-09-28
2413524136본관비치자료일반ADJ000021471아동 813.8-마68ㅁ2아동자료실소년병과 들국화남미영 글 ; 정수영 그림<NA><NA>세상모든책2004서울원본63p.:색채삽도;24cm2022-09-28
2377023771본관비치자료일반ADJ000020965유아 808.9-책228ㅊ3유아자료실무지개 호랑이게일 노드홈 글 ; 제니퍼 프로워크 그림 ; 이지영 옮김<NA><NA>행복도서관2008고양원본1책:채색삽도;23x29 cm2022-09-28
3741537416본관비치자료일반ADJ000036340유아 808.9-월228ㅇ-543유아자료실하양이의 빨간 모자A. J. 우드 글 ; 매기 닌 그림 ; 이경희 옮김54<NA>한국셰익스피어2016서울원본[30] p.:채색삽도;27 cm2022-09-28
3517835179본관비치자료일반ADJ000033967유아 375.1-하31ㅇ3유아자료실오늘은 우리가 선생님!하마다 게이코 지음 ; 김난주 옮김<NA><NA>찰리북2013서울원본[1책]:삽화;23 cm2022-09-28
38323833본관비치자료일반ADD0000010146884디지털자료실온리 유장하오 감독<NA><NA>비디오여행2016서울원본DVD 1매(113분):유성,천연색;12cm2022-09-28