Overview

Dataset statistics

Number of variables5
Number of observations1698
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory66.5 KiB
Average record size in memory40.1 B

Variable types

Text5

Dataset

Description한국저작권보호원이 수행하는 온라인 불법복제물 모니터링 업무 관련 중점보호 저작물 선정을 위한 디지털 음원 차트 정보
Author(재)한국저작권보호원
URLhttps://www.data.go.kr/data/15071046/fileData.do

Reproduction

Analysis started2023-12-12 13:22:16.880290
Analysis finished2023-12-12 13:22:18.090607
Duration1.21 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1658
Distinct (%)97.6%
Missing0
Missing (%)0.0%
Memory size13.4 KiB
2023-12-12T22:22:18.315956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length90
Median length54
Mean length12.142521
Min length1

Characters and Unicode

Total characters20618
Distinct characters746
Distinct categories14 ?
Distinct scripts5 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1620 ?
Unique (%)95.4%

Sample

1st row소나기
2nd row미워요
3rd row신촌을 못가
4th row가질수 없는 너
5th row술 한잔 해요
ValueCountFrequency (%)
feat 260
 
5.7%
you 73
 
1.6%
love 53
 
1.2%
prod 49
 
1.1%
me 44
 
1.0%
the 39
 
0.8%
i 37
 
0.8%
of 33
 
0.7%
28
 
0.6%
my 23
 
0.5%
Other values (2510) 3950
86.1%
2023-12-12T22:22:18.810620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2893
 
14.0%
e 1025
 
5.0%
a 816
 
4.0%
o 739
 
3.6%
t 597
 
2.9%
( 582
 
2.8%
) 582
 
2.8%
i 450
 
2.2%
n 431
 
2.1%
r 409
 
2.0%
Other values (736) 12094
58.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6945
33.7%
Other Letter 5634
27.3%
Uppercase Letter 3256
15.8%
Space Separator 2893
14.0%
Open Punctuation 582
 
2.8%
Close Punctuation 582
 
2.8%
Other Punctuation 524
 
2.5%
Decimal Number 124
 
0.6%
Modifier Symbol 48
 
0.2%
Dash Punctuation 22
 
0.1%
Other values (4) 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
176
 
3.1%
123
 
2.2%
115
 
2.0%
110
 
2.0%
103
 
1.8%
103
 
1.8%
96
 
1.7%
88
 
1.6%
85
 
1.5%
81
 
1.4%
Other values (654) 4554
80.8%
Lowercase Letter
ValueCountFrequency (%)
e 1025
14.8%
a 816
11.7%
o 739
10.6%
t 597
 
8.6%
i 450
 
6.5%
n 431
 
6.2%
r 409
 
5.9%
l 328
 
4.7%
y 256
 
3.7%
u 250
 
3.6%
Other values (16) 1644
23.7%
Uppercase Letter
ValueCountFrequency (%)
F 336
 
10.3%
L 224
 
6.9%
S 204
 
6.3%
A 192
 
5.9%
M 181
 
5.6%
O 170
 
5.2%
I 168
 
5.2%
B 165
 
5.1%
T 162
 
5.0%
E 161
 
4.9%
Other values (16) 1293
39.7%
Other Punctuation
ValueCountFrequency (%)
. 378
72.1%
, 88
 
16.8%
& 19
 
3.6%
: 11
 
2.1%
? 10
 
1.9%
! 8
 
1.5%
# 4
 
0.8%
% 3
 
0.6%
/ 2
 
0.4%
1
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 34
27.4%
2 32
25.8%
0 19
15.3%
4 11
 
8.9%
3 7
 
5.6%
7 5
 
4.0%
8 5
 
4.0%
6 4
 
3.2%
5 4
 
3.2%
9 3
 
2.4%
Math Symbol
ValueCountFrequency (%)
+ 2
66.7%
= 1
33.3%
Space Separator
ValueCountFrequency (%)
2893
100.0%
Open Punctuation
ValueCountFrequency (%)
( 582
100.0%
Close Punctuation
ValueCountFrequency (%)
) 582
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 48
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 2
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10201
49.5%
Hangul 5611
27.2%
Common 4783
23.2%
Han 22
 
0.1%
Hiragana 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
176
 
3.1%
123
 
2.2%
115
 
2.0%
110
 
2.0%
103
 
1.8%
103
 
1.8%
96
 
1.7%
88
 
1.6%
85
 
1.5%
81
 
1.4%
Other values (634) 4531
80.8%
Latin
ValueCountFrequency (%)
e 1025
 
10.0%
a 816
 
8.0%
o 739
 
7.2%
t 597
 
5.9%
i 450
 
4.4%
n 431
 
4.2%
r 409
 
4.0%
F 336
 
3.3%
l 328
 
3.2%
y 256
 
2.5%
Other values (42) 4814
47.2%
Common
ValueCountFrequency (%)
2893
60.5%
( 582
 
12.2%
) 582
 
12.2%
. 378
 
7.9%
, 88
 
1.8%
` 48
 
1.0%
1 34
 
0.7%
2 32
 
0.7%
- 22
 
0.5%
& 19
 
0.4%
Other values (20) 105
 
2.2%
Han
ValueCountFrequency (%)
2
 
9.1%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (9) 9
40.9%
Hiragana
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14980
72.7%
Hangul 5611
 
27.2%
CJK 21
 
0.1%
Punctuation 3
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%
Hiragana 1
 
< 0.1%
Geometric Shapes 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2893
19.3%
e 1025
 
6.8%
a 816
 
5.4%
o 739
 
4.9%
t 597
 
4.0%
( 582
 
3.9%
) 582
 
3.9%
i 450
 
3.0%
n 431
 
2.9%
r 409
 
2.7%
Other values (69) 6456
43.1%
Hangul
ValueCountFrequency (%)
176
 
3.1%
123
 
2.2%
115
 
2.0%
110
 
2.0%
103
 
1.8%
103
 
1.8%
96
 
1.7%
88
 
1.6%
85
 
1.5%
81
 
1.4%
Other values (634) 4531
80.8%
CJK
ValueCountFrequency (%)
2
 
9.5%
2
 
9.5%
2
 
9.5%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (8) 8
38.1%
Punctuation
ValueCountFrequency (%)
2
66.7%
1
33.3%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
Hiragana
ValueCountFrequency (%)
1
100.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%
Distinct1378
Distinct (%)81.2%
Missing0
Missing (%)0.0%
Memory size13.4 KiB
2023-12-12T22:22:19.200419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length70
Median length42
Mean length14.409894
Min length1

Characters and Unicode

Total characters24468
Distinct characters631
Distinct categories17 ?
Distinct scripts5 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1213 ?
Unique (%)71.4%

Sample

1st row소나기
2nd row정인 From Andromeda
3rd row신촌을 못가
4th rowBank 선물
5th rowAtelier
ValueCountFrequency (%)
ost 172
 
3.2%
the 167
 
3.1%
part 154
 
2.9%
135
 
2.5%
album 121
 
2.3%
love 77
 
1.4%
1 67
 
1.2%
2 58
 
1.1%
mini 56
 
1.0%
you 51
 
0.9%
Other values (1875) 4318
80.3%
2023-12-12T22:22:19.799658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3683
 
15.1%
e 1083
 
4.4%
a 733
 
3.0%
i 691
 
2.8%
t 656
 
2.7%
r 651
 
2.7%
o 622
 
2.5%
T 608
 
2.5%
n 564
 
2.3%
S 549
 
2.2%
Other values (621) 14628
59.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8372
34.2%
Uppercase Letter 5978
24.4%
Other Letter 4278
17.5%
Space Separator 3683
15.1%
Decimal Number 1018
 
4.2%
Modifier Symbol 301
 
1.2%
Other Punctuation 276
 
1.1%
Close Punctuation 190
 
0.8%
Open Punctuation 190
 
0.8%
Dash Punctuation 116
 
0.5%
Other values (7) 66
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
107
 
2.5%
102
 
2.4%
92
 
2.2%
91
 
2.1%
77
 
1.8%
72
 
1.7%
69
 
1.6%
62
 
1.4%
60
 
1.4%
57
 
1.3%
Other values (523) 3489
81.6%
Lowercase Letter
ValueCountFrequency (%)
e 1083
12.9%
a 733
 
8.8%
i 691
 
8.3%
t 656
 
7.8%
r 651
 
7.8%
o 622
 
7.4%
n 564
 
6.7%
l 519
 
6.2%
u 346
 
4.1%
h 337
 
4.0%
Other values (17) 2170
25.9%
Uppercase Letter
ValueCountFrequency (%)
T 608
 
10.2%
S 549
 
9.2%
O 537
 
9.0%
E 501
 
8.4%
P 399
 
6.7%
A 395
 
6.6%
L 309
 
5.2%
M 303
 
5.1%
I 244
 
4.1%
R 238
 
4.0%
Other values (16) 1895
31.7%
Other Punctuation
ValueCountFrequency (%)
. 148
53.6%
, 33
 
12.0%
: 29
 
10.5%
& 18
 
6.5%
# 14
 
5.1%
/ 13
 
4.7%
? 10
 
3.6%
! 6
 
2.2%
; 2
 
0.7%
* 2
 
0.7%
Decimal Number
ValueCountFrequency (%)
1 305
30.0%
2 194
19.1%
3 112
 
11.0%
4 104
 
10.2%
0 72
 
7.1%
5 65
 
6.4%
6 59
 
5.8%
8 42
 
4.1%
7 40
 
3.9%
9 25
 
2.5%
Math Symbol
ValueCountFrequency (%)
= 19
52.8%
+ 7
 
19.4%
÷ 7
 
19.4%
1
 
2.8%
< 1
 
2.8%
> 1
 
2.8%
Letter Number
ValueCountFrequency (%)
5
71.4%
1
 
14.3%
1
 
14.3%
Modifier Symbol
ValueCountFrequency (%)
` 300
99.7%
˚ 1
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 188
98.9%
] 2
 
1.1%
Open Punctuation
ValueCountFrequency (%)
( 188
98.9%
[ 2
 
1.1%
Final Punctuation
ValueCountFrequency (%)
5
83.3%
1
 
16.7%
Initial Punctuation
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Space Separator
ValueCountFrequency (%)
3683
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 116
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 10
100.0%
Other Number
ValueCountFrequency (%)
¹ 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14352
58.7%
Common 5833
23.8%
Hangul 4216
 
17.2%
Han 62
 
0.3%
Greek 5
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
107
 
2.5%
102
 
2.4%
92
 
2.2%
91
 
2.2%
77
 
1.8%
72
 
1.7%
69
 
1.6%
62
 
1.5%
60
 
1.4%
57
 
1.4%
Other values (494) 3427
81.3%
Latin
ValueCountFrequency (%)
e 1083
 
7.5%
a 733
 
5.1%
i 691
 
4.8%
t 656
 
4.6%
r 651
 
4.5%
o 622
 
4.3%
T 608
 
4.2%
n 564
 
3.9%
S 549
 
3.8%
O 537
 
3.7%
Other values (45) 7658
53.4%
Common
ValueCountFrequency (%)
3683
63.1%
1 305
 
5.2%
` 300
 
5.1%
2 194
 
3.3%
) 188
 
3.2%
( 188
 
3.2%
. 148
 
2.5%
- 116
 
2.0%
3 112
 
1.9%
4 104
 
1.8%
Other values (32) 495
 
8.5%
Han
ValueCountFrequency (%)
11
17.7%
10
16.1%
5
 
8.1%
4
 
6.5%
2
 
3.2%
2
 
3.2%
2
 
3.2%
2
 
3.2%
2
 
3.2%
2
 
3.2%
Other values (19) 20
32.3%
Greek
ValueCountFrequency (%)
χ 5
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20156
82.4%
Hangul 4216
 
17.2%
CJK 62
 
0.3%
None 15
 
0.1%
Punctuation 10
 
< 0.1%
Number Forms 7
 
< 0.1%
Modifier Letters 1
 
< 0.1%
Letterlike Symbols 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3683
 
18.3%
e 1083
 
5.4%
a 733
 
3.6%
i 691
 
3.4%
t 656
 
3.3%
r 651
 
3.2%
o 622
 
3.1%
T 608
 
3.0%
n 564
 
2.8%
S 549
 
2.7%
Other values (75) 10316
51.2%
Hangul
ValueCountFrequency (%)
107
 
2.5%
102
 
2.4%
92
 
2.2%
91
 
2.2%
77
 
1.8%
72
 
1.7%
69
 
1.6%
62
 
1.5%
60
 
1.4%
57
 
1.4%
Other values (494) 3427
81.3%
CJK
ValueCountFrequency (%)
11
17.7%
10
16.1%
5
 
8.1%
4
 
6.5%
2
 
3.2%
2
 
3.2%
2
 
3.2%
2
 
3.2%
2
 
3.2%
2
 
3.2%
Other values (19) 20
32.3%
None
ValueCountFrequency (%)
÷ 7
46.7%
χ 5
33.3%
¹ 2
 
13.3%
1
 
6.7%
Number Forms
ValueCountFrequency (%)
5
71.4%
1
 
14.3%
1
 
14.3%
Punctuation
ValueCountFrequency (%)
5
50.0%
3
30.0%
1
 
10.0%
1
 
10.0%
Modifier Letters
ValueCountFrequency (%)
˚ 1
100.0%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%
Distinct648
Distinct (%)38.2%
Missing0
Missing (%)0.0%
Memory size13.4 KiB
2023-12-12T22:22:20.237335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length36
Mean length9.2526502
Min length2

Characters and Unicode

Total characters15711
Distinct characters411
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique393 ?
Unique (%)23.1%

Sample

1st row아이오아이 (I.O.I)
2nd row정인
3rd row포스트맨 (Postmen)
4th row뱅크
5th row지아 (Zia)
ValueCountFrequency (%)
방탄소년단 40
 
1.3%
아이유 40
 
1.3%
iu 39
 
1.2%
버스커 38
 
1.2%
busker 38
 
1.2%
볼빨간사춘기 26
 
0.8%
빅뱅 24
 
0.8%
the 24
 
0.8%
다비치 22
 
0.7%
one 22
 
0.7%
Other values (965) 2819
90.0%
2023-12-12T22:22:20.787347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1434
 
9.1%
) 757
 
4.8%
( 757
 
4.8%
e 584
 
3.7%
a 535
 
3.4%
n 445
 
2.8%
i 419
 
2.7%
313
 
2.0%
r 302
 
1.9%
o 290
 
1.8%
Other values (401) 9875
62.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4852
30.9%
Lowercase Letter 4398
28.0%
Uppercase Letter 3091
19.7%
Space Separator 1434
 
9.1%
Close Punctuation 757
 
4.8%
Open Punctuation 757
 
4.8%
Other Punctuation 276
 
1.8%
Decimal Number 100
 
0.6%
Dash Punctuation 34
 
0.2%
Modifier Symbol 9
 
0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
313
 
6.5%
148
 
3.1%
91
 
1.9%
86
 
1.8%
77
 
1.6%
72
 
1.5%
65
 
1.3%
61
 
1.3%
59
 
1.2%
55
 
1.1%
Other values (329) 3825
78.8%
Uppercase Letter
ValueCountFrequency (%)
E 229
 
7.4%
N 206
 
6.7%
O 198
 
6.4%
A 195
 
6.3%
M 194
 
6.3%
I 182
 
5.9%
B 177
 
5.7%
C 176
 
5.7%
S 158
 
5.1%
H 143
 
4.6%
Other values (16) 1233
39.9%
Lowercase Letter
ValueCountFrequency (%)
e 584
13.3%
a 535
12.2%
n 445
10.1%
i 419
 
9.5%
r 302
 
6.9%
o 290
 
6.6%
l 269
 
6.1%
s 194
 
4.4%
u 157
 
3.6%
h 152
 
3.5%
Other values (15) 1051
23.9%
Decimal Number
ValueCountFrequency (%)
1 33
33.0%
0 19
19.0%
4 17
17.0%
5 14
14.0%
2 11
 
11.0%
7 3
 
3.0%
9 2
 
2.0%
8 1
 
1.0%
Other Punctuation
ValueCountFrequency (%)
, 202
73.2%
. 58
 
21.0%
& 9
 
3.3%
: 4
 
1.4%
* 2
 
0.7%
! 1
 
0.4%
Space Separator
ValueCountFrequency (%)
1434
100.0%
Close Punctuation
ValueCountFrequency (%)
) 757
100.0%
Open Punctuation
ValueCountFrequency (%)
( 757
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 34
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 9
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7489
47.7%
Hangul 4851
30.9%
Common 3370
21.4%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
313
 
6.5%
148
 
3.1%
91
 
1.9%
86
 
1.8%
77
 
1.6%
72
 
1.5%
65
 
1.3%
61
 
1.3%
59
 
1.2%
55
 
1.1%
Other values (328) 3824
78.8%
Latin
ValueCountFrequency (%)
e 584
 
7.8%
a 535
 
7.1%
n 445
 
5.9%
i 419
 
5.6%
r 302
 
4.0%
o 290
 
3.9%
l 269
 
3.6%
E 229
 
3.1%
N 206
 
2.8%
O 198
 
2.6%
Other values (41) 4012
53.6%
Common
ValueCountFrequency (%)
1434
42.6%
) 757
22.5%
( 757
22.5%
, 202
 
6.0%
. 58
 
1.7%
- 34
 
1.0%
1 33
 
1.0%
0 19
 
0.6%
4 17
 
0.5%
5 14
 
0.4%
Other values (11) 45
 
1.3%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10857
69.1%
Hangul 4851
30.9%
Misc Symbols 2
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1434
 
13.2%
) 757
 
7.0%
( 757
 
7.0%
e 584
 
5.4%
a 535
 
4.9%
n 445
 
4.1%
i 419
 
3.9%
r 302
 
2.8%
o 290
 
2.7%
l 269
 
2.5%
Other values (61) 5065
46.7%
Hangul
ValueCountFrequency (%)
313
 
6.5%
148
 
3.1%
91
 
1.9%
86
 
1.8%
77
 
1.6%
72
 
1.5%
65
 
1.3%
61
 
1.3%
59
 
1.2%
55
 
1.1%
Other values (328) 3824
78.8%
Misc Symbols
ValueCountFrequency (%)
2
100.0%
CJK
ValueCountFrequency (%)
1
100.0%
Distinct51
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size13.4 KiB
2023-12-12T22:22:20.989309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length24
Mean length9.5147232
Min length2

Characters and Unicode

Total characters16156
Distinct characters132
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)1.1%

Sample

1st row로엔엔터테인먼트
2nd rowCJ E&M
3rd row로엔엔터테인먼트
4th rowUniversal Music
5th row로엔엔터테인먼트
ValueCountFrequency (%)
music 483
16.2%
카카오 429
14.4%
m 429
14.4%
지니뮤직 264
8.8%
entertainment 238
8.0%
stone 234
7.8%
로엔엔터테인먼트 156
 
5.2%
universal 132
 
4.4%
dreamus 72
 
2.4%
cj 69
 
2.3%
Other values (52) 483
16.2%
2023-12-12T22:22:21.373272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1291
 
8.0%
n 1195
 
7.4%
M 987
 
6.1%
e 982
 
6.1%
t 948
 
5.9%
858
 
5.3%
i 854
 
5.3%
s 687
 
4.3%
r 566
 
3.5%
u 554
 
3.4%
Other values (122) 7234
44.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7697
47.6%
Other Letter 4337
26.8%
Uppercase Letter 2727
 
16.9%
Space Separator 1291
 
8.0%
Other Punctuation 104
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
858
19.8%
435
10.0%
337
 
7.8%
306
 
7.1%
303
 
7.0%
268
 
6.2%
267
 
6.2%
194
 
4.5%
187
 
4.3%
187
 
4.3%
Other values (82) 995
22.9%
Uppercase Letter
ValueCountFrequency (%)
M 987
36.2%
E 371
 
13.6%
S 326
 
12.0%
U 160
 
5.9%
I 128
 
4.7%
R 127
 
4.7%
N 116
 
4.3%
C 72
 
2.6%
D 72
 
2.6%
J 69
 
2.5%
Other values (10) 299
 
11.0%
Lowercase Letter
ValueCountFrequency (%)
n 1195
15.5%
e 982
12.8%
t 948
12.3%
i 854
11.1%
s 687
8.9%
r 566
7.4%
u 554
7.2%
a 505
6.6%
c 484
6.3%
m 310
 
4.0%
Other values (7) 612
8.0%
Other Punctuation
ValueCountFrequency (%)
& 70
67.3%
, 34
32.7%
Space Separator
ValueCountFrequency (%)
1291
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10424
64.5%
Hangul 4337
26.8%
Common 1395
 
8.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
858
19.8%
435
10.0%
337
 
7.8%
306
 
7.1%
303
 
7.0%
268
 
6.2%
267
 
6.2%
194
 
4.5%
187
 
4.3%
187
 
4.3%
Other values (82) 995
22.9%
Latin
ValueCountFrequency (%)
n 1195
11.5%
M 987
 
9.5%
e 982
 
9.4%
t 948
 
9.1%
i 854
 
8.2%
s 687
 
6.6%
r 566
 
5.4%
u 554
 
5.3%
a 505
 
4.8%
c 484
 
4.6%
Other values (27) 2662
25.5%
Common
ValueCountFrequency (%)
1291
92.5%
& 70
 
5.0%
, 34
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11819
73.2%
Hangul 4337
 
26.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1291
 
10.9%
n 1195
 
10.1%
M 987
 
8.4%
e 982
 
8.3%
t 948
 
8.0%
i 854
 
7.2%
s 687
 
5.8%
r 566
 
4.8%
u 554
 
4.7%
a 505
 
4.3%
Other values (30) 3250
27.5%
Hangul
ValueCountFrequency (%)
858
19.8%
435
10.0%
337
 
7.8%
306
 
7.1%
303
 
7.0%
268
 
6.2%
267
 
6.2%
194
 
4.5%
187
 
4.3%
187
 
4.3%
Other values (82) 995
22.9%
Distinct388
Distinct (%)22.9%
Missing0
Missing (%)0.0%
Memory size13.4 KiB
2023-12-12T22:22:21.677359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length44
Mean length13.18669
Min length3

Characters and Unicode

Total characters22391
Distinct characters316
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique198 ?
Unique (%)11.7%

Sample

1st rowYMC엔터테인먼트
2nd row정글엔터테인먼트
3rd rowGood fellas Entertainment
4th rowUniversal Music
5th row로엔엔터테인먼트
ValueCountFrequency (%)
entertainment 461
 
14.1%
music 250
 
7.6%
stone 179
 
5.5%
yg 115
 
3.5%
엔터테인먼트 105
 
3.2%
sm 105
 
3.2%
records 95
 
2.9%
jyp 43
 
1.3%
빅히트 43
 
1.3%
m 41
 
1.3%
Other values (456) 1840
56.1%
2023-12-12T22:22:22.149864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 1762
 
7.9%
t 1677
 
7.5%
1579
 
7.1%
e 1426
 
6.4%
i 914
 
4.1%
r 742
 
3.3%
a 641
 
2.9%
M 597
 
2.7%
E 584
 
2.6%
560
 
2.5%
Other values (306) 11909
53.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10099
45.1%
Other Letter 6819
30.5%
Uppercase Letter 3493
 
15.6%
Space Separator 1579
 
7.1%
Other Punctuation 339
 
1.5%
Decimal Number 43
 
0.2%
Close Punctuation 7
 
< 0.1%
Open Punctuation 6
 
< 0.1%
Dash Punctuation 3
 
< 0.1%
Currency Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
560
 
8.2%
494
 
7.2%
493
 
7.2%
465
 
6.8%
463
 
6.8%
460
 
6.7%
265
 
3.9%
172
 
2.5%
169
 
2.5%
168
 
2.5%
Other values (233) 3110
45.6%
Uppercase Letter
ValueCountFrequency (%)
M 597
17.1%
E 584
16.7%
S 398
11.4%
R 205
 
5.9%
Y 204
 
5.8%
G 179
 
5.1%
C 158
 
4.5%
I 129
 
3.7%
N 108
 
3.1%
T 101
 
2.9%
Other values (16) 830
23.8%
Lowercase Letter
ValueCountFrequency (%)
n 1762
17.4%
t 1677
16.6%
e 1426
14.1%
i 914
9.1%
r 742
7.3%
a 641
 
6.3%
s 516
 
5.1%
o 493
 
4.9%
m 487
 
4.8%
c 470
 
4.7%
Other values (15) 971
9.6%
Decimal Number
ValueCountFrequency (%)
2 13
30.2%
1 8
18.6%
0 6
14.0%
9 3
 
7.0%
8 3
 
7.0%
4 2
 
4.7%
3 2
 
4.7%
6 2
 
4.7%
7 2
 
4.7%
5 2
 
4.7%
Other Punctuation
ValueCountFrequency (%)
, 242
71.4%
& 56
 
16.5%
. 22
 
6.5%
/ 14
 
4.1%
! 3
 
0.9%
: 2
 
0.6%
Space Separator
ValueCountFrequency (%)
1579
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13592
60.7%
Hangul 6819
30.5%
Common 1979
 
8.8%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
560
 
8.2%
494
 
7.2%
493
 
7.2%
465
 
6.8%
463
 
6.8%
460
 
6.7%
265
 
3.9%
172
 
2.5%
169
 
2.5%
168
 
2.5%
Other values (233) 3110
45.6%
Latin
ValueCountFrequency (%)
n 1762
13.0%
t 1677
 
12.3%
e 1426
 
10.5%
i 914
 
6.7%
r 742
 
5.5%
a 641
 
4.7%
M 597
 
4.4%
E 584
 
4.3%
s 516
 
3.8%
o 493
 
3.6%
Other values (41) 4240
31.2%
Common
ValueCountFrequency (%)
1579
79.8%
, 242
 
12.2%
& 56
 
2.8%
. 22
 
1.1%
/ 14
 
0.7%
2 13
 
0.7%
1 8
 
0.4%
) 7
 
0.4%
( 6
 
0.3%
0 6
 
0.3%
Other values (11) 26
 
1.3%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15571
69.5%
Hangul 6818
30.4%
None 1
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 1762
 
11.3%
t 1677
 
10.8%
1579
 
10.1%
e 1426
 
9.2%
i 914
 
5.9%
r 742
 
4.8%
a 641
 
4.1%
M 597
 
3.8%
E 584
 
3.8%
s 516
 
3.3%
Other values (62) 5133
33.0%
Hangul
ValueCountFrequency (%)
560
 
8.2%
494
 
7.2%
493
 
7.2%
465
 
6.8%
463
 
6.8%
460
 
6.7%
265
 
3.9%
172
 
2.5%
169
 
2.5%
168
 
2.5%
Other values (232) 3109
45.6%
None
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
100.0%

Missing values

2023-12-12T22:22:17.962519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:22:18.053392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

저작물명앨범명아티스트명대리중개사명제작사명
0소나기소나기아이오아이 (I.O.I)로엔엔터테인먼트YMC엔터테인먼트
1미워요정인 From Andromeda정인CJ E&M정글엔터테인먼트
2신촌을 못가신촌을 못가포스트맨 (Postmen)로엔엔터테인먼트Good fellas Entertainment
3가질수 없는 너Bank 선물뱅크Universal MusicUniversal Music
4술 한잔 해요Atelier지아 (Zia)로엔엔터테인먼트로엔엔터테인먼트
5눈의 꽃미안하다 사랑한다 OST박효신오감엔터테인먼트스펀지엔터테인먼트
6지우개지우개알리 (Ali)로엔엔터테인먼트예당컴퍼니
7오르막길2012 월간 윤종신 6월호정인, 윤종신미러볼뮤직미스틱 엔터테인먼트
8그녀를 찾아주세요This Is The Name더 네임 (The Name)Warner MusicWarner Music
9걱정말아요 그대응답하라 1988 OST Part 2이적CJ E&MCJ E&M, 쿵엔터테인먼트
저작물명앨범명아티스트명대리중개사명제작사명
1688서쪽 하늘청연 OST이승철한국음반산업협회아이에스엔터미디어그룹
1689광대3집 Library Of Soul리쌍, BMK위지스제이엔터컴
1690Fix YouX & YColdplayWarner MusicEMI
1691인연 (동녘바람)사춘기이선희후크엔터테인먼트후크엔터테인먼트
1692바람이 분다6집 눈썹달이소라IS MUSIC아인스디지탈
1693Sunday MorningSunday MorningMaroon 5Universal MusicJ Records
1694너에게 쓰는 편지1집 180 DegreeMC 몽Stone Music EntertainmentM.A 엔터테인먼트
1695The ScientistA Rush Of Blood To The HeadColdplayWarner MusicWarner Music
1696Don`t Know WhyCome Away With MeNorah JonesWarner MusicBlue Note
1697여전히 아름다운지 (Feat. 김연우)A Night In Seoul토이 (Toy)삼성뮤직Orange