gimi9 Pandas Profiling

Dataset statistics

Number of variables	5
Number of observations	1698
Missing cells	0
Missing cells (%)	0.0%
Duplicate rows	0
Duplicate rows (%)	0.0%
Total size in memory	66.5 KiB
Average record size in memory	40.1 B

Variable types

Text	5

Dataset

Description	한국저작권보호원이 수행하는 온라인 불법복제물 모니터링 업무 관련 중점보호 저작물 선정을 위한 디지털 음원 차트 정보
Author	(재)한국저작권보호원
URL	https://www.data.go.kr/data/15071046/fileData.do

Reproduction

Analysis started	2023-12-12 13:22:16.880290
Analysis finished	2023-12-12 13:22:18.090607
Duration	1.21 second
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

저작물명
Text

Distinct	1658
Distinct (%)	97.6%
Missing	0
Missing (%)	0.0%
Memory size	13.4 KiB

Length

Max length	90
Median length	54
Mean length	12.142521
Min length	1

Characters and Unicode

Total characters	20618
Distinct characters	746
Distinct categories	14 ?
Distinct scripts	5 ?
Distinct blocks	7 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	1620 ?
Unique (%)	95.4%

Sample

1st row	소나기
2nd row	미워요
3rd row	신촌을 못가
4th row	가질수 없는 너
5th row	술 한잔 해요

Value	Count	Frequency (%)
feat	260	5.7%
you	73	1.6%
love	53	1.2%
prod	49	1.1%
me	44	1.0%
the	39	0.8%
i	37	0.8%
of	33	0.7%
	28	0.6%
my	23	0.5%
Other values (2510)	3950	86.1%

Most occurring characters

Value	Count	Frequency (%)
	2893	14.0%
e	1025	5.0%
a	816	4.0%
o	739	3.6%
t	597	2.9%
(	582	2.8%
)	582	2.8%
i	450	2.2%
n	431	2.1%
r	409	2.0%
Other values (736)	12094	58.7%

Most occurring categories

Value	Count	Frequency (%)
Lowercase Letter	6945	33.7%
Other Letter	5634	27.3%
Uppercase Letter	3256	15.8%
Space Separator	2893	14.0%
Open Punctuation	582	2.8%
Close Punctuation	582	2.8%
Other Punctuation	524	2.5%
Decimal Number	124	0.6%
Modifier Symbol	48	0.2%
Dash Punctuation	22	0.1%
Other values (4)	8	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
이	176	3.1%
사	123	2.2%
다	115	2.0%
나	110	2.0%
아	103	1.8%
리	103	1.8%
가	96	1.7%
지	88	1.6%
그	85	1.5%
랑	81	1.4%
Other values (654)	4554	80.8%

Lowercase Letter

Value	Count	Frequency (%)
e	1025	14.8%
a	816	11.7%
o	739	10.6%
t	597	8.6%
i	450	6.5%
n	431	6.2%
r	409	5.9%
l	328	4.7%
y	256	3.7%
u	250	3.6%
Other values (16)	1644	23.7%

Uppercase Letter

Value	Count	Frequency (%)
F	336	10.3%
L	224	6.9%
S	204	6.3%
A	192	5.9%
M	181	5.6%
O	170	5.2%
I	168	5.2%
B	165	5.1%
T	162	5.0%
E	161	4.9%
Other values (16)	1293	39.7%

Other Punctuation

Value	Count	Frequency (%)
.	378	72.1%
,	88	16.8%
&	19	3.6%
:	11	2.1%
?	10	1.9%
!	8	1.5%
#	4	0.8%
%	3	0.6%
/	2	0.4%
…	1	0.2%

Decimal Number

Value	Count	Frequency (%)
1	34	27.4%
2	32	25.8%
0	19	15.3%
4	11	8.9%
3	7	5.6%
7	5	4.0%
8	5	4.0%
6	4	3.2%
5	4	3.2%
9	3	2.4%

Math Symbol

Value	Count	Frequency (%)
+	2	66.7%
=	1	33.3%

Space Separator

Value	Count	Frequency (%)
	2893	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	582	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	582	100.0%

Modifier Symbol

Value	Count	Frequency (%)
`	48	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	22	100.0%

Currency Symbol

Value	Count	Frequency (%)
$	2	100.0%

Final Punctuation

Value	Count	Frequency (%)
’	2	100.0%

Other Symbol

Value	Count	Frequency (%)
◑	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	10201	49.5%
Hangul	5611	27.2%
Common	4783	23.2%
Han	22	0.1%
Hiragana	1	< 0.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
이	176	3.1%
사	123	2.2%
다	115	2.0%
나	110	2.0%
아	103	1.8%
리	103	1.8%
가	96	1.7%
지	88	1.6%
그	85	1.5%
랑	81	1.4%
Other values (634)	4531	80.8%

Latin

Value	Count	Frequency (%)
e	1025	10.0%
a	816	8.0%
o	739	7.2%
t	597	5.9%
i	450	4.4%
n	431	4.2%
r	409	4.0%
F	336	3.3%
l	328	3.2%
y	256	2.5%
Other values (42)	4814	47.2%

Common

Value	Count	Frequency (%)
	2893	60.5%
(	582	12.2%
)	582	12.2%
.	378	7.9%
,	88	1.8%
`	48	1.0%
1	34	0.7%
2	32	0.7%
-	22	0.5%
&	19	0.4%
Other values (20)	105	2.2%

Han

Value	Count	Frequency (%)
美	2	9.1%
人	2	9.1%
花	2	9.1%
承	1	4.5%
轉	1	4.5%
上	1	4.5%
海	1	4.5%
之	1	4.5%
戀	1	4.5%
情	1	4.5%
Other values (9)	9	40.9%

Hiragana

Value	Count	Frequency (%)
の	1	100.0%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	14980	72.7%
Hangul	5611	27.2%
CJK	21	0.1%
Punctuation	3	< 0.1%
CJK Compat Ideographs	1	< 0.1%
Hiragana	1	< 0.1%
Geometric Shapes	1	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	2893	19.3%
e	1025	6.8%
a	816	5.4%
o	739	4.9%
t	597	4.0%
(	582	3.9%
)	582	3.9%
i	450	3.0%
n	431	2.9%
r	409	2.7%
Other values (69)	6456	43.1%

Hangul

Value	Count	Frequency (%)
이	176	3.1%
사	123	2.2%
다	115	2.0%
나	110	2.0%
아	103	1.8%
리	103	1.8%
가	96	1.7%
지	88	1.6%
그	85	1.5%
랑	81	1.4%
Other values (634)	4531	80.8%

CJK

Value	Count	Frequency (%)
美	2	9.5%
人	2	9.5%
花	2	9.5%
承	1	4.8%
轉	1	4.8%
上	1	4.8%
海	1	4.8%
之	1	4.8%
情	1	4.8%
女	1	4.8%
Other values (8)	8	38.1%

Punctuation

Value	Count	Frequency (%)
’	2	66.7%
…	1	33.3%

CJK Compat Ideographs

Value	Count	Frequency (%)
戀	1	100.0%

Hiragana

Value	Count	Frequency (%)
の	1	100.0%

Geometric Shapes

Value	Count	Frequency (%)
◑	1	100.0%

앨범명
Text

Distinct	1378
Distinct (%)	81.2%
Missing	0
Missing (%)	0.0%
Memory size	13.4 KiB

Length

Max length	70
Median length	42
Mean length	14.409894
Min length	1

Characters and Unicode

Total characters	24468
Distinct characters	631
Distinct categories	17 ?
Distinct scripts	5 ?
Distinct blocks	8 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	1213 ?
Unique (%)	71.4%

Sample

1st row	소나기
2nd row	정인 From Andromeda
3rd row	신촌을 못가
4th row	Bank 선물
5th row	Atelier

Value	Count	Frequency (%)
ost	172	3.2%
the	167	3.1%
part	154	2.9%
	135	2.5%
album	121	2.3%
love	77	1.4%
1	67	1.2%
2	58	1.1%
mini	56	1.0%
you	51	0.9%
Other values (1875)	4318	80.3%

Most occurring characters

Value	Count	Frequency (%)
	3683	15.1%
e	1083	4.4%
a	733	3.0%
i	691	2.8%
t	656	2.7%
r	651	2.7%
o	622	2.5%
T	608	2.5%
n	564	2.3%
S	549	2.2%
Other values (621)	14628	59.8%

Most occurring categories

Value	Count	Frequency (%)
Lowercase Letter	8372	34.2%
Uppercase Letter	5978	24.4%
Other Letter	4278	17.5%
Space Separator	3683	15.1%
Decimal Number	1018	4.2%
Modifier Symbol	301	1.2%
Other Punctuation	276	1.1%
Close Punctuation	190	0.8%
Open Punctuation	190	0.8%
Dash Punctuation	116	0.5%
Other values (7)	66	0.3%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
집	107	2.5%
이	102	2.4%
스	92	2.2%
가	91	2.1%
의	77	1.8%
다	72	1.7%
사	69	1.6%
나	62	1.4%
고	60	1.4%
리	57	1.3%
Other values (523)	3489	81.6%

Lowercase Letter

Value	Count	Frequency (%)
e	1083	12.9%
a	733	8.8%
i	691	8.3%
t	656	7.8%
r	651	7.8%
o	622	7.4%
n	564	6.7%
l	519	6.2%
u	346	4.1%
h	337	4.0%
Other values (17)	2170	25.9%

Uppercase Letter

Value	Count	Frequency (%)
T	608	10.2%
S	549	9.2%
O	537	9.0%
E	501	8.4%
P	399	6.7%
A	395	6.6%
L	309	5.2%
M	303	5.1%
I	244	4.1%
R	238	4.0%
Other values (16)	1895	31.7%

Other Punctuation

Value	Count	Frequency (%)
.	148	53.6%
,	33	12.0%
:	29	10.5%
&	18	6.5%
#	14	5.1%
/	13	4.7%
?	10	3.6%
!	6	2.2%
;	2	0.7%
*	2	0.7%

Decimal Number

Value	Count	Frequency (%)
1	305	30.0%
2	194	19.1%
3	112	11.0%
4	104	10.2%
0	72	7.1%
5	65	6.4%
6	59	5.8%
8	42	4.1%
7	40	3.9%
9	25	2.5%

Math Symbol

Value	Count	Frequency (%)
=	19	52.8%
+	7	19.4%
÷	7	19.4%
｜	1	2.8%
<	1	2.8%
>	1	2.8%

Letter Number

Value	Count	Frequency (%)
Ⅰ	5	71.4%
Ⅲ	1	14.3%
Ⅱ	1	14.3%

Modifier Symbol

Value	Count	Frequency (%)
`	300	99.7%
˚	1	0.3%

Close Punctuation

Value	Count	Frequency (%)
)	188	98.9%
]	2	1.1%

Open Punctuation

Value	Count	Frequency (%)
(	188	98.9%
[	2	1.1%

Final Punctuation

Value	Count	Frequency (%)
’	5	83.3%
”	1	16.7%

Initial Punctuation

Value	Count	Frequency (%)
‘	3	75.0%
“	1	25.0%

Space Separator

Value	Count	Frequency (%)
	3683	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	116	100.0%

Connector Punctuation

Value	Count	Frequency (%)
_	10	100.0%

Other Number

Value	Count	Frequency (%)
¹	2	100.0%

Other Symbol

Value	Count	Frequency (%)
℃	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	14352	58.7%
Common	5833	23.8%
Hangul	4216	17.2%
Han	62	0.3%
Greek	5	< 0.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
집	107	2.5%
이	102	2.4%
스	92	2.2%
가	91	2.2%
의	77	1.8%
다	72	1.7%
사	69	1.6%
나	62	1.5%
고	60	1.4%
리	57	1.4%
Other values (494)	3427	81.3%

Latin

Value	Count	Frequency (%)
e	1083	7.5%
a	733	5.1%
i	691	4.8%
t	656	4.6%
r	651	4.5%
o	622	4.3%
T	608	4.2%
n	564	3.9%
S	549	3.8%
O	537	3.7%
Other values (45)	7658	53.4%

Common

Value	Count	Frequency (%)
	3683	63.1%
1	305	5.2%
`	300	5.1%
2	194	3.3%
)	188	3.2%
(	188	3.2%
.	148	2.5%
-	116	2.0%
3	112	1.9%
4	104	1.8%
Other values (32)	495	8.5%

Han

Value	Count	Frequency (%)
轉	11	17.7%
結	10	16.1%
甲	5	8.1%
承	4	6.5%
之	2	3.2%
記	2	3.2%
春	2	3.2%
思	2	3.2%
上	2	3.2%
戀	2	3.2%
Other values (19)	20	32.3%

Greek

Value	Count	Frequency (%)
χ	5	100.0%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	20156	82.4%
Hangul	4216	17.2%
CJK	62	0.3%
None	15	0.1%
Punctuation	10	< 0.1%
Number Forms	7	< 0.1%
Modifier Letters	1	< 0.1%
Letterlike Symbols	1	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	3683	18.3%
e	1083	5.4%
a	733	3.6%
i	691	3.4%
t	656	3.3%
r	651	3.2%
o	622	3.1%
T	608	3.0%
n	564	2.8%
S	549	2.7%
Other values (75)	10316	51.2%

Hangul

Value	Count	Frequency (%)
집	107	2.5%
이	102	2.4%
스	92	2.2%
가	91	2.2%
의	77	1.8%
다	72	1.7%
사	69	1.6%
나	62	1.5%
고	60	1.4%
리	57	1.4%
Other values (494)	3427	81.3%

CJK

Value	Count	Frequency (%)
轉	11	17.7%
結	10	16.1%
甲	5	8.1%
承	4	6.5%
之	2	3.2%
記	2	3.2%
春	2	3.2%
思	2	3.2%
上	2	3.2%
戀	2	3.2%
Other values (19)	20	32.3%

None

Value	Count	Frequency (%)
÷	7	46.7%
χ	5	33.3%
¹	2	13.3%
｜	1	6.7%

Number Forms

Value	Count	Frequency (%)
Ⅰ	5	71.4%
Ⅲ	1	14.3%
Ⅱ	1	14.3%

Punctuation

Value	Count	Frequency (%)
’	5	50.0%
‘	3	30.0%
“	1	10.0%
”	1	10.0%

Modifier Letters

Value	Count	Frequency (%)
˚	1	100.0%

Letterlike Symbols

Value	Count	Frequency (%)
℃	1	100.0%

아티스트명
Text

Distinct	648
Distinct (%)	38.2%
Missing	0
Missing (%)	0.0%
Memory size	13.4 KiB

Length

Max length	60
Median length	36
Mean length	9.2526502
Min length	2

Characters and Unicode

Total characters	15711
Distinct characters	411
Distinct categories	12 ?
Distinct scripts	4 ?
Distinct blocks	4 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	393 ?
Unique (%)	23.1%

Sample

1st row	아이오아이 (I.O.I)
2nd row	정인
3rd row	포스트맨 (Postmen)
4th row	뱅크
5th row	지아 (Zia)

Value	Count	Frequency (%)
방탄소년단	40	1.3%
아이유	40	1.3%
iu	39	1.2%
버스커	38	1.2%
busker	38	1.2%
볼빨간사춘기	26	0.8%
빅뱅	24	0.8%
the	24	0.8%
다비치	22	0.7%
one	22	0.7%
Other values (965)	2819	90.0%

Most occurring characters

Value	Count	Frequency (%)
	1434	9.1%
)	757	4.8%
(	757	4.8%
e	584	3.7%
a	535	3.4%
n	445	2.8%
i	419	2.7%
이	313	2.0%
r	302	1.9%
o	290	1.8%
Other values (401)	9875	62.9%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	4852	30.9%
Lowercase Letter	4398	28.0%
Uppercase Letter	3091	19.7%
Space Separator	1434	9.1%
Close Punctuation	757	4.8%
Open Punctuation	757	4.8%
Other Punctuation	276	1.8%
Decimal Number	100	0.6%
Dash Punctuation	34	0.2%
Modifier Symbol	9	0.1%
Other values (2)	3	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
이	313	6.5%
스	148	3.1%
아	91	1.9%
지	86	1.8%
비	77	1.6%
소	72	1.5%
유	65	1.3%
김	61	1.3%
라	59	1.2%
정	55	1.1%
Other values (329)	3825	78.8%

Uppercase Letter

Value	Count	Frequency (%)
E	229	7.4%
N	206	6.7%
O	198	6.4%
A	195	6.3%
M	194	6.3%
I	182	5.9%
B	177	5.7%
C	176	5.7%
S	158	5.1%
H	143	4.6%
Other values (16)	1233	39.9%

Lowercase Letter

Value	Count	Frequency (%)
e	584	13.3%
a	535	12.2%
n	445	10.1%
i	419	9.5%
r	302	6.9%
o	290	6.6%
l	269	6.1%
s	194	4.4%
u	157	3.6%
h	152	3.5%
Other values (15)	1051	23.9%

Decimal Number

Value	Count	Frequency (%)
1	33	33.0%
0	19	19.0%
4	17	17.0%
5	14	14.0%
2	11	11.0%
7	3	3.0%
9	2	2.0%
8	1	1.0%

Other Punctuation

Value	Count	Frequency (%)
,	202	73.2%
.	58	21.0%
&	9	3.3%
:	4	1.4%
*	2	0.7%
!	1	0.4%

Space Separator

Value	Count	Frequency (%)
	1434	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	757	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	757	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	34	100.0%

Modifier Symbol

Value	Count	Frequency (%)
`	9	100.0%

Other Symbol

Value	Count	Frequency (%)
★	2	100.0%

Currency Symbol

Value	Count	Frequency (%)
$	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	7489	47.7%
Hangul	4851	30.9%
Common	3370	21.4%
Han	1	< 0.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
이	313	6.5%
스	148	3.1%
아	91	1.9%
지	86	1.8%
비	77	1.6%
소	72	1.5%
유	65	1.3%
김	61	1.3%
라	59	1.2%
정	55	1.1%
Other values (328)	3824	78.8%

Latin

Value	Count	Frequency (%)
e	584	7.8%
a	535	7.1%
n	445	5.9%
i	419	5.6%
r	302	4.0%
o	290	3.9%
l	269	3.6%
E	229	3.1%
N	206	2.8%
O	198	2.6%
Other values (41)	4012	53.6%

Common

Value	Count	Frequency (%)
	1434	42.6%
)	757	22.5%
(	757	22.5%
,	202	6.0%
.	58	1.7%
-	34	1.0%
1	33	1.0%
0	19	0.6%
4	17	0.5%
5	14	0.4%
Other values (11)	45	1.3%

Han

Value	Count	Frequency (%)
美	1	100.0%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	10857	69.1%
Hangul	4851	30.9%
Misc Symbols	2	< 0.1%
CJK	1	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	1434	13.2%
)	757	7.0%
(	757	7.0%
e	584	5.4%
a	535	4.9%
n	445	4.1%
i	419	3.9%
r	302	2.8%
o	290	2.7%
l	269	2.5%
Other values (61)	5065	46.7%

Hangul

Value	Count	Frequency (%)
이	313	6.5%
스	148	3.1%
아	91	1.9%
지	86	1.8%
비	77	1.6%
소	72	1.5%
유	65	1.3%
김	61	1.3%
라	59	1.2%
정	55	1.1%
Other values (328)	3824	78.8%

Misc Symbols

Value	Count	Frequency (%)
★	2	100.0%

CJK

Value	Count	Frequency (%)
美	1	100.0%

대리중개사명
Text

Distinct	51
Distinct (%)	3.0%
Missing	0
Missing (%)	0.0%
Memory size	13.4 KiB

Length

Max length	31
Median length	24
Mean length	9.5147232
Min length	2

Characters and Unicode

Total characters	16156
Distinct characters	132
Distinct categories	5 ?
Distinct scripts	3 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	18 ?
Unique (%)	1.1%

Sample

1st row	로엔엔터테인먼트
2nd row	CJ E&M
3rd row	로엔엔터테인먼트
4th row	Universal Music
5th row	로엔엔터테인먼트

Value	Count	Frequency (%)
music	483	16.2%
카카오	429	14.4%
m	429	14.4%
지니뮤직	264	8.8%
entertainment	238	8.0%
stone	234	7.8%
로엔엔터테인먼트	156	5.2%
universal	132	4.4%
dreamus	72	2.4%
cj	69	2.3%
Other values (52)	483	16.2%

Most occurring characters

Value	Count	Frequency (%)
	1291	8.0%
n	1195	7.4%
M	987	6.1%
e	982	6.1%
t	948	5.9%
카	858	5.3%
i	854	5.3%
s	687	4.3%
r	566	3.5%
u	554	3.4%
Other values (122)	7234	44.8%

Most occurring categories

Value	Count	Frequency (%)
Lowercase Letter	7697	47.6%
Other Letter	4337	26.8%
Uppercase Letter	2727	16.9%
Space Separator	1291	8.0%
Other Punctuation	104	0.6%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
카	858	19.8%
오	435	10.0%
엔	337	7.8%
뮤	306	7.1%
직	303	7.0%
니	268	6.2%
지	267	6.2%
인	194	4.5%
트	187	4.3%
터	187	4.3%
Other values (82)	995	22.9%

Uppercase Letter

Value	Count	Frequency (%)
M	987	36.2%
E	371	13.6%
S	326	12.0%
U	160	5.9%
I	128	4.7%
R	127	4.7%
N	116	4.3%
C	72	2.6%
D	72	2.6%
J	69	2.5%
Other values (10)	299	11.0%

Lowercase Letter

Value	Count	Frequency (%)
n	1195	15.5%
e	982	12.8%
t	948	12.3%
i	854	11.1%
s	687	8.9%
r	566	7.4%
u	554	7.2%
a	505	6.6%
c	484	6.3%
m	310	4.0%
Other values (7)	612	8.0%

Other Punctuation

Value	Count	Frequency (%)
&	70	67.3%
,	34	32.7%

Space Separator

Value	Count	Frequency (%)
	1291	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	10424	64.5%
Hangul	4337	26.8%
Common	1395	8.6%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
카	858	19.8%
오	435	10.0%
엔	337	7.8%
뮤	306	7.1%
직	303	7.0%
니	268	6.2%
지	267	6.2%
인	194	4.5%
트	187	4.3%
터	187	4.3%
Other values (82)	995	22.9%

Latin

Value	Count	Frequency (%)
n	1195	11.5%
M	987	9.5%
e	982	9.4%
t	948	9.1%
i	854	8.2%
s	687	6.6%
r	566	5.4%
u	554	5.3%
a	505	4.8%
c	484	4.6%
Other values (27)	2662	25.5%

Common

Value	Count	Frequency (%)
	1291	92.5%
&	70	5.0%
,	34	2.4%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	11819	73.2%
Hangul	4337	26.8%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	1291	10.9%
n	1195	10.1%
M	987	8.4%
e	982	8.3%
t	948	8.0%
i	854	7.2%
s	687	5.8%
r	566	4.8%
u	554	4.7%
a	505	4.3%
Other values (30)	3250	27.5%

Hangul

Value	Count	Frequency (%)
카	858	19.8%
오	435	10.0%
엔	337	7.8%
뮤	306	7.1%
직	303	7.0%
니	268	6.2%
지	267	6.2%
인	194	4.5%
트	187	4.3%
터	187	4.3%
Other values (82)	995	22.9%

제작사명
Text

Distinct	388
Distinct (%)	22.9%
Missing	0
Missing (%)	0.0%
Memory size	13.4 KiB

Length

Max length	60
Median length	44
Mean length	13.18669
Min length	3

Characters and Unicode

Total characters	22391
Distinct characters	316
Distinct categories	11 ?
Distinct scripts	4 ?
Distinct blocks	4 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	198 ?
Unique (%)	11.7%

Sample

1st row	YMC엔터테인먼트
2nd row	정글엔터테인먼트
3rd row	Good fellas Entertainment
4th row	Universal Music
5th row	로엔엔터테인먼트

Value	Count	Frequency (%)
entertainment	461	14.1%
music	250	7.6%
stone	179	5.5%
yg	115	3.5%
엔터테인먼트	105	3.2%
sm	105	3.2%
records	95	2.9%
jyp	43	1.3%
빅히트	43	1.3%
m	41	1.3%
Other values (456)	1840	56.1%

Most occurring characters

Value	Count	Frequency (%)
n	1762	7.9%
t	1677	7.5%
	1579	7.1%
e	1426	6.4%
i	914	4.1%
r	742	3.3%
a	641	2.9%
M	597	2.7%
E	584	2.6%
트	560	2.5%
Other values (306)	11909	53.2%

Most occurring categories

Value	Count	Frequency (%)
Lowercase Letter	10099	45.1%
Other Letter	6819	30.5%
Uppercase Letter	3493	15.6%
Space Separator	1579	7.1%
Other Punctuation	339	1.5%
Decimal Number	43	0.2%
Close Punctuation	7	< 0.1%
Open Punctuation	6	< 0.1%
Dash Punctuation	3	< 0.1%
Currency Symbol	2	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
트	560	8.2%
인	494	7.2%
엔	493	7.2%
테	465	6.8%
터	463	6.8%
먼	460	6.7%
스	265	3.9%
직	172	2.5%
뮤	169	2.5%
이	168	2.5%
Other values (233)	3110	45.6%

Uppercase Letter

Value	Count	Frequency (%)
M	597	17.1%
E	584	16.7%
S	398	11.4%
R	205	5.9%
Y	204	5.8%
G	179	5.1%
C	158	4.5%
I	129	3.7%
N	108	3.1%
T	101	2.9%
Other values (16)	830	23.8%

Lowercase Letter

Value	Count	Frequency (%)
n	1762	17.4%
t	1677	16.6%
e	1426	14.1%
i	914	9.1%
r	742	7.3%
a	641	6.3%
s	516	5.1%
o	493	4.9%
m	487	4.8%
c	470	4.7%
Other values (15)	971	9.6%

Decimal Number

Value	Count	Frequency (%)
2	13	30.2%
1	8	18.6%
0	6	14.0%
9	3	7.0%
8	3	7.0%
4	2	4.7%
3	2	4.7%
6	2	4.7%
7	2	4.7%
5	2	4.7%

Other Punctuation

Value	Count	Frequency (%)
,	242	71.4%
&	56	16.5%
.	22	6.5%
/	14	4.1%
!	3	0.9%
:	2	0.6%

Space Separator

Value	Count	Frequency (%)
	1579	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	7	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	6	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	3	100.0%

Currency Symbol

Value	Count	Frequency (%)
$	2	100.0%

Other Symbol

Value	Count	Frequency (%)
㈜	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	13592	60.7%
Hangul	6819	30.5%
Common	1979	8.8%
Han	1	< 0.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
트	560	8.2%
인	494	7.2%
엔	493	7.2%
테	465	6.8%
터	463	6.8%
먼	460	6.7%
스	265	3.9%
직	172	2.5%
뮤	169	2.5%
이	168	2.5%
Other values (233)	3110	45.6%

Latin

Value	Count	Frequency (%)
n	1762	13.0%
t	1677	12.3%
e	1426	10.5%
i	914	6.7%
r	742	5.5%
a	641	4.7%
M	597	4.4%
E	584	4.3%
s	516	3.8%
o	493	3.6%
Other values (41)	4240	31.2%

Common

Value	Count	Frequency (%)
	1579	79.8%
,	242	12.2%
&	56	2.8%
.	22	1.1%
/	14	0.7%
2	13	0.7%
1	8	0.4%
)	7	0.4%
(	6	0.3%
0	6	0.3%
Other values (11)	26	1.3%

Han

Value	Count	Frequency (%)
無	1	100.0%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	15571	69.5%
Hangul	6818	30.4%
None	1	< 0.1%
CJK	1	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
n	1762	11.3%
t	1677	10.8%
	1579	10.1%
e	1426	9.2%
i	914	5.9%
r	742	4.8%
a	641	4.1%
M	597	3.8%
E	584	3.8%
s	516	3.3%
Other values (62)	5133	33.0%

Hangul

Value	Count	Frequency (%)
트	560	8.2%
인	494	7.2%
엔	493	7.2%
테	465	6.8%
터	463	6.8%
먼	460	6.7%
스	265	3.9%
직	172	2.5%
뮤	169	2.5%
이	168	2.5%
Other values (232)	3109	45.6%

None

Value	Count	Frequency (%)
㈜	1	100.0%

CJK

Value	Count	Frequency (%)
無	1	100.0%

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	저작물명	앨범명	아티스트명	대리중개사명	제작사명
0	소나기	소나기	아이오아이 (I.O.I)	로엔엔터테인먼트	YMC엔터테인먼트
1	미워요	정인 From Andromeda	정인	CJ E&M	정글엔터테인먼트
2	신촌을 못가	신촌을 못가	포스트맨 (Postmen)	로엔엔터테인먼트	Good fellas Entertainment
3	가질수 없는 너	Bank 선물	뱅크	Universal Music	Universal Music
4	술 한잔 해요	Atelier	지아 (Zia)	로엔엔터테인먼트	로엔엔터테인먼트
5	눈의 꽃	미안하다 사랑한다 OST	박효신	오감엔터테인먼트	스펀지엔터테인먼트
6	지우개	지우개	알리 (Ali)	로엔엔터테인먼트	예당컴퍼니
7	오르막길	2012 월간 윤종신 6월호	정인, 윤종신	미러볼뮤직	미스틱 엔터테인먼트
8	그녀를 찾아주세요	This Is The Name	더 네임 (The Name)	Warner Music	Warner Music
9	걱정말아요 그대	응답하라 1988 OST Part 2	이적	CJ E&M	CJ E&M, 쿵엔터테인먼트

	저작물명	앨범명	아티스트명	대리중개사명	제작사명
1688	서쪽 하늘	청연 OST	이승철	한국음반산업협회	아이에스엔터미디어그룹
1689	광대	3집 Library Of Soul	리쌍, BMK	위지스	제이엔터컴
1690	Fix You	X & Y	Coldplay	Warner Music	EMI
1691	인연 (동녘바람)	사춘기	이선희	후크엔터테인먼트	후크엔터테인먼트
1692	바람이 분다	6집 눈썹달	이소라	IS MUSIC	아인스디지탈
1693	Sunday Morning	Sunday Morning	Maroon 5	Universal Music	J Records
1694	너에게 쓰는 편지	1집 180 Degree	MC 몽	Stone Music Entertainment	M.A 엔터테인먼트
1695	The Scientist	A Rush Of Blood To The Head	Coldplay	Warner Music	Warner Music
1696	Don`t Know Why	Come Away With Me	Norah Jones	Warner Music	Blue Note
1697	여전히 아름다운지 (Feat. 김연우)	A Night In Seoul	토이 (Toy)	삼성뮤직	Orange

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Lowercase Letter

Uppercase Letter

Other Punctuation

Decimal Number

Math Symbol

Space Separator

Open Punctuation

Close Punctuation

Modifier Symbol

Dash Punctuation

Currency Symbol

Final Punctuation

Other Symbol

Most occurring scripts

Most frequent character per script

Hangul

Latin

Common

Han

Hiragana

Most occurring blocks

Most frequent character per block

ASCII

Hangul

CJK

Punctuation

CJK Compat Ideographs

Hiragana

Geometric Shapes

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Lowercase Letter

Uppercase Letter

Other Punctuation

Decimal Number

Math Symbol

Letter Number

Modifier Symbol

Close Punctuation

Open Punctuation

Final Punctuation

Initial Punctuation

Space Separator

Dash Punctuation

Connector Punctuation

Other Number

Other Symbol

Most occurring scripts

Most frequent character per script

Hangul

Latin

Common

Han

Greek

Most occurring blocks

Most frequent character per block

ASCII

Hangul

CJK

None

Number Forms

Punctuation

Modifier Letters

Letterlike Symbols

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Uppercase Letter

Lowercase Letter

Decimal Number

Other Punctuation