Overview

Dataset statistics

Number of variables6
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 KiB
Average record size in memory54.4 B

Variable types

Categorical4
Text2

Dataset

Description샘플 데이터
Author한양대
URLhttps://bigdata-region.kr/#/dataset/c5bceea1-a856-42f4-9d2b-0f28733f70bb

Alerts

해시태그수집일자 has constant value ""Constant
해시태그채널ID has constant value ""Constant
해시태그빈도수 is highly imbalanced (53.1%)Imbalance

Reproduction

Analysis started2023-12-10 14:16:10.655638
Analysis finished2023-12-10 14:16:11.427607
Duration0.77 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

해시태그수집일자
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
2021-02-01
30 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-02-01
2nd row2021-02-01
3rd row2021-02-01
4th row2021-02-01
5th row2021-02-01

Common Values

ValueCountFrequency (%)
2021-02-01 30
100.0%

Length

2023-12-10T23:16:11.556590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:16:11.741728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021-02-01 30
100.0%

해시태그채널ID
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUA
30 

Length

Max length56
Median length56
Mean length56
Min length56

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttps://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUA
2nd rowhttps://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUA
3rd rowhttps://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUA
4th rowhttps://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUA
5th rowhttps://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUA

Common Values

ValueCountFrequency (%)
https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUA 30
100.0%

Length

2023-12-10T23:16:11.954982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:16:12.131545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
https://www.youtube.com/channel/ucodllyiqum2jobi4dty8eua 30
100.0%
Distinct26
Distinct (%)86.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:16:12.521034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length43
Mean length43
Min length43

Characters and Unicode

Total characters1290
Distinct characters68
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)76.7%

Sample

1st rowhttps://www.youtube.com/watch?v=9Bl1gauPpCE
2nd rowhttps://www.youtube.com/watch?v=95GNUlH17Gs
3rd rowhttps://www.youtube.com/watch?v=9DyNBiHmxJg
4th rowhttps://www.youtube.com/watch?v=9LTRrSxilvo
5th rowhttps://www.youtube.com/watch?v=9GXDFnQDiiY
ValueCountFrequency (%)
https://www.youtube.com/watch?v=a2nk64pffma 3
 
10.0%
https://www.youtube.com/watch?v=9xl82wvv7b8 2
 
6.7%
https://www.youtube.com/watch?v=aqasxkexkpe 2
 
6.7%
https://www.youtube.com/watch?v=9rvbpogzbdi 1
 
3.3%
https://www.youtube.com/watch?v=9bl1gauppce 1
 
3.3%
https://www.youtube.com/watch?v=9xee8laaowm 1
 
3.3%
https://www.youtube.com/watch?v=ab77fq9tly0 1
 
3.3%
https://www.youtube.com/watch?v=ayd3r0y2yls 1
 
3.3%
https://www.youtube.com/watch?v=arynve-axj4 1
 
3.3%
https://www.youtube.com/watch?v=ayw9pk09vgi 1
 
3.3%
Other values (16) 16
53.3%
2023-12-10T23:16:13.225448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
w 125
 
9.7%
t 121
 
9.4%
/ 90
 
7.0%
o 64
 
5.0%
u 63
 
4.9%
h 62
 
4.8%
c 62
 
4.8%
. 60
 
4.7%
p 39
 
3.0%
m 37
 
2.9%
Other values (58) 567
44.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 847
65.7%
Other Punctuation 210
 
16.3%
Uppercase Letter 130
 
10.1%
Decimal Number 69
 
5.3%
Math Symbol 30
 
2.3%
Connector Punctuation 3
 
0.2%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 21
16.2%
X 9
 
6.9%
B 8
 
6.2%
E 8
 
6.2%
Q 7
 
5.4%
V 7
 
5.4%
D 6
 
4.6%
K 6
 
4.6%
Y 6
 
4.6%
P 6
 
4.6%
Other values (16) 46
35.4%
Lowercase Letter
ValueCountFrequency (%)
w 125
14.8%
t 121
14.3%
o 64
 
7.6%
u 63
 
7.4%
h 62
 
7.3%
c 62
 
7.3%
p 39
 
4.6%
m 37
 
4.4%
y 36
 
4.3%
s 36
 
4.3%
Other values (15) 202
23.8%
Decimal Number
ValueCountFrequency (%)
9 23
33.3%
8 8
 
11.6%
4 7
 
10.1%
2 7
 
10.1%
7 6
 
8.7%
1 5
 
7.2%
6 4
 
5.8%
0 4
 
5.8%
3 3
 
4.3%
5 2
 
2.9%
Other Punctuation
ValueCountFrequency (%)
/ 90
42.9%
. 60
28.6%
: 30
 
14.3%
? 30
 
14.3%
Math Symbol
ValueCountFrequency (%)
= 30
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 977
75.7%
Common 313
 
24.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 125
 
12.8%
t 121
 
12.4%
o 64
 
6.6%
u 63
 
6.4%
h 62
 
6.3%
c 62
 
6.3%
p 39
 
4.0%
m 37
 
3.8%
y 36
 
3.7%
s 36
 
3.7%
Other values (41) 332
34.0%
Common
ValueCountFrequency (%)
/ 90
28.8%
. 60
19.2%
: 30
 
9.6%
? 30
 
9.6%
= 30
 
9.6%
9 23
 
7.3%
8 8
 
2.6%
4 7
 
2.2%
2 7
 
2.2%
7 6
 
1.9%
Other values (7) 22
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1290
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
w 125
 
9.7%
t 121
 
9.4%
/ 90
 
7.0%
o 64
 
5.0%
u 63
 
4.9%
h 62
 
4.8%
c 62
 
4.8%
. 60
 
4.7%
p 39
 
3.0%
m 37
 
2.9%
Other values (58) 567
44.0%
Distinct24
Distinct (%)80.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:16:13.571118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.1
Min length1

Characters and Unicode

Total characters63
Distinct characters38
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)66.7%

Sample

1st row
2nd row수박
3rd row화장품
4th row수다
5th row도서관
ValueCountFrequency (%)
화장품 3
 
10.0%
도서관 3
 
10.0%
2
 
6.7%
2
 
6.7%
화장 1
 
3.3%
1
 
3.3%
아이돌 1
 
3.3%
1
 
3.3%
레시피 1
 
3.3%
먹방 1
 
3.3%
Other values (14) 14
46.7%
2023-12-10T23:16:14.588372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
 
7.9%
4
 
6.3%
4
 
6.3%
4
 
6.3%
4
 
6.3%
3
 
4.8%
3
 
4.8%
2
 
3.2%
2
 
3.2%
2
 
3.2%
Other values (28) 30
47.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 63
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
7.9%
4
 
6.3%
4
 
6.3%
4
 
6.3%
4
 
6.3%
3
 
4.8%
3
 
4.8%
2
 
3.2%
2
 
3.2%
2
 
3.2%
Other values (28) 30
47.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 63
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
7.9%
4
 
6.3%
4
 
6.3%
4
 
6.3%
4
 
6.3%
3
 
4.8%
3
 
4.8%
2
 
3.2%
2
 
3.2%
2
 
3.2%
Other values (28) 30
47.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 63
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5
 
7.9%
4
 
6.3%
4
 
6.3%
4
 
6.3%
4
 
6.3%
3
 
4.8%
3
 
4.8%
2
 
3.2%
2
 
3.2%
2
 
3.2%
Other values (28) 30
47.6%

해시태그빈도수
Categorical

IMBALANCE 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
1
27 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 27
90.0%
2 3
 
10.0%

Length

2023-12-10T23:16:14.901514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:16:15.139440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 27
90.0%
2 3
 
10.0%
Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
19
16 
18
17
20

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row18
2nd row18
3rd row18
4th row20
5th row19

Common Values

ValueCountFrequency (%)
19 16
53.3%
18 7
23.3%
17 5
 
16.7%
20 2
 
6.7%

Length

2023-12-10T23:16:15.321842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:16:15.495019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
19 16
53.3%
18 7
23.3%
17 5
 
16.7%
20 2
 
6.7%

Correlations

2023-12-10T23:16:15.618237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
해시태그영상ID해시태그추출명사명해시태그빈도수해시태그최근지수
해시태그영상ID1.0000.9021.0001.000
해시태그추출명사명0.9021.0001.0000.949
해시태그빈도수1.0001.0001.0000.406
해시태그최근지수1.0000.9490.4061.000
2023-12-10T23:16:15.758210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
해시태그빈도수해시태그최근지수
해시태그빈도수1.0000.256
해시태그최근지수0.2561.000
2023-12-10T23:16:15.960673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
해시태그빈도수해시태그최근지수
해시태그빈도수1.0000.256
해시태그최근지수0.2561.000

Missing values

2023-12-10T23:16:11.105760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:16:11.339616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

해시태그수집일자해시태그채널ID해시태그영상ID해시태그추출명사명해시태그빈도수해시태그최근지수
02021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=9Bl1gauPpCE118
12021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=95GNUlH17Gs수박118
22021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=9DyNBiHmxJg화장품118
32021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=9LTRrSxilvo수다120
42021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=9GXDFnQDiiY도서관119
52021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=9Xl82wVV7B8뷰티119
62021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=9Xl82wVV7B8119
72021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=9_Dgp9WKKmg호텔119
82021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=9a4L8n1J3Pc상담118
92021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=9b3WPBvQAXE썸녀117
해시태그수집일자해시태그채널ID해시태그영상ID해시태그추출명사명해시태그빈도수해시태그최근지수
202021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=A2nk64pffmA도서관117
212021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=A2nk64pffmA요리117
222021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=A65ym_Qx1qE메이크업119
232021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=AQASxkeXkPE먹방119
242021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=AQASxkeXkPE화장품119
252021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=AYW9PK09vgI도서관119
262021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=ARyNVE-Axj4레시피119
272021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=AYd3r0Y2yLs119
282021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=Ab77fq9Tly0아이돌119
292021-02-01https://www.youtube.com/channel/UCoDLLyiqum2JOBI4Dty8eUAhttps://www.youtube.com/watch?v=AbnXtYfjvLU여친117