Overview

Dataset statistics

Number of variables6
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 KiB
Average record size in memory54.4 B

Variable types

DateTime1
Categorical3
Text2

Dataset

Description샘플 데이터
Author한양대
URLhttps://bigdata-region.kr/#/dataset/3c212550-81d5-4b91-b3ba-5a76e02ec943

Alerts

해시태그수집일자 has constant value ""Constant
해시태그채널ID is highly overall correlated with 해시태그최근지수High correlation
해시태그최근지수 is highly overall correlated with 해시태그채널IDHigh correlation
해시태그빈도수 is highly imbalanced (64.6%)Imbalance

Reproduction

Analysis started2023-12-10 14:15:49.361605
Analysis finished2023-12-10 14:15:50.063942
Duration0.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
Minimum2021-10-01 00:00:00
Maximum2021-10-01 00:00:00
2023-12-10T23:15:50.138279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:15:50.298450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

해시태그채널ID
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPaw
17 
https://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTw
13 

Length

Max length56
Median length56
Mean length56
Min length56

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttps://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPaw
2nd rowhttps://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPaw
3rd rowhttps://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTw
4th rowhttps://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPaw
5th rowhttps://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPaw

Common Values

ValueCountFrequency (%)
https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPaw 17
56.7%
https://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTw 13
43.3%

Length

2023-12-10T23:15:50.566264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:15:50.925924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
https://www.youtube.com/channel/ucfm_07mxv6cglrek8qdkpaw 17
56.7%
https://www.youtube.com/channel/uckinyts9ihqoewr1sze2jtw 13
43.3%
Distinct29
Distinct (%)96.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:15:51.485450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length43
Mean length43
Min length43

Characters and Unicode

Total characters1290
Distinct characters69
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)93.3%

Sample

1st rowhttps://www.youtube.com/watch?v=9bf_LM0p_nA
2nd rowhttps://www.youtube.com/watch?v=9bf_LM0p_nA
3rd rowhttps://www.youtube.com/watch?v=cJ1AreifGxw
4th rowhttps://www.youtube.com/watch?v=9pOC80Llvto
5th rowhttps://www.youtube.com/watch?v=9pGhong3FgI
ValueCountFrequency (%)
https://www.youtube.com/watch?v=9bf_lm0p_na 2
 
6.7%
https://www.youtube.com/watch?v=c_9yz5pnmga 1
 
3.3%
https://www.youtube.com/watch?v=bnapmj04dum 1
 
3.3%
https://www.youtube.com/watch?v=cdfgnbk0hdy 1
 
3.3%
https://www.youtube.com/watch?v=bdpsxlfzfco 1
 
3.3%
https://www.youtube.com/watch?v=bmdvfgdvypm 1
 
3.3%
https://www.youtube.com/watch?v=cbxewahrva4 1
 
3.3%
https://www.youtube.com/watch?v=b_8oxvs5ele 1
 
3.3%
https://www.youtube.com/watch?v=cbpnypsyc_g 1
 
3.3%
https://www.youtube.com/watch?v=b6q54dz3-f8 1
 
3.3%
Other values (19) 19
63.3%
2023-12-10T23:15:52.295664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 124
 
9.6%
w 121
 
9.4%
/ 90
 
7.0%
c 74
 
5.7%
o 67
 
5.2%
h 63
 
4.9%
u 61
 
4.7%
. 60
 
4.7%
b 40
 
3.1%
p 40
 
3.1%
Other values (59) 550
42.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 865
67.1%
Other Punctuation 210
 
16.3%
Uppercase Letter 122
 
9.5%
Decimal Number 51
 
4.0%
Math Symbol 30
 
2.3%
Connector Punctuation 8
 
0.6%
Dash Punctuation 4
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 124
14.3%
w 121
14.0%
c 74
 
8.6%
o 67
 
7.7%
h 63
 
7.3%
u 61
 
7.1%
b 40
 
4.6%
p 40
 
4.6%
y 37
 
4.3%
s 36
 
4.2%
Other values (16) 202
23.4%
Uppercase Letter
ValueCountFrequency (%)
A 14
 
11.5%
B 9
 
7.4%
O 8
 
6.6%
F 8
 
6.6%
U 7
 
5.7%
G 7
 
5.7%
Y 6
 
4.9%
M 6
 
4.9%
Q 6
 
4.9%
C 5
 
4.1%
Other values (16) 46
37.7%
Decimal Number
ValueCountFrequency (%)
9 9
17.6%
0 8
15.7%
8 8
15.7%
4 7
13.7%
5 4
7.8%
6 4
7.8%
3 4
7.8%
7 3
 
5.9%
2 3
 
5.9%
1 1
 
2.0%
Other Punctuation
ValueCountFrequency (%)
/ 90
42.9%
. 60
28.6%
: 30
 
14.3%
? 30
 
14.3%
Math Symbol
ValueCountFrequency (%)
= 30
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 987
76.5%
Common 303
 
23.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 124
 
12.6%
w 121
 
12.3%
c 74
 
7.5%
o 67
 
6.8%
h 63
 
6.4%
u 61
 
6.2%
b 40
 
4.1%
p 40
 
4.1%
y 37
 
3.7%
s 36
 
3.6%
Other values (42) 324
32.8%
Common
ValueCountFrequency (%)
/ 90
29.7%
. 60
19.8%
: 30
 
9.9%
= 30
 
9.9%
? 30
 
9.9%
9 9
 
3.0%
_ 8
 
2.6%
0 8
 
2.6%
8 8
 
2.6%
4 7
 
2.3%
Other values (7) 23
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1290
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 124
 
9.6%
w 121
 
9.4%
/ 90
 
7.0%
c 74
 
5.7%
o 67
 
5.2%
h 63
 
4.9%
u 61
 
4.7%
. 60
 
4.7%
b 40
 
3.1%
p 40
 
3.1%
Other values (59) 550
42.6%
Distinct24
Distinct (%)80.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:15:52.660195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.1666667
Min length1

Characters and Unicode

Total characters65
Distinct characters36
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)66.7%

Sample

1st row도서관
2nd row
3rd row
4th row폴리
5th row스톤
ValueCountFrequency (%)
스톤 3
 
10.0%
철권 3
 
10.0%
에스에스비 2
 
6.7%
2
 
6.7%
에픽 1
 
3.3%
도서관 1
 
3.3%
민국 1
 
3.3%
접종 1
 
3.3%
생산 1
 
3.3%
블리자드 1
 
3.3%
Other values (14) 14
46.7%
2023-12-10T23:15:53.260211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
 
15.4%
6
 
9.2%
3
 
4.6%
3
 
4.6%
3
 
4.6%
2
 
3.1%
2
 
3.1%
2
 
3.1%
2
 
3.1%
2
 
3.1%
Other values (26) 30
46.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10
 
15.4%
6
 
9.2%
3
 
4.6%
3
 
4.6%
3
 
4.6%
2
 
3.1%
2
 
3.1%
2
 
3.1%
2
 
3.1%
2
 
3.1%
Other values (26) 30
46.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10
 
15.4%
6
 
9.2%
3
 
4.6%
3
 
4.6%
3
 
4.6%
2
 
3.1%
2
 
3.1%
2
 
3.1%
2
 
3.1%
2
 
3.1%
Other values (26) 30
46.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10
 
15.4%
6
 
9.2%
3
 
4.6%
3
 
4.6%
3
 
4.6%
2
 
3.1%
2
 
3.1%
2
 
3.1%
2
 
3.1%
2
 
3.1%
Other values (26) 30
46.2%

해시태그빈도수
Categorical

IMBALANCE 

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
1
27 
3
 
2
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 27
90.0%
3 2
 
6.7%
2 1
 
3.3%

Length

2023-12-10T23:15:53.502345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:15:53.697877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 27
90.0%
3 2
 
6.7%
2 1
 
3.3%

해시태그최근지수
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
20
15 
18
11 
19

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row18
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 15
50.0%
18 11
36.7%
19 4
 
13.3%

Length

2023-12-10T23:15:53.956710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:15:54.141580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 15
50.0%
18 11
36.7%
19 4
 
13.3%

Correlations

2023-12-10T23:15:54.273893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
해시태그채널ID해시태그영상ID해시태그추출명사명해시태그빈도수해시태그최근지수
해시태그채널ID1.0001.0001.0000.0000.600
해시태그영상ID1.0001.0000.9481.0001.000
해시태그추출명사명1.0000.9481.0001.0000.967
해시태그빈도수0.0001.0001.0001.0000.625
해시태그최근지수0.6001.0000.9670.6251.000
2023-12-10T23:15:54.451495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
해시태그최근지수해시태그빈도수해시태그채널ID
해시태그최근지수1.0000.2870.853
해시태그빈도수0.2871.0000.000
해시태그채널ID0.8530.0001.000
2023-12-10T23:15:54.582696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
해시태그채널ID해시태그빈도수해시태그최근지수
해시태그채널ID1.0000.0000.853
해시태그빈도수0.0001.0000.287
해시태그최근지수0.8530.2871.000

Missing values

2023-12-10T23:15:49.794167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:15:49.993165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

해시태그수집일자해시태그채널ID해시태그영상ID해시태그추출명사명해시태그빈도수해시태그최근지수
02021-10-01https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPawhttps://www.youtube.com/watch?v=9bf_LM0p_nA도서관120
12021-10-01https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPawhttps://www.youtube.com/watch?v=9bf_LM0p_nA120
22021-10-01https://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTwhttps://www.youtube.com/watch?v=cJ1AreifGxw118
32021-10-01https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPawhttps://www.youtube.com/watch?v=9pOC80Llvto폴리120
42021-10-01https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPawhttps://www.youtube.com/watch?v=9pGhong3FgI스톤120
52021-10-01https://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTwhttps://www.youtube.com/watch?v=cNOFfzrOHGA뉴스320
62021-10-01https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPawhttps://www.youtube.com/watch?v=9qs976PQokg스톤120
72021-10-01https://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTwhttps://www.youtube.com/watch?v=cQVKvkddG-g에스에스비118
82021-10-01https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPawhttps://www.youtube.com/watch?v=A0CUrUTGiTY하스120
92021-10-01https://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTwhttps://www.youtube.com/watch?v=cQZnGaeBOIQ에스219
해시태그수집일자해시태그채널ID해시태그영상ID해시태그추출명사명해시태그빈도수해시태그최근지수
202021-10-01https://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTwhttps://www.youtube.com/watch?v=cantqg2Bt_4서울118
212021-10-01https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPawhttps://www.youtube.com/watch?v=B6q54Dz3-f8120
222021-10-01https://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTwhttps://www.youtube.com/watch?v=cbpnYPsyc_g미국118
232021-10-01https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPawhttps://www.youtube.com/watch?v=B_8OXVS5ElE블리자드119
242021-10-01https://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTwhttps://www.youtube.com/watch?v=cbxeWaHrvA4생산118
252021-10-01https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPawhttps://www.youtube.com/watch?v=BmdvFgdvYpM119
262021-10-01https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPawhttps://www.youtube.com/watch?v=BdpsxLfZFCo스톤120
272021-10-01https://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTwhttps://www.youtube.com/watch?v=cdfgnbK0hdY접종118
282021-10-01https://www.youtube.com/channel/UCFM_07Mxv6CglREk8qdkPawhttps://www.youtube.com/watch?v=BnAPMj04dUM철권120
292021-10-01https://www.youtube.com/channel/UCkinYTS9IHqOEwR1Sze2JTwhttps://www.youtube.com/watch?v=cfbxHYUVHtA법무118