Overview

Dataset statistics

Number of variables7
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.8 KiB
Average record size in memory59.3 B

Variable types

Categorical5
Text1
Numeric1

Alerts

Country_CD has constant value ""Constant
Collection_CH_NM has constant value ""Constant
FILE_NAME has constant value ""Constant
BASE_YMD has constant value ""Constant

Reproduction

Analysis started2023-12-10 10:10:01.396780
Analysis finished2023-12-10 10:10:02.297233
Duration0.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2017-03
26 
2017-05
24 
2017-04
19 
2017-01
15 
2017-02
10 
Other values (2)

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row2017-01
2nd row2017-01
3rd row2017-01
4th row2017-01
5th row2017-01

Common Values

ValueCountFrequency (%)
2017-03 26
26.0%
2017-05 24
24.0%
2017-04 19
19.0%
2017-01 15
15.0%
2017-02 10
 
10.0%
2017-06 5
 
5.0%
2017-07 1
 
1.0%

Length

2023-12-10T19:10:02.419193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:10:02.662334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017-03 26
26.0%
2017-05 24
24.0%
2017-04 19
19.0%
2017-01 15
15.0%
2017-02 10
 
10.0%
2017-06 5
 
5.0%
2017-07 1
 
1.0%

Country_CD
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
vn
100 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowvn
2nd rowvn
3rd rowvn
4th rowvn
5th rowvn

Common Values

ValueCountFrequency (%)
vn 100
100.0%

Length

2023-12-10T19:10:02.873561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:10:03.037794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
vn 100
100.0%

Collection_CH_NM
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
NEWS
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNEWS
2nd rowNEWS
3rd rowNEWS
4th rowNEWS
5th rowNEWS

Common Values

ValueCountFrequency (%)
NEWS 100
100.0%

Length

2023-12-10T19:10:03.221864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:10:03.395038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
news 100
100.0%
Distinct69
Distinct (%)69.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:10:03.795801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7
Mean length5.51
Min length3

Characters and Unicode

Total characters551
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)49.0%

Sample

1st rowkorean
2nd rowsouth
3rd rowkorea
4th rowchina
5th rowmillion
ValueCountFrequency (%)
vietnam 5
 
5.0%
vietnames 4
 
4.0%
korean 4
 
4.0%
korea 4
 
4.0%
citi 3
 
3.0%
band 3
 
3.0%
hcm 2
 
2.0%
hanoi 2
 
2.0%
two 2
 
2.0%
kpop 2
 
2.0%
Other values (59) 69
69.0%
2023-12-10T19:10:04.456592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 56
10.2%
i 56
10.2%
e 52
 
9.4%
n 51
 
9.3%
o 44
 
8.0%
t 41
 
7.4%
r 37
 
6.7%
m 31
 
5.6%
c 23
 
4.2%
s 21
 
3.8%
Other values (13) 139
25.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 551
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 56
10.2%
i 56
10.2%
e 52
 
9.4%
n 51
 
9.3%
o 44
 
8.0%
t 41
 
7.4%
r 37
 
6.7%
m 31
 
5.6%
c 23
 
4.2%
s 21
 
3.8%
Other values (13) 139
25.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 551
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 56
10.2%
i 56
10.2%
e 52
 
9.4%
n 51
 
9.3%
o 44
 
8.0%
t 41
 
7.4%
r 37
 
6.7%
m 31
 
5.6%
c 23
 
4.2%
s 21
 
3.8%
Other values (13) 139
25.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 551
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 56
10.2%
i 56
10.2%
e 52
 
9.4%
n 51
 
9.3%
o 44
 
8.0%
t 41
 
7.4%
r 37
 
6.7%
m 31
 
5.6%
c 23
 
4.2%
s 21
 
3.8%
Other values (13) 139
25.2%

Keyword_FQ
Real number (ℝ)

Distinct31
Distinct (%)31.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.68
Minimum11
Maximum67
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:10:04.690526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile11
Q113
median16
Q323.25
95-th percentile42.05
Maximum67
Range56
Interquartile range (IQR)10.25

Descriptive statistics

Standard deviation10.674126
Coefficient of variation (CV)0.54238446
Kurtosis4.8208262
Mean19.68
Median Absolute Deviation (MAD)4
Skewness2.0855703
Sum1968
Variance113.93697
MonotonicityNot monotonic
2023-12-10T19:10:04.898275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
13 15
15.0%
12 13
13.0%
14 9
 
9.0%
11 9
 
9.0%
16 7
 
7.0%
18 4
 
4.0%
25 4
 
4.0%
24 3
 
3.0%
20 3
 
3.0%
19 3
 
3.0%
Other values (21) 30
30.0%
ValueCountFrequency (%)
11 9
9.0%
12 13
13.0%
13 15
15.0%
14 9
9.0%
15 3
 
3.0%
16 7
7.0%
17 3
 
3.0%
18 4
 
4.0%
19 3
 
3.0%
20 3
 
3.0%
ValueCountFrequency (%)
67 1
1.0%
54 1
1.0%
53 1
1.0%
49 1
1.0%
43 1
1.0%
42 1
1.0%
39 1
1.0%
38 1
1.0%
37 1
1.0%
36 1
1.0%

FILE_NAME
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
KC_KEYWORD_NEWS_VN_2019
100 

Length

Max length23
Median length23
Mean length23
Min length23

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKC_KEYWORD_NEWS_VN_2019
2nd rowKC_KEYWORD_NEWS_VN_2019
3rd rowKC_KEYWORD_NEWS_VN_2019
4th rowKC_KEYWORD_NEWS_VN_2019
5th rowKC_KEYWORD_NEWS_VN_2019

Common Values

ValueCountFrequency (%)
KC_KEYWORD_NEWS_VN_2019 100
100.0%

Length

2023-12-10T19:10:05.138739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:10:05.294286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kc_keyword_news_vn_2019 100
100.0%

BASE_YMD
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2019
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
2019 100
100.0%

Length

2023-12-10T19:10:05.506104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:10:05.721553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2019 100
100.0%

Interactions

2023-12-10T19:10:01.660688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:10:05.827683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Social_Data_Collection_Date_YMNews_KEY_WKeyword_FQ
Social_Data_Collection_Date_YM1.0000.4690.000
News_KEY_W0.4691.0000.000
Keyword_FQ0.0000.0001.000
2023-12-10T19:10:05.998309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Keyword_FQSocial_Data_Collection_Date_YM
Keyword_FQ1.0000.000
Social_Data_Collection_Date_YM0.0001.000

Missing values

2023-12-10T19:10:01.959159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:10:02.198467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Social_Data_Collection_Date_YMCountry_CDCollection_CH_NMNews_KEY_WKeyword_FQFILE_NAMEBASE_YMD
02017-01vnNEWSkorean43KC_KEYWORD_NEWS_VN_20192019
12017-01vnNEWSsouth24KC_KEYWORD_NEWS_VN_20192019
22017-01vnNEWSkorea23KC_KEYWORD_NEWS_VN_20192019
32017-01vnNEWSchina21KC_KEYWORD_NEWS_VN_20192019
42017-01vnNEWSmillion21KC_KEYWORD_NEWS_VN_20192019
52017-01vnNEWSnam20KC_KEYWORD_NEWS_VN_20192019
62017-01vnNEWSvietnames14KC_KEYWORD_NEWS_VN_20192019
72017-01vnNEWSciti14KC_KEYWORD_NEWS_VN_20192019
82017-01vnNEWSmegastar13KC_KEYWORD_NEWS_VN_20192019
92017-01vnNEWSbillion12KC_KEYWORD_NEWS_VN_20192019
Social_Data_Collection_Date_YMCountry_CDCollection_CH_NMNews_KEY_WKeyword_FQFILE_NAMEBASE_YMD
902017-05vnNEWSnight13KC_KEYWORD_NEWS_VN_20192019
912017-05vnNEWSmarket12KC_KEYWORD_NEWS_VN_20192019
922017-05vnNEWSfest12KC_KEYWORD_NEWS_VN_20192019
932017-05vnNEWSwinner12KC_KEYWORD_NEWS_VN_20192019
942017-06vnNEWSindustri28KC_KEYWORD_NEWS_VN_20192019
952017-06vnNEWSvietnam18KC_KEYWORD_NEWS_VN_20192019
962017-06vnNEWSasean13KC_KEYWORD_NEWS_VN_20192019
972017-06vnNEWSmacadamia12KC_KEYWORD_NEWS_VN_20192019
982017-06vnNEWSdevelop11KC_KEYWORD_NEWS_VN_20192019
992017-07vnNEWSalbum15KC_KEYWORD_NEWS_VN_20192019