Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 100 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 5.8 KiB |
Average record size in memory | 59.3 B |
Variable types
Categorical | 5 |
---|---|
Text | 1 |
Numeric | 1 |
Dataset
Description | Sample |
---|---|
Author | 한국문화정보원 |
URL | https://www.bigdata-culture.kr/bigdata/user/data_market/detail.do?id=124bd400-6270-11ea-8b67-7b32ce18203a |
Country_CD has constant value "" | Constant |
Collection_CH_NM has constant value "" | Constant |
FILE_NAME has constant value "" | Constant |
BASE_YMD has constant value "" | Constant |
Reproduction
Analysis started | 2023-12-10 10:10:01.396780 |
---|---|
Analysis finished | 2023-12-10 10:10:02.297233 |
Duration | 0.9 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
Social_Data_Collection_Date_YM
Categorical
Distinct | 7 |
---|---|
Distinct (%) | 7.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
2017-03 | |
---|---|
2017-05 | |
2017-04 | |
2017-01 | |
2017-02 | |
Other values (2) |
Length
Max length | 7 |
---|---|
Median length | 7 |
Mean length | 7 |
Min length | 7 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 1.0% |
Sample
1st row | 2017-01 |
---|---|
2nd row | 2017-01 |
3rd row | 2017-01 |
4th row | 2017-01 |
5th row | 2017-01 |
Common Values
Value | Count | Frequency (%) |
2017-03 | 26 | |
2017-05 | 24 | |
2017-04 | 19 | |
2017-01 | 15 | |
2017-02 | 10 | 10.0% |
2017-06 | 5 | 5.0% |
2017-07 | 1 | 1.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2017-03 | 26 | |
2017-05 | 24 | |
2017-04 | 19 | |
2017-01 | 15 | |
2017-02 | 10 | 10.0% |
2017-06 | 5 | 5.0% |
2017-07 | 1 | 1.0% |
Country_CD
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
vn |
---|
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | vn |
---|---|
2nd row | vn |
3rd row | vn |
4th row | vn |
5th row | vn |
Common Values
Value | Count | Frequency (%) |
vn | 100 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
vn | 100 |
Collection_CH_NM
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
NEWS |
---|
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | NEWS |
---|---|
2nd row | NEWS |
3rd row | NEWS |
4th row | NEWS |
5th row | NEWS |
Common Values
Value | Count | Frequency (%) |
NEWS | 100 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
news | 100 |
News_KEY_W
Text
Distinct | 69 |
---|---|
Distinct (%) | 69.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Value | Count | Frequency (%) |
vietnam | 5 | 5.0% |
vietnames | 4 | 4.0% |
korean | 4 | 4.0% |
korea | 4 | 4.0% |
citi | 3 | 3.0% |
band | 3 | 3.0% |
hcm | 2 | 2.0% |
hanoi | 2 | 2.0% |
two | 2 | 2.0% |
kpop | 2 | 2.0% |
Other values (59) | 69 |
Most occurring characters
Value | Count | Frequency (%) |
a | 56 | |
i | 56 | |
e | 52 | 9.4% |
n | 51 | 9.3% |
o | 44 | 8.0% |
t | 41 | 7.4% |
r | 37 | 6.7% |
m | 31 | 5.6% |
c | 23 | 4.2% |
s | 21 | 3.8% |
Other values (13) | 139 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 551 |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
a | 56 | |
i | 56 | |
e | 52 | 9.4% |
n | 51 | 9.3% |
o | 44 | 8.0% |
t | 41 | 7.4% |
r | 37 | 6.7% |
m | 31 | 5.6% |
c | 23 | 4.2% |
s | 21 | 3.8% |
Other values (13) | 139 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 551 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
a | 56 | |
i | 56 | |
e | 52 | 9.4% |
n | 51 | 9.3% |
o | 44 | 8.0% |
t | 41 | 7.4% |
r | 37 | 6.7% |
m | 31 | 5.6% |
c | 23 | 4.2% |
s | 21 | 3.8% |
Other values (13) | 139 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 551 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
a | 56 | |
i | 56 | |
e | 52 | 9.4% |
n | 51 | 9.3% |
o | 44 | 8.0% |
t | 41 | 7.4% |
r | 37 | 6.7% |
m | 31 | 5.6% |
c | 23 | 4.2% |
s | 21 | 3.8% |
Other values (13) | 139 |
Keyword_FQ
Real number (ℝ)
Distinct | 31 |
---|---|
Distinct (%) | 31.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 19.68 |
Minimum | 11 |
---|---|
Maximum | 67 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 11 |
---|---|
5-th percentile | 11 |
Q1 | 13 |
median | 16 |
Q3 | 23.25 |
95-th percentile | 42.05 |
Maximum | 67 |
Range | 56 |
Interquartile range (IQR) | 10.25 |
Descriptive statistics
Standard deviation | 10.674126 |
---|---|
Coefficient of variation (CV) | 0.54238446 |
Kurtosis | 4.8208262 |
Mean | 19.68 |
Median Absolute Deviation (MAD) | 4 |
Skewness | 2.0855703 |
Sum | 1968 |
Variance | 113.93697 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
13 | 15 | |
12 | 13 | |
14 | 9 | 9.0% |
11 | 9 | 9.0% |
16 | 7 | 7.0% |
18 | 4 | 4.0% |
25 | 4 | 4.0% |
24 | 3 | 3.0% |
20 | 3 | 3.0% |
19 | 3 | 3.0% |
Other values (21) | 30 |
Value | Count | Frequency (%) |
11 | 9 | |
12 | 13 | |
13 | 15 | |
14 | 9 | |
15 | 3 | 3.0% |
16 | 7 | |
17 | 3 | 3.0% |
18 | 4 | 4.0% |
19 | 3 | 3.0% |
20 | 3 | 3.0% |
Value | Count | Frequency (%) |
67 | 1 | |
54 | 1 | |
53 | 1 | |
49 | 1 | |
43 | 1 | |
42 | 1 | |
39 | 1 | |
38 | 1 | |
37 | 1 | |
36 | 1 |
FILE_NAME
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
KC_KEYWORD_NEWS_VN_2019 |
---|
Length
Max length | 23 |
---|---|
Median length | 23 |
Mean length | 23 |
Min length | 23 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | KC_KEYWORD_NEWS_VN_2019 |
---|---|
2nd row | KC_KEYWORD_NEWS_VN_2019 |
3rd row | KC_KEYWORD_NEWS_VN_2019 |
4th row | KC_KEYWORD_NEWS_VN_2019 |
5th row | KC_KEYWORD_NEWS_VN_2019 |
Common Values
Value | Count | Frequency (%) |
KC_KEYWORD_NEWS_VN_2019 | 100 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
kc_keyword_news_vn_2019 | 100 |
BASE_YMD
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
2019 |
---|
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2019 |
---|---|
2nd row | 2019 |
3rd row | 2019 |
4th row | 2019 |
5th row | 2019 |
Common Values
Value | Count | Frequency (%) |
2019 | 100 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2019 | 100 |
Social_Data_Collection_Date_YM | News_KEY_W | Keyword_FQ | |
---|---|---|---|
Social_Data_Collection_Date_YM | 1.000 | 0.469 | 0.000 |
News_KEY_W | 0.469 | 1.000 | 0.000 |
Keyword_FQ | 0.000 | 0.000 | 1.000 |
Keyword_FQ | Social_Data_Collection_Date_YM | |
---|---|---|
Keyword_FQ | 1.000 | 0.000 |
Social_Data_Collection_Date_YM | 0.000 | 1.000 |
Social_Data_Collection_Date_YM | Country_CD | Collection_CH_NM | News_KEY_W | Keyword_FQ | FILE_NAME | BASE_YMD | |
---|---|---|---|---|---|---|---|
0 | 2017-01 | vn | NEWS | korean | 43 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
1 | 2017-01 | vn | NEWS | south | 24 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
2 | 2017-01 | vn | NEWS | korea | 23 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
3 | 2017-01 | vn | NEWS | china | 21 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
4 | 2017-01 | vn | NEWS | million | 21 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
5 | 2017-01 | vn | NEWS | nam | 20 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
6 | 2017-01 | vn | NEWS | vietnames | 14 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
7 | 2017-01 | vn | NEWS | citi | 14 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
8 | 2017-01 | vn | NEWS | megastar | 13 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
9 | 2017-01 | vn | NEWS | billion | 12 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
Social_Data_Collection_Date_YM | Country_CD | Collection_CH_NM | News_KEY_W | Keyword_FQ | FILE_NAME | BASE_YMD | |
---|---|---|---|---|---|---|---|
90 | 2017-05 | vn | NEWS | night | 13 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
91 | 2017-05 | vn | NEWS | market | 12 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
92 | 2017-05 | vn | NEWS | fest | 12 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
93 | 2017-05 | vn | NEWS | winner | 12 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
94 | 2017-06 | vn | NEWS | industri | 28 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
95 | 2017-06 | vn | NEWS | vietnam | 18 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
96 | 2017-06 | vn | NEWS | asean | 13 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
97 | 2017-06 | vn | NEWS | macadamia | 12 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
98 | 2017-06 | vn | NEWS | develop | 11 | KC_KEYWORD_NEWS_VN_2019 | 2019 |
99 | 2017-07 | vn | NEWS | album | 15 | KC_KEYWORD_NEWS_VN_2019 | 2019 |