Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 10000 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 654.3 KiB |
Average record size in memory | 67.0 B |
Variable types
Categorical | 3 |
---|---|
Text | 1 |
Numeric | 3 |
Dataset
Description | 뉴스기반 통계검색 서비스 내의 주요 키워드, 키워드 관계망 그래프 작성을 위한 주간 키워드 데이터 집계 및 분석 자료입니다. |
---|---|
URL | https://www.data.go.kr/data/15121130/fileData.do |
주 is highly overall correlated with 등록일자 | High correlation |
등록일자 is highly overall correlated with 주 | High correlation |
주간단어개수 is highly overall correlated with 주간랭크 and 1 other fields | High correlation |
주간랭크 is highly overall correlated with 주간단어개수 | High correlation |
주간합계건수 is highly overall correlated with 주간단어개수 | High correlation |
Reproduction
Analysis started | 2023-12-12 12:08:50.193328 |
---|---|
Analysis finished | 2023-12-12 12:08:53.478520 |
Duration | 3.29 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
주
Categorical
HIGH CORRELATION
 
Distinct | 7 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
20220828-20220903 | |
---|---|
20220821-20220827 | |
20220904-20220910 | |
20220814-20220820 | |
20220807-20220813 | |
Other values (2) |
Length
Max length | 17 |
---|---|
Median length | 17 |
Mean length | 17 |
Min length | 17 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 20220828-20220903 |
---|---|
2nd row | 20220821-20220827 |
3rd row | 20220731-20220806 |
4th row | 20220814-20220820 |
5th row | 20220911-20220917 |
Common Values
Value | Count | Frequency (%) |
20220828-20220903 | 1606 | |
20220821-20220827 | 1586 | |
20220904-20220910 | 1579 | |
20220814-20220820 | 1577 | |
20220807-20220813 | 1565 | |
20220731-20220806 | 1552 | |
20220911-20220917 | 535 | 5.3% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20220828-20220903 | 1606 | |
20220821-20220827 | 1586 | |
20220904-20220910 | 1579 | |
20220814-20220820 | 1577 | |
20220807-20220813 | 1565 | |
20220731-20220806 | 1552 | |
20220911-20220917 | 535 | 5.3% |
단어
Text
Distinct | 5007 |
---|---|
Distinct (%) | 50.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
국가 | 10 | 0.1% |
안팎 | 9 | 0.1% |
침체 | 9 | 0.1% |
연합뉴스 | 9 | 0.1% |
물량 | 9 | 0.1% |
사실 | 9 | 0.1% |
능력 | 8 | 0.1% |
인정 | 8 | 0.1% |
회복 | 8 | 0.1% |
활성화 | 8 | 0.1% |
Other values (4996) | 9913 |
Most occurring characters
Value | Count | Frequency (%) |
기 | 423 | 1.6% |
사 | 385 | 1.5% |
이 | 368 | 1.4% |
자 | 350 | 1.3% |
지 | 350 | 1.3% |
대 | 329 | 1.2% |
시 | 311 | 1.2% |
가 | 307 | 1.2% |
상 | 304 | 1.1% |
수 | 298 | 1.1% |
Other values (769) | 23104 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 25876 | |
Uppercase Letter | 533 | 2.0% |
Lowercase Letter | 110 | 0.4% |
Decimal Number | 10 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
기 | 423 | 1.6% |
사 | 385 | 1.5% |
이 | 368 | 1.4% |
자 | 350 | 1.4% |
지 | 350 | 1.4% |
대 | 329 | 1.3% |
시 | 311 | 1.2% |
가 | 307 | 1.2% |
상 | 304 | 1.2% |
수 | 298 | 1.2% |
Other values (722) | 22451 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 52 | 9.8% |
S | 45 | 8.4% |
B | 37 | 6.9% |
T | 34 | 6.4% |
M | 30 | 5.6% |
D | 26 | 4.9% |
P | 26 | 4.9% |
G | 25 | 4.7% |
A | 24 | 4.5% |
I | 24 | 4.5% |
Other values (14) | 210 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 13 | |
o | 11 | 10.0% |
t | 10 | 9.1% |
w | 8 | 7.3% |
s | 8 | 7.3% |
a | 7 | 6.4% |
i | 7 | 6.4% |
n | 6 | 5.5% |
p | 5 | 4.5% |
m | 5 | 4.5% |
Other values (11) | 30 |
Decimal Number
Value | Count | Frequency (%) |
1 | 5 | |
9 | 5 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 25876 | |
Latin | 643 | 2.4% |
Common | 10 | < 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
기 | 423 | 1.6% |
사 | 385 | 1.5% |
이 | 368 | 1.4% |
자 | 350 | 1.4% |
지 | 350 | 1.4% |
대 | 329 | 1.3% |
시 | 311 | 1.2% |
가 | 307 | 1.2% |
상 | 304 | 1.2% |
수 | 298 | 1.2% |
Other values (722) | 22451 |
Latin
Value | Count | Frequency (%) |
C | 52 | 8.1% |
S | 45 | 7.0% |
B | 37 | 5.8% |
T | 34 | 5.3% |
M | 30 | 4.7% |
D | 26 | 4.0% |
P | 26 | 4.0% |
G | 25 | 3.9% |
A | 24 | 3.7% |
I | 24 | 3.7% |
Other values (35) | 320 |
Common
Value | Count | Frequency (%) |
1 | 5 | |
9 | 5 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 25876 | |
ASCII | 653 | 2.5% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
기 | 423 | 1.6% |
사 | 385 | 1.5% |
이 | 368 | 1.4% |
자 | 350 | 1.4% |
지 | 350 | 1.4% |
대 | 329 | 1.3% |
시 | 311 | 1.2% |
가 | 307 | 1.2% |
상 | 304 | 1.2% |
수 | 298 | 1.2% |
Other values (722) | 22451 |
ASCII
Value | Count | Frequency (%) |
C | 52 | 8.0% |
S | 45 | 6.9% |
B | 37 | 5.7% |
T | 34 | 5.2% |
M | 30 | 4.6% |
D | 26 | 4.0% |
P | 26 | 4.0% |
G | 25 | 3.8% |
A | 24 | 3.7% |
I | 24 | 3.7% |
Other values (37) | 330 |
등록일자
Categorical
HIGH CORRELATION
 
Distinct | 7 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2022-09-03 | |
---|---|
2022-08-27 | |
2022-09-10 | |
2022-08-20 | |
2022-08-13 | |
Other values (2) |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2022-09-03 |
---|---|
2nd row | 2022-08-27 |
3rd row | 2022-08-06 |
4th row | 2022-08-20 |
5th row | 2022-09-17 |
Common Values
Value | Count | Frequency (%) |
2022-09-03 | 1606 | |
2022-08-27 | 1586 | |
2022-09-10 | 1579 | |
2022-08-20 | 1577 | |
2022-08-13 | 1565 | |
2022-08-06 | 1552 | |
2022-09-17 | 535 | 5.3% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2022-09-03 | 1606 | |
2022-08-27 | 1586 | |
2022-09-10 | 1579 | |
2022-08-20 | 1577 | |
2022-08-13 | 1565 | |
2022-08-06 | 1552 | |
2022-09-17 | 535 | 5.3% |
주간단어개수
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 1605 |
---|---|
Distinct (%) | 16.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 377.8704 |
Minimum | 4 |
---|---|
Maximum | 20040 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 4 |
---|---|
5-th percentile | 7 |
Q1 | 24 |
median | 87 |
Q3 | 321 |
95-th percentile | 1634.05 |
Maximum | 20040 |
Range | 20036 |
Interquartile range (IQR) | 297 |
Descriptive statistics
Standard deviation | 951.81615 |
---|---|
Coefficient of variation (CV) | 2.5188958 |
Kurtosis | 67.440283 |
Mean | 377.8704 |
Median Absolute Deviation (MAD) | 76 |
Skewness | 6.7739566 |
Sum | 3778704 |
Variance | 905953.98 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
6 | 207 | 2.1% |
16 | 163 | 1.6% |
22 | 162 | 1.6% |
21 | 159 | 1.6% |
7 | 159 | 1.6% |
20 | 144 | 1.4% |
24 | 142 | 1.4% |
17 | 142 | 1.4% |
8 | 133 | 1.3% |
18 | 132 | 1.3% |
Other values (1595) | 8457 |
Value | Count | Frequency (%) |
4 | 16 | 0.2% |
5 | 124 | |
6 | 207 | |
7 | 159 | |
8 | 133 | |
9 | 114 | |
10 | 115 | |
11 | 84 | |
12 | 71 | 0.7% |
13 | 71 | 0.7% |
Value | Count | Frequency (%) |
20040 | 1 | |
15541 | 1 | |
14522 | 1 | |
13207 | 1 | |
12290 | 1 | |
11175 | 1 | |
10787 | 1 | |
10512 | 1 | |
10179 | 1 | |
10070 | 1 |
주간랭크
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 2897 |
---|---|
Distinct (%) | 29.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1490.6225 |
Minimum | 1 |
---|---|
Maximum | 3000 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 137 |
Q1 | 737.75 |
median | 1499 |
Q3 | 2244 |
95-th percentile | 2841 |
Maximum | 3000 |
Range | 2999 |
Interquartile range (IQR) | 1506.25 |
Descriptive statistics
Standard deviation | 872.95589 |
---|---|
Coefficient of variation (CV) | 0.58563177 |
Kurtosis | -1.2214308 |
Mean | 1490.6225 |
Median Absolute Deviation (MAD) | 753 |
Skewness | -0.0048208467 |
Sum | 14906225 |
Variance | 762051.99 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2211 | 11 | 0.1% |
105 | 10 | 0.1% |
791 | 10 | 0.1% |
2684 | 9 | 0.1% |
883 | 9 | 0.1% |
872 | 9 | 0.1% |
1628 | 9 | 0.1% |
2758 | 9 | 0.1% |
130 | 9 | 0.1% |
1977 | 9 | 0.1% |
Other values (2887) | 9906 |
Value | Count | Frequency (%) |
1 | 4 | |
2 | 6 | |
3 | 4 | |
4 | 4 | |
5 | 2 | < 0.1% |
6 | 3 | |
7 | 6 | |
8 | 6 | |
9 | 6 | |
11 | 5 |
Value | Count | Frequency (%) |
3000 | 6 | |
2999 | 2 | < 0.1% |
2998 | 5 | |
2997 | 1 | < 0.1% |
2996 | 2 | < 0.1% |
2995 | 2 | < 0.1% |
2994 | 5 | |
2993 | 3 | |
2992 | 1 | < 0.1% |
2991 | 3 |
주간합계건수
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 1245 |
---|---|
Distinct (%) | 12.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 226.455 |
Minimum | 1 |
---|---|
Maximum | 15264 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 4 |
Q1 | 15 |
median | 52 |
Q3 | 202 |
95-th percentile | 991.05 |
Maximum | 15264 |
Range | 15263 |
Interquartile range (IQR) | 187 |
Descriptive statistics
Standard deviation | 557.77365 |
---|---|
Coefficient of variation (CV) | 2.4630662 |
Kurtosis | 97.342258 |
Mean | 226.455 |
Median Absolute Deviation (MAD) | 46 |
Skewness | 7.386956 |
Sum | 2264550 |
Variance | 311111.44 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
5 | 222 | 2.2% |
6 | 205 | 2.1% |
7 | 194 | 1.9% |
4 | 185 | 1.8% |
12 | 183 | 1.8% |
14 | 178 | 1.8% |
15 | 175 | 1.8% |
13 | 174 | 1.7% |
10 | 167 | 1.7% |
8 | 165 | 1.7% |
Other values (1235) | 8152 |
Value | Count | Frequency (%) |
1 | 135 | |
2 | 136 | |
3 | 131 | |
4 | 185 | |
5 | 222 | |
6 | 205 | |
7 | 194 | |
8 | 165 | |
9 | 158 | |
10 | 167 |
Value | Count | Frequency (%) |
15264 | 1 | |
9680 | 1 | |
7735 | 1 | |
7067 | 1 | |
6721 | 1 | |
6446 | 1 | |
6382 | 1 | |
6209 | 1 | |
6046 | 1 | |
6028 | 1 |
키워드별코드
Categorical
Distinct | 5 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
COVID_SOC_KWD | |
---|---|
COVID_ECO_KWD | |
INSTITUTE_KWD | |
FRMPRD_KWD | |
ECO_KWD |
Length
Max length | 13 |
---|---|
Median length | 13 |
Mean length | 11.2462 |
Min length | 7 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | COVID_ECO_KWD |
---|---|
2nd row | COVID_SOC_KWD |
3rd row | FRMPRD_KWD |
4th row | FRMPRD_KWD |
5th row | COVID_SOC_KWD |
Common Values
Value | Count | Frequency (%) |
COVID_SOC_KWD | 2101 | |
COVID_ECO_KWD | 2003 | |
INSTITUTE_KWD | 1987 | |
FRMPRD_KWD | 1972 | |
ECO_KWD | 1937 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
covid_soc_kwd | 2101 | |
covid_eco_kwd | 2003 | |
institute_kwd | 1987 | |
frmprd_kwd | 1972 | |
eco_kwd | 1937 |
주 | 등록일자 | 주간단어개수 | 주간랭크 | 주간합계건수 | 키워드별코드 | |
---|---|---|---|---|---|---|
주 | 1.000 | 1.000 | 0.044 | 0.074 | 0.041 | 0.100 |
등록일자 | 1.000 | 1.000 | 0.044 | 0.074 | 0.041 | 0.100 |
주간단어개수 | 0.044 | 0.044 | 1.000 | 0.390 | 0.892 | 0.225 |
주간랭크 | 0.074 | 0.074 | 0.390 | 1.000 | 0.350 | 0.000 |
주간합계건수 | 0.041 | 0.041 | 0.892 | 0.350 | 1.000 | 0.183 |
키워드별코드 | 0.100 | 0.100 | 0.225 | 0.000 | 0.183 | 1.000 |
키워드별코드 | 주 | 등록일자 | |
---|---|---|---|
키워드별코드 | 1.000 | 0.064 | 0.064 |
주 | 0.064 | 1.000 | 1.000 |
등록일자 | 0.064 | 1.000 | 1.000 |
주간단어개수 | 주간랭크 | 주간합계건수 | 주 | 등록일자 | 키워드별코드 | |
---|---|---|---|---|---|---|
주간단어개수 | 1.000 | -0.518 | 0.967 | 0.023 | 0.023 | 0.132 |
주간랭크 | -0.518 | 1.000 | -0.487 | 0.037 | 0.037 | 0.000 |
주간합계건수 | 0.967 | -0.487 | 1.000 | 0.022 | 0.022 | 0.113 |
주 | 0.023 | 0.037 | 0.022 | 1.000 | 1.000 | 0.064 |
등록일자 | 0.023 | 0.037 | 0.022 | 1.000 | 1.000 | 0.064 |
키워드별코드 | 0.132 | 0.000 | 0.113 | 0.064 | 0.064 | 1.000 |
주 | 단어 | 등록일자 | 주간단어개수 | 주간랭크 | 주간합계건수 | 키워드별코드 | |
---|---|---|---|---|---|---|---|
72054 | 20220828-20220903 | 시공 | 2022-09-03 | 20 | 2943 | 17 | COVID_ECO_KWD |
49418 | 20220821-20220827 | 개가 | 2022-08-27 | 21 | 2431 | 21 | COVID_SOC_KWD |
4431 | 20220731-20220806 | 가게 | 2022-08-06 | 10 | 1664 | 4 | FRMPRD_KWD |
42560 | 20220814-20220820 | 해남 | 2022-08-20 | 9 | 1932 | 7 | FRMPRD_KWD |
93076 | 20220911-20220917 | 독감유행 | 2022-09-17 | 124 | 397 | 76 | COVID_SOC_KWD |
48892 | 20220821-20220827 | 코딩 | 2022-08-27 | 21 | 2353 | 13 | COVID_SOC_KWD |
36042 | 20220814-20220820 | 중소기업계 | 2022-08-20 | 22 | 2739 | 14 | COVID_ECO_KWD |
8413 | 20220731-20220806 | 병의원 | 2022-08-06 | 24 | 1983 | 17 | COVID_SOC_KWD |
83349 | 20220904-20220910 | 북상 | 2022-09-10 | 425 | 1059 | 258 | ECO_KWD |
63513 | 20220828-20220903 | 주방 | 2022-09-03 | 155 | 2362 | 132 | ECO_KWD |
주 | 단어 | 등록일자 | 주간단어개수 | 주간랭크 | 주간합계건수 | 키워드별코드 | |
---|---|---|---|---|---|---|---|
1789 | 20220731-20220806 | 세종뉴스 | 2022-08-06 | 18 | 959 | 18 | FRMPRD_KWD |
88206 | 20220904-20220910 | 만t | 2022-09-10 | 27 | 750 | 11 | FRMPRD_KWD |
30698 | 20220814-20220820 | 구성원 | 2022-08-20 | 190 | 2149 | 121 | ECO_KWD |
1124 | 20220731-20220806 | 농촌진흥청 | 2022-08-06 | 25 | 722 | 16 | FRMPRD_KWD |
76206 | 20220904-20220910 | EU | 2022-09-10 | 30 | 2213 | 17 | COVID_ECO_KWD |
31134 | 20220814-20220820 | 영세 | 2022-08-20 | 172 | 2341 | 96 | ECO_KWD |
23448 | 20220807-20220813 | 필수 | 2022-08-13 | 350 | 1134 | 318 | ECO_KWD |
31770 | 20220814-20220820 | 마무리 | 2022-08-20 | 549 | 811 | 455 | ECO_KWD |
21073 | 20220807-20220813 | 비판 | 2022-08-13 | 1003 | 632 | 686 | INSTITUTE_KWD |
76243 | 20220904-20220910 | 호실 | 2022-09-10 | 29 | 2250 | 21 | COVID_ECO_KWD |