Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 10000 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 576.3 KiB |
Average record size in memory | 59.0 B |
Variable types
DateTime | 1 |
---|---|
Text | 2 |
Numeric | 3 |
Categorical | 1 |
Dataset
Description | 뉴스기반 통계검색 서비스 내의 주요 키워드, 키워드 관계망 그래프 작성을 위한 일간 데이터 집계 주소 및 키워드 분석 자료입니다. |
---|---|
URL | https://www.data.go.kr/data/15121203/fileData.do |
Reproduction
Analysis started | 2023-12-12 19:39:33.995965 |
---|---|
Analysis finished | 2023-12-12 19:39:35.852863 |
Duration | 1.86 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
등록일자
Date
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 78.3 KiB |
Minimum | 2023-08-01 00:00:00 |
---|---|
Maximum | 2023-08-01 00:00:00 |
데이터주소
Text
Distinct | 4205 |
---|---|
Distinct (%) | 42.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 78.3 KiB |
Length
Max length | 61 |
---|---|
Median length | 61 |
Mean length | 61 |
Min length | 61 |
Characters and Unicode
Total characters | 610000 |
---|---|
Distinct characters | 31 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 1601 ? |
---|---|
Unique (%) | 16.0% |
Sample
1st row | https://n.news.naver.com/mnews/article/028/0002650578?sid=101 |
---|---|
2nd row | https://n.news.naver.com/mnews/article/144/0000903833?sid=101 |
3rd row | https://n.news.naver.com/mnews/article/076/0004038967?sid=101 |
4th row | https://n.news.naver.com/mnews/article/662/0000025491?sid=101 |
5th row | https://n.news.naver.com/mnews/article/036/0000048547?sid=102 |
Value | Count | Frequency (%) |
https://n.news.naver.com/mnews/article/640/0000041513?sid=101 | 20 | 0.2% |
https://n.news.naver.com/mnews/article/056/0011537207?sid=102 | 17 | 0.2% |
https://n.news.naver.com/mnews/article/028/0002650578?sid=101 | 16 | 0.2% |
https://n.news.naver.com/mnews/article/001/0014105357?sid=102 | 16 | 0.2% |
https://n.news.naver.com/mnews/article/640/0000041526?sid=102 | 14 | 0.1% |
https://n.news.naver.com/mnews/article/640/0000041534?sid=102 | 14 | 0.1% |
https://n.news.naver.com/mnews/article/215/0001116766?sid=101 | 14 | 0.1% |
https://n.news.naver.com/mnews/article/640/0000041533?sid=101 | 14 | 0.1% |
https://n.news.naver.com/mnews/article/003/0012007971?sid=101 | 13 | 0.1% |
https://n.news.naver.com/mnews/article/640/0000041527?sid=102 | 13 | 0.1% |
Other values (4195) | 9849 |
Most occurring characters
Value | Count | Frequency (%) |
/ | 60000 | 9.8% |
0 | 58051 | 9.5% |
s | 40000 | 6.6% |
n | 40000 | 6.6% |
e | 40000 | 6.6% |
1 | 30060 | 4.9% |
t | 30000 | 4.9% |
. | 30000 | 4.9% |
r | 20000 | 3.3% |
i | 20000 | 3.3% |
Other values (21) | 241889 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 330000 | |
Decimal Number | 160000 | |
Other Punctuation | 110000 | 18.0% |
Math Symbol | 10000 | 1.6% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
s | 40000 | |
n | 40000 | |
e | 40000 | |
t | 30000 | |
r | 20000 | 6.1% |
i | 20000 | 6.1% |
c | 20000 | 6.1% |
m | 20000 | 6.1% |
a | 20000 | 6.1% |
w | 20000 | 6.1% |
Other values (6) | 60000 |
Decimal Number
Value | Count | Frequency (%) |
0 | 58051 | |
1 | 30060 | |
2 | 15385 | 9.6% |
4 | 9718 | 6.1% |
5 | 8991 | 5.6% |
6 | 8741 | 5.5% |
9 | 7677 | 4.8% |
7 | 7490 | 4.7% |
3 | 7351 | 4.6% |
8 | 6536 | 4.1% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 60000 | |
. | 30000 | |
? | 10000 | 9.1% |
: | 10000 | 9.1% |
Math Symbol
Value | Count | Frequency (%) |
= | 10000 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 330000 | |
Common | 280000 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
s | 40000 | |
n | 40000 | |
e | 40000 | |
t | 30000 | |
r | 20000 | 6.1% |
i | 20000 | 6.1% |
c | 20000 | 6.1% |
m | 20000 | 6.1% |
a | 20000 | 6.1% |
w | 20000 | 6.1% |
Other values (6) | 60000 |
Common
Value | Count | Frequency (%) |
/ | 60000 | |
0 | 58051 | |
1 | 30060 | |
. | 30000 | |
2 | 15385 | 5.5% |
? | 10000 | 3.6% |
= | 10000 | 3.6% |
: | 10000 | 3.6% |
4 | 9718 | 3.5% |
5 | 8991 | 3.2% |
Other values (5) | 37795 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 610000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
/ | 60000 | 9.8% |
0 | 58051 | 9.5% |
s | 40000 | 6.6% |
n | 40000 | 6.6% |
e | 40000 | 6.6% |
1 | 30060 | 4.9% |
t | 30000 | 4.9% |
. | 30000 | 4.9% |
r | 20000 | 3.3% |
i | 20000 | 3.3% |
Other values (21) | 241889 |
단어
Text
Distinct | 3937 |
---|---|
Distinct (%) | 39.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 78.3 KiB |
Value | Count | Frequency (%) |
기자 | 152 | 1.5% |
확대 | 60 | 0.6% |
db | 57 | 0.6% |
필요 | 55 | 0.5% |
대상 | 54 | 0.5% |
기준 | 54 | 0.5% |
뉴스 | 51 | 0.5% |
다양 | 51 | 0.5% |
기업 | 48 | 0.5% |
가능 | 47 | 0.5% |
Other values (3920) | 9371 |
Most occurring characters
Value | Count | Frequency (%) |
기 | 957 | 3.3% |
대 | 549 | 1.9% |
가 | 455 | 1.6% |
자 | 424 | 1.5% |
국 | 412 | 1.4% |
구 | 337 | 1.2% |
스 | 300 | 1.0% |
사 | 287 | 1.0% |
해 | 264 | 0.9% |
강 | 246 | 0.9% |
Other values (751) | 24579 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 24340 | |
Uppercase Letter | 2241 | 7.8% |
Lowercase Letter | 2229 | 7.7% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
기 | 957 | 3.9% |
대 | 549 | 2.3% |
가 | 455 | 1.9% |
자 | 424 | 1.7% |
국 | 412 | 1.7% |
구 | 337 | 1.4% |
스 | 300 | 1.2% |
사 | 287 | 1.2% |
해 | 264 | 1.1% |
강 | 246 | 1.0% |
Other values (699) | 20109 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 246 | 11.0% |
a | 201 | 9.0% |
n | 198 | 8.9% |
o | 184 | 8.3% |
i | 164 | 7.4% |
s | 150 | 6.7% |
t | 128 | 5.7% |
r | 126 | 5.7% |
d | 86 | 3.9% |
p | 83 | 3.7% |
Other values (16) | 663 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 224 | 10.0% |
B | 201 | 9.0% |
S | 175 | 7.8% |
D | 157 | 7.0% |
A | 156 | 7.0% |
M | 149 | 6.6% |
I | 146 | 6.5% |
O | 101 | 4.5% |
T | 100 | 4.5% |
P | 90 | 4.0% |
Other values (16) | 742 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 24340 | |
Latin | 4470 | 15.5% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
기 | 957 | 3.9% |
대 | 549 | 2.3% |
가 | 455 | 1.9% |
자 | 424 | 1.7% |
국 | 412 | 1.7% |
구 | 337 | 1.4% |
스 | 300 | 1.2% |
사 | 287 | 1.2% |
해 | 264 | 1.1% |
강 | 246 | 1.0% |
Other values (699) | 20109 |
Latin
Value | Count | Frequency (%) |
e | 246 | 5.5% |
C | 224 | 5.0% |
a | 201 | 4.5% |
B | 201 | 4.5% |
n | 198 | 4.4% |
o | 184 | 4.1% |
S | 175 | 3.9% |
i | 164 | 3.7% |
D | 157 | 3.5% |
A | 156 | 3.5% |
Other values (42) | 2564 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 24340 | |
ASCII | 4470 | 15.5% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
기 | 957 | 3.9% |
대 | 549 | 2.3% |
가 | 455 | 1.9% |
자 | 424 | 1.7% |
국 | 412 | 1.7% |
구 | 337 | 1.4% |
스 | 300 | 1.2% |
사 | 287 | 1.2% |
해 | 264 | 1.1% |
강 | 246 | 1.0% |
Other values (699) | 20109 |
ASCII
Value | Count | Frequency (%) |
e | 246 | 5.5% |
C | 224 | 5.0% |
a | 201 | 4.5% |
B | 201 | 4.5% |
n | 198 | 4.4% |
o | 184 | 4.1% |
S | 175 | 3.9% |
i | 164 | 3.7% |
D | 157 | 3.5% |
A | 156 | 3.5% |
Other values (42) | 2564 |
단어개수
Real number (ℝ)
Distinct | 22 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1.5754 |
Minimum | 1 |
---|---|
Maximum | 25 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 88.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 1 |
median | 1 |
Q3 | 2 |
95-th percentile | 4 |
Maximum | 25 |
Range | 24 |
Interquartile range (IQR) | 1 |
Descriptive statistics
Standard deviation | 1.5022452 |
---|---|
Coefficient of variation (CV) | 0.95356427 |
Kurtosis | 46.11651 |
Mean | 1.5754 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 5.4617943 |
Sum | 15754 |
Variance | 2.2567405 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 7371 | |
2 | 1477 | 14.8% |
3 | 504 | 5.0% |
4 | 250 | 2.5% |
5 | 128 | 1.3% |
6 | 100 | 1.0% |
7 | 53 | 0.5% |
8 | 32 | 0.3% |
10 | 18 | 0.2% |
9 | 16 | 0.2% |
Other values (12) | 51 | 0.5% |
Value | Count | Frequency (%) |
1 | 7371 | |
2 | 1477 | 14.8% |
3 | 504 | 5.0% |
4 | 250 | 2.5% |
5 | 128 | 1.3% |
6 | 100 | 1.0% |
7 | 53 | 0.5% |
8 | 32 | 0.3% |
9 | 16 | 0.2% |
10 | 18 | 0.2% |
Value | Count | Frequency (%) |
25 | 1 | < 0.1% |
23 | 3 | |
22 | 1 | < 0.1% |
20 | 1 | < 0.1% |
18 | 1 | < 0.1% |
17 | 4 | |
16 | 3 | |
15 | 3 | |
14 | 6 | |
13 | 4 |
데이터건수
Real number (ℝ)
Distinct | 273 |
---|---|
Distinct (%) | 2.7% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 214.0577 |
Minimum | 1 |
---|---|
Maximum | 2551 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 88.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 10 |
median | 61 |
Q3 | 297 |
95-th percentile | 771 |
Maximum | 2551 |
Range | 2550 |
Interquartile range (IQR) | 287 |
Descriptive statistics
Standard deviation | 374.37153 |
---|---|
Coefficient of variation (CV) | 1.7489281 |
Kurtosis | 20.67667 |
Mean | 214.0577 |
Median Absolute Deviation (MAD) | 59 |
Skewness | 3.9690751 |
Sum | 2140577 |
Variance | 140154.04 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 734 | 7.3% |
2 | 412 | 4.1% |
3 | 323 | 3.2% |
4 | 250 | 2.5% |
6 | 183 | 1.8% |
5 | 183 | 1.8% |
2551 | 152 | 1.5% |
7 | 149 | 1.5% |
10 | 138 | 1.4% |
11 | 132 | 1.3% |
Other values (263) | 7344 |
Value | Count | Frequency (%) |
1 | 734 | |
2 | 412 | |
3 | 323 | |
4 | 250 | 2.5% |
5 | 183 | 1.8% |
6 | 183 | 1.8% |
7 | 149 | 1.5% |
8 | 125 | 1.2% |
9 | 128 | 1.3% |
10 | 138 | 1.4% |
Value | Count | Frequency (%) |
2551 | 152 | |
961 | 60 | 0.6% |
942 | 54 | 0.5% |
893 | 34 | 0.3% |
881 | 54 | 0.5% |
814 | 55 | 0.5% |
806 | 51 | 0.5% |
771 | 45 | 0.4% |
754 | 47 | 0.5% |
750 | 45 | 0.4% |
단어건수
Real number (ℝ)
Distinct | 304 |
---|---|
Distinct (%) | 3.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 125.0957 |
Minimum | 3 |
---|---|
Maximum | 700 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 88.0 KiB |
Quantile statistics
Minimum | 3 |
---|---|
5-th percentile | 39 |
Q1 | 77 |
median | 109 |
Q3 | 153 |
95-th percentile | 255 |
Maximum | 700 |
Range | 697 |
Interquartile range (IQR) | 76 |
Descriptive statistics
Standard deviation | 76.580564 |
---|---|
Coefficient of variation (CV) | 0.61217583 |
Kurtosis | 10.134513 |
Mean | 125.0957 |
Median Absolute Deviation (MAD) | 36 |
Skewness | 2.411821 |
Sum | 1250957 |
Variance | 5864.5828 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
122 | 118 | 1.2% |
80 | 111 | 1.1% |
77 | 105 | 1.1% |
101 | 101 | 1.0% |
68 | 97 | 1.0% |
86 | 95 | 0.9% |
87 | 95 | 0.9% |
102 | 95 | 0.9% |
79 | 94 | 0.9% |
91 | 93 | 0.9% |
Other values (294) | 8996 |
Value | Count | Frequency (%) |
3 | 1 | < 0.1% |
6 | 1 | < 0.1% |
7 | 2 | < 0.1% |
8 | 6 | |
10 | 3 | < 0.1% |
11 | 7 | |
12 | 2 | < 0.1% |
13 | 3 | < 0.1% |
14 | 3 | < 0.1% |
15 | 13 |
Value | Count | Frequency (%) |
700 | 16 | |
574 | 16 | |
517 | 11 | |
514 | 5 | 0.1% |
502 | 9 | |
488 | 5 | 0.1% |
480 | 12 | |
467 | 20 | |
433 | 17 | |
421 | 8 | 0.1% |
주
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 78.3 KiB |
20230730-20230805 |
---|
Length
Max length | 17 |
---|---|
Median length | 17 |
Mean length | 17 |
Min length | 17 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 20230730-20230805 |
---|---|
2nd row | 20230730-20230805 |
3rd row | 20230730-20230805 |
4th row | 20230730-20230805 |
5th row | 20230730-20230805 |
Common Values
Value | Count | Frequency (%) |
20230730-20230805 | 10000 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20230730-20230805 | 10000 |
단어개수 | 데이터건수 | 단어건수 | |
---|---|---|---|
단어개수 | 1.000 | 0.029 | 0.222 |
데이터건수 | 0.029 | 1.000 | 0.087 |
단어건수 | 0.222 | 0.087 | 1.000 |
단어개수 | 데이터건수 | 단어건수 | |
---|---|---|---|
단어개수 | 1.000 | 0.111 | 0.056 |
데이터건수 | 0.111 | 1.000 | -0.071 |
단어건수 | 0.056 | -0.071 | 1.000 |
등록일자 | 데이터주소 | 단어 | 단어개수 | 데이터건수 | 단어건수 | 주 | |
---|---|---|---|---|---|---|---|
0 | 2023-08-01 | https://n.news.naver.com/mnews/article/028/0002650578?sid=101 | 다양 | 1 | 806 | 700 | 20230730-20230805 |
1 | 2023-08-01 | https://n.news.naver.com/mnews/article/144/0000903833?sid=101 | 다양 | 1 | 806 | 145 | 20230730-20230805 |
2 | 2023-08-01 | https://n.news.naver.com/mnews/article/076/0004038967?sid=101 | 다양 | 1 | 806 | 89 | 20230730-20230805 |
3 | 2023-08-01 | https://n.news.naver.com/mnews/article/662/0000025491?sid=101 | 다양 | 2 | 806 | 139 | 20230730-20230805 |
4 | 2023-08-01 | https://n.news.naver.com/mnews/article/036/0000048547?sid=102 | 다양 | 1 | 806 | 480 | 20230730-20230805 |
5 | 2023-08-01 | https://n.news.naver.com/mnews/article/421/0006963833?sid=101 | 다양 | 1 | 806 | 219 | 20230730-20230805 |
6 | 2023-08-01 | https://n.news.naver.com/mnews/article/001/0014105158?sid=102 | 다양 | 1 | 806 | 129 | 20230730-20230805 |
7 | 2023-08-01 | https://n.news.naver.com/mnews/article/648/0000018372?sid=101 | 다양 | 1 | 806 | 153 | 20230730-20230805 |
8 | 2023-08-01 | https://n.news.naver.com/mnews/article/003/0012007000?sid=102 | 다양 | 1 | 806 | 165 | 20230730-20230805 |
9 | 2023-08-01 | https://n.news.naver.com/mnews/article/020/0003512553?sid=102 | 다양 | 1 | 806 | 138 | 20230730-20230805 |
등록일자 | 데이터주소 | 단어 | 단어개수 | 데이터건수 | 단어건수 | 주 | |
---|---|---|---|---|---|---|---|
9990 | 2023-08-01 | https://n.news.naver.com/mnews/article/081/0003381483?sid=102 | 맹주 | 1 | 3 | 143 | 20230730-20230805 |
9991 | 2023-08-01 | https://n.news.naver.com/mnews/article/008/0004919637?sid=101 | 머니투데이 | 1 | 24 | 64 | 20230730-20230805 |
9992 | 2023-08-01 | https://n.news.naver.com/mnews/article/008/0004919682?sid=101 | 머니투데이 | 1 | 24 | 71 | 20230730-20230805 |
9993 | 2023-08-01 | https://n.news.naver.com/mnews/article/003/0012007327?sid=102 | 머리 | 2 | 62 | 94 | 20230730-20230805 |
9994 | 2023-08-01 | https://n.news.naver.com/mnews/article/087/0000986529?sid=101 | 머스 | 1 | 3 | 57 | 20230730-20230805 |
9995 | 2023-08-01 | https://n.news.naver.com/mnews/article/030/0003122065?sid=101 | 머신러닝 | 1 | 14 | 68 | 20230730-20230805 |
9996 | 2023-08-01 | https://n.news.naver.com/mnews/article/215/0001116946?sid=101 | 먹거리 | 2 | 51 | 91 | 20230730-20230805 |
9997 | 2023-08-01 | https://n.news.naver.com/mnews/article/008/0004919811?sid=101 | 먹거리 | 1 | 51 | 149 | 20230730-20230805 |
9998 | 2023-08-01 | https://n.news.naver.com/mnews/article/003/0012006461?sid=101 | 먹거리 | 1 | 51 | 213 | 20230730-20230805 |
9999 | 2023-08-01 | https://n.news.naver.com/mnews/article/366/0000921069?sid=101 | 먹거리 | 2 | 51 | 261 | 20230730-20230805 |