Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory576.3 KiB
Average record size in memory59.0 B

Variable types

DateTime1
Text2
Numeric3
Categorical1

Dataset

Description뉴스기반 통계검색 서비스 내의 주요 키워드, 키워드 관계망 그래프 작성을 위한 일간 데이터 집계 주소 및 키워드 분석 자료입니다.
URLhttps://www.data.go.kr/data/15121203/fileData.do

Alerts

등록일자 has constant value ""Constant
has constant value ""Constant

Reproduction

Analysis started2023-12-12 19:39:33.995965
Analysis finished2023-12-12 19:39:35.852863
Duration1.86 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

등록일자
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.3 KiB
Minimum2023-08-01 00:00:00
Maximum2023-08-01 00:00:00
2023-12-13T04:39:35.912681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:36.006022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct4205
Distinct (%)42.0%
Missing0
Missing (%)0.0%
Memory size78.3 KiB
2023-12-13T04:39:36.289996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length61
Median length61
Mean length61
Min length61

Characters and Unicode

Total characters610000
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1601 ?
Unique (%)16.0%

Sample

1st rowhttps://n.news.naver.com/mnews/article/028/0002650578?sid=101
2nd rowhttps://n.news.naver.com/mnews/article/144/0000903833?sid=101
3rd rowhttps://n.news.naver.com/mnews/article/076/0004038967?sid=101
4th rowhttps://n.news.naver.com/mnews/article/662/0000025491?sid=101
5th rowhttps://n.news.naver.com/mnews/article/036/0000048547?sid=102
ValueCountFrequency (%)
https://n.news.naver.com/mnews/article/640/0000041513?sid=101 20
 
0.2%
https://n.news.naver.com/mnews/article/056/0011537207?sid=102 17
 
0.2%
https://n.news.naver.com/mnews/article/028/0002650578?sid=101 16
 
0.2%
https://n.news.naver.com/mnews/article/001/0014105357?sid=102 16
 
0.2%
https://n.news.naver.com/mnews/article/640/0000041526?sid=102 14
 
0.1%
https://n.news.naver.com/mnews/article/640/0000041534?sid=102 14
 
0.1%
https://n.news.naver.com/mnews/article/215/0001116766?sid=101 14
 
0.1%
https://n.news.naver.com/mnews/article/640/0000041533?sid=101 14
 
0.1%
https://n.news.naver.com/mnews/article/003/0012007971?sid=101 13
 
0.1%
https://n.news.naver.com/mnews/article/640/0000041527?sid=102 13
 
0.1%
Other values (4195) 9849
98.5%
2023-12-13T04:39:36.764653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 60000
 
9.8%
0 58051
 
9.5%
s 40000
 
6.6%
n 40000
 
6.6%
e 40000
 
6.6%
1 30060
 
4.9%
t 30000
 
4.9%
. 30000
 
4.9%
r 20000
 
3.3%
i 20000
 
3.3%
Other values (21) 241889
39.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 330000
54.1%
Decimal Number 160000
26.2%
Other Punctuation 110000
 
18.0%
Math Symbol 10000
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 40000
12.1%
n 40000
12.1%
e 40000
12.1%
t 30000
9.1%
r 20000
 
6.1%
i 20000
 
6.1%
c 20000
 
6.1%
m 20000
 
6.1%
a 20000
 
6.1%
w 20000
 
6.1%
Other values (6) 60000
18.2%
Decimal Number
ValueCountFrequency (%)
0 58051
36.3%
1 30060
18.8%
2 15385
 
9.6%
4 9718
 
6.1%
5 8991
 
5.6%
6 8741
 
5.5%
9 7677
 
4.8%
7 7490
 
4.7%
3 7351
 
4.6%
8 6536
 
4.1%
Other Punctuation
ValueCountFrequency (%)
/ 60000
54.5%
. 30000
27.3%
? 10000
 
9.1%
: 10000
 
9.1%
Math Symbol
ValueCountFrequency (%)
= 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 330000
54.1%
Common 280000
45.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 40000
12.1%
n 40000
12.1%
e 40000
12.1%
t 30000
9.1%
r 20000
 
6.1%
i 20000
 
6.1%
c 20000
 
6.1%
m 20000
 
6.1%
a 20000
 
6.1%
w 20000
 
6.1%
Other values (6) 60000
18.2%
Common
ValueCountFrequency (%)
/ 60000
21.4%
0 58051
20.7%
1 30060
10.7%
. 30000
10.7%
2 15385
 
5.5%
? 10000
 
3.6%
= 10000
 
3.6%
: 10000
 
3.6%
4 9718
 
3.5%
5 8991
 
3.2%
Other values (5) 37795
13.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 610000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 60000
 
9.8%
0 58051
 
9.5%
s 40000
 
6.6%
n 40000
 
6.6%
e 40000
 
6.6%
1 30060
 
4.9%
t 30000
 
4.9%
. 30000
 
4.9%
r 20000
 
3.3%
i 20000
 
3.3%
Other values (21) 241889
39.7%

단어
Text

Distinct3937
Distinct (%)39.4%
Missing0
Missing (%)0.0%
Memory size78.3 KiB
2023-12-13T04:39:37.185155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length2
Mean length2.881
Min length2

Characters and Unicode

Total characters28810
Distinct characters761
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2682 ?
Unique (%)26.8%

Sample

1st row다양
2nd row다양
3rd row다양
4th row다양
5th row다양
ValueCountFrequency (%)
기자 152
 
1.5%
확대 60
 
0.6%
db 57
 
0.6%
필요 55
 
0.5%
대상 54
 
0.5%
기준 54
 
0.5%
뉴스 51
 
0.5%
다양 51
 
0.5%
기업 48
 
0.5%
가능 47
 
0.5%
Other values (3920) 9371
93.7%
2023-12-13T04:39:37.836422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
957
 
3.3%
549
 
1.9%
455
 
1.6%
424
 
1.5%
412
 
1.4%
337
 
1.2%
300
 
1.0%
287
 
1.0%
264
 
0.9%
246
 
0.9%
Other values (751) 24579
85.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24340
84.5%
Uppercase Letter 2241
 
7.8%
Lowercase Letter 2229
 
7.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
957
 
3.9%
549
 
2.3%
455
 
1.9%
424
 
1.7%
412
 
1.7%
337
 
1.4%
300
 
1.2%
287
 
1.2%
264
 
1.1%
246
 
1.0%
Other values (699) 20109
82.6%
Lowercase Letter
ValueCountFrequency (%)
e 246
 
11.0%
a 201
 
9.0%
n 198
 
8.9%
o 184
 
8.3%
i 164
 
7.4%
s 150
 
6.7%
t 128
 
5.7%
r 126
 
5.7%
d 86
 
3.9%
p 83
 
3.7%
Other values (16) 663
29.7%
Uppercase Letter
ValueCountFrequency (%)
C 224
 
10.0%
B 201
 
9.0%
S 175
 
7.8%
D 157
 
7.0%
A 156
 
7.0%
M 149
 
6.6%
I 146
 
6.5%
O 101
 
4.5%
T 100
 
4.5%
P 90
 
4.0%
Other values (16) 742
33.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 24340
84.5%
Latin 4470
 
15.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
957
 
3.9%
549
 
2.3%
455
 
1.9%
424
 
1.7%
412
 
1.7%
337
 
1.4%
300
 
1.2%
287
 
1.2%
264
 
1.1%
246
 
1.0%
Other values (699) 20109
82.6%
Latin
ValueCountFrequency (%)
e 246
 
5.5%
C 224
 
5.0%
a 201
 
4.5%
B 201
 
4.5%
n 198
 
4.4%
o 184
 
4.1%
S 175
 
3.9%
i 164
 
3.7%
D 157
 
3.5%
A 156
 
3.5%
Other values (42) 2564
57.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 24340
84.5%
ASCII 4470
 
15.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
957
 
3.9%
549
 
2.3%
455
 
1.9%
424
 
1.7%
412
 
1.7%
337
 
1.4%
300
 
1.2%
287
 
1.2%
264
 
1.1%
246
 
1.0%
Other values (699) 20109
82.6%
ASCII
ValueCountFrequency (%)
e 246
 
5.5%
C 224
 
5.0%
a 201
 
4.5%
B 201
 
4.5%
n 198
 
4.4%
o 184
 
4.1%
S 175
 
3.9%
i 164
 
3.7%
D 157
 
3.5%
A 156
 
3.5%
Other values (42) 2564
57.4%

단어개수
Real number (ℝ)

Distinct22
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5754
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.0 KiB
2023-12-13T04:39:38.019444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile4
Maximum25
Range24
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.5022452
Coefficient of variation (CV)0.95356427
Kurtosis46.11651
Mean1.5754
Median Absolute Deviation (MAD)0
Skewness5.4617943
Sum15754
Variance2.2567405
MonotonicityNot monotonic
2023-12-13T04:39:38.185772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
1 7371
73.7%
2 1477
 
14.8%
3 504
 
5.0%
4 250
 
2.5%
5 128
 
1.3%
6 100
 
1.0%
7 53
 
0.5%
8 32
 
0.3%
10 18
 
0.2%
9 16
 
0.2%
Other values (12) 51
 
0.5%
ValueCountFrequency (%)
1 7371
73.7%
2 1477
 
14.8%
3 504
 
5.0%
4 250
 
2.5%
5 128
 
1.3%
6 100
 
1.0%
7 53
 
0.5%
8 32
 
0.3%
9 16
 
0.2%
10 18
 
0.2%
ValueCountFrequency (%)
25 1
 
< 0.1%
23 3
< 0.1%
22 1
 
< 0.1%
20 1
 
< 0.1%
18 1
 
< 0.1%
17 4
< 0.1%
16 3
< 0.1%
15 3
< 0.1%
14 6
0.1%
13 4
< 0.1%

데이터건수
Real number (ℝ)

Distinct273
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean214.0577
Minimum1
Maximum2551
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.0 KiB
2023-12-13T04:39:38.372799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q110
median61
Q3297
95-th percentile771
Maximum2551
Range2550
Interquartile range (IQR)287

Descriptive statistics

Standard deviation374.37153
Coefficient of variation (CV)1.7489281
Kurtosis20.67667
Mean214.0577
Median Absolute Deviation (MAD)59
Skewness3.9690751
Sum2140577
Variance140154.04
MonotonicityNot monotonic
2023-12-13T04:39:38.560892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 734
 
7.3%
2 412
 
4.1%
3 323
 
3.2%
4 250
 
2.5%
6 183
 
1.8%
5 183
 
1.8%
2551 152
 
1.5%
7 149
 
1.5%
10 138
 
1.4%
11 132
 
1.3%
Other values (263) 7344
73.4%
ValueCountFrequency (%)
1 734
7.3%
2 412
4.1%
3 323
3.2%
4 250
 
2.5%
5 183
 
1.8%
6 183
 
1.8%
7 149
 
1.5%
8 125
 
1.2%
9 128
 
1.3%
10 138
 
1.4%
ValueCountFrequency (%)
2551 152
1.5%
961 60
 
0.6%
942 54
 
0.5%
893 34
 
0.3%
881 54
 
0.5%
814 55
 
0.5%
806 51
 
0.5%
771 45
 
0.4%
754 47
 
0.5%
750 45
 
0.4%

단어건수
Real number (ℝ)

Distinct304
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean125.0957
Minimum3
Maximum700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.0 KiB
2023-12-13T04:39:38.746304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile39
Q177
median109
Q3153
95-th percentile255
Maximum700
Range697
Interquartile range (IQR)76

Descriptive statistics

Standard deviation76.580564
Coefficient of variation (CV)0.61217583
Kurtosis10.134513
Mean125.0957
Median Absolute Deviation (MAD)36
Skewness2.411821
Sum1250957
Variance5864.5828
MonotonicityNot monotonic
2023-12-13T04:39:38.916543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
122 118
 
1.2%
80 111
 
1.1%
77 105
 
1.1%
101 101
 
1.0%
68 97
 
1.0%
86 95
 
0.9%
87 95
 
0.9%
102 95
 
0.9%
79 94
 
0.9%
91 93
 
0.9%
Other values (294) 8996
90.0%
ValueCountFrequency (%)
3 1
 
< 0.1%
6 1
 
< 0.1%
7 2
 
< 0.1%
8 6
0.1%
10 3
 
< 0.1%
11 7
0.1%
12 2
 
< 0.1%
13 3
 
< 0.1%
14 3
 
< 0.1%
15 13
0.1%
ValueCountFrequency (%)
700 16
0.2%
574 16
0.2%
517 11
0.1%
514 5
 
0.1%
502 9
0.1%
488 5
 
0.1%
480 12
0.1%
467 20
0.2%
433 17
0.2%
421 8
 
0.1%


Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.3 KiB
20230730-20230805
10000 

Length

Max length17
Median length17
Mean length17
Min length17

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20230730-20230805
2nd row20230730-20230805
3rd row20230730-20230805
4th row20230730-20230805
5th row20230730-20230805

Common Values

ValueCountFrequency (%)
20230730-20230805 10000
100.0%

Length

2023-12-13T04:39:39.140668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:39:39.261147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20230730-20230805 10000
100.0%

Interactions

2023-12-13T04:39:35.313077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:34.727962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:35.013740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:35.413375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:34.827325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:35.107522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:35.517360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:34.913048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:39:35.204535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:39:39.343798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단어개수데이터건수단어건수
단어개수1.0000.0290.222
데이터건수0.0291.0000.087
단어건수0.2220.0871.000
2023-12-13T04:39:39.497322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단어개수데이터건수단어건수
단어개수1.0000.1110.056
데이터건수0.1111.000-0.071
단어건수0.056-0.0711.000

Missing values

2023-12-13T04:39:35.663218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:39:35.793549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

등록일자데이터주소단어단어개수데이터건수단어건수
02023-08-01https://n.news.naver.com/mnews/article/028/0002650578?sid=101다양180670020230730-20230805
12023-08-01https://n.news.naver.com/mnews/article/144/0000903833?sid=101다양180614520230730-20230805
22023-08-01https://n.news.naver.com/mnews/article/076/0004038967?sid=101다양18068920230730-20230805
32023-08-01https://n.news.naver.com/mnews/article/662/0000025491?sid=101다양280613920230730-20230805
42023-08-01https://n.news.naver.com/mnews/article/036/0000048547?sid=102다양180648020230730-20230805
52023-08-01https://n.news.naver.com/mnews/article/421/0006963833?sid=101다양180621920230730-20230805
62023-08-01https://n.news.naver.com/mnews/article/001/0014105158?sid=102다양180612920230730-20230805
72023-08-01https://n.news.naver.com/mnews/article/648/0000018372?sid=101다양180615320230730-20230805
82023-08-01https://n.news.naver.com/mnews/article/003/0012007000?sid=102다양180616520230730-20230805
92023-08-01https://n.news.naver.com/mnews/article/020/0003512553?sid=102다양180613820230730-20230805
등록일자데이터주소단어단어개수데이터건수단어건수
99902023-08-01https://n.news.naver.com/mnews/article/081/0003381483?sid=102맹주1314320230730-20230805
99912023-08-01https://n.news.naver.com/mnews/article/008/0004919637?sid=101머니투데이1246420230730-20230805
99922023-08-01https://n.news.naver.com/mnews/article/008/0004919682?sid=101머니투데이1247120230730-20230805
99932023-08-01https://n.news.naver.com/mnews/article/003/0012007327?sid=102머리2629420230730-20230805
99942023-08-01https://n.news.naver.com/mnews/article/087/0000986529?sid=101머스135720230730-20230805
99952023-08-01https://n.news.naver.com/mnews/article/030/0003122065?sid=101머신러닝1146820230730-20230805
99962023-08-01https://n.news.naver.com/mnews/article/215/0001116946?sid=101먹거리2519120230730-20230805
99972023-08-01https://n.news.naver.com/mnews/article/008/0004919811?sid=101먹거리15114920230730-20230805
99982023-08-01https://n.news.naver.com/mnews/article/003/0012006461?sid=101먹거리15121320230730-20230805
99992023-08-01https://n.news.naver.com/mnews/article/366/0000921069?sid=101먹거리25126120230730-20230805