Overview

Dataset statistics

Number of variables3
Number of observations2620
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory64.1 KiB
Average record size in memory25.1 B

Variable types

DateTime1
Text1
Numeric1

Dataset

Description뉴스데이터베이스 "BIGKinds" 에서 신문방송의 뉴스를 분석한 오늘의 이슈 정보입니다.국내 54개 언론사에서 가장 많이 다룬 오늘의 이슈 10건과 빈도 건수를 제공합니다.https://www.bigkinds.or.kr/v2/news/weekendNews.do 에 접속하면 보다 많은 정보를 확인할 수 있습니다.
Author한국언론진흥재단
URLhttps://www.data.go.kr/data/15119893/fileData.do

Reproduction

Analysis started2024-04-06 08:06:51.816948
Analysis finished2024-04-06 08:06:53.558910
Duration1.74 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

날짜
Date

Distinct262
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
Minimum2020-01-01 00:00:00
Maximum2020-12-31 00:00:00
2024-04-06T17:06:53.688526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:06:53.958090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

제목
Text

Distinct2446
Distinct (%)93.4%
Missing0
Missing (%)0.0%
Memory size20.6 KiB
2024-04-06T17:06:54.844133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length24
Mean length18.614885
Min length8

Characters and Unicode

Total characters48771
Distinct characters674
Distinct categories10 ?
Distinct scripts5 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2372 ?
Unique (%)90.5%

Sample

1st row초대 공수처장 후보 김진욱 지명
2nd row박범계 신임 법무부장관 내정
3rd row코스피 사상 마감 최고 8만 전자
4th row김진욱 국민 공수처장 권한 후보
5th row카투사 한미 미군 한국인 백신 접종
ValueCountFrequency (%)
코로나19 495
 
3.6%
확진자 291
 
2.1%
확진 194
 
1.4%
신규 178
 
1.3%
코로나 178
 
1.3%
대통령 143
 
1.0%
발생 126
 
0.9%
트럼프 119
 
0.9%
감염 118
 
0.9%
추가 103
 
0.7%
Other values (3696) 11882
85.9%
2024-04-06T17:06:55.960337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11207
 
23.0%
803
 
1.6%
1 791
 
1.6%
774
 
1.6%
744
 
1.5%
660
 
1.4%
589
 
1.2%
583
 
1.2%
568
 
1.2%
9 566
 
1.2%
Other values (664) 31486
64.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 33167
68.0%
Space Separator 11207
 
23.0%
Decimal Number 2449
 
5.0%
Other Punctuation 839
 
1.7%
Initial Punctuation 478
 
1.0%
Final Punctuation 476
 
1.0%
Uppercase Letter 100
 
0.2%
Dash Punctuation 25
 
0.1%
Lowercase Letter 25
 
0.1%
Math Symbol 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
803
 
2.4%
774
 
2.3%
744
 
2.2%
660
 
2.0%
589
 
1.8%
583
 
1.8%
568
 
1.7%
565
 
1.7%
492
 
1.5%
476
 
1.4%
Other values (614) 26913
81.1%
Uppercase Letter
ValueCountFrequency (%)
O 12
12.0%
C 11
11.0%
S 9
 
9.0%
H 8
 
8.0%
D 7
 
7.0%
W 7
 
7.0%
T 6
 
6.0%
P 6
 
6.0%
K 5
 
5.0%
G 5
 
5.0%
Other values (9) 24
24.0%
Decimal Number
ValueCountFrequency (%)
1 791
32.3%
9 566
23.1%
2 246
 
10.0%
0 225
 
9.2%
3 181
 
7.4%
5 122
 
5.0%
4 109
 
4.5%
7 79
 
3.2%
8 67
 
2.7%
6 63
 
2.6%
Other Punctuation
ValueCountFrequency (%)
, 542
64.6%
· 136
 
16.2%
. 64
 
7.6%
% 58
 
6.9%
? 33
 
3.9%
2
 
0.2%
" 2
 
0.2%
' 2
 
0.2%
Lowercase Letter
ValueCountFrequency (%)
n 12
48.0%
m 8
32.0%
v 2
 
8.0%
s 2
 
8.0%
α 1
 
4.0%
Initial Punctuation
ValueCountFrequency (%)
343
71.8%
135
 
28.2%
Final Punctuation
ValueCountFrequency (%)
341
71.6%
135
 
28.4%
Math Symbol
ValueCountFrequency (%)
~ 3
60.0%
+ 2
40.0%
Space Separator
ValueCountFrequency (%)
11207
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 25
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 32923
67.5%
Common 15479
31.7%
Han 244
 
0.5%
Latin 124
 
0.3%
Greek 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
803
 
2.4%
774
 
2.4%
744
 
2.3%
660
 
2.0%
589
 
1.8%
583
 
1.8%
568
 
1.7%
565
 
1.7%
492
 
1.5%
476
 
1.4%
Other values (597) 26669
81.0%
Common
ValueCountFrequency (%)
11207
72.4%
1 791
 
5.1%
9 566
 
3.7%
, 542
 
3.5%
343
 
2.2%
341
 
2.2%
2 246
 
1.6%
0 225
 
1.5%
3 181
 
1.2%
· 136
 
0.9%
Other values (16) 901
 
5.8%
Latin
ValueCountFrequency (%)
n 12
 
9.7%
O 12
 
9.7%
C 11
 
8.9%
S 9
 
7.3%
m 8
 
6.5%
H 8
 
6.5%
D 7
 
5.6%
W 7
 
5.6%
T 6
 
4.8%
P 6
 
4.8%
Other values (13) 38
30.6%
Han
ValueCountFrequency (%)
75
30.7%
44
18.0%
30
 
12.3%
18
 
7.4%
16
 
6.6%
14
 
5.7%
13
 
5.3%
12
 
4.9%
10
 
4.1%
3
 
1.2%
Other values (7) 9
 
3.7%
Greek
ValueCountFrequency (%)
α 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 32923
67.5%
ASCII 14511
29.8%
Punctuation 956
 
2.0%
CJK 244
 
0.5%
None 137
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11207
77.2%
1 791
 
5.5%
9 566
 
3.9%
, 542
 
3.7%
2 246
 
1.7%
0 225
 
1.6%
3 181
 
1.2%
5 122
 
0.8%
4 109
 
0.8%
7 79
 
0.5%
Other values (33) 443
 
3.1%
Hangul
ValueCountFrequency (%)
803
 
2.4%
774
 
2.4%
744
 
2.3%
660
 
2.0%
589
 
1.8%
583
 
1.8%
568
 
1.7%
565
 
1.7%
492
 
1.5%
476
 
1.4%
Other values (597) 26669
81.0%
Punctuation
ValueCountFrequency (%)
343
35.9%
341
35.7%
135
 
14.1%
135
 
14.1%
2
 
0.2%
None
ValueCountFrequency (%)
· 136
99.3%
α 1
 
0.7%
CJK
ValueCountFrequency (%)
75
30.7%
44
18.0%
30
 
12.3%
18
 
7.4%
16
 
6.6%
14
 
5.7%
13
 
5.3%
12
 
4.9%
10
 
4.1%
3
 
1.2%
Other values (7) 9
 
3.7%

건수
Real number (ℝ)

Distinct359
Distinct (%)13.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean132.90038
Minimum26
Maximum1264
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2024-04-06T17:06:56.370461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum26
5-th percentile59
Q181
median105
Q3149
95-th percentile305
Maximum1264
Range1238
Interquartile range (IQR)68

Descriptive statistics

Standard deviation95.073326
Coefficient of variation (CV)0.71537286
Kurtosis27.509239
Mean132.90038
Median Absolute Deviation (MAD)30
Skewness4.0222328
Sum348199
Variance9038.9374
MonotonicityNot monotonic
2024-04-06T17:06:56.716850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
79 37
 
1.4%
96 37
 
1.4%
83 36
 
1.4%
99 36
 
1.4%
90 35
 
1.3%
104 35
 
1.3%
81 34
 
1.3%
95 34
 
1.3%
73 32
 
1.2%
74 31
 
1.2%
Other values (349) 2273
86.8%
ValueCountFrequency (%)
26 1
 
< 0.1%
28 1
 
< 0.1%
37 1
 
< 0.1%
39 3
0.1%
40 4
0.2%
41 3
0.1%
42 1
 
< 0.1%
43 2
 
0.1%
44 5
0.2%
45 2
 
0.1%
ValueCountFrequency (%)
1264 1
< 0.1%
1237 1
< 0.1%
1026 1
< 0.1%
944 1
< 0.1%
924 1
< 0.1%
850 1
< 0.1%
709 1
< 0.1%
700 1
< 0.1%
699 1
< 0.1%
674 1
< 0.1%

Interactions

2024-04-06T17:06:53.001455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-04-06T17:06:53.269522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T17:06:53.445717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

날짜제목건수
02020-12-31초대 공수처장 후보 김진욱 지명220
12020-12-31박범계 신임 법무부장관 내정192
22020-12-31코스피 사상 마감 최고 8만 전자114
32020-12-31김진욱 국민 공수처장 권한 후보107
42020-12-31카투사 한미 미군 한국인 백신 접종105
52020-12-31서울 동부구치소 코로나19 추가 확진96
62020-12-31여성단체연합, ‘박원순 피소’ 유출 연루95
72020-12-31영국발 변이 바이러스 2명 추가 확인94
82020-12-31이재용 국정농단 재판 징역 9년 구형89
92020-12-31김진욱 공수처장 초대 지명 출신86
날짜제목건수
26102020-01-01검찰 조국 뇌물 혐의 불구속 기소208
26112020-01-01폼페이오 김정은 트럼프 약속 전략무기148
26122020-01-01검찰 뇌물 조국 혐의 불구속 기소81
26132020-01-012020년 대통령 새해 열매 국민 성과 신년80
26142020-01-01송병기 선거 개입 영장 구속73
26152020-01-01날씨 첫날 해돋이 새해 한파 강추위71
26162020-01-01여야 새해 이해찬 총선 승리 집권 다짐70
26172020-01-01김정은 투쟁 결심 전원회의69
26182020-01-01소비자물가 상승 0.4% 최저60
26192020-01-01폼페이오 김정은 트럼프 약속 전략무기59