Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 275 |
Missing cells | 2501 |
Missing cells (%) | 82.7% |
Duplicate rows | 5 |
Duplicate rows (%) | 1.8% |
Total size in memory | 25.4 KiB |
Average record size in memory | 94.5 B |
Variable types
Text | 4 |
---|---|
Numeric | 2 |
Categorical | 1 |
Unsupported | 4 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | MBN |
URL | https://kdx.kr/data/view/1013 |
Dataset has 5 (1.8%) duplicate rows | Duplicates |
NEWS_NO is highly overall correlated with NWS_CN | High correlation |
BDCT_TIME is highly overall correlated with NWS_CN | High correlation |
NWS_CN is highly overall correlated with NEWS_NO and 1 other fields | High correlation |
NWS_CN is highly imbalanced (87.1%) | Imbalance |
BDCT_NO has 104 (37.8%) missing values | Missing |
NEWS_CGR_CD has 257 (93.5%) missing values | Missing |
NEWS_NO has 255 (92.7%) missing values | Missing |
BDCT_DATE has 255 (92.7%) missing values | Missing |
BDCT_TIME has 265 (96.4%) missing values | Missing |
NWS_SJ has 265 (96.4%) missing values | Missing |
NWS_JRNL_NM has 275 (100.0%) missing values | Missing |
REG_DATE has 275 (100.0%) missing values | Missing |
MVP_CRS_NM has 275 (100.0%) missing values | Missing |
Unnamed: 10 has 275 (100.0%) missing values | Missing |
NWS_JRNL_NM is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
REG_DATE is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
MVP_CRS_NM is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2023-12-11 21:13:41.404029 |
---|---|
Analysis finished | 2023-12-11 21:13:43.903691 |
Duration | 2.5 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
BDCT_NO
Text
MISSING
 
Distinct | 160 |
---|---|
Distinct (%) | 93.6% |
Missing | 104 |
Missing (%) | 37.8% |
Memory size | 2.3 KiB |
Length
Max length | 141 |
---|---|
Median length | 70 |
Mean length | 43.263158 |
Min length | 6 |
Characters and Unicode
Total characters | 7398 |
---|---|
Distinct characters | 519 |
Distinct categories | 10 ? |
Distinct scripts | 4 ? |
Distinct blocks | 6 ? |
Unique
Unique | 155 ? |
---|---|
Unique (%) | 90.6% |
Sample
1st row | 1201343 |
---|---|
2nd row | 60년 만에 찾아온 황금돼지의 해가 밝았습니다. 오늘 전국 곳곳의 해돋이 명소에는 새해 첫 일출을 보며 소원을 비는 인파로 인산인해를 이뤘습니다. |
3rd row | ▶ 김정은 신년사 "완전한 비핵화 확고" |
4th row | 북한 김정은 위원장이 신년사를 통해 "완전한 비핵화는 불변한 입장"이라며, 한반도 비핵화 의지를 재확인했습니다. 또한 트럼프 대통령과 언제든 다시 만날 준비가 되어 있지만, 북한의 인내심을 오판하면 새로운 길을 모색하지 않을 수 없다고 밝혔습니다. |
5th row | ▶ 육성으로 첫 언급…"평화 의지 환영" |
Value | Count | Frequency (%) |
56 | 3.4% | |
김정은 | 24 | 1.4% |
▶ | 22 | 1.3% |
인터뷰 | 16 | 1.0% |
북한 | 12 | 0.7% |
기자 | 11 | 0.7% |
위원장의 | 10 | 0.6% |
김 | 9 | 0.5% |
전 | 9 | 0.5% |
문재인 | 8 | 0.5% |
Other values (1115) | 1493 |
Most occurring characters
Value | Count | Frequency (%) |
1745 | 23.6% | |
다 | 154 | 2.1% |
. | 149 | 2.0% |
이 | 134 | 1.8% |
니 | 120 | 1.6% |
지 | 97 | 1.3% |
는 | 96 | 1.3% |
에 | 92 | 1.2% |
한 | 89 | 1.2% |
가 | 80 | 1.1% |
Other values (509) | 4642 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 4964 | |
Space Separator | 1745 | 23.6% |
Other Punctuation | 351 | 4.7% |
Decimal Number | 202 | 2.7% |
Uppercase Letter | 39 | 0.5% |
Lowercase Letter | 30 | 0.4% |
Other Symbol | 22 | 0.3% |
Dash Punctuation | 17 | 0.2% |
Open Punctuation | 14 | 0.2% |
Close Punctuation | 14 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
다 | 154 | 3.1% |
이 | 134 | 2.7% |
니 | 120 | 2.4% |
지 | 97 | 2.0% |
는 | 96 | 1.9% |
에 | 92 | 1.9% |
한 | 89 | 1.8% |
가 | 80 | 1.6% |
의 | 80 | 1.6% |
습 | 77 | 1.6% |
Other values (456) | 3945 |
Lowercase Letter
Value | Count | Frequency (%) |
a | 4 | |
o | 4 | |
m | 3 | |
t | 2 | 6.7% |
c | 2 | 6.7% |
r | 2 | 6.7% |
e | 2 | 6.7% |
v | 2 | 6.7% |
n | 2 | 6.7% |
z | 1 | 3.3% |
Other values (6) | 6 |
Other Punctuation
Value | Count | Frequency (%) |
. | 149 | |
, | 57 | 16.2% |
" | 50 | 14.2% |
: | 25 | 7.1% |
' | 24 | 6.8% |
/ | 14 | 4.0% |
% | 12 | 3.4% |
· | 8 | 2.3% |
… | 7 | 2.0% |
? | 3 | 0.9% |
Decimal Number
Value | Count | Frequency (%) |
1 | 47 | |
0 | 43 | |
2 | 29 | |
4 | 20 | |
3 | 20 | |
5 | 17 | 8.4% |
9 | 9 | 4.5% |
6 | 6 | 3.0% |
7 | 6 | 3.0% |
8 | 5 | 2.5% |
Uppercase Letter
Value | Count | Frequency (%) |
N | 11 | |
B | 11 | |
M | 11 | |
T | 2 | 5.1% |
V | 2 | 5.1% |
I | 1 | 2.6% |
Q | 1 | 2.6% |
Open Punctuation
Value | Count | Frequency (%) |
【 | 8 | |
( | 4 | |
[ | 2 | 14.3% |
Close Punctuation
Value | Count | Frequency (%) |
】 | 8 | |
) | 4 | |
] | 2 | 14.3% |
Space Separator
Value | Count | Frequency (%) |
1745 |
Other Symbol
Value | Count | Frequency (%) |
▶ | 22 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 17 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 4962 | |
Common | 2365 | |
Latin | 69 | 0.9% |
Han | 2 | < 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
다 | 154 | 3.1% |
이 | 134 | 2.7% |
니 | 120 | 2.4% |
지 | 97 | 2.0% |
는 | 96 | 1.9% |
에 | 92 | 1.9% |
한 | 89 | 1.8% |
가 | 80 | 1.6% |
의 | 80 | 1.6% |
습 | 77 | 1.6% |
Other values (454) | 3943 |
Common
Value | Count | Frequency (%) |
1745 | ||
. | 149 | 6.3% |
, | 57 | 2.4% |
" | 50 | 2.1% |
1 | 47 | 2.0% |
0 | 43 | 1.8% |
2 | 29 | 1.2% |
: | 25 | 1.1% |
' | 24 | 1.0% |
▶ | 22 | 0.9% |
Other values (20) | 174 | 7.4% |
Latin
Value | Count | Frequency (%) |
N | 11 | |
B | 11 | |
M | 11 | |
a | 4 | 5.8% |
o | 4 | 5.8% |
m | 3 | 4.3% |
T | 2 | 2.9% |
V | 2 | 2.9% |
t | 2 | 2.9% |
c | 2 | 2.9% |
Other values (13) | 17 |
Han
Value | Count | Frequency (%) |
己 | 1 | |
亥 | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 4962 | |
ASCII | 2381 | |
None | 24 | 0.3% |
Geometric Shapes | 22 | 0.3% |
Punctuation | 7 | 0.1% |
CJK | 2 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1745 | ||
. | 149 | 6.3% |
, | 57 | 2.4% |
" | 50 | 2.1% |
1 | 47 | 2.0% |
0 | 43 | 1.8% |
2 | 29 | 1.2% |
: | 25 | 1.0% |
' | 24 | 1.0% |
4 | 20 | 0.8% |
Other values (38) | 192 | 8.1% |
Hangul
Value | Count | Frequency (%) |
다 | 154 | 3.1% |
이 | 134 | 2.7% |
니 | 120 | 2.4% |
지 | 97 | 2.0% |
는 | 96 | 1.9% |
에 | 92 | 1.9% |
한 | 89 | 1.8% |
가 | 80 | 1.6% |
의 | 80 | 1.6% |
습 | 77 | 1.6% |
Other values (454) | 3943 |
Geometric Shapes
Value | Count | Frequency (%) |
▶ | 22 |
None
Value | Count | Frequency (%) |
· | 8 | |
【 | 8 | |
】 | 8 |
Punctuation
Value | Count | Frequency (%) |
… | 7 |
CJK
Value | Count | Frequency (%) |
己 | 1 | |
亥 | 1 |
NEWS_CGR_CD
Text
MISSING
 
Distinct | 11 |
---|---|
Distinct (%) | 61.1% |
Missing | 257 |
Missing (%) | 93.5% |
Memory size | 2.3 KiB |
Value | Count | Frequency (%) |
mbn00006 | 7 | |
mbn00009 | 2 | 11.1% |
mbn00007 | 1 | 5.6% |
이동훈 | 1 | 5.6% |
김근희 | 1 | 5.6% |
주진희 | 1 | 5.6% |
이상주 | 1 | 5.6% |
김경기 | 1 | 5.6% |
최중락 | 1 | 5.6% |
이정호 | 1 | 5.6% |
Most occurring characters
Value | Count | Frequency (%) |
0 | 40 | |
m | 10 | 9.6% |
n | 10 | 9.6% |
b | 10 | 9.6% |
6 | 7 | 6.7% |
이 | 3 | 2.9% |
김 | 2 | 1.9% |
희 | 2 | 1.9% |
주 | 2 | 1.9% |
9 | 2 | 1.9% |
Other values (16) | 16 | 15.4% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 50 | |
Lowercase Letter | 30 | |
Other Letter | 24 |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
이 | 3 | 12.5% |
김 | 2 | 8.3% |
희 | 2 | 8.3% |
주 | 2 | 8.3% |
훈 | 1 | 4.2% |
최 | 1 | 4.2% |
태 | 1 | 4.2% |
오 | 1 | 4.2% |
호 | 1 | 4.2% |
정 | 1 | 4.2% |
Other values (9) | 9 |
Decimal Number
Value | Count | Frequency (%) |
0 | 40 | |
6 | 7 | 14.0% |
9 | 2 | 4.0% |
7 | 1 | 2.0% |
Lowercase Letter
Value | Count | Frequency (%) |
m | 10 | |
n | 10 | |
b | 10 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 50 | |
Latin | 30 | |
Hangul | 24 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
이 | 3 | 12.5% |
김 | 2 | 8.3% |
희 | 2 | 8.3% |
주 | 2 | 8.3% |
훈 | 1 | 4.2% |
최 | 1 | 4.2% |
태 | 1 | 4.2% |
오 | 1 | 4.2% |
호 | 1 | 4.2% |
정 | 1 | 4.2% |
Other values (9) | 9 |
Common
Value | Count | Frequency (%) |
0 | 40 | |
6 | 7 | 14.0% |
9 | 2 | 4.0% |
7 | 1 | 2.0% |
Latin
Value | Count | Frequency (%) |
m | 10 | |
n | 10 | |
b | 10 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 80 | |
Hangul | 24 | 23.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 40 | |
m | 10 | 12.5% |
n | 10 | 12.5% |
b | 10 | 12.5% |
6 | 7 | 8.8% |
9 | 2 | 2.5% |
7 | 1 | 1.2% |
Hangul
Value | Count | Frequency (%) |
이 | 3 | 12.5% |
김 | 2 | 8.3% |
희 | 2 | 8.3% |
주 | 2 | 8.3% |
훈 | 1 | 4.2% |
최 | 1 | 4.2% |
태 | 1 | 4.2% |
오 | 1 | 4.2% |
호 | 1 | 4.2% |
정 | 1 | 4.2% |
Other values (9) | 9 |
NEWS_NO
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 11 |
---|---|
Distinct (%) | 55.0% |
Missing | 255 |
Missing (%) | 92.7% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 11957298 |
Minimum | 3724486 |
---|---|
Maximum | 20190101 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.5 KiB |
Quantile statistics
Minimum | 3724486 |
---|---|
5-th percentile | 3724487 |
Q1 | 3724493.5 |
median | 11957303 |
Q3 | 20190101 |
95-th percentile | 20190101 |
Maximum | 20190101 |
Range | 16465615 |
Interquartile range (IQR) | 16465608 |
Descriptive statistics
Standard deviation | 8446677.8 |
---|---|
Coefficient of variation (CV) | 0.70640356 |
Kurtosis | -2.2352941 |
Mean | 11957298 |
Median Absolute Deviation (MAD) | 8232798 |
Skewness | -1.0298138 × 10-12 |
Sum | 2.3914596 × 108 |
Variance | 7.1346365 × 1013 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
20190101 | 10 | 3.6% |
3724486 | 1 | 0.4% |
3724504 | 1 | 0.4% |
3724497 | 1 | 0.4% |
3724492 | 1 | 0.4% |
3724502 | 1 | 0.4% |
3724487 | 1 | 0.4% |
3724494 | 1 | 0.4% |
3724505 | 1 | 0.4% |
3724490 | 1 | 0.4% |
(Missing) | 255 |
Value | Count | Frequency (%) |
3724486 | 1 | |
3724487 | 1 | |
3724490 | 1 | |
3724491 | 1 | |
3724492 | 1 | |
3724494 | 1 | |
3724497 | 1 | |
3724502 | 1 | |
3724504 | 1 | |
3724505 | 1 |
Value | Count | Frequency (%) |
20190101 | 10 | |
3724505 | 1 | 0.4% |
3724504 | 1 | 0.4% |
3724502 | 1 | 0.4% |
3724497 | 1 | 0.4% |
3724494 | 1 | 0.4% |
3724492 | 1 | 0.4% |
3724491 | 1 | 0.4% |
3724490 | 1 | 0.4% |
3724487 | 1 | 0.4% |
BDCT_DATE
Text
MISSING
 
Distinct | 11 |
---|---|
Distinct (%) | 55.0% |
Missing | 255 |
Missing (%) | 92.7% |
Memory size | 2.3 KiB |
Length
Max length | 82 |
---|---|
Median length | 45 |
Mean length | 45 |
Min length | 8 |
Characters and Unicode
Total characters | 900 |
---|---|
Distinct characters | 37 |
Distinct categories | 6 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 50.0% |
Sample
1st row | 20190101 |
---|---|
2nd row | http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=20&content_id=1201343 |
3rd row | 20190101 |
4th row | http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=20&content_id=1201344 |
5th row | 20190101 |
Value | Count | Frequency (%) |
20190101 | 10 | |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1201343 | 1 | 5.0% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1201344 | 1 | 5.0% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1201345 | 1 | 5.0% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1201346 | 1 | 5.0% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1201347 | 1 | 5.0% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1201348 | 1 | 5.0% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1201349 | 1 | 5.0% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1201350 | 1 | 5.0% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1201351 | 1 | 5.0% |
Most occurring characters
Value | Count | Frequency (%) |
t | 80 | 8.9% |
n | 80 | 8.9% |
0 | 51 | 5.7% |
1 | 51 | 5.7% |
e | 50 | 5.6% |
o | 50 | 5.6% |
c | 50 | 5.6% |
. | 40 | 4.4% |
/ | 40 | 4.4% |
2 | 31 | 3.4% |
Other values (27) | 377 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 560 | |
Decimal Number | 170 | 18.9% |
Other Punctuation | 110 | 12.2% |
Connector Punctuation | 30 | 3.3% |
Math Symbol | 20 | 2.2% |
Uppercase Letter | 10 | 1.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
t | 80 | |
n | 80 | |
e | 50 | 8.9% |
o | 50 | 8.9% |
c | 50 | 8.9% |
w | 30 | 5.4% |
m | 30 | 5.4% |
s | 20 | 3.6% |
i | 20 | 3.6% |
d | 20 | 3.6% |
Other values (9) | 130 |
Decimal Number
Value | Count | Frequency (%) |
0 | 51 | |
1 | 51 | |
2 | 31 | |
9 | 11 | 6.5% |
3 | 11 | 6.5% |
4 | 8 | 4.7% |
5 | 4 | 2.4% |
6 | 1 | 0.6% |
7 | 1 | 0.6% |
8 | 1 | 0.6% |
Other Punctuation
Value | Count | Frequency (%) |
. | 40 | |
/ | 40 | |
: | 10 | 9.1% |
? | 10 | 9.1% |
& | 10 | 9.1% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 30 |
Math Symbol
Value | Count | Frequency (%) |
= | 20 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 10 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 570 | |
Common | 330 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
t | 80 | |
n | 80 | |
e | 50 | 8.8% |
o | 50 | 8.8% |
c | 50 | 8.8% |
w | 30 | 5.3% |
m | 30 | 5.3% |
s | 20 | 3.5% |
i | 20 | 3.5% |
d | 20 | 3.5% |
Other values (10) | 140 |
Common
Value | Count | Frequency (%) |
0 | 51 | |
1 | 51 | |
. | 40 | |
/ | 40 | |
2 | 31 | |
_ | 30 | |
= | 20 | 6.1% |
9 | 11 | 3.3% |
3 | 11 | 3.3% |
: | 10 | 3.0% |
Other values (7) | 35 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 900 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
t | 80 | 8.9% |
n | 80 | 8.9% |
0 | 51 | 5.7% |
1 | 51 | 5.7% |
e | 50 | 5.6% |
o | 50 | 5.6% |
c | 50 | 5.6% |
. | 40 | 4.4% |
/ | 40 | 4.4% |
2 | 31 | 3.4% |
Other values (27) | 377 |
BDCT_TIME
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 265 |
Missing (%) | 96.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1937.8 |
Minimum | 1930 |
---|---|
Maximum | 1944 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.5 KiB |
Quantile statistics
Minimum | 1930 |
---|---|
5-th percentile | 1931.35 |
Q1 | 1935.25 |
median | 1938 |
Q3 | 1940.75 |
95-th percentile | 1943.55 |
Maximum | 1944 |
Range | 14 |
Interquartile range (IQR) | 5.5 |
Descriptive statistics
Standard deviation | 4.4422217 |
---|---|
Coefficient of variation (CV) | 0.0022924046 |
Kurtosis | -0.61551869 |
Mean | 1937.8 |
Median Absolute Deviation (MAD) | 3 |
Skewness | -0.30002336 |
Sum | 19378 |
Variance | 19.733333 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1930 | 1 | 0.4% |
1933 | 1 | 0.4% |
1935 | 1 | 0.4% |
1936 | 1 | 0.4% |
1937 | 1 | 0.4% |
1939 | 1 | 0.4% |
1940 | 1 | 0.4% |
1941 | 1 | 0.4% |
1943 | 1 | 0.4% |
1944 | 1 | 0.4% |
(Missing) | 265 |
Value | Count | Frequency (%) |
1930 | 1 | |
1933 | 1 | |
1935 | 1 | |
1936 | 1 | |
1937 | 1 | |
1939 | 1 | |
1940 | 1 | |
1941 | 1 | |
1943 | 1 | |
1944 | 1 |
Value | Count | Frequency (%) |
1944 | 1 | |
1943 | 1 | |
1941 | 1 | |
1940 | 1 | |
1939 | 1 | |
1937 | 1 | |
1936 | 1 | |
1935 | 1 | |
1933 | 1 | |
1930 | 1 |
NWS_SJ
Text
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 265 |
Missing (%) | 96.4% |
Memory size | 2.3 KiB |
Length
Max length | 41 |
---|---|
Median length | 30 |
Mean length | 28.5 |
Min length | 18 |
Characters and Unicode
Total characters | 285 |
---|---|
Distinct characters | 132 |
Distinct categories | 8 ? |
Distinct scripts | 3 ? |
Distinct blocks | 3 ? |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 김주하 앵커가 전하는 1월 1일 뉴스8 주요뉴스 |
---|---|
2nd row | [신년영상] 2019 황금돼지 해 밝았다 |
3rd row | 60년 만에 돌아온 황금돼지의 해 |
4th row | 김정은, '완전한 비핵화' 첫 언급…"약속 안 지키면 새 길” |
5th row | 김정은, 직접 개성공단 언급했지만…사실상 남측에 숙제 |
Value | Count | Frequency (%) |
김정은 | 4 | 6.0% |
대통령 | 2 | 3.0% |
첫 | 2 | 3.0% |
mbn | 2 | 3.0% |
여론조사 | 2 | 3.0% |
해 | 2 | 3.0% |
긍정적"…문 | 1 | 1.5% |
해결에 | 1 | 1.5% |
문제 | 1 | 1.5% |
김주하 | 1 | 1.5% |
Other values (49) | 49 |
Most occurring characters
Value | Count | Frequency (%) |
57 | 20.0% | |
지 | 8 | 2.8% |
… | 6 | 2.1% |
정 | 6 | 2.1% |
김 | 5 | 1.8% |
, | 5 | 1.8% |
은 | 4 | 1.4% |
해 | 4 | 1.4% |
대 | 4 | 1.4% |
에 | 4 | 1.4% |
Other values (122) | 182 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 182 | |
Space Separator | 57 | 20.0% |
Other Punctuation | 19 | 6.7% |
Decimal Number | 14 | 4.9% |
Uppercase Letter | 6 | 2.1% |
Open Punctuation | 3 | 1.1% |
Close Punctuation | 3 | 1.1% |
Final Punctuation | 1 | 0.4% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
지 | 8 | 4.4% |
정 | 6 | 3.3% |
김 | 5 | 2.7% |
은 | 4 | 2.2% |
해 | 4 | 2.2% |
대 | 4 | 2.2% |
에 | 4 | 2.2% |
사 | 4 | 2.2% |
년 | 3 | 1.6% |
황 | 3 | 1.6% |
Other values (101) | 137 |
Decimal Number
Value | Count | Frequency (%) |
0 | 3 | |
1 | 3 | |
5 | 2 | |
4 | 2 | |
6 | 1 | 7.1% |
9 | 1 | 7.1% |
2 | 1 | 7.1% |
8 | 1 | 7.1% |
Other Punctuation
Value | Count | Frequency (%) |
… | 6 | |
, | 5 | |
" | 3 | |
' | 2 | 10.5% |
% | 2 | 10.5% |
. | 1 | 5.3% |
Uppercase Letter
Value | Count | Frequency (%) |
M | 2 | |
B | 2 | |
N | 2 |
Space Separator
Value | Count | Frequency (%) |
57 |
Open Punctuation
Value | Count | Frequency (%) |
[ | 3 |
Close Punctuation
Value | Count | Frequency (%) |
] | 3 |
Final Punctuation
Value | Count | Frequency (%) |
” | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 182 | |
Common | 97 | |
Latin | 6 | 2.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
지 | 8 | 4.4% |
정 | 6 | 3.3% |
김 | 5 | 2.7% |
은 | 4 | 2.2% |
해 | 4 | 2.2% |
대 | 4 | 2.2% |
에 | 4 | 2.2% |
사 | 4 | 2.2% |
년 | 3 | 1.6% |
황 | 3 | 1.6% |
Other values (101) | 137 |
Common
Value | Count | Frequency (%) |
57 | ||
… | 6 | 6.2% |
, | 5 | 5.2% |
[ | 3 | 3.1% |
] | 3 | 3.1% |
0 | 3 | 3.1% |
" | 3 | 3.1% |
1 | 3 | 3.1% |
' | 2 | 2.1% |
5 | 2 | 2.1% |
Other values (8) | 10 | 10.3% |
Latin
Value | Count | Frequency (%) |
M | 2 | |
B | 2 | |
N | 2 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 182 | |
ASCII | 96 | |
Punctuation | 7 | 2.5% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
57 | ||
, | 5 | 5.2% |
[ | 3 | 3.1% |
] | 3 | 3.1% |
0 | 3 | 3.1% |
" | 3 | 3.1% |
1 | 3 | 3.1% |
M | 2 | 2.1% |
B | 2 | 2.1% |
N | 2 | 2.1% |
Other values (9) | 13 | 13.5% |
Hangul
Value | Count | Frequency (%) |
지 | 8 | 4.4% |
정 | 6 | 3.3% |
김 | 5 | 2.7% |
은 | 4 | 2.2% |
해 | 4 | 2.2% |
대 | 4 | 2.2% |
에 | 4 | 2.2% |
사 | 4 | 2.2% |
년 | 3 | 1.6% |
황 | 3 | 1.6% |
Other values (101) | 137 |
Punctuation
Value | Count | Frequency (%) |
… | 6 | |
” | 1 | 14.3% |
NWS_CN
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | 1.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
<NA> | |
---|---|
【 앵커멘트 】 | 8 |
▶ '황금돼지 해' 밝아…해맞이 인파 북적 | 1 |
2019 기해년, 황금돼지 해가 밝았습니다. | 1 |
Length
Max length | 24 |
---|---|
Median length | 4 |
Mean length | 4.2581818 |
Min length | 4 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 0.7% |
Sample
1st row | <NA> |
---|---|
2nd row | ▶ '황금돼지 해' 밝아…해맞이 인파 북적 |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 265 | |
【 앵커멘트 】 | 8 | 2.9% |
▶ '황금돼지 해' 밝아…해맞이 인파 북적 | 1 | 0.4% |
2019 기해년, 황금돼지 해가 밝았습니다. | 1 | 0.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 265 | |
【 | 8 | 2.7% |
앵커멘트 | 8 | 2.7% |
】 | 8 | 2.7% |
황금돼지 | 2 | 0.7% |
▶ | 1 | 0.3% |
해 | 1 | 0.3% |
밝아…해맞이 | 1 | 0.3% |
인파 | 1 | 0.3% |
북적 | 1 | 0.3% |
Other values (4) | 4 | 1.3% |
NWS_JRNL_NM
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 275 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.5 KiB |
REG_DATE
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 275 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.5 KiB |
MVP_CRS_NM
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 275 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.5 KiB |
Unnamed: 10
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 275 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.5 KiB |
NEWS_CGR_CD | NEWS_NO | BDCT_DATE | BDCT_TIME | NWS_SJ | NWS_CN | |
---|---|---|---|---|---|---|
NEWS_CGR_CD | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.898 |
NEWS_NO | 1.000 | 1.000 | 1.000 | NaN | NaN | NaN |
BDCT_DATE | 1.000 | 1.000 | 1.000 | NaN | NaN | NaN |
BDCT_TIME | 1.000 | NaN | NaN | 1.000 | 1.000 | 1.000 |
NWS_SJ | 1.000 | NaN | NaN | 1.000 | 1.000 | 1.000 |
NWS_CN | 0.898 | NaN | NaN | 1.000 | 1.000 | 1.000 |
NEWS_NO | BDCT_TIME | NWS_CN | |
---|---|---|---|
NEWS_NO | 1.000 | -0.018 | 1.000 |
BDCT_TIME | -0.018 | 1.000 | 0.655 |
NWS_CN | 1.000 | 0.655 | 1.000 |
BDCT_NO | NEWS_CGR_CD | NEWS_NO | BDCT_DATE | BDCT_TIME | NWS_SJ | NWS_CN | NWS_JRNL_NM | REG_DATE | MVP_CRS_NM | Unnamed: 10 | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
1 | 1201343 | mbn00009 | 3724486 | 20190101 | 1930 | 김주하 앵커가 전하는 1월 1일 뉴스8 주요뉴스 | ▶ '황금돼지 해' 밝아…해맞이 인파 북적 | <NA> | <NA> | <NA> | <NA> |
2 | 60년 만에 찾아온 황금돼지의 해가 밝았습니다. 오늘 전국 곳곳의 해돋이 명소에는 새해 첫 일출을 보며 소원을 비는 인파로 인산인해를 이뤘습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
3 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
4 | ▶ 김정은 신년사 "완전한 비핵화 확고" | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
5 | 북한 김정은 위원장이 신년사를 통해 "완전한 비핵화는 불변한 입장"이라며, 한반도 비핵화 의지를 재확인했습니다. 또한 트럼프 대통령과 언제든 다시 만날 준비가 되어 있지만, 북한의 인내심을 오판하면 새로운 길을 모색하지 않을 수 없다고 밝혔습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
6 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
7 | ▶ 육성으로 첫 언급…"평화 의지 환영" | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
8 | 김정은 위원장이 '완전한 비핵화'를 북한 주민들에게 육성으로 언급한 건 이번이 처음입니다. 우리 정부는 김정은 위원장의 신년사에 환영의 뜻을 밝혔습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
9 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
BDCT_NO | NEWS_CGR_CD | NEWS_NO | BDCT_DATE | BDCT_TIME | NWS_SJ | NWS_CN | NWS_JRNL_NM | REG_DATE | MVP_CRS_NM | Unnamed: 10 | |
---|---|---|---|---|---|---|---|---|---|---|---|
265 | 그렇다면, 이 총리와 황 전 총리가 양자대결을 한다면 어떻게 될까. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
266 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
267 | 여론조사 결과, 이 총리가 40.4%를 기록하며, 24.5%를 기록한 황 전 총리에 크게 앞서는 것으로 나타났습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
268 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
269 | 하지만, 진보후보와 보수후보 가운데 차기 대선 지지후보가 없다고 응답한 비율이 30%가 넘는다는 점에서, 시민들은 좀 더 상황을 지켜보겠다는 신중한 자세를 보이고 있습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
270 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
271 | MBN뉴스 오태윤입니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
272 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
273 | 영상편집 : 이유진 | 오태윤 | 20190101 | http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=20&content_id=1201352 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
274 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
Most frequently occurring
BDCT_NO | NEWS_CGR_CD | NEWS_NO | BDCT_DATE | BDCT_TIME | NWS_SJ | NWS_CN | # duplicates | |
---|---|---|---|---|---|---|---|---|
4 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 104 |
3 | 【 기자 】 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 7 |
0 | ▶ 인터뷰 : 김정은 / 북한 국무위원장 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 3 |
1 | ▶ 인터뷰 : 문재인 대통령 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 2 |
2 | ▶ 인터뷰 : 신범철 / 아산정책연구원 안보통일센터장 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 2 |