Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 226 |
Missing cells | 1632 |
Missing cells (%) | 65.6% |
Duplicate rows | 2 |
Duplicate rows (%) | 0.9% |
Total size in memory | 20.9 KiB |
Average record size in memory | 94.6 B |
Variable types
Text | 4 |
---|---|
Unsupported | 3 |
Categorical | 3 |
Numeric | 1 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | MBN |
URL | https://kdx.kr/data/view/167 |
Dataset has 2 (0.9%) duplicate rows | Duplicates |
REG_DATE is highly overall correlated with BDCT_TIME and 1 other fields | High correlation |
NWS_CN is highly overall correlated with REG_DATE | High correlation |
BDCT_TIME is highly overall correlated with REG_DATE | High correlation |
NEWS_NO is highly imbalanced (77.9%) | Imbalance |
NWS_CN is highly imbalanced (86.2%) | Imbalance |
REG_DATE is highly imbalanced (92.7%) | Imbalance |
BDCT_NO has 90 (39.8%) missing values | Missing |
NEWS_CGR_CD has 226 (100.0%) missing values | Missing |
BDCT_DATE has 208 (92.0%) missing values | Missing |
BDCT_TIME has 216 (95.6%) missing values | Missing |
NWS_SJ has 216 (95.6%) missing values | Missing |
NWS_JRNL_NM has 226 (100.0%) missing values | Missing |
MVP_CRS_NM has 224 (99.1%) missing values | Missing |
Unnamed: 10 has 226 (100.0%) missing values | Missing |
NEWS_CGR_CD is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
NWS_JRNL_NM is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2023-12-11 22:28:29.508974 |
---|---|
Analysis finished | 2023-12-11 22:28:31.783223 |
Duration | 2.27 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
BDCT_NO
Text
MISSING
 
Distinct | 130 |
---|---|
Distinct (%) | 95.6% |
Missing | 90 |
Missing (%) | 39.8% |
Memory size | 1.9 KiB |
Length
Max length | 109 |
---|---|
Median length | 68 |
Mean length | 42.558824 |
Min length | 1 |
Characters and Unicode
Total characters | 5788 |
---|---|
Distinct characters | 537 |
Distinct categories | 10 ? |
Distinct scripts | 3 ? |
Distinct blocks | 6 ? |
Unique
Unique | 129 ? |
---|---|
Unique (%) | 94.9% |
Sample
1st row | 1023827 |
---|---|
2nd row | 1023828 |
3rd row | 1023829 |
4th row | 바람이 불어 다소 쌀쌀했지만 화창한 날씨 덕분에 많은 시민들이 개나리가 활짝 핀 산을 찾는 등, 휴일을 즐겼는데요, |
5th row | 다양했던 시민들 표정 김순철 기자가 담았습니다. |
Value | Count | Frequency (%) |
46 | 3.5% | |
▶ | 12 | 0.9% |
기자 | 9 | 0.7% |
인터뷰 | 9 | 0.7% |
있습니다 | 9 | 0.7% |
】 | 7 | 0.5% |
【 | 7 | 0.5% |
mbn뉴스 | 6 | 0.5% |
북한이 | 5 | 0.4% |
불법 | 5 | 0.4% |
Other values (992) | 1216 |
Most occurring characters
Value | Count | Frequency (%) |
1383 | 23.9% | |
다 | 127 | 2.2% |
이 | 125 | 2.2% |
. | 119 | 2.1% |
니 | 99 | 1.7% |
에 | 71 | 1.2% |
는 | 71 | 1.2% |
을 | 63 | 1.1% |
고 | 62 | 1.1% |
지 | 60 | 1.0% |
Other values (527) | 3608 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 3762 | |
Space Separator | 1383 | 23.9% |
Other Punctuation | 252 | 4.4% |
Decimal Number | 181 | 3.1% |
Lowercase Letter | 92 | 1.6% |
Open Punctuation | 31 | 0.5% |
Close Punctuation | 31 | 0.5% |
Uppercase Letter | 29 | 0.5% |
Other Symbol | 14 | 0.2% |
Dash Punctuation | 13 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
다 | 127 | 3.4% |
이 | 125 | 3.3% |
니 | 99 | 2.6% |
에 | 71 | 1.9% |
는 | 71 | 1.9% |
을 | 63 | 1.7% |
고 | 62 | 1.6% |
지 | 60 | 1.6% |
기 | 58 | 1.5% |
가 | 53 | 1.4% |
Other values (465) | 2973 |
Lowercase Letter
Value | Count | Frequency (%) |
o | 13 | |
m | 9 | |
n | 9 | |
k | 8 | 8.7% |
b | 7 | 7.6% |
c | 7 | 7.6% |
r | 6 | 6.5% |
e | 5 | 5.4% |
g | 4 | 4.3% |
a | 4 | 4.3% |
Other values (12) | 20 |
Other Punctuation
Value | Count | Frequency (%) |
. | 119 | |
" | 36 | 14.3% |
, | 28 | 11.1% |
' | 24 | 9.5% |
/ | 13 | 5.2% |
: | 13 | 5.2% |
@ | 5 | 2.0% |
? | 4 | 1.6% |
· | 4 | 1.6% |
! | 3 | 1.2% |
Other values (2) | 3 | 1.2% |
Decimal Number
Value | Count | Frequency (%) |
1 | 51 | |
2 | 31 | |
3 | 28 | |
0 | 21 | |
8 | 15 | 8.3% |
4 | 12 | 6.6% |
7 | 8 | 4.4% |
9 | 5 | 2.8% |
6 | 5 | 2.8% |
5 | 5 | 2.8% |
Uppercase Letter
Value | Count | Frequency (%) |
B | 8 | |
M | 8 | |
N | 8 | |
A | 3 | 10.3% |
V | 1 | 3.4% |
T | 1 | 3.4% |
Open Punctuation
Value | Count | Frequency (%) |
( | 10 | |
「 | 8 | |
【 | 7 | |
[ | 6 |
Close Punctuation
Value | Count | Frequency (%) |
) | 10 | |
」 | 8 | |
】 | 7 | |
] | 6 |
Other Symbol
Value | Count | Frequency (%) |
▶ | 12 | |
☎ | 2 | 14.3% |
Space Separator
Value | Count | Frequency (%) |
1383 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 13 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 3762 | |
Common | 1905 | |
Latin | 121 | 2.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
다 | 127 | 3.4% |
이 | 125 | 3.3% |
니 | 99 | 2.6% |
에 | 71 | 1.9% |
는 | 71 | 1.9% |
을 | 63 | 1.7% |
고 | 62 | 1.6% |
지 | 60 | 1.6% |
기 | 58 | 1.5% |
가 | 53 | 1.4% |
Other values (465) | 2973 |
Common
Value | Count | Frequency (%) |
1383 | ||
. | 119 | 6.2% |
1 | 51 | 2.7% |
" | 36 | 1.9% |
2 | 31 | 1.6% |
3 | 28 | 1.5% |
, | 28 | 1.5% |
' | 24 | 1.3% |
0 | 21 | 1.1% |
8 | 15 | 0.8% |
Other values (24) | 169 | 8.9% |
Latin
Value | Count | Frequency (%) |
o | 13 | 10.7% |
m | 9 | 7.4% |
n | 9 | 7.4% |
B | 8 | 6.6% |
k | 8 | 6.6% |
M | 8 | 6.6% |
N | 8 | 6.6% |
b | 7 | 5.8% |
c | 7 | 5.8% |
r | 6 | 5.0% |
Other values (18) | 38 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 3762 | |
ASCII | 1976 | |
None | 34 | 0.6% |
Geometric Shapes | 12 | 0.2% |
Punctuation | 2 | < 0.1% |
Misc Symbols | 2 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1383 | ||
. | 119 | 6.0% |
1 | 51 | 2.6% |
" | 36 | 1.8% |
2 | 31 | 1.6% |
3 | 28 | 1.4% |
, | 28 | 1.4% |
' | 24 | 1.2% |
0 | 21 | 1.1% |
8 | 15 | 0.8% |
Other values (44) | 240 | 12.1% |
Hangul
Value | Count | Frequency (%) |
다 | 127 | 3.4% |
이 | 125 | 3.3% |
니 | 99 | 2.6% |
에 | 71 | 1.9% |
는 | 71 | 1.9% |
을 | 63 | 1.7% |
고 | 62 | 1.6% |
지 | 60 | 1.6% |
기 | 58 | 1.5% |
가 | 53 | 1.4% |
Other values (465) | 2973 |
Geometric Shapes
Value | Count | Frequency (%) |
▶ | 12 |
None
Value | Count | Frequency (%) |
」 | 8 | |
「 | 8 | |
【 | 7 | |
】 | 7 | |
· | 4 |
Punctuation
Value | Count | Frequency (%) |
… | 2 |
Misc Symbols
Value | Count | Frequency (%) |
☎ | 2 |
NEWS_CGR_CD
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 226 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.1 KiB |
NEWS_NO
Categorical
IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 0.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
<NA> | |
---|---|
20120407 | 8 |
Length
Max length | 8 |
---|---|
Median length | 4 |
Mean length | 4.1415929 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 218 | |
20120407 | 8 | 3.5% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 218 | |
20120407 | 8 | 3.5% |
BDCT_DATE
Text
MISSING
 
Distinct | 9 |
---|---|
Distinct (%) | 50.0% |
Missing | 208 |
Missing (%) | 92.0% |
Memory size | 1.9 KiB |
Length
Max length | 82 |
---|---|
Median length | 8 |
Mean length | 40.888889 |
Min length | 8 |
Characters and Unicode
Total characters | 736 |
---|---|
Distinct characters | 37 |
Distinct categories | 6 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 8 ? |
---|---|
Unique (%) | 44.4% |
Sample
1st row | 20120407 |
---|---|
2nd row | 20120407 |
3rd row | 20120407 |
4th row | http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=20&content_id=1023829 |
5th row | 20120407 |
Value | Count | Frequency (%) |
20120407 | 10 | |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1023829 | 1 | 5.6% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1023830 | 1 | 5.6% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1023831 | 1 | 5.6% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1023832 | 1 | 5.6% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1023833 | 1 | 5.6% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1023834 | 1 | 5.6% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1023835 | 1 | 5.6% |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1023836 | 1 | 5.6% |
Most occurring characters
Value | Count | Frequency (%) |
t | 64 | 8.7% |
n | 64 | 8.7% |
0 | 47 | 6.4% |
e | 40 | 5.4% |
o | 40 | 5.4% |
c | 40 | 5.4% |
2 | 38 | 5.2% |
. | 32 | 4.3% |
/ | 32 | 4.3% |
_ | 24 | 3.3% |
Other values (27) | 315 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 448 | |
Decimal Number | 152 | 20.7% |
Other Punctuation | 88 | 12.0% |
Connector Punctuation | 24 | 3.3% |
Math Symbol | 16 | 2.2% |
Uppercase Letter | 8 | 1.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
t | 64 | |
n | 64 | |
e | 40 | 8.9% |
o | 40 | 8.9% |
c | 40 | 8.9% |
w | 24 | 5.4% |
m | 24 | 5.4% |
s | 16 | 3.6% |
l | 16 | 3.6% |
i | 16 | 3.6% |
Other values (9) | 104 |
Decimal Number
Value | Count | Frequency (%) |
0 | 47 | |
2 | 38 | |
1 | 19 | |
3 | 16 | 10.5% |
4 | 11 | 7.2% |
7 | 10 | 6.6% |
8 | 8 | 5.3% |
9 | 1 | 0.7% |
5 | 1 | 0.7% |
6 | 1 | 0.7% |
Other Punctuation
Value | Count | Frequency (%) |
. | 32 | |
/ | 32 | |
& | 8 | 9.1% |
? | 8 | 9.1% |
: | 8 | 9.1% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 24 |
Math Symbol
Value | Count | Frequency (%) |
= | 16 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 8 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 456 | |
Common | 280 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
t | 64 | |
n | 64 | |
e | 40 | 8.8% |
o | 40 | 8.8% |
c | 40 | 8.8% |
w | 24 | 5.3% |
m | 24 | 5.3% |
s | 16 | 3.5% |
l | 16 | 3.5% |
i | 16 | 3.5% |
Other values (10) | 112 |
Common
Value | Count | Frequency (%) |
0 | 47 | |
2 | 38 | |
. | 32 | |
/ | 32 | |
_ | 24 | |
1 | 19 | |
= | 16 | 5.7% |
3 | 16 | 5.7% |
4 | 11 | 3.9% |
7 | 10 | 3.6% |
Other values (7) | 35 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 736 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
t | 64 | 8.7% |
n | 64 | 8.7% |
0 | 47 | 6.4% |
e | 40 | 5.4% |
o | 40 | 5.4% |
c | 40 | 5.4% |
2 | 38 | 5.2% |
. | 32 | 4.3% |
/ | 32 | 4.3% |
_ | 24 | 3.3% |
Other values (27) | 315 |
BDCT_TIME
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 216 |
Missing (%) | 95.6% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2021.1 |
Minimum | 2000 |
---|---|
Maximum | 2028 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.1 KiB |
Quantile statistics
Minimum | 2000 |
---|---|
5-th percentile | 2008.1 |
Q1 | 2020.25 |
median | 2023 |
Q3 | 2025.75 |
95-th percentile | 2027.55 |
Maximum | 2028 |
Range | 28 |
Interquartile range (IQR) | 5.5 |
Descriptive statistics
Standard deviation | 8.0753397 |
---|---|
Coefficient of variation (CV) | 0.0039955171 |
Kurtosis | 6.0621744 |
Mean | 2021.1 |
Median Absolute Deviation (MAD) | 3 |
Skewness | -2.291706 |
Sum | 20211 |
Variance | 65.211111 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2028 | 1 | 0.4% |
2000 | 1 | 0.4% |
2027 | 1 | 0.4% |
2026 | 1 | 0.4% |
2025 | 1 | 0.4% |
2024 | 1 | 0.4% |
2022 | 1 | 0.4% |
2021 | 1 | 0.4% |
2020 | 1 | 0.4% |
2018 | 1 | 0.4% |
(Missing) | 216 |
Value | Count | Frequency (%) |
2000 | 1 | |
2018 | 1 | |
2020 | 1 | |
2021 | 1 | |
2022 | 1 | |
2024 | 1 | |
2025 | 1 | |
2026 | 1 | |
2027 | 1 | |
2028 | 1 |
Value | Count | Frequency (%) |
2028 | 1 | |
2027 | 1 | |
2026 | 1 | |
2025 | 1 | |
2024 | 1 | |
2022 | 1 | |
2021 | 1 | |
2020 | 1 | |
2018 | 1 | |
2000 | 1 |
NWS_SJ
Text
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 216 |
Missing (%) | 95.6% |
Memory size | 1.9 KiB |
Length
Max length | 42 |
---|---|
Median length | 34.5 |
Mean length | 27.3 |
Min length | 3 |
Characters and Unicode
Total characters | 273 |
---|---|
Distinct characters | 125 |
Distinct categories | 6 ? |
Distinct scripts | 2 ? |
Distinct blocks | 4 ? |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 주요 뉴스 |
---|---|
2nd row | 클로징 |
3rd row | [김순철 기자] 개나리꽃 보러 왔어요…봄 나들이객 '북적' |
4th row | [김태영 기자] 강풍 피해 속출…상하층 온도 차 원인 |
5th row | [김명준 기자] [4·11 총선] 여야, 마지막 주말 '유세 총력전' |
Value | Count | Frequency (%) |
기자 | 7 | 10.6% |
총선 | 3 | 4.5% |
4·11 | 3 | 4.5% |
주요 | 1 | 1.5% |
경찰 | 1 | 1.5% |
김시영 | 1 | 1.5% |
선거판에 | 1 | 1.5% |
선파라치 | 1 | 1.5% |
떴다 | 1 | 1.5% |
이성훈 | 1 | 1.5% |
Other values (46) | 46 |
Most occurring characters
Value | Count | Frequency (%) |
56 | 20.5% | |
1 | 10 | 3.7% |
] | 9 | 3.3% |
[ | 9 | 3.3% |
' | 8 | 2.9% |
자 | 7 | 2.6% |
기 | 7 | 2.6% |
선 | 5 | 1.8% |
김 | 5 | 1.8% |
이 | 4 | 1.5% |
Other values (115) | 153 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 159 | |
Space Separator | 56 | 20.5% |
Other Punctuation | 22 | 8.1% |
Decimal Number | 18 | 6.6% |
Close Punctuation | 9 | 3.3% |
Open Punctuation | 9 | 3.3% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
자 | 7 | 4.4% |
기 | 7 | 4.4% |
선 | 5 | 3.1% |
김 | 5 | 3.1% |
이 | 4 | 2.5% |
총 | 4 | 2.5% |
전 | 3 | 1.9% |
성 | 3 | 1.9% |
리 | 3 | 1.9% |
고 | 3 | 1.9% |
Other values (100) | 115 |
Decimal Number
Value | Count | Frequency (%) |
1 | 10 | |
4 | 3 | 16.7% |
2 | 2 | 11.1% |
6 | 1 | 5.6% |
3 | 1 | 5.6% |
7 | 1 | 5.6% |
Other Punctuation
Value | Count | Frequency (%) |
' | 8 | |
" | 4 | |
, | 3 | 13.6% |
· | 3 | 13.6% |
… | 3 | 13.6% |
! | 1 | 4.5% |
Space Separator
Value | Count | Frequency (%) |
56 |
Close Punctuation
Value | Count | Frequency (%) |
] | 9 |
Open Punctuation
Value | Count | Frequency (%) |
[ | 9 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 159 | |
Common | 114 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
자 | 7 | 4.4% |
기 | 7 | 4.4% |
선 | 5 | 3.1% |
김 | 5 | 3.1% |
이 | 4 | 2.5% |
총 | 4 | 2.5% |
전 | 3 | 1.9% |
성 | 3 | 1.9% |
리 | 3 | 1.9% |
고 | 3 | 1.9% |
Other values (100) | 115 |
Common
Value | Count | Frequency (%) |
56 | ||
1 | 10 | 8.8% |
] | 9 | 7.9% |
[ | 9 | 7.9% |
' | 8 | 7.0% |
" | 4 | 3.5% |
4 | 3 | 2.6% |
, | 3 | 2.6% |
· | 3 | 2.6% |
… | 3 | 2.6% |
Other values (5) | 6 | 5.3% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 159 | |
ASCII | 108 | |
None | 3 | 1.1% |
Punctuation | 3 | 1.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
56 | ||
1 | 10 | 9.3% |
] | 9 | 8.3% |
[ | 9 | 8.3% |
' | 8 | 7.4% |
" | 4 | 3.7% |
4 | 3 | 2.8% |
, | 3 | 2.8% |
2 | 2 | 1.9% |
! | 1 | 0.9% |
Other values (3) | 3 | 2.8% |
Hangul
Value | Count | Frequency (%) |
자 | 7 | 4.4% |
기 | 7 | 4.4% |
선 | 5 | 3.1% |
김 | 5 | 3.1% |
이 | 4 | 2.5% |
총 | 4 | 2.5% |
전 | 3 | 1.9% |
성 | 3 | 1.9% |
리 | 3 | 1.9% |
고 | 3 | 1.9% |
Other values (100) | 115 |
None
Value | Count | Frequency (%) |
· | 3 |
Punctuation
Value | Count | Frequency (%) |
… | 3 |
NWS_CN
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 5 |
---|---|
Distinct (%) | 2.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
<NA> | |
---|---|
【 앵커멘트 】 | 7 |
주요 뉴스 | 1 |
클로징 | 1 |
경기도 수원 여성 납치 살해 사건의 부실 대응 논란과 관련해 경찰이 112 신고센터와 상황실 운영 체계를 전면 개편하기로 했습니다. | 1 |
Length
Max length | 73 |
---|---|
Median length | 4 |
Mean length | 4.4292035 |
Min length | 3 |
Unique
Unique | 3 ? |
---|---|
Unique (%) | 1.3% |
Sample
1st row | <NA> |
---|---|
2nd row | 주요 뉴스 |
3rd row | 클로징 |
4th row | 【 앵커멘트 】 |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 216 | |
【 앵커멘트 】 | 7 | 3.1% |
주요 뉴스 | 1 | 0.4% |
클로징 | 1 | 0.4% |
경기도 수원 여성 납치 살해 사건의 부실 대응 논란과 관련해 경찰이 112 신고센터와 상황실 운영 체계를 전면 개편하기로 했습니다. | 1 | 0.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 216 | |
앵커멘트 | 7 | 2.7% |
】 | 7 | 2.7% |
【 | 7 | 2.7% |
논란과 | 1 | 0.4% |
개편하기로 | 1 | 0.4% |
전면 | 1 | 0.4% |
체계를 | 1 | 0.4% |
운영 | 1 | 0.4% |
상황실 | 1 | 0.4% |
Other values (16) | 16 | 6.2% |
NWS_JRNL_NM
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 226 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.1 KiB |
REG_DATE
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 0.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
<NA> | |
---|---|
20120407 | 2 |
Length
Max length | 8 |
---|---|
Median length | 4 |
Mean length | 4.0353982 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | 20120407 |
3rd row | 20120407 |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 224 | |
20120407 | 2 | 0.9% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 224 | |
20120407 | 2 | 0.9% |
MVP_CRS_NM
Text
MISSING
 
Distinct | 2 |
---|---|
Distinct (%) | 100.0% |
Missing | 224 |
Missing (%) | 99.1% |
Memory size | 1.9 KiB |
Length
Max length | 82 |
---|---|
Median length | 82 |
Mean length | 82 |
Min length | 82 |
Characters and Unicode
Total characters | 164 |
---|---|
Distinct characters | 33 |
Distinct categories | 6 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=20&content_id=1023827 |
---|---|
2nd row | http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=20&content_id=1023828 |
Value | Count | Frequency (%) |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1023827 | 1 | |
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=20&content_id=1023828 | 1 |
Most occurring characters
Value | Count | Frequency (%) |
n | 16 | 9.8% |
t | 16 | 9.8% |
c | 10 | 6.1% |
e | 10 | 6.1% |
o | 10 | 6.1% |
/ | 8 | 4.9% |
. | 8 | 4.9% |
2 | 6 | 3.7% |
w | 6 | 3.7% |
m | 6 | 3.7% |
Other values (23) | 68 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 112 | |
Other Punctuation | 22 | 13.4% |
Decimal Number | 18 | 11.0% |
Connector Punctuation | 6 | 3.7% |
Math Symbol | 4 | 2.4% |
Uppercase Letter | 2 | 1.2% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
n | 16 | |
t | 16 | |
c | 10 | 8.9% |
e | 10 | 8.9% |
o | 10 | 8.9% |
w | 6 | 5.4% |
m | 6 | 5.4% |
l | 4 | 3.6% |
i | 4 | 3.6% |
d | 4 | 3.6% |
Other values (9) | 26 |
Decimal Number
Value | Count | Frequency (%) |
2 | 6 | |
0 | 4 | |
8 | 3 | |
1 | 2 | 11.1% |
3 | 2 | 11.1% |
7 | 1 | 5.6% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 8 | |
. | 8 | |
& | 2 | 9.1% |
? | 2 | 9.1% |
: | 2 | 9.1% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 6 |
Math Symbol
Value | Count | Frequency (%) |
= | 4 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 114 | |
Common | 50 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
n | 16 | |
t | 16 | |
c | 10 | 8.8% |
e | 10 | 8.8% |
o | 10 | 8.8% |
w | 6 | 5.3% |
m | 6 | 5.3% |
l | 4 | 3.5% |
i | 4 | 3.5% |
d | 4 | 3.5% |
Other values (10) | 28 |
Common
Value | Count | Frequency (%) |
/ | 8 | |
. | 8 | |
2 | 6 | |
_ | 6 | |
= | 4 | |
0 | 4 | |
8 | 3 | 6.0% |
& | 2 | 4.0% |
1 | 2 | 4.0% |
3 | 2 | 4.0% |
Other values (3) | 5 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 164 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
n | 16 | 9.8% |
t | 16 | 9.8% |
c | 10 | 6.1% |
e | 10 | 6.1% |
o | 10 | 6.1% |
/ | 8 | 4.9% |
. | 8 | 4.9% |
2 | 6 | 3.7% |
w | 6 | 3.7% |
m | 6 | 3.7% |
Other values (23) | 68 |
Unnamed: 10
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 226 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.1 KiB |
BDCT_DATE | BDCT_TIME | NWS_SJ | NWS_CN | MVP_CRS_NM | |
---|---|---|---|---|---|
BDCT_DATE | 1.000 | NaN | NaN | NaN | NaN |
BDCT_TIME | NaN | 1.000 | 1.000 | 0.000 | NaN |
NWS_SJ | NaN | 1.000 | 1.000 | 1.000 | 0.000 |
NWS_CN | NaN | 0.000 | 1.000 | 1.000 | 0.000 |
MVP_CRS_NM | NaN | NaN | 0.000 | 0.000 | 1.000 |
NEWS_NO | REG_DATE | NWS_CN | |
---|---|---|---|
NEWS_NO | 1.000 | NaN | NaN |
REG_DATE | NaN | 1.000 | 1.000 |
NWS_CN | NaN | 1.000 | 1.000 |
BDCT_TIME | NEWS_NO | NWS_CN | REG_DATE | |
---|---|---|---|---|
BDCT_TIME | 1.000 | 0.000 | 0.267 | 1.000 |
NEWS_NO | 0.000 | 1.000 | 0.000 | 0.000 |
NWS_CN | 0.267 | 0.000 | 1.000 | 1.000 |
REG_DATE | 1.000 | 0.000 | 1.000 | 1.000 |
BDCT_NO | NEWS_CGR_CD | NEWS_NO | BDCT_DATE | BDCT_TIME | NWS_SJ | NWS_CN | NWS_JRNL_NM | REG_DATE | MVP_CRS_NM | Unnamed: 10 | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
1 | 1023827 | <NA> | <NA> | 20120407 | 2028 | 주요 뉴스 | 주요 뉴스 | <NA> | 20120407 | http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=20&content_id=1023827 | <NA> |
2 | 1023828 | <NA> | <NA> | 20120407 | 2000 | 클로징 | 클로징 | <NA> | 20120407 | http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=20&content_id=1023828 | <NA> |
3 | 1023829 | <NA> | <NA> | 20120407 | 2027 | [김순철 기자] 개나리꽃 보러 왔어요…봄 나들이객 '북적' | 【 앵커멘트 】 | <NA> | <NA> | <NA> | <NA> |
4 | 바람이 불어 다소 쌀쌀했지만 화창한 날씨 덕분에 많은 시민들이 개나리가 활짝 핀 산을 찾는 등, 휴일을 즐겼는데요, | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
5 | 다양했던 시민들 표정 김순철 기자가 담았습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
6 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
7 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
8 | 【 기자 】 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
9 | 「"날씨가 오늘 기가 막힌다. 하늘 봐, 하늘." | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
BDCT_NO | NEWS_CGR_CD | NEWS_NO | BDCT_DATE | BDCT_TIME | NWS_SJ | NWS_CN | NWS_JRNL_NM | REG_DATE | MVP_CRS_NM | Unnamed: 10 | |
---|---|---|---|---|---|---|---|---|---|---|---|
216 | 시기적으로도 북한이 14일을 선택할 것이라는 전망이 우세합니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
217 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
218 | 오는 15일은 북한이 전례 없는 대규모 행사를 준비하고 있는 김일성의 100회 생일입니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
219 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
220 | 또 13일은 최고인민회의가 예정돼 있어 14일이 로켓 발사의 극적인 효과를 내는 데 가장 좋은 날이기 때문입니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
221 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
222 | 군 당국은 북한이 곧 로켓에 연료주입을 시작할 것으로 보고, 발사장 상황을 면밀하게 살피고 있습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
223 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
224 | MBN뉴스 이미혜입니다. | <NA> | 20120407 | http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=20&content_id=1023836 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
225 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
Most frequently occurring
BDCT_NO | NEWS_NO | BDCT_DATE | BDCT_TIME | NWS_SJ | NWS_CN | REG_DATE | MVP_CRS_NM | # duplicates | |
---|---|---|---|---|---|---|---|---|---|
1 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 90 |
0 | 【 기자 】 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 7 |