Dataset statistics
Number of variables | 10 |
---|---|
Number of observations | 154 |
Missing cells | 927 |
Missing cells (%) | 60.2% |
Duplicate rows | 2 |
Duplicate rows (%) | 1.3% |
Total size in memory | 12.8 KiB |
Average record size in memory | 84.9 B |
Variable types
Text | 4 |
---|---|
Categorical | 3 |
Numeric | 1 |
Unsupported | 2 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | MBN |
URL | https://kdx.kr/data/view/26943 |
Dataset has 2 (1.3%) duplicate rows | Duplicates |
MDA_CGR_NM is highly overall correlated with STD_YEAR and 2 other fields | High correlation |
WRT_DATE is highly overall correlated with STD_YEAR and 2 other fields | High correlation |
ATCH_IMG_NM is highly overall correlated with STD_YEAR and 2 other fields | High correlation |
STD_YEAR is highly overall correlated with MDA_CGR_NM and 2 other fields | High correlation |
MDA_CGR_NM is highly imbalanced (77.7%) | Imbalance |
ATCH_IMG_NM is highly imbalanced (86.1%) | Imbalance |
WRT_DATE is highly imbalanced (91.5%) | Imbalance |
MBN_MDA_SP_CD has 55 (35.7%) missing values | Missing |
MDA_ART_ESSN_NO has 139 (90.3%) missing values | Missing |
STD_YEAR has 137 (89.0%) missing values | Missing |
ART_SJ_CN has 144 (93.5%) missing values | Missing |
ART_CN has 144 (93.5%) missing values | Missing |
JRNL_NM has 154 (100.0%) missing values | Missing |
Unnamed: 9 has 154 (100.0%) missing values | Missing |
JRNL_NM is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2023-12-11 21:15:52.971080 |
---|---|
Analysis finished | 2023-12-11 21:15:53.985366 |
Duration | 1.01 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
MBN_MDA_SP_CD
Text
MISSING
 
Distinct | 87 |
---|---|
Distinct (%) | 87.9% |
Missing | 55 |
Missing (%) | 35.7% |
Memory size | 1.3 KiB |
Length
Max length | 311 |
---|---|
Median length | 79 |
Mean length | 48.919192 |
Min length | 1 |
Characters and Unicode
Total characters | 4843 |
---|---|
Distinct characters | 485 |
Distinct categories | 11 ? |
Distinct scripts | 4 ? |
Distinct blocks | 7 ? |
Unique
Unique | 85 ? |
---|---|
Unique (%) | 85.9% |
Sample
1st row | MBN |
---|---|
2nd row | MBN |
3rd row | MBN |
4th row | MBN |
5th row | 김 의원은 "좌파독재의 도구인 공수처법이 통과됐다"며 "문재인 좌파독재 정권에 헌법이 무참히 짓밟히는 현장을 무기력하게 지켜볼 수밖에 없었다, 참담하다"고 토로했습니다. |
Value | Count | Frequency (%) |
41 | 3.7% | |
mbn | 13 | 1.2% |
▶ | 10 | 0.9% |
기자 | 9 | 0.8% |
인터뷰 | 9 | 0.8% |
검찰 | 8 | 0.7% |
수 | 7 | 0.6% |
지난해 | 7 | 0.6% |
조국 | 7 | 0.6% |
한 | 5 | 0.4% |
Other values (813) | 1006 |
Most occurring characters
Value | Count | Frequency (%) |
1147 | 23.7% | |
이 | 105 | 2.2% |
다 | 101 | 2.1% |
" | 92 | 1.9% |
. | 73 | 1.5% |
의 | 66 | 1.4% |
에 | 58 | 1.2% |
는 | 58 | 1.2% |
니 | 55 | 1.1% |
고 | 54 | 1.1% |
Other values (475) | 3034 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 3281 | |
Space Separator | 1147 | 23.7% |
Other Punctuation | 234 | 4.8% |
Uppercase Letter | 51 | 1.1% |
Decimal Number | 32 | 0.7% |
Close Punctuation | 26 | 0.5% |
Open Punctuation | 26 | 0.5% |
Lowercase Letter | 20 | 0.4% |
Dash Punctuation | 12 | 0.2% |
Other Symbol | 11 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
이 | 105 | 3.2% |
다 | 101 | 3.1% |
의 | 66 | 2.0% |
에 | 58 | 1.8% |
는 | 58 | 1.8% |
니 | 55 | 1.7% |
고 | 54 | 1.6% |
한 | 51 | 1.6% |
을 | 44 | 1.3% |
지 | 43 | 1.3% |
Other values (426) | 2646 |
Other Punctuation
Value | Count | Frequency (%) |
" | 92 | |
. | 73 | |
, | 28 | 12.0% |
: | 16 | 6.8% |
/ | 10 | 4.3% |
' | 6 | 2.6% |
! | 4 | 1.7% |
… | 2 | 0.9% |
? | 1 | 0.4% |
@ | 1 | 0.4% |
Lowercase Letter
Value | Count | Frequency (%) |
r | 3 | |
c | 3 | |
k | 3 | |
o | 2 | |
m | 2 | |
b | 2 | |
n | 2 | |
h | 1 | 5.0% |
a | 1 | 5.0% |
g | 1 | 5.0% |
Decimal Number
Value | Count | Frequency (%) |
1 | 10 | |
4 | 6 | |
2 | 4 | 12.5% |
5 | 3 | 9.4% |
8 | 3 | 9.4% |
3 | 2 | 6.2% |
0 | 2 | 6.2% |
7 | 1 | 3.1% |
6 | 1 | 3.1% |
Uppercase Letter
Value | Count | Frequency (%) |
N | 17 | |
B | 15 | |
M | 15 | |
S | 4 | 7.8% |
Close Punctuation
Value | Count | Frequency (%) |
) | 13 | |
」 | 5 | 19.2% |
] | 4 | 15.4% |
】 | 4 | 15.4% |
Open Punctuation
Value | Count | Frequency (%) |
( | 13 | |
「 | 5 | 19.2% |
【 | 4 | 15.4% |
[ | 4 | 15.4% |
Math Symbol
Value | Count | Frequency (%) |
+ | 1 | |
< | 1 | |
> | 1 |
Other Symbol
Value | Count | Frequency (%) |
▶ | 10 | |
ⓒ | 1 | 9.1% |
Space Separator
Value | Count | Frequency (%) |
1147 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 12 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 3274 | |
Common | 1491 | |
Latin | 71 | 1.5% |
Han | 7 | 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
이 | 105 | 3.2% |
다 | 101 | 3.1% |
의 | 66 | 2.0% |
에 | 58 | 1.8% |
는 | 58 | 1.8% |
니 | 55 | 1.7% |
고 | 54 | 1.6% |
한 | 51 | 1.6% |
을 | 44 | 1.3% |
지 | 43 | 1.3% |
Other values (419) | 2639 |
Common
Value | Count | Frequency (%) |
1147 | ||
" | 92 | 6.2% |
. | 73 | 4.9% |
, | 28 | 1.9% |
: | 16 | 1.1% |
) | 13 | 0.9% |
( | 13 | 0.9% |
- | 12 | 0.8% |
1 | 10 | 0.7% |
▶ | 10 | 0.7% |
Other values (25) | 77 | 5.2% |
Latin
Value | Count | Frequency (%) |
N | 17 | |
B | 15 | |
M | 15 | |
S | 4 | 5.6% |
r | 3 | 4.2% |
c | 3 | 4.2% |
k | 3 | 4.2% |
o | 2 | 2.8% |
m | 2 | 2.8% |
b | 2 | 2.8% |
Other values (4) | 5 | 7.0% |
Han
Value | Count | Frequency (%) |
泰 | 1 | |
山 | 1 | |
鳴 | 1 | |
動 | 1 | |
鼠 | 1 | |
一 | 1 | |
匹 | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 3274 | |
ASCII | 1531 | |
None | 18 | 0.4% |
Geometric Shapes | 10 | 0.2% |
CJK | 7 | 0.1% |
Punctuation | 2 | < 0.1% |
Enclosed Alphanum | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1147 | ||
" | 92 | 6.0% |
. | 73 | 4.8% |
, | 28 | 1.8% |
N | 17 | 1.1% |
: | 16 | 1.0% |
B | 15 | 1.0% |
M | 15 | 1.0% |
) | 13 | 0.8% |
( | 13 | 0.8% |
Other values (32) | 102 | 6.7% |
Hangul
Value | Count | Frequency (%) |
이 | 105 | 3.2% |
다 | 101 | 3.1% |
의 | 66 | 2.0% |
에 | 58 | 1.8% |
는 | 58 | 1.8% |
니 | 55 | 1.7% |
고 | 54 | 1.6% |
한 | 51 | 1.6% |
을 | 44 | 1.3% |
지 | 43 | 1.3% |
Other values (419) | 2639 |
Geometric Shapes
Value | Count | Frequency (%) |
▶ | 10 |
None
Value | Count | Frequency (%) |
」 | 5 | |
「 | 5 | |
【 | 4 | |
】 | 4 |
Punctuation
Value | Count | Frequency (%) |
… | 2 |
Enclosed Alphanum
Value | Count | Frequency (%) |
ⓒ | 1 |
CJK
Value | Count | Frequency (%) |
泰 | 1 | |
山 | 1 | |
鳴 | 1 | |
動 | 1 | |
鼠 | 1 | |
一 | 1 | |
匹 | 1 |
MDA_ART_ESSN_NO
Text
MISSING
 
Distinct | 12 |
---|---|
Distinct (%) | 80.0% |
Missing | 139 |
Missing (%) | 90.3% |
Memory size | 1.3 KiB |
Value | Count | Frequency (%) |
4 | ||
4023123 | 1 | 6.7% |
4023124 | 1 | 6.7% |
4023125 | 1 | 6.7% |
4023130 | 1 | 6.7% |
4023131 | 1 | 6.7% |
4023146 | 1 | 6.7% |
4023172 | 1 | 6.7% |
http://img.mbn.co.kr/filewww/news/other/2020/01/01/020320020300.jpg | 1 | 6.7% |
4023183 | 1 | 6.7% |
Other values (2) | 2 |
Most occurring characters
Value | Count | Frequency (%) |
, | 45 | |
0 | 22 | |
2 | 19 | |
3 | 16 | 8.8% |
1 | 13 | 7.1% |
4 | 12 | 6.6% |
/ | 9 | 4.9% |
w | 4 | 2.2% |
. | 4 | 2.2% |
8 | 4 | 2.2% |
Other values (21) | 34 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 90 | |
Other Punctuation | 59 | |
Lowercase Letter | 33 | 18.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
w | 4 | |
t | 3 | 9.1% |
e | 3 | 9.1% |
n | 2 | 6.1% |
r | 2 | 6.1% |
o | 2 | 6.1% |
p | 2 | 6.1% |
m | 2 | 6.1% |
g | 2 | 6.1% |
i | 2 | 6.1% |
Other values (8) | 9 |
Decimal Number
Value | Count | Frequency (%) |
0 | 22 | |
2 | 19 | |
3 | 16 | |
1 | 13 | |
4 | 12 | |
8 | 4 | 4.4% |
5 | 2 | 2.2% |
6 | 1 | 1.1% |
7 | 1 | 1.1% |
Other Punctuation
Value | Count | Frequency (%) |
, | 45 | |
/ | 9 | 15.3% |
. | 4 | 6.8% |
: | 1 | 1.7% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 149 | |
Latin | 33 | 18.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
w | 4 | |
t | 3 | 9.1% |
e | 3 | 9.1% |
n | 2 | 6.1% |
r | 2 | 6.1% |
o | 2 | 6.1% |
p | 2 | 6.1% |
m | 2 | 6.1% |
g | 2 | 6.1% |
i | 2 | 6.1% |
Other values (8) | 9 |
Common
Value | Count | Frequency (%) |
, | 45 | |
0 | 22 | |
2 | 19 | |
3 | 16 | 10.7% |
1 | 13 | 8.7% |
4 | 12 | 8.1% |
/ | 9 | 6.0% |
. | 4 | 2.7% |
8 | 4 | 2.7% |
5 | 2 | 1.3% |
Other values (3) | 3 | 2.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 182 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
, | 45 | |
0 | 22 | |
2 | 19 | |
3 | 16 | 8.8% |
1 | 13 | 7.1% |
4 | 12 | 6.6% |
/ | 9 | 4.9% |
w | 4 | 2.2% |
. | 4 | 2.2% |
8 | 4 | 2.2% |
Other values (21) | 34 |
MDA_CGR_NM
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 7 |
---|---|
Distinct (%) | 4.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.3 KiB |
<NA> | |
---|---|
mbn00006 | 10 |
황재헌 | 1 |
이석희 | 1 |
최중락 | 1 |
Other values (2) | 2 |
Length
Max length | 8 |
---|---|
Median length | 4 |
Mean length | 4.2272727 |
Min length | 3 |
Unique
Unique | 5 ? |
---|---|
Unique (%) | 3.2% |
Sample
1st row | <NA> |
---|---|
2nd row | mbn00006 |
3rd row | mbn00006 |
4th row | mbn00006 |
5th row | mbn00006 |
Common Values
Value | Count | Frequency (%) |
<NA> | 139 | |
mbn00006 | 10 | 6.5% |
황재헌 | 1 | 0.6% |
이석희 | 1 | 0.6% |
최중락 | 1 | 0.6% |
선한빛 | 1 | 0.6% |
조창훈 | 1 | 0.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 139 | |
mbn00006 | 10 | 6.5% |
황재헌 | 1 | 0.6% |
이석희 | 1 | 0.6% |
최중락 | 1 | 0.6% |
선한빛 | 1 | 0.6% |
조창훈 | 1 | 0.6% |
STD_YEAR
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 8 |
---|---|
Distinct (%) | 47.1% |
Missing | 137 |
Missing (%) | 89.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 8.3176887 × 1012 |
Minimum | 2020 |
---|---|
Maximum | 2.0200101 × 1013 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.5 KiB |
Quantile statistics
Minimum | 2020 |
---|---|
5-th percentile | 2020 |
Q1 | 2020 |
median | 2020 |
Q3 | 2.0200101 × 1013 |
95-th percentile | 2.0200101 × 1013 |
Maximum | 2.0200101 × 1013 |
Range | 2.0200101 × 1013 |
Interquartile range (IQR) | 2.0200101 × 1013 |
Descriptive statistics
Standard deviation | 1.0247504 × 1013 |
---|---|
Coefficient of variation (CV) | 1.2320135 |
Kurtosis | -2.1093878 |
Mean | 8.3176887 × 1012 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 0.3942443 |
Sum | 1.4140071 × 1014 |
Variance | 1.0501135 × 1026 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2020 | 10 | 6.5% |
20200101071535 | 1 | 0.6% |
20200101071713 | 1 | 0.6% |
20200101080035 | 1 | 0.6% |
20200101093912 | 1 | 0.6% |
20200101100100 | 1 | 0.6% |
20200101100250 | 1 | 0.6% |
20200101100425 | 1 | 0.6% |
(Missing) | 137 |
Value | Count | Frequency (%) |
2020 | 10 | |
20200101071535 | 1 | 0.6% |
20200101071713 | 1 | 0.6% |
20200101080035 | 1 | 0.6% |
20200101093912 | 1 | 0.6% |
20200101100100 | 1 | 0.6% |
20200101100250 | 1 | 0.6% |
20200101100425 | 1 | 0.6% |
Value | Count | Frequency (%) |
20200101100425 | 1 | 0.6% |
20200101100250 | 1 | 0.6% |
20200101100100 | 1 | 0.6% |
20200101093912 | 1 | 0.6% |
20200101080035 | 1 | 0.6% |
20200101071713 | 1 | 0.6% |
20200101071535 | 1 | 0.6% |
2020 | 10 |
ART_SJ_CN
Text
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 144 |
Missing (%) | 93.5% |
Memory size | 1.3 KiB |
Length
Max length | 56 |
---|---|
Median length | 34.5 |
Mean length | 34.9 |
Min length | 25 |
Characters and Unicode
Total characters | 349 |
---|---|
Distinct characters | 141 |
Distinct categories | 6 ? |
Distinct scripts | 2 ? |
Distinct blocks | 4 ? |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | [속보] 김정은 "미국, 시간 끌수록 북한 위력 앞에 속수무책 당할 것" |
---|---|
2nd row | [속보] 북한 김정은 "머지않아 새 전략무기 목격하게 될 것" |
3rd row | [속보] 북한 김정은 "억제력 강화, 미국 입장따라 상향조정…적대정책 철회까지 전략무기개발 계속진행" |
4th row | 김도읍 자유한국당 의원 내년 총선 불출마 선언 |
5th row | 김정은 "미국, 시간 끌수록 북한 위력 앞에 속수무책 당할 것" |
Value | Count | Frequency (%) |
북한 | 4 | 4.8% |
김정은 | 4 | 4.8% |
속보 | 3 | 3.6% |
미국 | 3 | 3.6% |
것 | 3 | 3.6% |
시간 | 2 | 2.4% |
끌수록 | 2 | 2.4% |
위력 | 2 | 2.4% |
앞에 | 2 | 2.4% |
속수무책 | 2 | 2.4% |
Other values (55) | 56 |
Most occurring characters
Value | Count | Frequency (%) |
73 | 20.9% | |
" | 16 | 4.6% |
속 | 7 | 2.0% |
정 | 7 | 2.0% |
한 | 6 | 1.7% |
수 | 6 | 1.7% |
무 | 6 | 1.7% |
미 | 5 | 1.4% |
에 | 5 | 1.4% |
국 | 5 | 1.4% |
Other values (131) | 213 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 235 | |
Space Separator | 73 | 20.9% |
Other Punctuation | 28 | 8.0% |
Decimal Number | 7 | 2.0% |
Open Punctuation | 3 | 0.9% |
Close Punctuation | 3 | 0.9% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
속 | 7 | 3.0% |
정 | 7 | 3.0% |
한 | 6 | 2.6% |
수 | 6 | 2.6% |
무 | 6 | 2.6% |
미 | 5 | 2.1% |
에 | 5 | 2.1% |
국 | 5 | 2.1% |
은 | 5 | 2.1% |
김 | 5 | 2.1% |
Other values (118) | 178 |
Other Punctuation
Value | Count | Frequency (%) |
" | 16 | |
… | 4 | 14.3% |
' | 4 | 14.3% |
, | 3 | 10.7% |
· | 1 | 3.6% |
Decimal Number
Value | Count | Frequency (%) |
1 | 3 | |
4 | 1 | 14.3% |
9 | 1 | 14.3% |
0 | 1 | 14.3% |
2 | 1 | 14.3% |
Space Separator
Value | Count | Frequency (%) |
73 |
Open Punctuation
Value | Count | Frequency (%) |
[ | 3 |
Close Punctuation
Value | Count | Frequency (%) |
] | 3 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 235 | |
Common | 114 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
속 | 7 | 3.0% |
정 | 7 | 3.0% |
한 | 6 | 2.6% |
수 | 6 | 2.6% |
무 | 6 | 2.6% |
미 | 5 | 2.1% |
에 | 5 | 2.1% |
국 | 5 | 2.1% |
은 | 5 | 2.1% |
김 | 5 | 2.1% |
Other values (118) | 178 |
Common
Value | Count | Frequency (%) |
73 | ||
" | 16 | 14.0% |
… | 4 | 3.5% |
' | 4 | 3.5% |
[ | 3 | 2.6% |
1 | 3 | 2.6% |
, | 3 | 2.6% |
] | 3 | 2.6% |
· | 1 | 0.9% |
4 | 1 | 0.9% |
Other values (3) | 3 | 2.6% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 235 | |
ASCII | 109 | |
Punctuation | 4 | 1.1% |
None | 1 | 0.3% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
73 | ||
" | 16 | 14.7% |
' | 4 | 3.7% |
[ | 3 | 2.8% |
1 | 3 | 2.8% |
, | 3 | 2.8% |
] | 3 | 2.8% |
4 | 1 | 0.9% |
9 | 1 | 0.9% |
0 | 1 | 0.9% |
Hangul
Value | Count | Frequency (%) |
속 | 7 | 3.0% |
정 | 7 | 3.0% |
한 | 6 | 2.6% |
수 | 6 | 2.6% |
무 | 6 | 2.6% |
미 | 5 | 2.1% |
에 | 5 | 2.1% |
국 | 5 | 2.1% |
은 | 5 | 2.1% |
김 | 5 | 2.1% |
Other values (118) | 178 |
Punctuation
Value | Count | Frequency (%) |
… | 4 |
None
Value | Count | Frequency (%) |
· | 1 |
ART_CN
Text
MISSING
 
Distinct | 7 |
---|---|
Distinct (%) | 70.0% |
Missing | 144 |
Missing (%) | 93.5% |
Memory size | 1.3 KiB |
Length
Max length | 133 |
---|---|
Median length | 57.5 |
Mean length | 38.4 |
Min length | 8 |
Characters and Unicode
Total characters | 384 |
---|---|
Distinct characters | 147 |
Distinct categories | 10 ? |
Distinct scripts | 3 ? |
Distinct blocks | 4 ? |
Unique
Unique | 6 ? |
---|---|
Unique (%) | 60.0% |
Sample
1st row | 김정은 "미국, 시간 끌수록 북한 위력 앞에 속수무책 당할 것" |
---|---|
2nd row | 북한 김정은 "머지않아 새 전략무기 목격하게 될 것" |
3rd row | 북한 김정은 "억제력 강화, 미국 입장따라 상향조정…적대정책 철회까지 전략무기개발 계속진행" |
4th row | (이런 가운데) 자유한국당 김도읍 의원이 공수처법 저지 실패에 책임을 지고 내년 총선에 불출마하겠다고 선언했습니다. |
5th row | <!------------ PHOTO_POS_0 ------------> |
Value | Count | Frequency (%) |
【 | 4 | 5.1% |
】 | 4 | 5.1% |
앵커멘트 | 4 | 5.1% |
4 | 5.1% | |
김정은 | 3 | 3.8% |
북한 | 3 | 3.8% |
것 | 2 | 2.6% |
미국 | 2 | 2.6% |
photo_pos_0 | 2 | 2.6% |
대표은 | 1 | 1.3% |
Other values (49) | 49 |
Most occurring characters
Value | Count | Frequency (%) |
68 | 17.7% | |
- | 48 | 12.5% |
" | 8 | 2.1% |
O | 6 | 1.6% |
한 | 6 | 1.6% |
수 | 5 | 1.3% |
정 | 5 | 1.3% |
은 | 5 | 1.3% |
이 | 4 | 1.0% |
법 | 4 | 1.0% |
Other values (137) | 225 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 214 | |
Space Separator | 68 | 17.7% |
Dash Punctuation | 48 | 12.5% |
Uppercase Letter | 16 | 4.2% |
Other Punctuation | 15 | 3.9% |
Open Punctuation | 6 | 1.6% |
Close Punctuation | 6 | 1.6% |
Connector Punctuation | 4 | 1.0% |
Math Symbol | 4 | 1.0% |
Decimal Number | 3 | 0.8% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
한 | 6 | 2.8% |
수 | 5 | 2.3% |
정 | 5 | 2.3% |
은 | 5 | 2.3% |
이 | 4 | 1.9% |
법 | 4 | 1.9% |
지 | 4 | 1.9% |
무 | 4 | 1.9% |
에 | 4 | 1.9% |
멘 | 4 | 1.9% |
Other values (116) | 169 |
Other Punctuation
Value | Count | Frequency (%) |
" | 8 | |
. | 2 | 13.3% |
! | 2 | 13.3% |
, | 2 | 13.3% |
… | 1 | 6.7% |
Uppercase Letter
Value | Count | Frequency (%) |
O | 6 | |
P | 4 | |
H | 2 | 12.5% |
T | 2 | 12.5% |
S | 2 | 12.5% |
Open Punctuation
Value | Count | Frequency (%) |
【 | 4 | |
( | 2 |
Close Punctuation
Value | Count | Frequency (%) |
】 | 4 | |
) | 2 |
Decimal Number
Value | Count | Frequency (%) |
0 | 2 | |
1 | 1 |
Math Symbol
Value | Count | Frequency (%) |
< | 2 | |
> | 2 |
Space Separator
Value | Count | Frequency (%) |
68 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 48 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 214 | |
Common | 154 | |
Latin | 16 | 4.2% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
한 | 6 | 2.8% |
수 | 5 | 2.3% |
정 | 5 | 2.3% |
은 | 5 | 2.3% |
이 | 4 | 1.9% |
법 | 4 | 1.9% |
지 | 4 | 1.9% |
무 | 4 | 1.9% |
에 | 4 | 1.9% |
멘 | 4 | 1.9% |
Other values (116) | 169 |
Common
Value | Count | Frequency (%) |
68 | ||
- | 48 | |
" | 8 | 5.2% |
_ | 4 | 2.6% |
【 | 4 | 2.6% |
】 | 4 | 2.6% |
) | 2 | 1.3% |
( | 2 | 1.3% |
0 | 2 | 1.3% |
. | 2 | 1.3% |
Other values (6) | 10 | 6.5% |
Latin
Value | Count | Frequency (%) |
O | 6 | |
P | 4 | |
H | 2 | 12.5% |
T | 2 | 12.5% |
S | 2 | 12.5% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 214 | |
ASCII | 161 | |
None | 8 | 2.1% |
Punctuation | 1 | 0.3% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
68 | ||
- | 48 | |
" | 8 | 5.0% |
O | 6 | 3.7% |
P | 4 | 2.5% |
_ | 4 | 2.5% |
) | 2 | 1.2% |
( | 2 | 1.2% |
0 | 2 | 1.2% |
. | 2 | 1.2% |
Other values (8) | 15 | 9.3% |
Hangul
Value | Count | Frequency (%) |
한 | 6 | 2.8% |
수 | 5 | 2.3% |
정 | 5 | 2.3% |
은 | 5 | 2.3% |
이 | 4 | 1.9% |
법 | 4 | 1.9% |
지 | 4 | 1.9% |
무 | 4 | 1.9% |
에 | 4 | 1.9% |
멘 | 4 | 1.9% |
Other values (116) | 169 |
None
Value | Count | Frequency (%) |
【 | 4 | |
】 | 4 |
Punctuation
Value | Count | Frequency (%) |
… | 1 |
ATCH_IMG_NM
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 1.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.3 KiB |
<NA> | |
---|---|
,,,,,,,,, | 3 |
Length
Max length | 9 |
---|---|
Median length | 4 |
Mean length | 4.0974026 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | ,,,,,,,,, |
3rd row | ,,,,,,,,, |
4th row | ,,,,,,,,, |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 151 | |
,,,,,,,,, | 3 | 1.9% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 151 | |
3 | 1.9% |
JRNL_NM
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 154 |
---|---|
Missing (%) | 100.0% |
Memory size | 1.5 KiB |
WRT_DATE
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | 2.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.3 KiB |
<NA> | |
---|---|
20200101062259 | 1 |
20200101062328 | 1 |
20200101062900 | 1 |
Length
Max length | 14 |
---|---|
Median length | 4 |
Mean length | 4.1948052 |
Min length | 4 |
Unique
Unique | 3 ? |
---|---|
Unique (%) | 1.9% |
Sample
1st row | <NA> |
---|---|
2nd row | 20200101062259 |
3rd row | 20200101062328 |
4th row | 20200101062900 |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 151 | |
20200101062259 | 1 | 0.6% |
20200101062328 | 1 | 0.6% |
20200101062900 | 1 | 0.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 151 | |
20200101062259 | 1 | 0.6% |
20200101062328 | 1 | 0.6% |
20200101062900 | 1 | 0.6% |
Unnamed: 9
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 154 |
---|---|
Missing (%) | 100.0% |
Memory size | 1.5 KiB |
MBN_MDA_SP_CD | MDA_ART_ESSN_NO | MDA_CGR_NM | STD_YEAR | ART_SJ_CN | ART_CN | WRT_DATE | |
---|---|---|---|---|---|---|---|
MBN_MDA_SP_CD | 1.000 | 0.000 | 1.000 | NaN | NaN | NaN | NaN |
MDA_ART_ESSN_NO | 0.000 | 1.000 | 0.000 | NaN | 1.000 | 1.000 | 1.000 |
MDA_CGR_NM | 1.000 | 0.000 | 1.000 | NaN | NaN | NaN | NaN |
STD_YEAR | NaN | NaN | NaN | 1.000 | NaN | NaN | NaN |
ART_SJ_CN | NaN | 1.000 | NaN | NaN | 1.000 | 1.000 | 1.000 |
ART_CN | NaN | 1.000 | NaN | NaN | 1.000 | 1.000 | 1.000 |
WRT_DATE | NaN | 1.000 | NaN | NaN | 1.000 | 1.000 | 1.000 |
MDA_CGR_NM | WRT_DATE | ATCH_IMG_NM | |
---|---|---|---|
MDA_CGR_NM | 1.000 | 1.000 | 1.000 |
WRT_DATE | 1.000 | 1.000 | 1.000 |
ATCH_IMG_NM | 1.000 | 1.000 | 1.000 |
STD_YEAR | MDA_CGR_NM | ATCH_IMG_NM | WRT_DATE | |
---|---|---|---|---|
STD_YEAR | 1.000 | 0.832 | 1.000 | 1.000 |
MDA_CGR_NM | 0.832 | 1.000 | 1.000 | 1.000 |
ATCH_IMG_NM | 1.000 | 1.000 | 1.000 | 1.000 |
WRT_DATE | 1.000 | 1.000 | 1.000 | 1.000 |
MBN_MDA_SP_CD | MDA_ART_ESSN_NO | MDA_CGR_NM | STD_YEAR | ART_SJ_CN | ART_CN | ATCH_IMG_NM | JRNL_NM | WRT_DATE | Unnamed: 9 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
1 | MBN | 4023123 | mbn00006 | 2020 | [속보] 김정은 "미국, 시간 끌수록 북한 위력 앞에 속수무책 당할 것" | 김정은 "미국, 시간 끌수록 북한 위력 앞에 속수무책 당할 것" | ,,,,,,,,, | <NA> | 20200101062259 | <NA> |
2 | MBN | 4023124 | mbn00006 | 2020 | [속보] 북한 김정은 "머지않아 새 전략무기 목격하게 될 것" | 북한 김정은 "머지않아 새 전략무기 목격하게 될 것" | ,,,,,,,,, | <NA> | 20200101062328 | <NA> |
3 | MBN | 4023125 | mbn00006 | 2020 | [속보] 북한 김정은 "억제력 강화, 미국 입장따라 상향조정…적대정책 철회까지 전략무기개발 계속진행" | 북한 김정은 "억제력 강화, 미국 입장따라 상향조정…적대정책 철회까지 전략무기개발 계속진행" | ,,,,,,,,, | <NA> | 20200101062900 | <NA> |
4 | MBN | 4023130 | mbn00006 | 2020 | 김도읍 자유한국당 의원 내년 총선 불출마 선언 | (이런 가운데) 자유한국당 김도읍 의원이 공수처법 저지 실패에 책임을 지고 내년 총선에 불출마하겠다고 선언했습니다. | <NA> | <NA> | <NA> | <NA> |
5 | 김 의원은 "좌파독재의 도구인 공수처법이 통과됐다"며 "문재인 좌파독재 정권에 헌법이 무참히 짓밟히는 현장을 무기력하게 지켜볼 수밖에 없었다, 참담하다"고 토로했습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
6 | 한국당 의원 가운데 공수처법 저지 실패에 대한 책임을 지겠다며 총선 불출마를 선언한 것은 김 의원이 처음입니다. | <NA> | <NA> | 20200101071535 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
7 | MBN | 4023131 | mbn00006 | 2020 | 김정은 "미국, 시간 끌수록 북한 위력 앞에 속수무책 당할 것" | <!------------ PHOTO_POS_0 ------------> | <NA> | <NA> | <NA> | <NA> |
8 | 김정은 북한 국무위원장은 "미국이 시간을 끌면 끌수록, 조미(북미)관계의 결산을 주저하면 할수록 예측할 수 없이 강대해지는 조선민주주의인민공화국의 위력 앞에 속수무책으로 당할 수밖에 없게 되어있으며 더욱더 막다른 처지에 빠져들게 되어있다"고 말했다고 조선중앙방송이 1일 보도했습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
9 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
MBN_MDA_SP_CD | MDA_ART_ESSN_NO | MDA_CGR_NM | STD_YEAR | ART_SJ_CN | ART_CN | ATCH_IMG_NM | JRNL_NM | WRT_DATE | Unnamed: 9 | |
---|---|---|---|---|---|---|---|---|---|---|
144 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
145 | 신년사에서 협치를 외쳤던 문 의장은 결국 자신의 처지를 하소연하는 말로 한 해를 마무리해야 했습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
146 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
147 | ▶ 인터뷰 : 문희상 / 국회의장 (지난달 27일) | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
148 | - "문희상이는 하루에도 열두 번씩 요새 죽습니다. 이미 죽었어요." | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
149 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
150 | MBN뉴스 조창훈입니다. [ chang@mbn.co.kr ] | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
151 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
152 | 영상편집 : 송현주 | ,,,,,,,,, | 조창훈 | 20200101100425 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
153 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
Most frequently occurring
MBN_MDA_SP_CD | MDA_ART_ESSN_NO | MDA_CGR_NM | STD_YEAR | ART_SJ_CN | ART_CN | ATCH_IMG_NM | WRT_DATE | # duplicates | |
---|---|---|---|---|---|---|---|---|---|
1 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 55 |
0 | 【 기자 】 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 4 |