Overview

Dataset statistics

Number of variables10
Number of observations186
Missing cells1323
Missing cells (%)71.1%
Duplicate rows2
Duplicate rows (%)1.1%
Total size in memory15.4 KiB
Average record size in memory84.7 B

Variable types

Text5
Categorical2
Numeric1
Unsupported2

Dataset

Description샘플 데이터
AuthorMBN
URLhttps://kdx.kr/data/view/26951

Alerts

ATCH_IMG_NM has constant value ""Constant
Dataset has 2 (1.1%) duplicate rowsDuplicates
STD_YEAR is highly overall correlated with MDA_CGR_NMHigh correlation
MDA_CGR_NM is highly overall correlated with STD_YEARHigh correlation
MDA_CGR_NM is highly imbalanced (80.8%)Imbalance
WRT_DATE is highly imbalanced (95.2%)Imbalance
MBN_MDA_SP_CD has 77 (41.4%) missing valuesMissing
MDA_ART_ESSN_NO has 170 (91.4%) missing valuesMissing
STD_YEAR has 167 (89.8%) missing valuesMissing
ART_SJ_CN has 176 (94.6%) missing valuesMissing
ART_CN has 176 (94.6%) missing valuesMissing
ATCH_IMG_NM has 185 (99.5%) missing valuesMissing
JRNL_NM has 186 (100.0%) missing valuesMissing
Unnamed: 9 has 186 (100.0%) missing valuesMissing
JRNL_NM is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 21:13:13.948897
Analysis finished2023-12-11 21:13:15.356377
Duration1.41 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

MBN_MDA_SP_CD
Text

MISSING 

Distinct95
Distinct (%)87.2%
Missing77
Missing (%)41.4%
Memory size1.6 KiB
2023-12-12T06:13:15.560958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length271
Median length119
Mean length81.394495
Min length3

Characters and Unicode

Total characters8872
Distinct characters549
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique92 ?
Unique (%)84.4%

Sample

1st rowMBN
2nd row 트럼프 대통령은 또 나중에 2단계 회담이 시작되는 베이징으로 갈 것이라고 밝혀 베이징에서 미중 정상회담이 개최될 것임을 시사했으나 시기 등 구체적인 언급은 하지 않았습니다.
3rd row 중국은 농산물을 포함해 미국산 제품을 대규모로 구매하고, 미국은 추가 관세 부과를 철회하는 한편 기존 관세 중 일부 제품의 관세율을 낮추는 것이 1단계 합의의 주된 내용입니다.
4th rowMBN
5th row 최근 미국 정부에 '크리스마스 선물'을 언급하며 군사 도발 불안감을 키웠던 북한이 전략무기개발을 계속하겠다고 밝혔습니다.
ValueCountFrequency (%)
30
 
1.5%
25
 
1.2%
트럼프 24
 
1.2%
24
 
1.2%
위원장이 17
 
0.8%
mbn 15
 
0.7%
북한이 13
 
0.6%
미국 13
 
0.6%
11
 
0.5%
것이라고 10
 
0.5%
Other values (1344) 1863
91.1%
2023-12-12T06:13:15.953944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2017
 
22.7%
219
 
2.5%
205
 
2.3%
132
 
1.5%
. 115
 
1.3%
113
 
1.3%
110
 
1.2%
104
 
1.2%
102
 
1.1%
101
 
1.1%
Other values (539) 5654
63.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6182
69.7%
Space Separator 2017
 
22.7%
Other Punctuation 300
 
3.4%
Uppercase Letter 150
 
1.7%
Decimal Number 98
 
1.1%
Close Punctuation 34
 
0.4%
Open Punctuation 33
 
0.4%
Dash Punctuation 26
 
0.3%
Lowercase Letter 23
 
0.3%
Math Symbol 5
 
0.1%
Other values (2) 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
219
 
3.5%
205
 
3.3%
132
 
2.1%
113
 
1.8%
110
 
1.8%
104
 
1.7%
102
 
1.6%
101
 
1.6%
98
 
1.6%
95
 
1.5%
Other values (474) 4903
79.3%
Uppercase Letter
ValueCountFrequency (%)
N 30
20.0%
M 26
17.3%
B 25
16.7%
C 16
10.7%
I 10
 
6.7%
P 9
 
6.0%
A 7
 
4.7%
T 7
 
4.7%
Y 4
 
2.7%
O 3
 
2.0%
Other values (8) 13
8.7%
Lowercase Letter
ValueCountFrequency (%)
r 3
13.0%
i 3
13.0%
o 3
13.0%
g 2
8.7%
k 2
8.7%
n 2
8.7%
a 2
8.7%
b 2
8.7%
c 1
 
4.3%
m 1
 
4.3%
Other values (2) 2
8.7%
Other Punctuation
ValueCountFrequency (%)
. 115
38.3%
" 69
23.0%
' 58
19.3%
, 40
 
13.3%
· 8
 
2.7%
% 3
 
1.0%
: 2
 
0.7%
/ 2
 
0.7%
& 1
 
0.3%
1
 
0.3%
Decimal Number
ValueCountFrequency (%)
1 28
28.6%
2 25
25.5%
0 14
14.3%
3 8
 
8.2%
4 6
 
6.1%
5 5
 
5.1%
7 4
 
4.1%
9 3
 
3.1%
6 3
 
3.1%
8 2
 
2.0%
Close Punctuation
ValueCountFrequency (%)
) 25
73.5%
] 7
 
20.6%
2
 
5.9%
Open Punctuation
ValueCountFrequency (%)
( 24
72.7%
[ 7
 
21.2%
2
 
6.1%
Math Symbol
ValueCountFrequency (%)
> 2
40.0%
< 2
40.0%
= 1
20.0%
Other Symbol
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
2017
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6174
69.6%
Common 2517
28.4%
Latin 173
 
1.9%
Han 8
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
219
 
3.5%
205
 
3.3%
132
 
2.1%
113
 
1.8%
110
 
1.8%
104
 
1.7%
102
 
1.7%
101
 
1.6%
98
 
1.6%
95
 
1.5%
Other values (466) 4895
79.3%
Common
ValueCountFrequency (%)
2017
80.1%
. 115
 
4.6%
" 69
 
2.7%
' 58
 
2.3%
, 40
 
1.6%
1 28
 
1.1%
- 26
 
1.0%
) 25
 
1.0%
2 25
 
1.0%
( 24
 
1.0%
Other values (25) 90
 
3.6%
Latin
ValueCountFrequency (%)
N 30
17.3%
M 26
15.0%
B 25
14.5%
C 16
9.2%
I 10
 
5.8%
P 9
 
5.2%
A 7
 
4.0%
T 7
 
4.0%
Y 4
 
2.3%
O 3
 
1.7%
Other values (20) 36
20.8%
Han
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6174
69.6%
ASCII 2675
30.2%
None 12
 
0.1%
CJK 8
 
0.1%
Geometric Shapes 1
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2017
75.4%
. 115
 
4.3%
" 69
 
2.6%
' 58
 
2.2%
, 40
 
1.5%
N 30
 
1.1%
1 28
 
1.0%
M 26
 
1.0%
- 26
 
1.0%
) 25
 
0.9%
Other values (49) 241
 
9.0%
Hangul
ValueCountFrequency (%)
219
 
3.5%
205
 
3.3%
132
 
2.1%
113
 
1.8%
110
 
1.8%
104
 
1.7%
102
 
1.7%
101
 
1.6%
98
 
1.6%
95
 
1.5%
Other values (466) 4895
79.3%
None
ValueCountFrequency (%)
· 8
66.7%
2
 
16.7%
2
 
16.7%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%
Punctuation
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%

MDA_ART_ESSN_NO
Text

MISSING 

Distinct12
Distinct (%)75.0%
Missing170
Missing (%)91.4%
Memory size1.6 KiB
2023-12-12T06:13:16.075306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length76
Median length7
Mean length11.9375
Min length7

Characters and Unicode

Total characters191
Distinct characters32
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)68.8%

Sample

1st row4023133
2nd row4023138
3rd row4023166
4th row,,,,,,,,,
5th row4023232
ValueCountFrequency (%)
5
31.2%
4023133 1
 
6.2%
4023138 1
 
6.2%
4023166 1
 
6.2%
4023232 1
 
6.2%
4023246 1
 
6.2%
4023278 1
 
6.2%
4023297 1
 
6.2%
http://img.mbn.co.kr/filewww/news/other/2020/01/01/032001212223.jpg 1
 
6.2%
4023298 1
 
6.2%
Other values (2) 2
 
12.5%
2023-12-12T06:13:16.289519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 54
28.3%
2 25
13.1%
0 18
 
9.4%
3 16
 
8.4%
4 12
 
6.3%
/ 9
 
4.7%
1 7
 
3.7%
. 4
 
2.1%
w 4
 
2.1%
8 3
 
1.6%
Other values (22) 39
20.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 90
47.1%
Other Punctuation 68
35.6%
Lowercase Letter 33
 
17.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 4
12.1%
e 3
 
9.1%
t 3
 
9.1%
g 2
 
6.1%
r 2
 
6.1%
o 2
 
6.1%
n 2
 
6.1%
m 2
 
6.1%
i 2
 
6.1%
p 2
 
6.1%
Other values (8) 9
27.3%
Decimal Number
ValueCountFrequency (%)
2 25
27.8%
0 18
20.0%
3 16
17.8%
4 12
13.3%
1 7
 
7.8%
8 3
 
3.3%
6 3
 
3.3%
9 2
 
2.2%
7 2
 
2.2%
5 2
 
2.2%
Other Punctuation
ValueCountFrequency (%)
, 54
79.4%
/ 9
 
13.2%
. 4
 
5.9%
: 1
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Common 158
82.7%
Latin 33
 
17.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 4
12.1%
e 3
 
9.1%
t 3
 
9.1%
g 2
 
6.1%
r 2
 
6.1%
o 2
 
6.1%
n 2
 
6.1%
m 2
 
6.1%
i 2
 
6.1%
p 2
 
6.1%
Other values (8) 9
27.3%
Common
ValueCountFrequency (%)
, 54
34.2%
2 25
15.8%
0 18
 
11.4%
3 16
 
10.1%
4 12
 
7.6%
/ 9
 
5.7%
1 7
 
4.4%
. 4
 
2.5%
8 3
 
1.9%
6 3
 
1.9%
Other values (4) 7
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 191
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 54
28.3%
2 25
13.1%
0 18
 
9.4%
3 16
 
8.4%
4 12
 
6.3%
/ 9
 
4.7%
1 7
 
3.7%
. 4
 
2.1%
w 4
 
2.1%
8 3
 
1.6%
Other values (22) 39
20.4%

MDA_CGR_NM
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
<NA>
173 
mbn00008
 
10
이혁준
 
1
정욱
 
1
임성재
 
1

Length

Max length8
Median length4
Mean length4.1935484
Min length2

Unique

Unique3 ?
Unique (%)1.6%

Sample

1st row<NA>
2nd rowmbn00008
3rd row<NA>
4th row<NA>
5th rowmbn00008

Common Values

ValueCountFrequency (%)
<NA> 173
93.0%
mbn00008 10
 
5.4%
이혁준 1
 
0.5%
정욱 1
 
0.5%
임성재 1
 
0.5%

Length

2023-12-12T06:13:16.409070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T06:13:16.504147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 173
93.0%
mbn00008 10
 
5.4%
이혁준 1
 
0.5%
정욱 1
 
0.5%
임성재 1
 
0.5%

STD_YEAR
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)52.6%
Missing167
Missing (%)89.8%
Infinite0
Infinite (%)0.0%
Mean9.568469 × 1012
Minimum2020
Maximum2.0200102 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2023-12-12T06:13:16.584053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2020
5-th percentile2020
Q12020
median2020
Q32.0200101 × 1013
95-th percentile2.0200101 × 1013
Maximum2.0200102 × 1013
Range2.0200102 × 1013
Interquartile range (IQR)2.0200101 × 1013

Descriptive statistics

Standard deviation1.0362433 × 1013
Coefficient of variation (CV)1.0829771
Kurtosis-2.2352941
Mean9.568469 × 1012
Median Absolute Deviation (MAD)0
Skewness0.11466817
Sum1.8180091 × 1014
Variance1.0738002 × 1026
MonotonicityNot monotonic
2023-12-12T06:13:16.865295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2020 10
 
5.4%
20200101072035 1
 
0.5%
20200101080035 1
 
0.5%
20200101092438 1
 
0.5%
20200101111007 1
 
0.5%
20200101122349 1
 
0.5%
20200101133332 1
 
0.5%
20200101133442 1
 
0.5%
20200101193035 1
 
0.5%
20200102072234 1
 
0.5%
(Missing) 167
89.8%
ValueCountFrequency (%)
2020 10
5.4%
20200101072035 1
 
0.5%
20200101080035 1
 
0.5%
20200101092438 1
 
0.5%
20200101111007 1
 
0.5%
20200101122349 1
 
0.5%
20200101133332 1
 
0.5%
20200101133442 1
 
0.5%
20200101193035 1
 
0.5%
20200102072234 1
 
0.5%
ValueCountFrequency (%)
20200102072234 1
 
0.5%
20200101193035 1
 
0.5%
20200101133442 1
 
0.5%
20200101133332 1
 
0.5%
20200101122349 1
 
0.5%
20200101111007 1
 
0.5%
20200101092438 1
 
0.5%
20200101080035 1
 
0.5%
20200101072035 1
 
0.5%
2020 10
5.4%

ART_SJ_CN
Text

MISSING 

Distinct10
Distinct (%)100.0%
Missing176
Missing (%)94.6%
Memory size1.6 KiB
2023-12-12T06:13:17.046444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length33.5
Mean length32
Min length15

Characters and Unicode

Total characters320
Distinct characters143
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)100.0%

Sample

1st row트럼프 "미·중 1단계 무역합의, 1월15일 서명"
2nd row김정은 "전략무기개발 계속…실제 행동 넘어갈 것"
3rd row폼페이오, 김정은에 경고 "새 무기 목격하게 될 것...'옳은 결정' 바란다"
4th row아베, 새해 첫 소감 '평화헌법' 개정 추진 뜻 거듭 밝혀
5th row[속보] 트럼프 "김정은과 좋은 관계…김정은 약속 지킬 것"<로이터>
ValueCountFrequency (%)
트럼프 4
 
5.1%
김정은 3
 
3.8%
2
 
2.6%
2
 
2.6%
약속 2
 
2.6%
선물 1
 
1.3%
꽃병이길"...비핵화 1
 
1.3%
북한 1
 
1.3%
이행 1
 
1.3%
낙관 1
 
1.3%
Other values (60) 60
76.9%
2023-12-12T06:13:17.341861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
68
 
21.2%
" 14
 
4.4%
8
 
2.5%
8
 
2.5%
6
 
1.9%
, 6
 
1.9%
. 6
 
1.9%
5
 
1.6%
4
 
1.2%
4
 
1.2%
Other values (133) 191
59.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 200
62.5%
Space Separator 68
 
21.2%
Other Punctuation 37
 
11.6%
Uppercase Letter 7
 
2.2%
Decimal Number 4
 
1.2%
Math Symbol 2
 
0.6%
Close Punctuation 1
 
0.3%
Open Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
4.0%
8
 
4.0%
6
 
3.0%
5
 
2.5%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
3
 
1.5%
3
 
1.5%
Other values (115) 151
75.5%
Other Punctuation
ValueCountFrequency (%)
" 14
37.8%
, 6
16.2%
. 6
16.2%
4
 
10.8%
' 4
 
10.8%
· 3
 
8.1%
Uppercase Letter
ValueCountFrequency (%)
C 2
28.6%
N 2
28.6%
M 1
14.3%
B 1
14.3%
I 1
14.3%
Decimal Number
ValueCountFrequency (%)
1 3
75.0%
5 1
 
25.0%
Math Symbol
ValueCountFrequency (%)
> 1
50.0%
< 1
50.0%
Space Separator
ValueCountFrequency (%)
68
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 200
62.5%
Common 113
35.3%
Latin 7
 
2.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
 
4.0%
8
 
4.0%
6
 
3.0%
5
 
2.5%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
3
 
1.5%
3
 
1.5%
Other values (115) 151
75.5%
Common
ValueCountFrequency (%)
68
60.2%
" 14
 
12.4%
, 6
 
5.3%
. 6
 
5.3%
4
 
3.5%
' 4
 
3.5%
· 3
 
2.7%
1 3
 
2.7%
> 1
 
0.9%
5 1
 
0.9%
Other values (3) 3
 
2.7%
Latin
ValueCountFrequency (%)
C 2
28.6%
N 2
28.6%
M 1
14.3%
B 1
14.3%
I 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 200
62.5%
ASCII 113
35.3%
Punctuation 4
 
1.2%
None 3
 
0.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
68
60.2%
" 14
 
12.4%
, 6
 
5.3%
. 6
 
5.3%
' 4
 
3.5%
1 3
 
2.7%
C 2
 
1.8%
N 2
 
1.8%
M 1
 
0.9%
B 1
 
0.9%
Other values (6) 6
 
5.3%
Hangul
ValueCountFrequency (%)
8
 
4.0%
8
 
4.0%
6
 
3.0%
5
 
2.5%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
3
 
1.5%
3
 
1.5%
Other values (115) 151
75.5%
Punctuation
ValueCountFrequency (%)
4
100.0%
None
ValueCountFrequency (%)
· 3
100.0%

ART_CN
Text

MISSING 

Distinct5
Distinct (%)50.0%
Missing176
Missing (%)94.6%
Memory size1.6 KiB
2023-12-12T06:13:17.510976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length107
Median length94
Mean length43.7
Min length8

Characters and Unicode

Total characters437
Distinct characters112
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)30.0%

Sample

1st row트럼프 미국 대통령이 오는 1월15일 백악관에서 중국과의 매우 크고 포괄적인 1단계 무역합의에 서명할 것이라고 트위터 계정을 통해 밝혔습니다.
2nd row【 앵커멘트 】
3rd row<!------------ PHOTO_POS_0 ------------>
4th row<!------------ PHOTO_POS_0 ------------>
5th row트럼프 "김정은과 좋은 관계…김정은 약속 지킬 것"<로이터>
ValueCountFrequency (%)
12
 
18.2%
photo_pos_0 6
 
9.1%
2
 
3.0%
앵커멘트 2
 
3.0%
2
 
3.0%
트럼프 2
 
3.0%
1
 
1.5%
것"<로이터 1
 
1.5%
일본 1
 
1.5%
사법당국의 1
 
1.5%
Other values (36) 36
54.5%
2023-12-12T06:13:17.760916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 144
33.0%
59
 
13.5%
O 18
 
4.1%
P 12
 
2.7%
_ 12
 
2.7%
< 7
 
1.6%
> 7
 
1.6%
H 6
 
1.4%
T 6
 
1.4%
S 6
 
1.4%
Other values (102) 160
36.6%

Most occurring categories

ValueCountFrequency (%)
Dash Punctuation 144
33.0%
Other Letter 135
30.9%
Space Separator 59
13.5%
Uppercase Letter 48
 
11.0%
Math Symbol 14
 
3.2%
Connector Punctuation 12
 
2.7%
Other Punctuation 11
 
2.5%
Decimal Number 10
 
2.3%
Open Punctuation 2
 
0.5%
Close Punctuation 2
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
3.7%
4
 
3.0%
4
 
3.0%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
2
 
1.5%
2
 
1.5%
Other values (83) 103
76.3%
Uppercase Letter
ValueCountFrequency (%)
O 18
37.5%
P 12
25.0%
H 6
 
12.5%
T 6
 
12.5%
S 6
 
12.5%
Other Punctuation
ValueCountFrequency (%)
! 6
54.5%
. 2
 
18.2%
" 2
 
18.2%
1
 
9.1%
Decimal Number
ValueCountFrequency (%)
0 6
60.0%
1 3
30.0%
5 1
 
10.0%
Math Symbol
ValueCountFrequency (%)
< 7
50.0%
> 7
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 144
100.0%
Space Separator
ValueCountFrequency (%)
59
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 12
100.0%
Open Punctuation
ValueCountFrequency (%)
2
100.0%
Close Punctuation
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 254
58.1%
Hangul 135
30.9%
Latin 48
 
11.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
3.7%
4
 
3.0%
4
 
3.0%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
2
 
1.5%
2
 
1.5%
Other values (83) 103
76.3%
Common
ValueCountFrequency (%)
- 144
56.7%
59
23.2%
_ 12
 
4.7%
< 7
 
2.8%
> 7
 
2.8%
0 6
 
2.4%
! 6
 
2.4%
1 3
 
1.2%
2
 
0.8%
. 2
 
0.8%
Other values (4) 6
 
2.4%
Latin
ValueCountFrequency (%)
O 18
37.5%
P 12
25.0%
H 6
 
12.5%
T 6
 
12.5%
S 6
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 297
68.0%
Hangul 135
30.9%
None 4
 
0.9%
Punctuation 1
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 144
48.5%
59
19.9%
O 18
 
6.1%
P 12
 
4.0%
_ 12
 
4.0%
< 7
 
2.4%
> 7
 
2.4%
H 6
 
2.0%
T 6
 
2.0%
S 6
 
2.0%
Other values (6) 20
 
6.7%
Hangul
ValueCountFrequency (%)
5
 
3.7%
4
 
3.0%
4
 
3.0%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
2
 
1.5%
2
 
1.5%
Other values (83) 103
76.3%
None
ValueCountFrequency (%)
2
50.0%
2
50.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

ATCH_IMG_NM
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing185
Missing (%)99.5%
Memory size1.6 KiB
2023-12-12T06:13:17.826339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row,,,,,,,,,
ValueCountFrequency (%)
1
100.0%
2023-12-12T06:13:17.963816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 9
100.0%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 9
100.0%

Most frequent character per category

Other Punctuation
ValueCountFrequency (%)
, 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
, 9
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 9
100.0%

JRNL_NM
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing186
Missing (%)100.0%
Memory size1.8 KiB

WRT_DATE
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
<NA>
185 
20200101113443
 
1

Length

Max length14
Median length4
Mean length4.0537634
Min length4

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 185
99.5%
20200101113443 1
 
0.5%

Length

2023-12-12T06:13:18.057028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T06:13:18.144828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 185
99.5%
20200101113443 1
 
0.5%

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing186
Missing (%)100.0%
Memory size1.8 KiB

Interactions

2023-12-12T06:13:14.930190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T06:13:18.205754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
MBN_MDA_SP_CDMDA_ART_ESSN_NOMDA_CGR_NMSTD_YEARART_SJ_CNART_CN
MBN_MDA_SP_CD1.0001.0001.000NaNNaNNaN
MDA_ART_ESSN_NO1.0001.0001.000NaN1.0001.000
MDA_CGR_NM1.0001.0001.000NaNNaNNaN
STD_YEARNaNNaNNaN1.000NaNNaN
ART_SJ_CNNaN1.000NaNNaN1.0001.000
ART_CNNaN1.000NaNNaN1.0001.000
2023-12-12T06:13:18.313609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
MDA_CGR_NMWRT_DATE
MDA_CGR_NM1.000NaN
WRT_DATENaN1.000
2023-12-12T06:13:18.384789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
STD_YEARMDA_CGR_NMWRT_DATE
STD_YEAR1.0000.905NaN
MDA_CGR_NM0.9051.000NaN
WRT_DATENaNNaN1.000

Missing values

2023-12-12T06:13:15.038253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T06:13:15.155496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T06:13:15.271869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

MBN_MDA_SP_CDMDA_ART_ESSN_NOMDA_CGR_NMSTD_YEARART_SJ_CNART_CNATCH_IMG_NMJRNL_NMWRT_DATEUnnamed: 9
0<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
1MBN4023133mbn000082020트럼프 "미·중 1단계 무역합의, 1월15일 서명"트럼프 미국 대통령이 오는 1월15일 백악관에서 중국과의 매우 크고 포괄적인 1단계 무역합의에 서명할 것이라고 트위터 계정을 통해 밝혔습니다.<NA><NA><NA><NA>
2트럼프 대통령은 또 나중에 2단계 회담이 시작되는 베이징으로 갈 것이라고 밝혀 베이징에서 미중 정상회담이 개최될 것임을 시사했으나 시기 등 구체적인 언급은 하지 않았습니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
3중국은 농산물을 포함해 미국산 제품을 대규모로 구매하고, 미국은 추가 관세 부과를 철회하는 한편 기존 관세 중 일부 제품의 관세율을 낮추는 것이 1단계 합의의 주된 내용입니다.<NA><NA>20200101072035<NA><NA><NA><NA><NA><NA>
4MBN4023138mbn000082020김정은 "전략무기개발 계속…실제 행동 넘어갈 것"【 앵커멘트 】<NA><NA><NA><NA>
5최근 미국 정부에 '크리스마스 선물'을 언급하며 군사 도발 불안감을 키웠던 북한이 전략무기개발을 계속하겠다고 밝혔습니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
6김정은 국무위원장은 제재에 따른 조치라는 단서를 달았지만, 실제 행동하겠다고도 언급했습니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
7이혁준 기자입니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
8<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
9<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
MBN_MDA_SP_CDMDA_ART_ESSN_NOMDA_CGR_NMSTD_YEARART_SJ_CNART_CNATCH_IMG_NMJRNL_NMWRT_DATEUnnamed: 9
176중국의 대북 영향력이 미 전문가들이 생각하는 수준보다는 훨씬 더 제한적일 수 있지만, 그럼에도 북의 무기 시험발사를 자제시키는데 있어 중국의 외교적 노력이 성공을 거둬왔다는 것입니다. 북중간 이해관계 등을 제대로 파악해 대처해야 한다는 취지로 보입니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
177<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
178로버트 매닝 애틀랜틱 카운슬 선임연구원은 더힐에 기고한 '2020년에 주시할 6가지 최고 위험'이라는 글에서 북한 문제를 그중 하나로 꼽고 "이는 이제 형태를 갖추고 있는 새로운 핵 시대의 가장 당면한 도전과제'라고 밝혔습니다. 이어 "추가 북미정상회담 개최 여부와 상관없이 북한이 전략 무기 개발을 지속하면서 2020년 긴장이 고조될 가능성이 높다"고 말했습니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
179<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
180그러면서 "표면적으로는 외교가 지속될지 모르지만, 김정은은 '전략적 결정'에 조금씩 가까이 다가갈 것이다. 김정은의 목표는 미사일과 핵무기를 향상하면서 시간을 버는 것"이라며 김 위원장의 '새로운 길' 위협에도 전쟁은 여전히 상상도 할 수 없는 파멸적 선택인 만큼 상호 억지가 유지돼야 한다고 말했습니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
181<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
182이어 다음 단계는 핵을 보유한 북한과 어떻게 함께 살아가느냐의 문제일 수 있다고 지적하기도 했습니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
183<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
184[MBN 온라인뉴스팀],,,,,,,,,<NA>20200102072234<NA><NA><NA><NA><NA><NA>
185<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

MBN_MDA_SP_CDMDA_ART_ESSN_NOMDA_CGR_NMSTD_YEARART_SJ_CNART_CNATCH_IMG_NMWRT_DATE# duplicates
1<NA><NA><NA><NA><NA><NA><NA><NA>77
0【 기자 】<NA><NA><NA><NA><NA><NA><NA>2