Overview

Dataset statistics

Number of variables10
Number of observations192
Missing cells1704
Missing cells (%)88.8%
Duplicate rows1
Duplicate rows (%)0.5%
Total size in memory16.1 KiB
Average record size in memory85.7 B

Variable types

Text5
Numeric1
Unsupported4

Dataset

Description샘플 데이터
AuthorMBN
URLhttps://kdx.kr/data/view/26952

Alerts

Dataset has 1 (0.5%) duplicate rowsDuplicates
MBN_MDA_SP_CD has 50 (26.0%) missing valuesMissing
MDA_ART_ESSN_NO has 176 (91.7%) missing valuesMissing
MDA_CGR_NM has 174 (90.6%) missing valuesMissing
STD_YEAR has 172 (89.6%) missing valuesMissing
ART_SJ_CN has 182 (94.8%) missing valuesMissing
ART_CN has 182 (94.8%) missing valuesMissing
ATCH_IMG_NM has 192 (100.0%) missing valuesMissing
JRNL_NM has 192 (100.0%) missing valuesMissing
WRT_DATE has 192 (100.0%) missing valuesMissing
Unnamed: 9 has 192 (100.0%) missing valuesMissing
ATCH_IMG_NM is an unsupported type, check if it needs cleaning or further analysisUnsupported
JRNL_NM is an unsupported type, check if it needs cleaning or further analysisUnsupported
WRT_DATE is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 21:03:40.276899
Analysis finished2023-12-11 21:03:42.510278
Duration2.23 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

MBN_MDA_SP_CD
Text

MISSING 

Distinct126
Distinct (%)88.7%
Missing50
Missing (%)26.0%
Memory size1.6 KiB
2023-12-12T06:03:42.757271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length480
Median length144.5
Mean length90.669014
Min length3

Characters and Unicode

Total characters12875
Distinct characters688
Distinct categories14 ?
Distinct scripts4 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique123 ?
Unique (%)86.6%

Sample

1st rowMBN
2nd row수도권 남부 중 수원시는 교통의 요충지이자 도청이 자리한 도시로 발전 중입니다. 수원을 중심으로 화성, 오산, 평택 등 신도시의 개발과 신규 아파트가 들어서면서 역세권을 중심으로 희소성이 있습니다. 수원역과 고색역을 동시에 누리며 주변 시세 대비 저렴한 공급가로 실거주자의 인기가 높은 ‘남수원 이지더원’이 막바지 특별 모집 중입니다.
3rd row‘남수원 이지더원’은 경기도 화성시 배양동 일대로 총 1014세대의 대단지로 조성됩니다. 선호도 높은 중소형 평형위주로 전용면적 59㎡, 69㎡, 84㎡로 선택의 폭이 넓습니다.
4th row전 세대 남향위주의 배치와 3천여평의 공원을 조성해 일조량 및 채광이 우수합니다. 테마정원, 어린이놀이터, 주민운동시설 등 입주민을 위한 힐링 라이프를 선사합니다. 드레스룸, 알파룸, 현관수납장 등 수납공간을 넓히고 일부 타입의 경우 1+1 세대분리형으로 2세대가 거주가능합니다.
5th row북카페, 사우나, 피트니스센터, 배드민턴&탁구 등 다양한 커뮤니티시설을 도입합니다. 남수원 골프장 조망권과 숲세권을 누릴 수 있습니다. 황구지천 수변공원을 도보로 이용가능하며 단지 내 다양한 어린이놀이터, 산책로, 작은공원 등 설계했습니다.
ValueCountFrequency (%)
23
 
0.8%
21
 
0.7%
19
 
0.7%
19
 
0.7%
통해 18
 
0.6%
있는 13
 
0.5%
mbn 12
 
0.4%
제네시스 11
 
0.4%
것으로 11
 
0.4%
대형 10
 
0.4%
Other values (2050) 2680
94.5%
2023-12-12T06:03:43.185534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2756
 
21.4%
231
 
1.8%
225
 
1.7%
. 200
 
1.6%
162
 
1.3%
161
 
1.3%
152
 
1.2%
136
 
1.1%
136
 
1.1%
135
 
1.0%
Other values (678) 8581
66.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8564
66.5%
Space Separator 2756
 
21.4%
Other Punctuation 438
 
3.4%
Decimal Number 351
 
2.7%
Lowercase Letter 274
 
2.1%
Uppercase Letter 224
 
1.7%
Dash Punctuation 84
 
0.7%
Open Punctuation 63
 
0.5%
Close Punctuation 63
 
0.5%
Math Symbol 25
 
0.2%
Other values (4) 33
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
231
 
2.7%
225
 
2.6%
162
 
1.9%
161
 
1.9%
152
 
1.8%
136
 
1.6%
136
 
1.6%
135
 
1.6%
124
 
1.4%
115
 
1.3%
Other values (591) 6987
81.6%
Lowercase Letter
ValueCountFrequency (%)
r 30
10.9%
e 29
10.6%
i 27
9.9%
o 23
 
8.4%
a 19
 
6.9%
n 18
 
6.6%
t 18
 
6.6%
s 15
 
5.5%
c 15
 
5.5%
k 14
 
5.1%
Other values (14) 66
24.1%
Uppercase Letter
ValueCountFrequency (%)
V 29
12.9%
S 28
12.5%
M 18
 
8.0%
U 18
 
8.0%
N 17
 
7.6%
G 16
 
7.1%
B 16
 
7.1%
C 13
 
5.8%
P 9
 
4.0%
O 9
 
4.0%
Other values (11) 51
22.8%
Other Punctuation
ValueCountFrequency (%)
. 200
45.7%
, 87
19.9%
' 50
 
11.4%
" 40
 
9.1%
% 24
 
5.5%
: 8
 
1.8%
· 8
 
1.8%
& 8
 
1.8%
/ 5
 
1.1%
! 4
 
0.9%
Other values (2) 4
 
0.9%
Decimal Number
ValueCountFrequency (%)
0 85
24.2%
1 72
20.5%
2 56
16.0%
3 40
11.4%
8 22
 
6.3%
5 21
 
6.0%
4 18
 
5.1%
9 14
 
4.0%
7 13
 
3.7%
6 10
 
2.8%
Other Symbol
ValueCountFrequency (%)
7
33.3%
7
33.3%
3
14.3%
2
 
9.5%
2
 
9.5%
Math Symbol
ValueCountFrequency (%)
< 10
40.0%
> 10
40.0%
+ 3
 
12.0%
~ 2
 
8.0%
Open Punctuation
ValueCountFrequency (%)
( 46
73.0%
[ 16
 
25.4%
1
 
1.6%
Close Punctuation
ValueCountFrequency (%)
) 46
73.0%
] 16
 
25.4%
1
 
1.6%
Space Separator
ValueCountFrequency (%)
2756
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 84
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 6
100.0%
Final Punctuation
ValueCountFrequency (%)
3
100.0%
Initial Punctuation
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8562
66.5%
Common 3813
29.6%
Latin 498
 
3.9%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
231
 
2.7%
225
 
2.6%
162
 
1.9%
161
 
1.9%
152
 
1.8%
136
 
1.6%
136
 
1.6%
135
 
1.6%
124
 
1.4%
115
 
1.3%
Other values (589) 6985
81.6%
Latin
ValueCountFrequency (%)
r 30
 
6.0%
V 29
 
5.8%
e 29
 
5.8%
S 28
 
5.6%
i 27
 
5.4%
o 23
 
4.6%
a 19
 
3.8%
M 18
 
3.6%
n 18
 
3.6%
U 18
 
3.6%
Other values (35) 259
52.0%
Common
ValueCountFrequency (%)
2756
72.3%
. 200
 
5.2%
, 87
 
2.3%
0 85
 
2.2%
- 84
 
2.2%
1 72
 
1.9%
2 56
 
1.5%
' 50
 
1.3%
( 46
 
1.2%
) 46
 
1.2%
Other values (32) 331
 
8.7%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8562
66.5%
ASCII 4273
33.2%
None 10
 
0.1%
Geometric Shapes 9
 
0.1%
Enclosed Alphanum 7
 
0.1%
Punctuation 7
 
0.1%
CJK Compat 5
 
< 0.1%
CJK 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2756
64.5%
. 200
 
4.7%
, 87
 
2.0%
0 85
 
2.0%
- 84
 
2.0%
1 72
 
1.7%
2 56
 
1.3%
' 50
 
1.2%
( 46
 
1.1%
) 46
 
1.1%
Other values (66) 791
 
18.5%
Hangul
ValueCountFrequency (%)
231
 
2.7%
225
 
2.6%
162
 
1.9%
161
 
1.9%
152
 
1.8%
136
 
1.6%
136
 
1.6%
135
 
1.6%
124
 
1.4%
115
 
1.3%
Other values (589) 6985
81.6%
None
ValueCountFrequency (%)
· 8
80.0%
1
 
10.0%
1
 
10.0%
Geometric Shapes
ValueCountFrequency (%)
7
77.8%
2
 
22.2%
Enclosed Alphanum
ValueCountFrequency (%)
7
100.0%
Punctuation
ValueCountFrequency (%)
3
42.9%
3
42.9%
1
 
14.3%
CJK Compat
ValueCountFrequency (%)
3
60.0%
2
40.0%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%

MDA_ART_ESSN_NO
Text

MISSING 

Distinct16
Distinct (%)100.0%
Missing176
Missing (%)91.7%
Memory size1.6 KiB
2023-12-12T06:03:43.346304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length210
Median length7
Mean length37.375
Min length7

Characters and Unicode

Total characters598
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)100.0%

Sample

1st row4022298
2nd rowhttp://img.mbn.co.kr/filewww/news/2019/12/31/15777540505e0a9dc2d9285.jpg,,,,,,,,,
3rd row4023158
4th rowhttp://img.mbn.co.kr/filewww/news/other/2020/01/01/121021909010.jpg,http://img.mbn.co.kr/filewww/news/other/2020/01/01/000001012910.jpg,http://img.mbn.co.kr/filewww/news/other/2020/01/01/110901021202.jpg,,,,,,,
5th row4023174
ValueCountFrequency (%)
4022298 1
 
6.2%
http://img.mbn.co.kr/filewww/news/2019/12/31/15777540505e0a9dc2d9285.jpg 1
 
6.2%
4023158 1
 
6.2%
http://img.mbn.co.kr/filewww/news/other/2020/01/01/121021909010.jpg,http://img.mbn.co.kr/filewww/news/other/2020/01/01/000001012910.jpg,http://img.mbn.co.kr/filewww/news/other/2020/01/01/110901021202.jpg 1
 
6.2%
4023174 1
 
6.2%
1
 
6.2%
4023201 1
 
6.2%
http://img.mbn.co.kr/filewww/news/other/2020/01/01/012050025012.jpg 1
 
6.2%
4023233 1
 
6.2%
http://img.mbn.co.kr/filewww/news/other/2020/01/01/522001053151.jpg 1
 
6.2%
Other values (6) 6
37.5%
2023-12-12T06:03:43.593325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 70
 
11.7%
/ 62
 
10.4%
, 54
 
9.0%
2 45
 
7.5%
1 39
 
6.5%
w 28
 
4.7%
. 28
 
4.7%
e 21
 
3.5%
t 20
 
3.3%
4 16
 
2.7%
Other values (24) 215
36.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 231
38.6%
Decimal Number 216
36.1%
Other Punctuation 151
25.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 28
12.1%
e 21
 
9.1%
t 20
 
8.7%
i 14
 
6.1%
m 14
 
6.1%
g 14
 
6.1%
p 14
 
6.1%
n 14
 
6.1%
r 13
 
5.6%
o 13
 
5.6%
Other values (10) 66
28.6%
Decimal Number
ValueCountFrequency (%)
0 70
32.4%
2 45
20.8%
1 39
18.1%
4 16
 
7.4%
3 16
 
7.4%
5 11
 
5.1%
9 10
 
4.6%
7 5
 
2.3%
8 3
 
1.4%
6 1
 
0.5%
Other Punctuation
ValueCountFrequency (%)
/ 62
41.1%
, 54
35.8%
. 28
18.5%
: 7
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Common 367
61.4%
Latin 231
38.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 28
12.1%
e 21
 
9.1%
t 20
 
8.7%
i 14
 
6.1%
m 14
 
6.1%
g 14
 
6.1%
p 14
 
6.1%
n 14
 
6.1%
r 13
 
5.6%
o 13
 
5.6%
Other values (10) 66
28.6%
Common
ValueCountFrequency (%)
0 70
19.1%
/ 62
16.9%
, 54
14.7%
2 45
12.3%
1 39
10.6%
. 28
 
7.6%
4 16
 
4.4%
3 16
 
4.4%
5 11
 
3.0%
9 10
 
2.7%
Other values (4) 16
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 598
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 70
 
11.7%
/ 62
 
10.4%
, 54
 
9.0%
2 45
 
7.5%
1 39
 
6.5%
w 28
 
4.7%
. 28
 
4.7%
e 21
 
3.5%
t 20
 
3.3%
4 16
 
2.7%
Other values (24) 215
36.0%

MDA_CGR_NM
Text

MISSING 

Distinct9
Distinct (%)50.0%
Missing174
Missing (%)90.6%
Memory size1.6 KiB
2023-12-12T06:03:43.706249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length5.7777778
Min length3

Characters and Unicode

Total characters104
Distinct characters25
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)44.4%

Sample

1st rowmbn00003
2nd rowmbn00003
3rd row최기성
4th rowmbn00003
5th rowmbn00003
ValueCountFrequency (%)
mbn00003 10
55.6%
최기성 1
 
5.6%
김연주 1
 
5.6%
김태성 1
 
5.6%
이호승 1
 
5.6%
정슬기 1
 
5.6%
임성현 1
 
5.6%
강계만 1
 
5.6%
서영수 1
 
5.6%
2023-12-12T06:03:43.924522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 40
38.5%
m 10
 
9.6%
n 10
 
9.6%
3 10
 
9.6%
b 10
 
9.6%
3
 
2.9%
2
 
1.9%
2
 
1.9%
1
 
1.0%
1
 
1.0%
Other values (15) 15
 
14.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50
48.1%
Lowercase Letter 30
28.8%
Other Letter 24
23.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3
 
12.5%
2
 
8.3%
2
 
8.3%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other values (10) 10
41.7%
Lowercase Letter
ValueCountFrequency (%)
m 10
33.3%
n 10
33.3%
b 10
33.3%
Decimal Number
ValueCountFrequency (%)
0 40
80.0%
3 10
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Common 50
48.1%
Latin 30
28.8%
Hangul 24
23.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3
 
12.5%
2
 
8.3%
2
 
8.3%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other values (10) 10
41.7%
Latin
ValueCountFrequency (%)
m 10
33.3%
n 10
33.3%
b 10
33.3%
Common
ValueCountFrequency (%)
0 40
80.0%
3 10
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80
76.9%
Hangul 24
 
23.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 40
50.0%
m 10
 
12.5%
n 10
 
12.5%
3 10
 
12.5%
b 10
 
12.5%
Hangul
ValueCountFrequency (%)
3
 
12.5%
2
 
8.3%
2
 
8.3%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other values (10) 10
41.7%

STD_YEAR
Real number (ℝ)

MISSING 

Distinct11
Distinct (%)55.0%
Missing172
Missing (%)89.6%
Infinite0
Infinite (%)0.0%
Mean1.0100051 × 1013
Minimum2020
Maximum2.0200102 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2023-12-12T06:03:44.024023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2020
5-th percentile2020
Q12020
median1.0100051 × 1013
Q32.0200101 × 1013
95-th percentile2.0200101 × 1013
Maximum2.0200102 × 1013
Range2.0200102 × 1013
Interquartile range (IQR)2.0200101 × 1013

Descriptive statistics

Standard deviation1.0362433 × 1013
Coefficient of variation (CV)1.0259784
Kurtosis-2.2352941
Mean1.0100051 × 1013
Median Absolute Deviation (MAD)1.0100051 × 1013
Skewness1.1674698 × 10-15
Sum2.0200101 × 1014
Variance1.0738002 × 1026
MonotonicityNot monotonic
2023-12-12T06:03:44.126918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2020 10
 
5.2%
20200102090022 1
 
0.5%
20200101090112 1
 
0.5%
20200101094041 1
 
0.5%
20200101101944 1
 
0.5%
20200101111043 1
 
0.5%
20200101133232 1
 
0.5%
20200101141032 1
 
0.5%
20200101161233 1
 
0.5%
20200101164234 1
 
0.5%
(Missing) 172
89.6%
ValueCountFrequency (%)
2020 10
5.2%
20200101090112 1
 
0.5%
20200101094041 1
 
0.5%
20200101101944 1
 
0.5%
20200101111043 1
 
0.5%
20200101133232 1
 
0.5%
20200101141032 1
 
0.5%
20200101161233 1
 
0.5%
20200101164234 1
 
0.5%
20200101193035 1
 
0.5%
ValueCountFrequency (%)
20200102090022 1
0.5%
20200101193035 1
0.5%
20200101164234 1
0.5%
20200101161233 1
0.5%
20200101141032 1
0.5%
20200101133232 1
0.5%
20200101111043 1
0.5%
20200101101944 1
0.5%
20200101094041 1
0.5%
20200101090112 1
0.5%

ART_SJ_CN
Text

MISSING 

Distinct10
Distinct (%)100.0%
Missing182
Missing (%)94.8%
Memory size1.6 KiB
2023-12-12T06:03:44.309849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length31.5
Mean length28.5
Min length15

Characters and Unicode

Total characters285
Distinct characters146
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)100.0%

Sample

1st row‘남수원 이지더원’ 수원 생활인프라 공유, 마지막 조합원 모집
2nd row새해 첫날 `베일 벗은` 제네시스 GV80, 1월 출시 확정
3rd row작년 수출 금융위기후 10년만에 첫 두자릿수 하락…10.3%↓
4th row[단독] 자연출산률 0%에도···정부 콘트롤타워는 마비중
5th row극한배달전쟁에…이마트24도 요기요와 배달서비스
ValueCountFrequency (%)
출시 2
 
3.2%
항변 1
 
1.6%
흰쥐의 1
 
1.6%
1
 
1.6%
맞아 1
 
1.6%
해피치즈 1
 
1.6%
화이트모카 1
 
1.6%
당뇨병 1
 
1.6%
관리기기 1
 
1.6%
구입도 1
 
1.6%
Other values (52) 52
82.5%
2023-12-12T06:03:44.624064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
53
 
18.6%
0 6
 
2.1%
5
 
1.8%
5
 
1.8%
5
 
1.8%
4
 
1.4%
4
 
1.4%
1 4
 
1.4%
4
 
1.4%
4
 
1.4%
Other values (136) 191
67.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 191
67.0%
Space Separator 53
 
18.6%
Decimal Number 18
 
6.3%
Other Punctuation 14
 
4.9%
Modifier Symbol 2
 
0.7%
Uppercase Letter 2
 
0.7%
Initial Punctuation 1
 
0.4%
Final Punctuation 1
 
0.4%
Close Punctuation 1
 
0.4%
Open Punctuation 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
2.6%
5
 
2.6%
5
 
2.6%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
Other values (113) 148
77.5%
Decimal Number
ValueCountFrequency (%)
0 6
33.3%
1 4
22.2%
2 3
16.7%
9 1
 
5.6%
6 1
 
5.6%
4 1
 
5.6%
8 1
 
5.6%
3 1
 
5.6%
Other Punctuation
ValueCountFrequency (%)
· 3
21.4%
, 3
21.4%
3
21.4%
% 2
14.3%
' 2
14.3%
. 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
V 1
50.0%
G 1
50.0%
Space Separator
ValueCountFrequency (%)
53
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 1
100.0%
Math Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 191
67.0%
Common 92
32.3%
Latin 2
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
2.6%
5
 
2.6%
5
 
2.6%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
Other values (113) 148
77.5%
Common
ValueCountFrequency (%)
53
57.6%
0 6
 
6.5%
1 4
 
4.3%
· 3
 
3.3%
, 3
 
3.3%
3
 
3.3%
2 3
 
3.3%
% 2
 
2.2%
` 2
 
2.2%
' 2
 
2.2%
Other values (11) 11
 
12.0%
Latin
ValueCountFrequency (%)
V 1
50.0%
G 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 191
67.0%
ASCII 85
29.8%
Punctuation 5
 
1.8%
None 3
 
1.1%
Arrows 1
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
53
62.4%
0 6
 
7.1%
1 4
 
4.7%
, 3
 
3.5%
2 3
 
3.5%
% 2
 
2.4%
` 2
 
2.4%
' 2
 
2.4%
9 1
 
1.2%
6 1
 
1.2%
Other values (8) 8
 
9.4%
Hangul
ValueCountFrequency (%)
5
 
2.6%
5
 
2.6%
5
 
2.6%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
Other values (113) 148
77.5%
None
ValueCountFrequency (%)
· 3
100.0%
Punctuation
ValueCountFrequency (%)
3
60.0%
1
 
20.0%
1
 
20.0%
Arrows
ValueCountFrequency (%)
1
100.0%

ART_CN
Text

MISSING 

Distinct10
Distinct (%)100.0%
Missing182
Missing (%)94.8%
Memory size1.6 KiB
2023-12-12T06:03:44.827578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length205
Median length72.5
Mean length85.4
Min length8

Characters and Unicode

Total characters854
Distinct characters237
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)100.0%

Sample

1st row<!------------ PHOTO_POS_0 ------------>
2nd row<!------------ PHOTO_POS_0 ------------>지난해 하반기부터 국내 자동차 시장의 핫이슈로 떠오른 제네시스 GV80이 마침내 새해 첫달 출시된다.
3rd row2019년 수출이 10년 만에 두 자릿수의 하락세를 보였습니다.
4th row<!------------ PHOTO_POS_0 ------------> 지난 10월 자연출산률이 0까지 떨어지면서 인구 '데드크로스'가 코 앞으로 닥쳤다. 그러나 '인구재앙'이라 할 만한 위기상황에서 인구정책의 컨트롤타워인 저출산고령사회 위원회는 사실상 개점휴업 상태다. 총책임자인 부위원장이 3개월째 공석인데다, 대면회의는 단 한번도 열지 않은 것으로 알려졌다.
5th row<!------------ PHOTO_POS_0 ------------> 편의점 이마트24가 배달앱 '요기요'와 손잡고 편의점 상품을 배달하는 서비스를 시작한다고 1일 밝혔다.
ValueCountFrequency (%)
9
 
5.7%
photo_pos_0 5
 
3.2%
새해 2
 
1.3%
최대 2
 
1.3%
편의점 2
 
1.3%
관리기기를 1
 
0.6%
당뇨병 1
 
0.6%
환자가 1
 
0.6%
당뇨 1
 
0.6%
소아당뇨(제1형 1
 
0.6%
Other values (133) 133
84.2%
2023-12-12T06:03:45.157845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
150
 
17.6%
- 120
 
14.1%
15
 
1.8%
O 15
 
1.8%
0 14
 
1.6%
13
 
1.5%
12
 
1.4%
. 11
 
1.3%
11
 
1.3%
_ 10
 
1.2%
Other values (227) 483
56.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 446
52.2%
Space Separator 150
 
17.6%
Dash Punctuation 120
 
14.1%
Uppercase Letter 46
 
5.4%
Decimal Number 34
 
4.0%
Other Punctuation 30
 
3.5%
Connector Punctuation 10
 
1.2%
Math Symbol 10
 
1.2%
Close Punctuation 4
 
0.5%
Open Punctuation 4
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
15
 
3.4%
13
 
2.9%
12
 
2.7%
11
 
2.5%
8
 
1.8%
7
 
1.6%
7
 
1.6%
7
 
1.6%
7
 
1.6%
6
 
1.3%
Other values (194) 353
79.1%
Uppercase Letter
ValueCountFrequency (%)
O 15
32.6%
P 10
21.7%
T 5
 
10.9%
H 5
 
10.9%
S 5
 
10.9%
M 2
 
4.3%
G 2
 
4.3%
D 1
 
2.2%
V 1
 
2.2%
Decimal Number
ValueCountFrequency (%)
0 14
41.2%
1 6
17.6%
2 5
 
14.7%
6 2
 
5.9%
5 2
 
5.9%
3 2
 
5.9%
8 1
 
2.9%
9 1
 
2.9%
4 1
 
2.9%
Other Punctuation
ValueCountFrequency (%)
. 11
36.7%
' 8
26.7%
! 5
16.7%
, 3
 
10.0%
" 2
 
6.7%
% 1
 
3.3%
Math Symbol
ValueCountFrequency (%)
< 5
50.0%
> 5
50.0%
Close Punctuation
ValueCountFrequency (%)
) 3
75.0%
1
 
25.0%
Open Punctuation
ValueCountFrequency (%)
( 3
75.0%
1
 
25.0%
Space Separator
ValueCountFrequency (%)
150
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 120
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 443
51.9%
Common 362
42.4%
Latin 46
 
5.4%
Han 3
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
15
 
3.4%
13
 
2.9%
12
 
2.7%
11
 
2.5%
8
 
1.8%
7
 
1.6%
7
 
1.6%
7
 
1.6%
7
 
1.6%
6
 
1.4%
Other values (191) 350
79.0%
Common
ValueCountFrequency (%)
150
41.4%
- 120
33.1%
0 14
 
3.9%
. 11
 
3.0%
_ 10
 
2.8%
' 8
 
2.2%
1 6
 
1.7%
2 5
 
1.4%
! 5
 
1.4%
< 5
 
1.4%
Other values (14) 28
 
7.7%
Latin
ValueCountFrequency (%)
O 15
32.6%
P 10
21.7%
T 5
 
10.9%
H 5
 
10.9%
S 5
 
10.9%
M 2
 
4.3%
G 2
 
4.3%
D 1
 
2.2%
V 1
 
2.2%
Han
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 443
51.9%
ASCII 406
47.5%
CJK 3
 
0.4%
None 2
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
150
36.9%
- 120
29.6%
O 15
 
3.7%
0 14
 
3.4%
. 11
 
2.7%
_ 10
 
2.5%
P 10
 
2.5%
' 8
 
2.0%
1 6
 
1.5%
2 5
 
1.2%
Other values (21) 57
 
14.0%
Hangul
ValueCountFrequency (%)
15
 
3.4%
13
 
2.9%
12
 
2.7%
11
 
2.5%
8
 
1.8%
7
 
1.6%
7
 
1.6%
7
 
1.6%
7
 
1.6%
6
 
1.4%
Other values (191) 350
79.0%
None
ValueCountFrequency (%)
1
50.0%
1
50.0%
CJK
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

ATCH_IMG_NM
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing192
Missing (%)100.0%
Memory size1.8 KiB

JRNL_NM
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing192
Missing (%)100.0%
Memory size1.8 KiB

WRT_DATE
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing192
Missing (%)100.0%
Memory size1.8 KiB

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing192
Missing (%)100.0%
Memory size1.8 KiB

Interactions

2023-12-12T06:03:41.814889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T06:03:45.238793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
MDA_ART_ESSN_NOMDA_CGR_NMSTD_YEARART_SJ_CNART_CN
MDA_ART_ESSN_NO1.0001.000NaN1.0001.000
MDA_CGR_NM1.0001.000NaNNaNNaN
STD_YEARNaNNaN1.000NaNNaN
ART_SJ_CN1.000NaNNaN1.0001.000
ART_CN1.000NaNNaN1.0001.000

Missing values

2023-12-12T06:03:41.988549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T06:03:42.118866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T06:03:42.446360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

MBN_MDA_SP_CDMDA_ART_ESSN_NOMDA_CGR_NMSTD_YEARART_SJ_CNART_CNATCH_IMG_NMJRNL_NMWRT_DATEUnnamed: 9
0<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
1MBN4022298mbn000032020‘남수원 이지더원’ 수원 생활인프라 공유, 마지막 조합원 모집<!------------ PHOTO_POS_0 ------------><NA><NA><NA><NA>
2수도권 남부 중 수원시는 교통의 요충지이자 도청이 자리한 도시로 발전 중입니다. 수원을 중심으로 화성, 오산, 평택 등 신도시의 개발과 신규 아파트가 들어서면서 역세권을 중심으로 희소성이 있습니다. 수원역과 고색역을 동시에 누리며 주변 시세 대비 저렴한 공급가로 실거주자의 인기가 높은 ‘남수원 이지더원’이 막바지 특별 모집 중입니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
3<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
4‘남수원 이지더원’은 경기도 화성시 배양동 일대로 총 1014세대의 대단지로 조성됩니다. 선호도 높은 중소형 평형위주로 전용면적 59㎡, 69㎡, 84㎡로 선택의 폭이 넓습니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
5<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
6전 세대 남향위주의 배치와 3천여평의 공원을 조성해 일조량 및 채광이 우수합니다. 테마정원, 어린이놀이터, 주민운동시설 등 입주민을 위한 힐링 라이프를 선사합니다. 드레스룸, 알파룸, 현관수납장 등 수납공간을 넓히고 일부 타입의 경우 1+1 세대분리형으로 2세대가 거주가능합니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
7<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
8북카페, 사우나, 피트니스센터, 배드민턴&탁구 등 다양한 커뮤니티시설을 도입합니다. 남수원 골프장 조망권과 숲세권을 누릴 수 있습니다. 황구지천 수변공원을 도보로 이용가능하며 단지 내 다양한 어린이놀이터, 산책로, 작은공원 등 설계했습니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
9<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
MBN_MDA_SP_CDMDA_ART_ESSN_NOMDA_CGR_NMSTD_YEARART_SJ_CNART_CNATCH_IMG_NMJRNL_NMWRT_DATEUnnamed: 9
182지난 2018년 3월에도 삼성전자 평택 반도체 공장에서 20여 분간 정전이 발생해 500억 원가량의 피해가 나기도 했습니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
183다만, 이번 사고는 정전 시간이 지난번보다 짧고 바로 복구해 피해가 크지 않을 것으로 보입니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
184<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
185업계에서는 생산라인의 완전 복구까지는 2~3일 정도가 걸릴 것으로 보고 있습니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
186<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
187MBN뉴스 서영수입니다.<NA><NA><NA><NA><NA><NA><NA><NA><NA>
188<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
189영상취재 : 배완호 기자<NA><NA><NA><NA><NA><NA><NA><NA><NA>
190영상편집 : 이재형<NA>서영수20200101193035<NA><NA><NA><NA><NA><NA>
191<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

MBN_MDA_SP_CDMDA_ART_ESSN_NOMDA_CGR_NMSTD_YEARART_SJ_CNART_CN# duplicates
0<NA><NA><NA><NA><NA><NA>50