Overview

Dataset statistics

Number of variables6
Number of observations52
Missing cells242
Missing cells (%)77.6%
Duplicate rows1
Duplicate rows (%)1.9%
Total size in memory2.7 KiB
Average record size in memory53.5 B

Variable types

Text3
Unsupported3

Dataset

Description샘플 데이터
AuthorMBN
URLhttps://kdx.kr/data/view/146

Alerts

Dataset has 1 (1.9%) duplicate rowsDuplicates
RSTRC_VID_ESSN_NO has 22 (42.3%) missing valuesMissing
VID_SJ_CN has 32 (61.5%) missing valuesMissing
VID_CN has 32 (61.5%) missing valuesMissing
REG_DATE has 52 (100.0%) missing valuesMissing
VOD_CRS_NM has 52 (100.0%) missing valuesMissing
Unnamed: 5 has 52 (100.0%) missing valuesMissing
REG_DATE is an unsupported type, check if it needs cleaning or further analysisUnsupported
VOD_CRS_NM is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 21:15:17.433394
Analysis finished2023-12-11 21:15:17.947922
Duration0.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RSTRC_VID_ESSN_NO
Text

MISSING 

Distinct30
Distinct (%)100.0%
Missing22
Missing (%)42.3%
Memory size548.0 B
2023-12-12T06:15:18.113740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length30
Mean length18.833333
Min length7

Characters and Unicode

Total characters565
Distinct characters185
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row1005578
2nd row성인 남녀 기준 90cm, 85cm이상이면
3rd row실제 나이에 +1을 추가하면 된다고...
4th row1005579
5th row중년엔 비타민D결핍 예방,
ValueCountFrequency (%)
한다 5
 
3.9%
2
 
1.6%
좋다고 2
 
1.6%
된다고 2
 
1.6%
충분한 1
 
0.8%
수액대사를 1
 
0.8%
짠맛은 1
 
0.8%
비슷하고 1
 
0.8%
바닷물과 1
 
0.8%
1005645 1
 
0.8%
Other values (111) 111
86.7%
2023-12-12T06:15:18.453479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
108
 
19.1%
0 23
 
4.1%
19
 
3.4%
17
 
3.0%
1 15
 
2.7%
5 15
 
2.7%
14
 
2.5%
. 12
 
2.1%
11
 
1.9%
11
 
1.9%
Other values (175) 320
56.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 355
62.8%
Space Separator 108
 
19.1%
Decimal Number 77
 
13.6%
Other Punctuation 19
 
3.4%
Lowercase Letter 4
 
0.7%
Math Symbol 1
 
0.2%
Uppercase Letter 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
 
5.4%
17
 
4.8%
14
 
3.9%
11
 
3.1%
11
 
3.1%
8
 
2.3%
8
 
2.3%
8
 
2.3%
7
 
2.0%
6
 
1.7%
Other values (157) 246
69.3%
Decimal Number
ValueCountFrequency (%)
0 23
29.9%
1 15
19.5%
5 15
19.5%
6 8
 
10.4%
4 4
 
5.2%
8 4
 
5.2%
9 3
 
3.9%
3 2
 
2.6%
7 2
 
2.6%
2 1
 
1.3%
Other Punctuation
ValueCountFrequency (%)
. 12
63.2%
, 5
26.3%
' 2
 
10.5%
Lowercase Letter
ValueCountFrequency (%)
c 2
50.0%
m 2
50.0%
Space Separator
ValueCountFrequency (%)
108
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Uppercase Letter
ValueCountFrequency (%)
D 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 355
62.8%
Common 205
36.3%
Latin 5
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
 
5.4%
17
 
4.8%
14
 
3.9%
11
 
3.1%
11
 
3.1%
8
 
2.3%
8
 
2.3%
8
 
2.3%
7
 
2.0%
6
 
1.7%
Other values (157) 246
69.3%
Common
ValueCountFrequency (%)
108
52.7%
0 23
 
11.2%
1 15
 
7.3%
5 15
 
7.3%
. 12
 
5.9%
6 8
 
3.9%
, 5
 
2.4%
4 4
 
2.0%
8 4
 
2.0%
9 3
 
1.5%
Other values (5) 8
 
3.9%
Latin
ValueCountFrequency (%)
c 2
40.0%
m 2
40.0%
D 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 355
62.8%
ASCII 210
37.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
108
51.4%
0 23
 
11.0%
1 15
 
7.1%
5 15
 
7.1%
. 12
 
5.7%
6 8
 
3.8%
, 5
 
2.4%
4 4
 
1.9%
8 4
 
1.9%
9 3
 
1.4%
Other values (8) 13
 
6.2%
Hangul
ValueCountFrequency (%)
19
 
5.4%
17
 
4.8%
14
 
3.9%
11
 
3.1%
11
 
3.1%
8
 
2.3%
8
 
2.3%
8
 
2.3%
7
 
2.0%
6
 
1.7%
Other values (157) 246
69.3%

VID_SJ_CN
Text

MISSING 

Distinct14
Distinct (%)70.0%
Missing32
Missing (%)61.5%
Memory size548.0 B
2023-12-12T06:15:18.632481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length21.5
Mean length11.9
Min length8

Characters and Unicode

Total characters238
Distinct characters90
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)55.0%

Sample

1st row생체나이 자가 진단법
2nd row20150107
3rd row남녀노소 부담없는 뱅어포 주먹밥
4th row20150107
5th row회춘의 명약 '새싹보리'
ValueCountFrequency (%)
20150107 3
 
6.0%
20150121 3
 
6.0%
20150114 3
 
6.0%
중요성 1
 
2.0%
건강의 1
 
2.0%
마블링을 1
 
2.0%
만드는 1
 
2.0%
방법 1
 
2.0%
임신하면 1
 
2.0%
입맛이 1
 
2.0%
Other values (34) 34
68.0%
2023-12-12T06:15:18.907288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30
 
12.6%
1 26
 
10.9%
0 24
 
10.1%
2 14
 
5.9%
5 11
 
4.6%
7
 
2.9%
5
 
2.1%
5
 
2.1%
' 4
 
1.7%
4
 
1.7%
Other values (80) 108
45.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 111
46.6%
Decimal Number 82
34.5%
Space Separator 30
 
12.6%
Other Punctuation 15
 
6.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
6.3%
5
 
4.5%
5
 
4.5%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
2
 
1.8%
2
 
1.8%
Other values (67) 74
66.7%
Decimal Number
ValueCountFrequency (%)
1 26
31.7%
0 24
29.3%
2 14
17.1%
5 11
13.4%
7 3
 
3.7%
4 3
 
3.7%
8 1
 
1.2%
Other Punctuation
ValueCountFrequency (%)
' 4
26.7%
. 3
20.0%
? 3
20.0%
! 3
20.0%
, 2
13.3%
Space Separator
ValueCountFrequency (%)
30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 127
53.4%
Hangul 111
46.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
6.3%
5
 
4.5%
5
 
4.5%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
2
 
1.8%
2
 
1.8%
Other values (67) 74
66.7%
Common
ValueCountFrequency (%)
30
23.6%
1 26
20.5%
0 24
18.9%
2 14
11.0%
5 11
 
8.7%
' 4
 
3.1%
. 3
 
2.4%
? 3
 
2.4%
! 3
 
2.4%
7 3
 
2.4%
Other values (3) 6
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 127
53.4%
Hangul 111
46.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
30
23.6%
1 26
20.5%
0 24
18.9%
2 14
11.0%
5 11
 
8.7%
' 4
 
3.1%
. 3
 
2.4%
? 3
 
2.4%
! 3
 
2.4%
7 3
 
2.4%
Other values (3) 6
 
4.7%
Hangul
ValueCountFrequency (%)
7
 
6.3%
5
 
4.5%
5
 
4.5%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
2
 
1.8%
2
 
1.8%
Other values (67) 74
66.7%

VID_CN
Text

MISSING 

Distinct20
Distinct (%)100.0%
Missing32
Missing (%)61.5%
Memory size548.0 B
2023-12-12T06:15:19.062862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length82
Median length53.5
Mean length50.65
Min length12

Characters and Unicode

Total characters1013
Distinct characters135
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)100.0%

Sample

1st row내 몸의 변화를 살펴보고
2nd rowhttp://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005578
3rd row어린이에게는 성장을,
4th rowhttp://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005579
5th row옛날부터 구황작물로 불렸던 새싹보리는
ValueCountFrequency (%)
1
 
1.8%
내어 1
 
1.8%
임신했을때 1
 
1.8%
신맛이 1
 
1.8%
당기는 1
 
1.8%
이유는 1
 
1.8%
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=21&content_id=1005644 1
 
1.8%
농도의 1
 
1.8%
차이는 1
 
1.8%
있지만 1
 
1.8%
Other values (47) 47
82.5%
2023-12-12T06:15:19.322650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 80
 
7.9%
t 80
 
7.9%
c 50
 
4.9%
e 50
 
4.9%
o 50
 
4.9%
47
 
4.6%
. 40
 
3.9%
/ 40
 
3.9%
_ 30
 
3.0%
m 30
 
3.0%
Other values (125) 516
50.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 560
55.3%
Other Letter 142
 
14.0%
Other Punctuation 112
 
11.1%
Decimal Number 92
 
9.1%
Space Separator 47
 
4.6%
Connector Punctuation 30
 
3.0%
Math Symbol 20
 
2.0%
Uppercase Letter 10
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
4.9%
6
 
4.2%
4
 
2.8%
4
 
2.8%
3
 
2.1%
3
 
2.1%
3
 
2.1%
3
 
2.1%
2
 
1.4%
2
 
1.4%
Other values (86) 105
73.9%
Lowercase Letter
ValueCountFrequency (%)
n 80
14.3%
t 80
14.3%
c 50
 
8.9%
e 50
 
8.9%
o 50
 
8.9%
m 30
 
5.4%
w 30
 
5.4%
i 20
 
3.6%
r 20
 
3.6%
b 20
 
3.6%
Other values (9) 130
23.2%
Decimal Number
ValueCountFrequency (%)
1 24
26.1%
0 23
25.0%
5 15
16.3%
2 11
12.0%
6 8
 
8.7%
4 4
 
4.3%
8 3
 
3.3%
7 2
 
2.2%
3 1
 
1.1%
9 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
. 40
35.7%
/ 40
35.7%
? 10
 
8.9%
& 10
 
8.9%
: 10
 
8.9%
, 2
 
1.8%
Space Separator
ValueCountFrequency (%)
47
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 30
100.0%
Math Symbol
ValueCountFrequency (%)
= 20
100.0%
Uppercase Letter
ValueCountFrequency (%)
C 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 570
56.3%
Common 301
29.7%
Hangul 142
 
14.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
4.9%
6
 
4.2%
4
 
2.8%
4
 
2.8%
3
 
2.1%
3
 
2.1%
3
 
2.1%
3
 
2.1%
2
 
1.4%
2
 
1.4%
Other values (86) 105
73.9%
Latin
ValueCountFrequency (%)
n 80
14.0%
t 80
14.0%
c 50
 
8.8%
e 50
 
8.8%
o 50
 
8.8%
m 30
 
5.3%
w 30
 
5.3%
i 20
 
3.5%
r 20
 
3.5%
b 20
 
3.5%
Other values (10) 140
24.6%
Common
ValueCountFrequency (%)
47
15.6%
. 40
13.3%
/ 40
13.3%
_ 30
10.0%
1 24
8.0%
0 23
7.6%
= 20
6.6%
5 15
 
5.0%
2 11
 
3.7%
? 10
 
3.3%
Other values (9) 41
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 871
86.0%
Hangul 142
 
14.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 80
 
9.2%
t 80
 
9.2%
c 50
 
5.7%
e 50
 
5.7%
o 50
 
5.7%
47
 
5.4%
. 40
 
4.6%
/ 40
 
4.6%
_ 30
 
3.4%
m 30
 
3.4%
Other values (29) 374
42.9%
Hangul
ValueCountFrequency (%)
7
 
4.9%
6
 
4.2%
4
 
2.8%
4
 
2.8%
3
 
2.1%
3
 
2.1%
3
 
2.1%
3
 
2.1%
2
 
1.4%
2
 
1.4%
Other values (86) 105
73.9%

REG_DATE
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing52
Missing (%)100.0%
Memory size600.0 B

VOD_CRS_NM
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing52
Missing (%)100.0%
Memory size600.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing52
Missing (%)100.0%
Memory size600.0 B

Correlations

2023-12-12T06:15:19.389511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RSTRC_VID_ESSN_NOVID_SJ_CNVID_CN
RSTRC_VID_ESSN_NO1.0001.0001.000
VID_SJ_CN1.0001.0001.000
VID_CN1.0001.0001.000

Missing values

2023-12-12T06:15:17.719273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T06:15:17.817373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T06:15:17.907633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

RSTRC_VID_ESSN_NOVID_SJ_CNVID_CNREG_DATEVOD_CRS_NMUnnamed: 5
0<NA><NA><NA><NA><NA><NA>
11005578생체나이 자가 진단법내 몸의 변화를 살펴보고<NA><NA><NA>
2<NA><NA><NA><NA><NA><NA>
3성인 남녀 기준 90cm, 85cm이상이면<NA><NA><NA><NA><NA>
4<NA><NA><NA><NA><NA><NA>
5실제 나이에 +1을 추가하면 된다고...20150107http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005578<NA><NA><NA>
61005579남녀노소 부담없는 뱅어포 주먹밥어린이에게는 성장을,<NA><NA><NA>
7<NA><NA><NA><NA><NA><NA>
8중년엔 비타민D결핍 예방,<NA><NA><NA><NA><NA>
9<NA><NA><NA><NA><NA><NA>
RSTRC_VID_ESSN_NOVID_SJ_CNVID_CNREG_DATEVOD_CRS_NMUnnamed: 5
42<NA><NA><NA><NA><NA><NA>
43어떤 첨가물이 들어있지 않고, 근육이완을 시켜주고,<NA><NA><NA><NA><NA>
44<NA><NA><NA><NA><NA><NA>
45위경련 완화 및 위 연동운동 활성화에 좋다고 한다.20150121http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005646<NA><NA><NA>
46100568050대 이후, 손발 건강의 중요성대부분 겪는 50대 이후, 남자 여자 모두<NA><NA><NA>
47<NA><NA><NA><NA><NA><NA>
48갱년기를 겪으며 여자는 폐경도 함께 오고,<NA><NA><NA><NA><NA>
49<NA><NA><NA><NA><NA><NA>
50호르몬의 변화로 노화가 빨라진다고 했다.20150128http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005680<NA><NA><NA>
51<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

RSTRC_VID_ESSN_NOVID_SJ_CNVID_CN# duplicates
0<NA><NA><NA>22