Overview

Dataset statistics

Number of variables6
Number of observations29
Missing cells111
Missing cells (%)63.8%
Duplicate rows1
Duplicate rows (%)3.4%
Total size in memory1.6 KiB
Average record size in memory55.6 B

Variable types

Text3
Unsupported3

Dataset

Description샘플 데이터
AuthorMBN
URLhttps://kdx.kr/data/view/161

Alerts

Dataset has 1 (3.4%) duplicate rowsDuplicates
RSTRC_VID_ESSN_NO has 2 (6.9%) missing valuesMissing
VID_SJ_CN has 11 (37.9%) missing valuesMissing
VID_CN has 11 (37.9%) missing valuesMissing
REG_DATE has 29 (100.0%) missing valuesMissing
VOD_CRS_NM has 29 (100.0%) missing valuesMissing
Unnamed: 5 has 29 (100.0%) missing valuesMissing
REG_DATE is an unsupported type, check if it needs cleaning or further analysisUnsupported
VOD_CRS_NM is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 20:24:54.751230
Analysis finished2023-12-11 20:24:56.733061
Duration1.98 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RSTRC_VID_ESSN_NO
Text

MISSING 

Distinct27
Distinct (%)100.0%
Missing2
Missing (%)6.9%
Memory size364.0 B
2023-12-12T05:24:56.918466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length27
Mean length15.444444
Min length6

Characters and Unicode

Total characters417
Distinct characters122
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)100.0%

Sample

1st row1005466
2nd row하지만 모두 믿고 샀다가는 낭패 본다?!
3rd row쇼 호스트가 말하는 믿어도 되는 쇼 호스트의 말은?
4th row1005467
5th row'마지막'에 숨겨진 의미는?
ValueCountFrequency (%)
5
 
4.9%
호스트의 3
 
2.9%
시장 3
 
2.9%
말은 2
 
1.9%
약재 2
 
1.9%
2
 
1.9%
호스트가 2
 
1.9%
천마를 2
 
1.9%
제품이기도 2
 
1.9%
한의약 2
 
1.9%
Other values (78) 78
75.7%
2023-12-12T05:24:57.219874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
77
 
18.5%
0 21
 
5.0%
5 14
 
3.4%
12
 
2.9%
1 11
 
2.6%
? 10
 
2.4%
9
 
2.2%
9
 
2.2%
7
 
1.7%
7
 
1.7%
Other values (112) 240
57.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 254
60.9%
Space Separator 77
 
18.5%
Decimal Number 66
 
15.8%
Other Punctuation 20
 
4.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
4.7%
9
 
3.5%
9
 
3.5%
7
 
2.8%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
6
 
2.4%
5
 
2.0%
Other values (97) 180
70.9%
Decimal Number
ValueCountFrequency (%)
0 21
31.8%
5 14
21.2%
1 11
16.7%
4 5
 
7.6%
3 4
 
6.1%
6 4
 
6.1%
9 2
 
3.0%
2 2
 
3.0%
7 2
 
3.0%
8 1
 
1.5%
Other Punctuation
ValueCountFrequency (%)
? 10
50.0%
! 5
25.0%
, 3
 
15.0%
' 2
 
10.0%
Space Separator
ValueCountFrequency (%)
77
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 254
60.9%
Common 163
39.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
4.7%
9
 
3.5%
9
 
3.5%
7
 
2.8%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
6
 
2.4%
5
 
2.0%
Other values (97) 180
70.9%
Common
ValueCountFrequency (%)
77
47.2%
0 21
 
12.9%
5 14
 
8.6%
1 11
 
6.7%
? 10
 
6.1%
! 5
 
3.1%
4 5
 
3.1%
3 4
 
2.5%
6 4
 
2.5%
, 3
 
1.8%
Other values (5) 9
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 254
60.9%
ASCII 163
39.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
77
47.2%
0 21
 
12.9%
5 14
 
8.6%
1 11
 
6.7%
? 10
 
6.1%
! 5
 
3.1%
4 5
 
3.1%
3 4
 
2.5%
6 4
 
2.5%
, 3
 
1.8%
Other values (5) 9
 
5.5%
Hangul
ValueCountFrequency (%)
12
 
4.7%
9
 
3.5%
9
 
3.5%
7
 
2.8%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
6
 
2.4%
5
 
2.0%
Other values (97) 180
70.9%

VID_SJ_CN
Text

MISSING 

Distinct12
Distinct (%)66.7%
Missing11
Missing (%)37.9%
Memory size364.0 B
2023-12-12T05:24:57.546813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length20.5
Mean length12.944444
Min length8

Characters and Unicode

Total characters233
Distinct characters83
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)50.0%

Sample

1st row믿어도 되는 쇼 호스트의 말
2nd row20141215
3rd row홈쇼핑에서 '마지막'의 의미는?
4th row20141215
5th row판매 상품을 직접 사용하는 쇼 호스트?
ValueCountFrequency (%)
20141215 3
 
5.9%
20141229 3
 
5.9%
20141222 3
 
5.9%
2
 
3.9%
이용하라 2
 
3.9%
세일을 2
 
3.9%
서울 1
 
2.0%
청량리로 1
 
2.0%
가라 1
 
2.0%
국내 1
 
2.0%
Other values (32) 32
62.7%
2023-12-12T05:24:57.813108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
33
 
14.2%
2 28
 
12.0%
1 22
 
9.4%
4 9
 
3.9%
0 9
 
3.9%
7
 
3.0%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (73) 108
46.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 119
51.1%
Decimal Number 75
32.2%
Space Separator 33
 
14.2%
Other Punctuation 6
 
2.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
5.9%
5
 
4.2%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (62) 78
65.5%
Decimal Number
ValueCountFrequency (%)
2 28
37.3%
1 22
29.3%
4 9
 
12.0%
0 9
 
12.0%
9 3
 
4.0%
5 3
 
4.0%
7 1
 
1.3%
Other Punctuation
ValueCountFrequency (%)
, 2
33.3%
' 2
33.3%
? 2
33.3%
Space Separator
ValueCountFrequency (%)
33
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 119
51.1%
Common 114
48.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
5.9%
5
 
4.2%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (62) 78
65.5%
Common
ValueCountFrequency (%)
33
28.9%
2 28
24.6%
1 22
19.3%
4 9
 
7.9%
0 9
 
7.9%
9 3
 
2.6%
5 3
 
2.6%
, 2
 
1.8%
' 2
 
1.8%
? 2
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 119
51.1%
ASCII 114
48.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
33
28.9%
2 28
24.6%
1 22
19.3%
4 9
 
7.9%
0 9
 
7.9%
9 3
 
2.6%
5 3
 
2.6%
, 2
 
1.8%
' 2
 
1.8%
? 2
 
1.8%
Hangul
ValueCountFrequency (%)
7
 
5.9%
5
 
4.2%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (62) 78
65.5%

VID_CN
Text

MISSING 

Distinct18
Distinct (%)100.0%
Missing11
Missing (%)37.9%
Memory size364.0 B
2023-12-12T05:24:57.958971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length82
Median length55
Mean length50.222222
Min length12

Characters and Unicode

Total characters904
Distinct characters122
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)100.0%

Sample

1st row쇼호스트의 현란한 말솜씨!
2nd rowhttp://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005466
3rd row홈쇼핑에서 자주 보이는 단어, '마지막'
4th rowhttp://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005467
5th row자기가 판매하는 상품을
ValueCountFrequency (%)
백화점 2
 
3.8%
쇼호스트의 1
 
1.9%
청량리에 1
 
1.9%
각종 1
 
1.9%
도매 1
 
1.9%
시장들 1
 
1.9%
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=21&content_id=1005501 1
 
1.9%
조선시대 1
 
1.9%
보제원에서 1
 
1.9%
발전된 1
 
1.9%
Other values (41) 41
78.8%
2023-12-12T05:24:58.217746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 72
 
8.0%
t 72
 
8.0%
o 45
 
5.0%
c 45
 
5.0%
e 45
 
5.0%
. 36
 
4.0%
/ 36
 
4.0%
34
 
3.8%
m 27
 
3.0%
_ 27
 
3.0%
Other values (112) 465
51.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 504
55.8%
Other Letter 115
 
12.7%
Other Punctuation 109
 
12.1%
Decimal Number 88
 
9.7%
Space Separator 34
 
3.8%
Connector Punctuation 27
 
3.0%
Math Symbol 18
 
2.0%
Uppercase Letter 9
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6
 
5.2%
5
 
4.3%
5
 
4.3%
4
 
3.5%
3
 
2.6%
2
 
1.7%
2
 
1.7%
2
 
1.7%
2
 
1.7%
2
 
1.7%
Other values (71) 82
71.3%
Lowercase Letter
ValueCountFrequency (%)
n 72
14.3%
t 72
14.3%
o 45
 
8.9%
c 45
 
8.9%
e 45
 
8.9%
m 27
 
5.4%
w 27
 
5.4%
l 18
 
3.6%
b 18
 
3.6%
i 18
 
3.6%
Other values (9) 117
23.2%
Decimal Number
ValueCountFrequency (%)
0 22
25.0%
1 22
25.0%
5 14
15.9%
2 11
12.5%
4 6
 
6.8%
3 4
 
4.5%
6 4
 
4.5%
9 2
 
2.3%
7 2
 
2.3%
8 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
. 36
33.0%
/ 36
33.0%
? 9
 
8.3%
& 9
 
8.3%
: 9
 
8.3%
, 5
 
4.6%
! 3
 
2.8%
' 2
 
1.8%
Space Separator
ValueCountFrequency (%)
34
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 27
100.0%
Math Symbol
ValueCountFrequency (%)
= 18
100.0%
Uppercase Letter
ValueCountFrequency (%)
C 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 513
56.7%
Common 276
30.5%
Hangul 115
 
12.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6
 
5.2%
5
 
4.3%
5
 
4.3%
4
 
3.5%
3
 
2.6%
2
 
1.7%
2
 
1.7%
2
 
1.7%
2
 
1.7%
2
 
1.7%
Other values (71) 82
71.3%
Common
ValueCountFrequency (%)
. 36
13.0%
/ 36
13.0%
34
12.3%
_ 27
9.8%
0 22
8.0%
1 22
8.0%
= 18
 
6.5%
5 14
 
5.1%
2 11
 
4.0%
? 9
 
3.3%
Other values (11) 47
17.0%
Latin
ValueCountFrequency (%)
n 72
14.0%
t 72
14.0%
o 45
 
8.8%
c 45
 
8.8%
e 45
 
8.8%
m 27
 
5.3%
w 27
 
5.3%
l 18
 
3.5%
b 18
 
3.5%
i 18
 
3.5%
Other values (10) 126
24.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 789
87.3%
Hangul 115
 
12.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 72
 
9.1%
t 72
 
9.1%
o 45
 
5.7%
c 45
 
5.7%
e 45
 
5.7%
. 36
 
4.6%
/ 36
 
4.6%
34
 
4.3%
m 27
 
3.4%
_ 27
 
3.4%
Other values (31) 350
44.4%
Hangul
ValueCountFrequency (%)
6
 
5.2%
5
 
4.3%
5
 
4.3%
4
 
3.5%
3
 
2.6%
2
 
1.7%
2
 
1.7%
2
 
1.7%
2
 
1.7%
2
 
1.7%
Other values (71) 82
71.3%

REG_DATE
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing29
Missing (%)100.0%
Memory size393.0 B

VOD_CRS_NM
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing29
Missing (%)100.0%
Memory size393.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing29
Missing (%)100.0%
Memory size393.0 B

Correlations

2023-12-12T05:24:58.287173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RSTRC_VID_ESSN_NOVID_SJ_CNVID_CN
RSTRC_VID_ESSN_NO1.0001.0001.000
VID_SJ_CN1.0001.0001.000
VID_CN1.0001.0001.000

Missing values

2023-12-12T05:24:56.500416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T05:24:56.606521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T05:24:56.687592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

RSTRC_VID_ESSN_NOVID_SJ_CNVID_CNREG_DATEVOD_CRS_NMUnnamed: 5
0<NA><NA><NA><NA><NA><NA>
11005466믿어도 되는 쇼 호스트의 말쇼호스트의 현란한 말솜씨!<NA><NA><NA>
2하지만 모두 믿고 샀다가는 낭패 본다?!<NA><NA><NA><NA><NA>
3쇼 호스트가 말하는 믿어도 되는 쇼 호스트의 말은?20141215http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005466<NA><NA><NA>
41005467홈쇼핑에서 '마지막'의 의미는?홈쇼핑에서 자주 보이는 단어, '마지막'<NA><NA><NA>
5'마지막'에 숨겨진 의미는?<NA><NA><NA><NA><NA>
6쇼 호스트의 말을 끝까지 들어야 하는 이유!20141215http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005467<NA><NA><NA>
71005468판매 상품을 직접 사용하는 쇼 호스트?자기가 판매하는 상품을<NA><NA><NA>
8직접 사용한다는 쇼 호스트의 말!<NA><NA><NA><NA><NA>
9쇼 호스트가 신뢰감을 주기 위해 자주 쓰는 말은?20141215http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005468<NA><NA><NA>
RSTRC_VID_ESSN_NOVID_SJ_CNVID_CNREG_DATEVOD_CRS_NMUnnamed: 5
191005532국내 최대의 한약시장, 서울 약령시장조선시대 보제원에서 발전된 약령시장<NA><NA><NA>
20약령시장 전문가가 밝히는<NA><NA><NA><NA><NA>
21약재 시장에서 약초 잘 고르는 법20141229http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005532<NA><NA><NA>
221005533한의약 박물관 이용하는 방법한약에 대한 모든 것을 알 수 있다는<NA><NA><NA>
23약령시 한의약 박물관!<NA><NA><NA><NA><NA>
24한의약 박물관을 유용하게 이용하는 방법은?20141229http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005533<NA><NA><NA>
251005534하늘이 내려준 신비의 약초, 천마겨울에 먹으면 더욱 좋다는 천마!<NA><NA><NA>
26좋은 천마를 고르려면<NA><NA><NA><NA><NA>
27천마를 백열등에 비춰봐라?20141229http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1005534<NA><NA><NA>
28<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

RSTRC_VID_ESSN_NOVID_SJ_CNVID_CN# duplicates
0<NA><NA><NA>2