Overview

Dataset statistics

Number of variables6
Number of observations58
Missing cells275
Missing cells (%)79.0%
Duplicate rows1
Duplicate rows (%)1.7%
Total size in memory3.0 KiB
Average record size in memory53.3 B

Variable types

Text3
Unsupported3

Dataset

Description샘플 데이터
AuthorMBN
URLhttps://kdx.kr/data/view/163

Alerts

Dataset has 1 (1.7%) duplicate rowsDuplicates
RSTRC_VID_ESSN_NO has 25 (43.1%) missing valuesMissing
VID_SJ_CN has 38 (65.5%) missing valuesMissing
VID_CN has 38 (65.5%) missing valuesMissing
REG_DATE has 58 (100.0%) missing valuesMissing
VOD_CRS_NM has 58 (100.0%) missing valuesMissing
Unnamed: 5 has 58 (100.0%) missing valuesMissing
REG_DATE is an unsupported type, check if it needs cleaning or further analysisUnsupported
VOD_CRS_NM is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 22:09:12.919547
Analysis finished2023-12-11 22:09:14.650093
Duration1.73 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RSTRC_VID_ESSN_NO
Text

MISSING 

Distinct33
Distinct (%)100.0%
Missing25
Missing (%)43.1%
Memory size596.0 B
2023-12-12T07:09:14.893133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length32
Mean length21.727273
Min length7

Characters and Unicode

Total characters717
Distinct characters159
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)100.0%

Sample

1st row1010501
2nd row그러나 이런 꿈들이 무조건 길몽은 아니다?!
3rd row조상 꿈 해석의 포인트가 되는 것은 과연 무엇일까?
4th row1010502
5th row돼지 꿈 중에서도 특히 이런 돼지 꿈이 대박이다?
ValueCountFrequency (%)
돼지 5
 
2.7%
5
 
2.7%
4
 
2.2%
과연 4
 
2.2%
이런 3
 
1.6%
하는데 3
 
1.6%
실제 3
 
1.6%
것은 3
 
1.6%
중에서도 3
 
1.6%
아이의 2
 
1.1%
Other values (134) 147
80.8%
2023-12-12T07:09:15.289628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
152
 
21.2%
25
 
3.5%
0 25
 
3.5%
1 25
 
3.5%
18
 
2.5%
15
 
2.1%
15
 
2.1%
15
 
2.1%
. 12
 
1.7%
11
 
1.5%
Other values (149) 404
56.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 461
64.3%
Space Separator 152
 
21.2%
Decimal Number 71
 
9.9%
Other Punctuation 31
 
4.3%
Math Symbol 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
25
 
5.4%
18
 
3.9%
15
 
3.3%
15
 
3.3%
15
 
3.3%
11
 
2.4%
10
 
2.2%
9
 
2.0%
9
 
2.0%
8
 
1.7%
Other values (136) 326
70.7%
Decimal Number
ValueCountFrequency (%)
0 25
35.2%
1 25
35.2%
5 11
15.5%
2 5
 
7.0%
8 2
 
2.8%
7 1
 
1.4%
6 1
 
1.4%
9 1
 
1.4%
Other Punctuation
ValueCountFrequency (%)
. 12
38.7%
? 11
35.5%
! 8
25.8%
Space Separator
ValueCountFrequency (%)
152
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 461
64.3%
Common 256
35.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
25
 
5.4%
18
 
3.9%
15
 
3.3%
15
 
3.3%
15
 
3.3%
11
 
2.4%
10
 
2.2%
9
 
2.0%
9
 
2.0%
8
 
1.7%
Other values (136) 326
70.7%
Common
ValueCountFrequency (%)
152
59.4%
0 25
 
9.8%
1 25
 
9.8%
. 12
 
4.7%
? 11
 
4.3%
5 11
 
4.3%
! 8
 
3.1%
2 5
 
2.0%
8 2
 
0.8%
~ 2
 
0.8%
Other values (3) 3
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 461
64.3%
ASCII 256
35.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
152
59.4%
0 25
 
9.8%
1 25
 
9.8%
. 12
 
4.7%
? 11
 
4.3%
5 11
 
4.3%
! 8
 
3.1%
2 5
 
2.0%
8 2
 
0.8%
~ 2
 
0.8%
Other values (3) 3
 
1.2%
Hangul
ValueCountFrequency (%)
25
 
5.4%
18
 
3.9%
15
 
3.3%
15
 
3.3%
15
 
3.3%
11
 
2.4%
10
 
2.2%
9
 
2.0%
9
 
2.0%
8
 
1.7%
Other values (136) 326
70.7%

VID_SJ_CN
Text

MISSING 

Distinct11
Distinct (%)55.0%
Missing38
Missing (%)65.5%
Memory size596.0 B
2023-12-12T07:09:15.479530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length30
Mean length16.95
Min length8

Characters and Unicode

Total characters339
Distinct characters116
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)50.0%

Sample

1st row길몽인 줄 알았던 조상 꿈, 무조건 길몽은 아니다?!
2nd row20160104
3rd row새해맞이 꿈 풀이, 이런 돼지 꿈이 대박이다?
4th row20160104
5th row많은 동물 중 왜 하필 돼지 꿈이 길몽일까?
ValueCountFrequency (%)
20160104 10
 
12.2%
꿈이 3
 
3.7%
길몽은 2
 
2.4%
2
 
2.4%
있다 2
 
2.4%
돼지 2
 
2.4%
풀이 2
 
2.4%
2
 
2.4%
할까 1
 
1.2%
태어날 1
 
1.2%
Other values (55) 55
67.1%
2023-12-12T07:09:15.754533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
62
 
18.3%
0 30
 
8.8%
1 21
 
6.2%
2 10
 
2.9%
6 10
 
2.9%
4 10
 
2.9%
10
 
2.9%
? 8
 
2.4%
7
 
2.1%
! 6
 
1.8%
Other values (106) 165
48.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 178
52.5%
Decimal Number 81
23.9%
Space Separator 62
 
18.3%
Other Punctuation 17
 
5.0%
Math Symbol 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10
 
5.6%
7
 
3.9%
6
 
3.4%
6
 
3.4%
5
 
2.8%
5
 
2.8%
5
 
2.8%
4
 
2.2%
3
 
1.7%
3
 
1.7%
Other values (96) 124
69.7%
Decimal Number
ValueCountFrequency (%)
0 30
37.0%
1 21
25.9%
2 10
 
12.3%
6 10
 
12.3%
4 10
 
12.3%
Other Punctuation
ValueCountFrequency (%)
? 8
47.1%
! 6
35.3%
, 3
 
17.6%
Space Separator
ValueCountFrequency (%)
62
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 178
52.5%
Common 161
47.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10
 
5.6%
7
 
3.9%
6
 
3.4%
6
 
3.4%
5
 
2.8%
5
 
2.8%
5
 
2.8%
4
 
2.2%
3
 
1.7%
3
 
1.7%
Other values (96) 124
69.7%
Common
ValueCountFrequency (%)
62
38.5%
0 30
18.6%
1 21
 
13.0%
2 10
 
6.2%
6 10
 
6.2%
4 10
 
6.2%
? 8
 
5.0%
! 6
 
3.7%
, 3
 
1.9%
~ 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 178
52.5%
ASCII 161
47.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
62
38.5%
0 30
18.6%
1 21
 
13.0%
2 10
 
6.2%
6 10
 
6.2%
4 10
 
6.2%
? 8
 
5.0%
! 6
 
3.7%
, 3
 
1.9%
~ 1
 
0.6%
Hangul
ValueCountFrequency (%)
10
 
5.6%
7
 
3.9%
6
 
3.4%
6
 
3.4%
5
 
2.8%
5
 
2.8%
5
 
2.8%
4
 
2.2%
3
 
1.7%
3
 
1.7%
Other values (96) 124
69.7%

VID_CN
Text

MISSING 

Distinct20
Distinct (%)100.0%
Missing38
Missing (%)65.5%
Memory size596.0 B
2023-12-12T07:09:15.906605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length82
Median length60
Mean length54.25
Min length20

Characters and Unicode

Total characters1085
Distinct characters143
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)100.0%

Sample

1st row흔히 길몽이라 알려진 조상 꿈과 돼지 꿈!
2nd rowhttp://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1010501
3rd row꿈에서 보기만 해도 행복해지는 돼지!
4th rowhttp://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1010502
5th row많은 사람이 길몽이라 알고 있는 돼지 꿈.
ValueCountFrequency (%)
돼지 3
 
3.6%
길몽이라 2
 
2.4%
2
 
2.4%
사람이 2
 
2.4%
흔히 1
 
1.2%
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=21&content_id=1010525 1
 
1.2%
http://www.mbn.co.kr/player/moviecontents.mbn?content_cls_cd=21&content_id=1010526 1
 
1.2%
악몽 1
 
1.2%
괴롭히는 1
 
1.2%
잠자리를 1
 
1.2%
Other values (69) 69
82.1%
2023-12-12T07:09:16.169464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 80
 
7.4%
n 80
 
7.4%
64
 
5.9%
e 50
 
4.6%
o 50
 
4.6%
c 50
 
4.6%
. 42
 
3.9%
/ 40
 
3.7%
1 35
 
3.2%
m 30
 
2.8%
Other values (133) 564
52.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 560
51.6%
Other Letter 188
 
17.3%
Other Punctuation 121
 
11.2%
Decimal Number 91
 
8.4%
Space Separator 64
 
5.9%
Connector Punctuation 30
 
2.8%
Math Symbol 21
 
1.9%
Uppercase Letter 10
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
3.7%
6
 
3.2%
6
 
3.2%
6
 
3.2%
6
 
3.2%
5
 
2.7%
5
 
2.7%
4
 
2.1%
4
 
2.1%
4
 
2.1%
Other values (94) 135
71.8%
Lowercase Letter
ValueCountFrequency (%)
t 80
14.3%
n 80
14.3%
e 50
 
8.9%
o 50
 
8.9%
c 50
 
8.9%
m 30
 
5.4%
w 30
 
5.4%
r 20
 
3.6%
p 20
 
3.6%
b 20
 
3.6%
Other values (9) 130
23.2%
Decimal Number
ValueCountFrequency (%)
1 35
38.5%
0 25
27.5%
2 15
16.5%
5 11
 
12.1%
8 2
 
2.2%
7 1
 
1.1%
6 1
 
1.1%
9 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
. 42
34.7%
/ 40
33.1%
? 12
 
9.9%
: 10
 
8.3%
& 10
 
8.3%
! 5
 
4.1%
, 2
 
1.7%
Math Symbol
ValueCountFrequency (%)
= 20
95.2%
~ 1
 
4.8%
Space Separator
ValueCountFrequency (%)
64
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 30
100.0%
Uppercase Letter
ValueCountFrequency (%)
C 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 570
52.5%
Common 327
30.1%
Hangul 188
 
17.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
3.7%
6
 
3.2%
6
 
3.2%
6
 
3.2%
6
 
3.2%
5
 
2.7%
5
 
2.7%
4
 
2.1%
4
 
2.1%
4
 
2.1%
Other values (94) 135
71.8%
Latin
ValueCountFrequency (%)
t 80
14.0%
n 80
14.0%
e 50
 
8.8%
o 50
 
8.8%
c 50
 
8.8%
m 30
 
5.3%
w 30
 
5.3%
r 20
 
3.5%
p 20
 
3.5%
b 20
 
3.5%
Other values (10) 140
24.6%
Common
ValueCountFrequency (%)
64
19.6%
. 42
12.8%
/ 40
12.2%
1 35
10.7%
_ 30
9.2%
0 25
 
7.6%
= 20
 
6.1%
2 15
 
4.6%
? 12
 
3.7%
5 11
 
3.4%
Other values (9) 33
10.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 897
82.7%
Hangul 188
 
17.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 80
 
8.9%
n 80
 
8.9%
64
 
7.1%
e 50
 
5.6%
o 50
 
5.6%
c 50
 
5.6%
. 42
 
4.7%
/ 40
 
4.5%
1 35
 
3.9%
m 30
 
3.3%
Other values (29) 376
41.9%
Hangul
ValueCountFrequency (%)
7
 
3.7%
6
 
3.2%
6
 
3.2%
6
 
3.2%
6
 
3.2%
5
 
2.7%
5
 
2.7%
4
 
2.1%
4
 
2.1%
4
 
2.1%
Other values (94) 135
71.8%

REG_DATE
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing58
Missing (%)100.0%
Memory size654.0 B

VOD_CRS_NM
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing58
Missing (%)100.0%
Memory size654.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing58
Missing (%)100.0%
Memory size654.0 B

Correlations

2023-12-12T07:09:16.243500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RSTRC_VID_ESSN_NOVID_SJ_CNVID_CN
RSTRC_VID_ESSN_NO1.0001.0001.000
VID_SJ_CN1.0001.0001.000
VID_CN1.0001.0001.000

Missing values

2023-12-12T07:09:14.390736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:09:14.513018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T07:09:14.600184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

RSTRC_VID_ESSN_NOVID_SJ_CNVID_CNREG_DATEVOD_CRS_NMUnnamed: 5
0<NA><NA><NA><NA><NA><NA>
11010501길몽인 줄 알았던 조상 꿈, 무조건 길몽은 아니다?!흔히 길몽이라 알려진 조상 꿈과 돼지 꿈!<NA><NA><NA>
2<NA><NA><NA><NA><NA><NA>
3그러나 이런 꿈들이 무조건 길몽은 아니다?!<NA><NA><NA><NA><NA>
4<NA><NA><NA><NA><NA><NA>
5조상 꿈 해석의 포인트가 되는 것은 과연 무엇일까?20160104http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1010501<NA><NA><NA>
61010502새해맞이 꿈 풀이, 이런 돼지 꿈이 대박이다?꿈에서 보기만 해도 행복해지는 돼지!<NA><NA><NA>
7<NA><NA><NA><NA><NA><NA>
8돼지 꿈 중에서도 특히 이런 돼지 꿈이 대박이다?<NA><NA><NA><NA><NA>
9<NA><NA><NA><NA><NA><NA>
RSTRC_VID_ESSN_NOVID_SJ_CNVID_CNREG_DATEVOD_CRS_NMUnnamed: 5
48<NA><NA><NA><NA><NA><NA>
49그런데 이런 길몽도 효력과 효능이 제각각이라고 하는데...20160104http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1010527<NA><NA><NA>
501010528다른 사람의 길몽, 비싸게 주고 사세요~다른 사람이 꾼 길몽, 최대한 빠르고 비싸게 사는 게 좋다고 하는데~<NA><NA><NA>
51<NA><NA><NA><NA><NA><NA>
52그중에서도 태몽은 꼭 돈을 주고 사야 한다?<NA><NA><NA><NA><NA>
53<NA><NA><NA><NA><NA><NA>
54꿈을 꾼 후 내용을 발설하는 것은 그 운이 달아난다?<NA><NA><NA><NA><NA>
55<NA><NA><NA><NA><NA><NA>
56꿈 내용을 얘기하지 말라는 것은 모두 속설이라고 하는데...20160104http://www.mbn.co.kr/player/movieContents.mbn?content_cls_cd=21&content_id=1010528<NA><NA><NA>
57<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

RSTRC_VID_ESSN_NOVID_SJ_CNVID_CN# duplicates
0<NA><NA><NA>25