Overview

Dataset statistics

Number of variables9
Number of observations124
Missing cells959
Missing cells (%)85.9%
Duplicate rows4
Duplicate rows (%)3.2%
Total size in memory9.5 KiB
Average record size in memory78.1 B

Variable types

Text4
Numeric4
Unsupported1

Dataset

Description샘플 데이터
AuthorMBN
URLhttps://kdx.kr/data/view/29824

Alerts

Dataset has 4 (3.2%) duplicate rowsDuplicates
play_sec is highly overall correlated with play_hour and 1 other fieldsHigh correlation
play_hour is highly overall correlated with play_sec and 1 other fieldsHigh correlation
file_size is highly overall correlated with play_sec and 1 other fieldsHigh correlation
vod_seq_no has 37 (29.8%) missing valuesMissing
bcast_seq_no has 114 (91.9%) missing valuesMissing
play_sec has 114 (91.9%) missing valuesMissing
play_hour has 114 (91.9%) missing valuesMissing
file_size has 114 (91.9%) missing valuesMissing
vod_path has 114 (91.9%) missing valuesMissing
title has 114 (91.9%) missing valuesMissing
contents has 114 (91.9%) missing valuesMissing
Unnamed: 8 has 124 (100.0%) missing valuesMissing
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-11 03:30:33.922587
Analysis finished2024-03-11 03:30:37.290159
Duration3.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

vod_seq_no
Text

MISSING 

Distinct81
Distinct (%)93.1%
Missing37
Missing (%)29.8%
Memory size1.1 KiB
2024-03-11T12:30:37.476073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length93
Median length58
Mean length32.678161
Min length6

Characters and Unicode

Total characters2843
Distinct characters359
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78 ?
Unique (%)89.7%

Sample

1st row557892
2nd row 뉴스를 파헤치고 이슈를 터트리는 뉴스,
3rd row 뉴스 파이터..
4th row 진행을 맡은 최중락입니다.
5th row 저와 함께 뉴스를 철저하게 해부해주실 분, 소개합니다.
ValueCountFrequency (%)
12
 
1.8%
눈이 8
 
1.2%
조사를 7
 
1.1%
뉴스 7
 
1.1%
7
 
1.1%
7
 
1.1%
것으로 7
 
1.1%
7
 
1.1%
보입니다 6
 
0.9%
소환 6
 
0.9%
Other values (458) 584
88.8%
2024-03-11T12:30:37.810370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
644
 
22.7%
71
 
2.5%
. 60
 
2.1%
57
 
2.0%
46
 
1.6%
40
 
1.4%
35
 
1.2%
33
 
1.2%
31
 
1.1%
30
 
1.1%
Other values (349) 1796
63.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1965
69.1%
Space Separator 644
 
22.7%
Other Punctuation 92
 
3.2%
Decimal Number 77
 
2.7%
Control 19
 
0.7%
Math Symbol 11
 
0.4%
Open Punctuation 10
 
0.4%
Close Punctuation 10
 
0.4%
Lowercase Letter 6
 
0.2%
Uppercase Letter 5
 
0.2%
Other values (2) 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
71
 
3.6%
57
 
2.9%
46
 
2.3%
40
 
2.0%
35
 
1.8%
33
 
1.7%
31
 
1.6%
30
 
1.5%
29
 
1.5%
27
 
1.4%
Other values (317) 1566
79.7%
Decimal Number
ValueCountFrequency (%)
5 23
29.9%
8 15
19.5%
2 12
15.6%
1 9
 
11.7%
4 5
 
6.5%
0 4
 
5.2%
3 4
 
5.2%
7 2
 
2.6%
9 2
 
2.6%
6 1
 
1.3%
Other Punctuation
ValueCountFrequency (%)
. 60
65.2%
, 21
 
22.8%
? 5
 
5.4%
" 4
 
4.3%
! 2
 
2.2%
Uppercase Letter
ValueCountFrequency (%)
O 2
40.0%
B 1
20.0%
M 1
20.0%
N 1
20.0%
Math Symbol
ValueCountFrequency (%)
> 6
54.5%
< 4
36.4%
~ 1
 
9.1%
Open Punctuation
ValueCountFrequency (%)
7
70.0%
( 3
30.0%
Close Punctuation
ValueCountFrequency (%)
7
70.0%
) 3
30.0%
Lowercase Letter
ValueCountFrequency (%)
m 3
50.0%
c 3
50.0%
Space Separator
ValueCountFrequency (%)
644
100.0%
Control
ValueCountFrequency (%)
19
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1965
69.1%
Common 867
30.5%
Latin 11
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
71
 
3.6%
57
 
2.9%
46
 
2.3%
40
 
2.0%
35
 
1.8%
33
 
1.7%
31
 
1.6%
30
 
1.5%
29
 
1.5%
27
 
1.4%
Other values (317) 1566
79.7%
Common
ValueCountFrequency (%)
644
74.3%
. 60
 
6.9%
5 23
 
2.7%
, 21
 
2.4%
19
 
2.2%
8 15
 
1.7%
2 12
 
1.4%
1 9
 
1.0%
7
 
0.8%
7
 
0.8%
Other values (16) 50
 
5.8%
Latin
ValueCountFrequency (%)
m 3
27.3%
c 3
27.3%
O 2
18.2%
B 1
 
9.1%
M 1
 
9.1%
N 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1965
69.1%
ASCII 860
30.2%
None 14
 
0.5%
Punctuation 4
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
644
74.9%
. 60
 
7.0%
5 23
 
2.7%
, 21
 
2.4%
19
 
2.2%
8 15
 
1.7%
2 12
 
1.4%
1 9
 
1.0%
> 6
 
0.7%
? 5
 
0.6%
Other values (18) 46
 
5.3%
Hangul
ValueCountFrequency (%)
71
 
3.6%
57
 
2.9%
46
 
2.3%
40
 
2.0%
35
 
1.8%
33
 
1.7%
31
 
1.6%
30
 
1.5%
29
 
1.5%
27
 
1.4%
Other values (317) 1566
79.7%
None
ValueCountFrequency (%)
7
50.0%
7
50.0%
Punctuation
ValueCountFrequency (%)
2
50.0%
2
50.0%

bcast_seq_no
Real number (ℝ)

MISSING 

Distinct10
Distinct (%)100.0%
Missing114
Missing (%)91.9%
Infinite0
Infinite (%)0.0%
Mean1085455.9
Minimum1085286
Maximum1085569
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2024-03-11T12:30:37.902806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1085286
5-th percentile1085319.8
Q11085379.2
median1085494.5
Q31085496.8
95-th percentile1085568.6
Maximum1085569
Range283
Interquartile range (IQR)117.5

Descriptive statistics

Standard deviation93.657117
Coefficient of variation (CV)8.6283668 × 10-5
Kurtosis-0.58388648
Mean1085455.9
Median Absolute Deviation (MAD)68.5
Skewness-0.59137155
Sum10854559
Variance8771.6556
MonotonicityStrictly increasing
2024-03-11T12:30:37.983752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1085286 1
 
0.8%
1085361 1
 
0.8%
1085362 1
 
0.8%
1085431 1
 
0.8%
1085494 1
 
0.8%
1085495 1
 
0.8%
1085496 1
 
0.8%
1085497 1
 
0.8%
1085568 1
 
0.8%
1085569 1
 
0.8%
(Missing) 114
91.9%
ValueCountFrequency (%)
1085286 1
0.8%
1085361 1
0.8%
1085362 1
0.8%
1085431 1
0.8%
1085494 1
0.8%
1085495 1
0.8%
1085496 1
0.8%
1085497 1
0.8%
1085568 1
0.8%
1085569 1
0.8%
ValueCountFrequency (%)
1085569 1
0.8%
1085568 1
0.8%
1085497 1
0.8%
1085496 1
0.8%
1085495 1
0.8%
1085494 1
0.8%
1085431 1
0.8%
1085362 1
0.8%
1085361 1
0.8%
1085286 1
0.8%

play_sec
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)100.0%
Missing114
Missing (%)91.9%
Infinite0
Infinite (%)0.0%
Mean239.6
Minimum27
Maximum1079
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2024-03-11T12:30:38.074703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum27
5-th percentile29.7
Q137
median55
Q3207
95-th percentile893.6
Maximum1079
Range1052
Interquartile range (IQR)170

Descriptive statistics

Standard deviation353.80666
Coefficient of variation (CV)1.4766555
Kurtosis3.171038
Mean239.6
Median Absolute Deviation (MAD)25
Skewness1.9517025
Sum2396
Variance125179.16
MonotonicityNot monotonic
2024-03-11T12:30:38.199262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
59 1
 
0.8%
667 1
 
0.8%
186 1
 
0.8%
33 1
 
0.8%
34 1
 
0.8%
1079 1
 
0.8%
46 1
 
0.8%
214 1
 
0.8%
27 1
 
0.8%
51 1
 
0.8%
(Missing) 114
91.9%
ValueCountFrequency (%)
27 1
0.8%
33 1
0.8%
34 1
0.8%
46 1
0.8%
51 1
0.8%
59 1
0.8%
186 1
0.8%
214 1
0.8%
667 1
0.8%
1079 1
0.8%
ValueCountFrequency (%)
1079 1
0.8%
667 1
0.8%
214 1
0.8%
186 1
0.8%
59 1
0.8%
51 1
0.8%
46 1
0.8%
34 1
0.8%
33 1
0.8%
27 1
0.8%

play_hour
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)100.0%
Missing114
Missing (%)91.9%
Infinite0
Infinite (%)0.0%
Mean0.06656
Minimum0.0075
Maximum0.2997
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2024-03-11T12:30:38.300713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.0075
5-th percentile0.008265
Q10.01025
median0.0153
Q30.057475
95-th percentile0.24822
Maximum0.2997
Range0.2922
Interquartile range (IQR)0.047225

Descriptive statistics

Standard deviation0.098273306
Coefficient of variation (CV)1.4764619
Kurtosis3.1700556
Mean0.06656
Median Absolute Deviation (MAD)0.00695
Skewness1.9515842
Sum0.6656
Variance0.0096576427
MonotonicityNot monotonic
2024-03-11T12:30:38.386430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.0164 1
 
0.8%
0.1853 1
 
0.8%
0.0517 1
 
0.8%
0.0092 1
 
0.8%
0.0094 1
 
0.8%
0.2997 1
 
0.8%
0.0128 1
 
0.8%
0.0594 1
 
0.8%
0.0075 1
 
0.8%
0.0142 1
 
0.8%
(Missing) 114
91.9%
ValueCountFrequency (%)
0.0075 1
0.8%
0.0092 1
0.8%
0.0094 1
0.8%
0.0128 1
0.8%
0.0142 1
0.8%
0.0164 1
0.8%
0.0517 1
0.8%
0.0594 1
0.8%
0.1853 1
0.8%
0.2997 1
0.8%
ValueCountFrequency (%)
0.2997 1
0.8%
0.1853 1
0.8%
0.0594 1
0.8%
0.0517 1
0.8%
0.0164 1
0.8%
0.0142 1
0.8%
0.0128 1
0.8%
0.0094 1
0.8%
0.0092 1
0.8%
0.0075 1
0.8%

file_size
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)100.0%
Missing114
Missing (%)91.9%
Infinite0
Infinite (%)0.0%
Mean38938598
Minimum4447308
Maximum1.6952648 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2024-03-11T12:30:38.481333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4447308
5-th percentile4888862.4
Q16404105.8
median8935672.5
Q334356230
95-th percentile1.434459 × 108
Maximum1.6952648 × 108
Range1.6507917 × 108
Interquartile range (IQR)27952125

Descriptive statistics

Standard deviation56339606
Coefficient of variation (CV)1.4468833
Kurtosis2.7024016
Mean38938598
Median Absolute Deviation (MAD)3997748.5
Skewness1.8724102
Sum3.8938598 × 108
Variance3.1741512 × 1015
MonotonicityNot monotonic
2024-03-11T12:30:38.569831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
9494006 1
 
0.8%
111569625 1
 
0.8%
31426768 1
 
0.8%
5428540 1
 
0.8%
5916616 1
 
0.8%
169526481 1
 
0.8%
7866575 1
 
0.8%
35332718 1
 
0.8%
4447308 1
 
0.8%
8377339 1
 
0.8%
(Missing) 114
91.9%
ValueCountFrequency (%)
4447308 1
0.8%
5428540 1
0.8%
5916616 1
0.8%
7866575 1
0.8%
8377339 1
0.8%
9494006 1
0.8%
31426768 1
0.8%
35332718 1
0.8%
111569625 1
0.8%
169526481 1
0.8%
ValueCountFrequency (%)
169526481 1
0.8%
111569625 1
0.8%
35332718 1
0.8%
31426768 1
0.8%
9494006 1
0.8%
8377339 1
0.8%
7866575 1
0.8%
5916616 1
0.8%
5428540 1
0.8%
4447308 1
0.8%

vod_path
Text

MISSING 

Distinct10
Distinct (%)100.0%
Missing114
Missing (%)91.9%
Memory size1.1 KiB
2024-03-11T12:30:38.761733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length61
Median length61
Mean length61
Min length61

Characters and Unicode

Total characters610
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)100.0%

Sample

1st row/mbnvod2/673/2014/12/01/20141201103143_20_673_1085286_360.mp4
2nd row/mbnvod2/673/2014/12/02/20141202104424_20_673_1085361_360.mp4
3rd row/mbnvod2/673/2014/12/02/20141202104424_20_673_1085362_360.mp4
4th row/mbnvod2/673/2014/12/03/20141203101507_20_673_1085431_360.mp4
5th row/mbnvod2/673/2014/12/04/20141204101630_20_673_1085494_360.mp4
ValueCountFrequency (%)
mbnvod2/673/2014/12/01/20141201103143_20_673_1085286_360.mp4 1
10.0%
mbnvod2/673/2014/12/02/20141202104424_20_673_1085361_360.mp4 1
10.0%
mbnvod2/673/2014/12/02/20141202104424_20_673_1085362_360.mp4 1
10.0%
mbnvod2/673/2014/12/03/20141203101507_20_673_1085431_360.mp4 1
10.0%
mbnvod2/673/2014/12/04/20141204101630_20_673_1085494_360.mp4 1
10.0%
mbnvod2/673/2014/12/04/20141204102655_20_673_1085495_360.mp4 1
10.0%
mbnvod2/673/2014/12/04/20141204102655_20_673_1085496_360.mp4 1
10.0%
mbnvod2/673/2014/12/04/20141204102431_20_673_1085497_360.mp4 1
10.0%
mbnvod2/673/2014/12/05/20141205103252_20_673_1085568_360.mp4 1
10.0%
mbnvod2/673/2014/12/05/20141205102302_20_673_1085569_360.mp4 1
10.0%
2024-03-11T12:30:39.031947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 83
13.6%
2 75
12.3%
1 68
11.1%
/ 60
9.8%
4 52
8.5%
3 41
 
6.7%
_ 40
 
6.6%
6 39
 
6.4%
5 23
 
3.8%
7 22
 
3.6%
Other values (10) 107
17.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 420
68.9%
Lowercase Letter 80
 
13.1%
Other Punctuation 70
 
11.5%
Connector Punctuation 40
 
6.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 83
19.8%
2 75
17.9%
1 68
16.2%
4 52
12.4%
3 41
9.8%
6 39
9.3%
5 23
 
5.5%
7 22
 
5.2%
8 12
 
2.9%
9 5
 
1.2%
Lowercase Letter
ValueCountFrequency (%)
m 20
25.0%
d 10
12.5%
o 10
12.5%
v 10
12.5%
n 10
12.5%
b 10
12.5%
p 10
12.5%
Other Punctuation
ValueCountFrequency (%)
/ 60
85.7%
. 10
 
14.3%
Connector Punctuation
ValueCountFrequency (%)
_ 40
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 530
86.9%
Latin 80
 
13.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 83
15.7%
2 75
14.2%
1 68
12.8%
/ 60
11.3%
4 52
9.8%
3 41
7.7%
_ 40
7.5%
6 39
7.4%
5 23
 
4.3%
7 22
 
4.2%
Other values (3) 27
 
5.1%
Latin
ValueCountFrequency (%)
m 20
25.0%
d 10
12.5%
o 10
12.5%
v 10
12.5%
n 10
12.5%
b 10
12.5%
p 10
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 610
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 83
13.6%
2 75
12.3%
1 68
11.1%
/ 60
9.8%
4 52
8.5%
3 41
 
6.7%
_ 40
 
6.6%
6 39
 
6.4%
5 23
 
3.8%
7 22
 
3.6%
Other values (10) 107
17.5%

title
Text

MISSING 

Distinct8
Distinct (%)80.0%
Missing114
Missing (%)91.9%
Memory size1.1 KiB
2024-03-11T12:30:39.203892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length26
Mean length17.8
Min length11

Characters and Unicode

Total characters178
Distinct characters75
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)70.0%

Sample

1st row(이슈 파해치기)오프닝
2nd row인물 파헤치기-'표현의 자유 vs 종북 발언' 1
3rd row인물 파헤치기-'표현의 자유 vs 종북 발언' 2
4th row<뉴스 파이터> 오프닝
5th row<뉴스 파이터> 오프닝
ValueCountFrequency (%)
뉴스 3
 
6.8%
오프닝 3
 
6.8%
파이터 3
 
6.8%
인물 3
 
6.8%
자유 2
 
4.5%
발언 2
 
4.5%
vs 2
 
4.5%
종북 2
 
4.5%
파헤치기-'표현의 2
 
4.5%
유출 1
 
2.3%
Other values (21) 21
47.7%
2024-03-11T12:30:39.537758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34
 
19.1%
7
 
3.9%
' 6
 
3.4%
5
 
2.8%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
3
 
1.7%
Other values (65) 103
57.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 117
65.7%
Space Separator 34
 
19.1%
Other Punctuation 10
 
5.6%
Math Symbol 6
 
3.4%
Lowercase Letter 4
 
2.2%
Dash Punctuation 3
 
1.7%
Decimal Number 2
 
1.1%
Close Punctuation 1
 
0.6%
Open Punctuation 1
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
6.0%
5
 
4.3%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.6%
3
 
2.6%
3
 
2.6%
Other values (52) 76
65.0%
Other Punctuation
ValueCountFrequency (%)
' 6
60.0%
2
 
20.0%
· 2
 
20.0%
Math Symbol
ValueCountFrequency (%)
< 3
50.0%
> 3
50.0%
Lowercase Letter
ValueCountFrequency (%)
s 2
50.0%
v 2
50.0%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Space Separator
ValueCountFrequency (%)
34
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 117
65.7%
Common 57
32.0%
Latin 4
 
2.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
6.0%
5
 
4.3%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.6%
3
 
2.6%
3
 
2.6%
Other values (52) 76
65.0%
Common
ValueCountFrequency (%)
34
59.6%
' 6
 
10.5%
- 3
 
5.3%
< 3
 
5.3%
> 3
 
5.3%
2
 
3.5%
· 2
 
3.5%
) 1
 
1.8%
1 1
 
1.8%
2 1
 
1.8%
Latin
ValueCountFrequency (%)
s 2
50.0%
v 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 117
65.7%
ASCII 57
32.0%
Punctuation 2
 
1.1%
None 2
 
1.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
34
59.6%
' 6
 
10.5%
- 3
 
5.3%
< 3
 
5.3%
> 3
 
5.3%
s 2
 
3.5%
v 2
 
3.5%
) 1
 
1.8%
1 1
 
1.8%
2 1
 
1.8%
Hangul
ValueCountFrequency (%)
7
 
6.0%
5
 
4.3%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.6%
3
 
2.6%
3
 
2.6%
Other values (52) 76
65.0%
Punctuation
ValueCountFrequency (%)
2
100.0%
None
ValueCountFrequency (%)
· 2
100.0%

contents
Text

MISSING 

Distinct9
Distinct (%)90.0%
Missing114
Missing (%)91.9%
Memory size1.1 KiB
2024-03-11T12:30:39.739999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length67
Median length27
Mean length33
Min length6

Characters and Unicode

Total characters330
Distinct characters122
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)80.0%

Sample

1st row안녕하세요?
2nd row인물을 집중 해부해보는 코너, 인물 파헤치기입니다.
3rd row인물을 집중 해부해보는 코너, 인물 파헤치기입니다.
4th row시청자 여러분 안녕하십니까? 대통령의 측근, 비선, 동생과 관련된 얘기가 오늘 내린 하얀 눈처럼 세상을 덮고 있습니다.
5th row시청자 여러분 안녕하십니까. 청와대에서 작성된 문건을 과연 누가 왜 유출했는지를 두고 온통 추측이 난무합니다.
ValueCountFrequency (%)
시청자 3
 
4.1%
안녕하십니까 3
 
4.1%
여러분 3
 
4.1%
인물을 2
 
2.7%
집중 2
 
2.7%
있습니다 2
 
2.7%
파헤치기입니다 2
 
2.7%
인물 2
 
2.7%
코너 2
 
2.7%
해부해보는 2
 
2.7%
Other values (51) 51
68.9%
2024-03-11T12:30:40.085314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
70
 
21.2%
9
 
2.7%
8
 
2.4%
. 8
 
2.4%
7
 
2.1%
5
 
1.5%
5
 
1.5%
5
 
1.5%
, 5
 
1.5%
4
 
1.2%
Other values (112) 204
61.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 235
71.2%
Space Separator 70
 
21.2%
Other Punctuation 17
 
5.2%
Math Symbol 4
 
1.2%
Decimal Number 2
 
0.6%
Open Punctuation 1
 
0.3%
Close Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
3.8%
8
 
3.4%
7
 
3.0%
5
 
2.1%
5
 
2.1%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (102) 180
76.6%
Other Punctuation
ValueCountFrequency (%)
. 8
47.1%
, 5
29.4%
? 2
 
11.8%
' 2
 
11.8%
Math Symbol
ValueCountFrequency (%)
< 2
50.0%
> 2
50.0%
Space Separator
ValueCountFrequency (%)
70
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%
Open Punctuation
ValueCountFrequency (%)
1
100.0%
Close Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 235
71.2%
Common 95
28.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
 
3.8%
8
 
3.4%
7
 
3.0%
5
 
2.1%
5
 
2.1%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (102) 180
76.6%
Common
ValueCountFrequency (%)
70
73.7%
. 8
 
8.4%
, 5
 
5.3%
? 2
 
2.1%
< 2
 
2.1%
1 2
 
2.1%
> 2
 
2.1%
' 2
 
2.1%
1
 
1.1%
1
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 235
71.2%
ASCII 93
 
28.2%
None 2
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
70
75.3%
. 8
 
8.6%
, 5
 
5.4%
? 2
 
2.2%
< 2
 
2.2%
1 2
 
2.2%
> 2
 
2.2%
' 2
 
2.2%
Hangul
ValueCountFrequency (%)
9
 
3.8%
8
 
3.4%
7
 
3.0%
5
 
2.1%
5
 
2.1%
5
 
2.1%
4
 
1.7%
4
 
1.7%
4
 
1.7%
4
 
1.7%
Other values (102) 180
76.6%
None
ValueCountFrequency (%)
1
50.0%
1
50.0%

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing124
Missing (%)100.0%
Memory size1.2 KiB

Interactions

2024-03-11T12:30:36.670166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:35.705276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.099681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.378272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.738265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:35.848212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.168227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.444574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.800328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:35.929311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.228940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.512132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.867615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.015375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.310783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:30:36.601838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-11T12:30:40.165961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
vod_seq_nobcast_seq_noplay_secplay_hourfile_sizevod_pathtitlecontents
vod_seq_no1.0001.0001.0001.0001.0001.0001.0001.000
bcast_seq_no1.0001.0000.0000.0000.0001.0000.0001.000
play_sec1.0000.0001.0001.0001.0001.0001.0000.000
play_hour1.0000.0001.0001.0001.0001.0001.0000.000
file_size1.0000.0001.0001.0001.0001.0001.0000.000
vod_path1.0001.0001.0001.0001.0001.0001.0001.000
title1.0000.0001.0001.0001.0001.0001.0000.782
contents1.0001.0000.0000.0000.0001.0000.7821.000
2024-03-11T12:30:40.259677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
bcast_seq_noplay_secplay_hourfile_size
bcast_seq_no1.000-0.285-0.285-0.285
play_sec-0.2851.0001.0001.000
play_hour-0.2851.0001.0001.000
file_size-0.2851.0001.0001.000

Missing values

2024-03-11T12:30:36.961086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-11T12:30:37.070122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-11T12:30:37.205368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

vod_seq_nobcast_seq_noplay_secplay_hourfile_sizevod_pathtitlecontentsUnnamed: 8
0<NA><NA><NA><NA><NA><NA><NA><NA><NA>
15578921085286590.01649494006/mbnvod2/673/2014/12/01/20141201103143_20_673_1085286_360.mp4(이슈 파해치기)오프닝안녕하세요?<NA>
2뉴스를 파헤치고 이슈를 터트리는 뉴스,<NA><NA><NA><NA><NA><NA><NA><NA>
3뉴스 파이터..<NA><NA><NA><NA><NA><NA><NA><NA>
4<NA><NA><NA><NA><NA><NA><NA><NA><NA>
5진행을 맡은 최중락입니다.<NA><NA><NA><NA><NA><NA><NA><NA>
6<NA><NA><NA><NA><NA><NA><NA><NA><NA>
7저와 함께 뉴스를 철저하게 해부해주실 분, 소개합니다.<NA><NA><NA><NA><NA><NA><NA><NA>
8먼저, 정치 사회 전반에서 벌어지는 시사를 격파해 주실,<NA><NA><NA><NA><NA><NA><NA><NA>
9뉴스멘토 황장수 미래경영연구소 소장 나오셨고요.<NA><NA><NA><NA><NA><NA><NA><NA>
vod_seq_nobcast_seq_noplay_secplay_hourfile_sizevod_pathtitlecontentsUnnamed: 8
114<2>추위 뿐 만 아니라 눈 소식도 잦은데요.<NA><NA><NA><NA><NA><NA><NA><NA>
115충청과 호남지역에는 연일 대설특보가 내려진 가운데 오늘 밤까지 계속해서 눈이 내렸다 그쳤다를 반복하겠습니다.<NA><NA><NA><NA><NA><NA><NA><NA>
116특히 서해안과 제주산간을 중심으로는 최고 15cm이상의 많은 눈이 예보돼 있는데요.<NA><NA><NA><NA><NA><NA><NA><NA>
117기온이 낮아 눈이 어는 곳이 있어 교통안전에 각별히 주의하셔야겠습니다.<NA><NA><NA><NA><NA><NA><NA><NA>
118<NA><NA><NA><NA><NA><NA><NA><NA><NA>
119<주간>당분간 영하권 추위는 계속되겠고요.<NA><NA><NA><NA><NA><NA><NA><NA>
120다음주 월요일에는 충청 이남에 또 한 차례 눈이 내릴 전망입니다. 날씨였습니다.<NA><NA><NA><NA><NA><NA><NA><NA>
121<NA><NA><NA><NA><NA><NA><NA><NA><NA>
122(전주원 기상캐스터)<NA><NA><NA><NA><NA><NA><NA><NA>
123<NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

vod_seq_nobcast_seq_noplay_secplay_hourfile_sizevod_pathtitlecontents# duplicates
3<NA><NA><NA><NA><NA><NA><NA><NA>37
1【 기자 】<NA><NA><NA><NA><NA><NA><NA>4
2【 앵커멘트 】<NA><NA><NA><NA><NA><NA><NA>3
0(전주원 기상캐스터)<NA><NA><NA><NA><NA><NA><NA>2