Overview

Dataset statistics

Number of variables9
Number of observations384
Missing cells3112
Missing cells (%)90.0%
Duplicate rows2
Duplicate rows (%)0.5%
Total size in memory29.0 KiB
Average record size in memory77.3 B

Variable types

Text4
Numeric4
Unsupported1

Dataset

Description샘플 데이터
AuthorMBN
URLhttps://kdx.kr/data/view/29829

Alerts

Dataset has 2 (0.5%) duplicate rowsDuplicates
play_sec is highly overall correlated with play_hour and 1 other fieldsHigh correlation
play_hour is highly overall correlated with play_sec and 1 other fieldsHigh correlation
file_size is highly overall correlated with play_sec and 1 other fieldsHigh correlation
vod_seq_no has 110 (28.6%) missing valuesMissing
bcast_seq_no has 374 (97.4%) missing valuesMissing
play_sec has 374 (97.4%) missing valuesMissing
play_hour has 374 (97.4%) missing valuesMissing
file_size has 374 (97.4%) missing valuesMissing
vod_path has 374 (97.4%) missing valuesMissing
title has 374 (97.4%) missing valuesMissing
contents has 374 (97.4%) missing valuesMissing
Unnamed: 8 has 384 (100.0%) missing valuesMissing
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-11 03:19:12.256947
Analysis finished2024-03-11 03:19:15.596084
Duration3.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

vod_seq_no
Text

MISSING 

Distinct272
Distinct (%)99.3%
Missing110
Missing (%)28.6%
Memory size3.1 KiB
2024-03-11T12:19:15.822897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length360
Median length132
Mean length51.368613
Min length1

Characters and Unicode

Total characters14075
Distinct characters676
Distinct categories14 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique271 ?
Unique (%)98.9%

Sample

1st row476674
2nd row귀국한 안철수 전 서울대 교수가
3rd row서울 노원병 국회의원 보궐선거에 직접 출마하기로 했다는데요.
4th row매일경제신문 정치부 이상훈 차장과 함께
5th row자세한 정치권 소식 알아보겠습니다.
ValueCountFrequency (%)
17
 
0.5%
오늘 14
 
0.4%
10
 
0.3%
10
 
0.3%
당시 10
 
0.3%
10
 
0.3%
10
 
0.3%
10
 
0.3%
어떻게 9
 
0.3%
안철수 9
 
0.3%
Other values (2365) 3167
96.7%
2024-03-11T12:19:16.270170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3246
 
23.1%
313
 
2.2%
. 285
 
2.0%
231
 
1.6%
225
 
1.6%
162
 
1.2%
156
 
1.1%
151
 
1.1%
151
 
1.1%
148
 
1.1%
Other values (666) 9007
64.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9608
68.3%
Space Separator 3247
 
23.1%
Other Punctuation 574
 
4.1%
Decimal Number 372
 
2.6%
Math Symbol 77
 
0.5%
Control 62
 
0.4%
Uppercase Letter 37
 
0.3%
Close Punctuation 23
 
0.2%
Open Punctuation 23
 
0.2%
Dash Punctuation 18
 
0.1%
Other values (4) 34
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
313
 
3.3%
231
 
2.4%
225
 
2.3%
162
 
1.7%
156
 
1.6%
151
 
1.6%
151
 
1.6%
148
 
1.5%
148
 
1.5%
137
 
1.4%
Other values (610) 7786
81.0%
Uppercase Letter
ValueCountFrequency (%)
C 7
18.9%
B 5
13.5%
G 4
10.8%
D 3
8.1%
W 3
8.1%
Y 2
 
5.4%
P 2
 
5.4%
N 2
 
5.4%
M 2
 
5.4%
T 1
 
2.7%
Other values (6) 6
16.2%
Other Punctuation
ValueCountFrequency (%)
. 285
49.7%
, 118
20.6%
? 75
 
13.1%
' 24
 
4.2%
% 23
 
4.0%
! 15
 
2.6%
" 15
 
2.6%
/ 11
 
1.9%
5
 
0.9%
# 2
 
0.3%
Decimal Number
ValueCountFrequency (%)
2 64
17.2%
0 58
15.6%
1 50
13.4%
4 37
9.9%
7 34
9.1%
5 32
8.6%
6 32
8.6%
3 32
8.6%
8 18
 
4.8%
9 15
 
4.0%
Math Symbol
ValueCountFrequency (%)
= 43
55.8%
> 15
 
19.5%
~ 8
 
10.4%
< 6
 
7.8%
3
 
3.9%
2
 
2.6%
Space Separator
ValueCountFrequency (%)
3246
> 99.9%
  1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 22
95.7%
1
 
4.3%
Final Punctuation
ValueCountFrequency (%)
8
57.1%
6
42.9%
Initial Punctuation
ValueCountFrequency (%)
8
61.5%
5
38.5%
Control
ValueCountFrequency (%)
62
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Lowercase Letter
ValueCountFrequency (%)
m 5
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9604
68.2%
Common 4425
31.4%
Latin 42
 
0.3%
Han 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
313
 
3.3%
231
 
2.4%
225
 
2.3%
162
 
1.7%
156
 
1.6%
151
 
1.6%
151
 
1.6%
148
 
1.5%
148
 
1.5%
137
 
1.4%
Other values (606) 7782
81.0%
Common
ValueCountFrequency (%)
3246
73.4%
. 285
 
6.4%
, 118
 
2.7%
? 75
 
1.7%
2 64
 
1.4%
62
 
1.4%
0 58
 
1.3%
1 50
 
1.1%
= 43
 
1.0%
4 37
 
0.8%
Other values (29) 387
 
8.7%
Latin
ValueCountFrequency (%)
C 7
16.7%
m 5
11.9%
B 5
11.9%
G 4
9.5%
D 3
7.1%
W 3
7.1%
Y 2
 
4.8%
P 2
 
4.8%
N 2
 
4.8%
M 2
 
4.8%
Other values (7) 7
16.7%
Han
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9604
68.2%
ASCII 4428
31.5%
Punctuation 32
 
0.2%
None 7
 
< 0.1%
CJK 3
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3246
73.3%
. 285
 
6.4%
, 118
 
2.7%
? 75
 
1.7%
2 64
 
1.4%
62
 
1.4%
0 58
 
1.3%
1 50
 
1.1%
= 43
 
1.0%
4 37
 
0.8%
Other values (37) 390
 
8.8%
Hangul
ValueCountFrequency (%)
313
 
3.3%
231
 
2.4%
225
 
2.3%
162
 
1.7%
156
 
1.6%
151
 
1.6%
151
 
1.6%
148
 
1.5%
148
 
1.5%
137
 
1.4%
Other values (606) 7782
81.0%
Punctuation
ValueCountFrequency (%)
8
25.0%
8
25.0%
6
18.8%
5
15.6%
5
15.6%
None
ValueCountFrequency (%)
3
42.9%
2
28.6%
  1
 
14.3%
1
 
14.3%
CJK
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%

bcast_seq_no
Real number (ℝ)

MISSING 

Distinct10
Distinct (%)100.0%
Missing374
Missing (%)97.4%
Infinite0
Infinite (%)0.0%
Mean1042702.5
Minimum1042632
Maximum1042786
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2024-03-11T12:19:16.361801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1042632
5-th percentile1042632.4
Q11042634.2
median1042711.5
Q31042766.2
95-th percentile1042785.6
Maximum1042786
Range154
Interquartile range (IQR)132

Descriptive statistics

Standard deviation66.451737
Coefficient of variation (CV)6.3730294 × 10-5
Kurtosis-1.7661437
Mean1042702.5
Median Absolute Deviation (MAD)74
Skewness0.17635645
Sum10427025
Variance4415.8333
MonotonicityStrictly increasing
2024-03-11T12:19:16.478130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1042632 1
 
0.3%
1042633 1
 
0.3%
1042634 1
 
0.3%
1042635 1
 
0.3%
1042711 1
 
0.3%
1042712 1
 
0.3%
1042713 1
 
0.3%
1042784 1
 
0.3%
1042785 1
 
0.3%
1042786 1
 
0.3%
(Missing) 374
97.4%
ValueCountFrequency (%)
1042632 1
0.3%
1042633 1
0.3%
1042634 1
0.3%
1042635 1
0.3%
1042711 1
0.3%
1042712 1
0.3%
1042713 1
0.3%
1042784 1
0.3%
1042785 1
0.3%
1042786 1
0.3%
ValueCountFrequency (%)
1042786 1
0.3%
1042785 1
0.3%
1042784 1
0.3%
1042713 1
0.3%
1042712 1
0.3%
1042711 1
0.3%
1042635 1
0.3%
1042634 1
0.3%
1042633 1
0.3%
1042632 1
0.3%

play_sec
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)100.0%
Missing374
Missing (%)97.4%
Infinite0
Infinite (%)0.0%
Mean583
Minimum258
Maximum1060
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2024-03-11T12:19:16.583414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum258
5-th percentile271.95
Q1398.75
median553.5
Q3773
95-th percentile938.05
Maximum1060
Range802
Interquartile range (IQR)374.25

Descriptive statistics

Standard deviation263.2312
Coefficient of variation (CV)0.4515115
Kurtosis-0.82973325
Mean583
Median Absolute Deviation (MAD)209.5
Skewness0.41853378
Sum5830
Variance69290.667
MonotonicityNot monotonic
2024-03-11T12:19:16.673342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
743 1
 
0.3%
436 1
 
0.3%
289 1
 
0.3%
671 1
 
0.3%
1060 1
 
0.3%
397 1
 
0.3%
789 1
 
0.3%
783 1
 
0.3%
404 1
 
0.3%
258 1
 
0.3%
(Missing) 374
97.4%
ValueCountFrequency (%)
258 1
0.3%
289 1
0.3%
397 1
0.3%
404 1
0.3%
436 1
0.3%
671 1
0.3%
743 1
0.3%
783 1
0.3%
789 1
0.3%
1060 1
0.3%
ValueCountFrequency (%)
1060 1
0.3%
789 1
0.3%
783 1
0.3%
743 1
0.3%
671 1
0.3%
436 1
0.3%
404 1
0.3%
397 1
0.3%
289 1
0.3%
258 1
0.3%

play_hour
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)100.0%
Missing374
Missing (%)97.4%
Infinite0
Infinite (%)0.0%
Mean0.16195
Minimum0.0717
Maximum0.2944
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2024-03-11T12:19:16.760976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.0717
5-th percentile0.07557
Q10.110775
median0.15375
Q30.214725
95-th percentile0.26056
Maximum0.2944
Range0.2227
Interquartile range (IQR)0.10395

Descriptive statistics

Standard deviation0.073108188
Coefficient of variation (CV)0.45142444
Kurtosis-0.83174487
Mean0.16195
Median Absolute Deviation (MAD)0.0582
Skewness0.41819554
Sum1.6195
Variance0.0053448072
MonotonicityNot monotonic
2024-03-11T12:19:16.840265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.2064 1
 
0.3%
0.1211 1
 
0.3%
0.0803 1
 
0.3%
0.1864 1
 
0.3%
0.2944 1
 
0.3%
0.1103 1
 
0.3%
0.2192 1
 
0.3%
0.2175 1
 
0.3%
0.1122 1
 
0.3%
0.0717 1
 
0.3%
(Missing) 374
97.4%
ValueCountFrequency (%)
0.0717 1
0.3%
0.0803 1
0.3%
0.1103 1
0.3%
0.1122 1
0.3%
0.1211 1
0.3%
0.1864 1
0.3%
0.2064 1
0.3%
0.2175 1
0.3%
0.2192 1
0.3%
0.2944 1
0.3%
ValueCountFrequency (%)
0.2944 1
0.3%
0.2192 1
0.3%
0.2175 1
0.3%
0.2064 1
0.3%
0.1864 1
0.3%
0.1211 1
0.3%
0.1122 1
0.3%
0.1103 1
0.3%
0.0803 1
0.3%
0.0717 1
0.3%

file_size
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)100.0%
Missing374
Missing (%)97.4%
Infinite0
Infinite (%)0.0%
Mean95374723
Minimum40022256
Maximum1.7607761 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB
2024-03-11T12:19:16.923886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum40022256
5-th percentile42519062
Q167381922
median89287218
Q31.258217 × 108
95-th percentile1.5503235 × 108
Maximum1.7607761 × 108
Range1.3605535 × 108
Interquartile range (IQR)58439779

Descriptive statistics

Standard deviation43743223
Coefficient of variation (CV)0.45864587
Kurtosis-0.62098998
Mean95374723
Median Absolute Deviation (MAD)35219442
Skewness0.44200301
Sum9.5374723 × 108
Variance1.9134695 × 1015
MonotonicityNot monotonic
2024-03-11T12:19:17.012533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
121876576 1
 
0.3%
70812784 1
 
0.3%
45570713 1
 
0.3%
107761651 1
 
0.3%
176077608 1
 
0.3%
67174581 1
 
0.3%
129310378 1
 
0.3%
127136742 1
 
0.3%
68003943 1
 
0.3%
40022256 1
 
0.3%
(Missing) 374
97.4%
ValueCountFrequency (%)
40022256 1
0.3%
45570713 1
0.3%
67174581 1
0.3%
68003943 1
0.3%
70812784 1
0.3%
107761651 1
0.3%
121876576 1
0.3%
127136742 1
0.3%
129310378 1
0.3%
176077608 1
0.3%
ValueCountFrequency (%)
176077608 1
0.3%
129310378 1
0.3%
127136742 1
0.3%
121876576 1
0.3%
107761651 1
0.3%
70812784 1
0.3%
68003943 1
0.3%
67174581 1
0.3%
45570713 1
0.3%
40022256 1
0.3%

vod_path
Text

MISSING 

Distinct10
Distinct (%)100.0%
Missing374
Missing (%)97.4%
Memory size3.1 KiB
2024-03-11T12:19:17.192958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length61
Median length61
Mean length61
Min length61

Characters and Unicode

Total characters610
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)100.0%

Sample

1st row/mbnvod2/606/2013/03/04/20130304091959_20_606_1042632_360.mp4
2nd row/mbnvod2/606/2013/03/04/20130304091959_20_606_1042633_360.mp4
3rd row/mbnvod2/606/2013/03/04/20130304091959_20_606_1042634_360.mp4
4th row/mbnvod2/606/2013/03/04/20130304091959_20_606_1042635_360.mp4
5th row/mbnvod2/606/2013/03/05/20130305091825_20_606_1042711_360.mp4
ValueCountFrequency (%)
mbnvod2/606/2013/03/04/20130304091959_20_606_1042632_360.mp4 1
10.0%
mbnvod2/606/2013/03/04/20130304091959_20_606_1042633_360.mp4 1
10.0%
mbnvod2/606/2013/03/04/20130304091959_20_606_1042634_360.mp4 1
10.0%
mbnvod2/606/2013/03/04/20130304091959_20_606_1042635_360.mp4 1
10.0%
mbnvod2/606/2013/03/05/20130305091825_20_606_1042711_360.mp4 1
10.0%
mbnvod2/606/2013/03/05/20130305091825_20_606_1042712_360.mp4 1
10.0%
mbnvod2/606/2013/03/05/20130305091825_20_606_1042713_360.mp4 1
10.0%
mbnvod2/606/2013/03/06/20130306085742_20_606_1042784_360.mp4 1
10.0%
mbnvod2/606/2013/03/06/20130306090853_20_606_1042785_360.mp4 1
10.0%
mbnvod2/606/2013/03/06/20130306091926_20_606_1042786_360.mp4 1
10.0%
2024-03-11T12:19:17.488821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 121
19.8%
6 62
10.2%
/ 60
9.8%
2 57
9.3%
3 57
9.3%
1 42
 
6.9%
_ 40
 
6.6%
4 31
 
5.1%
m 20
 
3.3%
9 18
 
3.0%
Other values (10) 102
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 420
68.9%
Lowercase Letter 80
 
13.1%
Other Punctuation 70
 
11.5%
Connector Punctuation 40
 
6.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 121
28.8%
6 62
14.8%
2 57
13.6%
3 57
13.6%
1 42
 
10.0%
4 31
 
7.4%
9 18
 
4.3%
5 17
 
4.0%
8 8
 
1.9%
7 7
 
1.7%
Lowercase Letter
ValueCountFrequency (%)
m 20
25.0%
d 10
12.5%
o 10
12.5%
v 10
12.5%
n 10
12.5%
b 10
12.5%
p 10
12.5%
Other Punctuation
ValueCountFrequency (%)
/ 60
85.7%
. 10
 
14.3%
Connector Punctuation
ValueCountFrequency (%)
_ 40
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 530
86.9%
Latin 80
 
13.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 121
22.8%
6 62
11.7%
/ 60
11.3%
2 57
10.8%
3 57
10.8%
1 42
 
7.9%
_ 40
 
7.5%
4 31
 
5.8%
9 18
 
3.4%
5 17
 
3.2%
Other values (3) 25
 
4.7%
Latin
ValueCountFrequency (%)
m 20
25.0%
d 10
12.5%
o 10
12.5%
v 10
12.5%
n 10
12.5%
b 10
12.5%
p 10
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 610
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 121
19.8%
6 62
10.2%
/ 60
9.8%
2 57
9.3%
3 57
9.3%
1 42
 
6.9%
_ 40
 
6.6%
4 31
 
5.1%
m 20
 
3.3%
9 18
 
3.0%
Other values (10) 102
16.7%

title
Text

MISSING 

Distinct8
Distinct (%)80.0%
Missing374
Missing (%)97.4%
Memory size3.1 KiB
2024-03-11T12:19:17.655642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length15
Mean length11.9
Min length5

Characters and Unicode

Total characters119
Distinct characters71
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)70.0%

Sample

1st row안철수, 4월 재보선 출마
2nd row뉴스있수다
3rd row백악관 흉내라도 내봐라?
4th row""개인회생 파산 눈덩이""
5th row김종훈 사퇴, 누구의 탓인가
ValueCountFrequency (%)
뉴스있수다 3
 
10.0%
안철수 1
 
3.3%
장광익의 1
 
3.3%
1
 
3.3%
역사 1
 
3.3%
도는 1
 
3.3%
돌고 1
 
3.3%
해법은 1
 
3.3%
재테크 1
 
3.3%
시대 1
 
3.3%
Other values (18) 18
60.0%
2024-03-11T12:19:17.903994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20
 
16.8%
4
 
3.4%
" 4
 
3.4%
, 4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
2
 
1.7%
2
 
1.7%
Other values (61) 71
59.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 88
73.9%
Space Separator 20
 
16.8%
Other Punctuation 9
 
7.6%
Decimal Number 1
 
0.8%
Dash Punctuation 1
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
4.5%
3
 
3.4%
3
 
3.4%
3
 
3.4%
3
 
3.4%
2
 
2.3%
2
 
2.3%
2
 
2.3%
2
 
2.3%
2
 
2.3%
Other values (55) 62
70.5%
Other Punctuation
ValueCountFrequency (%)
" 4
44.4%
, 4
44.4%
? 1
 
11.1%
Space Separator
ValueCountFrequency (%)
20
100.0%
Decimal Number
ValueCountFrequency (%)
4 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 88
73.9%
Common 31
 
26.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
 
4.5%
3
 
3.4%
3
 
3.4%
3
 
3.4%
3
 
3.4%
2
 
2.3%
2
 
2.3%
2
 
2.3%
2
 
2.3%
2
 
2.3%
Other values (55) 62
70.5%
Common
ValueCountFrequency (%)
20
64.5%
" 4
 
12.9%
, 4
 
12.9%
4 1
 
3.2%
- 1
 
3.2%
? 1
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 88
73.9%
ASCII 31
 
26.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20
64.5%
" 4
 
12.9%
, 4
 
12.9%
4 1
 
3.2%
- 1
 
3.2%
? 1
 
3.2%
Hangul
ValueCountFrequency (%)
4
 
4.5%
3
 
3.4%
3
 
3.4%
3
 
3.4%
3
 
3.4%
2
 
2.3%
2
 
2.3%
2
 
2.3%
2
 
2.3%
2
 
2.3%
Other values (55) 62
70.5%

contents
Text

MISSING 

Distinct10
Distinct (%)100.0%
Missing374
Missing (%)97.4%
Memory size3.1 KiB
2024-03-11T12:19:18.083016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length64
Median length40
Mean length29.2
Min length17

Characters and Unicode

Total characters292
Distinct characters139
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)100.0%

Sample

1st row오늘 첫 번째 순서로는 정치권 이슈를 살펴보겠습니다.
2nd row(1) 지하경제 규모, GDP의 약 23%
3rd row여러분의 아침에 생각할 거리를 던집니다.
4th row먹거리 물가가 뜀박질하고 있습니다.
5th row김종훈 미래창조과학부 장관 내정자가
ValueCountFrequency (%)
1 3
 
4.5%
2
 
3.0%
김종훈 2
 
3.0%
오늘 1
 
1.5%
2 1
 
1.5%
아파트 1
 
1.5%
서울 1
 
1.5%
합니다 1
 
1.5%
시대라고 1
 
1.5%
선보입니다 1
 
1.5%
Other values (53) 53
79.1%
2024-03-11T12:19:18.371730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
61
 
20.9%
7
 
2.4%
7
 
2.4%
. 6
 
2.1%
5
 
1.7%
4
 
1.4%
4
 
1.4%
1 4
 
1.4%
4
 
1.4%
4
 
1.4%
Other values (129) 186
63.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 196
67.1%
Space Separator 61
 
20.9%
Other Punctuation 11
 
3.8%
Decimal Number 11
 
3.8%
Close Punctuation 3
 
1.0%
Open Punctuation 3
 
1.0%
Uppercase Letter 3
 
1.0%
Math Symbol 2
 
0.7%
Final Punctuation 1
 
0.3%
Initial Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
3.6%
7
 
3.6%
5
 
2.6%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
3
 
1.5%
Other values (111) 150
76.5%
Decimal Number
ValueCountFrequency (%)
1 4
36.4%
2 3
27.3%
0 2
18.2%
4 1
 
9.1%
3 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
. 6
54.5%
, 3
27.3%
% 2
 
18.2%
Uppercase Letter
ValueCountFrequency (%)
G 1
33.3%
D 1
33.3%
P 1
33.3%
Math Symbol
ValueCountFrequency (%)
> 1
50.0%
< 1
50.0%
Space Separator
ValueCountFrequency (%)
61
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 196
67.1%
Common 93
31.8%
Latin 3
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
3.6%
7
 
3.6%
5
 
2.6%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
3
 
1.5%
Other values (111) 150
76.5%
Common
ValueCountFrequency (%)
61
65.6%
. 6
 
6.5%
1 4
 
4.3%
, 3
 
3.2%
) 3
 
3.2%
( 3
 
3.2%
2 3
 
3.2%
0 2
 
2.2%
% 2
 
2.2%
1
 
1.1%
Other values (5) 5
 
5.4%
Latin
ValueCountFrequency (%)
G 1
33.3%
D 1
33.3%
P 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 196
67.1%
ASCII 94
32.2%
Punctuation 2
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
61
64.9%
. 6
 
6.4%
1 4
 
4.3%
, 3
 
3.2%
) 3
 
3.2%
( 3
 
3.2%
2 3
 
3.2%
0 2
 
2.1%
% 2
 
2.1%
4 1
 
1.1%
Other values (6) 6
 
6.4%
Hangul
ValueCountFrequency (%)
7
 
3.6%
7
 
3.6%
5
 
2.6%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
3
 
1.5%
Other values (111) 150
76.5%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing384
Missing (%)100.0%
Memory size3.5 KiB

Interactions

2024-03-11T12:19:14.983052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:13.989712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:14.402932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:14.690571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:15.057770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:14.127049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:14.477383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:14.761580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:15.140442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:14.203876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:14.551014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:14.845721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:15.203853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:14.307801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:14.615317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-11T12:19:14.909431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-11T12:19:18.467781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
bcast_seq_noplay_secplay_hourfile_sizevod_pathtitlecontents
bcast_seq_no1.0000.0000.0000.0001.0000.0001.000
play_sec0.0001.0001.0000.9851.0000.7981.000
play_hour0.0001.0001.0000.9851.0000.7761.000
file_size0.0000.9850.9851.0001.0000.7981.000
vod_path1.0001.0001.0001.0001.0001.0001.000
title0.0000.7980.7760.7981.0001.0001.000
contents1.0001.0001.0001.0001.0001.0001.000
2024-03-11T12:19:18.822981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
bcast_seq_noplay_secplay_hourfile_size
bcast_seq_no1.000-0.176-0.176-0.176
play_sec-0.1761.0001.0001.000
play_hour-0.1761.0001.0001.000
file_size-0.1761.0001.0001.000

Missing values

2024-03-11T12:19:15.299240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-11T12:19:15.405039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-11T12:19:15.509103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

vod_seq_nobcast_seq_noplay_secplay_hourfile_sizevod_pathtitlecontentsUnnamed: 8
0<NA><NA><NA><NA><NA><NA><NA><NA><NA>
147667410426327430.2064121876576/mbnvod2/606/2013/03/04/20130304091959_20_606_1042632_360.mp4안철수, 4월 재보선 출마오늘 첫 번째 순서로는 정치권 이슈를 살펴보겠습니다.<NA>
2귀국한 안철수 전 서울대 교수가<NA><NA><NA><NA><NA><NA><NA><NA>
3서울 노원병 국회의원 보궐선거에 직접 출마하기로 했다는데요.<NA><NA><NA><NA><NA><NA><NA><NA>
4매일경제신문 정치부 이상훈 차장과 함께<NA><NA><NA><NA><NA><NA><NA><NA>
5자세한 정치권 소식 알아보겠습니다.<NA><NA><NA><NA><NA><NA><NA><NA>
6<NA><NA><NA><NA><NA><NA><NA><NA><NA>
7<NA><NA><NA><NA><NA><NA><NA><NA><NA>
81. 드디어 안철수 전 대선 후보가 다시 등장했습니다.<NA><NA><NA><NA><NA><NA><NA><NA>
9어제 대변인격인 송호창 의원을 통해, 곧 귀국해 4월 재보선에서<NA><NA><NA><NA><NA><NA><NA><NA>
vod_seq_nobcast_seq_noplay_secplay_hourfile_sizevod_pathtitlecontentsUnnamed: 8
374인사청문회와 관련한 입장도 뒤바뀌었습니다.<NA><NA><NA><NA><NA><NA><NA><NA>
3752006년 2월 한나라당은 김우식 과학기술부 장관 후보자 등의 인선에 반대하면서 지명 철회를 요구했습니다.<NA><NA><NA><NA><NA><NA><NA><NA>
376박 대통령은 당시 대표 신분으로 “대통령이 국무위원 청문회의 입법 취지를 존중하지 않고 야당의 검증과 입장을 무시하는 건 문제가 있다”고 날을 세웠습니다.<NA><NA><NA><NA><NA><NA><NA><NA>
377하지만 박대통령은 당선인 시절 김용준 전 국무총리 후보자가 자진 사퇴하자 “신상털기식 검증은 문제가 있다. 이런 상황에 누가 청문회를 하려고 하겠느냐”고 정반대의 입장을 취했습니다.<NA><NA><NA><NA><NA><NA><NA><NA>
378우리가 역사를 배우는 것은 바로 이렇게 돌고 도는 역사에서, 배워야 할 것과 버려야 할 것을 골라내기 위해서가 아닐까요.<NA><NA><NA><NA><NA><NA><NA><NA>
379우리나라에서 내놓라하는 분들이 다 모였다는 정치인들만은 그런 학습 능력이 없는 모양입니다.<NA><NA><NA><NA><NA><NA><NA><NA>
380그래서 한국의 국가경쟁력중에서 정치가 가장 뒤떨어지는 분야가 됐구요.<NA><NA><NA><NA><NA><NA><NA><NA>
381<NA><NA><NA><NA><NA><NA><NA><NA><NA>
382모닝톡톡 오늘은 여기까집니다.<NA><NA><NA><NA><NA><NA><NA><NA>
383<NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

vod_seq_nobcast_seq_noplay_secplay_hourfile_sizevod_pathtitlecontents# duplicates
1<NA><NA><NA><NA><NA><NA><NA><NA>110
0오늘 말씀 잘 들었습니다.<NA><NA><NA><NA><NA><NA><NA>3