Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 384 |
Missing cells | 3112 |
Missing cells (%) | 90.0% |
Duplicate rows | 2 |
Duplicate rows (%) | 0.5% |
Total size in memory | 29.0 KiB |
Average record size in memory | 77.3 B |
Variable types
Text | 4 |
---|---|
Numeric | 4 |
Unsupported | 1 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | MBN |
URL | https://kdx.kr/data/view/29829 |
Dataset has 2 (0.5%) duplicate rows | Duplicates |
play_sec is highly overall correlated with play_hour and 1 other fields | High correlation |
play_hour is highly overall correlated with play_sec and 1 other fields | High correlation |
file_size is highly overall correlated with play_sec and 1 other fields | High correlation |
vod_seq_no has 110 (28.6%) missing values | Missing |
bcast_seq_no has 374 (97.4%) missing values | Missing |
play_sec has 374 (97.4%) missing values | Missing |
play_hour has 374 (97.4%) missing values | Missing |
file_size has 374 (97.4%) missing values | Missing |
vod_path has 374 (97.4%) missing values | Missing |
title has 374 (97.4%) missing values | Missing |
contents has 374 (97.4%) missing values | Missing |
Unnamed: 8 has 384 (100.0%) missing values | Missing |
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2024-03-11 03:19:12.256947 |
---|---|
Analysis finished | 2024-03-11 03:19:15.596084 |
Duration | 3.34 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
vod_seq_no
Text
MISSING
 
Distinct | 272 |
---|---|
Distinct (%) | 99.3% |
Missing | 110 |
Missing (%) | 28.6% |
Memory size | 3.1 KiB |
Length
Max length | 360 |
---|---|
Median length | 132 |
Mean length | 51.368613 |
Min length | 1 |
Characters and Unicode
Total characters | 14075 |
---|---|
Distinct characters | 676 |
Distinct categories | 14 ? |
Distinct scripts | 4 ? |
Distinct blocks | 6 ? |
Unique
Unique | 271 ? |
---|---|
Unique (%) | 98.9% |
Sample
1st row | 476674 |
---|---|
2nd row | 귀국한 안철수 전 서울대 교수가 |
3rd row | 서울 노원병 국회의원 보궐선거에 직접 출마하기로 했다는데요. |
4th row | 매일경제신문 정치부 이상훈 차장과 함께 |
5th row | 자세한 정치권 소식 알아보겠습니다. |
Value | Count | Frequency (%) |
수 | 17 | 0.5% |
오늘 | 14 | 0.4% |
한 | 10 | 0.3% |
더 | 10 | 0.3% |
당시 | 10 | 0.3% |
이 | 10 | 0.3% |
것 | 10 | 0.3% |
10 | 0.3% | |
어떻게 | 9 | 0.3% |
안철수 | 9 | 0.3% |
Other values (2365) | 3167 |
Most occurring characters
Value | Count | Frequency (%) |
3246 | 23.1% | |
이 | 313 | 2.2% |
. | 285 | 2.0% |
다 | 231 | 1.6% |
는 | 225 | 1.6% |
고 | 162 | 1.2% |
가 | 156 | 1.1% |
을 | 151 | 1.1% |
에 | 151 | 1.1% |
대 | 148 | 1.1% |
Other values (666) | 9007 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 9608 | |
Space Separator | 3247 | 23.1% |
Other Punctuation | 574 | 4.1% |
Decimal Number | 372 | 2.6% |
Math Symbol | 77 | 0.5% |
Control | 62 | 0.4% |
Uppercase Letter | 37 | 0.3% |
Close Punctuation | 23 | 0.2% |
Open Punctuation | 23 | 0.2% |
Dash Punctuation | 18 | 0.1% |
Other values (4) | 34 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
이 | 313 | 3.3% |
다 | 231 | 2.4% |
는 | 225 | 2.3% |
고 | 162 | 1.7% |
가 | 156 | 1.6% |
을 | 151 | 1.6% |
에 | 151 | 1.6% |
대 | 148 | 1.5% |
지 | 148 | 1.5% |
니 | 137 | 1.4% |
Other values (610) | 7786 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 7 | |
B | 5 | |
G | 4 | |
D | 3 | |
W | 3 | |
Y | 2 | 5.4% |
P | 2 | 5.4% |
N | 2 | 5.4% |
M | 2 | 5.4% |
T | 1 | 2.7% |
Other values (6) | 6 |
Other Punctuation
Value | Count | Frequency (%) |
. | 285 | |
, | 118 | |
? | 75 | 13.1% |
' | 24 | 4.2% |
% | 23 | 4.0% |
! | 15 | 2.6% |
" | 15 | 2.6% |
/ | 11 | 1.9% |
… | 5 | 0.9% |
# | 2 | 0.3% |
Decimal Number
Value | Count | Frequency (%) |
2 | 64 | |
0 | 58 | |
1 | 50 | |
4 | 37 | |
7 | 34 | |
5 | 32 | |
6 | 32 | |
3 | 32 | |
8 | 18 | 4.8% |
9 | 15 | 4.0% |
Math Symbol
Value | Count | Frequency (%) |
= | 43 | |
> | 15 | 19.5% |
~ | 8 | 10.4% |
< | 6 | 7.8% |
> | 3 | 3.9% |
< | 2 | 2.6% |
Space Separator
Value | Count | Frequency (%) |
3246 | ||
1 | < 0.1% |
Open Punctuation
Value | Count | Frequency (%) |
( | 22 | |
( | 1 | 4.3% |
Final Punctuation
Value | Count | Frequency (%) |
’ | 8 | |
” | 6 |
Initial Punctuation
Value | Count | Frequency (%) |
‘ | 8 | |
“ | 5 |
Control
Value | Count | Frequency (%) |
62 |
Close Punctuation
Value | Count | Frequency (%) |
) | 23 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 18 |
Lowercase Letter
Value | Count | Frequency (%) |
m | 5 |
Modifier Symbol
Value | Count | Frequency (%) |
` | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 9604 | |
Common | 4425 | |
Latin | 42 | 0.3% |
Han | 4 | < 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
이 | 313 | 3.3% |
다 | 231 | 2.4% |
는 | 225 | 2.3% |
고 | 162 | 1.7% |
가 | 156 | 1.6% |
을 | 151 | 1.6% |
에 | 151 | 1.6% |
대 | 148 | 1.5% |
지 | 148 | 1.5% |
니 | 137 | 1.4% |
Other values (606) | 7782 |
Common
Value | Count | Frequency (%) |
3246 | ||
. | 285 | 6.4% |
, | 118 | 2.7% |
? | 75 | 1.7% |
2 | 64 | 1.4% |
62 | 1.4% | |
0 | 58 | 1.3% |
1 | 50 | 1.1% |
= | 43 | 1.0% |
4 | 37 | 0.8% |
Other values (29) | 387 | 8.7% |
Latin
Value | Count | Frequency (%) |
C | 7 | |
m | 5 | |
B | 5 | |
G | 4 | |
D | 3 | |
W | 3 | |
Y | 2 | 4.8% |
P | 2 | 4.8% |
N | 2 | 4.8% |
M | 2 | 4.8% |
Other values (7) | 7 |
Han
Value | Count | Frequency (%) |
英 | 1 | |
女 | 1 | |
情 | 1 | |
父 | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 9604 | |
ASCII | 4428 | |
Punctuation | 32 | 0.2% |
None | 7 | < 0.1% |
CJK | 3 | < 0.1% |
CJK Compat Ideographs | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
3246 | ||
. | 285 | 6.4% |
, | 118 | 2.7% |
? | 75 | 1.7% |
2 | 64 | 1.4% |
62 | 1.4% | |
0 | 58 | 1.3% |
1 | 50 | 1.1% |
= | 43 | 1.0% |
4 | 37 | 0.8% |
Other values (37) | 390 | 8.8% |
Hangul
Value | Count | Frequency (%) |
이 | 313 | 3.3% |
다 | 231 | 2.4% |
는 | 225 | 2.3% |
고 | 162 | 1.7% |
가 | 156 | 1.6% |
을 | 151 | 1.6% |
에 | 151 | 1.6% |
대 | 148 | 1.5% |
지 | 148 | 1.5% |
니 | 137 | 1.4% |
Other values (606) | 7782 |
Punctuation
Value | Count | Frequency (%) |
’ | 8 | |
‘ | 8 | |
” | 6 | |
… | 5 | |
“ | 5 |
None
Value | Count | Frequency (%) |
> | 3 | |
< | 2 | |
1 | 14.3% | |
( | 1 | 14.3% |
CJK
Value | Count | Frequency (%) |
英 | 1 | |
情 | 1 | |
父 | 1 |
CJK Compat Ideographs
Value | Count | Frequency (%) |
女 | 1 |
bcast_seq_no
Real number (ℝ)
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 374 |
Missing (%) | 97.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1042702.5 |
Minimum | 1042632 |
---|---|
Maximum | 1042786 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 3.5 KiB |
Quantile statistics
Minimum | 1042632 |
---|---|
5-th percentile | 1042632.4 |
Q1 | 1042634.2 |
median | 1042711.5 |
Q3 | 1042766.2 |
95-th percentile | 1042785.6 |
Maximum | 1042786 |
Range | 154 |
Interquartile range (IQR) | 132 |
Descriptive statistics
Standard deviation | 66.451737 |
---|---|
Coefficient of variation (CV) | 6.3730294 × 10-5 |
Kurtosis | -1.7661437 |
Mean | 1042702.5 |
Median Absolute Deviation (MAD) | 74 |
Skewness | 0.17635645 |
Sum | 10427025 |
Variance | 4415.8333 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1042632 | 1 | 0.3% |
1042633 | 1 | 0.3% |
1042634 | 1 | 0.3% |
1042635 | 1 | 0.3% |
1042711 | 1 | 0.3% |
1042712 | 1 | 0.3% |
1042713 | 1 | 0.3% |
1042784 | 1 | 0.3% |
1042785 | 1 | 0.3% |
1042786 | 1 | 0.3% |
(Missing) | 374 |
Value | Count | Frequency (%) |
1042632 | 1 | |
1042633 | 1 | |
1042634 | 1 | |
1042635 | 1 | |
1042711 | 1 | |
1042712 | 1 | |
1042713 | 1 | |
1042784 | 1 | |
1042785 | 1 | |
1042786 | 1 |
Value | Count | Frequency (%) |
1042786 | 1 | |
1042785 | 1 | |
1042784 | 1 | |
1042713 | 1 | |
1042712 | 1 | |
1042711 | 1 | |
1042635 | 1 | |
1042634 | 1 | |
1042633 | 1 | |
1042632 | 1 |
play_sec
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 374 |
Missing (%) | 97.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 583 |
Minimum | 258 |
---|---|
Maximum | 1060 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 3.5 KiB |
Quantile statistics
Minimum | 258 |
---|---|
5-th percentile | 271.95 |
Q1 | 398.75 |
median | 553.5 |
Q3 | 773 |
95-th percentile | 938.05 |
Maximum | 1060 |
Range | 802 |
Interquartile range (IQR) | 374.25 |
Descriptive statistics
Standard deviation | 263.2312 |
---|---|
Coefficient of variation (CV) | 0.4515115 |
Kurtosis | -0.82973325 |
Mean | 583 |
Median Absolute Deviation (MAD) | 209.5 |
Skewness | 0.41853378 |
Sum | 5830 |
Variance | 69290.667 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
743 | 1 | 0.3% |
436 | 1 | 0.3% |
289 | 1 | 0.3% |
671 | 1 | 0.3% |
1060 | 1 | 0.3% |
397 | 1 | 0.3% |
789 | 1 | 0.3% |
783 | 1 | 0.3% |
404 | 1 | 0.3% |
258 | 1 | 0.3% |
(Missing) | 374 |
Value | Count | Frequency (%) |
258 | 1 | |
289 | 1 | |
397 | 1 | |
404 | 1 | |
436 | 1 | |
671 | 1 | |
743 | 1 | |
783 | 1 | |
789 | 1 | |
1060 | 1 |
Value | Count | Frequency (%) |
1060 | 1 | |
789 | 1 | |
783 | 1 | |
743 | 1 | |
671 | 1 | |
436 | 1 | |
404 | 1 | |
397 | 1 | |
289 | 1 | |
258 | 1 |
play_hour
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 374 |
Missing (%) | 97.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 0.16195 |
Minimum | 0.0717 |
---|---|
Maximum | 0.2944 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 3.5 KiB |
Quantile statistics
Minimum | 0.0717 |
---|---|
5-th percentile | 0.07557 |
Q1 | 0.110775 |
median | 0.15375 |
Q3 | 0.214725 |
95-th percentile | 0.26056 |
Maximum | 0.2944 |
Range | 0.2227 |
Interquartile range (IQR) | 0.10395 |
Descriptive statistics
Standard deviation | 0.073108188 |
---|---|
Coefficient of variation (CV) | 0.45142444 |
Kurtosis | -0.83174487 |
Mean | 0.16195 |
Median Absolute Deviation (MAD) | 0.0582 |
Skewness | 0.41819554 |
Sum | 1.6195 |
Variance | 0.0053448072 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0.2064 | 1 | 0.3% |
0.1211 | 1 | 0.3% |
0.0803 | 1 | 0.3% |
0.1864 | 1 | 0.3% |
0.2944 | 1 | 0.3% |
0.1103 | 1 | 0.3% |
0.2192 | 1 | 0.3% |
0.2175 | 1 | 0.3% |
0.1122 | 1 | 0.3% |
0.0717 | 1 | 0.3% |
(Missing) | 374 |
Value | Count | Frequency (%) |
0.0717 | 1 | |
0.0803 | 1 | |
0.1103 | 1 | |
0.1122 | 1 | |
0.1211 | 1 | |
0.1864 | 1 | |
0.2064 | 1 | |
0.2175 | 1 | |
0.2192 | 1 | |
0.2944 | 1 |
Value | Count | Frequency (%) |
0.2944 | 1 | |
0.2192 | 1 | |
0.2175 | 1 | |
0.2064 | 1 | |
0.1864 | 1 | |
0.1211 | 1 | |
0.1122 | 1 | |
0.1103 | 1 | |
0.0803 | 1 | |
0.0717 | 1 |
file_size
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 374 |
Missing (%) | 97.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 95374723 |
Minimum | 40022256 |
---|---|
Maximum | 1.7607761 × 108 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 3.5 KiB |
Quantile statistics
Minimum | 40022256 |
---|---|
5-th percentile | 42519062 |
Q1 | 67381922 |
median | 89287218 |
Q3 | 1.258217 × 108 |
95-th percentile | 1.5503235 × 108 |
Maximum | 1.7607761 × 108 |
Range | 1.3605535 × 108 |
Interquartile range (IQR) | 58439779 |
Descriptive statistics
Standard deviation | 43743223 |
---|---|
Coefficient of variation (CV) | 0.45864587 |
Kurtosis | -0.62098998 |
Mean | 95374723 |
Median Absolute Deviation (MAD) | 35219442 |
Skewness | 0.44200301 |
Sum | 9.5374723 × 108 |
Variance | 1.9134695 × 1015 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
121876576 | 1 | 0.3% |
70812784 | 1 | 0.3% |
45570713 | 1 | 0.3% |
107761651 | 1 | 0.3% |
176077608 | 1 | 0.3% |
67174581 | 1 | 0.3% |
129310378 | 1 | 0.3% |
127136742 | 1 | 0.3% |
68003943 | 1 | 0.3% |
40022256 | 1 | 0.3% |
(Missing) | 374 |
Value | Count | Frequency (%) |
40022256 | 1 | |
45570713 | 1 | |
67174581 | 1 | |
68003943 | 1 | |
70812784 | 1 | |
107761651 | 1 | |
121876576 | 1 | |
127136742 | 1 | |
129310378 | 1 | |
176077608 | 1 |
Value | Count | Frequency (%) |
176077608 | 1 | |
129310378 | 1 | |
127136742 | 1 | |
121876576 | 1 | |
107761651 | 1 | |
70812784 | 1 | |
68003943 | 1 | |
67174581 | 1 | |
45570713 | 1 | |
40022256 | 1 |
vod_path
Text
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 374 |
Missing (%) | 97.4% |
Memory size | 3.1 KiB |
Length
Max length | 61 |
---|---|
Median length | 61 |
Mean length | 61 |
Min length | 61 |
Characters and Unicode
Total characters | 610 |
---|---|
Distinct characters | 20 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | /mbnvod2/606/2013/03/04/20130304091959_20_606_1042632_360.mp4 |
---|---|
2nd row | /mbnvod2/606/2013/03/04/20130304091959_20_606_1042633_360.mp4 |
3rd row | /mbnvod2/606/2013/03/04/20130304091959_20_606_1042634_360.mp4 |
4th row | /mbnvod2/606/2013/03/04/20130304091959_20_606_1042635_360.mp4 |
5th row | /mbnvod2/606/2013/03/05/20130305091825_20_606_1042711_360.mp4 |
Value | Count | Frequency (%) |
mbnvod2/606/2013/03/04/20130304091959_20_606_1042632_360.mp4 | 1 | |
mbnvod2/606/2013/03/04/20130304091959_20_606_1042633_360.mp4 | 1 | |
mbnvod2/606/2013/03/04/20130304091959_20_606_1042634_360.mp4 | 1 | |
mbnvod2/606/2013/03/04/20130304091959_20_606_1042635_360.mp4 | 1 | |
mbnvod2/606/2013/03/05/20130305091825_20_606_1042711_360.mp4 | 1 | |
mbnvod2/606/2013/03/05/20130305091825_20_606_1042712_360.mp4 | 1 | |
mbnvod2/606/2013/03/05/20130305091825_20_606_1042713_360.mp4 | 1 | |
mbnvod2/606/2013/03/06/20130306085742_20_606_1042784_360.mp4 | 1 | |
mbnvod2/606/2013/03/06/20130306090853_20_606_1042785_360.mp4 | 1 | |
mbnvod2/606/2013/03/06/20130306091926_20_606_1042786_360.mp4 | 1 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 121 | |
6 | 62 | |
/ | 60 | |
2 | 57 | |
3 | 57 | |
1 | 42 | 6.9% |
_ | 40 | 6.6% |
4 | 31 | 5.1% |
m | 20 | 3.3% |
9 | 18 | 3.0% |
Other values (10) | 102 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 420 | |
Lowercase Letter | 80 | 13.1% |
Other Punctuation | 70 | 11.5% |
Connector Punctuation | 40 | 6.6% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 121 | |
6 | 62 | |
2 | 57 | |
3 | 57 | |
1 | 42 | 10.0% |
4 | 31 | 7.4% |
9 | 18 | 4.3% |
5 | 17 | 4.0% |
8 | 8 | 1.9% |
7 | 7 | 1.7% |
Lowercase Letter
Value | Count | Frequency (%) |
m | 20 | |
d | 10 | |
o | 10 | |
v | 10 | |
n | 10 | |
b | 10 | |
p | 10 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 60 | |
. | 10 | 14.3% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 40 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 530 | |
Latin | 80 | 13.1% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 121 | |
6 | 62 | |
/ | 60 | |
2 | 57 | |
3 | 57 | |
1 | 42 | 7.9% |
_ | 40 | 7.5% |
4 | 31 | 5.8% |
9 | 18 | 3.4% |
5 | 17 | 3.2% |
Other values (3) | 25 | 4.7% |
Latin
Value | Count | Frequency (%) |
m | 20 | |
d | 10 | |
o | 10 | |
v | 10 | |
n | 10 | |
b | 10 | |
p | 10 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 610 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 121 | |
6 | 62 | |
/ | 60 | |
2 | 57 | |
3 | 57 | |
1 | 42 | 6.9% |
_ | 40 | 6.6% |
4 | 31 | 5.1% |
m | 20 | 3.3% |
9 | 18 | 3.0% |
Other values (10) | 102 |
title
Text
MISSING
 
Distinct | 8 |
---|---|
Distinct (%) | 80.0% |
Missing | 374 |
Missing (%) | 97.4% |
Memory size | 3.1 KiB |
Length
Max length | 20 |
---|---|
Median length | 15 |
Mean length | 11.9 |
Min length | 5 |
Characters and Unicode
Total characters | 119 |
---|---|
Distinct characters | 71 |
Distinct categories | 5 ? |
Distinct scripts | 2 ? |
Distinct blocks | 2 ? |
Unique
Unique | 7 ? |
---|---|
Unique (%) | 70.0% |
Sample
1st row | 안철수, 4월 재보선 출마 |
---|---|
2nd row | 뉴스있수다 |
3rd row | 백악관 흉내라도 내봐라? |
4th row | ""개인회생 파산 눈덩이"" |
5th row | 김종훈 사퇴, 누구의 탓인가 |
Value | Count | Frequency (%) |
뉴스있수다 | 3 | 10.0% |
안철수 | 1 | 3.3% |
장광익의 | 1 | 3.3% |
1 | 3.3% | |
역사 | 1 | 3.3% |
도는 | 1 | 3.3% |
돌고 | 1 | 3.3% |
해법은 | 1 | 3.3% |
재테크 | 1 | 3.3% |
시대 | 1 | 3.3% |
Other values (18) | 18 |
Most occurring characters
Value | Count | Frequency (%) |
20 | 16.8% | |
수 | 4 | 3.4% |
" | 4 | 3.4% |
, | 4 | 3.4% |
뉴 | 3 | 2.5% |
스 | 3 | 2.5% |
있 | 3 | 2.5% |
다 | 3 | 2.5% |
도 | 2 | 1.7% |
인 | 2 | 1.7% |
Other values (61) | 71 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 88 | |
Space Separator | 20 | 16.8% |
Other Punctuation | 9 | 7.6% |
Decimal Number | 1 | 0.8% |
Dash Punctuation | 1 | 0.8% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
수 | 4 | 4.5% |
뉴 | 3 | 3.4% |
스 | 3 | 3.4% |
있 | 3 | 3.4% |
다 | 3 | 3.4% |
도 | 2 | 2.3% |
인 | 2 | 2.3% |
이 | 2 | 2.3% |
혼 | 2 | 2.3% |
사 | 2 | 2.3% |
Other values (55) | 62 |
Other Punctuation
Value | Count | Frequency (%) |
" | 4 | |
, | 4 | |
? | 1 | 11.1% |
Space Separator
Value | Count | Frequency (%) |
20 |
Decimal Number
Value | Count | Frequency (%) |
4 | 1 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 88 | |
Common | 31 | 26.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
수 | 4 | 4.5% |
뉴 | 3 | 3.4% |
스 | 3 | 3.4% |
있 | 3 | 3.4% |
다 | 3 | 3.4% |
도 | 2 | 2.3% |
인 | 2 | 2.3% |
이 | 2 | 2.3% |
혼 | 2 | 2.3% |
사 | 2 | 2.3% |
Other values (55) | 62 |
Common
Value | Count | Frequency (%) |
20 | ||
" | 4 | 12.9% |
, | 4 | 12.9% |
4 | 1 | 3.2% |
- | 1 | 3.2% |
? | 1 | 3.2% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 88 | |
ASCII | 31 | 26.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
20 | ||
" | 4 | 12.9% |
, | 4 | 12.9% |
4 | 1 | 3.2% |
- | 1 | 3.2% |
? | 1 | 3.2% |
Hangul
Value | Count | Frequency (%) |
수 | 4 | 4.5% |
뉴 | 3 | 3.4% |
스 | 3 | 3.4% |
있 | 3 | 3.4% |
다 | 3 | 3.4% |
도 | 2 | 2.3% |
인 | 2 | 2.3% |
이 | 2 | 2.3% |
혼 | 2 | 2.3% |
사 | 2 | 2.3% |
Other values (55) | 62 |
contents
Text
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 374 |
Missing (%) | 97.4% |
Memory size | 3.1 KiB |
Length
Max length | 64 |
---|---|
Median length | 40 |
Mean length | 29.2 |
Min length | 17 |
Characters and Unicode
Total characters | 292 |
---|---|
Distinct characters | 139 |
Distinct categories | 10 ? |
Distinct scripts | 3 ? |
Distinct blocks | 3 ? |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 오늘 첫 번째 순서로는 정치권 이슈를 살펴보겠습니다. |
---|---|
2nd row | (1) 지하경제 규모, GDP의 약 23% |
3rd row | 여러분의 아침에 생각할 거리를 던집니다. |
4th row | 먹거리 물가가 뜀박질하고 있습니다. |
5th row | 김종훈 미래창조과학부 장관 내정자가 |
Value | Count | Frequency (%) |
1 | 3 | 4.5% |
주 | 2 | 3.0% |
김종훈 | 2 | 3.0% |
오늘 | 1 | 1.5% |
2 | 1 | 1.5% |
아파트 | 1 | 1.5% |
서울 | 1 | 1.5% |
합니다 | 1 | 1.5% |
시대라고 | 1 | 1.5% |
선보입니다 | 1 | 1.5% |
Other values (53) | 53 |
Most occurring characters
Value | Count | Frequency (%) |
61 | 20.9% | |
니 | 7 | 2.4% |
다 | 7 | 2.4% |
. | 6 | 2.1% |
리 | 5 | 1.7% |
이 | 4 | 1.4% |
를 | 4 | 1.4% |
1 | 4 | 1.4% |
정 | 4 | 1.4% |
라 | 4 | 1.4% |
Other values (129) | 186 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 196 | |
Space Separator | 61 | 20.9% |
Other Punctuation | 11 | 3.8% |
Decimal Number | 11 | 3.8% |
Close Punctuation | 3 | 1.0% |
Open Punctuation | 3 | 1.0% |
Uppercase Letter | 3 | 1.0% |
Math Symbol | 2 | 0.7% |
Final Punctuation | 1 | 0.3% |
Initial Punctuation | 1 | 0.3% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
니 | 7 | 3.6% |
다 | 7 | 3.6% |
리 | 5 | 2.6% |
이 | 4 | 2.0% |
를 | 4 | 2.0% |
정 | 4 | 2.0% |
라 | 4 | 2.0% |
에 | 4 | 2.0% |
아 | 4 | 2.0% |
습 | 3 | 1.5% |
Other values (111) | 150 |
Decimal Number
Value | Count | Frequency (%) |
1 | 4 | |
2 | 3 | |
0 | 2 | |
4 | 1 | 9.1% |
3 | 1 | 9.1% |
Other Punctuation
Value | Count | Frequency (%) |
. | 6 | |
, | 3 | |
% | 2 | 18.2% |
Uppercase Letter
Value | Count | Frequency (%) |
G | 1 | |
D | 1 | |
P | 1 |
Math Symbol
Value | Count | Frequency (%) |
> | 1 | |
< | 1 |
Space Separator
Value | Count | Frequency (%) |
61 |
Close Punctuation
Value | Count | Frequency (%) |
) | 3 |
Open Punctuation
Value | Count | Frequency (%) |
( | 3 |
Final Punctuation
Value | Count | Frequency (%) |
’ | 1 |
Initial Punctuation
Value | Count | Frequency (%) |
‘ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 196 | |
Common | 93 | |
Latin | 3 | 1.0% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
니 | 7 | 3.6% |
다 | 7 | 3.6% |
리 | 5 | 2.6% |
이 | 4 | 2.0% |
를 | 4 | 2.0% |
정 | 4 | 2.0% |
라 | 4 | 2.0% |
에 | 4 | 2.0% |
아 | 4 | 2.0% |
습 | 3 | 1.5% |
Other values (111) | 150 |
Common
Value | Count | Frequency (%) |
61 | ||
. | 6 | 6.5% |
1 | 4 | 4.3% |
, | 3 | 3.2% |
) | 3 | 3.2% |
( | 3 | 3.2% |
2 | 3 | 3.2% |
0 | 2 | 2.2% |
% | 2 | 2.2% |
’ | 1 | 1.1% |
Other values (5) | 5 | 5.4% |
Latin
Value | Count | Frequency (%) |
G | 1 | |
D | 1 | |
P | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 196 | |
ASCII | 94 | |
Punctuation | 2 | 0.7% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
61 | ||
. | 6 | 6.4% |
1 | 4 | 4.3% |
, | 3 | 3.2% |
) | 3 | 3.2% |
( | 3 | 3.2% |
2 | 3 | 3.2% |
0 | 2 | 2.1% |
% | 2 | 2.1% |
4 | 1 | 1.1% |
Other values (6) | 6 | 6.4% |
Hangul
Value | Count | Frequency (%) |
니 | 7 | 3.6% |
다 | 7 | 3.6% |
리 | 5 | 2.6% |
이 | 4 | 2.0% |
를 | 4 | 2.0% |
정 | 4 | 2.0% |
라 | 4 | 2.0% |
에 | 4 | 2.0% |
아 | 4 | 2.0% |
습 | 3 | 1.5% |
Other values (111) | 150 |
Punctuation
Value | Count | Frequency (%) |
’ | 1 | |
‘ | 1 |
Unnamed: 8
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 384 |
---|---|
Missing (%) | 100.0% |
Memory size | 3.5 KiB |
bcast_seq_no | play_sec | play_hour | file_size | vod_path | title | contents | |
---|---|---|---|---|---|---|---|
bcast_seq_no | 1.000 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 1.000 |
play_sec | 0.000 | 1.000 | 1.000 | 0.985 | 1.000 | 0.798 | 1.000 |
play_hour | 0.000 | 1.000 | 1.000 | 0.985 | 1.000 | 0.776 | 1.000 |
file_size | 0.000 | 0.985 | 0.985 | 1.000 | 1.000 | 0.798 | 1.000 |
vod_path | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
title | 0.000 | 0.798 | 0.776 | 0.798 | 1.000 | 1.000 | 1.000 |
contents | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
bcast_seq_no | play_sec | play_hour | file_size | |
---|---|---|---|---|
bcast_seq_no | 1.000 | -0.176 | -0.176 | -0.176 |
play_sec | -0.176 | 1.000 | 1.000 | 1.000 |
play_hour | -0.176 | 1.000 | 1.000 | 1.000 |
file_size | -0.176 | 1.000 | 1.000 | 1.000 |
vod_seq_no | bcast_seq_no | play_sec | play_hour | file_size | vod_path | title | contents | Unnamed: 8 | |
---|---|---|---|---|---|---|---|---|---|
0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
1 | 476674 | 1042632 | 743 | 0.2064 | 121876576 | /mbnvod2/606/2013/03/04/20130304091959_20_606_1042632_360.mp4 | 안철수, 4월 재보선 출마 | 오늘 첫 번째 순서로는 정치권 이슈를 살펴보겠습니다. | <NA> |
2 | 귀국한 안철수 전 서울대 교수가 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
3 | 서울 노원병 국회의원 보궐선거에 직접 출마하기로 했다는데요. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
4 | 매일경제신문 정치부 이상훈 차장과 함께 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
5 | 자세한 정치권 소식 알아보겠습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
6 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
7 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
8 | 1. 드디어 안철수 전 대선 후보가 다시 등장했습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
9 | 어제 대변인격인 송호창 의원을 통해, 곧 귀국해 4월 재보선에서 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
vod_seq_no | bcast_seq_no | play_sec | play_hour | file_size | vod_path | title | contents | Unnamed: 8 | |
---|---|---|---|---|---|---|---|---|---|
374 | 인사청문회와 관련한 입장도 뒤바뀌었습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
375 | 2006년 2월 한나라당은 김우식 과학기술부 장관 후보자 등의 인선에 반대하면서 지명 철회를 요구했습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
376 | 박 대통령은 당시 대표 신분으로 “대통령이 국무위원 청문회의 입법 취지를 존중하지 않고 야당의 검증과 입장을 무시하는 건 문제가 있다”고 날을 세웠습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
377 | 하지만 박대통령은 당선인 시절 김용준 전 국무총리 후보자가 자진 사퇴하자 “신상털기식 검증은 문제가 있다. 이런 상황에 누가 청문회를 하려고 하겠느냐”고 정반대의 입장을 취했습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
378 | 우리가 역사를 배우는 것은 바로 이렇게 돌고 도는 역사에서, 배워야 할 것과 버려야 할 것을 골라내기 위해서가 아닐까요. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
379 | 우리나라에서 내놓라하는 분들이 다 모였다는 정치인들만은 그런 학습 능력이 없는 모양입니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
380 | 그래서 한국의 국가경쟁력중에서 정치가 가장 뒤떨어지는 분야가 됐구요. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
381 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
382 | 모닝톡톡 오늘은 여기까집니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
383 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
Most frequently occurring
vod_seq_no | bcast_seq_no | play_sec | play_hour | file_size | vod_path | title | contents | # duplicates | |
---|---|---|---|---|---|---|---|---|---|
1 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 110 |
0 | 오늘 말씀 잘 들었습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 3 |