Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 237 |
Missing cells | 1912 |
Missing cells (%) | 89.6% |
Duplicate rows | 5 |
Duplicate rows (%) | 2.1% |
Total size in memory | 17.9 KiB |
Average record size in memory | 77.5 B |
Variable types
Text | 4 |
---|---|
Numeric | 4 |
Unsupported | 1 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | MBN |
URL | https://kdx.kr/data/view/29831 |
Dataset has 5 (2.1%) duplicate rows | Duplicates |
play_sec is highly overall correlated with play_hour and 1 other fields | High correlation |
play_hour is highly overall correlated with play_sec and 1 other fields | High correlation |
file_size is highly overall correlated with play_sec and 1 other fields | High correlation |
vod_seq_no has 86 (36.3%) missing values | Missing |
bcast_seq_no has 227 (95.8%) missing values | Missing |
play_sec has 227 (95.8%) missing values | Missing |
play_hour has 227 (95.8%) missing values | Missing |
file_size has 227 (95.8%) missing values | Missing |
vod_path has 227 (95.8%) missing values | Missing |
title has 227 (95.8%) missing values | Missing |
contents has 227 (95.8%) missing values | Missing |
Unnamed: 8 has 237 (100.0%) missing values | Missing |
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2024-04-21 16:23:38.225012 |
---|---|
Analysis finished | 2024-04-21 16:23:44.229703 |
Duration | 6 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
vod_seq_no
Text
MISSING
 
Distinct | 142 |
---|---|
Distinct (%) | 94.0% |
Missing | 86 |
Missing (%) | 36.3% |
Memory size | 2.0 KiB |
Length
Max length | 95 |
---|---|
Median length | 72 |
Mean length | 39.298013 |
Min length | 6 |
Characters and Unicode
Total characters | 5934 |
---|---|
Distinct characters | 510 |
Distinct categories | 10 ? |
Distinct scripts | 3 ? |
Distinct blocks | 5 ? |
Unique
Unique | 138 ? |
---|---|
Unique (%) | 91.4% |
Sample
1st row | 588417 |
---|---|
2nd row | 서울에 노란색 택시가 등장했습니다. |
3rd row | 법인택시도 아니고 개인택시도 아닌 이 노란색 택시는 국내 최초로 설립된 협동조합 택시인데요. |
4th row | 승차거부나 부당요금을 없애겠다고 다짐했습니다. |
5th row | 김수형 기자가 보도합니다. |
Value | Count | Frequency (%) |
67 | 5.0% | |
▶ | 18 | 1.4% |
기자 | 16 | 1.2% |
인터뷰 | 16 | 1.2% |
있습니다 | 12 | 0.9% |
유라시아 | 7 | 0.5% |
이 | 7 | 0.5% |
【 | 7 | 0.5% |
】 | 7 | 0.5% |
영상취재 | 6 | 0.5% |
Other values (958) | 1168 |
Most occurring characters
Value | Count | Frequency (%) |
1398 | 23.6% | |
. | 123 | 2.1% |
다 | 115 | 1.9% |
이 | 99 | 1.7% |
니 | 98 | 1.7% |
시 | 90 | 1.5% |
는 | 62 | 1.0% |
의 | 61 | 1.0% |
고 | 60 | 1.0% |
에 | 60 | 1.0% |
Other values (500) | 3768 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 3857 | |
Space Separator | 1398 | 23.6% |
Other Punctuation | 269 | 4.5% |
Decimal Number | 213 | 3.6% |
Lowercase Letter | 83 | 1.4% |
Uppercase Letter | 40 | 0.7% |
Open Punctuation | 19 | 0.3% |
Close Punctuation | 19 | 0.3% |
Dash Punctuation | 18 | 0.3% |
Other Symbol | 18 | 0.3% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
다 | 115 | 3.0% |
이 | 99 | 2.6% |
니 | 98 | 2.5% |
시 | 90 | 2.3% |
는 | 62 | 1.6% |
의 | 61 | 1.6% |
고 | 60 | 1.6% |
에 | 60 | 1.6% |
기 | 58 | 1.5% |
대 | 57 | 1.5% |
Other values (441) | 3097 |
Lowercase Letter
Value | Count | Frequency (%) |
m | 10 | |
n | 8 | 9.6% |
k | 7 | 8.4% |
o | 7 | 8.4% |
r | 6 | 7.2% |
c | 6 | 7.2% |
a | 6 | 7.2% |
s | 4 | 4.8% |
i | 4 | 4.8% |
h | 4 | 4.8% |
Other values (11) | 21 |
Other Punctuation
Value | Count | Frequency (%) |
. | 123 | |
" | 38 | 14.1% |
, | 34 | 12.6% |
: | 29 | 10.8% |
/ | 17 | 6.3% |
' | 9 | 3.3% |
@ | 6 | 2.2% |
… | 6 | 2.2% |
· | 4 | 1.5% |
% | 3 | 1.1% |
Decimal Number
Value | Count | Frequency (%) |
1 | 39 | |
0 | 39 | |
8 | 31 | |
2 | 24 | |
5 | 21 | |
3 | 16 | |
7 | 16 | |
4 | 12 | 5.6% |
6 | 10 | 4.7% |
9 | 5 | 2.3% |
Uppercase Letter
Value | Count | Frequency (%) |
N | 7 | |
B | 7 | |
M | 7 | |
A | 5 | |
C | 4 | |
P | 4 | |
S | 4 | |
T | 1 | 2.5% |
W | 1 | 2.5% |
Open Punctuation
Value | Count | Frequency (%) |
【 | 7 | |
[ | 6 | |
( | 6 |
Close Punctuation
Value | Count | Frequency (%) |
】 | 7 | |
] | 6 | |
) | 6 |
Space Separator
Value | Count | Frequency (%) |
1398 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 18 |
Other Symbol
Value | Count | Frequency (%) |
▶ | 18 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 3857 | |
Common | 1954 | |
Latin | 123 | 2.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
다 | 115 | 3.0% |
이 | 99 | 2.6% |
니 | 98 | 2.5% |
시 | 90 | 2.3% |
는 | 62 | 1.6% |
의 | 61 | 1.6% |
고 | 60 | 1.6% |
에 | 60 | 1.6% |
기 | 58 | 1.5% |
대 | 57 | 1.5% |
Other values (441) | 3097 |
Latin
Value | Count | Frequency (%) |
m | 10 | 8.1% |
n | 8 | 6.5% |
k | 7 | 5.7% |
o | 7 | 5.7% |
N | 7 | 5.7% |
B | 7 | 5.7% |
M | 7 | 5.7% |
r | 6 | 4.9% |
c | 6 | 4.9% |
a | 6 | 4.9% |
Other values (20) | 52 |
Common
Value | Count | Frequency (%) |
1398 | ||
. | 123 | 6.3% |
1 | 39 | 2.0% |
0 | 39 | 2.0% |
" | 38 | 1.9% |
, | 34 | 1.7% |
8 | 31 | 1.6% |
: | 29 | 1.5% |
2 | 24 | 1.2% |
5 | 21 | 1.1% |
Other values (19) | 178 | 9.1% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 3857 | |
ASCII | 2035 | |
Geometric Shapes | 18 | 0.3% |
None | 18 | 0.3% |
Punctuation | 6 | 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1398 | ||
. | 123 | 6.0% |
1 | 39 | 1.9% |
0 | 39 | 1.9% |
" | 38 | 1.9% |
, | 34 | 1.7% |
8 | 31 | 1.5% |
: | 29 | 1.4% |
2 | 24 | 1.2% |
5 | 21 | 1.0% |
Other values (44) | 259 | 12.7% |
Hangul
Value | Count | Frequency (%) |
다 | 115 | 3.0% |
이 | 99 | 2.6% |
니 | 98 | 2.5% |
시 | 90 | 2.3% |
는 | 62 | 1.6% |
의 | 61 | 1.6% |
고 | 60 | 1.6% |
에 | 60 | 1.6% |
기 | 58 | 1.5% |
대 | 57 | 1.5% |
Other values (441) | 3097 |
Geometric Shapes
Value | Count | Frequency (%) |
▶ | 18 |
None
Value | Count | Frequency (%) |
【 | 7 | |
】 | 7 | |
· | 4 |
Punctuation
Value | Count | Frequency (%) |
… | 6 |
bcast_seq_no
Real number (ℝ)
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 227 |
Missing (%) | 95.8% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1100912.5 |
Minimum | 1100884 |
---|---|
Maximum | 1100973 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.2 KiB |
Quantile statistics
Minimum | 1100884 |
---|---|
5-th percentile | 1100884.4 |
Q1 | 1100886.2 |
median | 1100888.5 |
Q3 | 1100950.8 |
95-th percentile | 1100972.6 |
Maximum | 1100973 |
Range | 89 |
Interquartile range (IQR) | 64.5 |
Descriptive statistics
Standard deviation | 41.099473 |
---|---|
Coefficient of variation (CV) | 3.7332188 × 10-5 |
Kurtosis | -1.22515 |
Mean | 1100912.5 |
Median Absolute Deviation (MAD) | 3 |
Skewness | 1.0284649 |
Sum | 11009125 |
Variance | 1689.1667 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1100884 | 1 | 0.4% |
1100885 | 1 | 0.4% |
1100886 | 1 | 0.4% |
1100887 | 1 | 0.4% |
1100888 | 1 | 0.4% |
1100889 | 1 | 0.4% |
1100890 | 1 | 0.4% |
1100971 | 1 | 0.4% |
1100972 | 1 | 0.4% |
1100973 | 1 | 0.4% |
(Missing) | 227 |
Value | Count | Frequency (%) |
1100884 | 1 | |
1100885 | 1 | |
1100886 | 1 | |
1100887 | 1 | |
1100888 | 1 | |
1100889 | 1 | |
1100890 | 1 | |
1100971 | 1 | |
1100972 | 1 | |
1100973 | 1 |
Value | Count | Frequency (%) |
1100973 | 1 | |
1100972 | 1 | |
1100971 | 1 | |
1100890 | 1 | |
1100889 | 1 | |
1100888 | 1 | |
1100887 | 1 | |
1100886 | 1 | |
1100885 | 1 | |
1100884 | 1 |
play_sec
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 8 |
---|---|
Distinct (%) | 80.0% |
Missing | 227 |
Missing (%) | 95.8% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 79.2 |
Minimum | 20 |
---|---|
Maximum | 123 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.2 KiB |
Quantile statistics
Minimum | 20 |
---|---|
5-th percentile | 20.45 |
Q1 | 38 |
median | 98.5 |
Q3 | 105 |
95-th percentile | 119.4 |
Maximum | 123 |
Range | 103 |
Interquartile range (IQR) | 67 |
Descriptive statistics
Standard deviation | 41.480384 |
---|---|
Coefficient of variation (CV) | 0.52374222 |
Kurtosis | -1.2731476 |
Mean | 79.2 |
Median Absolute Deviation (MAD) | 13 |
Skewness | -0.8314088 |
Sum | 792 |
Variance | 1720.6222 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
99 | 2 | 0.8% |
21 | 2 | 0.8% |
98 | 1 | 0.4% |
89 | 1 | 0.4% |
107 | 1 | 0.4% |
20 | 1 | 0.4% |
115 | 1 | 0.4% |
123 | 1 | 0.4% |
(Missing) | 227 |
Value | Count | Frequency (%) |
20 | 1 | |
21 | 2 | |
89 | 1 | |
98 | 1 | |
99 | 2 | |
107 | 1 | |
115 | 1 | |
123 | 1 |
Value | Count | Frequency (%) |
123 | 1 | |
115 | 1 | |
107 | 1 | |
99 | 2 | |
98 | 1 | |
89 | 1 | |
21 | 2 | |
20 | 1 |
play_hour
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 8 |
---|---|
Distinct (%) | 80.0% |
Missing | 227 |
Missing (%) | 95.8% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 0.02199 |
Minimum | 0.0056 |
---|---|
Maximum | 0.0342 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.2 KiB |
Quantile statistics
Minimum | 0.0056 |
---|---|
5-th percentile | 0.00569 |
Q1 | 0.010525 |
median | 0.02735 |
Q3 | 0.02915 |
95-th percentile | 0.033165 |
Maximum | 0.0342 |
Range | 0.0286 |
Interquartile range (IQR) | 0.018625 |
Descriptive statistics
Standard deviation | 0.011522003 |
---|---|
Coefficient of variation (CV) | 0.52396558 |
Kurtosis | -1.2729393 |
Mean | 0.02199 |
Median Absolute Deviation (MAD) | 0.0036 |
Skewness | -0.83012866 |
Sum | 0.2199 |
Variance | 0.00013275656 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0.0275 | 2 | 0.8% |
0.0058 | 2 | 0.8% |
0.0272 | 1 | 0.4% |
0.0247 | 1 | 0.4% |
0.0297 | 1 | 0.4% |
0.0056 | 1 | 0.4% |
0.0319 | 1 | 0.4% |
0.0342 | 1 | 0.4% |
(Missing) | 227 |
Value | Count | Frequency (%) |
0.0056 | 1 | |
0.0058 | 2 | |
0.0247 | 1 | |
0.0272 | 1 | |
0.0275 | 2 | |
0.0297 | 1 | |
0.0319 | 1 | |
0.0342 | 1 |
Value | Count | Frequency (%) |
0.0342 | 1 | |
0.0319 | 1 | |
0.0297 | 1 | |
0.0275 | 2 | |
0.0272 | 1 | |
0.0247 | 1 | |
0.0058 | 2 | |
0.0056 | 1 |
file_size
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 227 |
Missing (%) | 95.8% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 12934689 |
Minimum | 3094308 |
---|---|
Maximum | 20558238 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.2 KiB |
Quantile statistics
Minimum | 3094308 |
---|---|
5-th percentile | 3192738.3 |
Q1 | 6321855 |
median | 15846926 |
Q3 | 16856573 |
95-th percentile | 19953285 |
Maximum | 20558238 |
Range | 17463930 |
Interquartile range (IQR) | 10534718 |
Descriptive statistics
Standard deviation | 6849841.4 |
---|---|
Coefficient of variation (CV) | 0.52957142 |
Kurtosis | -1.2555113 |
Mean | 12934689 |
Median Absolute Deviation (MAD) | 2331330 |
Skewness | -0.78249951 |
Sum | 1.2934689 × 108 |
Variance | 4.6920327 × 1013 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
15768640 | 1 | 0.4% |
16302226 | 1 | 0.4% |
14551239 | 1 | 0.4% |
17041355 | 1 | 0.4% |
3313042 | 1 | 0.4% |
3578727 | 1 | 0.4% |
3094308 | 1 | 0.4% |
19213899 | 1 | 0.4% |
15925212 | 1 | 0.4% |
20558238 | 1 | 0.4% |
(Missing) | 227 |
Value | Count | Frequency (%) |
3094308 | 1 | |
3313042 | 1 | |
3578727 | 1 | |
14551239 | 1 | |
15768640 | 1 | |
15925212 | 1 | |
16302226 | 1 | |
17041355 | 1 | |
19213899 | 1 | |
20558238 | 1 |
Value | Count | Frequency (%) |
20558238 | 1 | |
19213899 | 1 | |
17041355 | 1 | |
16302226 | 1 | |
15925212 | 1 | |
15768640 | 1 | |
14551239 | 1 | |
3578727 | 1 | |
3313042 | 1 | |
3094308 | 1 |
vod_path
Text
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 227 |
Missing (%) | 95.8% |
Memory size | 2.0 KiB |
Length
Max length | 61 |
---|---|
Median length | 61 |
Mean length | 61 |
Min length | 61 |
Characters and Unicode
Total characters | 610 |
---|---|
Distinct characters | 20 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | /mbnvod2/689/2015/07/15/20150715104207_20_689_1100884_360.mp4 |
---|---|
2nd row | /mbnvod2/689/2015/07/15/20150715104602_20_689_1100885_360.mp4 |
3rd row | /mbnvod2/689/2015/07/15/20150715104958_20_689_1100886_360.mp4 |
4th row | /mbnvod2/689/2015/07/15/20150715105403_20_689_1100887_360.mp4 |
5th row | /mbnvod2/689/2015/07/15/20150715105636_20_689_1100888_360.mp4 |
Value | Count | Frequency (%) |
mbnvod2/689/2015/07/15/20150715104207_20_689_1100884_360.mp4 | 1 | |
mbnvod2/689/2015/07/15/20150715104602_20_689_1100885_360.mp4 | 1 | |
mbnvod2/689/2015/07/15/20150715104958_20_689_1100886_360.mp4 | 1 | |
mbnvod2/689/2015/07/15/20150715105403_20_689_1100887_360.mp4 | 1 | |
mbnvod2/689/2015/07/15/20150715105636_20_689_1100888_360.mp4 | 1 | |
mbnvod2/689/2015/07/15/20150715105950_20_689_1100889_360.mp4 | 1 | |
mbnvod2/689/2015/07/15/20150715110205_20_689_1100890_360.mp4 | 1 | |
mbnvod2/689/2015/07/16/20150716105225_20_689_1100971_360.mp4 | 1 | |
mbnvod2/689/2015/07/16/20150716105602_20_689_1100972_360.mp4 | 1 | |
mbnvod2/689/2015/07/16/20150716105923_20_689_1100973_360.mp4 | 1 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 97 | |
1 | 72 | |
/ | 60 | |
2 | 48 | |
5 | 45 | 7.4% |
6 | 41 | 6.7% |
_ | 40 | 6.6% |
8 | 35 | 5.7% |
9 | 28 | 4.6% |
7 | 25 | 4.1% |
Other values (10) | 119 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 420 | |
Lowercase Letter | 80 | 13.1% |
Other Punctuation | 70 | 11.5% |
Connector Punctuation | 40 | 6.6% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 97 | |
1 | 72 | |
2 | 48 | |
5 | 45 | |
6 | 41 | |
8 | 35 | 8.3% |
9 | 28 | 6.7% |
7 | 25 | 6.0% |
4 | 15 | 3.6% |
3 | 14 | 3.3% |
Lowercase Letter
Value | Count | Frequency (%) |
m | 20 | |
d | 10 | |
o | 10 | |
v | 10 | |
n | 10 | |
b | 10 | |
p | 10 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 60 | |
. | 10 | 14.3% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 40 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 530 | |
Latin | 80 | 13.1% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 97 | |
1 | 72 | |
/ | 60 | |
2 | 48 | |
5 | 45 | |
6 | 41 | |
_ | 40 | |
8 | 35 | 6.6% |
9 | 28 | 5.3% |
7 | 25 | 4.7% |
Other values (3) | 39 |
Latin
Value | Count | Frequency (%) |
m | 20 | |
d | 10 | |
o | 10 | |
v | 10 | |
n | 10 | |
b | 10 | |
p | 10 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 610 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 97 | |
1 | 72 | |
/ | 60 | |
2 | 48 | |
5 | 45 | 7.4% |
6 | 41 | 6.7% |
_ | 40 | 6.6% |
8 | 35 | 5.7% |
9 | 28 | 4.6% |
7 | 25 | 4.1% |
Other values (10) | 119 |
title
Text
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 227 |
Missing (%) | 95.8% |
Memory size | 2.0 KiB |
Length
Max length | 32 |
---|---|
Median length | 27.5 |
Mean length | 26.2 |
Min length | 21 |
Characters and Unicode
Total characters | 262 |
---|---|
Distinct characters | 127 |
Distinct categories | 7 ? |
Distinct scripts | 2 ? |
Distinct blocks | 4 ? |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 노란색 택시 등장…"승차거부 없어요." |
---|---|
2nd row | "공동육아가 혐오시설?"…주민 갈등에도 과천시 '수수방관' |
3rd row | 400억대 불법 스포츠 사이트 일당 '덜미' |
4th row | [경북] '평화·통일 염원…유라시아 친선특급 |
5th row | [경기] 국가건축정책위 '건축·도시정책 포럼' 개최 |
Value | Count | Frequency (%) |
부산 | 2 | 3.7% |
경기 | 2 | 3.7% |
노란색 | 1 | 1.9% |
화백…두 | 1 | 1.9% |
대전-충남 | 1 | 1.9% |
지역 | 1 | 1.9% |
공동 | 1 | 1.9% |
현안 | 1 | 1.9% |
해결 | 1 | 1.9% |
상생협력협약 | 1 | 1.9% |
Other values (42) | 42 |
Most occurring characters
Value | Count | Frequency (%) |
44 | 16.8% | |
' | 11 | 4.2% |
시 | 10 | 3.8% |
… | 6 | 2.3% |
" | 4 | 1.5% |
[ | 4 | 1.5% |
기 | 4 | 1.5% |
] | 4 | 1.5% |
부 | 4 | 1.5% |
경 | 4 | 1.5% |
Other values (117) | 167 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 176 | |
Space Separator | 44 | 16.8% |
Other Punctuation | 26 | 9.9% |
Decimal Number | 7 | 2.7% |
Open Punctuation | 4 | 1.5% |
Close Punctuation | 4 | 1.5% |
Dash Punctuation | 1 | 0.4% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
시 | 10 | 5.7% |
기 | 4 | 2.3% |
부 | 4 | 2.3% |
경 | 4 | 2.3% |
아 | 4 | 2.3% |
도 | 4 | 2.3% |
산 | 4 | 2.3% |
에 | 3 | 1.7% |
일 | 3 | 1.7% |
대 | 3 | 1.7% |
Other values (102) | 133 |
Other Punctuation
Value | Count | Frequency (%) |
' | 11 | |
… | 6 | |
" | 4 | 15.4% |
· | 3 | 11.5% |
? | 1 | 3.8% |
. | 1 | 3.8% |
Decimal Number
Value | Count | Frequency (%) |
0 | 3 | |
5 | 1 | 14.3% |
8 | 1 | 14.3% |
6 | 1 | 14.3% |
4 | 1 | 14.3% |
Space Separator
Value | Count | Frequency (%) |
44 |
Open Punctuation
Value | Count | Frequency (%) |
[ | 4 |
Close Punctuation
Value | Count | Frequency (%) |
] | 4 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 176 | |
Common | 86 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
시 | 10 | 5.7% |
기 | 4 | 2.3% |
부 | 4 | 2.3% |
경 | 4 | 2.3% |
아 | 4 | 2.3% |
도 | 4 | 2.3% |
산 | 4 | 2.3% |
에 | 3 | 1.7% |
일 | 3 | 1.7% |
대 | 3 | 1.7% |
Other values (102) | 133 |
Common
Value | Count | Frequency (%) |
44 | ||
' | 11 | 12.8% |
… | 6 | 7.0% |
" | 4 | 4.7% |
[ | 4 | 4.7% |
] | 4 | 4.7% |
0 | 3 | 3.5% |
· | 3 | 3.5% |
5 | 1 | 1.2% |
- | 1 | 1.2% |
Other values (5) | 5 | 5.8% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 176 | |
ASCII | 77 | |
Punctuation | 6 | 2.3% |
None | 3 | 1.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
44 | ||
' | 11 | 14.3% |
" | 4 | 5.2% |
[ | 4 | 5.2% |
] | 4 | 5.2% |
0 | 3 | 3.9% |
5 | 1 | 1.3% |
- | 1 | 1.3% |
8 | 1 | 1.3% |
6 | 1 | 1.3% |
Other values (3) | 3 | 3.9% |
Hangul
Value | Count | Frequency (%) |
시 | 10 | 5.7% |
기 | 4 | 2.3% |
부 | 4 | 2.3% |
경 | 4 | 2.3% |
아 | 4 | 2.3% |
도 | 4 | 2.3% |
산 | 4 | 2.3% |
에 | 3 | 1.7% |
일 | 3 | 1.7% |
대 | 3 | 1.7% |
Other values (102) | 133 |
Punctuation
Value | Count | Frequency (%) |
… | 6 |
None
Value | Count | Frequency (%) |
· | 3 |
contents
Text
MISSING
 
Distinct | 5 |
---|---|
Distinct (%) | 50.0% |
Missing | 227 |
Missing (%) | 95.8% |
Memory size | 2.0 KiB |
Length
Max length | 56 |
---|---|
Median length | 8 |
Mean length | 21.6 |
Min length | 8 |
Characters and Unicode
Total characters | 216 |
---|---|
Distinct characters | 87 |
Distinct categories | 5 ? |
Distinct scripts | 2 ? |
Distinct blocks | 3 ? |
Unique
Unique | 4 ? |
---|---|
Unique (%) | 40.0% |
Sample
1st row | 【 앵커논평 】 |
---|---|
2nd row | 【 앵커멘트 】 |
3rd row | 【 앵커멘트 】 |
4th row | 【 앵커멘트 】 |
5th row | 경기도가 대통령직속 위원회인 국가건축정책위원회와 공동으로 첫 번째 건축·도시정책 포럼을 개최했습니다. |
Value | Count | Frequency (%) |
【 | 7 | 12.5% |
】 | 7 | 12.5% |
앵커멘트 | 6 | 10.7% |
손을 | 1 | 1.8% |
지역 | 1 | 1.8% |
활기를 | 1 | 1.8% |
띤 | 1 | 1.8% |
것으로 | 1 | 1.8% |
나타났습니다 | 1 | 1.8% |
대전광역시와 | 1 | 1.8% |
Other values (29) | 29 |
Most occurring characters
Value | Count | Frequency (%) |
46 | 21.3% | |
【 | 7 | 3.2% |
앵 | 7 | 3.2% |
커 | 7 | 3.2% |
】 | 7 | 3.2% |
멘 | 6 | 2.8% |
트 | 6 | 2.8% |
도 | 4 | 1.9% |
위 | 4 | 1.9% |
을 | 4 | 1.9% |
Other values (77) | 118 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 152 | |
Space Separator | 46 | 21.3% |
Open Punctuation | 7 | 3.2% |
Close Punctuation | 7 | 3.2% |
Other Punctuation | 4 | 1.9% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
앵 | 7 | 4.6% |
커 | 7 | 4.6% |
멘 | 6 | 3.9% |
트 | 6 | 3.9% |
도 | 4 | 2.6% |
위 | 4 | 2.6% |
을 | 4 | 2.6% |
기 | 4 | 2.6% |
회 | 3 | 2.0% |
니 | 3 | 2.0% |
Other values (72) | 104 |
Other Punctuation
Value | Count | Frequency (%) |
. | 3 | |
· | 1 | 25.0% |
Space Separator
Value | Count | Frequency (%) |
46 |
Open Punctuation
Value | Count | Frequency (%) |
【 | 7 |
Close Punctuation
Value | Count | Frequency (%) |
】 | 7 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 152 | |
Common | 64 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
앵 | 7 | 4.6% |
커 | 7 | 4.6% |
멘 | 6 | 3.9% |
트 | 6 | 3.9% |
도 | 4 | 2.6% |
위 | 4 | 2.6% |
을 | 4 | 2.6% |
기 | 4 | 2.6% |
회 | 3 | 2.0% |
니 | 3 | 2.0% |
Other values (72) | 104 |
Common
Value | Count | Frequency (%) |
46 | ||
【 | 7 | 10.9% |
】 | 7 | 10.9% |
. | 3 | 4.7% |
· | 1 | 1.6% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 152 | |
ASCII | 49 | 22.7% |
None | 15 | 6.9% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
46 | ||
. | 3 | 6.1% |
None
Value | Count | Frequency (%) |
【 | 7 | |
】 | 7 | |
· | 1 | 6.7% |
Hangul
Value | Count | Frequency (%) |
앵 | 7 | 4.6% |
커 | 7 | 4.6% |
멘 | 6 | 3.9% |
트 | 6 | 3.9% |
도 | 4 | 2.6% |
위 | 4 | 2.6% |
을 | 4 | 2.6% |
기 | 4 | 2.6% |
회 | 3 | 2.0% |
니 | 3 | 2.0% |
Other values (72) | 104 |
Unnamed: 8
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 237 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.2 KiB |
bcast_seq_no | play_sec | play_hour | file_size | vod_path | title | contents | |
---|---|---|---|---|---|---|---|
bcast_seq_no | 1.000 | 0.616 | 0.616 | 0.857 | 1.000 | 1.000 | 0.000 |
play_sec | 0.616 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.000 |
play_hour | 0.616 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.000 |
file_size | 0.857 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.566 |
vod_path | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
title | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
contents | 0.000 | 0.000 | 0.000 | 0.566 | 1.000 | 1.000 | 1.000 |
bcast_seq_no | play_sec | play_hour | file_size | |
---|---|---|---|---|
bcast_seq_no | 1.000 | 0.280 | 0.280 | 0.261 |
play_sec | 0.280 | 1.000 | 1.000 | 0.957 |
play_hour | 0.280 | 1.000 | 1.000 | 0.957 |
file_size | 0.261 | 0.957 | 0.957 | 1.000 |
vod_seq_no | bcast_seq_no | play_sec | play_hour | file_size | vod_path | title | contents | Unnamed: 8 | |
---|---|---|---|---|---|---|---|---|---|
0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
1 | 588417 | 1100884 | 99 | 0.0275 | 15768640 | /mbnvod2/689/2015/07/15/20150715104207_20_689_1100884_360.mp4 | 노란색 택시 등장…"승차거부 없어요." | 【 앵커논평 】 | <NA> |
2 | 서울에 노란색 택시가 등장했습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
3 | 법인택시도 아니고 개인택시도 아닌 이 노란색 택시는 국내 최초로 설립된 협동조합 택시인데요. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
4 | 승차거부나 부당요금을 없애겠다고 다짐했습니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
5 | 김수형 기자가 보도합니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
6 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
7 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
8 | 【 기자 】 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
9 | 택시기사 정용준 씨. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
vod_seq_no | bcast_seq_no | play_sec | play_hour | file_size | vod_path | title | contents | Unnamed: 8 | |
---|---|---|---|---|---|---|---|---|---|
227 | 대전시는 대전이 주도적으로 만든 세계과학도시연합, WTA를 함께 개최해 과학도시로서의 위상을 높일 계획입니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
228 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
229 | 2천7백억 원에 달하는 경제 파급 효과와 미래 성장동력으로 부상한 전시회의 산업도 활성화될 것으로 기대됩니다. | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
230 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
231 | ▶ 스탠딩 : 이상곤 / 기자 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
232 | - "강력한 경쟁도시들을 제치고 대규모 국제대회 유치에 성공함에 따라 대전은 세계가 주목하는 국제도시로서의 위상을 한 단계 더 높이게 됐습니다. MBN뉴스 이상곤입니다." | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
233 | [ lsk9017@mbn.co.kr ] | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
234 | 영상취재 : 박인학 기자 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
235 | 영상편집 : 김경준 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
236 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
Most frequently occurring
vod_seq_no | bcast_seq_no | play_sec | play_hour | file_size | vod_path | title | contents | # duplicates | |
---|---|---|---|---|---|---|---|---|---|
4 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 86 |
3 | 【 기자 】 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 7 |
0 | 영상편집 : 김경준 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 2 |
1 | 영상편집 : 한남선 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 2 |
2 | ▶ 인터뷰 : 그라함 쿽 / 호주 브리즈번 시장 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 2 |