Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 69 |
Missing cells | 185 |
Missing cells (%) | 24.4% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 6.5 KiB |
Average record size in memory | 96.9 B |
Variable types
Numeric | 1 |
---|---|
Text | 1 |
Unsupported | 2 |
Boolean | 1 |
Categorical | 5 |
DateTime | 1 |
Dataset
Description | 국립암센터에서 19년도 9월까지 국립암센터홈페이지를 통해 개방하는 기타공지 정보 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15049631/fileData.do |
BBSCONTENTYN has constant value "" | Constant |
BBS_KIND is highly overall correlated with BBSNUM and 2 other fields | High correlation |
READNUM is highly overall correlated with BBSNUM and 2 other fields | High correlation |
CONTENT2 is highly overall correlated with BBSNUM and 2 other fields | High correlation |
BBSNUM is highly overall correlated with READNUM and 2 other fields | High correlation |
CONTENT2 is highly imbalanced (62.5%) | Imbalance |
BBS_KIND is highly imbalanced (62.5%) | Imbalance |
BBSCONTENT has 69 (100.0%) missing values | Missing |
BBSCONTENTYN has 47 (68.1%) missing values | Missing |
ETC has 69 (100.0%) missing values | Missing |
BBSNUM has unique values | Unique |
BBSCONTENT is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
ETC is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2023-12-12 11:05:05.255887 |
---|---|
Analysis finished | 2023-12-12 11:05:06.574131 |
Duration | 1.32 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
BBSNUM
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 69 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 151.52174 |
Minimum | 59 |
---|---|
Maximum | 292 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 753.0 B |
Quantile statistics
Minimum | 59 |
---|---|
5-th percentile | 63.4 |
Q1 | 77 |
median | 113 |
Q3 | 274 |
95-th percentile | 288.6 |
Maximum | 292 |
Range | 233 |
Interquartile range (IQR) | 197 |
Descriptive statistics
Standard deviation | 90.682641 |
---|---|
Coefficient of variation (CV) | 0.59847941 |
Kurtosis | -1.4049328 |
Mean | 151.52174 |
Median Absolute Deviation (MAD) | 41 |
Skewness | 0.67139649 |
Sum | 10455 |
Variance | 8223.3414 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
124 | 1 | 1.4% |
64 | 1 | 1.4% |
89 | 1 | 1.4% |
82 | 1 | 1.4% |
81 | 1 | 1.4% |
80 | 1 | 1.4% |
79 | 1 | 1.4% |
78 | 1 | 1.4% |
63 | 1 | 1.4% |
90 | 1 | 1.4% |
Other values (59) | 59 |
Value | Count | Frequency (%) |
59 | 1 | |
61 | 1 | |
62 | 1 | |
63 | 1 | |
64 | 1 | |
65 | 1 | |
66 | 1 | |
67 | 1 | |
68 | 1 | |
69 | 1 |
Value | Count | Frequency (%) |
292 | 1 | |
291 | 1 | |
290 | 1 | |
289 | 1 | |
288 | 1 | |
287 | 1 | |
286 | 1 | |
285 | 1 | |
284 | 1 | |
283 | 1 |
BBSTITLE
Text
Distinct | 61 |
---|---|
Distinct (%) | 88.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 684.0 B |
Length
Max length | 159 |
---|---|
Median length | 39 |
Mean length | 30.376812 |
Min length | 2 |
Characters and Unicode
Total characters | 2096 |
---|---|
Distinct characters | 127 |
Distinct categories | 7 ? |
Distinct scripts | 2 ? |
Distinct blocks | 4 ? |
Unique
Unique | 55 ? |
---|---|
Unique (%) | 79.7% |
Sample
1st row | 2006년도 암정복추진연구개발사업 지정과제(추가) 공고 |
---|---|
2nd row | 2006년도 암정복추진연구개발사업 추가 지정과제 선정결과 |
3rd row | 암정복추진연구개발사업 연구성과물, 수행과제요약서, 진도보고서 제출 요청 |
4th row | 암정복추진연구개발사업「연구비 사용실적보고서 및 최종결과보고서」서식 안내 |
5th row | 2003년도 암정복추진연구개발사업 신청서 |
Value | Count | Frequency (%) |
암정복추진연구개발사업 | 37 | 11.5% |
안내 | 21 | 6.5% |
및 | 13 | 4.0% |
2009년도 | 11 | 3.4% |
선정결과 | 10 | 3.1% |
구두발표평가 | 10 | 3.1% |
2006년도 | 10 | 3.1% |
2005년도 | 9 | 2.8% |
공고 | 9 | 2.8% |
2003년도 | 7 | 2.2% |
Other values (110) | 185 |
Most occurring characters
Value | Count | Frequency (%) |
253 | 12.1% | |
0 | 106 | 5.1% |
2 | 83 | 4.0% |
정 | 77 | 3.7% |
구 | 73 | 3.5% |
연 | 67 | 3.2% |
발 | 62 | 3.0% |
사 | 59 | 2.8% |
년 | 55 | 2.6% |
추 | 52 | 2.5% |
Other values (117) | 1209 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 1400 | |
Decimal Number | 331 | 15.8% |
Space Separator | 253 | 12.1% |
Other Punctuation | 70 | 3.3% |
Open Punctuation | 20 | 1.0% |
Close Punctuation | 20 | 1.0% |
Dash Punctuation | 2 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
정 | 77 | 5.5% |
구 | 73 | 5.2% |
연 | 67 | 4.8% |
발 | 62 | 4.4% |
사 | 59 | 4.2% |
년 | 55 | 3.9% |
추 | 52 | 3.7% |
업 | 52 | 3.7% |
도 | 51 | 3.6% |
암 | 51 | 3.6% |
Other values (95) | 801 |
Decimal Number
Value | Count | Frequency (%) |
0 | 106 | |
2 | 83 | |
1 | 26 | 7.9% |
5 | 24 | 7.3% |
6 | 23 | 6.9% |
4 | 23 | 6.9% |
9 | 17 | 5.1% |
3 | 16 | 4.8% |
8 | 7 | 2.1% |
7 | 5 | 1.5% |
Other Punctuation
Value | Count | Frequency (%) |
; | 19 | |
& | 19 | |
# | 19 | |
· | 7 | 10.0% |
, | 6 | 8.6% |
Open Punctuation
Value | Count | Frequency (%) |
( | 14 | |
「 | 6 |
Close Punctuation
Value | Count | Frequency (%) |
) | 14 | |
」 | 6 |
Space Separator
Value | Count | Frequency (%) |
253 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 1400 | |
Common | 696 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
정 | 77 | 5.5% |
구 | 73 | 5.2% |
연 | 67 | 4.8% |
발 | 62 | 4.4% |
사 | 59 | 4.2% |
년 | 55 | 3.9% |
추 | 52 | 3.7% |
업 | 52 | 3.7% |
도 | 51 | 3.6% |
암 | 51 | 3.6% |
Other values (95) | 801 |
Common
Value | Count | Frequency (%) |
253 | ||
0 | 106 | |
2 | 83 | 11.9% |
1 | 26 | 3.7% |
5 | 24 | 3.4% |
6 | 23 | 3.3% |
4 | 23 | 3.3% |
; | 19 | 2.7% |
& | 19 | 2.7% |
# | 19 | 2.7% |
Other values (12) | 101 | 14.5% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 1398 | |
ASCII | 676 | |
None | 20 | 1.0% |
Compat Jamo | 2 | 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
253 | ||
0 | 106 | |
2 | 83 | 12.3% |
1 | 26 | 3.8% |
5 | 24 | 3.6% |
6 | 23 | 3.4% |
4 | 23 | 3.4% |
; | 19 | 2.8% |
& | 19 | 2.8% |
# | 19 | 2.8% |
Other values (8) | 81 | 12.0% |
Hangul
Value | Count | Frequency (%) |
정 | 77 | 5.5% |
구 | 73 | 5.2% |
연 | 67 | 4.8% |
발 | 62 | 4.4% |
사 | 59 | 4.2% |
년 | 55 | 3.9% |
추 | 52 | 3.7% |
업 | 52 | 3.7% |
도 | 51 | 3.6% |
암 | 51 | 3.6% |
Other values (94) | 799 |
None
Value | Count | Frequency (%) |
· | 7 | |
」 | 6 | |
「 | 6 | |
5 | 1 | 5.0% |
Compat Jamo
Value | Count | Frequency (%) |
ㆍ | 2 |
BBSCONTENT
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 69 |
---|---|
Missing (%) | 100.0% |
Memory size | 753.0 B |
BBSCONTENTYN
Boolean
CONSTANT
  MISSING
 
Distinct | 1 |
---|---|
Distinct (%) | 4.5% |
Missing | 47 |
Missing (%) | 68.1% |
Memory size | 270.0 B |
True | |
---|---|
(Missing) |
Value | Count | Frequency (%) |
True | 22 | |
(Missing) | 47 |
BBSFROMDATE
Categorical
Distinct | 3 |
---|---|
Distinct (%) | 4.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 684.0 B |
<NA> | |
---|---|
20100705 | |
20090818 | 1 |
Length
Max length | 8 |
---|---|
Median length | 4 |
Mean length | 4.9855072 |
Min length | 4 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 1.4% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 52 | |
20100705 | 16 | 23.2% |
20090818 | 1 | 1.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 52 | |
20100705 | 16 | 23.2% |
20090818 | 1 | 1.4% |
BBSTODATE
Categorical
Distinct | 3 |
---|---|
Distinct (%) | 4.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 684.0 B |
<NA> | |
---|---|
20100705 | |
20090818 | 1 |
Length
Max length | 8 |
---|---|
Median length | 4 |
Mean length | 4.9855072 |
Min length | 4 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 1.4% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 52 | |
20100705 | 16 | 23.2% |
20090818 | 1 | 1.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 52 | |
20100705 | 16 | 23.2% |
20090818 | 1 | 1.4% |
READNUM
Categorical
HIGH CORRELATION
 
Distinct | 2 |
---|---|
Distinct (%) | 2.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 684.0 B |
0 | |
---|---|
<NA> |
Length
Max length | 4 |
---|---|
Median length | 1 |
Mean length | 1.7391304 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 0 |
---|---|
2nd row | 0 |
3rd row | 0 |
4th row | 0 |
5th row | 0 |
Common Values
Value | Count | Frequency (%) |
0 | 52 | |
<NA> | 17 | 24.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
0 | 52 | |
na | 17 | 24.6% |
CONTENT2
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 2.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 684.0 B |
<NA> | |
---|---|
1111 | 5 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 64 | |
1111 | 5 | 7.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 64 | |
1111 | 5 | 7.2% |
BBS_KIND
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 2.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 684.0 B |
<NA> | |
---|---|
A | 5 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.7826087 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 64 | |
A | 5 | 7.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 64 | |
a | 5 | 7.2% |
MOD_DATE
Date
Distinct | 43 |
---|---|
Distinct (%) | 62.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 684.0 B |
Minimum | 2001-01-01 00:00:00 |
---|---|
Maximum | 2011-04-18 00:00:00 |
ETC
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 69 |
---|---|
Missing (%) | 100.0% |
Memory size | 753.0 B |
BBSNUM | BBSTITLE | BBSFROMDATE | BBSTODATE | MOD_DATE | |
---|---|---|---|---|---|
BBSNUM | 1.000 | 0.935 | 0.605 | 0.605 | 1.000 |
BBSTITLE | 0.935 | 1.000 | 1.000 | 1.000 | 0.983 |
BBSFROMDATE | 0.605 | 1.000 | 1.000 | 0.605 | 0.605 |
BBSTODATE | 0.605 | 1.000 | 0.605 | 1.000 | 0.605 |
MOD_DATE | 1.000 | 0.983 | 0.605 | 0.605 | 1.000 |
BBSFROMDATE | BBS_KIND | BBSTODATE | READNUM | CONTENT2 | |
---|---|---|---|---|---|
BBSFROMDATE | 1.000 | NaN | 0.410 | NaN | NaN |
BBS_KIND | NaN | 1.000 | NaN | 1.000 | 1.000 |
BBSTODATE | 0.410 | NaN | 1.000 | NaN | NaN |
READNUM | NaN | 1.000 | NaN | 1.000 | 1.000 |
CONTENT2 | NaN | 1.000 | NaN | 1.000 | 1.000 |
BBSNUM | BBSFROMDATE | BBSTODATE | READNUM | CONTENT2 | BBS_KIND | |
---|---|---|---|---|---|---|
BBSNUM | 1.000 | 0.410 | 0.410 | 1.000 | 1.000 | 1.000 |
BBSFROMDATE | 0.410 | 1.000 | 0.410 | 0.000 | 0.000 | 0.000 |
BBSTODATE | 0.410 | 0.410 | 1.000 | 0.000 | 0.000 | 0.000 |
READNUM | 1.000 | 0.000 | 0.000 | 1.000 | 1.000 | 1.000 |
CONTENT2 | 1.000 | 0.000 | 0.000 | 1.000 | 1.000 | 1.000 |
BBS_KIND | 1.000 | 0.000 | 0.000 | 1.000 | 1.000 | 1.000 |
BBSNUM | BBSTITLE | BBSCONTENT | BBSCONTENTYN | BBSFROMDATE | BBSTODATE | READNUM | CONTENT2 | BBS_KIND | MOD_DATE | ETC | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 124 | 2006년도 암정복추진연구개발사업 지정과제(추가) 공고 | <NA> | <NA> | <NA> | <NA> | 0 | <NA> | <NA> | 2006-08-14 | <NA> |
1 | 125 | 2006년도 암정복추진연구개발사업 추가 지정과제 선정결과 | <NA> | <NA> | <NA> | <NA> | 0 | <NA> | <NA> | 2006-09-28 | <NA> |
2 | 130 | 암정복추진연구개발사업 연구성과물, 수행과제요약서, 진도보고서 제출 요청 | <NA> | <NA> | <NA> | <NA> | 0 | <NA> | <NA> | 2006-12-08 | <NA> |
3 | 74 | 암정복추진연구개발사업「연구비 사용실적보고서 및 최종결과보고서」서식 안내 | <NA> | <NA> | <NA> | <NA> | 0 | <NA> | <NA> | 2004-05-10 | <NA> |
4 | 65 | 2003년도 암정복추진연구개발사업 신청서 | <NA> | <NA> | <NA> | <NA> | 0 | <NA> | <NA> | 2003-01-01 | <NA> |
5 | 66 | 암정복 추진연구개발사업 사업설명회 개최일정 변경 | <NA> | <NA> | <NA> | <NA> | 0 | <NA> | <NA> | 2003-04-15 | <NA> |
6 | 67 | 2003년도 암정복추진연구사업 2차평가 대상자 및 구두발표평가 일정안내 | <NA> | <NA> | <NA> | <NA> | 0 | <NA> | <NA> | 2003-05-22 | <NA> |
7 | 68 | 2003년도 암정복사업 2차평가 대상과제 선정결과 고지일정 변경 | <NA> | <NA> | <NA> | <NA> | 0 | <NA> | <NA> | 2003-06-17 | <NA> |
8 | 77 | 제5기 암정복추진기획단장 위촉-김창민 국립암센터 연구소장 연임 | <NA> | <NA> | <NA> | <NA> | 0 | <NA> | <NA> | 2004-09-01 | <NA> |
9 | 95 | 2005년도 암정복추진연구개발사업 선정결과 안내 | <NA> | <NA> | <NA> | <NA> | 0 | <NA> | <NA> | 2005-06-28 | <NA> |
BBSNUM | BBSTITLE | BBSCONTENT | BBSCONTENTYN | BBSFROMDATE | BBSTODATE | READNUM | CONTENT2 | BBS_KIND | MOD_DATE | ETC | |
---|---|---|---|---|---|---|---|---|---|---|---|
59 | 118 | 2006년도 암정복추진연구개발사업 계속과제 구두발표평가 일정 안내 | <NA> | <NA> | <NA> | <NA> | 0 | <NA> | <NA> | 2006-05-23 | <NA> |
60 | 283 | 2010년도 암정복추진연구개발사업 연차(단계)실적ㆍ계획서 제출요청 | <NA> | Y | 20100705 | 20100705 | <NA> | <NA> | <NA> | 2010-07-05 | <NA> |
61 | 284 | 2010년도 암정복추진연구개발사업 공고 | <NA> | Y | 20100705 | 20100705 | <NA> | <NA> | <NA> | 2010-07-05 | <NA> |
62 | 285 | 2010년도 암정복추진연구개발사업 계속과제(2년차 및 2단계 2, 3년차) 구두발표평가 실시 안내 | <NA> | Y | 20100705 | 20100705 | <NA> | <NA> | <NA> | 2010-07-05 | <NA> |
63 | 286 | 2010년도 암정복추진연구개발사업 계속과제(2년차 및 2단계 2,3년차) 선정결과 통보 | <NA> | Y | 20100705 | 20100705 | <NA> | <NA> | <NA> | 2010-07-05 | <NA> |
64 | 287 | 2010년도 신규 및 계속과제(3년차 및 2단계1년차) 선정결과 통보 - 이메일 공지 | <NA> | Y | 20100705 | 20100705 | <NA> | <NA> | <NA> | 2010-07-05 | <NA> |
65 | 289 | 11 | <NA> | Y | <NA> | <NA> | 0 | 1111 | A | 2011-04-18 | <NA> |
66 | 290 | 111 | <NA> | Y | <NA> | <NA> | 0 | 1111 | A | 2011-04-18 | <NA> |
67 | 291 | 111 | <NA> | Y | <NA> | <NA> | 0 | 1111 | A | 2011-04-18 | <NA> |
68 | 292 | 111 | <NA> | Y | <NA> | <NA> | 0 | 1111 | A | 2011-04-18 | <NA> |