Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 284 |
Missing cells | 284 |
Missing cells (%) | 9.1% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 25.4 KiB |
Average record size in memory | 91.5 B |
Variable types
Numeric | 2 |
---|---|
Categorical | 5 |
Text | 3 |
Unsupported | 1 |
Dataset
Description | 대장암 레지스트리 메타정보( 제공 되어질 데이터 항목, 타입, 사이즈, 항목별건수, 샘플데이터 등)를 제공 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15048689/fileData.do |
tblId is highly overall correlated with NUM and 4 other fields | High correlation |
tblNm is highly overall correlated with NUM and 4 other fields | High correlation |
gpNm is highly overall correlated with NUM and 4 other fields | High correlation |
gpId is highly overall correlated with NUM and 4 other fields | High correlation |
NUM is highly overall correlated with colCnt and 4 other fields | High correlation |
colCnt is highly overall correlated with NUM and 4 other fields | High correlation |
dispFormat has 284 (100.0%) missing values | Missing |
NUM has unique values | Unique |
dispFormat is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
colCnt has 98 (34.5%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-12 13:10:52.065721 |
---|---|
Analysis finished | 2023-12-12 13:10:53.376701 |
Duration | 1.31 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
NUM
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 284 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 142.5 |
Minimum | 1 |
---|---|
Maximum | 284 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.6 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 15.15 |
Q1 | 71.75 |
median | 142.5 |
Q3 | 213.25 |
95-th percentile | 269.85 |
Maximum | 284 |
Range | 283 |
Interquartile range (IQR) | 141.5 |
Descriptive statistics
Standard deviation | 82.127949 |
---|---|
Coefficient of variation (CV) | 0.57633648 |
Kurtosis | -1.2 |
Mean | 142.5 |
Median Absolute Deviation (MAD) | 71 |
Skewness | 0 |
Sum | 40470 |
Variance | 6745 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 0.4% |
189 | 1 | 0.4% |
195 | 1 | 0.4% |
194 | 1 | 0.4% |
193 | 1 | 0.4% |
192 | 1 | 0.4% |
191 | 1 | 0.4% |
190 | 1 | 0.4% |
188 | 1 | 0.4% |
197 | 1 | 0.4% |
Other values (274) | 274 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
284 | 1 | |
283 | 1 | |
282 | 1 | |
281 | 1 | |
280 | 1 | |
279 | 1 | |
278 | 1 | |
277 | 1 | |
276 | 1 | |
275 | 1 |
gpId
Categorical
HIGH CORRELATION
 
Distinct | 13 |
---|---|
Distinct (%) | 4.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
_CLRC_CHMO | |
---|---|
_CLRC_SPR | |
_CLRC_OPRT | |
_CLRC_RTX | |
_CLRC_RLPS_MIST | |
Other values (8) |
Length
Max length | 15 |
---|---|
Median length | 13 |
Mean length | 10.771127 |
Min length | 9 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | _CLRC_Summary |
---|---|
2nd row | _CLRC_Summary |
3rd row | _CLRC_Summary |
4th row | _CLRC_Summary |
5th row | _CLRC_Summary |
Common Values
Value | Count | Frequency (%) |
_CLRC_CHMO | 58 | |
_CLRC_SPR | 52 | |
_CLRC_OPRT | 42 | |
_CLRC_RTX | 41 | |
_CLRC_RLPS_MIST | 22 | 7.7% |
_CLRC_Summary | 14 | 4.9% |
_CLRC_EVAL_DEAD | 14 | 4.9% |
_CLRC_CNDX | 11 | 3.9% |
_CLRC_COMP | 10 | 3.5% |
_CLRC_MIEX_INIT | 6 | 2.1% |
Other values (3) | 14 | 4.9% |
Length
Value | Count | Frequency (%) |
clrc_chmo | 58 | |
clrc_spr | 52 | |
clrc_oprt | 42 | |
clrc_rtx | 41 | |
clrc_rlps_mist | 22 | 7.7% |
clrc_summary | 14 | 4.9% |
clrc_eval_dead | 14 | 4.9% |
clrc_cndx | 11 | 3.9% |
clrc_comp | 10 | 3.5% |
clrc_miex_init | 6 | 2.1% |
Other values (3) | 14 | 4.9% |
gpNm
Categorical
HIGH CORRELATION
 
Distinct | 13 |
---|---|
Distinct (%) | 4.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
항암치료 | |
---|---|
외과병리보고서 | |
수술정보 | |
방사선치료 | |
전이 및 재발 | |
Other values (8) |
Length
Max length | 17 |
---|---|
Median length | 16 |
Mean length | 5.915493 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Summary |
---|---|
2nd row | Summary |
3rd row | Summary |
4th row | Summary |
5th row | Summary |
Common Values
Value | Count | Frequency (%) |
항암치료 | 58 | |
외과병리보고서 | 52 | |
수술정보 | 42 | |
방사선치료 | 41 | |
전이 및 재발 | 22 | 7.7% |
Summary | 14 | 4.9% |
사망 및 치료평가 | 14 | 4.9% |
진단정보 | 11 | 3.9% |
합병증 | 10 | 3.5% |
진단검사(영상/시술) | 6 | 2.1% |
Other values (3) | 14 | 4.9% |
Length
Value | Count | Frequency (%) |
항암치료 | 58 | |
외과병리보고서 | 52 | |
수술정보 | 42 | |
방사선치료 | 41 | |
및 | 36 | |
전이 | 22 | 5.9% |
재발 | 22 | 5.9% |
치료평가 | 14 | 3.8% |
사망 | 14 | 3.8% |
summary | 14 | 3.8% |
Other values (7) | 55 |
tblId
Categorical
HIGH CORRELATION
 
Distinct | 25 |
---|---|
Distinct (%) | 8.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
_CLRC_PE_SPR | |
---|---|
_CLRC_PE_OPRT | |
_CLRC_PE_CHMO | |
_CLRC_PE_RTX | |
_CLRC_PE_RTX_PRE | 15 |
Other values (20) |
Length
Max length | 18 |
---|---|
Median length | 13 |
Mean length | 13.327465 |
Min length | 11 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | _CLRC_PT_TRGT |
---|---|
2nd row | _CLRC_PT_TRGT |
3rd row | _CLRC_PT_TRGT |
4th row | _CLRC_PT_TRGT |
5th row | _CLRC_PT_TRGT |
Common Values
Value | Count | Frequency (%) |
_CLRC_PE_SPR | 52 | |
_CLRC_PE_OPRT | 31 | 10.9% |
_CLRC_PE_CHMO | 20 | 7.0% |
_CLRC_PE_RTX | 16 | 5.6% |
_CLRC_PE_RTX_PRE | 15 | 5.3% |
_CLRC_PT_TRGT | 14 | 4.9% |
_CLRC_PE_RLPS | 14 | 4.9% |
_CLRC_PE_4S | 12 | 4.2% |
_CLRC_PE_RTX_PST | 10 | 3.5% |
_CLRC_PE_COMP | 10 | 3.5% |
Other values (15) | 90 |
Length
Value | Count | Frequency (%) |
clrc_pe_spr | 52 | |
clrc_pe_oprt | 31 | 10.9% |
clrc_pe_chmo | 20 | 7.0% |
clrc_pe_rtx | 16 | 5.6% |
clrc_pe_rtx_pre | 15 | 5.3% |
clrc_pt_trgt | 14 | 4.9% |
clrc_pe_rlps | 14 | 4.9% |
clrc_pe_4s | 12 | 4.2% |
clrc_pe_rtx_pst | 10 | 3.5% |
clrc_pe_comp | 10 | 3.5% |
Other values (15) | 90 |
tblNm
Categorical
HIGH CORRELATION
 
Distinct | 25 |
---|---|
Distinct (%) | 8.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
대장암 외과병리결과 | |
---|---|
대장암 수술기록 | |
대장암 항암치료 | |
대장암 방사선치료 | |
대장암 방사선치료전검사 | 15 |
Other values (20) |
Length
Max length | 19 |
---|---|
Median length | 16 |
Mean length | 9.7570423 |
Min length | 7 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 대장암 대상자 |
---|---|
2nd row | 대장암 대상자 |
3rd row | 대장암 대상자 |
4th row | 대장암 대상자 |
5th row | 대장암 대상자 |
Common Values
Value | Count | Frequency (%) |
대장암 외과병리결과 | 52 | |
대장암 수술기록 | 31 | 10.9% |
대장암 항암치료 | 20 | 7.0% |
대장암 방사선치료 | 16 | 5.6% |
대장암 방사선치료전검사 | 15 | 5.3% |
대장암 대상자 | 14 | 4.9% |
대장암 재발정보 | 14 | 4.9% |
대장암 4기진단정보 | 12 | 4.2% |
대장암 방사선치료후검사 | 10 | 3.5% |
대장암 합병증 | 10 | 3.5% |
Other values (15) | 90 |
Length
Value | Count | Frequency (%) |
대장암 | 284 | |
외과병리결과 | 52 | 8.8% |
수술기록 | 31 | 5.2% |
항암치료 | 20 | 3.4% |
initial | 16 | 2.7% |
방사선치료 | 16 | 2.7% |
방사선치료전검사 | 15 | 2.5% |
대상자 | 14 | 2.4% |
재발정보 | 14 | 2.4% |
4기진단정보 | 12 | 2.0% |
Other values (18) | 119 |
colId
Text
Distinct | 253 |
---|---|
Distinct (%) | 89.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
Value | Count | Frequency (%) |
oprt_ymd | 4 | 1.4% |
diag_ymd | 3 | 1.1% |
t4b_inst_cmnt | 2 | 0.7% |
meta_loca_cmnt | 2 | 0.7% |
tme_eval_cnmt | 2 | 0.7% |
exam_ymd | 2 | 0.7% |
tumr_size_vl | 2 | 0.7% |
ht_vl | 2 | 0.7% |
mtst_part_cmnt | 2 | 0.7% |
cexm_rslt_cmnt | 2 | 0.7% |
Other values (243) | 261 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 458 | |
T | 304 | 10.0% |
M | 281 | 9.2% |
N | 257 | 8.5% |
C | 215 | 7.1% |
R | 190 | 6.3% |
D | 185 | 6.1% |
A | 130 | 4.3% |
E | 126 | 4.1% |
S | 125 | 4.1% |
Other values (23) | 767 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 2565 | |
Connector Punctuation | 458 | 15.1% |
Decimal Number | 15 | 0.5% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
T | 304 | |
M | 281 | |
N | 257 | 10.0% |
C | 215 | 8.4% |
R | 190 | 7.4% |
D | 185 | 7.2% |
A | 130 | 5.1% |
E | 126 | 4.9% |
S | 125 | 4.9% |
L | 96 | 3.7% |
Other values (16) | 656 |
Decimal Number
Value | Count | Frequency (%) |
4 | 4 | |
2 | 4 | |
1 | 2 | |
6 | 2 | |
3 | 2 | |
8 | 1 | 6.7% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 458 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2565 | |
Common | 473 | 15.6% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
T | 304 | |
M | 281 | |
N | 257 | 10.0% |
C | 215 | 8.4% |
R | 190 | 7.4% |
D | 185 | 7.2% |
A | 130 | 5.1% |
E | 126 | 4.9% |
S | 125 | 4.9% |
L | 96 | 3.7% |
Other values (16) | 656 |
Common
Value | Count | Frequency (%) |
_ | 458 | |
4 | 4 | 0.8% |
2 | 4 | 0.8% |
1 | 2 | 0.4% |
6 | 2 | 0.4% |
3 | 2 | 0.4% |
8 | 1 | 0.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 3038 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 458 | |
T | 304 | 10.0% |
M | 281 | 9.2% |
N | 257 | 8.5% |
C | 215 | 7.1% |
R | 190 | 6.3% |
D | 185 | 6.1% |
A | 130 | 4.3% |
E | 126 | 4.1% |
S | 125 | 4.1% |
Other values (23) | 767 |
colNm
Text
Distinct | 256 |
---|---|
Distinct (%) | 90.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
Value | Count | Frequency (%) |
tumor | 19 | 3.2% |
of | 15 | 2.5% |
the | 12 | 2.0% |
재발 | 10 | 1.7% |
margin | 8 | 1.3% |
invasion | 7 | 1.2% |
lymph | 6 | 1.0% |
node | 6 | 1.0% |
기타 | 6 | 1.0% |
stage | 6 | 1.0% |
Other values (335) | 506 |
Most occurring characters
Value | Count | Frequency (%) |
317 | 8.6% | |
e | 228 | 6.2% |
o | 199 | 5.4% |
r | 197 | 5.4% |
t | 193 | 5.3% |
a | 192 | 5.2% |
i | 177 | 4.8% |
n | 149 | 4.1% |
s | 122 | 3.3% |
l | 112 | 3.0% |
Other values (182) | 1787 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 2194 | |
Other Letter | 728 | 19.8% |
Space Separator | 317 | 8.6% |
Uppercase Letter | 287 | 7.8% |
Close Punctuation | 40 | 1.1% |
Open Punctuation | 40 | 1.1% |
Other Punctuation | 27 | 0.7% |
Connector Punctuation | 17 | 0.5% |
Decimal Number | 14 | 0.4% |
Dash Punctuation | 8 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
일 | 46 | 6.3% |
사 | 32 | 4.4% |
수 | 23 | 3.2% |
검 | 21 | 2.9% |
부 | 19 | 2.6% |
드 | 19 | 2.6% |
코 | 19 | 2.6% |
자 | 18 | 2.5% |
치 | 18 | 2.5% |
술 | 16 | 2.2% |
Other values (119) | 497 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 228 | |
o | 199 | 9.1% |
r | 197 | 9.0% |
t | 193 | 8.8% |
a | 192 | 8.8% |
i | 177 | 8.1% |
n | 149 | 6.8% |
s | 122 | 5.6% |
l | 112 | 5.1% |
m | 102 | 4.6% |
Other values (15) | 523 |
Uppercase Letter
Value | Count | Frequency (%) |
M | 26 | 9.1% |
D | 26 | 9.1% |
C | 21 | 7.3% |
T | 20 | 7.0% |
E | 20 | 7.0% |
N | 18 | 6.3% |
S | 18 | 6.3% |
A | 17 | 5.9% |
R | 17 | 5.9% |
P | 16 | 5.6% |
Other values (12) | 88 |
Decimal Number
Value | Count | Frequency (%) |
2 | 4 | |
3 | 3 | |
4 | 3 | |
1 | 2 | |
6 | 1 | 7.1% |
8 | 1 | 7.1% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 21 | |
. | 4 | 14.8% |
: | 1 | 3.7% |
, | 1 | 3.7% |
Space Separator
Value | Count | Frequency (%) |
317 |
Close Punctuation
Value | Count | Frequency (%) |
) | 40 |
Open Punctuation
Value | Count | Frequency (%) |
( | 40 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 17 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 8 |
Math Symbol
Value | Count | Frequency (%) |
≥ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2481 | |
Hangul | 728 | 19.8% |
Common | 464 | 12.6% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
일 | 46 | 6.3% |
사 | 32 | 4.4% |
수 | 23 | 3.2% |
검 | 21 | 2.9% |
부 | 19 | 2.6% |
드 | 19 | 2.6% |
코 | 19 | 2.6% |
자 | 18 | 2.5% |
치 | 18 | 2.5% |
술 | 16 | 2.2% |
Other values (119) | 497 |
Latin
Value | Count | Frequency (%) |
e | 228 | 9.2% |
o | 199 | 8.0% |
r | 197 | 7.9% |
t | 193 | 7.8% |
a | 192 | 7.7% |
i | 177 | 7.1% |
n | 149 | 6.0% |
s | 122 | 4.9% |
l | 112 | 4.5% |
m | 102 | 4.1% |
Other values (37) | 810 |
Common
Value | Count | Frequency (%) |
317 | ||
) | 40 | 8.6% |
( | 40 | 8.6% |
/ | 21 | 4.5% |
_ | 17 | 3.7% |
- | 8 | 1.7% |
2 | 4 | 0.9% |
. | 4 | 0.9% |
3 | 3 | 0.6% |
4 | 3 | 0.6% |
Other values (6) | 7 | 1.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2944 | |
Hangul | 728 | 19.8% |
Math Operators | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
317 | 10.8% | |
e | 228 | 7.7% |
o | 199 | 6.8% |
r | 197 | 6.7% |
t | 193 | 6.6% |
a | 192 | 6.5% |
i | 177 | 6.0% |
n | 149 | 5.1% |
s | 122 | 4.1% |
l | 112 | 3.8% |
Other values (52) | 1058 |
Hangul
Value | Count | Frequency (%) |
일 | 46 | 6.3% |
사 | 32 | 4.4% |
수 | 23 | 3.2% |
검 | 21 | 2.9% |
부 | 19 | 2.6% |
드 | 19 | 2.6% |
코 | 19 | 2.6% |
자 | 18 | 2.5% |
치 | 18 | 2.5% |
술 | 16 | 2.2% |
Other values (119) | 497 |
Math Operators
Value | Count | Frequency (%) |
≥ | 1 |
dataType
Categorical
Distinct | 3 |
---|---|
Distinct (%) | 1.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
STRING | |
---|---|
DATE | |
INTEGER |
Length
Max length | 7 |
---|---|
Median length | 6 |
Mean length | 5.7535211 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | INTEGER |
---|---|
2nd row | DATE |
3rd row | STRING |
4th row | DATE |
5th row | STRING |
Common Values
Value | Count | Frequency (%) |
STRING | 213 | |
DATE | 47 | 16.5% |
INTEGER | 24 | 8.5% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
string | 213 | |
date | 47 | 16.5% |
integer | 24 | 8.5% |
colDesc
Text
Distinct | 256 |
---|---|
Distinct (%) | 90.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
Value | Count | Frequency (%) |
tumor | 19 | 3.2% |
of | 15 | 2.5% |
the | 12 | 2.0% |
재발 | 10 | 1.7% |
margin | 8 | 1.3% |
invasion | 7 | 1.2% |
lymph | 6 | 1.0% |
node | 6 | 1.0% |
기타 | 6 | 1.0% |
stage | 6 | 1.0% |
Other values (335) | 506 |
Most occurring characters
Value | Count | Frequency (%) |
317 | 8.6% | |
e | 228 | 6.2% |
o | 199 | 5.4% |
r | 197 | 5.4% |
t | 193 | 5.3% |
a | 192 | 5.2% |
i | 177 | 4.8% |
n | 149 | 4.1% |
s | 122 | 3.3% |
l | 112 | 3.0% |
Other values (182) | 1787 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 2194 | |
Other Letter | 728 | 19.8% |
Space Separator | 317 | 8.6% |
Uppercase Letter | 287 | 7.8% |
Close Punctuation | 40 | 1.1% |
Open Punctuation | 40 | 1.1% |
Other Punctuation | 27 | 0.7% |
Connector Punctuation | 17 | 0.5% |
Decimal Number | 14 | 0.4% |
Dash Punctuation | 8 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
일 | 46 | 6.3% |
사 | 32 | 4.4% |
수 | 23 | 3.2% |
검 | 21 | 2.9% |
부 | 19 | 2.6% |
드 | 19 | 2.6% |
코 | 19 | 2.6% |
자 | 18 | 2.5% |
치 | 18 | 2.5% |
술 | 16 | 2.2% |
Other values (119) | 497 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 228 | |
o | 199 | 9.1% |
r | 197 | 9.0% |
t | 193 | 8.8% |
a | 192 | 8.8% |
i | 177 | 8.1% |
n | 149 | 6.8% |
s | 122 | 5.6% |
l | 112 | 5.1% |
m | 102 | 4.6% |
Other values (15) | 523 |
Uppercase Letter
Value | Count | Frequency (%) |
M | 26 | 9.1% |
D | 26 | 9.1% |
C | 21 | 7.3% |
T | 20 | 7.0% |
E | 20 | 7.0% |
N | 18 | 6.3% |
S | 18 | 6.3% |
A | 17 | 5.9% |
R | 17 | 5.9% |
P | 16 | 5.6% |
Other values (12) | 88 |
Decimal Number
Value | Count | Frequency (%) |
2 | 4 | |
3 | 3 | |
4 | 3 | |
1 | 2 | |
6 | 1 | 7.1% |
8 | 1 | 7.1% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 21 | |
. | 4 | 14.8% |
: | 1 | 3.7% |
, | 1 | 3.7% |
Space Separator
Value | Count | Frequency (%) |
317 |
Close Punctuation
Value | Count | Frequency (%) |
) | 40 |
Open Punctuation
Value | Count | Frequency (%) |
( | 40 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 17 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 8 |
Math Symbol
Value | Count | Frequency (%) |
≥ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2481 | |
Hangul | 728 | 19.8% |
Common | 464 | 12.6% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
일 | 46 | 6.3% |
사 | 32 | 4.4% |
수 | 23 | 3.2% |
검 | 21 | 2.9% |
부 | 19 | 2.6% |
드 | 19 | 2.6% |
코 | 19 | 2.6% |
자 | 18 | 2.5% |
치 | 18 | 2.5% |
술 | 16 | 2.2% |
Other values (119) | 497 |
Latin
Value | Count | Frequency (%) |
e | 228 | 9.2% |
o | 199 | 8.0% |
r | 197 | 7.9% |
t | 193 | 7.8% |
a | 192 | 7.7% |
i | 177 | 7.1% |
n | 149 | 6.0% |
s | 122 | 4.9% |
l | 112 | 4.5% |
m | 102 | 4.1% |
Other values (37) | 810 |
Common
Value | Count | Frequency (%) |
317 | ||
) | 40 | 8.6% |
( | 40 | 8.6% |
/ | 21 | 4.5% |
_ | 17 | 3.7% |
- | 8 | 1.7% |
2 | 4 | 0.9% |
. | 4 | 0.9% |
3 | 3 | 0.6% |
4 | 3 | 0.6% |
Other values (6) | 7 | 1.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2944 | |
Hangul | 728 | 19.8% |
Math Operators | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
317 | 10.8% | |
e | 228 | 7.7% |
o | 199 | 6.8% |
r | 197 | 6.7% |
t | 193 | 6.6% |
a | 192 | 6.5% |
i | 177 | 6.0% |
n | 149 | 5.1% |
s | 122 | 4.1% |
l | 112 | 3.8% |
Other values (52) | 1058 |
Hangul
Value | Count | Frequency (%) |
일 | 46 | 6.3% |
사 | 32 | 4.4% |
수 | 23 | 3.2% |
검 | 21 | 2.9% |
부 | 19 | 2.6% |
드 | 19 | 2.6% |
코 | 19 | 2.6% |
자 | 18 | 2.5% |
치 | 18 | 2.5% |
술 | 16 | 2.2% |
Other values (119) | 497 |
Math Operators
Value | Count | Frequency (%) |
≥ | 1 |
colCnt
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 109 |
---|---|
Distinct (%) | 38.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 74970.236 |
Minimum | 0 |
---|---|
Maximum | 7821950 |
Zeros | 98 |
Zeros (%) | 34.5% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.6 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 1321.5 |
Q3 | 12801 |
95-th percentile | 123415.4 |
Maximum | 7821950 |
Range | 7821950 |
Interquartile range (IQR) | 12801 |
Descriptive statistics
Standard deviation | 642776.1 |
---|---|
Coefficient of variation (CV) | 8.5737506 |
Kurtosis | 136.83565 |
Mean | 74970.236 |
Median Absolute Deviation (MAD) | 1321.5 |
Skewness | 11.68213 |
Sum | 21291547 |
Variance | 4.1316112 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 98 | |
49741 | 11 | 3.9% |
12801 | 10 | 3.5% |
19106 | 7 | 2.5% |
1864 | 6 | 2.1% |
251 | 6 | 2.1% |
4883 | 6 | 2.1% |
10135 | 5 | 1.8% |
56267 | 5 | 1.8% |
443 | 4 | 1.4% |
Other values (99) | 126 |
Value | Count | Frequency (%) |
0 | 98 | |
16 | 1 | 0.4% |
33 | 1 | 0.4% |
65 | 1 | 0.4% |
85 | 1 | 0.4% |
94 | 1 | 0.4% |
133 | 1 | 0.4% |
185 | 1 | 0.4% |
193 | 1 | 0.4% |
196 | 1 | 0.4% |
Value | Count | Frequency (%) |
7821950 | 1 | 0.4% |
7483859 | 1 | 0.4% |
516812 | 1 | 0.4% |
491323 | 1 | 0.4% |
354382 | 4 | |
340848 | 1 | 0.4% |
133075 | 3 | |
131312 | 1 | 0.4% |
130852 | 1 | 0.4% |
129485 | 1 | 0.4% |
dispFormat
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 284 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.6 KiB |
NUM | gpId | gpNm | tblId | tblNm | dataType | colCnt | |
---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.925 | 0.925 | 0.972 | 0.972 | 0.338 | 0.241 |
gpId | 0.925 | 1.000 | 1.000 | 1.000 | 1.000 | 0.324 | 0.723 |
gpNm | 0.925 | 1.000 | 1.000 | 1.000 | 1.000 | 0.324 | 0.723 |
tblId | 0.972 | 1.000 | 1.000 | 1.000 | 1.000 | 0.530 | 0.754 |
tblNm | 0.972 | 1.000 | 1.000 | 1.000 | 1.000 | 0.530 | 0.754 |
dataType | 0.338 | 0.324 | 0.324 | 0.530 | 0.530 | 1.000 | 0.000 |
colCnt | 0.241 | 0.723 | 0.723 | 0.754 | 0.754 | 0.000 | 1.000 |
tblId | tblNm | gpNm | gpId | dataType | |
---|---|---|---|---|---|
tblId | 1.000 | 1.000 | 0.978 | 0.978 | 0.310 |
tblNm | 1.000 | 1.000 | 0.978 | 0.978 | 0.310 |
gpNm | 0.978 | 0.978 | 1.000 | 1.000 | 0.189 |
gpId | 0.978 | 0.978 | 1.000 | 1.000 | 0.189 |
dataType | 0.310 | 0.310 | 0.189 | 0.189 | 1.000 |
NUM | colCnt | gpId | gpNm | tblId | tblNm | dataType | |
---|---|---|---|---|---|---|---|
NUM | 1.000 | -0.536 | 0.730 | 0.730 | 0.793 | 0.793 | 0.213 |
colCnt | -0.536 | 1.000 | 0.675 | 0.675 | 0.643 | 0.643 | 0.000 |
gpId | 0.730 | 0.675 | 1.000 | 1.000 | 0.978 | 0.978 | 0.189 |
gpNm | 0.730 | 0.675 | 1.000 | 1.000 | 0.978 | 0.978 | 0.189 |
tblId | 0.793 | 0.643 | 0.978 | 0.978 | 1.000 | 1.000 | 0.310 |
tblNm | 0.793 | 0.643 | 0.978 | 0.978 | 1.000 | 1.000 | 0.310 |
dataType | 0.213 | 0.000 | 0.189 | 0.189 | 0.310 | 0.310 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | _CLRC_Summary | Summary | _CLRC_PT_TRGT | 대장암 대상자 | DIAG_AGE | 나이 | INTEGER | 나이 | 19106 | <NA> |
1 | 2 | _CLRC_Summary | Summary | _CLRC_PT_TRGT | 대장암 대상자 | BRTH_YMD | 생년월일 | DATE | 생년월일 | 19106 | <NA> |
2 | 3 | _CLRC_Summary | Summary | _CLRC_PT_TRGT | 대장암 대상자 | SEX_CD | 성별 | STRING | 성별 | 19106 | <NA> |
3 | 4 | _CLRC_Summary | Summary | _CLRC_PT_TRGT | 대장암 대상자 | FRMD_YMD | 초진일 | DATE | 초진일 | 19087 | <NA> |
4 | 5 | _CLRC_Summary | Summary | _CLRC_PT_TRGT | 대장암 대상자 | OPRT_CD | 수술코드 | STRING | 수술코드 | 10135 | <NA> |
5 | 6 | _CLRC_Summary | Summary | _CLRC_PT_TRGT | 대장암 대상자 | OPRT_NM | 수술명 | STRING | 수술명 | 10135 | <NA> |
6 | 7 | _CLRC_Summary | Summary | _CLRC_PT_TRGT | 대장암 대상자 | OPDR_ID | 집도의ID | STRING | 집도의ID | 10135 | <NA> |
7 | 8 | _CLRC_Summary | Summary | _CLRC_PT_TRGT | 대장암 대상자 | OPDR_NM | 집도의 | STRING | 집도의 | 10135 | <NA> |
8 | 9 | _CLRC_Summary | Summary | _CLRC_PT_TRGT | 대장암 대상자 | DRTR_YMD | 약물치료시작일 | DATE | 약물치료시작일 | 0 | <NA> |
9 | 10 | _CLRC_Summary | Summary | _CLRC_PT_TRGT | 대장암 대상자 | RATH_YMD | 방사선치료시작일 | DATE | 방사선치료시작일 | 3615 | <NA> |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
274 | 275 | _CLRC_EVAL_DEAD | 사망 및 치료평가 | _CLRC_PE_EVAL | 대장암 치료평가 | FLUP_LOSS_CD | 마지막 F/U 상태 | STRING | 마지막 F/U 상태 | 0 | <NA> |
275 | 276 | _CLRC_EVAL_DEAD | 사망 및 치료평가 | _CLRC_PE_EVAL | 대장암 치료평가 | DFS_DRTN | Disease-free survival (DFS) | STRING | Disease-free survival (DFS) | 0 | <NA> |
276 | 277 | _CLRC_EVAL_DEAD | 사망 및 치료평가 | _CLRC_PE_EVAL | 대장암 치료평가 | OS_DRTN | Overall survival (OS) | STRING | Overall survival (OS) | 0 | <NA> |
277 | 278 | _CLRC_EVAL_DEAD | 사망 및 치료평가 | _CLRC_PT_DEAD | 대장암 사망정보 | DEAD_YN | 사망여부 | STRING | 사망여부 | 19106 | <NA> |
278 | 279 | _CLRC_EVAL_DEAD | 사망 및 치료평가 | _CLRC_PT_DEAD | 대장암 사망정보 | DEAD_YMD | 사망일 | DATE | 사망일 | 719 | <NA> |
279 | 280 | _CLRC_EVAL_DEAD | 사망 및 치료평가 | _CLRC_PT_DEAD | 대장암 사망정보 | DCUZ1_CMNT | 사망 사유(text) | STRING | 사망 사유(text) | 712 | <NA> |
280 | 281 | _CLRC_EVAL_DEAD | 사망 및 치료평가 | _CLRC_PT_DEAD | 대장암 사망정보 | DCUZ2_CMNT | 기타 사망원인1 | STRING | 기타 사망원인1 | 266 | <NA> |
281 | 282 | _CLRC_EVAL_DEAD | 사망 및 치료평가 | _CLRC_PT_DEAD | 대장암 사망정보 | DCUZ3_CMNT | 기타 사망원인2 | STRING | 기타 사망원인2 | 133 | <NA> |
282 | 283 | _CLRC_EVAL_DEAD | 사망 및 치료평가 | _CLRC_PT_DEAD | 대장암 사망정보 | DCUZ4_CMNT | 기타 사망원인3 | STRING | 기타 사망원인3 | 33 | <NA> |
283 | 284 | _CLRC_EVAL_DEAD | 사망 및 치료평가 | _CLRC_PT_DEAD | 대장암 사망정보 | DCUZ_KCD6_CMNT | 주 사망원인코드(KCD) | STRING | 주 사망원인코드(KCD) | 0 | <NA> |