Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 246 |
Missing cells | 114 |
Missing cells (%) | 4.2% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 21.8 KiB |
Average record size in memory | 90.5 B |
Variable types
Numeric | 2 |
---|---|
Categorical | 6 |
Text | 3 |
Dataset
Description | 담도암 레지스트리 메타정보( 제공 되어질 데이터 항목, 타입, 사이즈, 항목별건수, 샘플데이터 등)를 제공 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15048698/fileData.do |
gpId is highly overall correlated with NUM and 4 other fields | High correlation |
gpNm is highly overall correlated with NUM and 4 other fields | High correlation |
tblId is highly overall correlated with NUM and 4 other fields | High correlation |
tblNm is highly overall correlated with NUM and 4 other fields | High correlation |
NUM is highly overall correlated with gpId and 3 other fields | High correlation |
colCnt is highly overall correlated with gpId and 4 other fields | High correlation |
dispFormat is highly overall correlated with colCnt | High correlation |
dataType is highly imbalanced (63.3%) | Imbalance |
colCnt has 114 (46.3%) missing values | Missing |
NUM has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 13:11:21.220729 |
---|---|
Analysis finished | 2023-12-12 13:11:22.361377 |
Duration | 1.14 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
NUM
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 246 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 123.5 |
Minimum | 1 |
---|---|
Maximum | 246 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.3 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 13.25 |
Q1 | 62.25 |
median | 123.5 |
Q3 | 184.75 |
95-th percentile | 233.75 |
Maximum | 246 |
Range | 245 |
Interquartile range (IQR) | 122.5 |
Descriptive statistics
Standard deviation | 71.158274 |
---|---|
Coefficient of variation (CV) | 0.57618036 |
Kurtosis | -1.2 |
Mean | 123.5 |
Median Absolute Deviation (MAD) | 61.5 |
Skewness | 0 |
Sum | 30381 |
Variance | 5063.5 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 0.4% |
156 | 1 | 0.4% |
158 | 1 | 0.4% |
159 | 1 | 0.4% |
160 | 1 | 0.4% |
161 | 1 | 0.4% |
162 | 1 | 0.4% |
163 | 1 | 0.4% |
164 | 1 | 0.4% |
165 | 1 | 0.4% |
Other values (236) | 236 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
246 | 1 | |
245 | 1 | |
244 | 1 | |
243 | 1 | |
242 | 1 | |
241 | 1 | |
240 | 1 | |
239 | 1 | |
238 | 1 | |
237 | 1 |
gpId
Categorical
HIGH CORRELATION
 
Distinct | 14 |
---|---|
Distinct (%) | 5.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
BC_HLTH | |
---|---|
BC_SPR | |
BC_OPRT | |
BC_COMP | |
BC_CCRT_RT | |
Other values (9) |
Length
Max length | 12 |
---|---|
Median length | 7 |
Mean length | 7.601626 |
Min length | 6 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | BC_SUMMARY |
---|---|
2nd row | BC_SUMMARY |
3rd row | BC_SUMMARY |
4th row | BC_SUMMARY |
5th row | BC_SUMMARY |
Common Values
Value | Count | Frequency (%) |
BC_HLTH | 73 | |
BC_SPR | 35 | |
BC_OPRT | 31 | |
BC_COMP | 28 | 11.4% |
BC_CCRT_RT | 15 | 6.1% |
BC_CST | 12 | 4.9% |
BC_FLUP_DEAD | 12 | 4.9% |
BC_DIAG | 11 | 4.5% |
BC_MIEX_SREX | 6 | 2.4% |
BC_INIT_BX | 6 | 2.4% |
Other values (4) | 17 | 6.9% |
Length
Value | Count | Frequency (%) |
bc_hlth | 73 | |
bc_spr | 35 | |
bc_oprt | 31 | |
bc_comp | 28 | 11.4% |
bc_ccrt_rt | 15 | 6.1% |
bc_cst | 12 | 4.9% |
bc_flup_dead | 12 | 4.9% |
bc_diag | 11 | 4.5% |
bc_miex_srex | 6 | 2.4% |
bc_init_bx | 6 | 2.4% |
Other values (4) | 17 | 6.9% |
gpNm
Categorical
HIGH CORRELATION
 
Distinct | 14 |
---|---|
Distinct (%) | 5.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
기타건강정보 | |
---|---|
외과병리보고서 | |
수술정보 | |
합병증 | |
CCRT/RT | |
Other values (9) |
Length
Max length | 17 |
---|---|
Median length | 16 |
Mean length | 6.199187 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Summary |
---|---|
2nd row | Summary |
3rd row | Summary |
4th row | Summary |
5th row | Summary |
Common Values
Value | Count | Frequency (%) |
기타건강정보 | 73 | |
외과병리보고서 | 35 | |
수술정보 | 31 | |
합병증 | 28 | 11.4% |
CCRT/RT | 15 | 6.1% |
전이 및 재발 | 12 | 4.9% |
사망 및 치료평가 | 12 | 4.9% |
진단정보 | 11 | 4.5% |
진단검사(영상/시술) | 6 | 2.4% |
진단검사(Initial Bx) | 6 | 2.4% |
Other values (4) | 17 | 6.9% |
Length
Value | Count | Frequency (%) |
기타건강정보 | 73 | |
외과병리보고서 | 35 | |
수술정보 | 31 | |
합병증 | 28 | 9.2% |
및 | 24 | 7.8% |
ccrt/rt | 15 | 4.9% |
사망 | 12 | 3.9% |
치료평가 | 12 | 3.9% |
재발 | 12 | 3.9% |
전이 | 12 | 3.9% |
Other values (8) | 52 |
tblId
Categorical
HIGH CORRELATION
 
Distinct | 36 |
---|---|
Distinct (%) | 14.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
PE_BC_SPR | |
---|---|
PE_BC_COMP | |
MR_BC_HLTH_9 | 14 |
PE_BC_OPRT | 10 |
PE_BC_RTX_1 | 10 |
Other values (31) |
Length
Max length | 17 |
---|---|
Median length | 16 |
Mean length | 11.51626 |
Min length | 9 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 0.8% |
Sample
1st row | BC_SUMMARY_PTIF_V |
---|---|
2nd row | BC_SUMMARY_PTIF_V |
3rd row | BC_SUMMARY_PTIF_V |
4th row | BC_SUMMARY_PTIF_V |
5th row | BC_SUMMARY_PTIF_V |
Common Values
Value | Count | Frequency (%) |
PE_BC_SPR | 35 | 14.2% |
PE_BC_COMP | 28 | 11.4% |
MR_BC_HLTH_9 | 14 | 5.7% |
PE_BC_OPRT | 10 | 4.1% |
PE_BC_RTX_1 | 10 | 4.1% |
MR_BC_HLTH_4 | 9 | 3.7% |
MR_BC_HLTH_7 | 9 | 3.7% |
MR_BC_HLTH_5 | 9 | 3.7% |
MR_BC_HLTH_6 | 9 | 3.7% |
MR_BC_HLTH_2 | 7 | 2.8% |
Other values (26) | 106 |
Length
Value | Count | Frequency (%) |
pe_bc_spr | 35 | 14.2% |
pe_bc_comp | 28 | 11.4% |
mr_bc_hlth_9 | 14 | 5.7% |
pe_bc_oprt | 10 | 4.1% |
pe_bc_rtx_1 | 10 | 4.1% |
mr_bc_hlth_4 | 9 | 3.7% |
mr_bc_hlth_7 | 9 | 3.7% |
mr_bc_hlth_5 | 9 | 3.7% |
mr_bc_hlth_6 | 9 | 3.7% |
pt_bc_dead | 7 | 2.8% |
Other values (26) | 106 |
tblNm
Categorical
HIGH CORRELATION
 
Distinct | 35 |
---|---|
Distinct (%) | 14.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
외과병리보고서 | |
---|---|
합병증 | |
과거력 | 14 |
수술정보 | 10 |
RT 정보 | 10 |
Other values (30) |
Length
Max length | 22 |
---|---|
Median length | 14 |
Mean length | 6.5934959 |
Min length | 2 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.4% |
Sample
1st row | Patient info |
---|---|
2nd row | Patient info |
3rd row | Patient info |
4th row | Patient info |
5th row | Patient info |
Common Values
Value | Count | Frequency (%) |
외과병리보고서 | 35 | 14.2% |
합병증 | 28 | 11.4% |
과거력 | 14 | 5.7% |
수술정보 | 10 | 4.1% |
RT 정보 | 10 | 4.1% |
가족력(형제/자매) | 9 | 3.7% |
가족력(부) | 9 | 3.7% |
가족력(자녀) | 9 | 3.7% |
가족력(모) | 9 | 3.7% |
사망정보 | 7 | 2.8% |
Other values (25) | 106 |
Length
Value | Count | Frequency (%) |
외과병리보고서 | 35 | 10.7% |
합병증 | 28 | 8.5% |
정보 | 16 | 4.9% |
결과 | 15 | 4.6% |
과거력 | 14 | 4.3% |
initial | 11 | 3.4% |
수술정보 | 10 | 3.0% |
rt | 10 | 3.0% |
가족력(모 | 9 | 2.7% |
가족력(자녀 | 9 | 2.7% |
Other values (35) | 171 |
colId
Text
Distinct | 237 |
---|---|
Distinct (%) | 96.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
Length
Max length | 19 |
---|---|
Median length | 17 |
Mean length | 12.373984 |
Min length | 5 |
Characters and Unicode
Total characters | 3044 |
---|---|
Distinct characters | 32 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 228 ? |
---|---|
Unique (%) | 92.7% |
Sample
1st row | DIAG_AGE |
---|---|
2nd row | FRMD_YMD |
3rd row | ORD_YMD |
4th row | FRST_TRTM_RSRV_YMD |
5th row | OPRT_NM |
Value | Count | Frequency (%) |
ancd_nm | 2 | 0.8% |
clnc_diag_nm | 2 | 0.8% |
stag_rcrd_ymd | 2 | 0.8% |
mtst_part_cmnt | 2 | 0.8% |
ancd_ingr_nm | 2 | 0.8% |
ctx_cycl | 2 | 0.8% |
oprt_nm | 2 | 0.8% |
frst_trtm_rsrv_ymd | 2 | 0.8% |
ord_ymd | 2 | 0.8% |
last_vshs_ymd | 1 | 0.4% |
Other values (227) | 227 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 487 | |
T | 271 | 8.9% |
N | 269 | 8.8% |
M | 238 | 7.8% |
C | 223 | 7.3% |
S | 182 | 6.0% |
R | 163 | 5.4% |
D | 150 | 4.9% |
Y | 115 | 3.8% |
H | 110 | 3.6% |
Other values (22) | 836 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 2549 | |
Connector Punctuation | 487 | 16.0% |
Decimal Number | 8 | 0.3% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
T | 271 | 10.6% |
N | 269 | 10.6% |
M | 238 | 9.3% |
C | 223 | 8.7% |
S | 182 | 7.1% |
R | 163 | 6.4% |
D | 150 | 5.9% |
Y | 115 | 4.5% |
H | 110 | 4.3% |
A | 99 | 3.9% |
Other values (16) | 729 |
Decimal Number
Value | Count | Frequency (%) |
1 | 3 | |
2 | 2 | |
6 | 1 | 12.5% |
3 | 1 | 12.5% |
4 | 1 | 12.5% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 487 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2549 | |
Common | 495 | 16.3% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
T | 271 | 10.6% |
N | 269 | 10.6% |
M | 238 | 9.3% |
C | 223 | 8.7% |
S | 182 | 7.1% |
R | 163 | 6.4% |
D | 150 | 5.9% |
Y | 115 | 4.5% |
H | 110 | 4.3% |
A | 99 | 3.9% |
Other values (16) | 729 |
Common
Value | Count | Frequency (%) |
_ | 487 | |
1 | 3 | 0.6% |
2 | 2 | 0.4% |
6 | 1 | 0.2% |
3 | 1 | 0.2% |
4 | 1 | 0.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 3044 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 487 | |
T | 271 | 8.9% |
N | 269 | 8.8% |
M | 238 | 7.8% |
C | 223 | 7.3% |
S | 182 | 6.0% |
R | 163 | 5.4% |
D | 150 | 4.9% |
Y | 115 | 3.8% |
H | 110 | 3.6% |
Other values (22) | 836 |
colNm
Text
Distinct | 230 |
---|---|
Distinct (%) | 93.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
Value | Count | Frequency (%) |
grade | 14 | 3.1% |
of | 10 | 2.2% |
가족병력(모 | 8 | 1.7% |
기타 | 8 | 1.7% |
invasion | 8 | 1.7% |
가족병력(부 | 8 | 1.7% |
가족병력(형제/자매 | 8 | 1.7% |
stage | 6 | 1.3% |
구분 | 6 | 1.3% |
resection | 5 | 1.1% |
Other values (242) | 378 |
Most occurring characters
Value | Count | Frequency (%) |
217 | 7.6% | |
A | 163 | 5.7% |
E | 161 | 5.6% |
I | 147 | 5.1% |
T | 121 | 4.2% |
O | 113 | 3.9% |
N | 104 | 3.6% |
S | 102 | 3.6% |
R | 101 | 3.5% |
C | 88 | 3.1% |
Other values (150) | 1551 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 1584 | |
Other Letter | 951 | |
Space Separator | 217 | 7.6% |
Open Punctuation | 43 | 1.5% |
Close Punctuation | 43 | 1.5% |
Other Punctuation | 19 | 0.7% |
Dash Punctuation | 6 | 0.2% |
Decimal Number | 5 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
부 | 54 | 5.7% |
력 | 53 | 5.6% |
병 | 51 | 5.4% |
가 | 40 | 4.2% |
족 | 38 | 4.0% |
여 | 37 | 3.9% |
기 | 26 | 2.7% |
타 | 24 | 2.5% |
일 | 22 | 2.3% |
자 | 20 | 2.1% |
Other values (116) | 586 |
Uppercase Letter
Value | Count | Frequency (%) |
A | 163 | |
E | 161 | 10.2% |
I | 147 | 9.3% |
T | 121 | 7.6% |
O | 113 | 7.1% |
N | 104 | 6.6% |
S | 102 | 6.4% |
R | 101 | 6.4% |
C | 88 | 5.6% |
L | 79 | 5.0% |
Other values (15) | 405 |
Decimal Number
Value | Count | Frequency (%) |
2 | 2 | |
1 | 2 | |
3 | 1 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 16 | |
. | 3 | 15.8% |
Space Separator
Value | Count | Frequency (%) |
217 |
Open Punctuation
Value | Count | Frequency (%) |
( | 43 |
Close Punctuation
Value | Count | Frequency (%) |
) | 43 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 6 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 1584 | |
Hangul | 951 | |
Common | 333 | 11.6% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
부 | 54 | 5.7% |
력 | 53 | 5.6% |
병 | 51 | 5.4% |
가 | 40 | 4.2% |
족 | 38 | 4.0% |
여 | 37 | 3.9% |
기 | 26 | 2.7% |
타 | 24 | 2.5% |
일 | 22 | 2.3% |
자 | 20 | 2.1% |
Other values (116) | 586 |
Latin
Value | Count | Frequency (%) |
A | 163 | |
E | 161 | 10.2% |
I | 147 | 9.3% |
T | 121 | 7.6% |
O | 113 | 7.1% |
N | 104 | 6.6% |
S | 102 | 6.4% |
R | 101 | 6.4% |
C | 88 | 5.6% |
L | 79 | 5.0% |
Other values (15) | 405 |
Common
Value | Count | Frequency (%) |
217 | ||
( | 43 | 12.9% |
) | 43 | 12.9% |
/ | 16 | 4.8% |
- | 6 | 1.8% |
. | 3 | 0.9% |
2 | 2 | 0.6% |
1 | 2 | 0.6% |
3 | 1 | 0.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1917 | |
Hangul | 951 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
217 | 11.3% | |
A | 163 | 8.5% |
E | 161 | 8.4% |
I | 147 | 7.7% |
T | 121 | 6.3% |
O | 113 | 5.9% |
N | 104 | 5.4% |
S | 102 | 5.3% |
R | 101 | 5.3% |
C | 88 | 4.6% |
Other values (24) | 600 |
Hangul
Value | Count | Frequency (%) |
부 | 54 | 5.7% |
력 | 53 | 5.6% |
병 | 51 | 5.4% |
가 | 40 | 4.2% |
족 | 38 | 4.0% |
여 | 37 | 3.9% |
기 | 26 | 2.7% |
타 | 24 | 2.5% |
일 | 22 | 2.3% |
자 | 20 | 2.1% |
Other values (116) | 586 |
dataType
Categorical
IMBALANCE
 
Distinct | 14 |
---|---|
Distinct (%) | 5.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
String() | |
---|---|
DATE | 13 |
Integer() | 12 |
<NA> | 11 |
Float() | 3 |
Other values (9) | 14 |
Length
Max length | 13 |
---|---|
Median length | 8 |
Mean length | 7.7764228 |
Min length | 4 |
Unique
Unique | 5 ? |
---|---|
Unique (%) | 2.0% |
Sample
1st row | Integer() |
---|---|
2nd row | DATE |
3rd row | DATE |
4th row | DATE |
5th row | String() |
Common Values
Value | Count | Frequency (%) |
String() | 193 | |
DATE | 13 | 5.3% |
Integer() | 12 | 4.9% |
<NA> | 11 | 4.5% |
Float() | 3 | 1.2% |
Integer(code) | 3 | 1.2% |
Float(51) | 2 | 0.8% |
String(code) | 2 | 0.8% |
Float(,) | 2 | 0.8% |
Float(62) | 1 | 0.4% |
Other values (4) | 4 | 1.6% |
Length
Value | Count | Frequency (%) |
string | 193 | |
date | 13 | 5.3% |
integer | 13 | 5.3% |
na | 11 | 4.5% |
float | 5 | 2.0% |
integer(code | 3 | 1.2% |
float(51 | 2 | 0.8% |
string(code | 2 | 0.8% |
float(62 | 1 | 0.4% |
string(4000 | 1 | 0.4% |
Other values (2) | 2 | 0.8% |
colDesc
Text
Distinct | 244 |
---|---|
Distinct (%) | 99.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
Length
Max length | 54 |
---|---|
Median length | 25 |
Mean length | 13.109756 |
Min length | 5 |
Characters and Unicode
Total characters | 3225 |
---|---|
Distinct characters | 242 |
Distinct categories | 9 ? |
Distinct scripts | 3 ? |
Distinct blocks | 3 ? |
Unique
Unique | 242 ? |
---|---|
Unique (%) | 98.4% |
Sample
1st row | 담도암 진단시 나이 |
---|---|
2nd row | 간암센터 외래 초진일 |
3rd row | 항암치료 첫 치료시작일 |
4th row | 방사선치료 첫 치료시작일 |
5th row | 담도암 수술명 |
Value | Count | Frequency (%) |
유무 | 43 | 5.5% |
상세내용 | 22 | 2.8% |
수술 | 21 | 2.7% |
및 | 20 | 2.6% |
기타 | 16 | 2.0% |
과거력 | 14 | 1.8% |
악성도 | 13 | 1.7% |
담도암 | 11 | 1.4% |
시 | 10 | 1.3% |
첫번째 | 10 | 1.3% |
Other values (296) | 602 |
Most occurring characters
Value | Count | Frequency (%) |
536 | 16.6% | |
유 | 87 | 2.7% |
무 | 72 | 2.2% |
사 | 59 | 1.8% |
수 | 55 | 1.7% |
력 | 55 | 1.7% |
시 | 52 | 1.6% |
부 | 48 | 1.5% |
기 | 46 | 1.4% |
( | 46 | 1.4% |
Other values (232) | 2169 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 2273 | |
Space Separator | 536 | 16.6% |
Uppercase Letter | 258 | 8.0% |
Other Punctuation | 50 | 1.6% |
Open Punctuation | 46 | 1.4% |
Close Punctuation | 46 | 1.4% |
Decimal Number | 12 | 0.4% |
Dash Punctuation | 3 | 0.1% |
Other Number | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
유 | 87 | 3.8% |
무 | 72 | 3.2% |
사 | 59 | 2.6% |
수 | 55 | 2.4% |
력 | 55 | 2.4% |
시 | 52 | 2.3% |
부 | 48 | 2.1% |
기 | 46 | 2.0% |
술 | 46 | 2.0% |
가 | 46 | 2.0% |
Other values (199) | 1707 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 40 | |
T | 26 | |
L | 26 | |
E | 22 | |
A | 22 | |
I | 20 | 7.8% |
N | 14 | 5.4% |
R | 14 | 5.4% |
O | 13 | 5.0% |
S | 11 | 4.3% |
Other values (10) | 50 |
Decimal Number
Value | Count | Frequency (%) |
1 | 4 | |
2 | 3 | |
4 | 2 | |
3 | 2 | |
0 | 1 | 8.3% |
Other Punctuation
Value | Count | Frequency (%) |
' | 26 | |
/ | 22 | |
, | 2 | 4.0% |
Space Separator
Value | Count | Frequency (%) |
536 |
Open Punctuation
Value | Count | Frequency (%) |
( | 46 |
Close Punctuation
Value | Count | Frequency (%) |
) | 46 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 3 |
Other Number
Value | Count | Frequency (%) |
² | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 2273 | |
Common | 694 | 21.5% |
Latin | 258 | 8.0% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
유 | 87 | 3.8% |
무 | 72 | 3.2% |
사 | 59 | 2.6% |
수 | 55 | 2.4% |
력 | 55 | 2.4% |
시 | 52 | 2.3% |
부 | 48 | 2.1% |
기 | 46 | 2.0% |
술 | 46 | 2.0% |
가 | 46 | 2.0% |
Other values (199) | 1707 |
Latin
Value | Count | Frequency (%) |
C | 40 | |
T | 26 | |
L | 26 | |
E | 22 | |
A | 22 | |
I | 20 | 7.8% |
N | 14 | 5.4% |
R | 14 | 5.4% |
O | 13 | 5.0% |
S | 11 | 4.3% |
Other values (10) | 50 |
Common
Value | Count | Frequency (%) |
536 | ||
( | 46 | 6.6% |
) | 46 | 6.6% |
' | 26 | 3.7% |
/ | 22 | 3.2% |
1 | 4 | 0.6% |
- | 3 | 0.4% |
2 | 3 | 0.4% |
4 | 2 | 0.3% |
3 | 2 | 0.3% |
Other values (3) | 4 | 0.6% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 2273 | |
ASCII | 951 | |
None | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
536 | ||
( | 46 | 4.8% |
) | 46 | 4.8% |
C | 40 | 4.2% |
' | 26 | 2.7% |
T | 26 | 2.7% |
L | 26 | 2.7% |
E | 22 | 2.3% |
/ | 22 | 2.3% |
A | 22 | 2.3% |
Other values (22) | 139 | 14.6% |
Hangul
Value | Count | Frequency (%) |
유 | 87 | 3.8% |
무 | 72 | 3.2% |
사 | 59 | 2.6% |
수 | 55 | 2.4% |
력 | 55 | 2.4% |
시 | 52 | 2.3% |
부 | 48 | 2.1% |
기 | 46 | 2.0% |
술 | 46 | 2.0% |
가 | 46 | 2.0% |
Other values (199) | 1707 |
None
Value | Count | Frequency (%) |
² | 1 |
colCnt
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 83 |
---|---|
Distinct (%) | 62.9% |
Missing | 114 |
Missing (%) | 46.3% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4628.0909 |
Minimum | 1 |
---|---|
Maximum | 90932 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.3 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 103.5 |
Q1 | 384 |
median | 743.5 |
Q3 | 2321 |
95-th percentile | 28783 |
Maximum | 90932 |
Range | 90931 |
Interquartile range (IQR) | 1937 |
Descriptive statistics
Standard deviation | 13943.03 |
---|---|
Coefficient of variation (CV) | 3.0126957 |
Kurtosis | 24.843484 |
Mean | 4628.0909 |
Median Absolute Deviation (MAD) | 563.5 |
Skewness | 4.7912511 |
Sum | 610908 |
Variance | 1.9440808 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2321 | 17 | 6.9% |
523 | 7 | 2.8% |
1009 | 7 | 2.8% |
658 | 6 | 2.4% |
180 | 4 | 1.6% |
4774 | 3 | 1.2% |
34339 | 3 | 1.2% |
828 | 2 | 0.8% |
712 | 2 | 0.8% |
166 | 2 | 0.8% |
Other values (73) | 79 | |
(Missing) | 114 |
Value | Count | Frequency (%) |
1 | 1 | |
19 | 1 | |
26 | 1 | |
29 | 1 | |
55 | 1 | |
79 | 1 | |
87 | 1 | |
117 | 1 | |
145 | 1 | |
163 | 1 |
Value | Count | Frequency (%) |
90932 | 2 | |
69449 | 1 | 0.4% |
34339 | 3 | |
28783 | 2 | |
27949 | 1 | 0.4% |
17879 | 2 | |
5718 | 1 | 0.4% |
4774 | 3 | |
4750 | 1 | 0.4% |
4570 | 1 | 0.4% |
dispFormat
Categorical
HIGH CORRELATION
 
Distinct | 21 |
---|---|
Distinct (%) | 8.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
<NA> | |
---|---|
텍스트 | |
Y 유 |N 무 | |
YYYY-MM-DD | |
숫자 | |
Other values (16) |
Length
Max length | 681 |
---|---|
Median length | 71 |
Mean length | 8.5447154 |
Min length | 2 |
Unique
Unique | 12 ? |
---|---|
Unique (%) | 4.9% |
Sample
1st row | 숫자 |
---|---|
2nd row | YYYY-MM-DD |
3rd row | YYYY-MM-DD |
4th row | YYYY-MM-DD |
5th row | 텍스트 |
Common Values
Value | Count | Frequency (%) |
<NA> | 105 | |
텍스트 | 54 | |
Y 유 |N 무 | 21 | 8.5% |
YYYY-MM-DD | 16 | 6.5% |
숫자 | 16 | 6.5% |
Y : 유 / N : 무 | 9 | 3.7% |
Y, Y |N, N | 8 | 3.3% |
Y Y |N N | 3 | 1.2% |
내부 | 외부 | 2 | 0.8% |
예)NEEDLE BIOPSY | 1 | 0.4% |
Other values (11) | 11 | 4.5% |
Length
Value | Count | Frequency (%) |
na | 105 | |
81 | ||
텍스트 | 55 | |
y | 52 | |
n | 52 | |
유 | 30 | 5.3% |
무 | 30 | 5.3% |
yyyy-mm-dd | 16 | 2.8% |
숫자 | 16 | 2.8% |
neutrophil | 2 | 0.4% |
Other values (122) | 127 |
NUM | gpId | gpNm | tblId | tblNm | dataType | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.938 | 0.938 | 0.988 | 0.991 | 0.445 | 0.440 | 0.734 |
gpId | 0.938 | 1.000 | 1.000 | 1.000 | 1.000 | 0.565 | 0.847 | 0.822 |
gpNm | 0.938 | 1.000 | 1.000 | 1.000 | 1.000 | 0.565 | 0.847 | 0.822 |
tblId | 0.988 | 1.000 | 1.000 | 1.000 | 1.000 | 0.721 | 0.871 | 0.887 |
tblNm | 0.991 | 1.000 | 1.000 | 1.000 | 1.000 | 0.715 | 0.871 | 0.887 |
dataType | 0.445 | 0.565 | 0.565 | 0.721 | 0.715 | 1.000 | 0.000 | 0.853 |
colCnt | 0.440 | 0.847 | 0.847 | 0.871 | 0.871 | 0.000 | 1.000 | 0.908 |
dispFormat | 0.734 | 0.822 | 0.822 | 0.887 | 0.887 | 0.853 | 0.908 | 1.000 |
dispFormat | gpId | gpNm | tblId | dataType | tblNm | |
---|---|---|---|---|---|---|
dispFormat | 1.000 | 0.457 | 0.457 | 0.456 | 0.434 | 0.456 |
gpId | 0.457 | 1.000 | 1.000 | 0.951 | 0.241 | 0.948 |
gpNm | 0.457 | 1.000 | 1.000 | 0.951 | 0.241 | 0.948 |
tblId | 0.456 | 0.951 | 0.951 | 1.000 | 0.295 | 0.998 |
dataType | 0.434 | 0.241 | 0.241 | 0.295 | 1.000 | 0.292 |
tblNm | 0.456 | 0.948 | 0.948 | 0.998 | 0.292 | 1.000 |
NUM | colCnt | gpId | gpNm | tblId | tblNm | dataType | dispFormat | |
---|---|---|---|---|---|---|---|---|
NUM | 1.000 | -0.090 | 0.752 | 0.752 | 0.851 | 0.849 | 0.204 | 0.381 |
colCnt | -0.090 | 1.000 | 0.686 | 0.686 | 0.613 | 0.613 | 0.000 | 0.716 |
gpId | 0.752 | 0.686 | 1.000 | 1.000 | 0.951 | 0.948 | 0.241 | 0.457 |
gpNm | 0.752 | 0.686 | 1.000 | 1.000 | 0.951 | 0.948 | 0.241 | 0.457 |
tblId | 0.851 | 0.613 | 0.951 | 0.951 | 1.000 | 0.998 | 0.295 | 0.456 |
tblNm | 0.849 | 0.613 | 0.948 | 0.948 | 0.998 | 1.000 | 0.292 | 0.456 |
dataType | 0.204 | 0.000 | 0.241 | 0.241 | 0.295 | 0.292 | 1.000 | 0.434 |
dispFormat | 0.381 | 0.716 | 0.457 | 0.457 | 0.456 | 0.456 | 0.434 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | BC_SUMMARY | Summary | BC_SUMMARY_PTIF_V | Patient info | DIAG_AGE | 진단 시 나이 | Integer() | 담도암 진단시 나이 | <NA> | 숫자 |
1 | 2 | BC_SUMMARY | Summary | BC_SUMMARY_PTIF_V | Patient info | FRMD_YMD | 초진일 | DATE | 간암센터 외래 초진일 | <NA> | YYYY-MM-DD |
2 | 3 | BC_SUMMARY | Summary | BC_SUMMARY_PTIF_V | Patient info | ORD_YMD | 약물치료시작일 | DATE | 항암치료 첫 치료시작일 | <NA> | YYYY-MM-DD |
3 | 4 | BC_SUMMARY | Summary | BC_SUMMARY_PTIF_V | Patient info | FRST_TRTM_RSRV_YMD | 방사선치료시작일 | DATE | 방사선치료 첫 치료시작일 | <NA> | YYYY-MM-DD |
4 | 5 | BC_SUMMARY | Summary | BC_SUMMARY_PTIF_V | Patient info | OPRT_NM | 수술명 | String() | 담도암 수술명 | <NA> | 텍스트 |
5 | 6 | BC_DIAG | 진단정보 | RG_BC_CNDX | 진단정보 | CLNC_DIAG_NM | 임상진단명 | String() | 담도암 및 기타암 임상진단명 | 4570 | 텍스트 |
6 | 7 | BC_DIAG | 진단정보 | RG_BC_CNDX | 진단정보 | CLNC_DIAG_CD | 담도암/기타암 임상진단코드 | String() | 담도암과 기타암 임상진단코드 | <NA> | 원내검사코드 |
7 | 8 | BC_DIAG | 진단정보 | RG_BC_CNDX | 진단정보 | DIAG_YMD | 진단일 | <NA> | 담도암 및 기타암 진단일자 | <NA> | YYYY-MM-DD |
8 | 9 | BC_DIAG | 진단정보 | PT_BC_BDMS | 증상 및 신체계측 | HT_MSRM_YMD | 신장 측정일 | DATE | 담도암 진단 이후 첫번째 신장 측정일 | 1694 | YYYY-MM-DD |
9 | 10 | BC_DIAG | 진단정보 | PT_BC_BDMS | 증상 및 신체계측 | HT_VL | 신장 | Float(51) | 담도암 진단 이후 첫번째 신장 값 | 1782 | 숫자 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
236 | 237 | BC_HLTH | 기타건강정보 | MR_BC_HLTH_9 | 과거력 | PHIS_CNCR_YN | 과거병력암여부 | String() | 과거력 암 유무 | <NA> | <NA> |
237 | 238 | BC_HLTH | 기타건강정보 | MR_BC_HLTH_9 | 과거력 | PHIS_DEPR_YN | 과거병력우울증여부 | String() | 과거력 우울 유무 | <NA> | <NA> |
238 | 239 | BC_HLTH | 기타건강정보 | MR_BC_HLTH_9 | 과거력 | PHIS_INSM_YN | 과거병력불면증여부 | String() | 과거력 불면 유무 | <NA> | <NA> |
239 | 240 | BC_HLTH | 기타건강정보 | MR_BC_HLTH_9 | 과거력 | PHIS_CADZ_YN | 과거병력심장질환여부 | String() | 과거력 심장질환 유무 | <NA> | <NA> |
240 | 241 | BC_HLTH | 기타건강정보 | MR_BC_HLTH_9 | 과거력 | PHIS_CADZ_CMNT | 과거병력심장질환내용 | String() | 과거력 심장질환 상세내용 | <NA> | <NA> |
241 | 242 | BC_HLTH | 기타건강정보 | MR_BC_HLTH_9 | 과거력 | PHIS_ETC_YN | 과거병력기타여부 | String() | 과거력 기타 유무 | <NA> | <NA> |
242 | 243 | BC_HLTH | 기타건강정보 | MR_BC_HLTH_9 | 과거력 | PHIS_ETC_CMNT | 과거병력기타내용 | String() | 과거력 기타 상세내용 | <NA> | <NA> |
243 | 244 | BC_HLTH | 기타건강정보 | MR_BC_HLTH_10 | 증상/전원 정보 | MAIN_SYMP_YN | 증상 | String() | 입원 시 주증상 유무 | <NA> | <NA> |
244 | 245 | BC_HLTH | 기타건강정보 | MR_BC_HLTH_10 | 증상/전원 정보 | MAIN_SYMP_CMNT | 증상 상세내용 | String() | 입원 시 주증상 상세내용 | <NA> | <NA> |
245 | 246 | BC_HLTH | 기타건강정보 | MR_BC_HLTH_10 | 증상/전원 정보 | OUTS_DIAG_TRANS_YN | 타 병원 진단 후 전원여부 | String() | 타 병원 진단 후 전원여부 | <NA> | <NA> |