Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 221 |
Missing cells | 183 |
Missing cells (%) | 7.5% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 19.6 KiB |
Average record size in memory | 90.6 B |
Variable types
Numeric | 2 |
---|---|
Categorical | 6 |
Text | 3 |
Dataset
Description | 췌장암 레지스트리 메타정보( 제공 되어질 데이터 항목, 타입, 사이즈, 항목별건수, 샘플데이터 등)를 제공 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15048699/fileData.do |
tblNm is highly overall correlated with NUM and 5 other fields | High correlation |
gpId is highly overall correlated with NUM and 5 other fields | High correlation |
gpNm is highly overall correlated with NUM and 5 other fields | High correlation |
tblId is highly overall correlated with NUM and 5 other fields | High correlation |
NUM is highly overall correlated with gpId and 4 other fields | High correlation |
colCnt is highly overall correlated with gpId and 3 other fields | High correlation |
dataType is highly overall correlated with dispFormat | High correlation |
dispFormat is highly overall correlated with NUM and 5 other fields | High correlation |
dataType is highly imbalanced (59.4%) | Imbalance |
colCnt has 183 (82.8%) missing values | Missing |
NUM has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 20:07:47.201709 |
---|---|
Analysis finished | 2023-12-12 20:07:48.684454 |
Duration | 1.48 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
NUM
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 221 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 111 |
Minimum | 1 |
---|---|
Maximum | 221 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.1 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 12 |
Q1 | 56 |
median | 111 |
Q3 | 166 |
95-th percentile | 210 |
Maximum | 221 |
Range | 220 |
Interquartile range (IQR) | 110 |
Descriptive statistics
Standard deviation | 63.941379 |
---|---|
Coefficient of variation (CV) | 0.57604846 |
Kurtosis | -1.2 |
Mean | 111 |
Median Absolute Deviation (MAD) | 55 |
Skewness | 0 |
Sum | 24531 |
Variance | 4088.5 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 0.5% |
153 | 1 | 0.5% |
142 | 1 | 0.5% |
143 | 1 | 0.5% |
144 | 1 | 0.5% |
145 | 1 | 0.5% |
146 | 1 | 0.5% |
147 | 1 | 0.5% |
148 | 1 | 0.5% |
149 | 1 | 0.5% |
Other values (211) | 211 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
221 | 1 | |
220 | 1 | |
219 | 1 | |
218 | 1 | |
217 | 1 | |
216 | 1 | |
215 | 1 | |
214 | 1 | |
213 | 1 | |
212 | 1 |
gpId
Categorical
HIGH CORRELATION
 
Distinct | 14 |
---|---|
Distinct (%) | 6.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
PNC_HLTH | |
---|---|
PNC_OPRT | |
PNC_COMP | |
PNC_SPR | |
PNC_CST | |
Other values (9) |
Length
Max length | 13 |
---|---|
Median length | 8 |
Mean length | 8.5972851 |
Min length | 7 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | PNC_SUMMARY |
---|---|
2nd row | PNC_SUMMARY |
3rd row | PNC_SUMMARY |
4th row | PNC_SUMMARY |
5th row | PNC_SUMMARY |
Common Values
Value | Count | Frequency (%) |
PNC_HLTH | 73 | |
PNC_OPRT | 29 | 13.1% |
PNC_COMP | 28 | 12.7% |
PNC_SPR | 21 | 9.5% |
PNC_CST | 12 | 5.4% |
PNC_FLUP_DEAD | 12 | 5.4% |
PNC_DIAG | 11 | 5.0% |
PNC_CHMO | 7 | 3.2% |
PNC_MIEX_SREX | 6 | 2.7% |
PNC_INIT_BX | 6 | 2.7% |
Other values (4) | 16 | 7.2% |
Length
Value | Count | Frequency (%) |
pnc_hlth | 73 | |
pnc_oprt | 29 | 13.1% |
pnc_comp | 28 | 12.7% |
pnc_spr | 21 | 9.5% |
pnc_cst | 12 | 5.4% |
pnc_flup_dead | 12 | 5.4% |
pnc_diag | 11 | 5.0% |
pnc_chmo | 7 | 3.2% |
pnc_miex_srex | 6 | 2.7% |
pnc_init_bx | 6 | 2.7% |
Other values (4) | 16 | 7.2% |
gpNm
Categorical
HIGH CORRELATION
 
Distinct | 14 |
---|---|
Distinct (%) | 6.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
기타건강정보 | |
---|---|
수술정보 | |
합병증 | |
외과병리보고서 | |
전이 및 재발 | |
Other values (9) |
Length
Max length | 17 |
---|---|
Median length | 16 |
Mean length | 6.1221719 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Summary |
---|---|
2nd row | Summary |
3rd row | Summary |
4th row | Summary |
5th row | Summary |
Common Values
Value | Count | Frequency (%) |
기타건강정보 | 73 | |
수술정보 | 29 | 13.1% |
합병증 | 28 | 12.7% |
외과병리보고서 | 21 | 9.5% |
전이 및 재발 | 12 | 5.4% |
사망 및 치료평가 | 12 | 5.4% |
진단정보 | 11 | 5.0% |
항암치료 | 7 | 3.2% |
진단검사(영상/시술) | 6 | 2.7% |
진단검사(Initial Bx) | 6 | 2.7% |
Other values (4) | 16 | 7.2% |
Length
Value | Count | Frequency (%) |
기타건강정보 | 73 | |
수술정보 | 29 | 10.3% |
합병증 | 28 | 10.0% |
및 | 24 | 8.5% |
외과병리보고서 | 21 | 7.5% |
전이 | 12 | 4.3% |
재발 | 12 | 4.3% |
사망 | 12 | 4.3% |
치료평가 | 12 | 4.3% |
진단정보 | 11 | 3.9% |
Other values (8) | 47 |
tblId
Categorical
HIGH CORRELATION
 
Distinct | 35 |
---|---|
Distinct (%) | 15.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
PE_PNC_COMP | |
---|---|
PE_PNC_SPR | |
MR_PNC_HLTH_9 | |
PE_PNC_OPRT | 10 |
MR_PNC_HLTH_4 | 9 |
Other values (30) |
Length
Max length | 18 |
---|---|
Median length | 17 |
Mean length | 12.687783 |
Min length | 10 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 0.9% |
Sample
1st row | PNC_SUMMARY_PTIF_V |
---|---|
2nd row | PNC_SUMMARY_PTIF_V |
3rd row | PNC_SUMMARY_PTIF_V |
4th row | PNC_SUMMARY_PTIF_V |
5th row | PNC_SUMMARY_PTIF_V |
Common Values
Value | Count | Frequency (%) |
PE_PNC_COMP | 28 | 12.7% |
PE_PNC_SPR | 21 | 9.5% |
MR_PNC_HLTH_9 | 14 | 6.3% |
PE_PNC_OPRT | 10 | 4.5% |
MR_PNC_HLTH_4 | 9 | 4.1% |
MR_PNC_HLTH_7 | 9 | 4.1% |
MR_PNC_HLTH_5 | 9 | 4.1% |
MR_PNC_HLTH_6 | 9 | 4.1% |
PE_PNC_CHMO | 7 | 3.2% |
PT_PNC_DEAD | 7 | 3.2% |
Other values (25) | 98 |
Length
Value | Count | Frequency (%) |
pe_pnc_comp | 28 | 12.7% |
pe_pnc_spr | 21 | 9.5% |
mr_pnc_hlth_9 | 14 | 6.3% |
pe_pnc_oprt | 10 | 4.5% |
mr_pnc_hlth_4 | 9 | 4.1% |
mr_pnc_hlth_7 | 9 | 4.1% |
mr_pnc_hlth_5 | 9 | 4.1% |
mr_pnc_hlth_6 | 9 | 4.1% |
pe_pnc_chmo | 7 | 3.2% |
pt_pnc_dead | 7 | 3.2% |
Other values (25) | 98 |
tblNm
Categorical
HIGH CORRELATION
 
Distinct | 34 |
---|---|
Distinct (%) | 15.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
합병증 | |
---|---|
외과병리보고서 | |
과거력 | |
수술정보 | 10 |
가족력(형제/자매) | 9 |
Other values (29) |
Length
Max length | 22 |
---|---|
Median length | 14 |
Mean length | 6.6063348 |
Min length | 2 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.5% |
Sample
1st row | Patient info |
---|---|
2nd row | Patient info |
3rd row | Patient info |
4th row | Patient info |
5th row | Patient info |
Common Values
Value | Count | Frequency (%) |
합병증 | 28 | 12.7% |
외과병리보고서 | 21 | 9.5% |
과거력 | 14 | 6.3% |
수술정보 | 10 | 4.5% |
가족력(형제/자매) | 9 | 4.1% |
가족력(자녀) | 9 | 4.1% |
가족력(모) | 9 | 4.1% |
가족력(부) | 9 | 4.1% |
항암치료정보 | 7 | 3.2% |
음주력 | 7 | 3.2% |
Other values (24) | 98 |
Length
Value | Count | Frequency (%) |
합병증 | 28 | 9.6% |
외과병리보고서 | 21 | 7.2% |
결과 | 15 | 5.1% |
과거력 | 14 | 4.8% |
initial | 11 | 3.8% |
수술정보 | 10 | 3.4% |
가족력(형제/자매 | 9 | 3.1% |
가족력(자녀 | 9 | 3.1% |
가족력(모 | 9 | 3.1% |
가족력(부 | 9 | 3.1% |
Other values (34) | 158 |
colId
Text
Distinct | 214 |
---|---|
Distinct (%) | 96.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
Length
Max length | 19 |
---|---|
Median length | 16 |
Mean length | 12.253394 |
Min length | 5 |
Characters and Unicode
Total characters | 2708 |
---|---|
Distinct characters | 32 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 207 ? |
---|---|
Unique (%) | 93.7% |
Sample
1st row | FRMD_YMD |
---|---|
2nd row | DIAG_AGE |
3rd row | OPRT_NM |
4th row | ORD_YMD |
5th row | FRST_TRTM_RSRV_YMD |
Value | Count | Frequency (%) |
stag_rcrd_ymd | 2 | 0.9% |
ancd_ingr_nm | 2 | 0.9% |
ctx_cycl | 2 | 0.9% |
oprt_nm | 2 | 0.9% |
ord_ymd | 2 | 0.9% |
mtst_part_cmnt | 2 | 0.9% |
ancd_nm | 2 | 0.9% |
dcuz3_cmnt | 1 | 0.5% |
edu_dgre_cd | 1 | 0.5% |
adm_ymd | 1 | 0.5% |
Other values (204) | 204 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 436 | |
N | 238 | 8.8% |
T | 222 | 8.2% |
M | 209 | 7.7% |
C | 198 | 7.3% |
S | 161 | 5.9% |
D | 141 | 5.2% |
R | 132 | 4.9% |
Y | 110 | 4.1% |
H | 103 | 3.8% |
Other values (22) | 758 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 2264 | |
Connector Punctuation | 436 | 16.1% |
Decimal Number | 8 | 0.3% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
N | 238 | 10.5% |
T | 222 | 9.8% |
M | 209 | 9.2% |
C | 198 | 8.7% |
S | 161 | 7.1% |
D | 141 | 6.2% |
R | 132 | 5.8% |
Y | 110 | 4.9% |
H | 103 | 4.5% |
A | 93 | 4.1% |
Other values (16) | 657 |
Decimal Number
Value | Count | Frequency (%) |
1 | 3 | |
2 | 2 | |
3 | 1 | 12.5% |
4 | 1 | 12.5% |
6 | 1 | 12.5% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 436 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2264 | |
Common | 444 | 16.4% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
N | 238 | 10.5% |
T | 222 | 9.8% |
M | 209 | 9.2% |
C | 198 | 8.7% |
S | 161 | 7.1% |
D | 141 | 6.2% |
R | 132 | 5.8% |
Y | 110 | 4.9% |
H | 103 | 4.5% |
A | 93 | 4.1% |
Other values (16) | 657 |
Common
Value | Count | Frequency (%) |
_ | 436 | |
1 | 3 | 0.7% |
2 | 2 | 0.5% |
3 | 1 | 0.2% |
4 | 1 | 0.2% |
6 | 1 | 0.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2708 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 436 | |
N | 238 | 8.8% |
T | 222 | 8.2% |
M | 209 | 7.7% |
C | 198 | 7.3% |
S | 161 | 5.9% |
D | 141 | 5.2% |
R | 132 | 4.9% |
Y | 110 | 4.1% |
H | 103 | 3.8% |
Other values (22) | 758 |
colNm
Text
Distinct | 206 |
---|---|
Distinct (%) | 93.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
Value | Count | Frequency (%) |
grade | 13 | 3.3% |
가족병력(부 | 9 | 2.3% |
가족병력(형제/자매 | 9 | 2.3% |
가족병력(모 | 9 | 2.3% |
가족병력(자녀 | 9 | 2.3% |
기타 | 8 | 2.0% |
유무 | 7 | 1.8% |
stage | 6 | 1.5% |
기타내용 | 5 | 1.3% |
invasion | 5 | 1.3% |
Other values (193) | 317 |
Most occurring characters
Value | Count | Frequency (%) |
181 | 7.5% | |
E | 135 | 5.6% |
A | 129 | 5.3% |
I | 111 | 4.6% |
T | 93 | 3.8% |
S | 84 | 3.5% |
R | 84 | 3.5% |
O | 80 | 3.3% |
N | 74 | 3.1% |
L | 66 | 2.7% |
Other values (148) | 1383 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 1235 | |
Other Letter | 891 | |
Space Separator | 181 | 7.5% |
Open Punctuation | 43 | 1.8% |
Close Punctuation | 43 | 1.8% |
Other Punctuation | 16 | 0.7% |
Dash Punctuation | 6 | 0.2% |
Decimal Number | 5 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
력 | 53 | 5.9% |
병 | 53 | 5.9% |
부 | 50 | 5.6% |
가 | 40 | 4.5% |
족 | 38 | 4.3% |
여 | 34 | 3.8% |
기 | 25 | 2.8% |
타 | 23 | 2.6% |
사 | 19 | 2.1% |
일 | 19 | 2.1% |
Other values (114) | 537 |
Uppercase Letter
Value | Count | Frequency (%) |
E | 135 | |
A | 129 | |
I | 111 | 9.0% |
T | 93 | 7.5% |
S | 84 | 6.8% |
R | 84 | 6.8% |
O | 80 | 6.5% |
N | 74 | 6.0% |
L | 66 | 5.3% |
C | 65 | 5.3% |
Other values (15) | 314 |
Decimal Number
Value | Count | Frequency (%) |
2 | 2 | |
1 | 2 | |
3 | 1 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 14 | |
. | 2 | 12.5% |
Space Separator
Value | Count | Frequency (%) |
181 |
Open Punctuation
Value | Count | Frequency (%) |
( | 43 |
Close Punctuation
Value | Count | Frequency (%) |
) | 43 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 6 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 1235 | |
Hangul | 891 | |
Common | 294 | 12.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
력 | 53 | 5.9% |
병 | 53 | 5.9% |
부 | 50 | 5.6% |
가 | 40 | 4.5% |
족 | 38 | 4.3% |
여 | 34 | 3.8% |
기 | 25 | 2.8% |
타 | 23 | 2.6% |
사 | 19 | 2.1% |
일 | 19 | 2.1% |
Other values (114) | 537 |
Latin
Value | Count | Frequency (%) |
E | 135 | |
A | 129 | |
I | 111 | 9.0% |
T | 93 | 7.5% |
S | 84 | 6.8% |
R | 84 | 6.8% |
O | 80 | 6.5% |
N | 74 | 6.0% |
L | 66 | 5.3% |
C | 65 | 5.3% |
Other values (15) | 314 |
Common
Value | Count | Frequency (%) |
181 | ||
( | 43 | 14.6% |
) | 43 | 14.6% |
/ | 14 | 4.8% |
- | 6 | 2.0% |
. | 2 | 0.7% |
2 | 2 | 0.7% |
1 | 2 | 0.7% |
3 | 1 | 0.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1529 | |
Hangul | 891 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
181 | 11.8% | |
E | 135 | 8.8% |
A | 129 | 8.4% |
I | 111 | 7.3% |
T | 93 | 6.1% |
S | 84 | 5.5% |
R | 84 | 5.5% |
O | 80 | 5.2% |
N | 74 | 4.8% |
L | 66 | 4.3% |
Other values (24) | 492 |
Hangul
Value | Count | Frequency (%) |
력 | 53 | 5.9% |
병 | 53 | 5.9% |
부 | 50 | 5.6% |
가 | 40 | 4.5% |
족 | 38 | 4.3% |
여 | 34 | 3.8% |
기 | 25 | 2.8% |
타 | 23 | 2.6% |
사 | 19 | 2.1% |
일 | 19 | 2.1% |
Other values (114) | 537 |
dataType
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 12 |
---|---|
Distinct (%) | 5.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
String() | |
---|---|
DATE | |
Integer() | 13 |
Float() | 6 |
Integer(code) | 6 |
Other values (7) | 9 |
Length
Max length | 13 |
---|---|
Median length | 8 |
Mean length | 7.8371041 |
Min length | 4 |
Unique
Unique | 5 ? |
---|---|
Unique (%) | 2.3% |
Sample
1st row | DATE |
---|---|
2nd row | Integer() |
3rd row | String() |
4th row | DATE |
5th row | DATE |
Common Values
Value | Count | Frequency (%) |
String() | 166 | |
DATE | 21 | 9.5% |
Integer() | 13 | 5.9% |
Float() | 6 | 2.7% |
Integer(code) | 6 | 2.7% |
Float(51) | 2 | 0.9% |
Integer(1) | 2 | 0.9% |
String(code) | 1 | 0.5% |
Float(62) | 1 | 0.5% |
String(4000) | 1 | 0.5% |
Other values (2) | 2 | 0.9% |
Length
Value | Count | Frequency (%) |
string | 166 | |
date | 21 | 9.5% |
integer | 13 | 5.9% |
float | 7 | 3.2% |
integer(code | 6 | 2.7% |
float(51 | 2 | 0.9% |
integer(1 | 2 | 0.9% |
string(code | 1 | 0.5% |
float(62 | 1 | 0.5% |
string(4000 | 1 | 0.5% |
colDesc
Text
Distinct | 219 |
---|---|
Distinct (%) | 99.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
Length
Max length | 54 |
---|---|
Median length | 25 |
Mean length | 13.40724 |
Min length | 5 |
Characters and Unicode
Total characters | 2963 |
---|---|
Distinct characters | 232 |
Distinct categories | 9 ? |
Distinct scripts | 3 ? |
Distinct blocks | 3 ? |
Unique
Unique | 217 ? |
---|---|
Unique (%) | 98.2% |
Sample
1st row | 간암센터 외래 초진일 |
---|---|
2nd row | 췌장암 진단시 나이 |
3rd row | 췌장암 수술명 |
4th row | 항암치료 첫 치료시작일 |
5th row | 방사선치료 첫 치료시작일 |
Value | Count | Frequency (%) |
유무 | 44 | 6.1% |
수술 | 21 | 2.9% |
상세내용 | 21 | 2.9% |
및 | 17 | 2.4% |
기타 | 16 | 2.2% |
과거력 | 14 | 1.9% |
악성도 | 13 | 1.8% |
췌장암 | 11 | 1.5% |
첫번째 | 10 | 1.4% |
시 | 10 | 1.4% |
Other values (258) | 542 |
Most occurring characters
Value | Count | Frequency (%) |
498 | 16.8% | |
유 | 88 | 3.0% |
무 | 73 | 2.5% |
력 | 54 | 1.8% |
수 | 49 | 1.7% |
사 | 48 | 1.6% |
시 | 48 | 1.6% |
) | 46 | 1.6% |
( | 46 | 1.6% |
기 | 46 | 1.6% |
Other values (222) | 1967 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 2051 | |
Space Separator | 498 | 16.8% |
Uppercase Letter | 255 | 8.6% |
Other Punctuation | 52 | 1.8% |
Close Punctuation | 46 | 1.6% |
Open Punctuation | 46 | 1.6% |
Decimal Number | 11 | 0.4% |
Dash Punctuation | 3 | 0.1% |
Other Number | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
유 | 88 | 4.3% |
무 | 73 | 3.6% |
력 | 54 | 2.6% |
수 | 49 | 2.4% |
사 | 48 | 2.3% |
시 | 48 | 2.3% |
기 | 46 | 2.2% |
가 | 45 | 2.2% |
암 | 43 | 2.1% |
술 | 42 | 2.0% |
Other values (189) | 1515 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 40 | |
L | 26 | |
T | 25 | |
E | 22 | |
A | 21 | |
I | 20 | |
N | 14 | 5.5% |
R | 14 | 5.5% |
O | 13 | 5.1% |
S | 11 | 4.3% |
Other values (10) | 49 |
Decimal Number
Value | Count | Frequency (%) |
2 | 3 | |
1 | 3 | |
4 | 2 | |
3 | 2 | |
0 | 1 | 9.1% |
Other Punctuation
Value | Count | Frequency (%) |
' | 26 | |
/ | 24 | |
, | 2 | 3.8% |
Space Separator
Value | Count | Frequency (%) |
498 |
Close Punctuation
Value | Count | Frequency (%) |
) | 46 |
Open Punctuation
Value | Count | Frequency (%) |
( | 46 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 3 |
Other Number
Value | Count | Frequency (%) |
² | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 2051 | |
Common | 657 | 22.2% |
Latin | 255 | 8.6% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
유 | 88 | 4.3% |
무 | 73 | 3.6% |
력 | 54 | 2.6% |
수 | 49 | 2.4% |
사 | 48 | 2.3% |
시 | 48 | 2.3% |
기 | 46 | 2.2% |
가 | 45 | 2.2% |
암 | 43 | 2.1% |
술 | 42 | 2.0% |
Other values (189) | 1515 |
Latin
Value | Count | Frequency (%) |
C | 40 | |
L | 26 | |
T | 25 | |
E | 22 | |
A | 21 | |
I | 20 | |
N | 14 | 5.5% |
R | 14 | 5.5% |
O | 13 | 5.1% |
S | 11 | 4.3% |
Other values (10) | 49 |
Common
Value | Count | Frequency (%) |
498 | ||
) | 46 | 7.0% |
( | 46 | 7.0% |
' | 26 | 4.0% |
/ | 24 | 3.7% |
2 | 3 | 0.5% |
- | 3 | 0.5% |
1 | 3 | 0.5% |
, | 2 | 0.3% |
4 | 2 | 0.3% |
Other values (3) | 4 | 0.6% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 2051 | |
ASCII | 911 | |
None | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
498 | ||
) | 46 | 5.0% |
( | 46 | 5.0% |
C | 40 | 4.4% |
L | 26 | 2.9% |
' | 26 | 2.9% |
T | 25 | 2.7% |
/ | 24 | 2.6% |
E | 22 | 2.4% |
A | 21 | 2.3% |
Other values (22) | 137 | 15.0% |
Hangul
Value | Count | Frequency (%) |
유 | 88 | 4.3% |
무 | 73 | 3.6% |
력 | 54 | 2.6% |
수 | 49 | 2.4% |
사 | 48 | 2.3% |
시 | 48 | 2.3% |
기 | 46 | 2.2% |
가 | 45 | 2.2% |
암 | 43 | 2.1% |
술 | 42 | 2.0% |
Other values (189) | 1515 |
None
Value | Count | Frequency (%) |
² | 1 |
colCnt
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 23 |
---|---|
Distinct (%) | 60.5% |
Missing | 183 |
Missing (%) | 82.8% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 8048.2895 |
Minimum | 14 |
---|---|
Maximum | 85076 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.1 KiB |
Quantile statistics
Minimum | 14 |
---|---|
5-th percentile | 48.3 |
Q1 | 514.75 |
median | 2269.5 |
Q3 | 2321 |
95-th percentile | 82753.8 |
Maximum | 85076 |
Range | 85062 |
Interquartile range (IQR) | 1806.25 |
Descriptive statistics
Standard deviation | 22613.591 |
---|---|
Coefficient of variation (CV) | 2.8097388 |
Kurtosis | 9.0122344 |
Mean | 8048.2895 |
Median Absolute Deviation (MAD) | 1301.5 |
Skewness | 3.2406059 |
Sum | 305835 |
Variance | 5.1137451 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2321 | 13 | 5.9% |
85076 | 2 | 0.9% |
4156 | 2 | 0.9% |
995 | 2 | 0.9% |
401 | 1 | 0.5% |
1427 | 1 | 0.5% |
941 | 1 | 0.5% |
68 | 1 | 0.5% |
512 | 1 | 0.5% |
564 | 1 | 0.5% |
Other values (13) | 13 | 5.9% |
(Missing) | 183 |
Value | Count | Frequency (%) |
14 | 1 | |
16 | 1 | |
54 | 1 | |
62 | 1 | |
68 | 1 | |
130 | 1 | |
401 | 1 | |
426 | 1 | |
427 | 1 | |
512 | 1 |
Value | Count | Frequency (%) |
85076 | 2 | 0.9% |
82344 | 1 | 0.5% |
4156 | 2 | 0.9% |
2321 | 13 | |
2320 | 1 | 0.5% |
2219 | 1 | 0.5% |
2058 | 1 | 0.5% |
1427 | 1 | 0.5% |
995 | 2 | 0.9% |
941 | 1 | 0.5% |
dispFormat
Categorical
HIGH CORRELATION
 
Distinct | 12 |
---|---|
Distinct (%) | 5.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
<NA> | |
---|---|
텍스트 | |
Y : 유 / N : 무 | |
숫자 | |
YYYY-MM-DD | |
Other values (7) |
Length
Max length | 15 |
---|---|
Median length | 13 |
Mean length | 6.6877828 |
Min length | 2 |
Unique
Unique | 4 ? |
---|---|
Unique (%) | 1.8% |
Sample
1st row | YYYY-MM-DD |
---|---|
2nd row | 숫자 |
3rd row | 텍스트 |
4th row | YYYY-MM-DD |
5th row | YYYY-MM-DD |
Common Values
Value | Count | Frequency (%) |
<NA> | 60 | |
텍스트 | 43 | |
Y : 유 / N : 무 | 32 | |
숫자 | 25 | |
YYYY-MM-DD | 19 | 8.6% |
Y, Y |N, N | 13 | 5.9% |
Y 유 |N 무 | 13 | 5.9% |
Y : 내부 / N : 외부 | 12 | 5.4% |
원내검사코드 | 1 | 0.5% |
Free 텍스트 | 1 | 0.5% |
Other values (2) | 2 | 0.9% |
Length
Value | Count | Frequency (%) |
132 | ||
y | 84 | |
n | 84 | |
na | 60 | |
유 | 45 | 7.9% |
무 | 45 | 7.9% |
텍스트 | 44 | 7.8% |
숫자 | 25 | 4.4% |
yyyy-mm-dd | 20 | 3.5% |
내부 | 12 | 2.1% |
Other values (5) | 16 | 2.8% |
NUM | gpId | gpNm | tblId | tblNm | dataType | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.915 | 0.915 | 0.984 | 0.980 | 0.433 | 0.963 | 0.758 |
gpId | 0.915 | 1.000 | 1.000 | 1.000 | 0.999 | 0.638 | 1.000 | 0.819 |
gpNm | 0.915 | 1.000 | 1.000 | 1.000 | 0.999 | 0.638 | 1.000 | 0.819 |
tblId | 0.984 | 1.000 | 1.000 | 1.000 | 1.000 | 0.741 | 1.000 | 0.936 |
tblNm | 0.980 | 0.999 | 0.999 | 1.000 | 1.000 | 0.745 | 1.000 | 0.936 |
dataType | 0.433 | 0.638 | 0.638 | 0.741 | 0.745 | 1.000 | 0.000 | 0.882 |
colCnt | 0.963 | 1.000 | 1.000 | 1.000 | 1.000 | 0.000 | 1.000 | 0.000 |
dispFormat | 0.758 | 0.819 | 0.819 | 0.936 | 0.936 | 0.882 | 0.000 | 1.000 |
dataType | tblNm | dispFormat | gpId | gpNm | tblId | |
---|---|---|---|---|---|---|
dataType | 1.000 | 0.334 | 0.665 | 0.309 | 0.309 | 0.332 |
tblNm | 0.334 | 1.000 | 0.639 | 0.940 | 0.940 | 0.997 |
dispFormat | 0.665 | 0.639 | 1.000 | 0.506 | 0.506 | 0.639 |
gpId | 0.309 | 0.940 | 0.506 | 1.000 | 1.000 | 0.948 |
gpNm | 0.309 | 0.940 | 0.506 | 1.000 | 1.000 | 0.948 |
tblId | 0.332 | 0.997 | 0.639 | 0.948 | 0.948 | 1.000 |
NUM | colCnt | gpId | gpNm | tblId | tblNm | dataType | dispFormat | |
---|---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.017 | 0.678 | 0.678 | 0.812 | 0.810 | 0.196 | 0.551 |
colCnt | 0.017 | 1.000 | 0.957 | 0.957 | 0.866 | 0.866 | 0.000 | 0.000 |
gpId | 0.678 | 0.957 | 1.000 | 1.000 | 0.948 | 0.940 | 0.309 | 0.506 |
gpNm | 0.678 | 0.957 | 1.000 | 1.000 | 0.948 | 0.940 | 0.309 | 0.506 |
tblId | 0.812 | 0.866 | 0.948 | 0.948 | 1.000 | 0.997 | 0.332 | 0.639 |
tblNm | 0.810 | 0.866 | 0.940 | 0.940 | 0.997 | 1.000 | 0.334 | 0.639 |
dataType | 0.196 | 0.000 | 0.309 | 0.309 | 0.332 | 0.334 | 1.000 | 0.665 |
dispFormat | 0.551 | 0.000 | 0.506 | 0.506 | 0.639 | 0.639 | 0.665 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | PNC_SUMMARY | Summary | PNC_SUMMARY_PTIF_V | Patient info | FRMD_YMD | 초진일 | DATE | 간암센터 외래 초진일 | <NA> | YYYY-MM-DD |
1 | 2 | PNC_SUMMARY | Summary | PNC_SUMMARY_PTIF_V | Patient info | DIAG_AGE | 진단 시 나이 | Integer() | 췌장암 진단시 나이 | <NA> | 숫자 |
2 | 3 | PNC_SUMMARY | Summary | PNC_SUMMARY_PTIF_V | Patient info | OPRT_NM | 수술명 | String() | 췌장암 수술명 | <NA> | 텍스트 |
3 | 4 | PNC_SUMMARY | Summary | PNC_SUMMARY_PTIF_V | Patient info | ORD_YMD | 약물치료시작일 | DATE | 항암치료 첫 치료시작일 | <NA> | YYYY-MM-DD |
4 | 5 | PNC_SUMMARY | Summary | PNC_SUMMARY_PTIF_V | Patient info | FRST_TRTM_RSRV_YMD | 방사선치료시작일 | DATE | 방사선치료 첫 치료시작일 | <NA> | YYYY-MM-DD |
5 | 6 | PNC_DIAG | 진단정보 | RG_PNC_CNDX | 진단정보 | CLNC_DIAG_CD | 임상진단코드 | String(code) | 췌장암과 기타암 임상진단코드 | <NA> | 원내검사코드 |
6 | 7 | PNC_DIAG | 진단정보 | RG_PNC_CNDX | 진단정보 | CLNC_DIAG_NM | 임상진단명 | String() | 췌장암 및 기타암 임상진단명 | <NA> | 텍스트 |
7 | 8 | PNC_DIAG | 진단정보 | RG_PNC_CNDX | 진단정보 | DIAG_YMD | 진단일 | DATE | 췌장암 및 기타암 진단일자 | <NA> | YYYY-MM-DD |
8 | 9 | PNC_DIAG | 진단정보 | PT_PNC_BDMS | 증상 및 신체계측 | HT_MSRM_YMD | 신장 측정일 | DATE | 췌장암 진단 이후 첫번째 신장 측정일 | <NA> | YYYY-MM-DD |
9 | 10 | PNC_DIAG | 진단정보 | PT_PNC_BDMS | 증상 및 신체계측 | HT_VL | 신장 | Float(51) | 췌장암 진단 이후 첫번째 신장 값 | <NA> | 숫자 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
211 | 212 | PNC_HLTH | 기타건강정보 | MR_PNC_HLTH_9 | 과거력 | PHIS_CNCR_YN | 과거병력암여부 | String() | 과거력 암 유무 | 2321 | Y 유 |N 무 |
212 | 213 | PNC_HLTH | 기타건강정보 | MR_PNC_HLTH_9 | 과거력 | PHIS_DEPR_YN | 과거병력우을증여부 | String() | 과거력 우울 유무 | 2321 | Y 유 |N 무 |
213 | 214 | PNC_HLTH | 기타건강정보 | MR_PNC_HLTH_9 | 과거력 | PHIS_INSM_YN | 과거병력불면증여부 | String() | 과거력 불면 유무 | 2321 | Y 유 |N 무 |
214 | 215 | PNC_HLTH | 기타건강정보 | MR_PNC_HLTH_9 | 과거력 | PHIS_CADZ_YN | 과거병력심장질환여부 | String() | 과거력 심장질환 유무 | 2321 | Y 유 |N 무 |
215 | 216 | PNC_HLTH | 기타건강정보 | MR_PNC_HLTH_9 | 과거력 | PHIS_CADZ_CMNT | 과거병력심장질환내용 | String() | 과거력 심장질환 상세내용 | 68 | 텍스트 |
216 | 217 | PNC_HLTH | 기타건강정보 | MR_PNC_HLTH_9 | 과거력 | PHIS_ETC_YN | 과거병력기타여부 | String() | 과거력 기타 유무 | 2321 | Y 유 |N 무 |
217 | 218 | PNC_HLTH | 기타건강정보 | MR_PNC_HLTH_9 | 과거력 | PHIS_ETC_CMNT | 과거병력기타내용 | String() | 과거력 기타 상세내용 | 941 | 텍스트 |
218 | 219 | PNC_HLTH | 기타건강정보 | MR_PNC_HLTH_10 | 증상/전원 정보 | MAIN_SYMP_YN | 주증상유무 | String() | 입원 시 주증상 유무 | 995 | Y 유 |N 무 |
219 | 220 | PNC_HLTH | 기타건강정보 | MR_PNC_HLTH_10 | 증상/전원 정보 | MAIN_SYMP_CMNT | 증상 상세내용 | String() | 입원 시 주증상 상세내용 | 1427 | 텍스트 |
220 | 221 | PNC_HLTH | 기타건강정보 | MR_PNC_HLTH_10 | 증상/전원 정보 | OUTS_DIAG_TRANS_YN | 타 병원 진단 후 전원 | String() | 타 병원 진단 후 전원여부 | 2321 | Y 유 |N 무 |