Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 231 |
Missing cells | 236 |
Missing cells (%) | 9.3% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 20.4 KiB |
Average record size in memory | 90.6 B |
Variable types
Numeric | 1 |
---|---|
Categorical | 6 |
Text | 3 |
Unsupported | 1 |
Dataset
Description | 전립선암 레지스트리 메타정보( 제공 되어질 데이터 항목, 타입, 사이즈, 항목별건수, 샘플데이터 등)를 제공 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15048688/fileData.do |
tblNm is highly overall correlated with NUM and 4 other fields | High correlation |
dispFormat is highly overall correlated with NUM and 5 other fields | High correlation |
gpNm is highly overall correlated with NUM and 4 other fields | High correlation |
gpId is highly overall correlated with NUM and 4 other fields | High correlation |
tblId is highly overall correlated with NUM and 4 other fields | High correlation |
dataType is highly overall correlated with dispFormat | High correlation |
NUM is highly overall correlated with gpId and 4 other fields | High correlation |
dataType is highly imbalanced (71.7%) | Imbalance |
colDesc has 5 (2.2%) missing values | Missing |
colCnt has 231 (100.0%) missing values | Missing |
NUM has unique values | Unique |
colCnt is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2023-12-12 04:39:59.308562 |
---|---|
Analysis finished | 2023-12-12 04:40:00.429805 |
Duration | 1.12 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
NUM
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 231 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 116 |
Minimum | 1 |
---|---|
Maximum | 231 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.2 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 12.5 |
Q1 | 58.5 |
median | 116 |
Q3 | 173.5 |
95-th percentile | 219.5 |
Maximum | 231 |
Range | 230 |
Interquartile range (IQR) | 115 |
Descriptive statistics
Standard deviation | 66.828138 |
---|---|
Coefficient of variation (CV) | 0.57610464 |
Kurtosis | -1.2 |
Mean | 116 |
Median Absolute Deviation (MAD) | 58 |
Skewness | 0 |
Sum | 26796 |
Variance | 4466 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 0.4% |
160 | 1 | 0.4% |
148 | 1 | 0.4% |
149 | 1 | 0.4% |
150 | 1 | 0.4% |
151 | 1 | 0.4% |
152 | 1 | 0.4% |
153 | 1 | 0.4% |
154 | 1 | 0.4% |
155 | 1 | 0.4% |
Other values (221) | 221 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
231 | 1 | |
230 | 1 | |
229 | 1 | |
228 | 1 | |
227 | 1 | |
226 | 1 | |
225 | 1 | |
224 | 1 | |
223 | 1 | |
222 | 1 |
gpId
Categorical
HIGH CORRELATION
 
Distinct | 9 |
---|---|
Distinct (%) | 3.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
MR_PRST_HLTH | |
---|---|
PE_PRST_NHT | |
RG_CNDX_PRST | |
PE_PRST_LAB | |
PE_PRST_REBX | |
Other values (4) |
Length
Max length | 13 |
---|---|
Median length | 12 |
Mean length | 11.714286 |
Min length | 11 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | PE_PRST_PINFO |
---|---|
2nd row | PE_PRST_PINFO |
3rd row | PE_PRST_PINFO |
4th row | PE_PRST_PINFO |
5th row | PE_PRST_PINFO |
Common Values
Value | Count | Frequency (%) |
MR_PRST_HLTH | 66 | |
PE_PRST_NHT | 35 | |
RG_CNDX_PRST | 34 | |
PE_PRST_LAB | 34 | |
PE_PRST_REBX | 18 | 7.8% |
PE_PRST_MIEX | 18 | 7.8% |
PE_PRST_PINFO | 12 | 5.2% |
PE_PRST_PSA | 9 | 3.9% |
PT_PRST_DEAD | 5 | 2.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
mr_prst_hlth | 66 | |
pe_prst_nht | 35 | |
rg_cndx_prst | 34 | |
pe_prst_lab | 34 | |
pe_prst_rebx | 18 | 7.8% |
pe_prst_miex | 18 | 7.8% |
pe_prst_pinfo | 12 | 5.2% |
pe_prst_psa | 9 | 3.9% |
pt_prst_dead | 5 | 2.2% |
gpNm
Categorical
HIGH CORRELATION
 
Distinct | 9 |
---|---|
Distinct (%) | 3.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
기타건강정보 | |
---|---|
수술정보 | |
진단정보 | |
진단검사 | |
약물치료 | |
Other values (4) |
Length
Max length | 9 |
---|---|
Median length | 4 |
Mean length | 5.1861472 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Summary |
---|---|
2nd row | Summary |
3rd row | Summary |
4th row | Summary |
5th row | Summary |
Common Values
Value | Count | Frequency (%) |
기타건강정보 | 66 | |
수술정보 | 35 | |
진단정보 | 34 | |
진단검사 | 34 | |
약물치료 | 18 | 7.8% |
전이 및 재발 | 18 | 7.8% |
Summary | 12 | 5.2% |
PSA_F/U | 9 | 3.9% |
사망 및 치료평가 | 5 | 2.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
기타건강정보 | 66 | |
수술정보 | 35 | |
진단정보 | 34 | |
진단검사 | 34 | |
및 | 23 | 8.3% |
약물치료 | 18 | 6.5% |
전이 | 18 | 6.5% |
재발 | 18 | 6.5% |
summary | 12 | 4.3% |
psa_f/u | 9 | 3.2% |
Other values (2) | 10 | 3.6% |
tblId
Categorical
HIGH CORRELATION
 
Distinct | 33 |
---|---|
Distinct (%) | 14.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
PE_PRST_SPR | |
---|---|
MR_PRST_HLTH_1 | |
RG_PRST_ESMP_1 | 14 |
MR_PRST_HLTH_7 | 14 |
PE_PRST_FRBX_2 | 13 |
Other values (28) |
Length
Max length | 19 |
---|---|
Median length | 14 |
Mean length | 13.536797 |
Min length | 11 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.4% |
Sample
1st row | PE_PRST_CNST_V |
---|---|
2nd row | PE_PRST_CNST_V |
3rd row | PE_PRST_CNST_V |
4th row | PE_PRST_GSSC_V |
5th row | PE_PRST_GSSC_V |
Common Values
Value | Count | Frequency (%) |
PE_PRST_SPR | 27 | 11.7% |
MR_PRST_HLTH_1 | 15 | 6.5% |
RG_PRST_ESMP_1 | 14 | 6.1% |
MR_PRST_HLTH_7 | 14 | 6.1% |
PE_PRST_FRBX_2 | 13 | 5.6% |
PE_PRST_FRBX_1 | 11 | 4.8% |
MR_PRST_HLTH_5 | 9 | 3.9% |
MR_PRST_HLTH_2 | 9 | 3.9% |
MR_PRST_HLTH_3 | 9 | 3.9% |
MR_PRST_HLTH_4 | 9 | 3.9% |
Other values (23) | 101 |
Length
Value | Count | Frequency (%) |
pe_prst_spr | 27 | 11.7% |
mr_prst_hlth_1 | 15 | 6.5% |
rg_prst_esmp_1 | 14 | 6.1% |
mr_prst_hlth_7 | 14 | 6.1% |
pe_prst_frbx_2 | 13 | 5.6% |
pe_prst_frbx_1 | 11 | 4.8% |
mr_prst_hlth_5 | 9 | 3.9% |
mr_prst_hlth_2 | 9 | 3.9% |
mr_prst_hlth_3 | 9 | 3.9% |
mr_prst_hlth_4 | 9 | 3.9% |
Other values (23) | 101 |
tblNm
Categorical
HIGH CORRELATION
 
Distinct | 33 |
---|---|
Distinct (%) | 14.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
외과병리보고서 | |
---|---|
기타건강정보 | |
전자설문 | 14 |
과거력 | 14 |
원내 Initial 조직검사 | 13 |
Other values (28) |
Length
Max length | 27 |
---|---|
Median length | 21 |
Mean length | 8.017316 |
Min length | 2 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.4% |
Sample
1st row | Cancer Stage |
---|---|
2nd row | Cancer Stage |
3rd row | Cancer Stage |
4th row | Initial Gleason Score |
5th row | Initial Gleason Score |
Common Values
Value | Count | Frequency (%) |
외과병리보고서 | 27 | 11.7% |
기타건강정보 | 15 | 6.5% |
전자설문 | 14 | 6.1% |
과거력 | 14 | 6.1% |
원내 Initial 조직검사 | 13 | 5.6% |
외부 Initial 조직검사 | 11 | 4.8% |
가족력(형제/자매) | 9 | 3.9% |
가족력(부) | 9 | 3.9% |
가족력(모) | 9 | 3.9% |
가족력(자녀) | 9 | 3.9% |
Other values (23) | 101 |
Length
Value | Count | Frequency (%) |
initial | 40 | 12.0% |
외과병리보고서 | 27 | 8.1% |
조직검사 | 24 | 7.2% |
psa | 15 | 4.5% |
기타건강정보 | 15 | 4.5% |
과거력 | 14 | 4.2% |
전자설문 | 14 | 4.2% |
원내 | 13 | 3.9% |
외부 | 11 | 3.3% |
clinical | 11 | 3.3% |
Other values (30) | 149 |
colId
Text
Distinct | 223 |
---|---|
Distinct (%) | 96.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
Length
Max length | 18 |
---|---|
Median length | 15 |
Mean length | 11.359307 |
Min length | 5 |
Characters and Unicode
Total characters | 2624 |
---|---|
Distinct characters | 35 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 218 ? |
---|---|
Unique (%) | 94.4% |
Sample
1st row | STAG_TYPE |
---|---|
2nd row | STAG_YMD |
3rd row | STAG_VL |
4th row | GLSN_FRST_SCR |
5th row | GLSN_SCND_SCR |
Value | Count | Frequency (%) |
glsn_frst_scr | 3 | 1.3% |
glsn_scnd_scr | 3 | 1.3% |
glsn_sum_scr | 3 | 1.3% |
prnr_invs | 2 | 0.9% |
max_tumr_len | 2 | 0.9% |
clrc1_part | 1 | 0.4% |
last_pt_stts | 1 | 0.4% |
clrc1_ymd | 1 | 0.4% |
clrc1_prgr_yn | 1 | 0.4% |
clrc1_seq | 1 | 0.4% |
Other values (213) | 213 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 421 | |
S | 222 | 8.5% |
T | 197 | 7.5% |
M | 179 | 6.8% |
N | 160 | 6.1% |
C | 155 | 5.9% |
R | 146 | 5.6% |
D | 127 | 4.8% |
L | 114 | 4.3% |
H | 113 | 4.3% |
Other values (25) | 790 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 2188 | |
Connector Punctuation | 421 | 16.0% |
Decimal Number | 15 | 0.6% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
S | 222 | 10.1% |
T | 197 | 9.0% |
M | 179 | 8.2% |
N | 160 | 7.3% |
C | 155 | 7.1% |
R | 146 | 6.7% |
D | 127 | 5.8% |
L | 114 | 5.2% |
H | 113 | 5.2% |
Y | 92 | 4.2% |
Other values (16) | 683 |
Decimal Number
Value | Count | Frequency (%) |
1 | 7 | |
2 | 2 | 13.3% |
3 | 1 | 6.7% |
4 | 1 | 6.7% |
5 | 1 | 6.7% |
6 | 1 | 6.7% |
7 | 1 | 6.7% |
8 | 1 | 6.7% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 421 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2188 | |
Common | 436 | 16.6% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
S | 222 | 10.1% |
T | 197 | 9.0% |
M | 179 | 8.2% |
N | 160 | 7.3% |
C | 155 | 7.1% |
R | 146 | 6.7% |
D | 127 | 5.8% |
L | 114 | 5.2% |
H | 113 | 5.2% |
Y | 92 | 4.2% |
Other values (16) | 683 |
Common
Value | Count | Frequency (%) |
_ | 421 | |
1 | 7 | 1.6% |
2 | 2 | 0.5% |
3 | 1 | 0.2% |
4 | 1 | 0.2% |
5 | 1 | 0.2% |
6 | 1 | 0.2% |
7 | 1 | 0.2% |
8 | 1 | 0.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2624 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 421 | |
S | 222 | 8.5% |
T | 197 | 7.5% |
M | 179 | 6.8% |
N | 160 | 6.1% |
C | 155 | 5.9% |
R | 146 | 5.6% |
D | 127 | 4.8% |
L | 114 | 4.3% |
H | 113 | 4.3% |
Other values (25) | 790 |
colNm
Text
Distinct | 204 |
---|---|
Distinct (%) | 88.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
Value | Count | Frequency (%) |
과거력 | 14 | 3.1% |
score | 14 | 3.1% |
tumor | 13 | 2.9% |
검사 | 12 | 2.7% |
영상 | 12 | 2.7% |
psa | 11 | 2.4% |
date | 11 | 2.4% |
gleason | 9 | 2.0% |
ipss | 8 | 1.8% |
bx | 7 | 1.5% |
Other values (225) | 341 |
Most occurring characters
Value | Count | Frequency (%) |
221 | 8.9% | |
e | 132 | 5.3% |
o | 113 | 4.5% |
t | 82 | 3.3% |
a | 81 | 3.3% |
n | 75 | 3.0% |
s | 75 | 3.0% |
i | 68 | 2.7% |
r | 67 | 2.7% |
) | 67 | 2.7% |
Other values (169) | 1504 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 1010 | |
Other Letter | 748 | |
Uppercase Letter | 314 | 12.6% |
Space Separator | 221 | 8.9% |
Close Punctuation | 67 | 2.7% |
Open Punctuation | 67 | 2.7% |
Other Punctuation | 23 | 0.9% |
Decimal Number | 23 | 0.9% |
Dash Punctuation | 6 | 0.2% |
Connector Punctuation | 6 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
력 | 52 | 7.0% |
가 | 42 | 5.6% |
족 | 37 | 4.9% |
자 | 31 | 4.1% |
부 | 27 | 3.6% |
과 | 24 | 3.2% |
일 | 22 | 2.9% |
사 | 22 | 2.9% |
검 | 22 | 2.9% |
기 | 16 | 2.1% |
Other values (106) | 453 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 132 | |
o | 113 | |
t | 82 | 8.1% |
a | 81 | 8.0% |
n | 75 | 7.4% |
s | 75 | 7.4% |
i | 68 | 6.7% |
r | 67 | 6.6% |
l | 52 | 5.1% |
c | 43 | 4.3% |
Other values (13) | 222 |
Uppercase Letter
Value | Count | Frequency (%) |
S | 46 | |
P | 45 | |
I | 26 | 8.3% |
T | 23 | 7.3% |
B | 19 | 6.1% |
R | 17 | 5.4% |
C | 16 | 5.1% |
A | 16 | 5.1% |
E | 13 | 4.1% |
D | 12 | 3.8% |
Other values (12) | 81 |
Decimal Number
Value | Count | Frequency (%) |
1 | 6 | |
2 | 5 | |
0 | 4 | |
5 | 3 | |
4 | 1 | 4.3% |
3 | 1 | 4.3% |
6 | 1 | 4.3% |
8 | 1 | 4.3% |
7 | 1 | 4.3% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 14 | |
% | 5 | 21.7% |
* | 2 | 8.7% |
. | 2 | 8.7% |
Space Separator
Value | Count | Frequency (%) |
221 |
Close Punctuation
Value | Count | Frequency (%) |
) | 67 |
Open Punctuation
Value | Count | Frequency (%) |
( | 67 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 6 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 6 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 1324 | |
Hangul | 748 | |
Common | 413 | 16.6% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
력 | 52 | 7.0% |
가 | 42 | 5.6% |
족 | 37 | 4.9% |
자 | 31 | 4.1% |
부 | 27 | 3.6% |
과 | 24 | 3.2% |
일 | 22 | 2.9% |
사 | 22 | 2.9% |
검 | 22 | 2.9% |
기 | 16 | 2.1% |
Other values (106) | 453 |
Latin
Value | Count | Frequency (%) |
e | 132 | 10.0% |
o | 113 | 8.5% |
t | 82 | 6.2% |
a | 81 | 6.1% |
n | 75 | 5.7% |
s | 75 | 5.7% |
i | 68 | 5.1% |
r | 67 | 5.1% |
l | 52 | 3.9% |
S | 46 | 3.5% |
Other values (35) | 533 |
Common
Value | Count | Frequency (%) |
221 | ||
) | 67 | 16.2% |
( | 67 | 16.2% |
/ | 14 | 3.4% |
- | 6 | 1.5% |
1 | 6 | 1.5% |
_ | 6 | 1.5% |
2 | 5 | 1.2% |
% | 5 | 1.2% |
0 | 4 | 1.0% |
Other values (8) | 12 | 2.9% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1737 | |
Hangul | 748 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
221 | 12.7% | |
e | 132 | 7.6% |
o | 113 | 6.5% |
t | 82 | 4.7% |
a | 81 | 4.7% |
n | 75 | 4.3% |
s | 75 | 4.3% |
i | 68 | 3.9% |
r | 67 | 3.9% |
) | 67 | 3.9% |
Other values (53) | 756 |
Hangul
Value | Count | Frequency (%) |
력 | 52 | 7.0% |
가 | 42 | 5.6% |
족 | 37 | 4.9% |
자 | 31 | 4.1% |
부 | 27 | 3.6% |
과 | 24 | 3.2% |
일 | 22 | 2.9% |
사 | 22 | 2.9% |
검 | 22 | 2.9% |
기 | 16 | 2.1% |
Other values (106) | 453 |
dataType
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 8 |
---|---|
Distinct (%) | 3.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
<NA> | |
---|---|
STRING | 7 |
DATE | 6 |
String | 5 |
Date | 4 |
Other values (3) | 6 |
Length
Max length | 7 |
---|---|
Median length | 4 |
Mean length | 4.1731602 |
Min length | 4 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.4% |
Sample
1st row | String |
---|---|
2nd row | Date |
3rd row | String |
4th row | Integer |
5th row | Integer |
Common Values
Value | Count | Frequency (%) |
<NA> | 203 | |
STRING | 7 | 3.0% |
DATE | 6 | 2.6% |
String | 5 | 2.2% |
Date | 4 | 1.7% |
Integer | 3 | 1.3% |
INTEGER | 2 | 0.9% |
FLOAT | 1 | 0.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 203 | |
string | 12 | 5.2% |
date | 10 | 4.3% |
integer | 5 | 2.2% |
float | 1 | 0.4% |
colDesc
Text
MISSING
 
Distinct | 209 |
---|---|
Distinct (%) | 92.5% |
Missing | 5 |
Missing (%) | 2.2% |
Memory size | 1.9 KiB |
Length
Max length | 49 |
---|---|
Median length | 32 |
Mean length | 14.681416 |
Min length | 5 |
Characters and Unicode
Total characters | 3318 |
---|---|
Distinct characters | 250 |
Distinct categories | 8 ? |
Distinct scripts | 3 ? |
Distinct blocks | 2 ? |
Unique
Unique | 195 ? |
---|---|
Unique (%) | 86.3% |
Sample
1st row | 병기 종류 |
---|---|
2nd row | 병기 기재 일자 |
3rd row | 진단 병기 값 |
4th row | 외과병리보고서의 Gleason score 1st |
5th row | 외과병리보고서의 Gleason score 2nd |
Value | Count | Frequency (%) |
환자 | 39 | 4.5% |
가족력 | 25 | 2.9% |
환자의 | 25 | 2.9% |
관련 | 24 | 2.7% |
유무 | 16 | 1.8% |
psa | 16 | 1.8% |
검사일 | 15 | 1.7% |
시행한 | 14 | 1.6% |
값 | 14 | 1.6% |
과거력 | 13 | 1.5% |
Other values (281) | 673 |
Most occurring characters
Value | Count | Frequency (%) |
663 | 20.0% | |
의 | 144 | 4.3% |
자 | 86 | 2.6% |
환 | 80 | 2.4% |
가 | 52 | 1.6% |
력 | 51 | 1.5% |
사 | 46 | 1.4% |
부 | 45 | 1.4% |
검 | 45 | 1.4% |
일 | 42 | 1.3% |
Other values (240) | 2064 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 2182 | |
Space Separator | 663 | 20.0% |
Lowercase Letter | 250 | 7.5% |
Uppercase Letter | 184 | 5.5% |
Decimal Number | 21 | 0.6% |
Other Punctuation | 10 | 0.3% |
Close Punctuation | 4 | 0.1% |
Open Punctuation | 4 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
의 | 144 | 6.6% |
자 | 86 | 3.9% |
환 | 80 | 3.7% |
가 | 52 | 2.4% |
력 | 51 | 2.3% |
사 | 46 | 2.1% |
부 | 45 | 2.1% |
검 | 45 | 2.1% |
일 | 42 | 1.9% |
전 | 38 | 1.7% |
Other values (192) | 1553 |
Uppercase Letter
Value | Count | Frequency (%) |
S | 37 | |
P | 29 | |
A | 18 | |
I | 16 | |
B | 14 | 7.6% |
G | 13 | 7.1% |
M | 10 | 5.4% |
Q | 9 | 4.9% |
C | 9 | 4.9% |
T | 9 | 4.9% |
Other values (8) | 20 |
Lowercase Letter
Value | Count | Frequency (%) |
o | 40 | |
s | 39 | |
e | 37 | |
n | 28 | |
a | 24 | |
c | 20 | |
r | 15 | 6.0% |
l | 13 | 5.2% |
i | 7 | 2.8% |
y | 6 | 2.4% |
Other values (6) | 21 |
Decimal Number
Value | Count | Frequency (%) |
1 | 5 | |
2 | 4 | |
6 | 3 | |
4 | 2 | 9.5% |
8 | 2 | 9.5% |
3 | 2 | 9.5% |
5 | 1 | 4.8% |
7 | 1 | 4.8% |
0 | 1 | 4.8% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 9 | |
% | 1 | 10.0% |
Space Separator
Value | Count | Frequency (%) |
663 |
Close Punctuation
Value | Count | Frequency (%) |
) | 4 |
Open Punctuation
Value | Count | Frequency (%) |
( | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 2182 | |
Common | 702 | 21.2% |
Latin | 434 | 13.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
의 | 144 | 6.6% |
자 | 86 | 3.9% |
환 | 80 | 3.7% |
가 | 52 | 2.4% |
력 | 51 | 2.3% |
사 | 46 | 2.1% |
부 | 45 | 2.1% |
검 | 45 | 2.1% |
일 | 42 | 1.9% |
전 | 38 | 1.7% |
Other values (192) | 1553 |
Latin
Value | Count | Frequency (%) |
o | 40 | 9.2% |
s | 39 | 9.0% |
S | 37 | 8.5% |
e | 37 | 8.5% |
P | 29 | 6.7% |
n | 28 | 6.5% |
a | 24 | 5.5% |
c | 20 | 4.6% |
A | 18 | 4.1% |
I | 16 | 3.7% |
Other values (24) | 146 |
Common
Value | Count | Frequency (%) |
663 | ||
/ | 9 | 1.3% |
1 | 5 | 0.7% |
2 | 4 | 0.6% |
) | 4 | 0.6% |
( | 4 | 0.6% |
6 | 3 | 0.4% |
4 | 2 | 0.3% |
8 | 2 | 0.3% |
3 | 2 | 0.3% |
Other values (4) | 4 | 0.6% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 2182 | |
ASCII | 1136 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
663 | ||
o | 40 | 3.5% |
s | 39 | 3.4% |
S | 37 | 3.3% |
e | 37 | 3.3% |
P | 29 | 2.6% |
n | 28 | 2.5% |
a | 24 | 2.1% |
c | 20 | 1.8% |
A | 18 | 1.6% |
Other values (38) | 201 | 17.7% |
Hangul
Value | Count | Frequency (%) |
의 | 144 | 6.6% |
자 | 86 | 3.9% |
환 | 80 | 3.7% |
가 | 52 | 2.4% |
력 | 51 | 2.3% |
사 | 46 | 2.1% |
부 | 45 | 2.1% |
검 | 45 | 2.1% |
일 | 42 | 1.9% |
전 | 38 | 1.7% |
Other values (192) | 1553 |
colCnt
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 231 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.2 KiB |
dispFormat
Categorical
HIGH CORRELATION
 
Distinct | 29 |
---|---|
Distinct (%) | 12.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.9 KiB |
<NA> | |
---|---|
Y유 | N무 | |
텍스트 | 10 |
Y,유 | N,무 | 8 |
1 Not listed | 2 Present | 3 Not indentified | 6 |
Other values (24) |
Length
Max length | 132 |
---|---|
Median length | 4 |
Mean length | 11.896104 |
Min length | 2 |
Unique
Unique | 18 ? |
---|---|
Unique (%) | 7.8% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 131 | |
Y유 | N무 | 35 | 15.2% |
텍스트 | 10 | 4.3% |
Y,유 | N,무 | 8 | 3.5% |
1 Not listed | 2 Present | 3 Not indentified | 6 | 2.6% |
YYYY-MM-DD | 6 | 2.6% |
1,유 | 2, 무 | 3, 모름 | 4 | 1.7% |
숫자 | 4 | 1.7% |
1, Not listed | 2,Present | 3, Not indentified | 4 | 1.7% |
1 위 | 2 폐 | 3 간 | 4 대장 | 5 유방 | 6 자궁 | 7 기타 | 8 갑상선 | 9 전립선 | 3 | 1.3% |
Other values (19) | 20 | 8.7% |
Length
Value | Count | Frequency (%) |
166 | ||
na | 131 | |
n무 | 35 | 4.7% |
y유 | 35 | 4.7% |
3 | 32 | 4.3% |
not | 28 | 3.8% |
2 | 27 | 3.6% |
1 | 27 | 3.6% |
listed | 16 | 2.1% |
indentified | 11 | 1.5% |
Other values (95) | 238 |
NUM | gpId | gpNm | tblId | tblNm | dataType | dispFormat | |
---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.929 | 0.929 | 0.989 | 0.989 | 0.304 | 0.921 |
gpId | 0.929 | 1.000 | 1.000 | 1.000 | 1.000 | 0.585 | 0.994 |
gpNm | 0.929 | 1.000 | 1.000 | 1.000 | 1.000 | 0.585 | 0.994 |
tblId | 0.989 | 1.000 | 1.000 | 1.000 | 1.000 | 0.700 | 0.939 |
tblNm | 0.989 | 1.000 | 1.000 | 1.000 | 1.000 | 0.700 | 0.939 |
dataType | 0.304 | 0.585 | 0.585 | 0.700 | 0.700 | 1.000 | 1.000 |
dispFormat | 0.921 | 0.994 | 0.994 | 0.939 | 0.939 | 1.000 | 1.000 |
tblNm | dispFormat | gpNm | gpId | tblId | dataType | |
---|---|---|---|---|---|---|
tblNm | 1.000 | 0.542 | 0.944 | 0.944 | 1.000 | 0.433 |
dispFormat | 0.542 | 1.000 | 0.779 | 0.779 | 0.542 | 0.953 |
gpNm | 0.944 | 0.779 | 1.000 | 1.000 | 0.944 | 0.408 |
gpId | 0.944 | 0.779 | 1.000 | 1.000 | 0.944 | 0.408 |
tblId | 1.000 | 0.542 | 0.944 | 0.944 | 1.000 | 0.433 |
dataType | 0.433 | 0.953 | 0.408 | 0.408 | 0.433 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | dataType | dispFormat | |
---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.753 | 0.753 | 0.866 | 0.866 | 0.211 | 0.599 |
gpId | 0.753 | 1.000 | 1.000 | 0.944 | 0.944 | 0.408 | 0.779 |
gpNm | 0.753 | 1.000 | 1.000 | 0.944 | 0.944 | 0.408 | 0.779 |
tblId | 0.866 | 0.944 | 0.944 | 1.000 | 1.000 | 0.433 | 0.542 |
tblNm | 0.866 | 0.944 | 0.944 | 1.000 | 1.000 | 0.433 | 0.542 |
dataType | 0.211 | 0.408 | 0.408 | 0.433 | 0.433 | 1.000 | 0.953 |
dispFormat | 0.599 | 0.779 | 0.779 | 0.542 | 0.542 | 0.953 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | PE_PRST_PINFO | Summary | PE_PRST_CNST_V | Cancer Stage | STAG_TYPE | 구분 | String | 병기 종류 | <NA> | <NA> |
1 | 2 | PE_PRST_PINFO | Summary | PE_PRST_CNST_V | Cancer Stage | STAG_YMD | 일자 | Date | 병기 기재 일자 | <NA> | <NA> |
2 | 3 | PE_PRST_PINFO | Summary | PE_PRST_CNST_V | Cancer Stage | STAG_VL | 값 | String | 진단 병기 값 | <NA> | <NA> |
3 | 4 | PE_PRST_PINFO | Summary | PE_PRST_GSSC_V | Initial Gleason Score | GLSN_FRST_SCR | First Score | Integer | 외과병리보고서의 Gleason score 1st | <NA> | <NA> |
4 | 5 | PE_PRST_PINFO | Summary | PE_PRST_GSSC_V | Initial Gleason Score | GLSN_SCND_SCR | Second Score | Integer | 외과병리보고서의 Gleason score 2nd | <NA> | <NA> |
5 | 6 | PE_PRST_PINFO | Summary | PE_PRST_GSSC_V | Initial Gleason Score | GLSN_SUM_SCR | Sum Score | Integer | 외과병리보고서의 Gleason score sum | <NA> | <NA> |
6 | 7 | PE_PRST_PINFO | Summary | PE_PRST_PSA_V | PSA | PSA_BX_YMD | Initial Date | Date | 최초 PSA 검사일 | <NA> | <NA> |
7 | 8 | PE_PRST_PINFO | Summary | PE_PRST_PSA_V | PSA | PSA_BX_VL | Initial Value | String | 최초 PSA 검사값 | <NA> | <NA> |
8 | 9 | PE_PRST_PINFO | Summary | PE_PRST_PSA_V | PSA | PSA_PREOP_YMD | Pre OP Date | Date | 수술 직전 PSA 검사일 | <NA> | <NA> |
9 | 10 | PE_PRST_PINFO | Summary | PE_PRST_PSA_V | PSA | PSA_PREOP_VL | Pre OP Value | String | 수술 직전 PSA 검사값 | <NA> | <NA> |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
221 | 222 | MR_PRST_HLTH | 기타건강정보 | MR_PRST_HLTH_7 | 과거력 | PHIS_DM_YN | 과거력 당뇨 | <NA> | 환자의 당뇨관련 과거력 | <NA> | Y유 | N무 |
222 | 223 | MR_PRST_HLTH | 기타건강정보 | MR_PRST_HLTH_7 | 과거력 | PHIS_DM_CTNT | 과거력 당뇨 상세내용 | <NA> | 환자의 당뇨관련 과거력 상세내용 | <NA> | 텍스트 |
223 | 224 | MR_PRST_HLTH | 기타건강정보 | MR_PRST_HLTH_7 | 과거력 | PHIS_TB_YN | 과거력 결핵 | <NA> | 환자의 결핵 관련 과거력 | <NA> | Y유 | N무 |
224 | 225 | MR_PRST_HLTH | 기타건강정보 | MR_PRST_HLTH_7 | 과거력 | PHIS_LVDZ_YN | 과거력 간질환 | <NA> | 환자의 간질환 관련 과거력 | <NA> | Y유 | N무 |
225 | 226 | MR_PRST_HLTH | 기타건강정보 | MR_PRST_HLTH_7 | 과거력 | PHIS_CNCR_YN | 과거력 암 | <NA> | 환자의 암관련 과거력 | <NA> | Y유 | N무 |
226 | 227 | MR_PRST_HLTH | 기타건강정보 | MR_PRST_HLTH_7 | 과거력 | PHIS_DEPR_YN | 과거력 우울 | <NA> | 환자의 우울증 관련 과거력 | <NA> | Y유 | N무 |
227 | 228 | MR_PRST_HLTH | 기타건강정보 | MR_PRST_HLTH_7 | 과거력 | PHIS_INSM_YN | 과거력 불면 | <NA> | 환자의 불면증 관련 과거력 | <NA> | Y유 | N무 |
228 | 229 | MR_PRST_HLTH | 기타건강정보 | MR_PRST_HLTH_7 | 과거력 | PHIS_CADZ_YN | 과거력 심장질환 | <NA> | 환자의 심장실환 관련 과거력 | <NA> | Y유 | N무 |
229 | 230 | MR_PRST_HLTH | 기타건강정보 | MR_PRST_HLTH_7 | 과거력 | PHIS_CADZ_CTNT | 과거력 심장질환 상세내용 | <NA> | 환자의 심장질환 관련 과거력 상세내용 | <NA> | 텍스트 |
230 | 231 | MR_PRST_HLTH | 기타건강정보 | MR_PRST_HLTH_6 | 가족력(기타) | FMHS_ETC_YN | 가족력(기타)유무 | <NA> | 환자 의 가족력 유무 | <NA> | Y유 | N무 |