Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 295 |
Missing cells | 333 |
Missing cells (%) | 10.3% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 26.1 KiB |
Average record size in memory | 90.4 B |
Variable types
Numeric | 1 |
---|---|
Categorical | 5 |
Text | 4 |
Unsupported | 1 |
Dataset
Description | 신장암 레지스트리 메타정보( 제공 되어질 데이터 항목, 타입, 사이즈, 항목별건수, 샘플데이터 등)를 제공 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15048684/fileData.do |
tblNm is highly overall correlated with NUM and 3 other fields | High correlation |
tblId is highly overall correlated with NUM and 3 other fields | High correlation |
gpNm is highly overall correlated with NUM and 3 other fields | High correlation |
gpId is highly overall correlated with NUM and 3 other fields | High correlation |
NUM is highly overall correlated with gpId and 3 other fields | High correlation |
dataType is highly imbalanced (58.5%) | Imbalance |
colCnt has 295 (100.0%) missing values | Missing |
dispFormat has 36 (12.2%) missing values | Missing |
NUM has unique values | Unique |
colCnt is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2024-04-19 06:27:37.316049 |
---|---|
Analysis finished | 2024-04-19 06:27:38.636324 |
Duration | 1.32 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
NUM
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 295 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 148 |
Minimum | 1 |
---|---|
Maximum | 295 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.7 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 15.7 |
Q1 | 74.5 |
median | 148 |
Q3 | 221.5 |
95-th percentile | 280.3 |
Maximum | 295 |
Range | 294 |
Interquartile range (IQR) | 147 |
Descriptive statistics
Standard deviation | 85.30338 |
---|---|
Coefficient of variation (CV) | 0.57637419 |
Kurtosis | -1.2 |
Mean | 148 |
Median Absolute Deviation (MAD) | 74 |
Skewness | 0 |
Sum | 43660 |
Variance | 7276.6667 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 0.3% |
204 | 1 | 0.3% |
202 | 1 | 0.3% |
201 | 1 | 0.3% |
200 | 1 | 0.3% |
199 | 1 | 0.3% |
198 | 1 | 0.3% |
197 | 1 | 0.3% |
196 | 1 | 0.3% |
195 | 1 | 0.3% |
Other values (285) | 285 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
295 | 1 | |
294 | 1 | |
293 | 1 | |
292 | 1 | |
291 | 1 | |
290 | 1 | |
289 | 1 | |
288 | 1 | |
287 | 1 | |
286 | 1 |
gpId
Categorical
HIGH CORRELATION
 
Distinct | 15 |
---|---|
Distinct (%) | 5.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.4 KiB |
KDNY_HLTH | |
---|---|
KDNY_KUOS | |
KDNY_COMP | |
KDNY_OPRT | |
KDNY_SPR | |
Other values (10) |
Length
Max length | 18 |
---|---|
Median length | 9 |
Mean length | 10.979661 |
Min length | 8 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | KDNY_SUMMARY_PTIF |
---|---|
2nd row | KDNY_SUMMARY_PTIF |
3rd row | KDNY_SUMMARY_PTIF |
4th row | KDNY_SUMMARY_PTIF |
5th row | KDNY_SUMMARY_PTIF |
Common Values
Value | Count | Frequency (%) |
KDNY_HLTH | 69 | |
KDNY_KUOS | 44 | |
KDNY_COMP | 24 | 8.1% |
KDNY_OPRT | 23 | 7.8% |
KDNY_SPR | 22 | 7.5% |
KDNY_FLUP_DEAD | 19 | 6.4% |
KDNY_SUMMARY_PTIF | 17 | 5.8% |
KDNY_FLUP_CST_RLPS | 15 | 5.1% |
KDNY_CHMO_HRSK | 14 | 4.7% |
KDNY_CEXM_DTPA | 12 | 4.1% |
Other values (5) | 36 |
Length
Value | Count | Frequency (%) |
kdny_hlth | 69 | |
kdny_kuos | 44 | |
kdny_comp | 24 | 8.1% |
kdny_oprt | 23 | 7.8% |
kdny_spr | 22 | 7.5% |
kdny_flup_dead | 19 | 6.4% |
kdny_summary_ptif | 17 | 5.8% |
kdny_flup_cst_rlps | 15 | 5.1% |
kdny_chmo_hrsk | 14 | 4.7% |
kdny_cexm_dtpa | 12 | 4.1% |
Other values (5) | 36 |
gpNm
Categorical
HIGH CORRELATION
 
Distinct | 15 |
---|---|
Distinct (%) | 5.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.4 KiB |
기타건강정보 | |
---|---|
비뇨기종양학회 | |
합병증 | |
수술정보 | |
외과병리보고서 | |
Other values (10) |
Length
Max length | 12 |
---|---|
Median length | 10 |
Mean length | 6.3525424 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 기본정보 |
---|---|
2nd row | 기본정보 |
3rd row | 기본정보 |
4th row | 기본정보 |
5th row | 기본정보 |
Common Values
Value | Count | Frequency (%) |
기타건강정보 | 69 | |
비뇨기종양학회 | 44 | |
합병증 | 24 | 8.1% |
수술정보 | 23 | 7.8% |
외과병리보고서 | 22 | 7.5% |
사망 및 치료평가 | 19 | 6.4% |
기본정보 | 17 | 5.8% |
전이 및 재발 | 15 | 5.1% |
항암화학요법 | 14 | 4.7% |
진단검사(검체) | 12 | 4.1% |
Other values (5) | 36 |
Length
Value | Count | Frequency (%) |
기타건강정보 | 69 | |
비뇨기종양학회 | 44 | |
및 | 34 | 8.7% |
진단검사(검체 | 24 | 6.2% |
합병증 | 24 | 6.2% |
수술정보 | 23 | 5.9% |
외과병리보고서 | 22 | 5.7% |
사망 | 19 | 4.9% |
치료평가 | 19 | 4.9% |
기본정보 | 17 | 4.4% |
Other values (8) | 94 |
tblId
Categorical
HIGH CORRELATION
 
Distinct | 39 |
---|---|
Distinct (%) | 13.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.4 KiB |
MR_KDNY_HLTH_4 | |
---|---|
PE_KDNY_COMP | |
PE_KDNY_SPR_V | |
MR_KDNY_HLTH_2 | 15 |
PE_KDNY_KUOS_OPRT_2 | 15 |
Other values (34) |
Length
Max length | 19 |
---|---|
Median length | 17 |
Mean length | 14.291525 |
Min length | 11 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 0.7% |
Sample
1st row | PT_KDNY_TRGT |
---|---|
2nd row | PT_KDNY_TRGT |
3rd row | RG_KDNY_CNDX_V |
4th row | RG_KDNY_CNDX_V |
5th row | RG_KDNY_CNDX_V |
Common Values
Value | Count | Frequency (%) |
MR_KDNY_HLTH_4 | 34 | 11.5% |
PE_KDNY_COMP | 24 | 8.1% |
PE_KDNY_SPR_V | 22 | 7.5% |
MR_KDNY_HLTH_2 | 15 | 5.1% |
PE_KDNY_KUOS_OPRT_2 | 15 | 5.1% |
MR_KDNY_HLTH_3 | 14 | 4.7% |
PE_KDNY_KUOS_OPRT_1 | 13 | 4.4% |
PE_KDNY_CHMO | 12 | 4.1% |
PE_KDNY_RTX | 10 | 3.4% |
PE_KDNY_FLUP | 10 | 3.4% |
Other values (29) | 126 |
Length
Value | Count | Frequency (%) |
mr_kdny_hlth_4 | 34 | 11.5% |
pe_kdny_comp | 24 | 8.1% |
pe_kdny_spr_v | 22 | 7.5% |
mr_kdny_hlth_2 | 15 | 5.1% |
pe_kdny_kuos_oprt_2 | 15 | 5.1% |
mr_kdny_hlth_3 | 14 | 4.7% |
pe_kdny_kuos_oprt_1 | 13 | 4.4% |
pe_kdny_chmo | 12 | 4.1% |
pe_kdny_flup | 10 | 3.4% |
pe_kdny_rtx | 10 | 3.4% |
Other values (29) | 126 |
tblNm
Categorical
HIGH CORRELATION
 
Distinct | 36 |
---|---|
Distinct (%) | 12.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.4 KiB |
가족력 | |
---|---|
외과병리보고서 | |
합병증 | |
Initial 영상검사 | |
입원정보 | 15 |
Other values (31) |
Length
Max length | 22 |
---|---|
Median length | 19 |
Mean length | 7.1254237 |
Min length | 2 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 0.7% |
Sample
1st row | 기본정보 |
---|---|
2nd row | 기본정보 |
3rd row | 진단정보 |
4th row | 진단정보 |
5th row | 진단정보 |
Common Values
Value | Count | Frequency (%) |
가족력 | 34 | 11.5% |
외과병리보고서 | 27 | 9.2% |
합병증 | 24 | 8.1% |
Initial 영상검사 | 19 | 6.4% |
입원정보 | 15 | 5.1% |
PADUA/RENAL Score | 15 | 5.1% |
과거력 | 14 | 4.7% |
항암화학요법 | 12 | 4.1% |
RT | 10 | 3.4% |
추적관찰 | 10 | 3.4% |
Other values (26) | 115 |
Length
Value | Count | Frequency (%) |
initial | 47 | 11.9% |
가족력 | 34 | 8.6% |
외과병리보고서 | 27 | 6.8% |
합병증 | 24 | 6.1% |
영상검사 | 22 | 5.6% |
입원정보 | 15 | 3.8% |
padua/renal | 15 | 3.8% |
score | 15 | 3.8% |
f/u | 15 | 3.8% |
과거력 | 14 | 3.5% |
Other values (31) | 168 |
colId
Text
Distinct | 270 |
---|---|
Distinct (%) | 91.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.4 KiB |
Value | Count | Frequency (%) |
miex_ymd | 3 | 1.0% |
diur_t_half_l_cmnt | 2 | 0.7% |
diur_t_half_r_cmnt | 2 | 0.7% |
miex_nm | 2 | 0.7% |
ancd_nm | 2 | 0.7% |
t_max_l_cmnt | 2 | 0.7% |
relt_func_r_cmnt | 2 | 0.7% |
cexm_nm | 2 | 0.7% |
cexm_ymd | 2 | 0.7% |
t_half_r_cmnt | 2 | 0.7% |
Other values (260) | 274 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 624 | |
T | 358 | 9.5% |
M | 348 | 9.2% |
N | 340 | 9.0% |
C | 296 | 7.8% |
S | 208 | 5.5% |
R | 199 | 5.3% |
D | 167 | 4.4% |
P | 118 | 3.1% |
L | 118 | 3.1% |
Other values (21) | 996 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 3139 | |
Connector Punctuation | 624 | 16.5% |
Decimal Number | 9 | 0.2% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
T | 358 | |
M | 348 | 11.1% |
N | 340 | 10.8% |
C | 296 | 9.4% |
S | 208 | 6.6% |
R | 199 | 6.3% |
D | 167 | 5.3% |
P | 118 | 3.8% |
L | 118 | 3.8% |
H | 116 | 3.7% |
Other values (16) | 871 |
Decimal Number
Value | Count | Frequency (%) |
1 | 5 | |
2 | 2 | 22.2% |
3 | 1 | 11.1% |
4 | 1 | 11.1% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 624 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 3139 | |
Common | 633 | 16.8% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
T | 358 | |
M | 348 | 11.1% |
N | 340 | 10.8% |
C | 296 | 9.4% |
S | 208 | 6.6% |
R | 199 | 6.3% |
D | 167 | 5.3% |
P | 118 | 3.8% |
L | 118 | 3.8% |
H | 116 | 3.7% |
Other values (16) | 871 |
Common
Value | Count | Frequency (%) |
_ | 624 | |
1 | 5 | 0.8% |
2 | 2 | 0.3% |
3 | 1 | 0.2% |
4 | 1 | 0.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 3772 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 624 | |
T | 358 | 9.5% |
M | 348 | 9.2% |
N | 340 | 9.0% |
C | 296 | 7.8% |
S | 208 | 5.5% |
R | 199 | 5.3% |
D | 167 | 4.4% |
P | 118 | 3.1% |
L | 118 | 3.1% |
Other values (21) | 996 |
colNm
Text
Distinct | 238 |
---|---|
Distinct (%) | 80.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.4 KiB |
Value | Count | Frequency (%) |
기타 | 14 | 2.9% |
t | 10 | 2.1% |
invasion | 9 | 1.9% |
상세내용 | 9 | 1.9% |
of | 8 | 1.7% |
오른쪽 | 8 | 1.7% |
왼쪽 | 8 | 1.7% |
당뇨 | 6 | 1.3% |
검사명 | 5 | 1.0% |
간질환 | 5 | 1.0% |
Other values (289) | 395 |
Most occurring characters
Value | Count | Frequency (%) |
182 | 7.5% | |
i | 94 | 3.9% |
a | 91 | 3.7% |
o | 88 | 3.6% |
t | 81 | 3.3% |
e | 79 | 3.2% |
n | 76 | 3.1% |
s | 66 | 2.7% |
r | 58 | 2.4% |
기 | 53 | 2.2% |
Other values (205) | 1570 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 1041 | |
Lowercase Letter | 954 | |
Space Separator | 182 | 7.5% |
Uppercase Letter | 158 | 6.5% |
Open Punctuation | 39 | 1.6% |
Close Punctuation | 39 | 1.6% |
Other Punctuation | 12 | 0.5% |
Decimal Number | 10 | 0.4% |
Dash Punctuation | 2 | 0.1% |
Math Symbol | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
기 | 53 | 5.1% |
타 | 33 | 3.2% |
일 | 33 | 3.2% |
사 | 28 | 2.7% |
병 | 22 | 2.1% |
상 | 20 | 1.9% |
계 | 20 | 1.9% |
용 | 20 | 1.9% |
검 | 19 | 1.8% |
내 | 19 | 1.8% |
Other values (150) | 774 |
Lowercase Letter
Value | Count | Frequency (%) |
i | 94 | |
a | 91 | |
o | 88 | 9.2% |
t | 81 | 8.5% |
e | 79 | 8.3% |
n | 76 | 8.0% |
s | 66 | 6.9% |
r | 58 | 6.1% |
c | 52 | 5.5% |
m | 48 | 5.0% |
Other values (15) | 221 |
Uppercase Letter
Value | Count | Frequency (%) |
T | 17 | |
S | 16 | 10.1% |
C | 13 | 8.2% |
A | 13 | 8.2% |
R | 10 | 6.3% |
L | 10 | 6.3% |
N | 10 | 6.3% |
D | 9 | 5.7% |
P | 9 | 5.7% |
M | 9 | 5.7% |
Other values (10) | 42 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 10 | |
' | 1 | 8.3% |
. | 1 | 8.3% |
Decimal Number
Value | Count | Frequency (%) |
2 | 5 | |
1 | 5 |
Space Separator
Value | Count | Frequency (%) |
182 |
Open Punctuation
Value | Count | Frequency (%) |
( | 39 |
Close Punctuation
Value | Count | Frequency (%) |
) | 39 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 2 |
Math Symbol
Value | Count | Frequency (%) |
< | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 1112 | |
Hangul | 1041 | |
Common | 285 | 11.7% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
기 | 53 | 5.1% |
타 | 33 | 3.2% |
일 | 33 | 3.2% |
사 | 28 | 2.7% |
병 | 22 | 2.1% |
상 | 20 | 1.9% |
계 | 20 | 1.9% |
용 | 20 | 1.9% |
검 | 19 | 1.8% |
내 | 19 | 1.8% |
Other values (150) | 774 |
Latin
Value | Count | Frequency (%) |
i | 94 | 8.5% |
a | 91 | 8.2% |
o | 88 | 7.9% |
t | 81 | 7.3% |
e | 79 | 7.1% |
n | 76 | 6.8% |
s | 66 | 5.9% |
r | 58 | 5.2% |
c | 52 | 4.7% |
m | 48 | 4.3% |
Other values (35) | 379 |
Common
Value | Count | Frequency (%) |
182 | ||
( | 39 | 13.7% |
) | 39 | 13.7% |
/ | 10 | 3.5% |
2 | 5 | 1.8% |
1 | 5 | 1.8% |
- | 2 | 0.7% |
' | 1 | 0.4% |
. | 1 | 0.4% |
< | 1 | 0.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1397 | |
Hangul | 1041 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
182 | 13.0% | |
i | 94 | 6.7% |
a | 91 | 6.5% |
o | 88 | 6.3% |
t | 81 | 5.8% |
e | 79 | 5.7% |
n | 76 | 5.4% |
s | 66 | 4.7% |
r | 58 | 4.2% |
c | 52 | 3.7% |
Other values (45) | 530 |
Hangul
Value | Count | Frequency (%) |
기 | 53 | 5.1% |
타 | 33 | 3.2% |
일 | 33 | 3.2% |
사 | 28 | 2.7% |
병 | 22 | 2.1% |
상 | 20 | 1.9% |
계 | 20 | 1.9% |
용 | 20 | 1.9% |
검 | 19 | 1.8% |
내 | 19 | 1.8% |
Other values (150) | 774 |
dataType
Categorical
IMBALANCE
 
Distinct | 5 |
---|---|
Distinct (%) | 1.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.4 KiB |
String | |
---|---|
Date | |
Float | 12 |
FLOAT | 4 |
INTEGER | 4 |
Length
Max length | 7 |
---|---|
Median length | 6 |
Mean length | 5.7220339 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Date |
---|---|
2nd row | FLOAT |
3rd row | Date |
4th row | String |
5th row | String |
Common Values
Value | Count | Frequency (%) |
String | 240 | |
Date | 35 | 11.9% |
Float | 12 | 4.1% |
FLOAT | 4 | 1.4% |
INTEGER | 4 | 1.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
string | 240 | |
date | 35 | 11.9% |
float | 16 | 5.4% |
integer | 4 | 1.4% |
colDesc
Text
Distinct | 287 |
---|---|
Distinct (%) | 98.0% |
Missing | 2 |
Missing (%) | 0.7% |
Memory size | 2.4 KiB |
Length
Max length | 83 |
---|---|
Median length | 38 |
Mean length | 17.587031 |
Min length | 5 |
Characters and Unicode
Total characters | 5153 |
---|---|
Distinct characters | 334 |
Distinct categories | 11 ? |
Distinct scripts | 3 ? |
Distinct blocks | 5 ? |
Unique
Unique | 285 ? |
---|---|
Unique (%) | 97.3% |
Sample
1st row | KCD 분류가 C64 C65인 최초 진단 등록일 |
---|---|
2nd row | 신장암 수술 당시 나이 |
3rd row | 환자가 진단받은 암 진단일 |
4th row | KCD 분류 모든 등록 진단 코드 (하위코드 포함) |
5th row | 환자가 진단받은 암 진단명 |
Value | Count | Frequency (%) |
여부 | 66 | 5.0% |
환자의 | 52 | 4.0% |
기타 | 30 | 2.3% |
종양의 | 29 | 2.2% |
시행한 | 28 | 2.1% |
ncc에서 | 25 | 1.9% |
합병증 | 21 | 1.6% |
dtpa | 18 | 1.4% |
18 | 1.4% | |
f/u | 16 | 1.2% |
Other values (472) | 1013 |
Most occurring characters
Value | Count | Frequency (%) |
1025 | 19.9% | |
의 | 105 | 2.0% |
자 | 98 | 1.9% |
시 | 90 | 1.7% |
부 | 86 | 1.7% |
기 | 78 | 1.5% |
환 | 77 | 1.5% |
사 | 75 | 1.5% |
C | 74 | 1.4% |
수 | 71 | 1.4% |
Other values (324) | 3374 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 3321 | |
Space Separator | 1025 | 19.9% |
Uppercase Letter | 377 | 7.3% |
Lowercase Letter | 242 | 4.7% |
Other Punctuation | 64 | 1.2% |
Open Punctuation | 38 | 0.7% |
Close Punctuation | 38 | 0.7% |
Decimal Number | 33 | 0.6% |
Math Symbol | 8 | 0.2% |
Dash Punctuation | 6 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
의 | 105 | 3.2% |
자 | 98 | 3.0% |
시 | 90 | 2.7% |
부 | 86 | 2.6% |
기 | 78 | 2.3% |
환 | 77 | 2.3% |
사 | 75 | 2.3% |
수 | 71 | 2.1% |
여 | 69 | 2.1% |
종 | 63 | 1.9% |
Other values (260) | 2509 |
Lowercase Letter
Value | Count | Frequency (%) |
i | 26 | |
a | 26 | |
o | 24 | 9.9% |
e | 20 | 8.3% |
t | 19 | 7.9% |
s | 16 | 6.6% |
l | 15 | 6.2% |
r | 14 | 5.8% |
n | 12 | 5.0% |
c | 10 | 4.1% |
Other values (12) | 60 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 74 | |
N | 43 | |
T | 40 | |
M | 28 | 7.4% |
F | 25 | 6.6% |
D | 24 | 6.4% |
A | 22 | 5.8% |
P | 22 | 5.8% |
U | 17 | 4.5% |
E | 17 | 4.5% |
Other values (10) | 65 |
Decimal Number
Value | Count | Frequency (%) |
1 | 14 | |
2 | 7 | |
3 | 3 | 9.1% |
6 | 3 | 9.1% |
0 | 3 | 9.1% |
4 | 2 | 6.1% |
5 | 1 | 3.0% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 35 | |
: | 18 | |
, | 5 | 7.8% |
. | 4 | 6.2% |
* | 2 | 3.1% |
Math Symbol
Value | Count | Frequency (%) |
= | 2 | |
≥ | 2 | |
~ | 2 | |
< | 1 | |
→ | 1 |
Space Separator
Value | Count | Frequency (%) |
1025 |
Open Punctuation
Value | Count | Frequency (%) |
( | 38 |
Close Punctuation
Value | Count | Frequency (%) |
) | 38 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 6 |
Other Number
Value | Count | Frequency (%) |
² | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 3321 | |
Common | 1213 | 23.5% |
Latin | 619 | 12.0% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
의 | 105 | 3.2% |
자 | 98 | 3.0% |
시 | 90 | 2.7% |
부 | 86 | 2.6% |
기 | 78 | 2.3% |
환 | 77 | 2.3% |
사 | 75 | 2.3% |
수 | 71 | 2.1% |
여 | 69 | 2.1% |
종 | 63 | 1.9% |
Other values (260) | 2509 |
Latin
Value | Count | Frequency (%) |
C | 74 | 12.0% |
N | 43 | 6.9% |
T | 40 | 6.5% |
M | 28 | 4.5% |
i | 26 | 4.2% |
a | 26 | 4.2% |
F | 25 | 4.0% |
D | 24 | 3.9% |
o | 24 | 3.9% |
A | 22 | 3.6% |
Other values (32) | 287 |
Common
Value | Count | Frequency (%) |
1025 | ||
( | 38 | 3.1% |
) | 38 | 3.1% |
/ | 35 | 2.9% |
: | 18 | 1.5% |
1 | 14 | 1.2% |
2 | 7 | 0.6% |
- | 6 | 0.5% |
, | 5 | 0.4% |
. | 4 | 0.3% |
Other values (12) | 23 | 1.9% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 3321 | |
ASCII | 1828 | |
Math Operators | 2 | < 0.1% |
None | 1 | < 0.1% |
Arrows | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1025 | ||
C | 74 | 4.0% |
N | 43 | 2.4% |
T | 40 | 2.2% |
( | 38 | 2.1% |
) | 38 | 2.1% |
/ | 35 | 1.9% |
M | 28 | 1.5% |
i | 26 | 1.4% |
a | 26 | 1.4% |
Other values (51) | 455 |
Hangul
Value | Count | Frequency (%) |
의 | 105 | 3.2% |
자 | 98 | 3.0% |
시 | 90 | 2.7% |
부 | 86 | 2.6% |
기 | 78 | 2.3% |
환 | 77 | 2.3% |
사 | 75 | 2.3% |
수 | 71 | 2.1% |
여 | 69 | 2.1% |
종 | 63 | 1.9% |
Other values (260) | 2509 |
Math Operators
Value | Count | Frequency (%) |
≥ | 2 |
None
Value | Count | Frequency (%) |
² | 1 |
Arrows
Value | Count | Frequency (%) |
→ | 1 |
colCnt
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 295 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.7 KiB |
dispFormat
Text
MISSING
 
Distinct | 74 |
---|---|
Distinct (%) | 28.6% |
Missing | 36 |
Missing (%) | 12.2% |
Memory size | 2.4 KiB |
Length
Max length | 587 |
---|---|
Median length | 118 |
Mean length | 20.494208 |
Min length | 2 |
Characters and Unicode
Total characters | 5308 |
---|---|
Distinct characters | 157 |
Distinct categories | 10 ? |
Distinct scripts | 3 ? |
Distinct blocks | 2 ? |
Unique
Unique | 57 ? |
---|---|
Unique (%) | 22.0% |
Sample
1st row | YYYY-MM-DD |
---|---|
2nd row | 숫자 |
3rd row | YYYY-MM-DD |
4th row | ex) C64 |
5th row | ex) Mlignant neoplasms of kidney, except renal pelvis |
Value | Count | Frequency (%) |
194 | 14.3% | |
n | 61 | 4.5% |
y | 61 | 4.5% |
1 | 55 | 4.1% |
2 | 55 | 4.1% |
텍스트 | 54 | 4.0% |
grade | 54 | 4.0% |
유 | 49 | 3.6% |
무 | 49 | 3.6% |
3 | 45 | 3.3% |
Other values (258) | 675 |
Most occurring characters
Value | Count | Frequency (%) |
1118 | ||
| | 298 | 5.6% |
, | 250 | 4.7% |
e | 225 | 4.2% |
Y | 201 | 3.8% |
a | 167 | 3.1% |
r | 146 | 2.8% |
i | 145 | 2.7% |
n | 144 | 2.7% |
o | 109 | 2.1% |
Other values (147) | 2505 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 1633 | |
Space Separator | 1118 | |
Uppercase Letter | 761 | |
Other Letter | 593 | 11.2% |
Decimal Number | 433 | 8.2% |
Math Symbol | 303 | 5.7% |
Other Punctuation | 302 | 5.7% |
Dash Punctuation | 75 | 1.4% |
Close Punctuation | 58 | 1.1% |
Open Punctuation | 32 | 0.6% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
텍 | 54 | 9.1% |
트 | 54 | 9.1% |
스 | 54 | 9.1% |
유 | 53 | 8.9% |
무 | 50 | 8.4% |
수 | 21 | 3.5% |
로 | 16 | 2.7% |
기 | 15 | 2.5% |
정 | 15 | 2.5% |
자 | 14 | 2.4% |
Other values (77) | 247 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 225 | |
a | 167 | |
r | 146 | 8.9% |
i | 145 | 8.9% |
n | 144 | 8.8% |
o | 109 | 6.7% |
t | 92 | 5.6% |
l | 88 | 5.4% |
d | 80 | 4.9% |
s | 61 | 3.7% |
Other values (14) | 376 |
Uppercase Letter
Value | Count | Frequency (%) |
Y | 201 | |
M | 92 | |
D | 84 | |
N | 77 | 10.1% |
G | 60 | 7.9% |
C | 39 | 5.1% |
R | 26 | 3.4% |
U | 26 | 3.4% |
T | 25 | 3.3% |
B | 21 | 2.8% |
Other values (12) | 110 |
Decimal Number
Value | Count | Frequency (%) |
1 | 86 | |
2 | 83 | |
3 | 81 | |
4 | 57 | |
5 | 36 | |
6 | 25 | 5.8% |
7 | 20 | 4.6% |
8 | 18 | 4.2% |
9 | 16 | 3.7% |
0 | 11 | 2.5% |
Other Punctuation
Value | Count | Frequency (%) |
, | 250 | |
* | 38 | 12.6% |
: | 7 | 2.3% |
% | 4 | 1.3% |
/ | 2 | 0.7% |
. | 1 | 0.3% |
Math Symbol
Value | Count | Frequency (%) |
| | 298 | |
+ | 2 | 0.7% |
> | 2 | 0.7% |
~ | 1 | 0.3% |
Space Separator
Value | Count | Frequency (%) |
1118 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 75 |
Close Punctuation
Value | Count | Frequency (%) |
) | 58 |
Open Punctuation
Value | Count | Frequency (%) |
( | 32 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2394 | |
Common | 2321 | |
Hangul | 593 | 11.2% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
텍 | 54 | 9.1% |
트 | 54 | 9.1% |
스 | 54 | 9.1% |
유 | 53 | 8.9% |
무 | 50 | 8.4% |
수 | 21 | 3.5% |
로 | 16 | 2.7% |
기 | 15 | 2.5% |
정 | 15 | 2.5% |
자 | 14 | 2.4% |
Other values (77) | 247 |
Latin
Value | Count | Frequency (%) |
e | 225 | 9.4% |
Y | 201 | 8.4% |
a | 167 | 7.0% |
r | 146 | 6.1% |
i | 145 | 6.1% |
n | 144 | 6.0% |
o | 109 | 4.6% |
t | 92 | 3.8% |
M | 92 | 3.8% |
l | 88 | 3.7% |
Other values (36) | 985 |
Common
Value | Count | Frequency (%) |
1118 | ||
| | 298 | 12.8% |
, | 250 | 10.8% |
1 | 86 | 3.7% |
2 | 83 | 3.6% |
3 | 81 | 3.5% |
- | 75 | 3.2% |
) | 58 | 2.5% |
4 | 57 | 2.5% |
* | 38 | 1.6% |
Other values (14) | 177 | 7.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 4715 | |
Hangul | 593 | 11.2% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1118 | ||
| | 298 | 6.3% |
, | 250 | 5.3% |
e | 225 | 4.8% |
Y | 201 | 4.3% |
a | 167 | 3.5% |
r | 146 | 3.1% |
i | 145 | 3.1% |
n | 144 | 3.1% |
o | 109 | 2.3% |
Other values (60) | 1912 |
Hangul
Value | Count | Frequency (%) |
텍 | 54 | 9.1% |
트 | 54 | 9.1% |
스 | 54 | 9.1% |
유 | 53 | 8.9% |
무 | 50 | 8.4% |
수 | 21 | 3.5% |
로 | 16 | 2.7% |
기 | 15 | 2.5% |
정 | 15 | 2.5% |
자 | 14 | 2.4% |
Other values (77) | 247 |
NUM | gpId | gpNm | tblId | tblNm | dataType | dispFormat | |
---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.955 | 0.955 | 0.983 | 0.975 | 0.504 | 0.844 |
gpId | 0.955 | 1.000 | 1.000 | 1.000 | 0.999 | 0.698 | 0.910 |
gpNm | 0.955 | 1.000 | 1.000 | 1.000 | 0.999 | 0.698 | 0.910 |
tblId | 0.983 | 1.000 | 1.000 | 1.000 | 1.000 | 0.723 | 0.955 |
tblNm | 0.975 | 0.999 | 0.999 | 1.000 | 1.000 | 0.697 | 0.949 |
dataType | 0.504 | 0.698 | 0.698 | 0.723 | 0.697 | 1.000 | 0.913 |
dispFormat | 0.844 | 0.910 | 0.910 | 0.955 | 0.949 | 0.913 | 1.000 |
tblNm | dataType | tblId | gpNm | gpId | |
---|---|---|---|---|---|
tblNm | 1.000 | 0.389 | 0.994 | 0.947 | 0.947 |
dataType | 0.389 | 1.000 | 0.408 | 0.368 | 0.368 |
tblId | 0.994 | 0.408 | 1.000 | 0.956 | 0.956 |
gpNm | 0.947 | 0.368 | 0.956 | 1.000 | 1.000 |
gpId | 0.947 | 0.368 | 0.956 | 1.000 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | dataType | |
---|---|---|---|---|---|---|
NUM | 1.000 | 0.754 | 0.754 | 0.821 | 0.798 | 0.231 |
gpId | 0.754 | 1.000 | 1.000 | 0.956 | 0.947 | 0.368 |
gpNm | 0.754 | 1.000 | 1.000 | 0.956 | 0.947 | 0.368 |
tblId | 0.821 | 0.956 | 0.956 | 1.000 | 0.994 | 0.408 |
tblNm | 0.798 | 0.947 | 0.947 | 0.994 | 1.000 | 0.389 |
dataType | 0.231 | 0.368 | 0.368 | 0.408 | 0.389 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | KDNY_SUMMARY_PTIF | 기본정보 | PT_KDNY_TRGT | 기본정보 | FRMD_YMD | 초진일 | Date | KCD 분류가 C64 C65인 최초 진단 등록일 | <NA> | YYYY-MM-DD |
1 | 2 | KDNY_SUMMARY_PTIF | 기본정보 | PT_KDNY_TRGT | 기본정보 | OPRT_AGE | 수술 당시 연령 | FLOAT | 신장암 수술 당시 나이 | <NA> | 숫자 |
2 | 3 | KDNY_SUMMARY_PTIF | 기본정보 | RG_KDNY_CNDX_V | 진단정보 | DIAG_YMD | 진단일 | Date | 환자가 진단받은 암 진단일 | <NA> | YYYY-MM-DD |
3 | 4 | KDNY_SUMMARY_PTIF | 기본정보 | RG_KDNY_CNDX_V | 진단정보 | DIAG_CD | 진단코드 | String | KCD 분류 모든 등록 진단 코드 (하위코드 포함) | <NA> | ex) C64 |
4 | 5 | KDNY_SUMMARY_PTIF | 기본정보 | RG_KDNY_CNDX_V | 진단정보 | DIAG_ENM | 진단명 | String | 환자가 진단받은 암 진단명 | <NA> | ex) Mlignant neoplasms of kidney, except renal pelvis |
5 | 6 | KDNY_SUMMARY_PTIF | 기본정보 | RG_KDNY_CNDX_V | 진단정보 | ETC_CNCR_YN | 기타암여부 | String | 신장암 혹은 신장암을 제외한 기타부위의 암종 여부 | <NA> | Y, 유 | N, 무 |
6 | 7 | KDNY_SUMMARY_PTIF | 기본정보 | PT_KDNY_BDMS | 신체계측 | WT_MSRM_YMD | 체중(kg) | Date | 환자의 몸무게 | <NA> | YYYY-MM-DD |
7 | 8 | KDNY_SUMMARY_PTIF | 기본정보 | PT_KDNY_BDMS | 신체계측 | WT_VL | 체중측정일 | FLOAT | 환자의 몸무게 측정일 | <NA> | 숫자 |
8 | 9 | KDNY_SUMMARY_PTIF | 기본정보 | PT_KDNY_BDMS | 신체계측 | HT_MSRM_YMD | 신장(cm) | Date | 환자의 신장 | <NA> | YYYY-MM-DD |
9 | 10 | KDNY_SUMMARY_PTIF | 기본정보 | PT_KDNY_BDMS | 신체계측 | HT_VL | 신장측정일 | FLOAT | 환자의 신장 측정일 | <NA> | 숫자 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
285 | 286 | KDNY_KUOS | 비뇨기종양학회 | PE_KDNY_KUOS_SPR | 외과병리보고서 | SP_ACPT_YMD | 접수일자 | Date | 수술후 외과병리 검체접수일자 | <NA> | YYYY-MM-DD |
286 | 287 | KDNY_KUOS | 비뇨기종양학회 | PE_KDNY_KUOS_SPR | 외과병리보고서 | VSCL_INVS_CMNT | Vascular invasion | String | 종양의 혈관 침범 여부 | <NA> | ex) vascular invasion: not identified |
287 | 288 | KDNY_KUOS | 비뇨기종양학회 | PE_KDNY_KUOS_SPR | 외과병리보고서 | SRG_MRGN_CMNT | Surgical margin | String | 수술적으로 제거된 조직의 가장자리 전이 여부 | <NA> | ex) Surgical margins: not identified |
288 | 289 | KDNY_KUOS | 비뇨기종양학회 | PE_KDNY_KUOS_SPR | 외과병리보고서 | CPSL_INVS_CMNT | Capsular invasion | String | 종양의 피막 침범 여부 | <NA> | ex) capsular invasion: not identified |
289 | 290 | KDNY_KUOS | 비뇨기종양학회 | PE_KDNY_KUOS_SPR | 외과병리보고서 | SCMT_DIFF_CMNT | Sarcomatoid differentiation | String | 종양의 육종 분화 정도 | <NA> | <NA> |
290 | 291 | KDNY_KUOS | 비뇨기종양학회 | PE_KDNY_KUOS_HLTH | 기타건강정보 | ACMP_DISS_CMNT | 동반질환 | String | 입원 당시 환자가 가지고 있었던 기존 질환 | <NA> | 1, DM | 2, HTN | 3, ETC |
291 | 292 | KDNY_KUOS | 비뇨기종양학회 | PE_KDNY_KUOS_HLTH | 기타건강정보 | CHRN_RNLF_YN | 만성신부전 | String | 만성신부전 여부 | <NA> | Y, 유 | N, 무 |
292 | 293 | KDNY_KUOS | 비뇨기종양학회 | PE_KDNY_KUOS_HLTH | 기타건강정보 | DALY_YN | 투석 | String | 투석 여부 | <NA> | Y, 유 | N, 무 |
293 | 294 | KDNY_KUOS | 비뇨기종양학회 | PE_KDNY_KUOS_HLTH | 기타건강정보 | CHRL_CMBD_INDX_CMNT | Charlson comorbidity index | String | 환자가 앓고 있는 다른 상병들이 환자의 사망에 미치는 영향을 알기 위한 지표 | <NA> | Myocardial infarct (1점) |
294 | 295 | KDNY_KUOS | 비뇨기종양학회 | PE_KDNY_KUOS_HLTH | 기타건강정보 | CHRL_CMBD_INDX_TPNT | Charison comorbidity index Total (점) | Float | Charison comorbidity index을 모두 합산한 점수 | <NA> | 정수로 표현 |