Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 489 |
Missing cells | 572 |
Missing cells (%) | 10.6% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 43.1 KiB |
Average record size in memory | 90.3 B |
Variable types
Numeric | 1 |
---|---|
Categorical | 3 |
Text | 6 |
Unsupported | 1 |
Dataset
Description | 난소암 레지스트리 메타정보( 제공 되어질 데이터 항목, 타입, 사이즈, 항목별건수, 샘플데이터 등)를 제공 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15048685/fileData.do |
gpId is highly overall correlated with NUM and 1 other fields | High correlation |
gpNm is highly overall correlated with NUM and 1 other fields | High correlation |
NUM is highly overall correlated with gpId and 1 other fields | High correlation |
dataType is highly imbalanced (75.7%) | Imbalance |
colDesc has 8 (1.6%) missing values | Missing |
colCnt has 489 (100.0%) missing values | Missing |
dispFormat has 75 (15.3%) missing values | Missing |
NUM has unique values | Unique |
colCnt is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2024-04-17 18:53:38.623977 |
---|---|
Analysis finished | 2024-04-17 18:53:40.501479 |
Duration | 1.88 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
NUM
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 489 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 245 |
Minimum | 1 |
---|---|
Maximum | 489 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.4 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 25.4 |
Q1 | 123 |
median | 245 |
Q3 | 367 |
95-th percentile | 464.6 |
Maximum | 489 |
Range | 488 |
Interquartile range (IQR) | 244 |
Descriptive statistics
Standard deviation | 141.3064 |
---|---|
Coefficient of variation (CV) | 0.57676084 |
Kurtosis | -1.2 |
Mean | 245 |
Median Absolute Deviation (MAD) | 122 |
Skewness | 0 |
Sum | 119805 |
Variance | 19967.5 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 0.2% |
337 | 1 | 0.2% |
335 | 1 | 0.2% |
334 | 1 | 0.2% |
333 | 1 | 0.2% |
332 | 1 | 0.2% |
331 | 1 | 0.2% |
330 | 1 | 0.2% |
329 | 1 | 0.2% |
328 | 1 | 0.2% |
Other values (479) | 479 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
489 | 1 | |
488 | 1 | |
487 | 1 | |
486 | 1 | |
485 | 1 | |
484 | 1 | |
483 | 1 | |
482 | 1 | |
481 | 1 | |
480 | 1 |
gpId
Categorical
HIGH CORRELATION
 
Distinct | 14 |
---|---|
Distinct (%) | 2.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.9 KiB |
OVRY_OPRT_SOTOC | |
---|---|
OVRY_OPRT | |
OVRY_HLTH | |
OVRY_SPR | |
OVRY_FLUP_CST_FLUP | |
Other values (9) |
Length
Max length | 18 |
---|---|
Median length | 17 |
Mean length | 11.903885 |
Min length | 7 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | OVRY_SUMMARY_PTIF |
---|---|
2nd row | OVRY_SUMMARY_PTIF |
3rd row | OVRY_SUMMARY_PTIF |
4th row | OVRY_SUMMARY_PTIF |
5th row | OVRY_SUMMARY_PTIF |
Common Values
Value | Count | Frequency (%) |
OVRY_OPRT_SOTOC | 190 | |
OVRY_OPRT | 83 | |
OVRY_HLTH | 79 | |
OVRY_SPR | 33 | 6.7% |
OVRY_FLUP_CST_FLUP | 24 | 4.9% |
OVRY_BX | 17 | 3.5% |
OVRY_SUMMARY_PTIF | 16 | 3.3% |
OVRY_RTX | 10 | 2.0% |
OVRY_CHMO | 9 | 1.8% |
OVRY_BRCA | 9 | 1.8% |
Other values (4) | 19 | 3.9% |
Length
Value | Count | Frequency (%) |
ovry_oprt_sotoc | 190 | |
ovry_oprt | 83 | |
ovry_hlth | 79 | |
ovry_spr | 33 | 6.7% |
ovry_flup_cst_flup | 24 | 4.9% |
ovry_bx | 17 | 3.5% |
ovry_summary_ptif | 16 | 3.3% |
ovry_rtx | 10 | 2.0% |
ovry_chmo | 9 | 1.8% |
ovry_brca | 9 | 1.8% |
Other values (4) | 19 | 3.9% |
gpNm
Categorical
HIGH CORRELATION
 
Distinct | 14 |
---|---|
Distinct (%) | 2.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.9 KiB |
수술정보(SOTOC) | |
---|---|
수술정보 | |
기타건강정보 | |
외과병리결과 | |
추적관찰 | |
Other values (9) |
Length
Max length | 18 |
---|---|
Median length | 17 |
Mean length | 7.7484663 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 기본/진단정보 |
---|---|
2nd row | 기본/진단정보 |
3rd row | 기본/진단정보 |
4th row | 기본/진단정보 |
5th row | 기본/진단정보 |
Common Values
Value | Count | Frequency (%) |
수술정보(SOTOC) | 190 | |
수술정보 | 83 | |
기타건강정보 | 79 | |
외과병리결과 | 33 | 6.7% |
추적관찰 | 24 | 4.9% |
Bx | 17 | 3.5% |
기본/진단정보 | 16 | 3.3% |
방사선치료 | 10 | 2.0% |
항암화학요법 | 9 | 1.8% |
BRCA 검사 | 9 | 1.8% |
Other values (4) | 19 | 3.9% |
Length
Value | Count | Frequency (%) |
수술정보(sotoc | 190 | |
수술정보 | 83 | |
기타건강정보 | 79 | |
외과병리결과 | 33 | 6.3% |
추적관찰 | 24 | 4.6% |
bx | 17 | 3.3% |
기본/진단정보 | 16 | 3.1% |
방사선치료 | 10 | 1.9% |
brca | 9 | 1.7% |
검사 | 9 | 1.7% |
Other values (10) | 50 | 9.6% |
tblId
Text
Distinct | 66 |
---|---|
Distinct (%) | 13.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.9 KiB |
Length
Max length | 21 |
---|---|
Median length | 20 |
Mean length | 16.257669 |
Min length | 10 |
Characters and Unicode
Total characters | 7950 |
---|---|
Distinct characters | 32 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 0.4% |
Sample
1st row | PT_OVRY_TRGT |
---|---|
2nd row | PT_OVRY_TRGT |
3rd row | RG_OVRY_CNDX_V |
4th row | RG_OVRY_CNDX_V |
5th row | RG_OVRY_CNDX_V |
Value | Count | Frequency (%) |
pe_ovry_oprt_4 | 50 | 10.2% |
pe_ovry_spr_1 | 28 | 5.7% |
pe_ovry_oprt_sotoc_4 | 23 | 4.7% |
pe_ovry_oprt_sotoc_8 | 22 | 4.5% |
pe_ovry_oprt_sotoc_5 | 22 | 4.5% |
pe_ovry_oprt_1 | 18 | 3.7% |
pe_ovry_oprt_sotoc_2 | 16 | 3.3% |
mr_ovry_hlth_10 | 14 | 2.9% |
pe_ovry_oprt_sotoc_3 | 14 | 2.9% |
pe_ovry_rtx_v | 10 | 2.0% |
Other values (56) | 272 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 1607 | |
O | 1145 | |
R | 905 | |
P | 728 | |
T | 600 | 7.5% |
V | 520 | 6.5% |
Y | 494 | 6.2% |
E | 410 | 5.2% |
S | 251 | 3.2% |
C | 243 | 3.1% |
Other values (22) | 1047 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 5875 | |
Connector Punctuation | 1607 | 20.2% |
Decimal Number | 468 | 5.9% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
O | 1145 | |
R | 905 | |
P | 728 | |
T | 600 | |
V | 520 | |
Y | 494 | |
E | 410 | 7.0% |
S | 251 | 4.3% |
C | 243 | 4.1% |
H | 167 | 2.8% |
Other values (11) | 412 | 7.0% |
Decimal Number
Value | Count | Frequency (%) |
1 | 128 | |
4 | 92 | |
2 | 64 | |
3 | 39 | 8.3% |
5 | 37 | 7.9% |
8 | 35 | 7.5% |
6 | 22 | 4.7% |
0 | 22 | 4.7% |
7 | 21 | 4.5% |
9 | 8 | 1.7% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 1607 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 5875 | |
Common | 2075 | 26.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
O | 1145 | |
R | 905 | |
P | 728 | |
T | 600 | |
V | 520 | |
Y | 494 | |
E | 410 | 7.0% |
S | 251 | 4.3% |
C | 243 | 4.1% |
H | 167 | 2.8% |
Other values (11) | 412 | 7.0% |
Common
Value | Count | Frequency (%) |
_ | 1607 | |
1 | 128 | 6.2% |
4 | 92 | 4.4% |
2 | 64 | 3.1% |
3 | 39 | 1.9% |
5 | 37 | 1.8% |
8 | 35 | 1.7% |
6 | 22 | 1.1% |
0 | 22 | 1.1% |
7 | 21 | 1.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 7950 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 1607 | |
O | 1145 | |
R | 905 | |
P | 728 | |
T | 600 | 7.5% |
V | 520 | 6.5% |
Y | 494 | 6.2% |
E | 410 | 5.2% |
S | 251 | 3.2% |
C | 243 | 3.1% |
Other values (22) | 1047 |
tblNm
Text
Distinct | 65 |
---|---|
Distinct (%) | 13.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.9 KiB |
Value | Count | Frequency (%) |
ln | 64 | 7.5% |
58 | 6.8% | |
name | 50 | 5.8% |
op | 50 | 5.8% |
postop | 33 | 3.9% |
bx | 28 | 3.3% |
ruq | 23 | 2.7% |
pelvis | 22 | 2.6% |
colon | 22 | 2.6% |
수술정보 | 18 | 2.1% |
Other values (81) | 487 |
Most occurring characters
Value | Count | Frequency (%) |
370 | 8.2% | |
a | 308 | 6.8% |
e | 221 | 4.9% |
t | 201 | 4.5% |
o | 185 | 4.1% |
i | 182 | 4.0% |
l | 174 | 3.9% |
r | 173 | 3.8% |
L | 160 | 3.5% |
P | 154 | 3.4% |
Other values (113) | 2387 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 2282 | |
Uppercase Letter | 815 | 18.1% |
Other Letter | 731 | 16.2% |
Space Separator | 370 | 8.2% |
Dash Punctuation | 94 | 2.1% |
Other Punctuation | 61 | 1.4% |
Close Punctuation | 60 | 1.3% |
Open Punctuation | 60 | 1.3% |
Decimal Number | 42 | 0.9% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
력 | 72 | 9.8% |
가 | 41 | 5.6% |
족 | 38 | 5.2% |
정 | 26 | 3.6% |
보 | 26 | 3.6% |
수 | 25 | 3.4% |
자 | 24 | 3.3% |
술 | 21 | 2.9% |
부 | 21 | 2.9% |
사 | 21 | 2.9% |
Other values (61) | 416 |
Lowercase Letter
Value | Count | Frequency (%) |
a | 308 | |
e | 221 | |
t | 201 | |
o | 185 | 8.1% |
i | 182 | 8.0% |
l | 174 | 7.6% |
r | 173 | 7.6% |
n | 152 | 6.7% |
s | 121 | 5.3% |
m | 99 | 4.3% |
Other values (13) | 466 |
Uppercase Letter
Value | Count | Frequency (%) |
L | 160 | |
P | 154 | |
N | 110 | |
O | 75 | |
C | 50 | 6.1% |
U | 39 | 4.8% |
Q | 39 | 4.8% |
B | 37 | 4.5% |
R | 36 | 4.4% |
S | 32 | 3.9% |
Other values (6) | 83 |
Decimal Number
Value | Count | Frequency (%) |
1 | 7 | |
5 | 6 | |
3 | 6 | |
6 | 6 | |
4 | 6 | |
8 | 4 | |
7 | 4 | |
2 | 3 |
Space Separator
Value | Count | Frequency (%) |
370 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 94 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 61 |
Close Punctuation
Value | Count | Frequency (%) |
) | 60 |
Open Punctuation
Value | Count | Frequency (%) |
( | 60 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 3097 | |
Hangul | 731 | 16.2% |
Common | 687 | 15.2% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
력 | 72 | 9.8% |
가 | 41 | 5.6% |
족 | 38 | 5.2% |
정 | 26 | 3.6% |
보 | 26 | 3.6% |
수 | 25 | 3.4% |
자 | 24 | 3.3% |
술 | 21 | 2.9% |
부 | 21 | 2.9% |
사 | 21 | 2.9% |
Other values (61) | 416 |
Latin
Value | Count | Frequency (%) |
a | 308 | 9.9% |
e | 221 | 7.1% |
t | 201 | 6.5% |
o | 185 | 6.0% |
i | 182 | 5.9% |
l | 174 | 5.6% |
r | 173 | 5.6% |
L | 160 | 5.2% |
P | 154 | 5.0% |
n | 152 | 4.9% |
Other values (29) | 1187 |
Common
Value | Count | Frequency (%) |
370 | ||
- | 94 | 13.7% |
/ | 61 | 8.9% |
) | 60 | 8.7% |
( | 60 | 8.7% |
1 | 7 | 1.0% |
5 | 6 | 0.9% |
3 | 6 | 0.9% |
6 | 6 | 0.9% |
4 | 6 | 0.9% |
Other values (3) | 11 | 1.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 3784 | |
Hangul | 731 | 16.2% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
370 | 9.8% | |
a | 308 | 8.1% |
e | 221 | 5.8% |
t | 201 | 5.3% |
o | 185 | 4.9% |
i | 182 | 4.8% |
l | 174 | 4.6% |
r | 173 | 4.6% |
L | 160 | 4.2% |
P | 154 | 4.1% |
Other values (42) | 1656 |
Hangul
Value | Count | Frequency (%) |
력 | 72 | 9.8% |
가 | 41 | 5.6% |
족 | 38 | 5.2% |
정 | 26 | 3.6% |
보 | 26 | 3.6% |
수 | 25 | 3.4% |
자 | 24 | 3.3% |
술 | 21 | 2.9% |
부 | 21 | 2.9% |
사 | 21 | 2.9% |
Other values (61) | 416 |
colId
Text
Distinct | 478 |
---|---|
Distinct (%) | 97.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.9 KiB |
Length
Max length | 27 |
---|---|
Median length | 23 |
Mean length | 14.437628 |
Min length | 5 |
Characters and Unicode
Total characters | 7060 |
---|---|
Distinct characters | 35 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 468 ? |
---|---|
Unique (%) | 95.7% |
Sample
1st row | FRMD_YMD |
---|---|
2nd row | DIAG_AGE |
3rd row | DIAG_YMD |
4th row | DIAG_ENM |
5th row | ETC_CNCR_YN |
Value | Count | Frequency (%) |
figo_stag | 3 | 0.6% |
ancd_ingr_nm | 2 | 0.4% |
clnc_m_stag | 2 | 0.4% |
clnc_n_stag | 2 | 0.4% |
ord_ymd | 2 | 0.4% |
ancd_nm | 2 | 0.4% |
oprt_ymd | 2 | 0.4% |
clnc_stag | 2 | 0.4% |
ancd_ord_seq | 2 | 0.4% |
clnc_t_stag | 2 | 0.4% |
Other values (468) | 468 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 1267 | |
N | 706 | 10.0% |
T | 567 | 8.0% |
M | 470 | 6.7% |
L | 432 | 6.1% |
C | 421 | 6.0% |
R | 376 | 5.3% |
S | 359 | 5.1% |
P | 295 | 4.2% |
Y | 275 | 3.9% |
Other values (25) | 1892 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 5735 | |
Connector Punctuation | 1267 | 17.9% |
Decimal Number | 58 | 0.8% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
N | 706 | |
T | 567 | 9.9% |
M | 470 | 8.2% |
L | 432 | 7.5% |
C | 421 | 7.3% |
R | 376 | 6.6% |
S | 359 | 6.3% |
P | 295 | 5.1% |
Y | 275 | 4.8% |
D | 263 | 4.6% |
Other values (16) | 1571 |
Decimal Number
Value | Count | Frequency (%) |
1 | 11 | |
4 | 9 | |
3 | 9 | |
5 | 8 | |
2 | 7 | |
6 | 6 | |
7 | 4 | 6.9% |
8 | 4 | 6.9% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 1267 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 5735 | |
Common | 1325 | 18.8% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
N | 706 | |
T | 567 | 9.9% |
M | 470 | 8.2% |
L | 432 | 7.5% |
C | 421 | 7.3% |
R | 376 | 6.6% |
S | 359 | 6.3% |
P | 295 | 5.1% |
Y | 275 | 4.8% |
D | 263 | 4.6% |
Other values (16) | 1571 |
Common
Value | Count | Frequency (%) |
_ | 1267 | |
1 | 11 | 0.8% |
4 | 9 | 0.7% |
3 | 9 | 0.7% |
5 | 8 | 0.6% |
2 | 7 | 0.5% |
6 | 6 | 0.5% |
7 | 4 | 0.3% |
8 | 4 | 0.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 7060 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 1267 | |
N | 706 | 10.0% |
T | 567 | 8.0% |
M | 470 | 6.7% |
L | 432 | 6.1% |
C | 421 | 6.0% |
R | 376 | 5.3% |
S | 359 | 5.1% |
P | 295 | 4.2% |
Y | 275 | 3.9% |
Other values (25) | 1892 |
colNm
Text
Distinct | 313 |
---|---|
Distinct (%) | 64.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.9 KiB |
Value | Count | Frequency (%) |
right | 34 | 3.5% |
left | 32 | 3.3% |
lnd | 30 | 3.1% |
size | 29 | 3.0% |
op | 27 | 2.8% |
of | 26 | 2.7% |
lns | 24 | 2.5% |
resection | 16 | 1.7% |
stage | 15 | 1.6% |
기타 | 15 | 1.6% |
Other values (325) | 718 |
Most occurring characters
Value | Count | Frequency (%) |
478 | 8.3% | |
e | 469 | 8.2% |
t | 383 | 6.7% |
i | 338 | 5.9% |
o | 334 | 5.8% |
r | 275 | 4.8% |
a | 275 | 4.8% |
c | 209 | 3.6% |
s | 175 | 3.0% |
n | 166 | 2.9% |
Other values (181) | 2639 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 3707 | |
Uppercase Letter | 739 | 12.9% |
Other Letter | 714 | 12.4% |
Space Separator | 478 | 8.3% |
Close Punctuation | 28 | 0.5% |
Open Punctuation | 28 | 0.5% |
Decimal Number | 17 | 0.3% |
Other Punctuation | 14 | 0.2% |
Dash Punctuation | 13 | 0.2% |
Math Symbol | 3 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
일 | 30 | 4.2% |
기 | 26 | 3.6% |
자 | 23 | 3.2% |
부 | 23 | 3.2% |
사 | 20 | 2.8% |
타 | 19 | 2.7% |
진 | 17 | 2.4% |
수 | 16 | 2.2% |
용 | 14 | 2.0% |
내 | 14 | 2.0% |
Other values (122) | 512 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 469 | |
t | 383 | |
i | 338 | 9.1% |
o | 334 | 9.0% |
r | 275 | 7.4% |
a | 275 | 7.4% |
c | 209 | 5.6% |
s | 175 | 4.7% |
n | 166 | 4.5% |
l | 145 | 3.9% |
Other values (14) | 938 |
Uppercase Letter
Value | Count | Frequency (%) |
L | 114 | |
S | 106 | |
P | 89 | |
N | 72 | |
R | 58 | |
D | 52 | |
O | 50 | |
A | 36 | 4.9% |
C | 30 | 4.1% |
I | 29 | 3.9% |
Other values (12) | 103 |
Decimal Number
Value | Count | Frequency (%) |
2 | 4 | |
1 | 4 | |
5 | 3 | |
4 | 3 | |
3 | 3 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 10 | |
, | 2 | 14.3% |
& | 2 | 14.3% |
Space Separator
Value | Count | Frequency (%) |
478 |
Close Punctuation
Value | Count | Frequency (%) |
) | 28 |
Open Punctuation
Value | Count | Frequency (%) |
( | 28 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 13 |
Math Symbol
Value | Count | Frequency (%) |
+ | 3 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 4446 | |
Hangul | 714 | 12.4% |
Common | 581 | 10.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
일 | 30 | 4.2% |
기 | 26 | 3.6% |
자 | 23 | 3.2% |
부 | 23 | 3.2% |
사 | 20 | 2.8% |
타 | 19 | 2.7% |
진 | 17 | 2.4% |
수 | 16 | 2.2% |
용 | 14 | 2.0% |
내 | 14 | 2.0% |
Other values (122) | 512 |
Latin
Value | Count | Frequency (%) |
e | 469 | 10.5% |
t | 383 | 8.6% |
i | 338 | 7.6% |
o | 334 | 7.5% |
r | 275 | 6.2% |
a | 275 | 6.2% |
c | 209 | 4.7% |
s | 175 | 3.9% |
n | 166 | 3.7% |
l | 145 | 3.3% |
Other values (36) | 1677 |
Common
Value | Count | Frequency (%) |
478 | ||
) | 28 | 4.8% |
( | 28 | 4.8% |
- | 13 | 2.2% |
/ | 10 | 1.7% |
2 | 4 | 0.7% |
1 | 4 | 0.7% |
+ | 3 | 0.5% |
5 | 3 | 0.5% |
4 | 3 | 0.5% |
Other values (3) | 7 | 1.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 5027 | |
Hangul | 714 | 12.4% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
478 | 9.5% | |
e | 469 | 9.3% |
t | 383 | 7.6% |
i | 338 | 6.7% |
o | 334 | 6.6% |
r | 275 | 5.5% |
a | 275 | 5.5% |
c | 209 | 4.2% |
s | 175 | 3.5% |
n | 166 | 3.3% |
Other values (49) | 1925 |
Hangul
Value | Count | Frequency (%) |
일 | 30 | 4.2% |
기 | 26 | 3.6% |
자 | 23 | 3.2% |
부 | 23 | 3.2% |
사 | 20 | 2.8% |
타 | 19 | 2.7% |
진 | 17 | 2.4% |
수 | 16 | 2.2% |
용 | 14 | 2.0% |
내 | 14 | 2.0% |
Other values (122) | 512 |
dataType
Categorical
IMBALANCE
 
Distinct | 9 |
---|---|
Distinct (%) | 1.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.9 KiB |
String | |
---|---|
Date | 26 |
Integer | 10 |
Float | 9 |
DATE | 4 |
Other values (4) | 7 |
Length
Max length | 9 |
---|---|
Median length | 6 |
Mean length | 5.8916155 |
Min length | 4 |
Unique
Unique | 3 ? |
---|---|
Unique (%) | 0.6% |
Sample
1st row | <NA> |
---|---|
2nd row | Float() |
3rd row | DATE |
4th row | DATE |
5th row | String |
Common Values
Value | Count | Frequency (%) |
String | 433 | |
Date | 26 | 5.3% |
Integer | 10 | 2.0% |
Float | 9 | 1.8% |
DATE | 4 | 0.8% |
INTEGER | 4 | 0.8% |
<NA> | 1 | 0.2% |
Float() | 1 | 0.2% |
Float(51) | 1 | 0.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
string | 433 | |
date | 30 | 6.1% |
integer | 14 | 2.9% |
float | 10 | 2.0% |
na | 1 | 0.2% |
float(51 | 1 | 0.2% |
colDesc
Text
MISSING
 
Distinct | 475 |
---|---|
Distinct (%) | 98.8% |
Missing | 8 |
Missing (%) | 1.6% |
Memory size | 3.9 KiB |
Length
Max length | 436 |
---|---|
Median length | 39 |
Mean length | 23.328482 |
Min length | 5 |
Characters and Unicode
Total characters | 11221 |
---|---|
Distinct characters | 283 |
Distinct categories | 11 ? |
Distinct scripts | 3 ? |
Distinct blocks | 3 ? |
Unique
Unique | 469 ? |
---|---|
Unique (%) | 97.5% |
Sample
1st row | 자궁암센터 외래 초진일 |
---|---|
2nd row | 환자의 진단시 나이 |
3rd row | KCD가 C48 C56 C57 중 하나 이상인 최초 진단 등록일 |
4th row | KCD가 C48 C56 C57 중 하나 이상인 모든 등록 진단 정보 (하위코드 포함) |
5th row | 난소암 or 기타암 구분 |
Value | Count | Frequency (%) |
여부 | 171 | 8.0% |
내용 | 85 | 4.0% |
환자의 | 53 | 2.5% |
52 | 2.4% | |
유무 | 43 | 2.0% |
환자 | 39 | 1.8% |
right | 39 | 1.8% |
left | 37 | 1.7% |
lnd | 30 | 1.4% |
size | 28 | 1.3% |
Other values (480) | 1569 |
Most occurring characters
Value | Count | Frequency (%) |
1684 | 15.0% | |
e | 571 | 5.1% |
t | 490 | 4.4% |
i | 481 | 4.3% |
a | 441 | 3.9% |
r | 434 | 3.9% |
o | 414 | 3.7% |
c | 267 | 2.4% |
l | 261 | 2.3% |
L | 251 | 2.2% |
Other values (273) | 5927 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 5016 | |
Other Letter | 2887 | |
Space Separator | 1684 | 15.0% |
Uppercase Letter | 1217 | 10.8% |
Dash Punctuation | 103 | 0.9% |
Decimal Number | 90 | 0.8% |
Open Punctuation | 78 | 0.7% |
Close Punctuation | 78 | 0.7% |
Other Punctuation | 64 | 0.6% |
Math Symbol | 3 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
부 | 199 | 6.9% |
여 | 177 | 6.1% |
자 | 133 | 4.6% |
의 | 120 | 4.2% |
내 | 103 | 3.6% |
환 | 100 | 3.5% |
용 | 100 | 3.5% |
수 | 72 | 2.5% |
기 | 64 | 2.2% |
사 | 53 | 1.8% |
Other values (208) | 1766 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 571 | |
t | 490 | |
i | 481 | |
a | 441 | 8.8% |
r | 434 | 8.7% |
o | 414 | 8.3% |
c | 267 | 5.3% |
l | 261 | 5.2% |
s | 244 | 4.9% |
n | 234 | 4.7% |
Other values (14) | 1179 |
Uppercase Letter
Value | Count | Frequency (%) |
L | 251 | |
P | 143 | |
N | 140 | |
S | 98 | 8.1% |
R | 92 | 7.6% |
O | 76 | 6.2% |
C | 69 | 5.7% |
D | 53 | 4.4% |
A | 52 | 4.3% |
U | 46 | 3.8% |
Other values (12) | 197 |
Decimal Number
Value | Count | Frequency (%) |
4 | 17 | |
3 | 16 | |
5 | 13 | |
1 | 12 | |
2 | 11 | |
6 | 9 | |
8 | 6 | 6.7% |
7 | 6 | 6.7% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 59 | |
. | 2 | 3.1% |
& | 2 | 3.1% |
, | 1 | 1.6% |
Math Symbol
Value | Count | Frequency (%) |
= | 2 | |
+ | 1 |
Space Separator
Value | Count | Frequency (%) |
1684 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 103 |
Open Punctuation
Value | Count | Frequency (%) |
( | 78 |
Close Punctuation
Value | Count | Frequency (%) |
) | 78 |
Other Number
Value | Count | Frequency (%) |
² | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 6233 | |
Hangul | 2887 | |
Common | 2101 | 18.7% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
부 | 199 | 6.9% |
여 | 177 | 6.1% |
자 | 133 | 4.6% |
의 | 120 | 4.2% |
내 | 103 | 3.6% |
환 | 100 | 3.5% |
용 | 100 | 3.5% |
수 | 72 | 2.5% |
기 | 64 | 2.2% |
사 | 53 | 1.8% |
Other values (208) | 1766 |
Latin
Value | Count | Frequency (%) |
e | 571 | 9.2% |
t | 490 | 7.9% |
i | 481 | 7.7% |
a | 441 | 7.1% |
r | 434 | 7.0% |
o | 414 | 6.6% |
c | 267 | 4.3% |
l | 261 | 4.2% |
L | 251 | 4.0% |
s | 244 | 3.9% |
Other values (36) | 2379 |
Common
Value | Count | Frequency (%) |
1684 | ||
- | 103 | 4.9% |
( | 78 | 3.7% |
) | 78 | 3.7% |
/ | 59 | 2.8% |
4 | 17 | 0.8% |
3 | 16 | 0.8% |
5 | 13 | 0.6% |
1 | 12 | 0.6% |
2 | 11 | 0.5% |
Other values (9) | 30 | 1.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 8333 | |
Hangul | 2887 | 25.7% |
None | 1 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1684 | ||
e | 571 | 6.9% |
t | 490 | 5.9% |
i | 481 | 5.8% |
a | 441 | 5.3% |
r | 434 | 5.2% |
o | 414 | 5.0% |
c | 267 | 3.2% |
l | 261 | 3.1% |
L | 251 | 3.0% |
Other values (54) | 3039 |
Hangul
Value | Count | Frequency (%) |
부 | 199 | 6.9% |
여 | 177 | 6.1% |
자 | 133 | 4.6% |
의 | 120 | 4.2% |
내 | 103 | 3.6% |
환 | 100 | 3.5% |
용 | 100 | 3.5% |
수 | 72 | 2.5% |
기 | 64 | 2.2% |
사 | 53 | 1.8% |
Other values (208) | 1766 |
None
Value | Count | Frequency (%) |
² | 1 |
colCnt
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 489 |
---|---|
Missing (%) | 100.0% |
Memory size | 4.4 KiB |
dispFormat
Text
MISSING
 
Distinct | 105 |
---|---|
Distinct (%) | 25.4% |
Missing | 75 |
Missing (%) | 15.3% |
Memory size | 3.9 KiB |
Length
Max length | 328 |
---|---|
Median length | 207 |
Mean length | 16.413043 |
Min length | 2 |
Characters and Unicode
Total characters | 6795 |
---|---|
Distinct characters | 101 |
Distinct categories | 10 ? |
Distinct scripts | 3 ? |
Distinct blocks | 2 ? |
Unique
Unique | 81 ? |
---|---|
Unique (%) | 19.6% |
Sample
1st row | YYYY-MM-DD |
---|---|
2nd row | 숫자 |
3rd row | YYYY-MM-DD |
4th row | OVARIAN CANCER UNSPECIFIED SIDE |
5th row | Y기타암 |N난소암 |
Value | Count | Frequency (%) |
y | 176 | 11.5% |
140 | 9.2% | |
n | 112 | 7.3% |
텍스트 | 74 | 4.8% |
숫자 | 61 | 4.0% |
no | 60 | 3.9% |
yes | 60 | 3.9% |
무 | 45 | 2.9% |
y유 | 44 | 2.9% |
yyyy-mm-dd | 30 | 2.0% |
Other values (230) | 727 |
Most occurring characters
Value | Count | Frequency (%) |
1265 | ||
e | 376 | 5.5% |
Y | 373 | 5.5% |
o | 301 | 4.4% |
t | 273 | 4.0% |
, | 269 | 4.0% |
N | 253 | 3.7% |
i | 227 | 3.3% |
a | 195 | 2.9% |
s | 191 | 2.8% |
Other values (91) | 3072 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 2865 | |
Space Separator | 1265 | |
Uppercase Letter | 1165 | |
Other Letter | 486 | 7.2% |
Other Punctuation | 396 | 5.8% |
Decimal Number | 299 | 4.4% |
Math Symbol | 200 | 2.9% |
Dash Punctuation | 77 | 1.1% |
Close Punctuation | 27 | 0.4% |
Open Punctuation | 15 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
스 | 74 | |
트 | 74 | |
텍 | 74 | |
자 | 62 | |
숫 | 61 | |
유 | 46 | |
무 | 45 | |
예 | 6 | 1.2% |
부 | 4 | 0.8% |
타 | 3 | 0.6% |
Other values (23) | 37 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 376 | |
o | 301 | |
t | 273 | 9.5% |
i | 227 | 7.9% |
a | 195 | 6.8% |
s | 191 | 6.7% |
r | 157 | 5.5% |
c | 147 | 5.1% |
n | 132 | 4.6% |
m | 125 | 4.4% |
Other values (13) | 741 |
Uppercase Letter
Value | Count | Frequency (%) |
Y | 373 | |
N | 253 | |
L | 140 | 12.0% |
D | 94 | 8.1% |
M | 58 | 5.0% |
R | 52 | 4.5% |
S | 44 | 3.8% |
C | 28 | 2.4% |
A | 20 | 1.7% |
P | 18 | 1.5% |
Other values (13) | 85 | 7.3% |
Decimal Number
Value | Count | Frequency (%) |
1 | 71 | |
2 | 62 | |
3 | 44 | |
4 | 36 | |
0 | 25 | 8.4% |
8 | 18 | 6.0% |
6 | 13 | 4.3% |
7 | 13 | 4.3% |
5 | 9 | 3.0% |
9 | 8 | 2.7% |
Other Punctuation
Value | Count | Frequency (%) |
, | 269 | |
: | 54 | 13.6% |
/ | 47 | 11.9% |
. | 26 | 6.6% |
Math Symbol
Value | Count | Frequency (%) |
| | 188 | |
+ | 6 | 3.0% |
= | 4 | 2.0% |
> | 2 | 1.0% |
Space Separator
Value | Count | Frequency (%) |
1265 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 77 |
Close Punctuation
Value | Count | Frequency (%) |
) | 27 |
Open Punctuation
Value | Count | Frequency (%) |
( | 15 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 4030 | |
Common | 2279 | |
Hangul | 486 | 7.2% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 376 | 9.3% |
Y | 373 | 9.3% |
o | 301 | 7.5% |
t | 273 | 6.8% |
N | 253 | 6.3% |
i | 227 | 5.6% |
a | 195 | 4.8% |
s | 191 | 4.7% |
r | 157 | 3.9% |
c | 147 | 3.6% |
Other values (36) | 1537 |
Hangul
Value | Count | Frequency (%) |
스 | 74 | |
트 | 74 | |
텍 | 74 | |
자 | 62 | |
숫 | 61 | |
유 | 46 | |
무 | 45 | |
예 | 6 | 1.2% |
부 | 4 | 0.8% |
타 | 3 | 0.6% |
Other values (23) | 37 |
Common
Value | Count | Frequency (%) |
1265 | ||
, | 269 | 11.8% |
| | 188 | 8.2% |
- | 77 | 3.4% |
1 | 71 | 3.1% |
2 | 62 | 2.7% |
: | 54 | 2.4% |
/ | 47 | 2.1% |
3 | 44 | 1.9% |
4 | 36 | 1.6% |
Other values (12) | 166 | 7.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 6309 | |
Hangul | 486 | 7.2% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1265 | ||
e | 376 | 6.0% |
Y | 373 | 5.9% |
o | 301 | 4.8% |
t | 273 | 4.3% |
, | 269 | 4.3% |
N | 253 | 4.0% |
i | 227 | 3.6% |
a | 195 | 3.1% |
s | 191 | 3.0% |
Other values (58) | 2586 |
Hangul
Value | Count | Frequency (%) |
스 | 74 | |
트 | 74 | |
텍 | 74 | |
자 | 62 | |
숫 | 61 | |
유 | 46 | |
무 | 45 | |
예 | 6 | 1.2% |
부 | 4 | 0.8% |
타 | 3 | 0.6% |
Other values (23) | 37 |
NUM | gpId | gpNm | tblId | tblNm | dataType | |
---|---|---|---|---|---|---|
NUM | 1.000 | 0.891 | 0.891 | 0.992 | 0.993 | 0.353 |
gpId | 0.891 | 1.000 | 1.000 | 1.000 | 1.000 | 0.630 |
gpNm | 0.891 | 1.000 | 1.000 | 1.000 | 1.000 | 0.630 |
tblId | 0.992 | 1.000 | 1.000 | 1.000 | 1.000 | 0.863 |
tblNm | 0.993 | 1.000 | 1.000 | 1.000 | 1.000 | 0.864 |
dataType | 0.353 | 0.630 | 0.630 | 0.863 | 0.864 | 1.000 |
gpId | gpNm | dataType | |
---|---|---|---|
gpId | 1.000 | 1.000 | 0.337 |
gpNm | 1.000 | 1.000 | 0.337 |
dataType | 0.337 | 0.337 | 1.000 |
NUM | gpId | gpNm | dataType | |
---|---|---|---|---|
NUM | 1.000 | 0.637 | 0.637 | 0.176 |
gpId | 0.637 | 1.000 | 1.000 | 0.337 |
gpNm | 0.637 | 1.000 | 1.000 | 0.337 |
dataType | 0.176 | 0.337 | 0.337 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | OVRY_SUMMARY_PTIF | 기본/진단정보 | PT_OVRY_TRGT | 기본정보 | FRMD_YMD | 초진일 | <NA> | 자궁암센터 외래 초진일 | <NA> | YYYY-MM-DD |
1 | 2 | OVRY_SUMMARY_PTIF | 기본/진단정보 | PT_OVRY_TRGT | 기본정보 | DIAG_AGE | 진단시나이 | Float() | 환자의 진단시 나이 | <NA> | 숫자 |
2 | 3 | OVRY_SUMMARY_PTIF | 기본/진단정보 | RG_OVRY_CNDX_V | 진단정보 | DIAG_YMD | 진단일 | DATE | KCD가 C48 C56 C57 중 하나 이상인 최초 진단 등록일 | <NA> | YYYY-MM-DD |
3 | 4 | OVRY_SUMMARY_PTIF | 기본/진단정보 | RG_OVRY_CNDX_V | 진단정보 | DIAG_ENM | 진단명 | DATE | KCD가 C48 C56 C57 중 하나 이상인 모든 등록 진단 정보 (하위코드 포함) | <NA> | OVARIAN CANCER UNSPECIFIED SIDE |
4 | 5 | OVRY_SUMMARY_PTIF | 기본/진단정보 | RG_OVRY_CNDX_V | 진단정보 | ETC_CNCR_YN | 난소암/기타암 구분 | String | 난소암 or 기타암 구분 | <NA> | Y기타암 |N난소암 |
5 | 6 | OVRY_SUMMARY_PTIF | 기본/진단정보 | PT_OVRY_BDMS | 신체계측 | WT_MSRM_YMD | 체중 측정일 | DATE | 첫번째 간호기록의 입원일(측정일) | <NA> | YYYY-MM-DD |
6 | 7 | OVRY_SUMMARY_PTIF | 기본/진단정보 | PT_OVRY_BDMS | 신체계측 | PT_OVRY_BDMS_WT_VL | 체중(KG) | Float(51) | 첫번째 간호기록의 입원일에 측정한 체중 측정 | <NA> | 숫자 |
7 | 8 | OVRY_SUMMARY_PTIF | 기본/진단정보 | PT_OVRY_BDMS | 신체계측 | HT_MSRM_YMD | 신장 측정일 | DATE | 첫번째 간호기록의 입원일(측정일) | <NA> | YYYY-MM-DD |
8 | 9 | OVRY_SUMMARY_PTIF | 기본/진단정보 | PT_OVRY_BDMS | 신체계측 | HT_VL | 신장(cm) | Float | 첫번째 간호기록의 입원일에 측정한 신장 측정 | <NA> | 숫자 |
9 | 10 | OVRY_SUMMARY_PTIF | 기본/진단정보 | PT_OVRY_BDMS | 신체계측 | BMI_VL | BMI | Float | 자동계산 = 환자의 체중/(환자의 신장)² | <NA> | 숫자 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
479 | 480 | OVRY_HLTH | 기타건강정보 | MR_OVRY_HLTH_10 | 과거력 | PHIS_INSM_YN | 불면 | String | 환자의 과거 불면증 유무 | <NA> | Y유 | N 무 |
480 | 481 | OVRY_HLTH | 기타건강정보 | MR_OVRY_HLTH_10 | 과거력 | PHIS_CADZ_YN | 심장질환 | String | 환자의 과거 심장질환 유무 | <NA> | Y유 | N 무 |
481 | 482 | OVRY_HLTH | 기타건강정보 | MR_OVRY_HLTH_10 | 과거력 | PHIS_ETC_YN | 기타 | String | 환자의 과거 기타 병력 유무 | <NA> | Y유 | N 무 |
482 | 483 | OVRY_HLTH | 기타건강정보 | MR_OVRY_HLTH_10 | 과거력 | PHIS_HTN_CMNT | 고혈압 상세내용 | String | 환자의 과거 고혈압 관련 기타내용 | <NA> | 텍스트 |
483 | 484 | OVRY_HLTH | 기타건강정보 | MR_OVRY_HLTH_10 | 과거력 | PHIS_DM_CMNT | 당뇨내용 | String | 환자의 과거 당뇨 관련 기타 내용 | <NA> | 텍스트 |
484 | 485 | OVRY_HLTH | 기타건강정보 | MR_OVRY_HLTH_10 | 과거력 | PHIS_CADZ_CMNT | 심장질환 상세내용 | String | 환자의 과거 심장질환 관련 기타 내용 | <NA> | 텍스트 |
485 | 486 | OVRY_HLTH | 기타건강정보 | MR_OVRY_HLTH_10 | 과거력 | PHIS_ETC_CMNT | 기타 상세내용 | String | 환자의 과거 기타 병력 내용 | <NA> | 텍스트 |
486 | 487 | OVRY_HLTH | 기타건강정보 | MR_OVRY_HLTH_11 | 증상/전원 정보 | MAIN_SYMP_YN | 증상 | String | 환자의 주 증상 유무 | <NA> | Y유 | N 무 |
487 | 488 | OVRY_HLTH | 기타건강정보 | MR_OVRY_HLTH_11 | 증상/전원 정보 | MAIN_SYMP_CMNT | 증상 상세내용 | String | 환자의 증상 상세내용 | <NA> | 텍스트 |
488 | 489 | OVRY_HLTH | 기타건강정보 | MR_OVRY_HLTH_11 | 증상/전원 정보 | OUTS_DIAG_TRANS_YN | 타 병원 진단 후 전원 | String | 환자의 타 병원 진단 후 전원 여부 | <NA> | Y유 | N 무 |