Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 321 |
Missing cells | 320 |
Missing cells (%) | 9.1% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 28.3 KiB |
Average record size in memory | 90.4 B |
Variable types
Numeric | 2 |
---|---|
Categorical | 5 |
Text | 4 |
Dataset
Description | 갑상선암 레지스트리 메타정보( 제공 되어질 데이터 항목, 타입, 사이즈, 항목별건수, 샘플데이터 등)를 제공 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15048690/fileData.do |
dispFormat has constant value "" | Constant |
gpNm is highly overall correlated with NUM and 4 other fields | High correlation |
tblNm is highly overall correlated with NUM and 4 other fields | High correlation |
gpId is highly overall correlated with NUM and 4 other fields | High correlation |
tblId is highly overall correlated with NUM and 4 other fields | High correlation |
NUM is highly overall correlated with gpId and 3 other fields | High correlation |
colCnt is highly overall correlated with gpId and 3 other fields | High correlation |
dispFormat has 320 (99.7%) missing values | Missing |
NUM has unique values | Unique |
colCnt has 6 (1.9%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-12 15:53:23.646175 |
---|---|
Analysis finished | 2023-12-12 15:53:25.907905 |
Duration | 2.26 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
NUM
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 321 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 161 |
Minimum | 1 |
---|---|
Maximum | 321 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 3.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 17 |
Q1 | 81 |
median | 161 |
Q3 | 241 |
95-th percentile | 305 |
Maximum | 321 |
Range | 320 |
Interquartile range (IQR) | 160 |
Descriptive statistics
Standard deviation | 92.808944 |
---|---|
Coefficient of variation (CV) | 0.57645307 |
Kurtosis | -1.2 |
Mean | 161 |
Median Absolute Deviation (MAD) | 80 |
Skewness | 0 |
Sum | 51681 |
Variance | 8613.5 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 0.3% |
242 | 1 | 0.3% |
220 | 1 | 0.3% |
219 | 1 | 0.3% |
218 | 1 | 0.3% |
217 | 1 | 0.3% |
216 | 1 | 0.3% |
215 | 1 | 0.3% |
214 | 1 | 0.3% |
213 | 1 | 0.3% |
Other values (311) | 311 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
321 | 1 | |
320 | 1 | |
319 | 1 | |
318 | 1 | |
317 | 1 | |
316 | 1 | |
315 | 1 | |
314 | 1 | |
313 | 1 | |
312 | 1 |
gpId
Categorical
HIGH CORRELATION
 
Distinct | 18 |
---|---|
Distinct (%) | 5.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.6 KiB |
_THRD_HLTH | |
---|---|
_THRD_OPRT | |
_THRD_SPR | |
_THRD_CNDX | |
_THRD_RADI | |
Other values (13) |
Length
Max length | 15 |
---|---|
Median length | 10 |
Mean length | 10.82243 |
Min length | 9 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | _THRD_Summary |
---|---|
2nd row | _THRD_Summary |
3rd row | _THRD_Summary |
4th row | _THRD_Summary |
5th row | _THRD_Summary |
Common Values
Value | Count | Frequency (%) |
_THRD_HLTH | 68 | |
_THRD_OPRT | 49 | |
_THRD_SPR | 48 | |
_THRD_CNDX | 20 | 6.2% |
_THRD_RADI | 17 | 5.3% |
_THRD_RTX | 14 | 4.4% |
_THRD_IMMU | 13 | 4.0% |
_THRD_BX_INIT | 12 | 3.7% |
_THRD_Summary | 12 | 3.7% |
_THRD_BX_FLUP | 12 | 3.7% |
Other values (8) | 56 |
Length
Value | Count | Frequency (%) |
thrd_hlth | 68 | |
thrd_oprt | 49 | |
thrd_spr | 48 | |
thrd_cndx | 20 | 6.2% |
thrd_radi | 17 | 5.3% |
thrd_rtx | 14 | 4.4% |
thrd_immu | 13 | 4.0% |
thrd_eval_dead | 12 | 3.7% |
thrd_bx_flup | 12 | 3.7% |
thrd_summary | 12 | 3.7% |
Other values (8) | 56 |
gpNm
Categorical
HIGH CORRELATION
 
Distinct | 18 |
---|---|
Distinct (%) | 5.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.6 KiB |
기타 | |
---|---|
수술 | |
외과병리 | |
진단정보 | |
방사성요오드치료 | |
Other values (13) |
Length
Max length | 15 |
---|---|
Median length | 13 |
Mean length | 4.728972 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Summary |
---|---|
2nd row | Summary |
3rd row | Summary |
4th row | Summary |
5th row | Summary |
Common Values
Value | Count | Frequency (%) |
기타 | 68 | |
수술 | 49 | |
외과병리 | 48 | |
진단정보 | 20 | 6.2% |
방사성요오드치료 | 17 | 5.3% |
방사선치료 | 14 | 4.4% |
면역병리 | 13 | 4.0% |
수술 전 병리검사 | 12 | 3.7% |
Summary | 12 | 3.7% |
F/U 병리검사 | 12 | 3.7% |
Other values (8) | 56 |
Length
Value | Count | Frequency (%) |
수술 | 73 | |
기타 | 68 | |
외과병리 | 48 | |
f/u | 24 | 5.6% |
전 | 24 | 5.6% |
병리검사 | 24 | 5.6% |
진단정보 | 20 | 4.7% |
방사성요오드치료 | 17 | 4.0% |
방사선치료 | 14 | 3.3% |
면역병리 | 13 | 3.1% |
Other values (13) | 100 |
tblId
Categorical
HIGH CORRELATION
 
Distinct | 29 |
---|---|
Distinct (%) | 9.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.6 KiB |
_THRD_MR_HLTH | |
---|---|
_THRD_PE_OPRT_LIST | |
_THRD_PE_SPR_SUB | |
_THRD_PE_SPR | |
_THRD_PE_OPRT | |
Other values (24) |
Length
Max length | 18 |
---|---|
Median length | 13 |
Mean length | 14.573209 |
Min length | 12 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | _THRD_PT_TRGT |
---|---|
2nd row | _THRD_PT_TRGT |
3rd row | _THRD_PT_TRGT |
4th row | _THRD_PT_TRGT |
5th row | _THRD_PT_TRGT |
Common Values
Value | Count | Frequency (%) |
_THRD_MR_HLTH | 68 | |
_THRD_PE_OPRT_LIST | 26 | 8.1% |
_THRD_PE_SPR_SUB | 23 | 7.2% |
_THRD_PE_SPR | 19 | 5.9% |
_THRD_PE_OPRT | 17 | 5.3% |
_THRD_PE_IMMU | 13 | 4.0% |
_THRD_PE_BX_INIT | 12 | 3.7% |
_THRD_PE_RADI | 12 | 3.7% |
_THRD_PE_BX_FLUP | 12 | 3.7% |
_THRD_PT_TRGT | 12 | 3.7% |
Other values (19) | 107 |
Length
Value | Count | Frequency (%) |
thrd_mr_hlth | 68 | |
thrd_pe_oprt_list | 26 | 8.1% |
thrd_pe_spr_sub | 23 | 7.2% |
thrd_pe_spr | 19 | 5.9% |
thrd_pe_oprt | 17 | 5.3% |
thrd_pe_immu | 13 | 4.0% |
thrd_pe_bx_init | 12 | 3.7% |
thrd_pe_radi | 12 | 3.7% |
thrd_pe_bx_flup | 12 | 3.7% |
thrd_pt_trgt | 12 | 3.7% |
Other values (19) | 107 |
tblNm
Categorical
HIGH CORRELATION
 
Distinct | 28 |
---|---|
Distinct (%) | 8.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.6 KiB |
갑상선암 환자건강정보 | |
---|---|
갑상선암 수술내용 | |
갑상선암 외과병리(SUB) | |
갑상선암 외과병리내용 | |
갑상선암 수술정보 | |
Other values (23) |
Length
Max length | 27 |
---|---|
Median length | 20 |
Mean length | 12.05919 |
Min length | 8 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 갑성선암 대상자 |
---|---|
2nd row | 갑성선암 대상자 |
3rd row | 갑성선암 대상자 |
4th row | 갑성선암 대상자 |
5th row | 갑성선암 대상자 |
Common Values
Value | Count | Frequency (%) |
갑상선암 환자건강정보 | 68 | |
갑상선암 수술내용 | 26 | 8.1% |
갑상선암 외과병리(SUB) | 23 | 7.2% |
갑상선암 외과병리내용 | 19 | 5.9% |
갑상선암 수술정보 | 17 | 5.3% |
갑상선암 방사성요오드치료 | 17 | 5.3% |
갑상선암 면역병리 | 13 | 4.0% |
갑상선암 Initial 병리검사 | 12 | 3.7% |
갑상선암 F/U 병리검사 | 12 | 3.7% |
갑성선암 대상자 | 12 | 3.7% |
Other values (18) | 102 |
Length
Value | Count | Frequency (%) |
갑상선암 | 309 | |
환자건강정보 | 68 | 9.5% |
수술내용 | 26 | 3.6% |
initial | 25 | 3.5% |
병리검사 | 24 | 3.3% |
f/u | 24 | 3.3% |
외과병리(sub | 23 | 3.2% |
외과병리내용 | 19 | 2.6% |
수술정보 | 17 | 2.4% |
방사성요오드치료 | 17 | 2.4% |
Other values (24) | 165 |
colId
Text
Distinct | 269 |
---|---|
Distinct (%) | 83.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.6 KiB |
Value | Count | Frequency (%) |
pt_sbst_no | 8 | 2.5% |
ldng_ymd | 7 | 2.2% |
oprt_ymd | 4 | 1.2% |
oprt_nm | 3 | 0.9% |
miex_nm | 2 | 0.6% |
bx_inhs_yn | 2 | 0.6% |
bx_mthd_cmnt | 2 | 0.6% |
cexm_nm | 2 | 0.6% |
cexm_rslt_cmnt | 2 | 0.6% |
miex_srex_cd | 2 | 0.6% |
Other values (259) | 287 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 630 | |
T | 361 | 9.3% |
M | 329 | 8.5% |
N | 316 | 8.1% |
C | 217 | 5.6% |
D | 211 | 5.4% |
R | 208 | 5.4% |
S | 203 | 5.2% |
E | 153 | 3.9% |
A | 143 | 3.7% |
Other values (25) | 1115 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 3212 | |
Connector Punctuation | 630 | 16.2% |
Decimal Number | 44 | 1.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
T | 361 | 11.2% |
M | 329 | 10.2% |
N | 316 | 9.8% |
C | 217 | 6.8% |
D | 211 | 6.6% |
R | 208 | 6.5% |
S | 203 | 6.3% |
E | 153 | 4.8% |
A | 143 | 4.5% |
Y | 141 | 4.4% |
Other values (16) | 930 |
Decimal Number
Value | Count | Frequency (%) |
1 | 13 | |
0 | 12 | |
2 | 9 | |
7 | 4 | 9.1% |
3 | 3 | 6.8% |
9 | 1 | 2.3% |
5 | 1 | 2.3% |
4 | 1 | 2.3% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 630 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 3212 | |
Common | 674 | 17.3% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
T | 361 | 11.2% |
M | 329 | 10.2% |
N | 316 | 9.8% |
C | 217 | 6.8% |
D | 211 | 6.6% |
R | 208 | 6.5% |
S | 203 | 6.3% |
E | 153 | 4.8% |
A | 143 | 4.5% |
Y | 141 | 4.4% |
Other values (16) | 930 |
Common
Value | Count | Frequency (%) |
_ | 630 | |
1 | 13 | 1.9% |
0 | 12 | 1.8% |
2 | 9 | 1.3% |
7 | 4 | 0.6% |
3 | 3 | 0.4% |
9 | 1 | 0.1% |
5 | 1 | 0.1% |
4 | 1 | 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 3886 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 630 | |
T | 361 | 9.3% |
M | 329 | 8.5% |
N | 316 | 8.1% |
C | 217 | 5.6% |
D | 211 | 5.4% |
R | 208 | 5.4% |
S | 203 | 5.2% |
E | 153 | 3.9% |
A | 143 | 3.7% |
Other values (25) | 1115 |
colNm
Text
Distinct | 268 |
---|---|
Distinct (%) | 83.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.6 KiB |
Value | Count | Frequency (%) |
stage | 14 | 2.4% |
of | 13 | 2.2% |
finding | 11 | 1.9% |
operation | 11 | 1.9% |
procedure | 10 | 1.7% |
검체결과 | 10 | 1.7% |
and | 10 | 1.7% |
최초 | 8 | 1.4% |
환자대체번호 | 8 | 1.4% |
diagnosis | 7 | 1.2% |
Other values (306) | 490 |
Most occurring characters
Value | Count | Frequency (%) |
272 | 7.6% | |
o | 159 | 4.4% |
i | 150 | 4.2% |
e | 130 | 3.6% |
n | 117 | 3.3% |
a | 115 | 3.2% |
t | 106 | 2.9% |
r | 92 | 2.6% |
s | 77 | 2.1% |
자 | 74 | 2.1% |
Other values (218) | 2302 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 1435 | |
Other Letter | 1430 | |
Space Separator | 272 | 7.6% |
Uppercase Letter | 202 | 5.6% |
Open Punctuation | 73 | 2.0% |
Close Punctuation | 73 | 2.0% |
Other Punctuation | 55 | 1.5% |
Decimal Number | 49 | 1.4% |
Dash Punctuation | 3 | 0.1% |
Connector Punctuation | 2 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
자 | 74 | 5.2% |
부 | 54 | 3.8% |
일 | 52 | 3.6% |
병 | 51 | 3.6% |
력 | 49 | 3.4% |
검 | 49 | 3.4% |
사 | 44 | 3.1% |
여 | 37 | 2.6% |
족 | 34 | 2.4% |
가 | 34 | 2.4% |
Other values (154) | 952 |
Lowercase Letter
Value | Count | Frequency (%) |
o | 159 | |
i | 150 | |
e | 130 | 9.1% |
n | 117 | 8.2% |
a | 115 | 8.0% |
t | 106 | 7.4% |
r | 92 | 6.4% |
s | 77 | 5.4% |
d | 71 | 4.9% |
g | 65 | 4.5% |
Other values (13) | 353 |
Uppercase Letter
Value | Count | Frequency (%) |
P | 34 | |
N | 27 | |
T | 19 | |
O | 14 | 6.9% |
L | 14 | 6.9% |
S | 14 | 6.9% |
M | 12 | 5.9% |
C | 10 | 5.0% |
E | 9 | 4.5% |
F | 6 | 3.0% |
Other values (13) | 43 |
Decimal Number
Value | Count | Frequency (%) |
1 | 14 | |
0 | 14 | |
2 | 10 | |
7 | 4 | 8.2% |
3 | 3 | 6.1% |
4 | 1 | 2.0% |
5 | 1 | 2.0% |
6 | 1 | 2.0% |
9 | 1 | 2.0% |
Other Punctuation
Value | Count | Frequency (%) |
: | 33 | |
/ | 14 | |
, | 4 | 7.3% |
" | 4 | 7.3% |
Space Separator
Value | Count | Frequency (%) |
272 |
Open Punctuation
Value | Count | Frequency (%) |
( | 73 |
Close Punctuation
Value | Count | Frequency (%) |
) | 73 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 3 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 1637 | |
Hangul | 1430 | |
Common | 527 | 14.7% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
자 | 74 | 5.2% |
부 | 54 | 3.8% |
일 | 52 | 3.6% |
병 | 51 | 3.6% |
력 | 49 | 3.4% |
검 | 49 | 3.4% |
사 | 44 | 3.1% |
여 | 37 | 2.6% |
족 | 34 | 2.4% |
가 | 34 | 2.4% |
Other values (154) | 952 |
Latin
Value | Count | Frequency (%) |
o | 159 | 9.7% |
i | 150 | 9.2% |
e | 130 | 7.9% |
n | 117 | 7.1% |
a | 115 | 7.0% |
t | 106 | 6.5% |
r | 92 | 5.6% |
s | 77 | 4.7% |
d | 71 | 4.3% |
g | 65 | 4.0% |
Other values (36) | 555 |
Common
Value | Count | Frequency (%) |
272 | ||
( | 73 | 13.9% |
) | 73 | 13.9% |
: | 33 | 6.3% |
1 | 14 | 2.7% |
0 | 14 | 2.7% |
/ | 14 | 2.7% |
2 | 10 | 1.9% |
7 | 4 | 0.8% |
, | 4 | 0.8% |
Other values (8) | 16 | 3.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2164 | |
Hangul | 1430 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
272 | 12.6% | |
o | 159 | 7.3% |
i | 150 | 6.9% |
e | 130 | 6.0% |
n | 117 | 5.4% |
a | 115 | 5.3% |
t | 106 | 4.9% |
r | 92 | 4.3% |
s | 77 | 3.6% |
( | 73 | 3.4% |
Other values (54) | 873 |
Hangul
Value | Count | Frequency (%) |
자 | 74 | 5.2% |
부 | 54 | 3.8% |
일 | 52 | 3.6% |
병 | 51 | 3.6% |
력 | 49 | 3.4% |
검 | 49 | 3.4% |
사 | 44 | 3.1% |
여 | 37 | 2.6% |
족 | 34 | 2.4% |
가 | 34 | 2.4% |
Other values (154) | 952 |
dataType
Categorical
Distinct | 3 |
---|---|
Distinct (%) | 0.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.6 KiB |
STRING | |
---|---|
DATE | |
INTEGER | 23 |
Length
Max length | 7 |
---|---|
Median length | 6 |
Mean length | 5.7601246 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | STRING |
---|---|
2nd row | DATE |
3rd row | STRING |
4th row | DATE |
5th row | INTEGER |
Common Values
Value | Count | Frequency (%) |
STRING | 248 | |
DATE | 50 | 15.6% |
INTEGER | 23 | 7.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
string | 248 | |
date | 50 | 15.6% |
integer | 23 | 7.2% |
colDesc
Text
Distinct | 268 |
---|---|
Distinct (%) | 83.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.6 KiB |
Value | Count | Frequency (%) |
stage | 14 | 2.4% |
of | 13 | 2.2% |
operation | 11 | 1.9% |
finding | 11 | 1.9% |
검체결과 | 10 | 1.7% |
procedure | 10 | 1.7% |
and | 10 | 1.7% |
환자대체번호 | 8 | 1.4% |
최초 | 8 | 1.4% |
lymph | 7 | 1.2% |
Other values (307) | 490 |
Most occurring characters
Value | Count | Frequency (%) |
272 | 7.6% | |
o | 161 | 4.5% |
i | 150 | 4.2% |
e | 130 | 3.6% |
n | 117 | 3.3% |
a | 115 | 3.2% |
t | 106 | 3.0% |
r | 92 | 2.6% |
s | 77 | 2.1% |
자 | 74 | 2.1% |
Other values (219) | 2296 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 1437 | |
Other Letter | 1416 | |
Space Separator | 272 | 7.6% |
Uppercase Letter | 208 | 5.8% |
Open Punctuation | 73 | 2.0% |
Close Punctuation | 73 | 2.0% |
Other Punctuation | 57 | 1.6% |
Decimal Number | 49 | 1.4% |
Dash Punctuation | 3 | 0.1% |
Connector Punctuation | 2 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
자 | 74 | 5.2% |
부 | 54 | 3.8% |
일 | 52 | 3.7% |
병 | 51 | 3.6% |
검 | 49 | 3.5% |
력 | 49 | 3.5% |
사 | 42 | 3.0% |
여 | 37 | 2.6% |
가 | 34 | 2.4% |
족 | 34 | 2.4% |
Other values (154) | 940 |
Lowercase Letter
Value | Count | Frequency (%) |
o | 161 | |
i | 150 | |
e | 130 | 9.0% |
n | 117 | 8.1% |
a | 115 | 8.0% |
t | 106 | 7.4% |
r | 92 | 6.4% |
s | 77 | 5.4% |
d | 71 | 4.9% |
g | 65 | 4.5% |
Other values (13) | 353 |
Uppercase Letter
Value | Count | Frequency (%) |
P | 34 | |
N | 29 | |
T | 21 | |
O | 14 | 6.7% |
L | 14 | 6.7% |
S | 14 | 6.7% |
M | 12 | 5.8% |
C | 10 | 4.8% |
E | 9 | 4.3% |
R | 8 | 3.8% |
Other values (13) | 43 |
Decimal Number
Value | Count | Frequency (%) |
1 | 14 | |
0 | 14 | |
2 | 10 | |
7 | 4 | 8.2% |
3 | 3 | 6.1% |
4 | 1 | 2.0% |
5 | 1 | 2.0% |
6 | 1 | 2.0% |
9 | 1 | 2.0% |
Other Punctuation
Value | Count | Frequency (%) |
: | 33 | |
/ | 14 | |
" | 4 | 7.0% |
, | 4 | 7.0% |
. | 2 | 3.5% |
Space Separator
Value | Count | Frequency (%) |
272 |
Open Punctuation
Value | Count | Frequency (%) |
( | 73 |
Close Punctuation
Value | Count | Frequency (%) |
) | 73 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 3 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 1645 | |
Hangul | 1416 | |
Common | 529 | 14.7% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
자 | 74 | 5.2% |
부 | 54 | 3.8% |
일 | 52 | 3.7% |
병 | 51 | 3.6% |
검 | 49 | 3.5% |
력 | 49 | 3.5% |
사 | 42 | 3.0% |
여 | 37 | 2.6% |
가 | 34 | 2.4% |
족 | 34 | 2.4% |
Other values (154) | 940 |
Latin
Value | Count | Frequency (%) |
o | 161 | 9.8% |
i | 150 | 9.1% |
e | 130 | 7.9% |
n | 117 | 7.1% |
a | 115 | 7.0% |
t | 106 | 6.4% |
r | 92 | 5.6% |
s | 77 | 4.7% |
d | 71 | 4.3% |
g | 65 | 4.0% |
Other values (36) | 561 |
Common
Value | Count | Frequency (%) |
272 | ||
( | 73 | 13.8% |
) | 73 | 13.8% |
: | 33 | 6.2% |
/ | 14 | 2.6% |
1 | 14 | 2.6% |
0 | 14 | 2.6% |
2 | 10 | 1.9% |
7 | 4 | 0.8% |
" | 4 | 0.8% |
Other values (9) | 18 | 3.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2174 | |
Hangul | 1416 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
272 | 12.5% | |
o | 161 | 7.4% |
i | 150 | 6.9% |
e | 130 | 6.0% |
n | 117 | 5.4% |
a | 115 | 5.3% |
t | 106 | 4.9% |
r | 92 | 4.2% |
s | 77 | 3.5% |
( | 73 | 3.4% |
Other values (55) | 881 |
Hangul
Value | Count | Frequency (%) |
자 | 74 | 5.2% |
부 | 54 | 3.8% |
일 | 52 | 3.7% |
병 | 51 | 3.6% |
검 | 49 | 3.5% |
력 | 49 | 3.5% |
사 | 42 | 3.0% |
여 | 37 | 2.6% |
가 | 34 | 2.4% |
족 | 34 | 2.4% |
Other values (154) | 940 |
colCnt
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 173 |
---|---|
Distinct (%) | 53.9% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 73865.209 |
Minimum | 0 |
---|---|
Maximum | 2951558 |
Zeros | 6 |
Zeros (%) | 1.9% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 3.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 23 |
Q1 | 1279 |
median | 6000 |
Q3 | 7999 |
95-th percentile | 87424 |
Maximum | 2951558 |
Range | 2951558 |
Interquartile range (IQR) | 6720 |
Descriptive statistics
Standard deviation | 421073.01 |
---|---|
Coefficient of variation (CV) | 5.7005594 |
Kurtosis | 41.772907 |
Mean | 73865.209 |
Median Absolute Deviation (MAD) | 3619 |
Skewness | 6.5702359 |
Sum | 23710732 |
Variance | 1.7730248 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
7785 | 34 | 10.6% |
5900 | 8 | 2.5% |
3893 | 7 | 2.2% |
7802 | 7 | 2.2% |
11061 | 6 | 1.9% |
203408 | 6 | 1.9% |
8658 | 6 | 1.9% |
0 | 6 | 1.9% |
2951558 | 6 | 1.9% |
4576 | 6 | 1.9% |
Other values (163) | 229 |
Value | Count | Frequency (%) |
0 | 6 | |
1 | 1 | 0.3% |
2 | 3 | |
4 | 1 | 0.3% |
11 | 1 | 0.3% |
13 | 1 | 0.3% |
15 | 1 | 0.3% |
22 | 1 | 0.3% |
23 | 6 | |
24 | 1 | 0.3% |
Value | Count | Frequency (%) |
2951558 | 6 | |
2430429 | 1 | 0.3% |
203408 | 6 | |
87424 | 4 | |
76052 | 1 | 0.3% |
38147 | 5 | |
30165 | 3 | |
29842 | 1 | 0.3% |
29814 | 1 | 0.3% |
29565 | 1 | 0.3% |
dispFormat
Text
CONSTANT
  MISSING
 
Distinct | 1 |
---|---|
Distinct (%) | 100.0% |
Missing | 320 |
Missing (%) | 99.7% |
Memory size | 2.6 KiB |
Value | Count | Frequency (%) |
yyyy-mm-dd | 1 |
Most occurring characters
Value | Count | Frequency (%) |
Y | 4 | |
- | 2 | |
M | 2 | |
D | 2 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 8 | |
Dash Punctuation | 2 | 20.0% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
Y | 4 | |
M | 2 | |
D | 2 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 8 | |
Common | 2 | 20.0% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
Y | 4 | |
M | 2 | |
D | 2 |
Common
Value | Count | Frequency (%) |
- | 2 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 10 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
Y | 4 | |
- | 2 | |
M | 2 | |
D | 2 |
NUM | gpId | gpNm | tblId | tblNm | dataType | colCnt | |
---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.960 | 0.960 | 0.983 | 0.981 | 0.417 | 0.414 |
gpId | 0.960 | 1.000 | 1.000 | 1.000 | 1.000 | 0.589 | 0.909 |
gpNm | 0.960 | 1.000 | 1.000 | 1.000 | 1.000 | 0.589 | 0.909 |
tblId | 0.983 | 1.000 | 1.000 | 1.000 | 1.000 | 0.648 | 0.854 |
tblNm | 0.981 | 1.000 | 1.000 | 1.000 | 1.000 | 0.639 | 0.854 |
dataType | 0.417 | 0.589 | 0.589 | 0.648 | 0.639 | 1.000 | 0.000 |
colCnt | 0.414 | 0.909 | 0.909 | 0.854 | 0.854 | 0.000 | 1.000 |
gpNm | tblNm | dataType | gpId | tblId | |
---|---|---|---|---|---|
gpNm | 1.000 | 0.983 | 0.325 | 1.000 | 0.982 |
tblNm | 0.983 | 1.000 | 0.400 | 0.983 | 0.998 |
dataType | 0.325 | 0.400 | 1.000 | 0.325 | 0.406 |
gpId | 1.000 | 0.983 | 0.325 | 1.000 | 0.982 |
tblId | 0.982 | 0.998 | 0.406 | 0.982 | 1.000 |
NUM | colCnt | gpId | gpNm | tblId | tblNm | dataType | |
---|---|---|---|---|---|---|---|
NUM | 1.000 | -0.222 | 0.795 | 0.795 | 0.850 | 0.852 | 0.281 |
colCnt | -0.222 | 1.000 | 0.671 | 0.671 | 0.644 | 0.647 | 0.000 |
gpId | 0.795 | 0.671 | 1.000 | 1.000 | 0.982 | 0.983 | 0.325 |
gpNm | 0.795 | 0.671 | 1.000 | 1.000 | 0.982 | 0.983 | 0.325 |
tblId | 0.850 | 0.644 | 0.982 | 0.982 | 1.000 | 0.998 | 0.406 |
tblNm | 0.852 | 0.647 | 0.983 | 0.983 | 0.998 | 1.000 | 0.400 |
dataType | 0.281 | 0.000 | 0.325 | 0.325 | 0.406 | 0.400 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | _THRD_Summary | Summary | _THRD_PT_TRGT | 갑성선암 대상자 | PT_NM | 성명 | STRING | 성명 | 11061 | <NA> |
1 | 2 | _THRD_Summary | Summary | _THRD_PT_TRGT | 갑성선암 대상자 | BRTH_YMD | 생년월일 | DATE | 생년월일 | 11061 | <NA> |
2 | 3 | _THRD_Summary | Summary | _THRD_PT_TRGT | 갑성선암 대상자 | SEX_CD | 성별 | STRING | 성별 | 11061 | <NA> |
3 | 4 | _THRD_Summary | Summary | _THRD_PT_TRGT | 갑성선암 대상자 | DIAG_YMD | 최초 진단일자 | DATE | 최초 진단일자 | 10513 | <NA> |
4 | 5 | _THRD_Summary | Summary | _THRD_PT_TRGT | 갑성선암 대상자 | DIAG_AGE | 최초 진단시 나이 | INTEGER | 최초 진단시 나이 | 10513 | <NA> |
5 | 6 | _THRD_Summary | Summary | _THRD_PT_TRGT | 갑성선암 대상자 | DIAG_ENM | 최초 진단영문명 | STRING | 최초 진단영문명 | 10513 | <NA> |
6 | 7 | _THRD_Summary | Summary | _THRD_PT_TRGT | 갑성선암 대상자 | DIAG_KNM | 최초 진단한글명 | STRING | 최초 진단한글명 | 10513 | <NA> |
7 | 8 | _THRD_Summary | Summary | _THRD_PT_TRGT | 갑성선암 대상자 | OPRT_YMD | 최초 수술일자 | DATE | 최초 수술일자 | 7572 | <NA> |
8 | 9 | _THRD_Summary | Summary | _THRD_PT_TRGT | 갑성선암 대상자 | OPRT_NM | 수술명 | STRING | 수술명 | 7572 | <NA> |
9 | 10 | _THRD_Summary | Summary | _THRD_PT_TRGT | 갑성선암 대상자 | MDCL_YMD | 최초 초진일자 | DATE | 최초 초진일자 | 9368 | <NA> |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
311 | 312 | _THRD_HLTH | 기타 | _THRD_MR_HLTH | 갑상선암 환자건강정보 | PHIS_INSM_YN | 과거병력불면증여부 | STRING | 과거병력불면증여부 | 7785 | <NA> |
312 | 313 | _THRD_HLTH | 기타 | _THRD_MR_HLTH | 갑상선암 환자건강정보 | PHIS_CADZ_YN | 과거병력심장질환여부 | STRING | 과거병력심장질환여부 | 7785 | <NA> |
313 | 314 | _THRD_HLTH | 기타 | _THRD_MR_HLTH | 갑상선암 환자건강정보 | PHIS_ETC_YN | 과거병력기타여부 | STRING | 과거병력기타여부 | 7785 | <NA> |
314 | 315 | _THRD_HLTH | 기타 | _THRD_MR_HLTH | 갑상선암 환자건강정보 | PHIS_HTN_CMNT | 과거병력고혈압내용 | STRING | 과거병력고혈압내용 | 2056 | <NA> |
315 | 316 | _THRD_HLTH | 기타 | _THRD_MR_HLTH | 갑상선암 환자건강정보 | PHIS_DM_CMNT | 과거병력당뇨내용 | STRING | 과거병력당뇨내용 | 345 | <NA> |
316 | 317 | _THRD_HLTH | 기타 | _THRD_MR_HLTH | 갑상선암 환자건강정보 | PHIS_CADZ_CMNT | 과거병력심장질환내용 | STRING | 과거병력심장질환내용 | 68 | <NA> |
317 | 318 | _THRD_HLTH | 기타 | _THRD_MR_HLTH | 갑상선암 환자건강정보 | PHIS_ETC_CMNT | 과거병력기타내용 | STRING | 과거병력기타내용 | 1096 | <NA> |
318 | 319 | _THRD_HLTH | 기타 | _THRD_MR_HLTH | 갑상선암 환자건강정보 | MAIN_SYMP_YN | 주증상 | STRING | 주증상 | 3776 | <NA> |
319 | 320 | _THRD_HLTH | 기타 | _THRD_MR_HLTH | 갑상선암 환자건강정보 | MAIN_SYMP_CMNT | 주증상 상세내용 | STRING | 주증상 상세내용 | 3553 | <NA> |
320 | 321 | _THRD_HLTH | 기타 | _THRD_MR_HLTH | 갑상선암 환자건강정보 | LDNG_YMD | 적재일자 | DATE | 적재일자 | 7785 | <NA> |