Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 382 |
Missing cells | 382 |
Missing cells (%) | 9.1% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 34.1 KiB |
Average record size in memory | 91.3 B |
Variable types
Numeric | 2 |
---|---|
Categorical | 5 |
Text | 3 |
Unsupported | 1 |
Dataset
Description | 간암 레지스트리 메타정보( 제공 되어질 데이터 항목, 타입, 사이즈, 항목별건수, 샘플데이터 등)를 제공 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15048691/fileData.do |
gpNm is highly overall correlated with NUM and 4 other fields | High correlation |
tblNm is highly overall correlated with NUM and 4 other fields | High correlation |
gpId is highly overall correlated with NUM and 4 other fields | High correlation |
tblId is highly overall correlated with NUM and 4 other fields | High correlation |
NUM is highly overall correlated with gpId and 3 other fields | High correlation |
colCnt is highly overall correlated with gpId and 3 other fields | High correlation |
dataType is highly imbalanced (50.6%) | Imbalance |
dispFormat has 382 (100.0%) missing values | Missing |
NUM has unique values | Unique |
dispFormat is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
colCnt has 82 (21.5%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-12 17:10:20.259944 |
---|---|
Analysis finished | 2023-12-12 17:10:21.621350 |
Duration | 1.36 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
NUM
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 382 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 191.5 |
Minimum | 1 |
---|---|
Maximum | 382 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 3.5 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 20.05 |
Q1 | 96.25 |
median | 191.5 |
Q3 | 286.75 |
95-th percentile | 362.95 |
Maximum | 382 |
Range | 381 |
Interquartile range (IQR) | 190.5 |
Descriptive statistics
Standard deviation | 110.41814 |
---|---|
Coefficient of variation (CV) | 0.57659606 |
Kurtosis | -1.2 |
Mean | 191.5 |
Median Absolute Deviation (MAD) | 95.5 |
Skewness | 0 |
Sum | 73153 |
Variance | 12192.167 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 0.3% |
253 | 1 | 0.3% |
262 | 1 | 0.3% |
261 | 1 | 0.3% |
260 | 1 | 0.3% |
259 | 1 | 0.3% |
258 | 1 | 0.3% |
257 | 1 | 0.3% |
256 | 1 | 0.3% |
255 | 1 | 0.3% |
Other values (372) | 372 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
382 | 1 | |
381 | 1 | |
380 | 1 | |
379 | 1 | |
378 | 1 | |
377 | 1 | |
376 | 1 | |
375 | 1 | |
374 | 1 | |
373 | 1 |
gpId
Categorical
HIGH CORRELATION
 
Distinct | 18 |
---|---|
Distinct (%) | 4.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.1 KiB |
_LVER_HLTH | |
---|---|
_LVER_SPR | |
_LVER_OPRT | |
_LVER_COMP | |
_LVER_HEPA | |
Other values (13) |
Length
Max length | 15 |
---|---|
Median length | 10 |
Mean length | 10.670157 |
Min length | 9 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | _LVER_Summary |
---|---|
2nd row | _LVER_Summary |
3rd row | _LVER_Summary |
4th row | _LVER_Summary |
5th row | _LVER_Summary |
Common Values
Value | Count | Frequency (%) |
_LVER_HLTH | 79 | |
_LVER_SPR | 77 | |
_LVER_OPRT | 38 | |
_LVER_COMP | 28 | 7.3% |
_LVER_HEPA | 23 | 6.0% |
_LVER_Summary | 18 | 4.7% |
_LVER_RTX | 15 | 3.9% |
_LVER_RLPS_MIST | 15 | 3.9% |
_LVER_CNDX | 14 | 3.7% |
_LVER_BX_INIT | 13 | 3.4% |
Other values (8) | 62 |
Length
Value | Count | Frequency (%) |
lver_hlth | 79 | |
lver_spr | 77 | |
lver_oprt | 38 | |
lver_comp | 28 | 7.3% |
lver_hepa | 23 | 6.0% |
lver_summary | 18 | 4.7% |
lver_rtx | 15 | 3.9% |
lver_rlps_mist | 15 | 3.9% |
lver_cndx | 14 | 3.7% |
lver_bx_init | 13 | 3.4% |
Other values (8) | 62 |
gpNm
Categorical
HIGH CORRELATION
 
Distinct | 18 |
---|---|
Distinct (%) | 4.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.1 KiB |
기타건강정보 | |
---|---|
외과병리보고서 | |
수술정보 | |
합병증 | |
기저간질환 정보 | |
Other values (13) |
Length
Max length | 12 |
---|---|
Median length | 11 |
Mean length | 6.1806283 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Summary |
---|---|
2nd row | Summary |
3rd row | Summary |
4th row | Summary |
5th row | Summary |
Common Values
Value | Count | Frequency (%) |
기타건강정보 | 79 | |
외과병리보고서 | 77 | |
수술정보 | 38 | |
합병증 | 28 | 7.3% |
기저간질환 정보 | 23 | 6.0% |
Summary | 18 | 4.7% |
방사선치료 | 15 | 3.9% |
전이 및 재발 | 15 | 3.9% |
진단정보 | 14 | 3.7% |
병리검사 | 13 | 3.4% |
Other values (8) | 62 |
Length
Value | Count | Frequency (%) |
기타건강정보 | 79 | |
외과병리보고서 | 77 | |
수술정보 | 38 | 7.9% |
합병증 | 28 | 5.8% |
및 | 27 | 5.6% |
기저간질환 | 23 | 4.8% |
정보 | 23 | 4.8% |
summary | 18 | 3.7% |
방사선치료 | 15 | 3.1% |
전이 | 15 | 3.1% |
Other values (13) | 140 |
tblId
Categorical
HIGH CORRELATION
 
Distinct | 33 |
---|---|
Distinct (%) | 8.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.1 KiB |
_LVER_MR_HLTH | |
---|---|
_LVER_PE_SPR | |
_LVER_PE_COMP | |
_LVER_PE_SPR_SUB | |
_LVER_PT_TRGT | 18 |
Other values (28) |
Length
Max length | 18 |
---|---|
Median length | 13 |
Mean length | 13.740838 |
Min length | 12 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | _LVER_PT_TRGT |
---|---|
2nd row | _LVER_PT_TRGT |
3rd row | _LVER_PT_TRGT |
4th row | _LVER_PT_TRGT |
5th row | _LVER_PT_TRGT |
Common Values
Value | Count | Frequency (%) |
_LVER_MR_HLTH | 79 | |
_LVER_PE_SPR | 44 | 11.5% |
_LVER_PE_COMP | 28 | 7.3% |
_LVER_PE_SPR_SUB | 23 | 6.0% |
_LVER_PT_TRGT | 18 | 4.7% |
_LVER_PE_RTX | 15 | 3.9% |
_LVER_PE_OPRT | 14 | 3.7% |
_LVER_PE_CHMO | 12 | 3.1% |
_LVER_PE_OPRT_LIST | 9 | 2.4% |
_LVER_PE_BX_INIT | 9 | 2.4% |
Other values (23) | 131 |
Length
Value | Count | Frequency (%) |
lver_mr_hlth | 79 | |
lver_pe_spr | 44 | 11.5% |
lver_pe_comp | 28 | 7.3% |
lver_pe_spr_sub | 23 | 6.0% |
lver_pt_trgt | 18 | 4.7% |
lver_pe_rtx | 15 | 3.9% |
lver_pe_oprt | 14 | 3.7% |
lver_pe_chmo | 12 | 3.1% |
lver_pe_oprt_list | 9 | 2.4% |
lver_pe_bx_init | 9 | 2.4% |
Other values (23) | 131 |
tblNm
Categorical
HIGH CORRELATION
 
Distinct | 33 |
---|---|
Distinct (%) | 8.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.1 KiB |
간암 환자건강정보 | |
---|---|
간암 외과병리 | |
간암 합병증 | |
간암 외과병리내용 | |
간암 대상자 | 18 |
Other values (28) |
Length
Max length | 25 |
---|---|
Median length | 19 |
Mean length | 9.3743455 |
Min length | 6 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 간암 대상자 |
---|---|
2nd row | 간암 대상자 |
3rd row | 간암 대상자 |
4th row | 간암 대상자 |
5th row | 간암 대상자 |
Common Values
Value | Count | Frequency (%) |
간암 환자건강정보 | 79 | |
간암 외과병리 | 44 | 11.5% |
간암 합병증 | 28 | 7.3% |
간암 외과병리내용 | 23 | 6.0% |
간암 대상자 | 18 | 4.7% |
간암 방사선치료 | 15 | 3.9% |
간암 수술정보 | 14 | 3.7% |
간암 항암치료 | 12 | 3.1% |
간암 수술기록 | 9 | 2.4% |
간암 Initial 병리검사 | 9 | 2.4% |
Other values (23) | 131 |
Length
Value | Count | Frequency (%) |
간암 | 359 | |
환자건강정보 | 79 | 9.0% |
외과병리 | 44 | 5.0% |
initial | 30 | 3.4% |
합병증 | 28 | 3.2% |
외과병리내용 | 23 | 2.6% |
대상자 | 18 | 2.0% |
정보 | 18 | 2.0% |
관련 | 18 | 2.0% |
방사선치료 | 15 | 1.7% |
Other values (34) | 250 |
colId
Text
Distinct | 337 |
---|---|
Distinct (%) | 88.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.1 KiB |
Value | Count | Frequency (%) |
path_no | 6 | 1.6% |
oprt_ymd | 5 | 1.3% |
ord_seq | 3 | 0.8% |
cexm_kind_nm | 3 | 0.8% |
cexm_nm | 3 | 0.8% |
cexm_rslt_cmnt | 3 | 0.8% |
cexm_cd | 3 | 0.8% |
ctx_estm_cmnt | 3 | 0.8% |
stag_rcrd_ymd | 2 | 0.5% |
lymp_seq | 2 | 0.5% |
Other values (327) | 349 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 688 | |
T | 395 | 9.1% |
M | 349 | 8.0% |
N | 346 | 8.0% |
C | 289 | 6.6% |
R | 234 | 5.4% |
S | 229 | 5.3% |
D | 210 | 4.8% |
A | 196 | 4.5% |
H | 160 | 3.7% |
Other values (23) | 1256 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 3615 | |
Connector Punctuation | 688 | 15.8% |
Decimal Number | 49 | 1.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
T | 395 | 10.9% |
M | 349 | 9.7% |
N | 346 | 9.6% |
C | 289 | 8.0% |
R | 234 | 6.5% |
S | 229 | 6.3% |
D | 210 | 5.8% |
A | 196 | 5.4% |
H | 160 | 4.4% |
Y | 156 | 4.3% |
Other values (16) | 1051 |
Decimal Number
Value | Count | Frequency (%) |
1 | 15 | |
2 | 15 | |
0 | 12 | |
7 | 4 | 8.2% |
3 | 2 | 4.1% |
4 | 1 | 2.0% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 688 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 3615 | |
Common | 737 | 16.9% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
T | 395 | 10.9% |
M | 349 | 9.7% |
N | 346 | 9.6% |
C | 289 | 8.0% |
R | 234 | 6.5% |
S | 229 | 6.3% |
D | 210 | 5.8% |
A | 196 | 5.4% |
H | 160 | 4.4% |
Y | 156 | 4.3% |
Other values (16) | 1051 |
Common
Value | Count | Frequency (%) |
_ | 688 | |
1 | 15 | 2.0% |
2 | 15 | 2.0% |
0 | 12 | 1.6% |
7 | 4 | 0.5% |
3 | 2 | 0.3% |
4 | 1 | 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 4352 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 688 | |
T | 395 | 9.1% |
M | 349 | 8.0% |
N | 346 | 8.0% |
C | 289 | 6.6% |
R | 234 | 5.4% |
S | 229 | 5.3% |
D | 210 | 4.8% |
A | 196 | 4.5% |
H | 160 | 3.7% |
Other values (23) | 1256 |
colNm
Text
Distinct | 329 |
---|---|
Distinct (%) | 86.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.1 KiB |
Value | Count | Frequency (%) |
stage | 33 | 4.3% |
grade | 19 | 2.5% |
pathology | 16 | 2.1% |
항바이러스제 | 12 | 1.6% |
ajcc | 12 | 1.6% |
경구용 | 12 | 1.6% |
of | 11 | 1.4% |
histologic | 10 | 1.3% |
lymph | 7 | 0.9% |
node | 7 | 0.9% |
Other values (357) | 635 |
Most occurring characters
Value | Count | Frequency (%) |
398 | 7.9% | |
e | 257 | 5.1% |
a | 239 | 4.7% |
t | 228 | 4.5% |
i | 218 | 4.3% |
o | 216 | 4.3% |
r | 165 | 3.3% |
n | 148 | 2.9% |
s | 139 | 2.8% |
g | 115 | 2.3% |
Other values (209) | 2917 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 2407 | |
Other Letter | 1634 | |
Space Separator | 398 | 7.9% |
Uppercase Letter | 292 | 5.8% |
Open Punctuation | 99 | 2.0% |
Close Punctuation | 99 | 2.0% |
Decimal Number | 51 | 1.0% |
Other Punctuation | 50 | 1.0% |
Dash Punctuation | 9 | 0.2% |
Connector Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
병 | 70 | 4.3% |
부 | 65 | 4.0% |
력 | 58 | 3.5% |
일 | 47 | 2.9% |
가 | 46 | 2.8% |
여 | 44 | 2.7% |
족 | 42 | 2.6% |
자 | 42 | 2.6% |
사 | 36 | 2.2% |
드 | 35 | 2.1% |
Other values (147) | 1149 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 257 | |
a | 239 | |
t | 228 | 9.5% |
i | 218 | 9.1% |
o | 216 | 9.0% |
r | 165 | 6.9% |
n | 148 | 6.1% |
s | 139 | 5.8% |
g | 115 | 4.8% |
l | 107 | 4.4% |
Other values (16) | 575 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 58 | |
P | 28 | 9.6% |
S | 23 | 7.9% |
A | 20 | 6.8% |
I | 19 | 6.5% |
N | 17 | 5.8% |
L | 15 | 5.1% |
B | 14 | 4.8% |
J | 12 | 4.1% |
T | 12 | 4.1% |
Other values (11) | 74 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 18 | |
: | 17 | |
, | 9 | |
% | 3 | 6.0% |
. | 3 | 6.0% |
Decimal Number
Value | Count | Frequency (%) |
1 | 16 | |
2 | 16 | |
0 | 12 | |
7 | 4 | 7.8% |
3 | 3 | 5.9% |
Space Separator
Value | Count | Frequency (%) |
398 |
Open Punctuation
Value | Count | Frequency (%) |
( | 99 |
Close Punctuation
Value | Count | Frequency (%) |
) | 99 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 9 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2699 | |
Hangul | 1634 | |
Common | 707 | 14.0% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
병 | 70 | 4.3% |
부 | 65 | 4.0% |
력 | 58 | 3.5% |
일 | 47 | 2.9% |
가 | 46 | 2.8% |
여 | 44 | 2.7% |
족 | 42 | 2.6% |
자 | 42 | 2.6% |
사 | 36 | 2.2% |
드 | 35 | 2.1% |
Other values (147) | 1149 |
Latin
Value | Count | Frequency (%) |
e | 257 | 9.5% |
a | 239 | 8.9% |
t | 228 | 8.4% |
i | 218 | 8.1% |
o | 216 | 8.0% |
r | 165 | 6.1% |
n | 148 | 5.5% |
s | 139 | 5.2% |
g | 115 | 4.3% |
l | 107 | 4.0% |
Other values (37) | 867 |
Common
Value | Count | Frequency (%) |
398 | ||
( | 99 | 14.0% |
) | 99 | 14.0% |
/ | 18 | 2.5% |
: | 17 | 2.4% |
1 | 16 | 2.3% |
2 | 16 | 2.3% |
0 | 12 | 1.7% |
, | 9 | 1.3% |
- | 9 | 1.3% |
Other values (5) | 14 | 2.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 3406 | |
Hangul | 1634 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
398 | 11.7% | |
e | 257 | 7.5% |
a | 239 | 7.0% |
t | 228 | 6.7% |
i | 218 | 6.4% |
o | 216 | 6.3% |
r | 165 | 4.8% |
n | 148 | 4.3% |
s | 139 | 4.1% |
g | 115 | 3.4% |
Other values (52) | 1283 |
Hangul
Value | Count | Frequency (%) |
병 | 70 | 4.3% |
부 | 65 | 4.0% |
력 | 58 | 3.5% |
일 | 47 | 2.9% |
가 | 46 | 2.8% |
여 | 44 | 2.7% |
족 | 42 | 2.6% |
자 | 42 | 2.6% |
사 | 36 | 2.2% |
드 | 35 | 2.1% |
Other values (147) | 1149 |
dataType
Categorical
IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.1 KiB |
STRING | |
---|---|
DATE | |
INTEGER | |
<NA> | 4 |
Length
Max length | 7 |
---|---|
Median length | 6 |
Mean length | 5.8298429 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | STRING |
---|---|
2nd row | STRING |
3rd row | DATE |
4th row | INTEGER |
5th row | DATE |
Common Values
Value | Count | Frequency (%) |
STRING | 303 | |
DATE | 44 | 11.5% |
INTEGER | 31 | 8.1% |
<NA> | 4 | 1.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
string | 303 | |
date | 44 | 11.5% |
integer | 31 | 8.1% |
na | 4 | 1.0% |
colDesc
Text
Distinct | 329 |
---|---|
Distinct (%) | 86.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.1 KiB |
Value | Count | Frequency (%) |
stage | 33 | 4.3% |
grade | 19 | 2.5% |
pathology | 16 | 2.1% |
항바이러스제 | 12 | 1.6% |
ajcc | 12 | 1.6% |
경구용 | 12 | 1.6% |
of | 11 | 1.4% |
histologic | 10 | 1.3% |
lymph | 7 | 0.9% |
node | 7 | 0.9% |
Other values (357) | 635 |
Most occurring characters
Value | Count | Frequency (%) |
398 | 7.9% | |
e | 257 | 5.1% |
a | 239 | 4.7% |
t | 228 | 4.5% |
i | 218 | 4.3% |
o | 216 | 4.3% |
r | 165 | 3.3% |
n | 148 | 2.9% |
s | 139 | 2.8% |
g | 115 | 2.3% |
Other values (209) | 2917 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 2407 | |
Other Letter | 1634 | |
Space Separator | 398 | 7.9% |
Uppercase Letter | 292 | 5.8% |
Open Punctuation | 99 | 2.0% |
Close Punctuation | 99 | 2.0% |
Decimal Number | 51 | 1.0% |
Other Punctuation | 50 | 1.0% |
Dash Punctuation | 9 | 0.2% |
Connector Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
병 | 70 | 4.3% |
부 | 65 | 4.0% |
력 | 58 | 3.5% |
일 | 47 | 2.9% |
가 | 46 | 2.8% |
여 | 44 | 2.7% |
족 | 42 | 2.6% |
자 | 42 | 2.6% |
사 | 36 | 2.2% |
드 | 35 | 2.1% |
Other values (147) | 1149 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 257 | |
a | 239 | |
t | 228 | 9.5% |
i | 218 | 9.1% |
o | 216 | 9.0% |
r | 165 | 6.9% |
n | 148 | 6.1% |
s | 139 | 5.8% |
g | 115 | 4.8% |
l | 107 | 4.4% |
Other values (16) | 575 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 58 | |
P | 28 | 9.6% |
S | 23 | 7.9% |
A | 20 | 6.8% |
I | 19 | 6.5% |
N | 17 | 5.8% |
L | 15 | 5.1% |
B | 14 | 4.8% |
J | 12 | 4.1% |
T | 12 | 4.1% |
Other values (11) | 74 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 18 | |
: | 17 | |
, | 9 | |
% | 3 | 6.0% |
. | 3 | 6.0% |
Decimal Number
Value | Count | Frequency (%) |
1 | 16 | |
2 | 16 | |
0 | 12 | |
7 | 4 | 7.8% |
3 | 3 | 5.9% |
Space Separator
Value | Count | Frequency (%) |
398 |
Open Punctuation
Value | Count | Frequency (%) |
( | 99 |
Close Punctuation
Value | Count | Frequency (%) |
) | 99 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 9 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2699 | |
Hangul | 1634 | |
Common | 707 | 14.0% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
병 | 70 | 4.3% |
부 | 65 | 4.0% |
력 | 58 | 3.5% |
일 | 47 | 2.9% |
가 | 46 | 2.8% |
여 | 44 | 2.7% |
족 | 42 | 2.6% |
자 | 42 | 2.6% |
사 | 36 | 2.2% |
드 | 35 | 2.1% |
Other values (147) | 1149 |
Latin
Value | Count | Frequency (%) |
e | 257 | 9.5% |
a | 239 | 8.9% |
t | 228 | 8.4% |
i | 218 | 8.1% |
o | 216 | 8.0% |
r | 165 | 6.1% |
n | 148 | 5.5% |
s | 139 | 5.2% |
g | 115 | 4.3% |
l | 107 | 4.0% |
Other values (37) | 867 |
Common
Value | Count | Frequency (%) |
398 | ||
( | 99 | 14.0% |
) | 99 | 14.0% |
/ | 18 | 2.5% |
: | 17 | 2.4% |
1 | 16 | 2.3% |
2 | 16 | 2.3% |
0 | 12 | 1.7% |
, | 9 | 1.3% |
- | 9 | 1.3% |
Other values (5) | 14 | 2.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 3406 | |
Hangul | 1634 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
398 | 11.7% | |
e | 257 | 7.5% |
a | 239 | 7.0% |
t | 228 | 6.7% |
i | 218 | 6.4% |
o | 216 | 6.3% |
r | 165 | 4.8% |
n | 148 | 4.3% |
s | 139 | 4.1% |
g | 115 | 3.4% |
Other values (52) | 1283 |
Hangul
Value | Count | Frequency (%) |
병 | 70 | 4.3% |
부 | 65 | 4.0% |
력 | 58 | 3.5% |
일 | 47 | 2.9% |
가 | 46 | 2.8% |
여 | 44 | 2.7% |
족 | 42 | 2.6% |
자 | 42 | 2.6% |
사 | 36 | 2.2% |
드 | 35 | 2.1% |
Other values (147) | 1149 |
colCnt
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 142 |
---|---|
Distinct (%) | 37.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 74961.393 |
Minimum | 0 |
---|---|
Maximum | 5061656 |
Zeros | 82 |
Zeros (%) | 21.5% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 3.5 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 36.75 |
median | 1688 |
Q3 | 2257 |
95-th percentile | 63745.3 |
Maximum | 5061656 |
Range | 5061656 |
Interquartile range (IQR) | 2220.25 |
Descriptive statistics
Standard deviation | 572154.62 |
---|---|
Coefficient of variation (CV) | 7.6326573 |
Kurtosis | 71.920694 |
Mean | 74961.393 |
Median Absolute Deviation (MAD) | 1504 |
Skewness | 8.5609433 |
Sum | 28635252 |
Variance | 3.2736091 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 82 | |
2104 | 37 | 9.7% |
2102 | 12 | 3.1% |
1728 | 11 | 2.9% |
2257 | 10 | 2.6% |
903 | 8 | 2.1% |
13049 | 7 | 1.8% |
50443 | 7 | 1.8% |
881 | 6 | 1.6% |
543 | 6 | 1.6% |
Other values (132) | 196 |
Value | Count | Frequency (%) |
0 | 82 | |
1 | 1 | 0.3% |
2 | 2 | 0.5% |
3 | 2 | 0.5% |
4 | 1 | 0.3% |
8 | 1 | 0.3% |
9 | 1 | 0.3% |
15 | 1 | 0.3% |
16 | 1 | 0.3% |
24 | 1 | 0.3% |
Value | Count | Frequency (%) |
5061656 | 4 | |
4892526 | 1 | 0.3% |
261251 | 4 | |
254070 | 1 | 0.3% |
105528 | 4 | |
102459 | 1 | 0.3% |
101029 | 1 | 0.3% |
65605 | 3 | |
63806 | 1 | 0.3% |
62592 | 1 | 0.3% |
dispFormat
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 382 |
---|---|
Missing (%) | 100.0% |
Memory size | 3.5 KiB |
NUM | gpId | gpNm | tblId | tblNm | dataType | colCnt | |
---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.955 | 0.955 | 0.976 | 0.976 | 0.378 | 0.409 |
gpId | 0.955 | 1.000 | 1.000 | 1.000 | 1.000 | 0.497 | 1.000 |
gpNm | 0.955 | 1.000 | 1.000 | 1.000 | 1.000 | 0.497 | 1.000 |
tblId | 0.976 | 1.000 | 1.000 | 1.000 | 1.000 | 0.623 | 1.000 |
tblNm | 0.976 | 1.000 | 1.000 | 1.000 | 1.000 | 0.623 | 1.000 |
dataType | 0.378 | 0.497 | 0.497 | 0.623 | 0.623 | 1.000 | 0.000 |
colCnt | 0.409 | 1.000 | 1.000 | 1.000 | 1.000 | 0.000 | 1.000 |
gpNm | dataType | tblNm | gpId | tblId | |
---|---|---|---|---|---|
gpNm | 1.000 | 0.259 | 0.979 | 1.000 | 0.979 |
dataType | 0.259 | 1.000 | 0.350 | 0.259 | 0.350 |
tblNm | 0.979 | 0.350 | 1.000 | 0.979 | 1.000 |
gpId | 1.000 | 0.259 | 0.979 | 1.000 | 0.979 |
tblId | 0.979 | 0.350 | 1.000 | 0.979 | 1.000 |
NUM | colCnt | gpId | gpNm | tblId | tblNm | dataType | |
---|---|---|---|---|---|---|---|
NUM | 1.000 | -0.214 | 0.786 | 0.786 | 0.819 | 0.819 | 0.243 |
colCnt | -0.214 | 1.000 | 0.979 | 0.979 | 0.958 | 0.958 | 0.000 |
gpId | 0.786 | 0.979 | 1.000 | 1.000 | 0.979 | 0.979 | 0.259 |
gpNm | 0.786 | 0.979 | 1.000 | 1.000 | 0.979 | 0.979 | 0.259 |
tblId | 0.819 | 0.958 | 0.979 | 0.979 | 1.000 | 1.000 | 0.350 |
tblNm | 0.819 | 0.958 | 0.979 | 0.979 | 1.000 | 1.000 | 0.350 |
dataType | 0.243 | 0.000 | 0.259 | 0.259 | 0.350 | 0.350 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | _LVER_Summary | Summary | _LVER_PT_TRGT | 간암 대상자 | PT_NM | 성명 | STRING | 성명 | 13049 | <NA> |
1 | 2 | _LVER_Summary | Summary | _LVER_PT_TRGT | 간암 대상자 | SEX_CD | 성별코드 | STRING | 성별코드 | 13049 | <NA> |
2 | 3 | _LVER_Summary | Summary | _LVER_PT_TRGT | 간암 대상자 | BRTH_YMD | 생년월일 | DATE | 생년월일 | 13049 | <NA> |
3 | 4 | _LVER_Summary | Summary | _LVER_PT_TRGT | 간암 대상자 | DIAG_AGE | 진단시나이 | INTEGER | 진단시나이 | 13049 | <NA> |
4 | 5 | _LVER_Summary | Summary | _LVER_PT_TRGT | 간암 대상자 | FRMD_YMD | 첫진료일자 | DATE | 첫진료일자 | 13049 | <NA> |
5 | 6 | _LVER_Summary | Summary | _LVER_PT_TRGT | 간암 대상자 | DIAG_CD | 첫진단코드 | STRING | 첫진단코드 | 13049 | <NA> |
6 | 7 | _LVER_Summary | Summary | _LVER_PT_TRGT | 간암 대상자 | DIAG_ENM | 첫진단한글명 | STRING | 첫진단한글명 | 13049 | <NA> |
7 | 8 | _LVER_Summary | Summary | _LVER_PT_TRGT | 간암 대상자 | FMDR_ID | 주치의ID | STRING | 주치의ID | 12899 | <NA> |
8 | 9 | _LVER_Summary | Summary | _LVER_PT_TRGT | 간암 대상자 | FMDR_NM | 주치의명 | STRING | 주치의명 | 12888 | <NA> |
9 | 10 | _LVER_Summary | Summary | _LVER_PT_TRGT | 간암 대상자 | OPRT_YMD | 첫수술일자 | DATE | 첫수술일자 | 2102 | <NA> |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
372 | 373 | _LVER_HLTH | 기타건강정보 | _LVER_MR_HLTH | 간암 환자건강정보 | PHIS_CNCR_YN | 과거병력암여부 | STRING | 과거병력암여부 | 2104 | <NA> |
373 | 374 | _LVER_HLTH | 기타건강정보 | _LVER_MR_HLTH | 간암 환자건강정보 | PHIS_DEPR_YN | 과거병력우울증여부 | STRING | 과거병력우울증여부 | 2104 | <NA> |
374 | 375 | _LVER_HLTH | 기타건강정보 | _LVER_MR_HLTH | 간암 환자건강정보 | PHIS_INSM_YN | 과거병력불면증여부 | STRING | 과거병력불면증여부 | 2104 | <NA> |
375 | 376 | _LVER_HLTH | 기타건강정보 | _LVER_MR_HLTH | 간암 환자건강정보 | PHIS_CADZ_YN | 과거병력심장질환여부 | STRING | 과거병력심장질환여부 | 2104 | <NA> |
376 | 377 | _LVER_HLTH | 기타건강정보 | _LVER_MR_HLTH | 간암 환자건강정보 | PHIS_CADZ_CMNT | 과거병력심장질환내용 | STRING | 과거병력심장질환내용 | 39 | <NA> |
377 | 378 | _LVER_HLTH | 기타건강정보 | _LVER_MR_HLTH | 간암 환자건강정보 | PHIS_ETC_YN | 과거병력기타여부 | STRING | 과거병력기타여부 | 2104 | <NA> |
378 | 379 | _LVER_HLTH | 기타건강정보 | _LVER_MR_HLTH | 간암 환자건강정보 | PHIS_ETC_CMNT | 과거병력기타내용 | STRING | 과거병력기타내용 | 678 | <NA> |
379 | 380 | _LVER_HLTH | 기타건강정보 | _LVER_MR_HLTH | 간암 환자건강정보 | MAIN_SYMP_YN | 주증상유무 | STRING | 주증상유무 | 889 | <NA> |
380 | 381 | _LVER_HLTH | 기타건강정보 | _LVER_MR_HLTH | 간암 환자건강정보 | MAIN_SYMP_CMNT | 주증상내용 | STRING | 주증상내용 | 1038 | <NA> |
381 | 382 | _LVER_HLTH | 기타건강정보 | _LVER_MR_HLTH | 간암 환자건강정보 | OUTS_DIAG_TRANS_YN | 타병원진단후전원여부 | STRING | 타병원진단후전원여부 | 2104 | <NA> |