Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 309 |
Missing cells | 309 |
Missing cells (%) | 9.1% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 27.6 KiB |
Average record size in memory | 91.4 B |
Variable types
Numeric | 2 |
---|---|
Categorical | 5 |
Text | 3 |
Unsupported | 1 |
Dataset
Description | 위암 레지스트리 메타정보( 제공 되어질 데이터 항목, 타입, 사이즈, 항목별건수, 샘플데이터 등)를 제공 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15048687/fileData.do |
tblNm is highly overall correlated with NUM and 4 other fields | High correlation |
gpId is highly overall correlated with NUM and 3 other fields | High correlation |
gpNm is highly overall correlated with NUM and 3 other fields | High correlation |
tblId is highly overall correlated with NUM and 4 other fields | High correlation |
NUM is highly overall correlated with gpId and 3 other fields | High correlation |
colCnt is highly overall correlated with tblId and 1 other fields | High correlation |
dispFormat has 309 (100.0%) missing values | Missing |
NUM has unique values | Unique |
dispFormat is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
colCnt has 113 (36.6%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-12 05:20:51.941086 |
---|---|
Analysis finished | 2023-12-12 05:20:53.299697 |
Duration | 1.36 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
NUM
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 309 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 155 |
Minimum | 1 |
---|---|
Maximum | 309 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.8 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 16.4 |
Q1 | 78 |
median | 155 |
Q3 | 232 |
95-th percentile | 293.6 |
Maximum | 309 |
Range | 308 |
Interquartile range (IQR) | 154 |
Descriptive statistics
Standard deviation | 89.344838 |
---|---|
Coefficient of variation (CV) | 0.57641831 |
Kurtosis | -1.2 |
Mean | 155 |
Median Absolute Deviation (MAD) | 77 |
Skewness | 0 |
Sum | 47895 |
Variance | 7982.5 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 0.3% |
205 | 1 | 0.3% |
212 | 1 | 0.3% |
211 | 1 | 0.3% |
210 | 1 | 0.3% |
209 | 1 | 0.3% |
208 | 1 | 0.3% |
207 | 1 | 0.3% |
206 | 1 | 0.3% |
204 | 1 | 0.3% |
Other values (299) | 299 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
309 | 1 | |
308 | 1 | |
307 | 1 | |
306 | 1 | |
305 | 1 | |
304 | 1 | |
303 | 1 | |
302 | 1 | |
301 | 1 | |
300 | 1 |
gpId
Categorical
HIGH CORRELATION
 
Distinct | 5 |
---|---|
Distinct (%) | 1.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.5 KiB |
_GSTR_OPRT | |
---|---|
_GSTR_HLTH | |
_GSTR_CEXM | |
_GSTR_FLUP | |
_GSTR_Summary |
Length
Max length | 13 |
---|---|
Median length | 10 |
Mean length | 10.194175 |
Min length | 10 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | _GSTR_Summary |
---|---|
2nd row | _GSTR_Summary |
3rd row | _GSTR_Summary |
4th row | _GSTR_Summary |
5th row | _GSTR_Summary |
Common Values
Value | Count | Frequency (%) |
_GSTR_OPRT | 110 | |
_GSTR_HLTH | 80 | |
_GSTR_CEXM | 62 | |
_GSTR_FLUP | 37 | 12.0% |
_GSTR_Summary | 20 | 6.5% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
gstr_oprt | 110 | |
gstr_hlth | 80 | |
gstr_cexm | 62 | |
gstr_flup | 37 | 12.0% |
gstr_summary | 20 | 6.5% |
gpNm
Categorical
HIGH CORRELATION
 
Distinct | 5 |
---|---|
Distinct (%) | 1.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.5 KiB |
수술기록 | |
---|---|
기타건강정보 | |
진단검사 | |
추적관찰 | |
Patient info |
Length
Max length | 12 |
---|---|
Median length | 4 |
Mean length | 5.0355987 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Patient info |
---|---|
2nd row | Patient info |
3rd row | Patient info |
4th row | Patient info |
5th row | Patient info |
Common Values
Value | Count | Frequency (%) |
수술기록 | 110 | |
기타건강정보 | 80 | |
진단검사 | 62 | |
추적관찰 | 37 | 12.0% |
Patient info | 20 | 6.5% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
수술기록 | 110 | |
기타건강정보 | 80 | |
진단검사 | 62 | |
추적관찰 | 37 | 11.2% |
patient | 20 | 6.1% |
info | 20 | 6.1% |
tblId
Categorical
HIGH CORRELATION
 
Distinct | 21 |
---|---|
Distinct (%) | 6.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.5 KiB |
_GSTR_MR_HLTH | |
---|---|
_GSTR_PE_OPRT | |
_GSTR_PE_BIOP | |
_GSTR_PE_CEXM | |
_GSTR_PE_RTX | |
Other values (16) |
Length
Max length | 18 |
---|---|
Median length | 13 |
Mean length | 13.375405 |
Min length | 12 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | _GSTR_PT_TRGT |
---|---|
2nd row | _GSTR_PT_TRGT |
3rd row | _GSTR_PT_TRGT |
4th row | _GSTR_PT_TRGT |
5th row | _GSTR_PT_TRGT |
Common Values
Value | Count | Frequency (%) |
_GSTR_MR_HLTH | 80 | |
_GSTR_PE_OPRT | 37 | |
_GSTR_PE_BIOP | 35 | |
_GSTR_PE_CEXM | 23 | 7.4% |
_GSTR_PE_RTX | 15 | 4.9% |
_GSTR_PE_SPR | 15 | 4.9% |
_GSTR_PE_CHMO | 12 | 3.9% |
_GSTR_PE_MIEX_FLUP | 11 | 3.6% |
_GSTR_PT_TRGT | 9 | 2.9% |
_GSTR_PE_ESD | 9 | 2.9% |
Other values (11) | 63 |
Length
Value | Count | Frequency (%) |
gstr_mr_hlth | 80 | |
gstr_pe_oprt | 37 | |
gstr_pe_biop | 35 | |
gstr_pe_cexm | 23 | 7.4% |
gstr_pe_rtx | 15 | 4.9% |
gstr_pe_spr | 15 | 4.9% |
gstr_pe_chmo | 12 | 3.9% |
gstr_pe_miex_flup | 11 | 3.6% |
gstr_pe_esd | 9 | 2.9% |
gstr_pe_sten | 9 | 2.9% |
Other values (11) | 63 |
tblNm
Categorical
HIGH CORRELATION
 
Distinct | 21 |
---|---|
Distinct (%) | 6.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.5 KiB |
위암 환자건강정보 | |
---|---|
위암 수술기록 | |
위암 조직검사 | |
위암 진단검사 | |
위암 방사선치료 | |
Other values (16) |
Length
Max length | 25 |
---|---|
Median length | 15 |
Mean length | 8.6569579 |
Min length | 6 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 위암 대상자 |
---|---|
2nd row | 위암 대상자 |
3rd row | 위암 대상자 |
4th row | 위암 대상자 |
5th row | 위암 대상자 |
Common Values
Value | Count | Frequency (%) |
위암 환자건강정보 | 80 | |
위암 수술기록 | 37 | |
위암 조직검사 | 35 | |
위암 진단검사 | 23 | 7.4% |
위암 방사선치료 | 15 | 4.9% |
위암 ESD 외과병리결과 | 15 | 4.9% |
위암 항암치료 | 12 | 3.9% |
위암 영상기능검사 | 11 | 3.6% |
위암 대상자 | 9 | 2.9% |
위암 ESD 검사 | 9 | 2.9% |
Other values (11) | 63 |
Length
Value | Count | Frequency (%) |
위암 | 309 | |
환자건강정보 | 80 | 11.9% |
수술기록 | 37 | 5.5% |
조직검사 | 35 | 5.2% |
진단검사 | 29 | 4.3% |
esd | 24 | 3.6% |
방사선치료 | 15 | 2.2% |
외과병리결과 | 15 | 2.2% |
initial | 15 | 2.2% |
항암치료 | 12 | 1.8% |
Other values (15) | 100 | 14.9% |
colId
Text
Distinct | 284 |
---|---|
Distinct (%) | 91.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.5 KiB |
Value | Count | Frequency (%) |
comp_yn | 3 | 1.0% |
oprt_ymd | 3 | 1.0% |
mult_cnt | 2 | 0.6% |
diag_cmnt | 2 | 0.6% |
cexm_nm | 2 | 0.6% |
path_no | 2 | 0.6% |
ct_rslt_cmnt | 2 | 0.6% |
ct_ymd | 2 | 0.6% |
pa_rslt_cmnt | 2 | 0.6% |
pa_ymd | 2 | 0.6% |
Other values (274) | 287 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 565 | |
T | 345 | 9.7% |
M | 309 | 8.7% |
N | 297 | 8.4% |
C | 294 | 8.3% |
S | 203 | 5.7% |
D | 192 | 5.4% |
R | 152 | 4.3% |
E | 129 | 3.6% |
H | 125 | 3.5% |
Other values (23) | 942 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 2970 | |
Connector Punctuation | 565 | 15.9% |
Decimal Number | 18 | 0.5% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
T | 345 | |
M | 309 | 10.4% |
N | 297 | 10.0% |
C | 294 | 9.9% |
S | 203 | 6.8% |
D | 192 | 6.5% |
R | 152 | 5.1% |
E | 129 | 4.3% |
H | 125 | 4.2% |
Y | 111 | 3.7% |
Other values (16) | 813 |
Decimal Number
Value | Count | Frequency (%) |
1 | 8 | |
2 | 5 | |
3 | 2 | 11.1% |
5 | 1 | 5.6% |
8 | 1 | 5.6% |
0 | 1 | 5.6% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 565 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2970 | |
Common | 583 | 16.4% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
T | 345 | |
M | 309 | 10.4% |
N | 297 | 10.0% |
C | 294 | 9.9% |
S | 203 | 6.8% |
D | 192 | 6.5% |
R | 152 | 5.1% |
E | 129 | 4.3% |
H | 125 | 4.2% |
Y | 111 | 3.7% |
Other values (16) | 813 |
Common
Value | Count | Frequency (%) |
_ | 565 | |
1 | 8 | 1.4% |
2 | 5 | 0.9% |
3 | 2 | 0.3% |
5 | 1 | 0.2% |
8 | 1 | 0.2% |
0 | 1 | 0.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 3553 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 565 | |
T | 345 | 9.7% |
M | 309 | 8.7% |
N | 297 | 8.4% |
C | 294 | 8.3% |
S | 203 | 5.7% |
D | 192 | 5.4% |
R | 152 | 4.3% |
E | 129 | 3.6% |
H | 125 | 3.5% |
Other values (23) | 942 |
colNm
Text
Distinct | 289 |
---|---|
Distinct (%) | 93.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.5 KiB |
Value | Count | Frequency (%) |
egd | 9 | 1.8% |
결과 | 8 | 1.6% |
location | 7 | 1.4% |
stage | 7 | 1.4% |
invasion | 6 | 1.2% |
type | 6 | 1.2% |
y:유 | 6 | 1.2% |
n:무 | 6 | 1.2% |
검사일 | 6 | 1.2% |
상세내용 | 6 | 1.2% |
Other values (318) | 446 |
Most occurring characters
Value | Count | Frequency (%) |
500 | 14.4% | |
i | 145 | 4.2% |
e | 137 | 3.9% |
t | 133 | 3.8% |
o | 113 | 3.3% |
n | 102 | 2.9% |
s | 98 | 2.8% |
a | 96 | 2.8% |
( | 72 | 2.1% |
) | 72 | 2.1% |
Other values (194) | 2007 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 1287 | |
Other Letter | 1216 | |
Space Separator | 500 | 14.4% |
Uppercase Letter | 265 | 7.6% |
Open Punctuation | 72 | 2.1% |
Close Punctuation | 72 | 2.1% |
Other Punctuation | 37 | 1.1% |
Decimal Number | 13 | 0.4% |
Dash Punctuation | 7 | 0.2% |
Connector Punctuation | 4 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
병 | 69 | 5.7% |
력 | 58 | 4.8% |
부 | 51 | 4.2% |
가 | 44 | 3.6% |
족 | 42 | 3.5% |
여 | 39 | 3.2% |
일 | 37 | 3.0% |
사 | 29 | 2.4% |
과 | 29 | 2.4% |
자 | 28 | 2.3% |
Other values (133) | 790 |
Lowercase Letter
Value | Count | Frequency (%) |
i | 145 | |
e | 137 | |
t | 133 | |
o | 113 | |
n | 102 | 7.9% |
s | 98 | 7.6% |
a | 96 | 7.5% |
r | 67 | 5.2% |
c | 65 | 5.1% |
l | 54 | 4.2% |
Other values (13) | 277 |
Uppercase Letter
Value | Count | Frequency (%) |
D | 30 | |
C | 26 | 9.8% |
E | 23 | 8.7% |
S | 20 | 7.5% |
N | 18 | 6.8% |
I | 18 | 6.8% |
G | 17 | 6.4% |
A | 14 | 5.3% |
M | 13 | 4.9% |
L | 12 | 4.5% |
Other values (11) | 74 |
Decimal Number
Value | Count | Frequency (%) |
1 | 6 | |
2 | 3 | |
3 | 1 | 7.7% |
0 | 1 | 7.7% |
5 | 1 | 7.7% |
8 | 1 | 7.7% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 17 | |
: | 12 | |
. | 6 | 16.2% |
% | 2 | 5.4% |
Letter Number
Value | Count | Frequency (%) |
Ⅱ | 1 | |
Ⅰ | 1 |
Space Separator
Value | Count | Frequency (%) |
500 |
Open Punctuation
Value | Count | Frequency (%) |
( | 72 |
Close Punctuation
Value | Count | Frequency (%) |
) | 72 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 7 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 1554 | |
Hangul | 1216 | |
Common | 705 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
병 | 69 | 5.7% |
력 | 58 | 4.8% |
부 | 51 | 4.2% |
가 | 44 | 3.6% |
족 | 42 | 3.5% |
여 | 39 | 3.2% |
일 | 37 | 3.0% |
사 | 29 | 2.4% |
과 | 29 | 2.4% |
자 | 28 | 2.3% |
Other values (133) | 790 |
Latin
Value | Count | Frequency (%) |
i | 145 | 9.3% |
e | 137 | 8.8% |
t | 133 | 8.6% |
o | 113 | 7.3% |
n | 102 | 6.6% |
s | 98 | 6.3% |
a | 96 | 6.2% |
r | 67 | 4.3% |
c | 65 | 4.2% |
l | 54 | 3.5% |
Other values (36) | 544 |
Common
Value | Count | Frequency (%) |
500 | ||
( | 72 | 10.2% |
) | 72 | 10.2% |
/ | 17 | 2.4% |
: | 12 | 1.7% |
- | 7 | 1.0% |
. | 6 | 0.9% |
1 | 6 | 0.9% |
_ | 4 | 0.6% |
2 | 3 | 0.4% |
Other values (5) | 6 | 0.9% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2257 | |
Hangul | 1216 | |
Number Forms | 2 | 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
500 | ||
i | 145 | 6.4% |
e | 137 | 6.1% |
t | 133 | 5.9% |
o | 113 | 5.0% |
n | 102 | 4.5% |
s | 98 | 4.3% |
a | 96 | 4.3% |
( | 72 | 3.2% |
) | 72 | 3.2% |
Other values (49) | 789 |
Hangul
Value | Count | Frequency (%) |
병 | 69 | 5.7% |
력 | 58 | 4.8% |
부 | 51 | 4.2% |
가 | 44 | 3.6% |
족 | 42 | 3.5% |
여 | 39 | 3.2% |
일 | 37 | 3.0% |
사 | 29 | 2.4% |
과 | 29 | 2.4% |
자 | 28 | 2.3% |
Other values (133) | 790 |
Number Forms
Value | Count | Frequency (%) |
Ⅱ | 1 | |
Ⅰ | 1 |
dataType
Categorical
Distinct | 3 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.5 KiB |
STRING | |
---|---|
DATE | |
INTEGER | 21 |
Length
Max length | 7 |
---|---|
Median length | 6 |
Mean length | 5.8349515 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | INTEGER |
---|---|
2nd row | STRING |
3rd row | DATE |
4th row | DATE |
5th row | DATE |
Common Values
Value | Count | Frequency (%) |
STRING | 252 | |
DATE | 36 | 11.7% |
INTEGER | 21 | 6.8% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
string | 252 | |
date | 36 | 11.7% |
integer | 21 | 6.8% |
colDesc
Text
Distinct | 289 |
---|---|
Distinct (%) | 93.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.5 KiB |
Value | Count | Frequency (%) |
egd | 9 | 1.8% |
결과 | 8 | 1.6% |
location | 7 | 1.4% |
stage | 7 | 1.4% |
invasion | 6 | 1.2% |
type | 6 | 1.2% |
y:유 | 6 | 1.2% |
n:무 | 6 | 1.2% |
검사일 | 6 | 1.2% |
상세내용 | 6 | 1.2% |
Other values (318) | 446 |
Most occurring characters
Value | Count | Frequency (%) |
1224 | ||
i | 145 | 3.5% |
e | 137 | 3.3% |
t | 133 | 3.2% |
o | 113 | 2.7% |
n | 102 | 2.4% |
s | 98 | 2.3% |
a | 96 | 2.3% |
( | 72 | 1.7% |
) | 72 | 1.7% |
Other values (194) | 2007 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 1287 | |
Space Separator | 1224 | |
Other Letter | 1216 | |
Uppercase Letter | 265 | 6.3% |
Open Punctuation | 72 | 1.7% |
Close Punctuation | 72 | 1.7% |
Other Punctuation | 37 | 0.9% |
Decimal Number | 13 | 0.3% |
Dash Punctuation | 7 | 0.2% |
Connector Punctuation | 4 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
병 | 69 | 5.7% |
력 | 58 | 4.8% |
부 | 51 | 4.2% |
가 | 44 | 3.6% |
족 | 42 | 3.5% |
여 | 39 | 3.2% |
일 | 37 | 3.0% |
사 | 29 | 2.4% |
과 | 29 | 2.4% |
자 | 28 | 2.3% |
Other values (133) | 790 |
Lowercase Letter
Value | Count | Frequency (%) |
i | 145 | |
e | 137 | |
t | 133 | |
o | 113 | |
n | 102 | 7.9% |
s | 98 | 7.6% |
a | 96 | 7.5% |
r | 67 | 5.2% |
c | 65 | 5.1% |
l | 54 | 4.2% |
Other values (13) | 277 |
Uppercase Letter
Value | Count | Frequency (%) |
D | 30 | |
C | 26 | 9.8% |
E | 23 | 8.7% |
S | 20 | 7.5% |
N | 18 | 6.8% |
I | 18 | 6.8% |
G | 17 | 6.4% |
A | 14 | 5.3% |
M | 13 | 4.9% |
L | 12 | 4.5% |
Other values (11) | 74 |
Decimal Number
Value | Count | Frequency (%) |
1 | 6 | |
2 | 3 | |
3 | 1 | 7.7% |
0 | 1 | 7.7% |
5 | 1 | 7.7% |
8 | 1 | 7.7% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 17 | |
: | 12 | |
. | 6 | 16.2% |
% | 2 | 5.4% |
Letter Number
Value | Count | Frequency (%) |
Ⅱ | 1 | |
Ⅰ | 1 |
Space Separator
Value | Count | Frequency (%) |
1224 |
Open Punctuation
Value | Count | Frequency (%) |
( | 72 |
Close Punctuation
Value | Count | Frequency (%) |
) | 72 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 7 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 1554 | |
Common | 1429 | |
Hangul | 1216 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
병 | 69 | 5.7% |
력 | 58 | 4.8% |
부 | 51 | 4.2% |
가 | 44 | 3.6% |
족 | 42 | 3.5% |
여 | 39 | 3.2% |
일 | 37 | 3.0% |
사 | 29 | 2.4% |
과 | 29 | 2.4% |
자 | 28 | 2.3% |
Other values (133) | 790 |
Latin
Value | Count | Frequency (%) |
i | 145 | 9.3% |
e | 137 | 8.8% |
t | 133 | 8.6% |
o | 113 | 7.3% |
n | 102 | 6.6% |
s | 98 | 6.3% |
a | 96 | 6.2% |
r | 67 | 4.3% |
c | 65 | 4.2% |
l | 54 | 3.5% |
Other values (36) | 544 |
Common
Value | Count | Frequency (%) |
1224 | ||
( | 72 | 5.0% |
) | 72 | 5.0% |
/ | 17 | 1.2% |
: | 12 | 0.8% |
- | 7 | 0.5% |
. | 6 | 0.4% |
1 | 6 | 0.4% |
_ | 4 | 0.3% |
2 | 3 | 0.2% |
Other values (5) | 6 | 0.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2981 | |
Hangul | 1216 | |
Number Forms | 2 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1224 | ||
i | 145 | 4.9% |
e | 137 | 4.6% |
t | 133 | 4.5% |
o | 113 | 3.8% |
n | 102 | 3.4% |
s | 98 | 3.3% |
a | 96 | 3.2% |
( | 72 | 2.4% |
) | 72 | 2.4% |
Other values (49) | 789 |
Hangul
Value | Count | Frequency (%) |
병 | 69 | 5.7% |
력 | 58 | 4.8% |
부 | 51 | 4.2% |
가 | 44 | 3.6% |
족 | 42 | 3.5% |
여 | 39 | 3.2% |
일 | 37 | 3.0% |
사 | 29 | 2.4% |
과 | 29 | 2.4% |
자 | 28 | 2.3% |
Other values (133) | 790 |
Number Forms
Value | Count | Frequency (%) |
Ⅱ | 1 | |
Ⅰ | 1 |
colCnt
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 97 |
---|---|
Distinct (%) | 31.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 37756.388 |
Minimum | 0 |
---|---|
Maximum | 908669 |
Zeros | 113 |
Zeros (%) | 36.6% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.8 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 1499 |
Q3 | 8697 |
95-th percentile | 214531.2 |
Maximum | 908669 |
Range | 908669 |
Interquartile range (IQR) | 8697 |
Descriptive statistics
Standard deviation | 136944.26 |
---|---|
Coefficient of variation (CV) | 3.6270487 |
Kurtosis | 28.133127 |
Mean | 37756.388 |
Median Absolute Deviation (MAD) | 1499 |
Skewness | 5.2092516 |
Sum | 11666724 |
Variance | 1.875373 × 1010 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 113 | |
8697 | 42 | 13.6% |
1148 | 11 | 3.6% |
23174 | 7 | 2.3% |
3196 | 7 | 2.3% |
908669 | 5 | 1.6% |
68620 | 4 | 1.3% |
7627 | 4 | 1.3% |
90737 | 4 | 1.3% |
226585 | 3 | 1.0% |
Other values (87) | 109 |
Value | Count | Frequency (%) |
0 | 113 | |
7 | 1 | 0.3% |
41 | 1 | 0.3% |
42 | 1 | 0.3% |
103 | 1 | 0.3% |
177 | 1 | 0.3% |
259 | 1 | 0.3% |
432 | 1 | 0.3% |
530 | 1 | 0.3% |
588 | 1 | 0.3% |
Value | Count | Frequency (%) |
908669 | 5 | |
607968 | 3 | |
607933 | 1 | 0.3% |
226585 | 3 | |
217029 | 3 | |
216976 | 1 | 0.3% |
210864 | 3 | |
111067 | 2 | 0.6% |
100622 | 2 | 0.6% |
90737 | 4 |
dispFormat
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 309 |
---|---|
Missing (%) | 100.0% |
Memory size | 2.8 KiB |
NUM | gpId | gpNm | tblId | tblNm | dataType | colCnt | |
---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.987 | 0.987 | 0.967 | 0.967 | 0.456 | 0.726 |
gpId | 0.987 | 1.000 | 1.000 | 1.000 | 1.000 | 0.343 | 0.621 |
gpNm | 0.987 | 1.000 | 1.000 | 1.000 | 1.000 | 0.343 | 0.621 |
tblId | 0.967 | 1.000 | 1.000 | 1.000 | 1.000 | 0.649 | 0.883 |
tblNm | 0.967 | 1.000 | 1.000 | 1.000 | 1.000 | 0.649 | 0.883 |
dataType | 0.456 | 0.343 | 0.343 | 0.649 | 0.649 | 1.000 | 0.114 |
colCnt | 0.726 | 0.621 | 0.621 | 0.883 | 0.883 | 0.114 | 1.000 |
tblNm | gpId | dataType | gpNm | tblId | |
---|---|---|---|---|---|
tblNm | 1.000 | 0.973 | 0.374 | 0.973 | 1.000 |
gpId | 0.973 | 1.000 | 0.275 | 1.000 | 0.973 |
dataType | 0.374 | 0.275 | 1.000 | 0.275 | 0.374 |
gpNm | 0.973 | 1.000 | 0.275 | 1.000 | 0.973 |
tblId | 1.000 | 0.973 | 0.374 | 0.973 | 1.000 |
NUM | colCnt | gpId | gpNm | tblId | tblNm | dataType | |
---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.080 | 0.833 | 0.833 | 0.804 | 0.804 | 0.304 |
colCnt | 0.080 | 1.000 | 0.274 | 0.274 | 0.661 | 0.661 | 0.085 |
gpId | 0.833 | 0.274 | 1.000 | 1.000 | 0.973 | 0.973 | 0.275 |
gpNm | 0.833 | 0.274 | 1.000 | 1.000 | 0.973 | 0.973 | 0.275 |
tblId | 0.804 | 0.661 | 0.973 | 0.973 | 1.000 | 1.000 | 0.374 |
tblNm | 0.804 | 0.661 | 0.973 | 0.973 | 1.000 | 1.000 | 0.374 |
dataType | 0.304 | 0.085 | 0.275 | 0.275 | 0.374 | 0.374 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | _GSTR_Summary | Patient info | _GSTR_PT_TRGT | 위암 대상자 | DIAG_AGE | 진단시나이 | INTEGER | 진단시나이 | 20886 | <NA> |
1 | 2 | _GSTR_Summary | Patient info | _GSTR_PT_TRGT | 위암 대상자 | SEX_CD | 성별코드 | STRING | 성별코드 | 20886 | <NA> |
2 | 3 | _GSTR_Summary | Patient info | _GSTR_PT_TRGT | 위암 대상자 | FRMD_YMD | 첫진료일자 | DATE | 첫진료일자 | 19927 | <NA> |
3 | 4 | _GSTR_Summary | Patient info | _GSTR_PT_TRGT | 위암 대상자 | DRTR_YMD | 약물치료시작일 | DATE | 약물치료시작일 | 16915 | <NA> |
4 | 5 | _GSTR_Summary | Patient info | _GSTR_PT_TRGT | 위암 대상자 | OPRT_YMD | 수술일자 | DATE | 수술일자 | 7624 | <NA> |
5 | 6 | _GSTR_Summary | Patient info | _GSTR_PT_TRGT | 위암 대상자 | OPRT_CD | 수술코드 | STRING | 수술코드 | 7627 | <NA> |
6 | 7 | _GSTR_Summary | Patient info | _GSTR_PT_TRGT | 위암 대상자 | OPRT_NM | 수술명 | STRING | 수술명 | 7627 | <NA> |
7 | 8 | _GSTR_Summary | Patient info | _GSTR_PT_TRGT | 위암 대상자 | OPDR_ID | 집도의ID | STRING | 집도의ID | 7627 | <NA> |
8 | 9 | _GSTR_Summary | Patient info | _GSTR_PT_TRGT | 위암 대상자 | OPDR_NM | 집도의명 | STRING | 집도의명 | 7627 | <NA> |
9 | 10 | _GSTR_Summary | Patient info | _GSTR_RG_CNDX | 위암 진단정보 | INIT_DIAG_YMD | 위암 최초 진단일 | DATE | 위암 최초 진단일 | 9669 | <NA> |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
299 | 300 | _GSTR_HLTH | 기타건강정보 | _GSTR_MR_HLTH | 위암 환자건강정보 | PHIS_INSM_YN | 과거병력불면증여부 | STRING | 과거병력불면증여부 | 8697 | <NA> |
300 | 301 | _GSTR_HLTH | 기타건강정보 | _GSTR_MR_HLTH | 위암 환자건강정보 | PHIS_CADZ_YN | 과거병력심장질환여부 | STRING | 과거병력심장질환여부 | 8697 | <NA> |
301 | 302 | _GSTR_HLTH | 기타건강정보 | _GSTR_MR_HLTH | 위암 환자건강정보 | PHIS_ETC_YN | 과거병력기타여부 | STRING | 과거병력기타여부 | 8697 | <NA> |
302 | 303 | _GSTR_HLTH | 기타건강정보 | _GSTR_MR_HLTH | 위암 환자건강정보 | PHIS_HTN_CMNT | 과거병력고혈압내용 | STRING | 과거병력고혈압내용 | 1419 | <NA> |
303 | 304 | _GSTR_HLTH | 기타건강정보 | _GSTR_MR_HLTH | 위암 환자건강정보 | PHIS_DM_CMNT | 과거병력당뇨내용 | STRING | 과거병력당뇨내용 | 653 | <NA> |
304 | 305 | _GSTR_HLTH | 기타건강정보 | _GSTR_MR_HLTH | 위암 환자건강정보 | PHIS_CADZ_CMNT | 과거병력심장질환내용 | STRING | 과거병력심장질환내용 | 177 | <NA> |
305 | 306 | _GSTR_HLTH | 기타건강정보 | _GSTR_MR_HLTH | 위암 환자건강정보 | PHIS_ETC_CMNT | 과거병력기타내용 | STRING | 과거병력기타내용 | 3376 | <NA> |
306 | 307 | _GSTR_HLTH | 기타건강정보 | _GSTR_MR_HLTH | 위암 환자건강정보 | MAIN_SYMP_YN | 주증상유무 | STRING | 주증상유무 | 8697 | <NA> |
307 | 308 | _GSTR_HLTH | 기타건강정보 | _GSTR_MR_HLTH | 위암 환자건강정보 | MAIN_SYMP_CMNT | 주증상내용 | STRING | 주증상내용 | 5500 | <NA> |
308 | 309 | _GSTR_HLTH | 기타건강정보 | _GSTR_MR_HLTH | 위암 환자건강정보 | OUTS_DIAG_TRANS_YN | 타병원진단후전원여부 | STRING | 타병원진단후전원여부 | 0 | <NA> |