Dataset statistics
Number of variables | 11 |
---|---|
Number of observations | 253 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 22.4 KiB |
Average record size in memory | 90.5 B |
Variable types
Numeric | 2 |
---|---|
Categorical | 6 |
Text | 3 |
Dataset
Description | 폐암 레지스트리 메타정보( 제공 되어질 데이터 항목, 타입, 사이즈, 항목별건수, 샘플데이터 등)를 제공 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15048700/fileData.do |
tblNm is highly overall correlated with NUM and 4 other fields | High correlation |
gpId is highly overall correlated with NUM and 4 other fields | High correlation |
gpNm is highly overall correlated with NUM and 4 other fields | High correlation |
tblId is highly overall correlated with NUM and 4 other fields | High correlation |
NUM is highly overall correlated with gpId and 3 other fields | High correlation |
colCnt is highly overall correlated with gpId and 3 other fields | High correlation |
dataType is highly overall correlated with dispFormat | High correlation |
dispFormat is highly overall correlated with dataType | High correlation |
NUM has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 23:00:44.278263 |
---|---|
Analysis finished | 2023-12-12 23:00:45.905236 |
Duration | 1.63 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
NUM
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 253 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 127 |
Minimum | 1 |
---|---|
Maximum | 253 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.4 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 13.6 |
Q1 | 64 |
median | 127 |
Q3 | 190 |
95-th percentile | 240.4 |
Maximum | 253 |
Range | 252 |
Interquartile range (IQR) | 126 |
Descriptive statistics
Standard deviation | 73.179004 |
---|---|
Coefficient of variation (CV) | 0.57621263 |
Kurtosis | -1.2 |
Mean | 127 |
Median Absolute Deviation (MAD) | 63 |
Skewness | 0 |
Sum | 32131 |
Variance | 5355.1667 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 0.4% |
175 | 1 | 0.4% |
162 | 1 | 0.4% |
163 | 1 | 0.4% |
164 | 1 | 0.4% |
165 | 1 | 0.4% |
166 | 1 | 0.4% |
167 | 1 | 0.4% |
168 | 1 | 0.4% |
169 | 1 | 0.4% |
Other values (243) | 243 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
253 | 1 | |
252 | 1 | |
251 | 1 | |
250 | 1 | |
249 | 1 | |
248 | 1 | |
247 | 1 | |
246 | 1 | |
245 | 1 | |
244 | 1 |
gpId
Categorical
HIGH CORRELATION
 
Distinct | 15 |
---|---|
Distinct (%) | 5.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
LUNG_HLTH | |
---|---|
LUNG_SPR | |
LUNG_CHMO_FLST | |
LUNG_BRON | |
LUNG_OPRT | |
Other values (10) |
Length
Max length | 14 |
---|---|
Median length | 9 |
Mean length | 10.079051 |
Min length | 8 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | LUNG_TRGT |
---|---|
2nd row | LUNG_TRGT |
3rd row | LUNG_TRGT |
4th row | LUNG_TRGT |
5th row | LUNG_TRGT |
Common Values
Value | Count | Frequency (%) |
LUNG_HLTH | 74 | |
LUNG_SPR | 32 | |
LUNG_CHMO_FLST | 26 | 10.3% |
LUNG_BRON | 18 | 7.1% |
LUNG_OPRT | 17 | 6.7% |
LUNG_CHMO | 14 | 5.5% |
LUNG_TRGT | 13 | 5.1% |
LUNG_CNDX_BDMS | 12 | 4.7% |
LUNG_EVAL_DEAD | 10 | 4.0% |
LUNG_RTX | 9 | 3.6% |
Other values (5) | 28 | 11.1% |
Length
Value | Count | Frequency (%) |
lung_hlth | 74 | |
lung_spr | 32 | |
lung_chmo_flst | 26 | 10.3% |
lung_bron | 18 | 7.1% |
lung_oprt | 17 | 6.7% |
lung_chmo | 14 | 5.5% |
lung_trgt | 13 | 5.1% |
lung_cndx_bdms | 12 | 4.7% |
lung_eval_dead | 10 | 4.0% |
lung_rtx | 9 | 3.6% |
Other values (5) | 28 | 11.1% |
gpNm
Categorical
HIGH CORRELATION
 
Distinct | 15 |
---|---|
Distinct (%) | 5.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
환자건강정보 | |
---|---|
외과병리 | |
항암 FlowSheet | |
기관지내시경검사 | |
수술 | |
Other values (10) |
Length
Max length | 12 |
---|---|
Median length | 11 |
Mean length | 6.8853755 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Summary |
---|---|
2nd row | Summary |
3rd row | Summary |
4th row | Summary |
5th row | Summary |
Common Values
Value | Count | Frequency (%) |
환자건강정보 | 74 | |
외과병리 | 32 | |
항암 FlowSheet | 26 | 10.3% |
기관지내시경검사 | 18 | 7.1% |
수술 | 17 | 6.7% |
항암치료 | 14 | 5.5% |
Summary | 13 | 5.1% |
진단 및 신체 | 12 | 4.7% |
치료평가 및 사망정보 | 10 | 4.0% |
방사선 치료 | 9 | 3.6% |
Other values (5) | 28 | 11.1% |
Length
Value | Count | Frequency (%) |
환자건강정보 | 74 | |
외과병리 | 32 | 9.1% |
항암 | 26 | 7.4% |
flowsheet | 26 | 7.4% |
및 | 22 | 6.3% |
initial | 19 | 5.4% |
기관지내시경검사 | 18 | 5.1% |
수술 | 17 | 4.8% |
항암치료 | 14 | 4.0% |
summary | 13 | 3.7% |
Other values (11) | 90 |
tblId
Categorical
HIGH CORRELATION
 
Distinct | 29 |
---|---|
Distinct (%) | 11.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
LUNG_PE_SPR | |
---|---|
LUNG_PE_CHMO_FLST | |
LUNG_PE_BRON | |
LUNG_MR_HLTH_10 | 14 |
LUNG_PT_TRGT | 13 |
Other values (24) |
Length
Max length | 17 |
---|---|
Median length | 15 |
Mean length | 13.549407 |
Min length | 11 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | LUNG_PT_TRGT |
---|---|
2nd row | LUNG_PT_TRGT |
3rd row | LUNG_PT_TRGT |
4th row | LUNG_PT_TRGT |
5th row | LUNG_PT_TRGT |
Common Values
Value | Count | Frequency (%) |
LUNG_PE_SPR | 32 | 12.6% |
LUNG_PE_CHMO_FLST | 26 | 10.3% |
LUNG_PE_BRON | 18 | 7.1% |
LUNG_MR_HLTH_10 | 14 | 5.5% |
LUNG_PT_TRGT | 13 | 5.1% |
LUNG_PE_RTX | 9 | 3.6% |
LUNG_PE_CHMO | 9 | 3.6% |
LUNG_MR_HLTH_8 | 9 | 3.6% |
LUNG_MR_HLTH_7 | 9 | 3.6% |
LUNG_PE_OPRT | 9 | 3.6% |
Other values (19) | 105 |
Length
Value | Count | Frequency (%) |
lung_pe_spr | 32 | 12.6% |
lung_pe_chmo_flst | 26 | 10.3% |
lung_pe_bron | 18 | 7.1% |
lung_mr_hlth_10 | 14 | 5.5% |
lung_pt_trgt | 13 | 5.1% |
lung_pe_rtx | 9 | 3.6% |
lung_pe_chmo | 9 | 3.6% |
lung_mr_hlth_8 | 9 | 3.6% |
lung_mr_hlth_7 | 9 | 3.6% |
lung_pe_oprt | 9 | 3.6% |
Other values (19) | 105 |
tblNm
Categorical
HIGH CORRELATION
 
Distinct | 29 |
---|---|
Distinct (%) | 11.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
수술 후 결과 | |
---|---|
Flow Sheet | |
기관지내시경검사 결과 | |
과거력 | 14 |
기본정보 | 13 |
Other values (24) |
Length
Max length | 13 |
---|---|
Median length | 10 |
Mean length | 7.0474308 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 기본정보 |
---|---|
2nd row | 기본정보 |
3rd row | 기본정보 |
4th row | 기본정보 |
5th row | 기본정보 |
Common Values
Value | Count | Frequency (%) |
수술 후 결과 | 32 | 12.6% |
Flow Sheet | 26 | 10.3% |
기관지내시경검사 결과 | 18 | 7.1% |
과거력 | 14 | 5.5% |
기본정보 | 13 | 5.1% |
방사선치료정보 | 9 | 3.6% |
항암치료정보 | 9 | 3.6% |
가족력(형제/자매) | 9 | 3.6% |
가족력(자녀) | 9 | 3.6% |
수술정보 | 9 | 3.6% |
Other values (19) | 105 |
Length
Value | Count | Frequency (%) |
결과 | 67 | |
수술 | 32 | 7.8% |
후 | 32 | 7.8% |
flow | 26 | 6.4% |
sheet | 26 | 6.4% |
initial | 19 | 4.6% |
기관지내시경검사 | 18 | 4.4% |
과거력 | 14 | 3.4% |
기본정보 | 13 | 3.2% |
방사선치료정보 | 9 | 2.2% |
Other values (26) | 153 |
colId
Text
Distinct | 221 |
---|---|
Distinct (%) | 87.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
Length
Max length | 22 |
---|---|
Median length | 17 |
Mean length | 12.027668 |
Min length | 5 |
Characters and Unicode
Total characters | 3043 |
---|---|
Distinct characters | 30 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 207 ? |
---|---|
Unique (%) | 81.8% |
Sample
1st row | PT_SBST_NO |
---|---|
2nd row | SEX_CD |
3rd row | BRTH_YMD |
4th row | FRST_DIAG_CD |
5th row | FRST_DIAG_YMD |
Value | Count | Frequency (%) |
pt_sbst_no | 19 | 7.5% |
chmo_strt_ymd | 3 | 1.2% |
dead_ymd | 2 | 0.8% |
cexm_clsf_nm | 2 | 0.8% |
cexm_ymd | 2 | 0.8% |
oprt_ymd | 2 | 0.8% |
cexm_rslt_cmnt | 2 | 0.8% |
cexm_nm | 2 | 0.8% |
chmo_prps_nm | 2 | 0.8% |
rtx_strt_ymd | 2 | 0.8% |
Other values (211) | 215 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 505 | |
T | 297 | 9.8% |
N | 268 | 8.8% |
M | 258 | 8.5% |
C | 196 | 6.4% |
S | 192 | 6.3% |
D | 149 | 4.9% |
R | 123 | 4.0% |
H | 121 | 4.0% |
Y | 99 | 3.3% |
Other values (20) | 835 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 2535 | |
Connector Punctuation | 505 | 16.6% |
Decimal Number | 3 | 0.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
T | 297 | |
N | 268 | 10.6% |
M | 258 | 10.2% |
C | 196 | 7.7% |
S | 192 | 7.6% |
D | 149 | 5.9% |
R | 123 | 4.9% |
H | 121 | 4.8% |
Y | 99 | 3.9% |
E | 97 | 3.8% |
Other values (16) | 735 |
Decimal Number
Value | Count | Frequency (%) |
3 | 1 | |
1 | 1 | |
2 | 1 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 505 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2535 | |
Common | 508 | 16.7% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
T | 297 | |
N | 268 | 10.6% |
M | 258 | 10.2% |
C | 196 | 7.7% |
S | 192 | 7.6% |
D | 149 | 5.9% |
R | 123 | 4.9% |
H | 121 | 4.8% |
Y | 99 | 3.9% |
E | 97 | 3.8% |
Other values (16) | 735 |
Common
Value | Count | Frequency (%) |
_ | 505 | |
3 | 1 | 0.2% |
1 | 1 | 0.2% |
2 | 1 | 0.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 3043 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 505 | |
T | 297 | 9.8% |
N | 268 | 8.8% |
M | 258 | 8.5% |
C | 196 | 6.4% |
S | 192 | 6.3% |
D | 149 | 4.9% |
R | 123 | 4.0% |
H | 121 | 4.0% |
Y | 99 | 3.3% |
Other values (20) | 835 |
colNm
Text
Distinct | 205 |
---|---|
Distinct (%) | 81.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
Value | Count | Frequency (%) |
내용 | 49 | 10.4% |
환자대체번호 | 19 | 4.0% |
검사 | 14 | 3.0% |
명 | 9 | 1.9% |
기타 | 7 | 1.5% |
to | 7 | 1.5% |
상세 | 7 | 1.5% |
결과 | 7 | 1.5% |
검사일 | 6 | 1.3% |
n:무 | 6 | 1.3% |
Other values (225) | 342 |
Most occurring characters
Value | Count | Frequency (%) |
220 | 9.0% | |
a | 74 | 3.0% |
내 | 62 | 2.5% |
i | 59 | 2.4% |
용 | 58 | 2.4% |
) | 58 | 2.4% |
( | 58 | 2.4% |
부 | 55 | 2.2% |
병 | 54 | 2.2% |
력 | 53 | 2.2% |
Other values (177) | 1697 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 1261 | |
Lowercase Letter | 598 | |
Space Separator | 220 | 9.0% |
Uppercase Letter | 217 | 8.9% |
Close Punctuation | 58 | 2.4% |
Open Punctuation | 58 | 2.4% |
Other Punctuation | 28 | 1.1% |
Decimal Number | 4 | 0.2% |
Dash Punctuation | 3 | 0.1% |
Connector Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
내 | 62 | 4.9% |
용 | 58 | 4.6% |
부 | 55 | 4.4% |
병 | 54 | 4.3% |
력 | 53 | 4.2% |
가 | 38 | 3.0% |
족 | 38 | 3.0% |
여 | 38 | 3.0% |
자 | 37 | 2.9% |
사 | 37 | 2.9% |
Other values (120) | 791 |
Lowercase Letter
Value | Count | Frequency (%) |
a | 74 | |
i | 59 | |
e | 53 | |
n | 51 | |
o | 49 | 8.2% |
t | 48 | 8.0% |
s | 40 | 6.7% |
c | 35 | 5.9% |
l | 34 | 5.7% |
r | 32 | 5.4% |
Other values (14) | 123 |
Uppercase Letter
Value | Count | Frequency (%) |
I | 22 | 10.1% |
N | 21 | 9.7% |
C | 18 | 8.3% |
T | 17 | 7.8% |
A | 16 | 7.4% |
S | 15 | 6.9% |
B | 15 | 6.9% |
P | 13 | 6.0% |
E | 10 | 4.6% |
L | 10 | 4.6% |
Other values (12) | 60 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 14 | |
: | 12 | |
. | 2 | 7.1% |
Decimal Number
Value | Count | Frequency (%) |
1 | 2 | |
3 | 1 | |
2 | 1 |
Space Separator
Value | Count | Frequency (%) |
220 |
Close Punctuation
Value | Count | Frequency (%) |
) | 58 |
Open Punctuation
Value | Count | Frequency (%) |
( | 58 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 3 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 1261 | |
Latin | 815 | |
Common | 372 | 15.2% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
내 | 62 | 4.9% |
용 | 58 | 4.6% |
부 | 55 | 4.4% |
병 | 54 | 4.3% |
력 | 53 | 4.2% |
가 | 38 | 3.0% |
족 | 38 | 3.0% |
여 | 38 | 3.0% |
자 | 37 | 2.9% |
사 | 37 | 2.9% |
Other values (120) | 791 |
Latin
Value | Count | Frequency (%) |
a | 74 | 9.1% |
i | 59 | 7.2% |
e | 53 | 6.5% |
n | 51 | 6.3% |
o | 49 | 6.0% |
t | 48 | 5.9% |
s | 40 | 4.9% |
c | 35 | 4.3% |
l | 34 | 4.2% |
r | 32 | 3.9% |
Other values (36) | 340 |
Common
Value | Count | Frequency (%) |
220 | ||
) | 58 | 15.6% |
( | 58 | 15.6% |
/ | 14 | 3.8% |
: | 12 | 3.2% |
- | 3 | 0.8% |
1 | 2 | 0.5% |
. | 2 | 0.5% |
3 | 1 | 0.3% |
2 | 1 | 0.3% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 1261 | |
ASCII | 1187 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
220 | ||
a | 74 | 6.2% |
i | 59 | 5.0% |
) | 58 | 4.9% |
( | 58 | 4.9% |
e | 53 | 4.5% |
n | 51 | 4.3% |
o | 49 | 4.1% |
t | 48 | 4.0% |
s | 40 | 3.4% |
Other values (47) | 477 |
Hangul
Value | Count | Frequency (%) |
내 | 62 | 4.9% |
용 | 58 | 4.6% |
부 | 55 | 4.4% |
병 | 54 | 4.3% |
력 | 53 | 4.2% |
가 | 38 | 3.0% |
족 | 38 | 3.0% |
여 | 38 | 3.0% |
자 | 37 | 2.9% |
사 | 37 | 2.9% |
Other values (120) | 791 |
dataType
Categorical
HIGH CORRELATION
 
Distinct | 28 |
---|---|
Distinct (%) | 11.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
String(1) | |
---|---|
DATE | |
String(10) | |
String(100) | |
String(200) | |
Other values (23) |
Length
Max length | 13 |
---|---|
Median length | 12 |
Mean length | 9.4624506 |
Min length | 4 |
Unique
Unique | 8 ? |
---|---|
Unique (%) | 3.2% |
Sample
1st row | String(10) |
---|---|
2nd row | String(code) |
3rd row | DATE |
4th row | String(code) |
5th row | DATE |
Common Values
Value | Count | Frequency (%) |
String(1) | 49 | |
DATE | 34 | |
String(10) | 28 | |
String(100) | 26 | |
String(200) | 20 | |
String(50) | 16 | 6.3% |
String(400) | 13 | 5.1% |
String(20) | 12 | 4.7% |
Integer(code) | 9 | 3.6% |
Integer(4) | 6 | 2.4% |
Other values (18) | 40 |
Length
Value | Count | Frequency (%) |
string(1 | 49 | |
date | 34 | |
string(10 | 28 | |
string(100 | 26 | |
string(200 | 20 | |
string(50 | 16 | 6.3% |
string(400 | 13 | 5.1% |
string(20 | 12 | 4.7% |
integer(code | 9 | 3.6% |
integer(4 | 6 | 2.4% |
Other values (18) | 40 |
colDesc
Text
Distinct | 222 |
---|---|
Distinct (%) | 87.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
Value | Count | Frequency (%) |
환자대체번호 | 19 | 4.3% |
내용 | 11 | 2.5% |
상세 | 10 | 2.3% |
to | 7 | 1.6% |
병리 | 7 | 1.6% |
항암화학요법치료 | 6 | 1.4% |
기타 | 6 | 1.4% |
명칭 | 6 | 1.4% |
stage | 5 | 1.1% |
일 | 5 | 1.1% |
Other values (262) | 357 |
Most occurring characters
Value | Count | Frequency (%) |
186 | 7.4% | |
a | 74 | 2.9% |
i | 71 | 2.8% |
병 | 63 | 2.5% |
) | 59 | 2.3% |
( | 59 | 2.3% |
e | 54 | 2.1% |
o | 53 | 2.1% |
부 | 53 | 2.1% |
n | 53 | 2.1% |
Other values (189) | 1803 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 1346 | |
Lowercase Letter | 628 | |
Uppercase Letter | 217 | 8.6% |
Space Separator | 186 | 7.4% |
Close Punctuation | 59 | 2.3% |
Open Punctuation | 59 | 2.3% |
Other Punctuation | 25 | 1.0% |
Decimal Number | 4 | 0.2% |
Dash Punctuation | 3 | 0.1% |
Connector Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
병 | 63 | 4.7% |
부 | 53 | 3.9% |
력 | 53 | 3.9% |
자 | 50 | 3.7% |
사 | 41 | 3.0% |
가 | 40 | 3.0% |
족 | 38 | 2.8% |
여 | 36 | 2.7% |
일 | 35 | 2.6% |
내 | 31 | 2.3% |
Other values (133) | 906 |
Lowercase Letter
Value | Count | Frequency (%) |
a | 74 | |
i | 71 | |
e | 54 | |
o | 53 | |
n | 53 | |
t | 48 | 7.6% |
s | 44 | 7.0% |
c | 37 | 5.9% |
l | 34 | 5.4% |
r | 33 | 5.3% |
Other values (13) | 127 |
Uppercase Letter
Value | Count | Frequency (%) |
N | 21 | 9.7% |
B | 19 | 8.8% |
C | 18 | 8.3% |
I | 18 | 8.3% |
T | 17 | 7.8% |
S | 15 | 6.9% |
A | 15 | 6.9% |
P | 12 | 5.5% |
L | 11 | 5.1% |
E | 10 | 4.6% |
Other values (12) | 61 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 13 | |
: | 10 | |
. | 2 | 8.0% |
Decimal Number
Value | Count | Frequency (%) |
1 | 2 | |
3 | 1 | |
2 | 1 |
Space Separator
Value | Count | Frequency (%) |
186 |
Close Punctuation
Value | Count | Frequency (%) |
) | 59 |
Open Punctuation
Value | Count | Frequency (%) |
( | 59 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 3 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 1346 | |
Latin | 845 | |
Common | 337 | 13.3% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
병 | 63 | 4.7% |
부 | 53 | 3.9% |
력 | 53 | 3.9% |
자 | 50 | 3.7% |
사 | 41 | 3.0% |
가 | 40 | 3.0% |
족 | 38 | 2.8% |
여 | 36 | 2.7% |
일 | 35 | 2.6% |
내 | 31 | 2.3% |
Other values (133) | 906 |
Latin
Value | Count | Frequency (%) |
a | 74 | 8.8% |
i | 71 | 8.4% |
e | 54 | 6.4% |
o | 53 | 6.3% |
n | 53 | 6.3% |
t | 48 | 5.7% |
s | 44 | 5.2% |
c | 37 | 4.4% |
l | 34 | 4.0% |
r | 33 | 3.9% |
Other values (35) | 344 |
Common
Value | Count | Frequency (%) |
186 | ||
) | 59 | 17.5% |
( | 59 | 17.5% |
/ | 13 | 3.9% |
: | 10 | 3.0% |
- | 3 | 0.9% |
1 | 2 | 0.6% |
. | 2 | 0.6% |
3 | 1 | 0.3% |
2 | 1 | 0.3% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 1346 | |
ASCII | 1182 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
186 | 15.7% | |
a | 74 | 6.3% |
i | 71 | 6.0% |
) | 59 | 5.0% |
( | 59 | 5.0% |
e | 54 | 4.6% |
o | 53 | 4.5% |
n | 53 | 4.5% |
t | 48 | 4.1% |
s | 44 | 3.7% |
Other values (46) | 481 |
Hangul
Value | Count | Frequency (%) |
병 | 63 | 4.7% |
부 | 53 | 3.9% |
력 | 53 | 3.9% |
자 | 50 | 3.7% |
사 | 41 | 3.0% |
가 | 40 | 3.0% |
족 | 38 | 2.8% |
여 | 36 | 2.7% |
일 | 35 | 2.6% |
내 | 31 | 2.3% |
Other values (133) | 906 |
colCnt
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 147 |
---|---|
Distinct (%) | 58.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 60406.569 |
Minimum | 10 |
---|---|
Maximum | 755482 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.4 KiB |
Quantile statistics
Minimum | 10 |
---|---|
5-th percentile | 422.4 |
Q1 | 5599 |
median | 19267 |
Q3 | 39381 |
95-th percentile | 286468.6 |
Maximum | 755482 |
Range | 755472 |
Interquartile range (IQR) | 33782 |
Descriptive statistics
Standard deviation | 129867.02 |
---|---|
Coefficient of variation (CV) | 2.1498824 |
Kurtosis | 15.683781 |
Mean | 60406.569 |
Median Absolute Deviation (MAD) | 13848 |
Skewness | 3.8602949 |
Sum | 15282862 |
Variance | 1.6865443 × 1010 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
128430 | 12 | 4.7% |
31410 | 10 | 4.0% |
13393 | 9 | 3.6% |
24809 | 7 | 2.8% |
22501 | 7 | 2.8% |
18640 | 7 | 2.8% |
23008 | 7 | 2.8% |
16395 | 7 | 2.8% |
23358 | 7 | 2.8% |
60299 | 6 | 2.4% |
Other values (137) | 174 |
Value | Count | Frequency (%) |
10 | 1 | |
26 | 1 | |
41 | 1 | |
47 | 1 | |
67 | 1 | |
123 | 1 | |
128 | 1 | |
130 | 1 | |
145 | 1 | |
197 | 2 |
Value | Count | Frequency (%) |
755482 | 4 | |
700110 | 1 | 0.4% |
500357 | 1 | 0.4% |
499632 | 1 | 0.4% |
467196 | 1 | 0.4% |
445627 | 4 | |
412393 | 1 | 0.4% |
202519 | 1 | 0.4% |
202517 | 2 | |
202514 | 1 | 0.4% |
dispFormat
Categorical
HIGH CORRELATION
 
Distinct | 9 |
---|---|
Distinct (%) | 3.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.1 KiB |
텍스트 | |
---|---|
Y : 유 / N : 무 | |
YYYY-MM-DD | |
숫자 | |
RN+비식별숫자(8) | |
Other values (4) |
Length
Max length | 15 |
---|---|
Median length | 13 |
Mean length | 6.6679842 |
Min length | 2 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.4% |
Sample
1st row | RN+비식별숫자(8) |
---|---|
2nd row | M 남 | F 여 |
3rd row | YYYY-MM-DD |
4th row | 원내검사 코드 |
5th row | YYYY-MM-DD |
Common Values
Value | Count | Frequency (%) |
텍스트 | 112 | |
Y : 유 / N : 무 | 47 | |
YYYY-MM-DD | 34 | 13.4% |
숫자 | 25 | 9.9% |
RN+비식별숫자(8) | 19 | 7.5% |
Free 텍스트 | 11 | 4.3% |
원내검사 코드 | 2 | 0.8% |
Y : 내부 / N : 외부 | 2 | 0.8% |
M 남 | F 여 | 1 | 0.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
148 | ||
텍스트 | 123 | |
y | 49 | 8.7% |
n | 49 | 8.7% |
유 | 47 | 8.3% |
무 | 47 | 8.3% |
yyyy-mm-dd | 34 | 6.0% |
숫자 | 25 | 4.4% |
rn+비식별숫자(8 | 19 | 3.4% |
free | 11 | 2.0% |
Other values (8) | 12 | 2.1% |
NUM | gpId | gpNm | tblId | tblNm | dataType | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.954 | 0.954 | 0.985 | 0.985 | 0.776 | 0.751 | 0.585 |
gpId | 0.954 | 1.000 | 1.000 | 1.000 | 1.000 | 0.791 | 0.955 | 0.631 |
gpNm | 0.954 | 1.000 | 1.000 | 1.000 | 1.000 | 0.791 | 0.955 | 0.631 |
tblId | 0.985 | 1.000 | 1.000 | 1.000 | 1.000 | 0.709 | 0.986 | 0.652 |
tblNm | 0.985 | 1.000 | 1.000 | 1.000 | 1.000 | 0.709 | 0.986 | 0.652 |
dataType | 0.776 | 0.791 | 0.791 | 0.709 | 0.709 | 1.000 | 0.541 | 0.958 |
colCnt | 0.751 | 0.955 | 0.955 | 0.986 | 0.986 | 0.541 | 1.000 | 0.000 |
dispFormat | 0.585 | 0.631 | 0.631 | 0.652 | 0.652 | 0.958 | 0.000 | 1.000 |
dataType | tblNm | dispFormat | gpId | gpNm | tblId | |
---|---|---|---|---|---|---|
dataType | 1.000 | 0.233 | 0.751 | 0.359 | 0.359 | 0.233 |
tblNm | 0.233 | 1.000 | 0.297 | 0.970 | 0.970 | 1.000 |
dispFormat | 0.751 | 0.297 | 1.000 | 0.312 | 0.312 | 0.297 |
gpId | 0.359 | 0.970 | 0.312 | 1.000 | 1.000 | 0.970 |
gpNm | 0.359 | 0.970 | 0.312 | 1.000 | 1.000 | 0.970 |
tblId | 0.233 | 1.000 | 0.297 | 0.970 | 0.970 | 1.000 |
NUM | colCnt | gpId | gpNm | tblId | tblNm | dataType | dispFormat | |
---|---|---|---|---|---|---|---|---|
NUM | 1.000 | 0.155 | 0.750 | 0.750 | 0.853 | 0.853 | 0.394 | 0.312 |
colCnt | 0.155 | 1.000 | 0.810 | 0.810 | 0.878 | 0.878 | 0.262 | 0.000 |
gpId | 0.750 | 0.810 | 1.000 | 1.000 | 0.970 | 0.970 | 0.359 | 0.312 |
gpNm | 0.750 | 0.810 | 1.000 | 1.000 | 0.970 | 0.970 | 0.359 | 0.312 |
tblId | 0.853 | 0.878 | 0.970 | 0.970 | 1.000 | 1.000 | 0.233 | 0.297 |
tblNm | 0.853 | 0.878 | 0.970 | 0.970 | 1.000 | 1.000 | 0.233 | 0.297 |
dataType | 0.394 | 0.262 | 0.359 | 0.359 | 0.233 | 0.233 | 1.000 | 0.751 |
dispFormat | 0.312 | 0.000 | 0.312 | 0.312 | 0.297 | 0.297 | 0.751 | 1.000 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | LUNG_TRGT | Summary | LUNG_PT_TRGT | 기본정보 | PT_SBST_NO | 환자대체번호 | String(10) | 환자대체번호 | 24809 | RN+비식별숫자(8) |
1 | 2 | LUNG_TRGT | Summary | LUNG_PT_TRGT | 기본정보 | SEX_CD | 성별 코드 | String(code) | 성별코드 | 24791 | M 남 | F 여 |
2 | 3 | LUNG_TRGT | Summary | LUNG_PT_TRGT | 기본정보 | BRTH_YMD | 생년월일 | DATE | 생년월일 | 24791 | YYYY-MM-DD |
3 | 4 | LUNG_TRGT | Summary | LUNG_PT_TRGT | 기본정보 | FRST_DIAG_CD | 최초 진단 코드 | String(code) | 최초진단코드 | 24809 | 원내검사 코드 |
4 | 5 | LUNG_TRGT | Summary | LUNG_PT_TRGT | 기본정보 | FRST_DIAG_YMD | 최초 진단일 | DATE | 최초진단일자 | 24809 | YYYY-MM-DD |
5 | 6 | LUNG_TRGT | Summary | LUNG_PT_TRGT | 기본정보 | FRST_DIAG_NM | 최초 진단명 | String(256) | 최초진단명 | 24809 | 텍스트 |
6 | 7 | LUNG_TRGT | Summary | LUNG_PT_TRGT | 기본정보 | DIAG_ATT_AGE | 진단 시 나이 | Integer(3) | 진단 시 나이 | 24791 | 숫자 |
7 | 8 | LUNG_TRGT | Summary | LUNG_PT_TRGT | 기본정보 | FRMD_YMD | 초진일 | DATE | 초진일자 | 24391 | YYYY-MM-DD |
8 | 9 | LUNG_TRGT | Summary | LUNG_PT_TRGT | 기본정보 | FRST_OPRT_YMD | 최초 수술일 | DATE | 최초 수술일자 | 5307 | YYYY-MM-DD |
9 | 10 | LUNG_TRGT | Summary | LUNG_PT_TRGT | 기본정보 | FRST_OPRT_NM | 최초 수술명 | String(256) | 최초 수술명 | 5307 | 텍스트 |
NUM | gpId | gpNm | tblId | tblNm | colId | colNm | dataType | colDesc | colCnt | dispFormat | |
---|---|---|---|---|---|---|---|---|---|---|---|
243 | 244 | LUNG_CHMO_FLST | 항암 FlowSheet | LUNG_PE_CHMO_FLST | Flow Sheet | FATG_CMNT | FATIGUE 내용 | String(50) | FATIGUE | 128430 | 텍스트 |
244 | 245 | LUNG_CHMO_FLST | 항암 FlowSheet | LUNG_PE_CHMO_FLST | Flow Sheet | NV_CMNT | NV 내용 | String(50) | NV | 128430 | 텍스트 |
245 | 246 | LUNG_CHMO_FLST | 항암 FlowSheet | LUNG_PE_CHMO_FLST | Flow Sheet | CSTP_CMNT | CONSTIPATION 내용 | String(50) | CONSTIPATION | 128430 | 텍스트 |
246 | 247 | LUNG_CHMO_FLST | 항암 FlowSheet | LUNG_PE_CHMO_FLST | Flow Sheet | DIAR_CMNT | DIARRHEA 내용 | String(50) | DIARRHEA | 128430 | 텍스트 |
247 | 248 | LUNG_CHMO_FLST | 항암 FlowSheet | LUNG_PE_CHMO_FLST | Flow Sheet | SKIN_RASH_CMNT | SKINRASH 내용 | String(50) | SKINRASH | 128430 | 텍스트 |
248 | 249 | LUNG_CHMO_FLST | 항암 FlowSheet | LUNG_PE_CHMO_FLST | Flow Sheet | MCST_CMNT | MUCOSITIS 내용 | String(50) | MUCOSITIS | 128430 | 텍스트 |
249 | 250 | LUNG_CHMO_FLST | 항암 FlowSheet | LUNG_PE_CHMO_FLST | Flow Sheet | NURO_PTHY_CMNT | NEUROPATHY 내용 | String(50) | NEUROPATHY | 128430 | 텍스트 |
250 | 251 | LUNG_CHMO_FLST | 항암 FlowSheet | LUNG_PE_CHMO_FLST | Flow Sheet | ECOG_CD | ECOG 코드 | Integer(code) | ECOG 전신상태평가 | 70183 | 숫자 |
251 | 252 | LUNG_CHMO_FLST | 항암 FlowSheet | LUNG_PE_CHMO_FLST | Flow Sheet | WT_VL | 체중 (kg) | String(8) | 체중 | 121580 | 텍스트 |
252 | 253 | LUNG_CHMO_FLST | 항암 FlowSheet | LUNG_PE_CHMO_FLST | Flow Sheet | BSA_VL | BSA | Float(102) | 체표면적 | 121542 | 숫자 |