Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 1725 |
Missing cells | 230 |
Missing cells (%) | 1.9% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 101.2 KiB |
Average record size in memory | 60.1 B |
Variable types
Numeric | 4 |
---|---|
Categorical | 3 |
Dataset
Description | 2023년 12월 공표된 [2021년 암등록통계] 중, 국내 24개 암종별 암발생률 자료임.과거 자료 최신화로 인해, 1999-2020년의 수치가 변동됨.(단위 : 명, 10만 명 당 발생률) |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15009644/fileData.do |
국제질병분류 is highly overall correlated with 암종 | High correlation |
암종 is highly overall correlated with 국제질병분류 | High correlation |
발생자수 is highly overall correlated with 조발생률 and 1 other fields | High correlation |
조발생률 is highly overall correlated with 발생자수 and 1 other fields | High correlation |
연령표준화발생률 is highly overall correlated with 발생자수 and 1 other fields | High correlation |
조발생률 has 115 (6.7%) missing values | Missing |
연령표준화발생률 has 115 (6.7%) missing values | Missing |
발생자수 has 115 (6.7%) zeros | Zeros |
Reproduction
Analysis started | 2024-03-14 23:14:27.450923 |
---|---|
Analysis finished | 2024-03-14 23:14:33.541664 |
Duration | 6.09 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
발생연도
Real number (ℝ)
Distinct | 23 |
---|---|
Distinct (%) | 1.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2010 |
Minimum | 1999 |
---|---|
Maximum | 2021 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 15.3 KiB |
Quantile statistics
Minimum | 1999 |
---|---|
5-th percentile | 2000 |
Q1 | 2004 |
median | 2010 |
Q3 | 2016 |
95-th percentile | 2020 |
Maximum | 2021 |
Range | 22 |
Interquartile range (IQR) | 12 |
Descriptive statistics
Standard deviation | 6.6351731 |
---|---|
Coefficient of variation (CV) | 0.0033010811 |
Kurtosis | -1.2045578 |
Mean | 2010 |
Median Absolute Deviation (MAD) | 6 |
Skewness | 0 |
Sum | 3467250 |
Variance | 44.025522 |
Monotonicity | Increasing |
Value | Count | Frequency (%) |
1999 | 75 | 4.3% |
2000 | 75 | 4.3% |
2021 | 75 | 4.3% |
2020 | 75 | 4.3% |
2019 | 75 | 4.3% |
2018 | 75 | 4.3% |
2017 | 75 | 4.3% |
2016 | 75 | 4.3% |
2015 | 75 | 4.3% |
2014 | 75 | 4.3% |
Other values (13) | 975 |
Value | Count | Frequency (%) |
1999 | 75 | |
2000 | 75 | |
2001 | 75 | |
2002 | 75 | |
2003 | 75 | |
2004 | 75 | |
2005 | 75 | |
2006 | 75 | |
2007 | 75 | |
2008 | 75 |
Value | Count | Frequency (%) |
2021 | 75 | |
2020 | 75 | |
2019 | 75 | |
2018 | 75 | |
2017 | 75 | |
2016 | 75 | |
2015 | 75 | |
2014 | 75 | |
2013 | 75 | |
2012 | 75 |
성별
Categorical
Distinct | 3 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 13.6 KiB |
남녀전체 | |
---|---|
남자 | |
여자 |
Length
Max length | 4 |
---|---|
Median length | 2 |
Mean length | 2.6666667 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 남녀전체 |
---|---|
2nd row | 남녀전체 |
3rd row | 남녀전체 |
4th row | 남녀전체 |
5th row | 남녀전체 |
Common Values
Value | Count | Frequency (%) |
남녀전체 | 575 | |
남자 | 575 | |
여자 | 575 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
남녀전체 | 575 | |
남자 | 575 | |
여자 | 575 |
국제질병분류
Categorical
HIGH CORRELATION
 
Distinct | 25 |
---|---|
Distinct (%) | 1.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 13.6 KiB |
C00-C96 | 69 |
---|---|
C00-C14 | 69 |
C15 | 69 |
C16 | 69 |
C18-C20 | 69 |
Other values (20) |
Length
Max length | 11 |
---|---|
Median length | 3 |
Mean length | 4.72 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | C00-C96 |
---|---|
2nd row | C00-C14 |
3rd row | C15 |
4th row | C16 |
5th row | C18-C20 |
Common Values
Value | Count | Frequency (%) |
C00-C96 | 69 | 4.0% |
C00-C14 | 69 | 4.0% |
C15 | 69 | 4.0% |
C16 | 69 | 4.0% |
C18-C20 | 69 | 4.0% |
C22 | 69 | 4.0% |
C23-C24 | 69 | 4.0% |
C25 | 69 | 4.0% |
C32 | 69 | 4.0% |
C33-C34 | 69 | 4.0% |
Other values (15) | 1035 |
Length
Value | Count | Frequency (%) |
c00-c96 | 69 | 4.0% |
c56 | 69 | 4.0% |
c91-c95 | 69 | 4.0% |
c90 | 69 | 4.0% |
c82-c86,c96 | 69 | 4.0% |
c81 | 69 | 4.0% |
c73 | 69 | 4.0% |
c70-c72 | 69 | 4.0% |
c67 | 69 | 4.0% |
c64 | 69 | 4.0% |
Other values (15) | 1035 |
암종
Categorical
HIGH CORRELATION
 
Distinct | 25 |
---|---|
Distinct (%) | 1.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 13.6 KiB |
모든암 | 69 |
---|---|
입술, 구강 및 인두 | 69 |
식도 | 69 |
위 | 69 |
대장 | 69 |
Other values (20) |
Length
Max length | 11 |
---|---|
Median length | 9 |
Mean length | 3.76 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 모든암 |
---|---|
2nd row | 입술, 구강 및 인두 |
3rd row | 식도 |
4th row | 위 |
5th row | 대장 |
Common Values
Value | Count | Frequency (%) |
모든암 | 69 | 4.0% |
입술, 구강 및 인두 | 69 | 4.0% |
식도 | 69 | 4.0% |
위 | 69 | 4.0% |
대장 | 69 | 4.0% |
간 | 69 | 4.0% |
담낭 및 기타담도 | 69 | 4.0% |
췌장 | 69 | 4.0% |
후두 | 69 | 4.0% |
폐 | 69 | 4.0% |
Other values (15) | 1035 |
Length
Value | Count | Frequency (%) |
및 | 207 | 8.8% |
모든암 | 69 | 2.9% |
난소 | 69 | 2.9% |
기타 | 69 | 2.9% |
백혈병 | 69 | 2.9% |
골수종 | 69 | 2.9% |
다발성 | 69 | 2.9% |
비호지킨림프종 | 69 | 2.9% |
호지킨림프종 | 69 | 2.9% |
갑상선 | 69 | 2.9% |
Other values (22) | 1518 |
발생자수
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 1369 |
---|---|
Distinct (%) | 79.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 10161.7 |
Minimum | 0 |
---|---|
Maximum | 277523 |
Zeros | 115 |
Zeros (%) | 6.7% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 15.3 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 906 |
median | 2463 |
Q3 | 7992 |
95-th percentile | 30526.6 |
Maximum | 277523 |
Range | 277523 |
Interquartile range (IQR) | 7086 |
Descriptive statistics
Standard deviation | 27561.421 |
---|---|
Coefficient of variation (CV) | 2.7122845 |
Kurtosis | 38.225033 |
Mean | 10161.7 |
Median Absolute Deviation (MAD) | 2077 |
Skewness | 5.7387966 |
Sum | 17528932 |
Variance | 7.5963191 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 115 | 6.7% |
62 | 4 | 0.2% |
177 | 4 | 0.2% |
155 | 4 | 0.2% |
182 | 4 | 0.2% |
76 | 4 | 0.2% |
154 | 4 | 0.2% |
328 | 3 | 0.2% |
2808 | 3 | 0.2% |
9458 | 3 | 0.2% |
Other values (1359) | 1577 |
Value | Count | Frequency (%) |
0 | 115 | |
35 | 1 | 0.1% |
37 | 1 | 0.1% |
41 | 1 | 0.1% |
44 | 1 | 0.1% |
46 | 1 | 0.1% |
48 | 2 | 0.1% |
51 | 2 | 0.1% |
53 | 1 | 0.1% |
57 | 1 | 0.1% |
Value | Count | Frequency (%) |
277523 | 1 | |
258121 | 1 | |
250521 | 1 | |
247251 | 1 | |
237181 | 1 | |
233426 | 1 | |
229659 | 1 | |
228956 | 1 | |
222996 | 1 | |
221356 | 1 |
조발생률
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 551 |
---|---|
Distinct (%) | 34.2% |
Missing | 115 |
Missing (%) | 6.7% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 32.622174 |
Minimum | 0.1 |
---|---|
Maximum | 561.7 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 15.3 KiB |
Quantile statistics
Minimum | 0.1 |
---|---|
5-th percentile | 0.4 |
Q1 | 3.7 |
median | 8.2 |
Q3 | 29.275 |
95-th percentile | 91.795 |
Maximum | 561.7 |
Range | 561.6 |
Interquartile range (IQR) | 25.575 |
Descriptive statistics
Standard deviation | 79.326079 |
---|---|
Coefficient of variation (CV) | 2.4316613 |
Kurtosis | 20.692893 |
Mean | 32.622174 |
Median Absolute Deviation (MAD) | 6.2 |
Skewness | 4.4973279 |
Sum | 52521.7 |
Variance | 6292.6269 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0.3 | 40 | 2.3% |
0.4 | 33 | 1.9% |
0.7 | 24 | 1.4% |
0.5 | 19 | 1.1% |
0.6 | 18 | 1.0% |
0.2 | 18 | 1.0% |
3.1 | 16 | 0.9% |
2.8 | 15 | 0.9% |
4.2 | 15 | 0.9% |
4.3 | 15 | 0.9% |
Other values (541) | 1397 | |
(Missing) | 115 | 6.7% |
Value | Count | Frequency (%) |
0.1 | 1 | 0.1% |
0.2 | 18 | |
0.3 | 40 | |
0.4 | 33 | |
0.5 | 19 | |
0.6 | 18 | |
0.7 | 24 | |
0.8 | 10 | 0.6% |
0.9 | 6 | 0.3% |
1.0 | 11 | 0.6% |
Value | Count | Frequency (%) |
561.7 | 1 | |
540.6 | 1 | |
530.9 | 1 | |
519.7 | 1 | |
515.2 | 1 | |
509.9 | 1 | |
502.8 | 1 | |
487.9 | 2 | |
482.0 | 1 | |
479.1 | 1 |
연령표준화발생률
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 619 |
---|---|
Distinct (%) | 38.4% |
Missing | 115 |
Missing (%) | 6.7% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 43.623043 |
Minimum | 0.2 |
---|---|
Maximum | 686.9 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 15.3 KiB |
Quantile statistics
Minimum | 0.2 |
---|---|
5-th percentile | 0.5 |
Q1 | 4.5 |
median | 11.7 |
Q3 | 36.525 |
95-th percentile | 131.03 |
Maximum | 686.9 |
Range | 686.7 |
Interquartile range (IQR) | 32.025 |
Descriptive statistics
Standard deviation | 104.35027 |
---|---|
Coefficient of variation (CV) | 2.3920905 |
Kurtosis | 18.697302 |
Mean | 43.623043 |
Median Absolute Deviation (MAD) | 8.5 |
Skewness | 4.3125338 |
Sum | 70233.1 |
Variance | 10888.978 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0.5 | 36 | 2.1% |
0.4 | 31 | 1.8% |
0.6 | 25 | 1.4% |
0.3 | 22 | 1.3% |
3.5 | 19 | 1.1% |
0.7 | 16 | 0.9% |
3.6 | 15 | 0.9% |
3.9 | 15 | 0.9% |
3.4 | 15 | 0.9% |
5.8 | 13 | 0.8% |
Other values (609) | 1403 | |
(Missing) | 115 | 6.7% |
Value | Count | Frequency (%) |
0.2 | 11 | 0.6% |
0.3 | 22 | |
0.4 | 31 | |
0.5 | 36 | |
0.6 | 25 | |
0.7 | 16 | |
0.8 | 6 | 0.3% |
0.9 | 7 | 0.4% |
1.0 | 12 | 0.7% |
1.1 | 7 | 0.4% |
Value | Count | Frequency (%) |
686.9 | 1 | |
675.8 | 1 | |
672.5 | 1 | |
672.1 | 1 | |
660.3 | 1 | |
651.2 | 1 | |
646.9 | 1 | |
636.0 | 1 | |
631.9 | 1 | |
626.5 | 1 |
발생연도 | 성별 | 국제질병분류 | 암종 | 발생자수 | 조발생률 | 연령표준화발생률 | |
---|---|---|---|---|---|---|---|
발생연도 | 1.000 | 0.000 | 0.000 | 0.000 | 0.188 | 0.308 | 0.067 |
성별 | 0.000 | 1.000 | 0.000 | 0.000 | 0.243 | 0.125 | 0.420 |
국제질병분류 | 0.000 | 0.000 | 1.000 | 1.000 | 0.696 | 0.744 | 0.740 |
암종 | 0.000 | 0.000 | 1.000 | 1.000 | 0.696 | 0.744 | 0.740 |
발생자수 | 0.188 | 0.243 | 0.696 | 0.696 | 1.000 | 0.938 | 0.794 |
조발생률 | 0.308 | 0.125 | 0.744 | 0.744 | 0.938 | 1.000 | 0.876 |
연령표준화발생률 | 0.067 | 0.420 | 0.740 | 0.740 | 0.794 | 0.876 | 1.000 |
성별 | 국제질병분류 | 암종 | |
---|---|---|---|
성별 | 1.000 | 0.000 | 0.000 |
국제질병분류 | 0.000 | 1.000 | 1.000 |
암종 | 0.000 | 1.000 | 1.000 |
발생연도 | 발생자수 | 조발생률 | 연령표준화발생률 | 성별 | 국제질병분류 | 암종 | |
---|---|---|---|---|---|---|---|
발생연도 | 1.000 | 0.175 | 0.192 | 0.060 | 0.000 | 0.000 | 0.000 |
발생자수 | 0.175 | 1.000 | 0.968 | 0.946 | 0.149 | 0.325 | 0.325 |
조발생률 | 0.192 | 0.968 | 1.000 | 0.983 | 0.075 | 0.368 | 0.368 |
연령표준화발생률 | 0.060 | 0.946 | 0.983 | 1.000 | 0.204 | 0.392 | 0.392 |
성별 | 0.000 | 0.149 | 0.075 | 0.204 | 1.000 | 0.000 | 0.000 |
국제질병분류 | 0.000 | 0.325 | 0.368 | 0.392 | 0.000 | 1.000 | 1.000 |
암종 | 0.000 | 0.325 | 0.368 | 0.392 | 0.000 | 1.000 | 1.000 |
발생연도 | 성별 | 국제질병분류 | 암종 | 발생자수 | 조발생률 | 연령표준화발생률 | |
---|---|---|---|---|---|---|---|
0 | 1999 | 남녀전체 | C00-C96 | 모든암 | 101857 | 216.0 | 402.7 |
1 | 1999 | 남녀전체 | C00-C14 | 입술, 구강 및 인두 | 1739 | 3.7 | 6.6 |
2 | 1999 | 남녀전체 | C15 | 식도 | 1861 | 3.9 | 8.2 |
3 | 1999 | 남녀전체 | C16 | 위 | 20901 | 44.3 | 86.0 |
4 | 1999 | 남녀전체 | C18-C20 | 대장 | 9780 | 20.7 | 40.8 |
5 | 1999 | 남녀전체 | C22 | 간 | 13262 | 28.1 | 52.4 |
6 | 1999 | 남녀전체 | C23-C24 | 담낭 및 기타담도 | 3047 | 6.5 | 14.0 |
7 | 1999 | 남녀전체 | C25 | 췌장 | 2614 | 5.5 | 11.7 |
8 | 1999 | 남녀전체 | C32 | 후두 | 1101 | 2.3 | 4.8 |
9 | 1999 | 남녀전체 | C33-C34 | 폐 | 13230 | 28.1 | 59.8 |
발생연도 | 성별 | 국제질병분류 | 암종 | 발생자수 | 조발생률 | 연령표준화발생률 | |
---|---|---|---|---|---|---|---|
1715 | 2021 | 여자 | C62 | 고환 | 0 | <NA> | <NA> |
1716 | 2021 | 여자 | C64 | 신장 | 2108 | 8.2 | 7.7 |
1717 | 2021 | 여자 | C67 | 방광 | 968 | 3.8 | 3.2 |
1718 | 2021 | 여자 | C70-C72 | 뇌 및 중추신경계 | 939 | 3.6 | 3.4 |
1719 | 2021 | 여자 | C73 | 갑상선 | 26532 | 103.1 | 104.0 |
1720 | 2021 | 여자 | C81 | 호지킨림프종 | 112 | 0.4 | 0.4 |
1721 | 2021 | 여자 | C82-C86,C96 | 비호지킨림프종 | 2434 | 9.5 | 8.7 |
1722 | 2021 | 여자 | C90 | 다발성 골수종 | 923 | 3.6 | 3.2 |
1723 | 2021 | 여자 | C91-C95 | 백혈병 | 1686 | 6.5 | 6.2 |
1724 | 2021 | 여자 | Re.C00-C96 | 기타 암 | 11995 | 46.6 | 41.2 |