Overview

Dataset statistics

Number of variables9
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows626
Duplicate rows (%)6.3%
Total size in memory859.4 KiB
Average record size in memory88.0 B

Variable types

Categorical3
Numeric6

Dataset

Descriptiono (내용) 시도별·연령별 일반(암) 검진 대상자 및 수검자 인원 수 o (대상) 당해연도 일반(암)검진 종별 중 하나라도 대상자인 건강보험가입자 o (변수 레이아웃) 1 사업년도 2 센터구분코드(01: 서울 02: 부산, 03: 대구, 04: 광주, 05: 대전, 06: 경인 3 관리지사코드(수검 당시 주소지 관할 지사) 4 소속지사구분코드(0: 지사, 1: 출장소1, 2: 출장소2, 3: 출장소3) 5 대상자연령 6 건강검진대상유형코드(A0: 일반, A5: 생애검진, D1: 위암, D2: 대장암, D3: 유방암, D4: 간암상반기, D5: 자궁경부암, D6: 간암하반기, D7: 폐암) 7 대상자인원수 8 수검자인원수 9 수검율(수검자인원수/대상자인원수 x 100(%)) o (자료제공범위) 자료가 존재하는 범위 내 가장 최근 ‘1개월’ (2020년5월1일~2020년5월31일), 6행 이상 제공 불가
URLhttps://www.data.go.kr/data/15121859/fileData.do

Alerts

사업년도 has constant value ""Constant
Dataset has 626 (6.3%) duplicate rowsDuplicates
대상자인원수 is highly overall correlated with 수검자인원수High correlation
수검자인원수 is highly overall correlated with 대상자인원수High correlation
수검자인원수 has 264 (2.6%) zerosZeros
수검율 has 264 (2.6%) zerosZeros

Reproduction

Analysis started2023-12-12 20:17:15.651246
Analysis finished2023-12-12 20:17:21.224563
Duration5.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사업년도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2020
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 10000
100.0%

Length

2023-12-13T05:17:21.307051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:17:21.409650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 10000
100.0%

센터구분코드
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.4766
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:17:21.497340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q35
95-th percentile6
Maximum6
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.785153
Coefficient of variation (CV)0.51347666
Kurtosis-1.3426381
Mean3.4766
Median Absolute Deviation (MAD)2
Skewness0.0086074088
Sum34766
Variance3.1867711
MonotonicityNot monotonic
2023-12-13T05:17:21.616172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 2002
20.0%
6 1927
19.3%
4 1822
18.2%
2 1543
15.4%
5 1355
13.6%
3 1351
13.5%
ValueCountFrequency (%)
1 2002
20.0%
2 1543
15.4%
3 1351
13.5%
4 1822
18.2%
5 1355
13.6%
6 1927
19.3%
ValueCountFrequency (%)
6 1927
19.3%
5 1355
13.6%
4 1822
18.2%
3 1351
13.5%
2 1543
15.4%
1 2002
20.0%

관리지사코드
Real number (ℝ)

Distinct178
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean444.5311
Minimum101
Maximum802
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:17:21.746249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile113
Q1242
median405
Q3664
95-th percentile762
Maximum802
Range701
Interquartile range (IQR)422

Descriptive statistics

Standard deviation218.97221
Coefficient of variation (CV)0.49259144
Kurtosis-1.4349458
Mean444.5311
Median Absolute Deviation (MAD)198
Skewness0.023513377
Sum4445311
Variance47948.83
MonotonicityNot monotonic
2023-12-13T05:17:21.907626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
716 141
 
1.4%
765 141
 
1.4%
606 137
 
1.4%
704 136
 
1.4%
401 128
 
1.3%
405 128
 
1.3%
604 122
 
1.2%
505 115
 
1.1%
416 110
 
1.1%
771 104
 
1.0%
Other values (168) 8738
87.4%
ValueCountFrequency (%)
101 41
0.4%
103 35
0.4%
104 38
0.4%
105 44
0.4%
106 44
0.4%
107 36
0.4%
108 46
0.5%
109 51
0.5%
110 43
0.4%
111 46
0.5%
ValueCountFrequency (%)
802 53
 
0.5%
801 38
 
0.4%
771 104
1.0%
769 53
 
0.5%
767 43
 
0.4%
765 141
1.4%
762 72
0.7%
759 72
0.7%
757 36
 
0.4%
756 38
 
0.4%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
7691 
1
1844 
2
 
465

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 7691
76.9%
1 1844
 
18.4%
2 465
 
4.7%

Length

2023-12-13T05:17:22.055622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:17:22.471364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 7691
76.9%
1 1844
 
18.4%
2 465
 
4.7%

대상자연령
Real number (ℝ)

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean57.047
Minimum10
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:17:22.582103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile20
Q150
median60
Q370
95-th percentile80
Maximum80
Range70
Interquartile range (IQR)20

Descriptive statistics

Standard deviation17.787676
Coefficient of variation (CV)0.31180739
Kurtosis-0.20970883
Mean57.047
Median Absolute Deviation (MAD)10
Skewness-0.61574988
Sum570470
Variance316.40143
MonotonicityNot monotonic
2023-12-13T05:17:22.708608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
70 2048
20.5%
60 1973
19.7%
50 1803
18.0%
80 1776
17.8%
40 1336
13.4%
20 454
 
4.5%
30 394
 
3.9%
10 216
 
2.2%
ValueCountFrequency (%)
10 216
 
2.2%
20 454
 
4.5%
30 394
 
3.9%
40 1336
13.4%
50 1803
18.0%
60 1973
19.7%
70 2048
20.5%
80 1776
17.8%
ValueCountFrequency (%)
80 1776
17.8%
70 2048
20.5%
60 1973
19.7%
50 1803
18.0%
40 1336
13.4%
30 394
 
3.9%
20 454
 
4.5%
10 216
 
2.2%
Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A0
1688 
D5
1532 
D6
1161 
D1
1151 
D3
1127 
Other values (4)
3341 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowD2
2nd rowD3
3rd rowD7
4th rowD3
5th rowD6

Common Values

ValueCountFrequency (%)
A0 1688
16.9%
D5 1532
15.3%
D6 1161
11.6%
D1 1151
11.5%
D3 1127
11.3%
D4 1105
11.1%
D2 900
9.0%
A5 685
6.9%
D7 651
 
6.5%

Length

2023-12-13T05:17:22.858930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:17:22.979402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a0 1688
16.9%
d5 1532
15.3%
d6 1161
11.6%
d1 1151
11.5%
d3 1127
11.3%
d4 1105
11.1%
d2 900
9.0%
a5 685
6.9%
d7 651
 
6.5%

대상자인원수
Real number (ℝ)

HIGH CORRELATION 

Distinct5907
Distinct (%)59.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7584.2882
Minimum1
Maximum142013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:17:23.128700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile52
Q1524
median2176.5
Q39103
95-th percentile33385.6
Maximum142013
Range142012
Interquartile range (IQR)8579

Descriptive statistics

Standard deviation12877.772
Coefficient of variation (CV)1.6979539
Kurtosis17.701979
Mean7584.2882
Median Absolute Deviation (MAD)2029.5
Skewness3.4544721
Sum75842882
Variance1.6583701 × 108
MonotonicityNot monotonic
2023-12-13T05:17:23.318985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 53
 
0.5%
96 18
 
0.2%
9 17
 
0.2%
66 17
 
0.2%
82 17
 
0.2%
105 16
 
0.2%
13 15
 
0.1%
103 14
 
0.1%
39 14
 
0.1%
4 14
 
0.1%
Other values (5897) 9805
98.0%
ValueCountFrequency (%)
1 53
0.5%
2 8
 
0.1%
3 12
 
0.1%
4 14
 
0.1%
5 5
 
0.1%
6 9
 
0.1%
7 13
 
0.1%
8 8
 
0.1%
9 17
 
0.2%
10 5
 
0.1%
ValueCountFrequency (%)
142013 1
< 0.1%
141904 1
< 0.1%
141816 1
< 0.1%
141628 1
< 0.1%
128136 1
< 0.1%
128084 1
< 0.1%
127900 1
< 0.1%
118513 1
< 0.1%
118059 1
< 0.1%
115711 1
< 0.1%

수검자인원수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1996
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean472.4716
Minimum0
Maximum9774
Zeros264
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:17:23.489919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q135
median170
Q3614
95-th percentile1922
Maximum9774
Range9774
Interquartile range (IQR)579

Descriptive statistics

Standard deviation703.46407
Coefficient of variation (CV)1.4889023
Kurtosis10.585367
Mean472.4716
Median Absolute Deviation (MAD)158
Skewness2.6485137
Sum4724716
Variance494861.7
MonotonicityNot monotonic
2023-12-13T05:17:23.694189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 264
 
2.6%
1 180
 
1.8%
2 134
 
1.3%
4 102
 
1.0%
3 99
 
1.0%
5 96
 
1.0%
11 93
 
0.9%
7 91
 
0.9%
6 91
 
0.9%
8 89
 
0.9%
Other values (1986) 8761
87.6%
ValueCountFrequency (%)
0 264
2.6%
1 180
1.8%
2 134
1.3%
3 99
 
1.0%
4 102
 
1.0%
5 96
 
1.0%
6 91
 
0.9%
7 91
 
0.9%
8 89
 
0.9%
9 68
 
0.7%
ValueCountFrequency (%)
9774 1
< 0.1%
6333 1
< 0.1%
5760 1
< 0.1%
5371 1
< 0.1%
5343 1
< 0.1%
5306 1
< 0.1%
5178 1
< 0.1%
5102 1
< 0.1%
4948 1
< 0.1%
4884 1
< 0.1%

수검율
Real number (ℝ)

ZEROS 

Distinct1766
Distinct (%)17.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.978185
Minimum0
Maximum100
Zeros264
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:17:23.842287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.6
Q14.57
median7.385
Q310.4325
95-th percentile15.92
Maximum100
Range100
Interquartile range (IQR)5.8625

Descriptive statistics

Standard deviation5.9923127
Coefficient of variation (CV)0.75108721
Kurtosis96.621141
Mean7.978185
Median Absolute Deviation (MAD)2.935
Skewness6.8195187
Sum79781.85
Variance35.907812
MonotonicityNot monotonic
2023-12-13T05:17:23.991107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 264
 
2.6%
11.11 25
 
0.2%
9.09 22
 
0.2%
8.61 22
 
0.2%
5.66 22
 
0.2%
6.25 20
 
0.2%
9.5 20
 
0.2%
5.88 20
 
0.2%
7.89 20
 
0.2%
6.67 19
 
0.2%
Other values (1756) 9546
95.5%
ValueCountFrequency (%)
0.0 264
2.6%
0.11 1
 
< 0.1%
0.13 1
 
< 0.1%
0.16 1
 
< 0.1%
0.43 2
 
< 0.1%
0.45 1
 
< 0.1%
0.54 2
 
< 0.1%
0.56 3
 
< 0.1%
0.61 2
 
< 0.1%
0.64 1
 
< 0.1%
ValueCountFrequency (%)
100.0 17
0.2%
75.0 1
 
< 0.1%
66.67 2
 
< 0.1%
42.86 2
 
< 0.1%
33.33 7
0.1%
32.0 2
 
< 0.1%
31.76 1
 
< 0.1%
31.42 1
 
< 0.1%
31.27 1
 
< 0.1%
28.57 1
 
< 0.1%

Interactions

2023-12-13T05:17:20.231529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:16.985825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:17.629414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:18.277783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:18.883983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:19.525261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:20.343606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:17.094498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:17.728735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:18.375981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:18.980018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:19.645976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:20.449098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:17.204285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:17.838768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:18.486072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:19.099569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:19.759056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:20.558456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:17.307619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:17.958638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:18.587742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:19.205194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:19.879055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:20.677062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:17.405648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:18.071344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:18.692022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:19.306865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:20.004143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:20.843266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:17.528702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:18.190255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:18.784878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:19.403573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:17:20.126252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:17:24.076693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
센터구분코드관리지사코드소속지사구분코드대상자연령건강검진대상유형코드대상자인원수수검자인원수수검율
센터구분코드1.0000.9350.3880.0000.0000.1910.1590.153
관리지사코드0.9351.0000.4690.0000.0000.2550.2480.141
소속지사구분코드0.3880.4691.0000.0000.0460.3480.2490.166
대상자연령0.0000.0000.0001.0000.4550.2040.2930.375
건강검진대상유형코드0.0000.0000.0460.4551.0000.4860.3040.394
대상자인원수0.1910.2550.3480.2040.4861.0000.6340.153
수검자인원수0.1590.2480.2490.2930.3040.6341.0000.107
수검율0.1530.1410.1660.3750.3940.1530.1071.000
2023-12-13T05:17:24.185328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건강검진대상유형코드소속지사구분코드
건강검진대상유형코드1.0000.020
소속지사구분코드0.0201.000
2023-12-13T05:17:24.278070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
센터구분코드관리지사코드대상자연령대상자인원수수검자인원수수검율소속지사구분코드건강검진대상유형코드
센터구분코드1.0000.099-0.0080.0040.0090.0220.1740.000
관리지사코드0.0991.000-0.001-0.282-0.2750.0580.3190.000
대상자연령-0.008-0.0011.000-0.192-0.1980.0420.0000.243
대상자인원수0.004-0.282-0.1921.0000.954-0.1270.1640.174
수검자인원수0.009-0.275-0.1980.9541.0000.1410.1620.154
수검율0.0220.0580.042-0.1270.1411.0000.1040.204
소속지사구분코드0.1740.3190.0000.1640.1620.1041.0000.020
건강검진대상유형코드0.0000.0000.2430.1740.1540.2040.0201.000

Missing values

2023-12-13T05:17:20.970668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:17:21.130806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사업년도센터구분코드관리지사코드소속지사구분코드대상자연령건강검진대상유형코드대상자인원수수검자인원수수검율
5112220206311070D289203924.39
2729920205560040D324471486.05
8920020201134060D71256332.63
2105220202205060D391418439.22
6572920206309070D6101012212.08
7449620202262070D44015814.46
1991120202203060D71232221.79
3166420205252070D45428415.5
8886020206328060D6171421012.25
2856520206306050D2395019132.31
사업년도센터구분코드관리지사코드소속지사구분코드대상자연령건강검진대상유형코드대상자인원수수검자인원수수검율
2092020203704230D5239187.53
5393720201418070D255642664.78
6147420202753080D31710412.4
3647720206318080D181555686.97
6681320206232060D122741243510.71
5540820201141080A0105926476.11
5568120204611070D7278113.96
6534420206326070D6176169.09
8706120206342040D52284317517.67
646020203705120A0112580.71

Duplicate rows

Most frequently occurring

사업년도센터구분코드관리지사코드소속지사구분코드대상자연령건강검진대상유형코드대상자인원수수검자인원수수검율# duplicates
7220201404170D74324.654
9720201408060A5881314.774
13420202209060A53063411.114
27020203716270D68289.764
32920204606010A0100.04
41220204666170D72129.524
51420205555140D1100.04
57420206312110A01300.04
61220206333080A5466163.434
920201107050D7434112.533