Overview

Dataset statistics

Number of variables3
Number of observations122
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.2 KiB
Average record size in memory27.1 B

Variable types

Categorical1
Text1
Numeric1

Dataset

Descriptiono (내용) 건강검진 종별 대상자 구축 내역 o (대상) 당해연도 건강검진 종별 중 하나라도 대상자인 건강보험 가입자 o (변수 레이아웃) 1 사업년도 2 분류명(대상자 구축 단계) 3 건수 o (자료제공범위) 자료가 존재하는 범위 내 가장 최근 ‘1개월’ (2022년12월1일~2022년12월30일), 6행 이상 제공 불가
URLhttps://www.data.go.kr/data/15121847/fileData.do

Alerts

사업년도 has constant value ""Constant
건수 has 27 (22.1%) zerosZeros

Reproduction

Analysis started2023-12-12 12:06:13.714973
Analysis finished2023-12-12 12:06:14.149699
Duration0.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사업년도
Categorical

CONSTANT 

Distinct1
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2023
122 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2023 122
100.0%

Length

2023-12-12T21:06:14.226293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:06:14.328851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 122
100.0%
Distinct106
Distinct (%)86.9%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2023-12-12T21:06:14.562649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length25
Mean length7.5245902
Min length1

Characters and Unicode

Total characters918
Distinct characters148
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique90 ?
Unique (%)73.8%

Sample

1st row폐암 고위험군 등록
2nd row대상자 구축현황
3rd row자격건수
4th row자격건수
5th row검진건수
ValueCountFrequency (%)
대상자 8
 
3.9%
반영 7
 
3.4%
산정특례 5
 
2.4%
고위험군 3
 
1.5%
구축 3
 
1.5%
의료급여 3
 
1.5%
제외처리 3
 
1.5%
자격 3
 
1.5%
직장가입자 3
 
1.5%
검진대상자 2
 
1.0%
Other values (137) 166
80.6%
2023-12-12T21:06:15.021917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
85
 
9.3%
42
 
4.6%
38
 
4.1%
0 33
 
3.6%
( 22
 
2.4%
) 22
 
2.4%
20
 
2.2%
20
 
2.2%
3 18
 
2.0%
2 18
 
2.0%
Other values (138) 600
65.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 622
67.8%
Decimal Number 124
 
13.5%
Space Separator 85
 
9.3%
Uppercase Letter 35
 
3.8%
Open Punctuation 22
 
2.4%
Close Punctuation 22
 
2.4%
Other Punctuation 5
 
0.5%
Dash Punctuation 2
 
0.2%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
42
 
6.8%
38
 
6.1%
20
 
3.2%
20
 
3.2%
17
 
2.7%
17
 
2.7%
14
 
2.3%
13
 
2.1%
13
 
2.1%
13
 
2.1%
Other values (115) 415
66.7%
Decimal Number
ValueCountFrequency (%)
0 33
26.6%
3 18
14.5%
2 18
14.5%
1 16
12.9%
7 11
 
8.9%
4 8
 
6.5%
6 6
 
4.8%
5 6
 
4.8%
8 5
 
4.0%
9 3
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
A 18
51.4%
B 6
 
17.1%
C 5
 
14.3%
T 3
 
8.6%
H 3
 
8.6%
Other Punctuation
ValueCountFrequency (%)
. 3
60.0%
/ 1
 
20.0%
! 1
 
20.0%
Space Separator
ValueCountFrequency (%)
85
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 622
67.8%
Common 261
28.4%
Latin 35
 
3.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
42
 
6.8%
38
 
6.1%
20
 
3.2%
20
 
3.2%
17
 
2.7%
17
 
2.7%
14
 
2.3%
13
 
2.1%
13
 
2.1%
13
 
2.1%
Other values (115) 415
66.7%
Common
ValueCountFrequency (%)
85
32.6%
0 33
 
12.6%
( 22
 
8.4%
) 22
 
8.4%
3 18
 
6.9%
2 18
 
6.9%
1 16
 
6.1%
7 11
 
4.2%
4 8
 
3.1%
6 6
 
2.3%
Other values (8) 22
 
8.4%
Latin
ValueCountFrequency (%)
A 18
51.4%
B 6
 
17.1%
C 5
 
14.3%
T 3
 
8.6%
H 3
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 622
67.8%
ASCII 296
32.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
85
28.7%
0 33
 
11.1%
( 22
 
7.4%
) 22
 
7.4%
3 18
 
6.1%
2 18
 
6.1%
A 18
 
6.1%
1 16
 
5.4%
7 11
 
3.7%
4 8
 
2.7%
Other values (13) 45
15.2%
Hangul
ValueCountFrequency (%)
42
 
6.8%
38
 
6.1%
20
 
3.2%
20
 
3.2%
17
 
2.7%
17
 
2.7%
14
 
2.3%
13
 
2.1%
13
 
2.1%
13
 
2.1%
Other values (115) 415
66.7%

건수
Real number (ℝ)

ZEROS 

Distinct89
Distinct (%)73.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4016375.2
Minimum0
Maximum60179692
Zeros27
Zeros (%)22.1%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-12T21:06:15.198896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q114.5
median85538
Q32541822.2
95-th percentile22830917
Maximum60179692
Range60179692
Interquartile range (IQR)2541807.8

Descriptive statistics

Standard deviation9485584
Coefficient of variation (CV)2.3617275
Kurtosis14.15388
Mean4016375.2
Median Absolute Deviation (MAD)85538
Skewness3.5500346
Sum4.8999778 × 108
Variance8.9976304 × 1013
MonotonicityNot monotonic
2023-12-12T21:06:15.366110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 27
 
22.1%
41876269 2
 
1.6%
22719404 2
 
1.6%
22836786 2
 
1.6%
25244 2
 
1.6%
399974 2
 
1.6%
3 2
 
1.6%
1862 2
 
1.6%
4636 1
 
0.8%
51758 1
 
0.8%
Other values (79) 79
64.8%
ValueCountFrequency (%)
0 27
22.1%
3 2
 
1.6%
13 1
 
0.8%
14 1
 
0.8%
16 1
 
0.8%
31 1
 
0.8%
33 1
 
0.8%
37 1
 
0.8%
90 1
 
0.8%
110 1
 
0.8%
ValueCountFrequency (%)
60179692 1
0.8%
41876269 2
1.6%
36386995 1
0.8%
27843452 1
0.8%
22836786 2
1.6%
22719404 2
1.6%
18835976 1
0.8%
12318830 1
0.8%
12259273 1
0.8%
11691233 1
0.8%

Interactions

2023-12-12T21:06:13.835197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-12T21:06:14.002174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:06:14.110571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사업년도분류명건수
02023폐암 고위험군 등록522339
12023대상자 구축현황110
22023자격건수22719404
32023자격건수22836786
42023검진건수22719404
52023검진건수22836786
62023의료급여세대원80110
72023의료급여세대주579342
82023지역세대원3522777
92023지역세대주4261900
사업년도분류명건수
112202360대3999079
113202340대3999754
114202370대1884900
115202330대2548124
116202320대2288570
117202350대4552086
118202330대3091293
119202370대2195481
120202320대2170974
121202360대3897385