Overview

Dataset statistics

Number of variables4
Number of observations45
Missing cells37
Missing cells (%)20.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.5 KiB
Average record size in memory34.9 B

Variable types

Text2
Categorical2

Dataset

Description국가무료예방접종에 대한 데이터로 국가필수예방접종의 종류, 접종시기와 대상자, 추가접종에 대한 정보 비용에 관한 정보 데이터
URLhttps://www.data.go.kr/data/15026410/fileData.do

Alerts

비용 has constant value ""Constant
접종대상자(추가접종) has 37 (82.2%) missing valuesMissing
구분 has unique valuesUnique

Reproduction

Analysis started2023-12-12 14:16:47.186598
Analysis finished2023-12-12 14:16:47.788274
Duration0.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

UNIQUE 

Distinct45
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size492.0 B
2023-12-12T23:16:47.942036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length17
Mean length12.688889
Min length2

Characters and Unicode

Total characters571
Distinct characters62
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)100.0%

Sample

1st rowBCG(피내용)
2nd rowB형간염(1차)
3rd rowB형간염(2차)
4th rowB형간염(3차)
5th row디프테리아/파상풍/백일해(1차)
ValueCountFrequency (%)
bcg(피내용 1
 
2.2%
일본뇌염(사백신)(2차 1
 
2.2%
일본뇌염(사백신)(4차 1
 
2.2%
일본뇌염(사백신)(5차 1
 
2.2%
일본뇌염(생백신)(1차 1
 
2.2%
일본뇌염(생백신)(2차 1
 
2.2%
수두 1
 
2.2%
뇌수막염(1차 1
 
2.2%
뇌수막염(2차 1
 
2.2%
뇌수막염(3차 1
 
2.2%
Other values (35) 35
77.8%
2023-12-12T23:16:48.286854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 48
 
8.4%
) 48
 
8.4%
/ 41
 
7.2%
40
 
7.0%
25
 
4.4%
21
 
3.7%
20
 
3.5%
20
 
3.5%
16
 
2.8%
14
 
2.5%
Other values (52) 278
48.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 386
67.6%
Open Punctuation 48
 
8.4%
Close Punctuation 48
 
8.4%
Other Punctuation 41
 
7.2%
Decimal Number 40
 
7.0%
Uppercase Letter 8
 
1.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
40
 
10.4%
25
 
6.5%
21
 
5.4%
20
 
5.2%
20
 
5.2%
16
 
4.1%
14
 
3.6%
14
 
3.6%
14
 
3.6%
14
 
3.6%
Other values (40) 188
48.7%
Decimal Number
ValueCountFrequency (%)
2 12
30.0%
1 12
30.0%
3 8
20.0%
4 6
15.0%
5 2
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
B 4
50.0%
A 2
25.0%
G 1
 
12.5%
C 1
 
12.5%
Open Punctuation
ValueCountFrequency (%)
( 48
100.0%
Close Punctuation
ValueCountFrequency (%)
) 48
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 41
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 386
67.6%
Common 177
31.0%
Latin 8
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
40
 
10.4%
25
 
6.5%
21
 
5.4%
20
 
5.2%
20
 
5.2%
16
 
4.1%
14
 
3.6%
14
 
3.6%
14
 
3.6%
14
 
3.6%
Other values (40) 188
48.7%
Common
ValueCountFrequency (%)
( 48
27.1%
) 48
27.1%
/ 41
23.2%
2 12
 
6.8%
1 12
 
6.8%
3 8
 
4.5%
4 6
 
3.4%
5 2
 
1.1%
Latin
ValueCountFrequency (%)
B 4
50.0%
A 2
25.0%
G 1
 
12.5%
C 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 386
67.6%
ASCII 185
32.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 48
25.9%
) 48
25.9%
/ 41
22.2%
2 12
 
6.5%
1 12
 
6.5%
3 8
 
4.3%
4 6
 
3.2%
B 4
 
2.2%
A 2
 
1.1%
5 2
 
1.1%
Other values (2) 2
 
1.1%
Hangul
ValueCountFrequency (%)
40
 
10.4%
25
 
6.5%
21
 
5.4%
20
 
5.2%
20
 
5.2%
16
 
4.1%
14
 
3.6%
14
 
3.6%
14
 
3.6%
14
 
3.6%
Other values (40) 188
48.7%
Distinct19
Distinct (%)42.2%
Missing0
Missing (%)0.0%
Memory size492.0 B
<NA>
생후4개월
생후2개월
생후6개월
생후12~15개월
Other values (14)
17 

Length

Max length12
Median length11
Mean length5.9111111
Min length3

Unique

Unique11 ?
Unique (%)24.4%

Sample

1st row생후4주
2nd row출생시
3rd row생후1~2개월
4th row생후6~18개월
5th row생후2개월

Common Values

ValueCountFrequency (%)
<NA> 8
17.8%
생후4개월 6
13.3%
생후2개월 6
13.3%
생후6개월 5
11.1%
생후12~15개월 3
 
6.7%
생후6~18개월 2
 
4.4%
12~23개월 2
 
4.4%
24~35개월 2
 
4.4%
만4~6세 1
 
2.2%
생후1~2개월 1
 
2.2%
Other values (9) 9
20.0%

Length

2023-12-12T23:16:48.455654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 8
17.4%
생후2개월 6
13.0%
생후4개월 6
13.0%
생후6개월 5
10.9%
생후12~15개월 3
 
6.5%
생후6~18개월 2
 
4.3%
12~23개월 2
 
4.3%
24~35개월 2
 
4.3%
생후 1
 
2.2%
생후4주 1
 
2.2%
Other values (10) 10
21.7%
Distinct6
Distinct (%)75.0%
Missing37
Missing (%)82.2%
Memory size492.0 B
2023-12-12T23:16:48.608130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length6.5
Mean length5.5
Min length3

Characters and Unicode

Total characters44
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)62.5%

Sample

1st row생후15~18개월
2nd row만4~6세
3rd row만4~6세
4th row만4~6세
5th row만6세
ValueCountFrequency (%)
만4~6세 3
37.5%
생후15~18개월 1
 
12.5%
만6세 1
 
12.5%
만12세 1
 
12.5%
만11~12세 1
 
12.5%
만11세이상 1
 
12.5%
2023-12-12T23:16:48.931054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 8
18.2%
7
15.9%
7
15.9%
~ 5
11.4%
6 4
9.1%
4 3
 
6.8%
2 2
 
4.5%
1
 
2.3%
1
 
2.3%
5 1
 
2.3%
Other values (5) 5
11.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 20
45.5%
Decimal Number 19
43.2%
Math Symbol 5
 
11.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
35.0%
7
35.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Decimal Number
ValueCountFrequency (%)
1 8
42.1%
6 4
21.1%
4 3
 
15.8%
2 2
 
10.5%
5 1
 
5.3%
8 1
 
5.3%
Math Symbol
ValueCountFrequency (%)
~ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24
54.5%
Hangul 20
45.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
35.0%
7
35.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Common
ValueCountFrequency (%)
1 8
33.3%
~ 5
20.8%
6 4
16.7%
4 3
 
12.5%
2 2
 
8.3%
5 1
 
4.2%
8 1
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24
54.5%
Hangul 20
45.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 8
33.3%
~ 5
20.8%
6 4
16.7%
4 3
 
12.5%
2 2
 
8.3%
5 1
 
4.2%
8 1
 
4.2%
Hangul
ValueCountFrequency (%)
7
35.0%
7
35.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%

비용
Categorical

CONSTANT 

Distinct1
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size492.0 B
무료
45 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row무료
2nd row무료
3rd row무료
4th row무료
5th row무료

Common Values

ValueCountFrequency (%)
무료 45
100.0%

Length

2023-12-12T23:16:49.039927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:16:49.125898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
무료 45
100.0%

Correlations

2023-12-12T23:16:49.181513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분접종대상자(기초접종)접종대상자(추가접종)
구분1.0001.0001.000
접종대상자(기초접종)1.0001.000NaN
접종대상자(추가접종)1.000NaN1.000

Missing values

2023-12-12T23:16:47.355421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:16:47.757348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분접종대상자(기초접종)접종대상자(추가접종)비용
0BCG(피내용)생후4주<NA>무료
1B형간염(1차)출생시<NA>무료
2B형간염(2차)생후1~2개월<NA>무료
3B형간염(3차)생후6~18개월<NA>무료
4디프테리아/파상풍/백일해(1차)생후2개월<NA>무료
5디프테리아/파상풍/백일해(2차)생후4개월<NA>무료
6디프테리아/파상풍/백일해(3차)생후6개월<NA>무료
7디프테리아/파상풍/백일해(4차)<NA>생후15~18개월무료
8디프테리아/파상풍/백일해(5차)<NA>만4~6세무료
9디프테리아/파상풍/백일해/폴리오(1차)생후2개월<NA>무료
구분접종대상자(기초접종)접종대상자(추가접종)비용
35폐렴구균(2차)생후4개월<NA>무료
36폐렴구균(3차)생후6개월<NA>무료
37폐렴구균(4차)생후12~15개월<NA>무료
38A형간염(1차)생후12~23개월<NA>무료
39A형간염(2차)생후 18개월<NA>무료
40성인용디프테리아/파상풍<NA>만11~12세무료
41성인용디프테리아/파상풍/백일해<NA>만11세이상무료
42인유두종바이러스(1차)11~12세<NA>무료
43인유두종바이러스(2차)11~12세(+6개월)<NA>무료
44인플루엔자생후6개월~13세이하<NA>무료