Overview

Dataset statistics

Number of variables4
Number of observations213
Missing cells11
Missing cells (%)1.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.2 KiB
Average record size in memory34.6 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description- 제주도 내 감염병 발생 통계를 제공합니다. - 데이터 제공처: 제주특별자치도
Author제주특별자치도 미래성장과
URLhttps://www.jejudatahub.net/data/view/data/896

Alerts

발생 건수 has 11 (5.2%) missing valuesMissing
발생 건수 has 113 (53.1%) zerosZeros

Reproduction

Analysis started2023-12-11 20:10:53.252670
Analysis finished2023-12-11 20:10:53.720656
Duration0.47 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준 연도
Categorical

Distinct3
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2017
71 
2018
71 
2019
71 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
2017 71
33.3%
2018 71
33.3%
2019 71
33.3%

Length

2023-12-12T05:10:53.779553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T05:10:53.877957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017 71
33.3%
2018 71
33.3%
2019 71
33.3%
Distinct4
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
제4군
75 
제3군
72 
제2군
45 
제1군
21 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제1군
2nd row제1군
3rd row제1군
4th row제1군
5th row제1군

Common Values

ValueCountFrequency (%)
제4군 75
35.2%
제3군 72
33.8%
제2군 45
21.1%
제1군 21
 
9.9%

Length

2023-12-12T05:10:54.000705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T05:10:54.136424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제4군 75
35.2%
제3군 72
33.8%
제2군 45
21.1%
제1군 21
 
9.9%
Distinct68
Distinct (%)31.9%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-12T05:10:54.456131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length14
Mean length5.9295775
Min length2

Characters and Unicode

Total characters1263
Distinct characters157
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row소계
2nd row콜레라
3rd row장티푸스
4th row파라티푸스
5th row세균성이질
ValueCountFrequency (%)
소계 12
 
5.6%
반코마이신내성황색포도알균(vrsa)감염증 3
 
1.4%
크로이츠펠트야콥병(cjd 3
 
1.4%
결핵 3
 
1.4%
한센병 3
 
1.4%
후천성면역결핍증 3
 
1.4%
c형간염 3
 
1.4%
콜레라 3
 
1.4%
페스트 3
 
1.4%
백일해 3
 
1.4%
Other values (58) 174
81.7%
2023-12-12T05:10:55.055008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
69
 
5.5%
45
 
3.6%
39
 
3.1%
) 30
 
2.4%
30
 
2.4%
30
 
2.4%
( 30
 
2.4%
21
 
1.7%
21
 
1.7%
21
 
1.7%
Other values (147) 927
73.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1137
90.0%
Uppercase Letter 57
 
4.5%
Close Punctuation 30
 
2.4%
Open Punctuation 30
 
2.4%
Decimal Number 6
 
0.5%
Lowercase Letter 3
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
69
 
6.1%
45
 
4.0%
39
 
3.4%
30
 
2.6%
30
 
2.6%
21
 
1.8%
21
 
1.8%
21
 
1.8%
21
 
1.8%
21
 
1.8%
Other values (133) 819
72.0%
Uppercase Letter
ValueCountFrequency (%)
A 9
15.8%
B 9
15.8%
R 9
15.8%
C 9
15.8%
S 9
15.8%
V 3
 
5.3%
D 3
 
5.3%
J 3
 
5.3%
E 3
 
5.3%
Decimal Number
ValueCountFrequency (%)
2 3
50.0%
1 3
50.0%
Close Punctuation
ValueCountFrequency (%)
) 30
100.0%
Open Punctuation
ValueCountFrequency (%)
( 30
100.0%
Lowercase Letter
ValueCountFrequency (%)
b 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1137
90.0%
Common 66
 
5.2%
Latin 60
 
4.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
69
 
6.1%
45
 
4.0%
39
 
3.4%
30
 
2.6%
30
 
2.6%
21
 
1.8%
21
 
1.8%
21
 
1.8%
21
 
1.8%
21
 
1.8%
Other values (133) 819
72.0%
Latin
ValueCountFrequency (%)
A 9
15.0%
B 9
15.0%
R 9
15.0%
C 9
15.0%
S 9
15.0%
V 3
 
5.0%
D 3
 
5.0%
J 3
 
5.0%
E 3
 
5.0%
b 3
 
5.0%
Common
ValueCountFrequency (%)
) 30
45.5%
( 30
45.5%
2 3
 
4.5%
1 3
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1137
90.0%
ASCII 126
 
10.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
69
 
6.1%
45
 
4.0%
39
 
3.4%
30
 
2.6%
30
 
2.6%
21
 
1.8%
21
 
1.8%
21
 
1.8%
21
 
1.8%
21
 
1.8%
Other values (133) 819
72.0%
ASCII
ValueCountFrequency (%)
) 30
23.8%
( 30
23.8%
A 9
 
7.1%
B 9
 
7.1%
R 9
 
7.1%
C 9
 
7.1%
S 9
 
7.1%
V 3
 
2.4%
D 3
 
2.4%
J 3
 
2.4%
Other values (4) 12
 
9.5%

발생 건수
Real number (ℝ)

MISSING  ZEROS 

Distinct52
Distinct (%)25.7%
Missing11
Missing (%)5.2%
Infinite0
Infinite (%)0.0%
Mean100.90099
Minimum0
Maximum3579
Zeros113
Zeros (%)53.1%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2023-12-12T05:10:55.261425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q38
95-th percentile372.45
Maximum3579
Range3579
Interquartile range (IQR)8

Descriptive statistics

Standard deviation423.62869
Coefficient of variation (CV)4.1984592
Kurtosis40.084077
Mean100.90099
Median Absolute Deviation (MAD)0
Skewness5.9972996
Sum20382
Variance179461.26
MonotonicityNot monotonic
2023-12-12T05:10:55.448309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 113
53.1%
1 14
 
6.6%
2 7
 
3.3%
3 7
 
3.3%
4 6
 
2.8%
15 4
 
1.9%
8 3
 
1.4%
5 2
 
0.9%
14 2
 
0.9%
23 2
 
0.9%
Other values (42) 42
 
19.7%
(Missing) 11
 
5.2%
ValueCountFrequency (%)
0 113
53.1%
1 14
 
6.6%
2 7
 
3.3%
3 7
 
3.3%
4 6
 
2.8%
5 2
 
0.9%
7 1
 
0.5%
8 3
 
1.4%
9 1
 
0.5%
10 1
 
0.5%
ValueCountFrequency (%)
3579 1
0.5%
3241 1
0.5%
1860 1
0.5%
1771 1
0.5%
1552 1
0.5%
1528 1
0.5%
1018 1
0.5%
909 1
0.5%
848 1
0.5%
380 1
0.5%

Interactions

2023-12-12T05:10:53.441716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T05:10:55.588092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준 연도법정감염군 대분류법정감염군 소분류발생 건수
기준 연도1.0000.0000.0000.037
법정감염군 대분류0.0001.0000.9980.286
법정감염군 소분류0.0000.9981.0000.000
발생 건수0.0370.2860.0001.000
2023-12-12T05:10:55.691529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준 연도법정감염군 대분류
기준 연도1.0000.000
법정감염군 대분류0.0001.000
2023-12-12T05:10:55.798469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발생 건수기준 연도법정감염군 대분류
발생 건수1.0000.0100.187
기준 연도0.0101.0000.000
법정감염군 대분류0.1870.0001.000

Missing values

2023-12-12T05:10:53.562398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T05:10:53.687420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준 연도법정감염군 대분류법정감염군 소분류발생 건수
02017제1군소계42
12017제1군콜레라0
22017제1군장티푸스8
32017제1군파라티푸스2
42017제1군세균성이질2
52017제1군장출혈성대장균감염증4
62017제1군A형간염26
72018제1군소계27
82018제1군콜레라0
92018제1군장티푸스3
기준 연도법정감염군 대분류법정감염군 소분류발생 건수
2032019제4군진드기매개뇌염0
2042019제4군유비저0
2052019제4군치쿤구니야열1
2062019제4군중증열성혈소판감소증후군9
2072019제4군중동호흡기증후군0
2082019제4군지카바이러스감염증0
2092019제4군리슈마니아증0
2102019제4군바베시아증0
2112019제4군크립토스포리디움증0
2122019제4군주혈흡충증0