Overview

Dataset statistics

Number of variables3
Number of observations90
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.3 KiB
Average record size in memory26.5 B

Variable types

Categorical1
Text1
Numeric1

Dataset

Description인천광역시 연수구의 법정감염병 신고현황 데이터로서 감염병 급수, 감염병명, 신고 횟수 등의 항목으로 이루어져 있으며 연수구에서 주로 발생하는 감염병에 대한 분석 활용에 용이합니다.
Author인천광역시 연수구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15116722&srcSe=7661IVAWM27C61E190

Alerts

감염병명 has unique valuesUnique
신고횟수 has 60 (66.7%) zerosZeros

Reproduction

Analysis started2024-04-17 18:41:20.013798
Analysis finished2024-04-17 18:41:20.324343
Duration0.31 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

급수
Categorical

Distinct4
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size852.0 B
3급
26 
2급
24 
4급
23 
1급
17 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1급
2nd row1급
3rd row1급
4th row1급
5th row1급

Common Values

ValueCountFrequency (%)
3급 26
28.9%
2급 24
26.7%
4급 23
25.6%
1급 17
18.9%

Length

2024-04-18T03:41:20.369042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T03:41:20.448560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3급 26
28.9%
2급 24
26.7%
4급 23
25.6%
1급 17
18.9%

감염병명
Text

UNIQUE 

Distinct90
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size852.0 B
2024-04-18T03:41:20.598485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length21
Mean length7.1666667
Min length2

Characters and Unicode

Total characters645
Distinct characters204
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique90 ?
Unique (%)100.0%

Sample

1st row에볼라바이러스병
2nd row마버그열
3rd row라싸열
4th row크리미안콩고출혈열
5th row남아메리카출혈열
ValueCountFrequency (%)
감염증 2
 
2.1%
에볼라바이러스병 1
 
1.0%
말라리아 1
 
1.0%
유비저 1
 
1.0%
진드기매개뇌염 1
 
1.0%
라임병 1
 
1.0%
웨스트나일열 1
 
1.0%
큐열 1
 
1.0%
뎅기열 1
 
1.0%
황열 1
 
1.0%
Other values (85) 85
88.5%
2024-04-18T03:41:20.861673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
38
 
5.9%
26
 
4.0%
21
 
3.3%
18
 
2.8%
) 16
 
2.5%
( 16
 
2.5%
14
 
2.2%
13
 
2.0%
11
 
1.7%
11
 
1.7%
Other values (194) 461
71.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 551
85.4%
Uppercase Letter 44
 
6.8%
Close Punctuation 16
 
2.5%
Open Punctuation 16
 
2.5%
Decimal Number 7
 
1.1%
Space Separator 6
 
0.9%
Dash Punctuation 3
 
0.5%
Lowercase Letter 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
38
 
6.9%
26
 
4.7%
21
 
3.8%
18
 
3.3%
14
 
2.5%
13
 
2.4%
11
 
2.0%
11
 
2.0%
10
 
1.8%
9
 
1.6%
Other values (172) 380
69.0%
Uppercase Letter
ValueCountFrequency (%)
R 8
18.2%
A 7
15.9%
S 6
13.6%
C 4
9.1%
E 4
9.1%
M 4
9.1%
D 3
 
6.8%
V 2
 
4.5%
B 2
 
4.5%
J 2
 
4.5%
Other values (2) 2
 
4.5%
Decimal Number
ValueCountFrequency (%)
1 3
42.9%
9 2
28.6%
2 1
 
14.3%
0 1
 
14.3%
Lowercase Letter
ValueCountFrequency (%)
v 1
50.0%
b 1
50.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Space Separator
ValueCountFrequency (%)
6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 551
85.4%
Common 48
 
7.4%
Latin 46
 
7.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
38
 
6.9%
26
 
4.7%
21
 
3.8%
18
 
3.3%
14
 
2.5%
13
 
2.4%
11
 
2.0%
11
 
2.0%
10
 
1.8%
9
 
1.6%
Other values (172) 380
69.0%
Latin
ValueCountFrequency (%)
R 8
17.4%
A 7
15.2%
S 6
13.0%
C 4
8.7%
E 4
8.7%
M 4
8.7%
D 3
 
6.5%
V 2
 
4.3%
B 2
 
4.3%
J 2
 
4.3%
Other values (4) 4
8.7%
Common
ValueCountFrequency (%)
) 16
33.3%
( 16
33.3%
6
 
12.5%
1 3
 
6.2%
- 3
 
6.2%
9 2
 
4.2%
2 1
 
2.1%
0 1
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 551
85.4%
ASCII 94
 
14.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
38
 
6.9%
26
 
4.7%
21
 
3.8%
18
 
3.3%
14
 
2.5%
13
 
2.4%
11
 
2.0%
11
 
2.0%
10
 
1.8%
9
 
1.6%
Other values (172) 380
69.0%
ASCII
ValueCountFrequency (%)
) 16
17.0%
( 16
17.0%
R 8
 
8.5%
A 7
 
7.4%
S 6
 
6.4%
6
 
6.4%
C 4
 
4.3%
E 4
 
4.3%
M 4
 
4.3%
1 3
 
3.2%
Other values (12) 20
21.3%

신고횟수
Real number (ℝ)

ZEROS 

Distinct18
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean295.47778
Minimum0
Maximum26083
Zeros60
Zeros (%)66.7%
Negative0
Negative (%)0.0%
Memory size942.0 B
2024-04-18T03:41:20.955823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile40.3
Maximum26083
Range26083
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2748.8422
Coefficient of variation (CV)9.303042
Kurtosis89.992334
Mean295.47778
Median Absolute Deviation (MAD)0
Skewness9.4862348
Sum26593
Variance7556133.3
MonotonicityNot monotonic
2024-04-18T03:41:21.046113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
0 60
66.7%
2 6
 
6.7%
5 4
 
4.4%
4 4
 
4.4%
3 2
 
2.2%
1 2
 
2.2%
17 1
 
1.1%
56 1
 
1.1%
114 1
 
1.1%
22 1
 
1.1%
Other values (8) 8
 
8.9%
ValueCountFrequency (%)
0 60
66.7%
1 2
 
2.2%
2 6
 
6.7%
3 2
 
2.2%
4 4
 
4.4%
5 4
 
4.4%
6 1
 
1.1%
7 1
 
1.1%
17 1
 
1.1%
22 1
 
1.1%
ValueCountFrequency (%)
26083 1
1.1%
114 1
1.1%
91 1
1.1%
56 1
1.1%
43 1
1.1%
37 1
1.1%
36 1
1.1%
25 1
1.1%
22 1
1.1%
17 1
1.1%

Interactions

2024-04-18T03:41:20.145188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-18T03:41:21.108874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
급수감염병명신고횟수
급수1.0001.0000.000
감염병명1.0001.0001.000
신고횟수0.0001.0001.000
2024-04-18T03:41:21.171183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
신고횟수급수
신고횟수1.0000.000
급수0.0001.000

Missing values

2024-04-18T03:41:20.236466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-18T03:41:20.302701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

급수감염병명신고횟수
01급에볼라바이러스병0
11급마버그열0
21급라싸열0
31급크리미안콩고출혈열0
41급남아메리카출혈열0
51급리프트밸리열0
61급두창0
71급페스트0
81급탄저0
91급보툴리눔독소증0
급수감염병명신고횟수
804급첨규콘딜롬5
814급반코마이신내성장알균(VRE)감염증0
824급메티실린내성활색포도알균(MRSA)감염증0
834급다제내성녹농균(MRPA)감염증0
844급다제내성아시네토박터바우마니균(MRAB)감염증0
854급사람유두종바이러스감염증56
864급해외유입기생충감염증(11종)0
874급장관감염증(20종)0
884급급성호흡기감염증(9종)0
894급엔테로바이러스감염증0