Overview

Dataset statistics

Number of variables6
Number of observations48
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory51.8 B

Variable types

Numeric1
Categorical4
Text1

Dataset

Description인천광역시 서구 감염병 현황에 대한 데이터로 연번, 구분, 질병명, 발생기간, 신고건수 등의 정보가 포함되어 있습니다.
Author인천광역시 서구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15090927&srcSe=7661IVAWM27C61E190

Alerts

발생기간 has constant value ""Constant
데이터기준일자 has constant value ""Constant
연번 is highly overall correlated with 구분High correlation
구분 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
신고건수 is highly overall correlated with 구분High correlation
연번 has unique valuesUnique
질병명 has unique valuesUnique

Reproduction

Analysis started2024-01-28 14:39:14.416514
Analysis finished2024-01-28 14:39:14.828664
Duration0.41 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct48
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.5
Minimum1
Maximum48
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size564.0 B
2024-01-28T23:39:14.893855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.35
Q112.75
median24.5
Q336.25
95-th percentile45.65
Maximum48
Range47
Interquartile range (IQR)23.5

Descriptive statistics

Standard deviation14
Coefficient of variation (CV)0.57142857
Kurtosis-1.2
Mean24.5
Median Absolute Deviation (MAD)12
Skewness0
Sum1176
Variance196
MonotonicityStrictly increasing
2024-01-28T23:39:15.013527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
1 1
 
2.1%
26 1
 
2.1%
28 1
 
2.1%
29 1
 
2.1%
30 1
 
2.1%
31 1
 
2.1%
32 1
 
2.1%
33 1
 
2.1%
34 1
 
2.1%
35 1
 
2.1%
Other values (38) 38
79.2%
ValueCountFrequency (%)
1 1
2.1%
2 1
2.1%
3 1
2.1%
4 1
2.1%
5 1
2.1%
6 1
2.1%
7 1
2.1%
8 1
2.1%
9 1
2.1%
10 1
2.1%
ValueCountFrequency (%)
48 1
2.1%
47 1
2.1%
46 1
2.1%
45 1
2.1%
44 1
2.1%
43 1
2.1%
42 1
2.1%
41 1
2.1%
40 1
2.1%
39 1
2.1%

구분
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size516.0 B
3급
25 
2급
22 
4급
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)2.1%

Sample

1st row2급
2nd row2급
3rd row2급
4th row2급
5th row2급

Common Values

ValueCountFrequency (%)
3급 25
52.1%
2급 22
45.8%
4급 1
 
2.1%

Length

2024-01-28T23:39:15.120185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-28T23:39:15.193871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3급 25
52.1%
2급 22
45.8%
4급 1
 
2.1%

질병명
Text

UNIQUE 

Distinct48
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size516.0 B
2024-01-28T23:39:15.367248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length20
Mean length6.7916667
Min length2

Characters and Unicode

Total characters326
Distinct characters144
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)100.0%

Sample

1st row수두
2nd row홍역
3rd row콜레라
4th row장티푸스
5th row파라티푸스
ValueCountFrequency (%)
감염증 4
 
7.4%
수두 1
 
1.9%
c형간염 1
 
1.9%
말라리아 1
 
1.9%
레지오넬라증 1
 
1.9%
비브리오패혈증 1
 
1.9%
발진티푸스 1
 
1.9%
발진열 1
 
1.9%
쯔쯔가무시증 1
 
1.9%
렙토스피라증 1
 
1.9%
Other values (41) 41
75.9%
2024-01-28T23:39:15.660536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14
 
4.3%
13
 
4.0%
9
 
2.8%
9
 
2.8%
( 8
 
2.5%
) 8
 
2.5%
7
 
2.1%
7
 
2.1%
7
 
2.1%
7
 
2.1%
Other values (134) 237
72.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 267
81.9%
Uppercase Letter 26
 
8.0%
Open Punctuation 8
 
2.5%
Close Punctuation 8
 
2.5%
Space Separator 6
 
1.8%
Decimal Number 6
 
1.8%
Dash Punctuation 3
 
0.9%
Lowercase Letter 2
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
5.2%
13
 
4.9%
9
 
3.4%
9
 
3.4%
7
 
2.6%
7
 
2.6%
7
 
2.6%
7
 
2.6%
7
 
2.6%
6
 
2.2%
Other values (110) 181
67.8%
Uppercase Letter
ValueCountFrequency (%)
C 5
19.2%
S 3
11.5%
D 3
11.5%
V 2
 
7.7%
J 2
 
7.7%
R 2
 
7.7%
E 2
 
7.7%
A 2
 
7.7%
I 1
 
3.8%
O 1
 
3.8%
Other values (3) 3
11.5%
Decimal Number
ValueCountFrequency (%)
1 2
33.3%
2 1
16.7%
0 1
16.7%
8 1
16.7%
9 1
16.7%
Lowercase Letter
ValueCountFrequency (%)
v 1
50.0%
b 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Space Separator
ValueCountFrequency (%)
6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 267
81.9%
Common 31
 
9.5%
Latin 28
 
8.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
5.2%
13
 
4.9%
9
 
3.4%
9
 
3.4%
7
 
2.6%
7
 
2.6%
7
 
2.6%
7
 
2.6%
7
 
2.6%
6
 
2.2%
Other values (110) 181
67.8%
Latin
ValueCountFrequency (%)
C 5
17.9%
S 3
10.7%
D 3
10.7%
V 2
 
7.1%
J 2
 
7.1%
R 2
 
7.1%
E 2
 
7.1%
A 2
 
7.1%
v 1
 
3.6%
I 1
 
3.6%
Other values (5) 5
17.9%
Common
ValueCountFrequency (%)
( 8
25.8%
) 8
25.8%
6
19.4%
- 3
 
9.7%
1 2
 
6.5%
2 1
 
3.2%
0 1
 
3.2%
8 1
 
3.2%
9 1
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 267
81.9%
ASCII 59
 
18.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
14
 
5.2%
13
 
4.9%
9
 
3.4%
9
 
3.4%
7
 
2.6%
7
 
2.6%
7
 
2.6%
7
 
2.6%
7
 
2.6%
6
 
2.2%
Other values (110) 181
67.8%
ASCII
ValueCountFrequency (%)
( 8
13.6%
) 8
13.6%
6
 
10.2%
C 5
 
8.5%
S 3
 
5.1%
D 3
 
5.1%
- 3
 
5.1%
V 2
 
3.4%
J 2
 
3.4%
R 2
 
3.4%
Other values (14) 17
28.8%

발생기간
Categorical

CONSTANT 

Distinct1
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size516.0 B
2023-01-01~2023-09-30
48 

Length

Max length21
Median length21
Mean length21
Min length21

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-01-01~2023-09-30
2nd row2023-01-01~2023-09-30
3rd row2023-01-01~2023-09-30
4th row2023-01-01~2023-09-30
5th row2023-01-01~2023-09-30

Common Values

ValueCountFrequency (%)
2023-01-01~2023-09-30 48
100.0%

Length

2024-01-28T23:39:15.770346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-28T23:39:15.853044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-01-01~2023-09-30 48
100.0%

신고건수
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)27.1%
Missing0
Missing (%)0.0%
Memory size516.0 B
0
32 
5
 
3
1
 
2
4
 
2
207
 
1
Other values (8)

Length

Max length7
Median length1
Mean length1.2916667
Min length1

Unique

Unique9 ?
Unique (%)18.8%

Sample

1st row207
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 32
66.7%
5 3
 
6.2%
1 2
 
4.2%
4 2
 
4.2%
207 1
 
2.1%
3 1
 
2.1%
6 1
 
2.1%
116 1
 
2.1%
339 1
 
2.1%
73 1
 
2.1%
Other values (3) 3
 
6.2%

Length

2024-01-28T23:39:15.937519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 32
66.7%
5 3
 
6.2%
1 2
 
4.2%
4 2
 
4.2%
207 1
 
2.1%
3 1
 
2.1%
6 1
 
2.1%
116 1
 
2.1%
339 1
 
2.1%
73 1
 
2.1%
Other values (3) 3
 
6.2%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size516.0 B
2023-09-30
48 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-09-30
2nd row2023-09-30
3rd row2023-09-30
4th row2023-09-30
5th row2023-09-30

Common Values

ValueCountFrequency (%)
2023-09-30 48
100.0%

Length

2024-01-28T23:39:16.040229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-28T23:39:16.116570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-09-30 48
100.0%

Interactions

2024-01-28T23:39:14.614808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-28T23:39:16.164925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분질병명신고건수
연번1.0000.7731.0000.351
구분0.7731.0001.0000.818
질병명1.0001.0001.0001.000
신고건수0.3510.8181.0001.000
2024-01-28T23:39:16.240017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
신고건수구분
신고건수1.0000.599
구분0.5991.000
2024-01-28T23:39:16.320344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분신고건수
연번1.0000.5970.123
구분0.5971.0000.599
신고건수0.1230.5991.000

Missing values

2024-01-28T23:39:14.711070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-28T23:39:14.792262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번구분질병명발생기간신고건수데이터기준일자
012급수두2023-01-01~2023-09-302072023-09-30
122급홍역2023-01-01~2023-09-3002023-09-30
232급콜레라2023-01-01~2023-09-3002023-09-30
342급장티푸스2023-01-01~2023-09-3002023-09-30
452급파라티푸스2023-01-01~2023-09-3012023-09-30
562급세균성이질2023-01-01~2023-09-3002023-09-30
672급장출혈성대장균감염증2023-01-01~2023-09-3032023-09-30
782급A형간염2023-01-01~2023-09-3062023-09-30
892급백일해2023-01-01~2023-09-3002023-09-30
9102급유행성이하선염2023-01-01~2023-09-301162023-09-30
연번구분질병명발생기간신고건수데이터기준일자
38393급뎅기열2023-01-01~2023-09-3042023-09-30
39403급큐열2023-01-01~2023-09-3002023-09-30
40413급웨스트나일열2023-01-01~2023-09-3002023-09-30
41423급라임병2023-01-01~2023-09-3002023-09-30
42433급진드기매개뇌염2023-01-01~2023-09-3002023-09-30
43443급유비저2023-01-01~2023-09-3002023-09-30
44453급치쿤구니야열2023-01-01~2023-09-3002023-09-30
45463급중증열성혈소판감소증후군(SFTS)2023-01-01~2023-09-3002023-09-30
46473급지카바이러스감염증2023-01-01~2023-09-3002023-09-30
47484급COVID-192023-01-01~2023-09-30389,9312023-09-30