Overview

Dataset statistics

Number of variables4
Number of observations64
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.3 KiB
Average record size in memory36.1 B

Variable types

Numeric2
Text1
Categorical1

Dataset

Description부천시 2023년 1~7월(누적) 감염병 발생현황 정보로 감염병 유형(1,2,3급)에 따른 64종의 감염병명, 감염등급, 감염인원 등의 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15090519/fileData.do

Alerts

연번 is highly overall correlated with 감염등급High correlation
감염등급 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique
감염병명 has unique valuesUnique
발생인원 has 49 (76.6%) zerosZeros

Reproduction

Analysis started2023-12-12 04:24:47.094897
Analysis finished2023-12-12 04:24:47.802848
Duration0.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct64
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.5
Minimum1
Maximum64
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size708.0 B
2023-12-12T13:24:47.873779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.15
Q116.75
median32.5
Q348.25
95-th percentile60.85
Maximum64
Range63
Interquartile range (IQR)31.5

Descriptive statistics

Standard deviation18.618987
Coefficient of variation (CV)0.5728919
Kurtosis-1.2
Mean32.5
Median Absolute Deviation (MAD)16
Skewness0
Sum2080
Variance346.66667
MonotonicityStrictly increasing
2023-12-12T13:24:48.012187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.6%
34 1
 
1.6%
36 1
 
1.6%
37 1
 
1.6%
38 1
 
1.6%
39 1
 
1.6%
40 1
 
1.6%
41 1
 
1.6%
42 1
 
1.6%
43 1
 
1.6%
Other values (54) 54
84.4%
ValueCountFrequency (%)
1 1
1.6%
2 1
1.6%
3 1
1.6%
4 1
1.6%
5 1
1.6%
6 1
1.6%
7 1
1.6%
8 1
1.6%
9 1
1.6%
10 1
1.6%
ValueCountFrequency (%)
64 1
1.6%
63 1
1.6%
62 1
1.6%
61 1
1.6%
60 1
1.6%
59 1
1.6%
58 1
1.6%
57 1
1.6%
56 1
1.6%
55 1
1.6%

감염병명
Text

UNIQUE 

Distinct64
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size644.0 B
2023-12-12T13:24:48.287781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length18
Mean length6.8125
Min length2

Characters and Unicode

Total characters436
Distinct characters169
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)100.0%

Sample

1st row에볼라바이러스병
2nd row마버그열
3rd row라싸열
4th row크리미안콩고출혈열
5th row남아메리카출혈열
ValueCountFrequency (%)
감염증 4
 
5.6%
b형간염 1
 
1.4%
발진티푸스 1
 
1.4%
비브리오패혈증 1
 
1.4%
레지오넬라증 1
 
1.4%
말라리아 1
 
1.4%
c형간염 1
 
1.4%
일본뇌염 1
 
1.4%
파상풍 1
 
1.4%
폐렴구균 1
 
1.4%
Other values (58) 58
81.7%
2023-12-12T13:24:48.759165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20
 
4.6%
15
 
3.4%
14
 
3.2%
) 10
 
2.3%
( 10
 
2.3%
10
 
2.3%
9
 
2.1%
9
 
2.1%
9
 
2.1%
9
 
2.1%
Other values (159) 321
73.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 372
85.3%
Uppercase Letter 29
 
6.7%
Close Punctuation 10
 
2.3%
Open Punctuation 10
 
2.3%
Space Separator 7
 
1.6%
Decimal Number 4
 
0.9%
Dash Punctuation 2
 
0.5%
Lowercase Letter 2
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
20
 
5.4%
15
 
4.0%
14
 
3.8%
10
 
2.7%
9
 
2.4%
9
 
2.4%
9
 
2.4%
9
 
2.4%
8
 
2.2%
8
 
2.2%
Other values (137) 261
70.2%
Uppercase Letter
ValueCountFrequency (%)
S 6
20.7%
C 4
13.8%
R 4
13.8%
A 3
10.3%
E 3
10.3%
D 2
 
6.9%
J 2
 
6.9%
B 1
 
3.4%
V 1
 
3.4%
F 1
 
3.4%
Other values (2) 2
 
6.9%
Decimal Number
ValueCountFrequency (%)
8 1
25.0%
1 1
25.0%
0 1
25.0%
2 1
25.0%
Lowercase Letter
ValueCountFrequency (%)
v 1
50.0%
b 1
50.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Space Separator
ValueCountFrequency (%)
7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 372
85.3%
Common 33
 
7.6%
Latin 31
 
7.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
20
 
5.4%
15
 
4.0%
14
 
3.8%
10
 
2.7%
9
 
2.4%
9
 
2.4%
9
 
2.4%
9
 
2.4%
8
 
2.2%
8
 
2.2%
Other values (137) 261
70.2%
Latin
ValueCountFrequency (%)
S 6
19.4%
C 4
12.9%
R 4
12.9%
A 3
9.7%
E 3
9.7%
D 2
 
6.5%
J 2
 
6.5%
B 1
 
3.2%
V 1
 
3.2%
F 1
 
3.2%
Other values (4) 4
12.9%
Common
ValueCountFrequency (%)
) 10
30.3%
( 10
30.3%
7
21.2%
- 2
 
6.1%
8 1
 
3.0%
1 1
 
3.0%
0 1
 
3.0%
2 1
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 372
85.3%
ASCII 64
 
14.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
20
 
5.4%
15
 
4.0%
14
 
3.8%
10
 
2.7%
9
 
2.4%
9
 
2.4%
9
 
2.4%
9
 
2.4%
8
 
2.2%
8
 
2.2%
Other values (137) 261
70.2%
ASCII
ValueCountFrequency (%)
) 10
15.6%
( 10
15.6%
7
10.9%
S 6
9.4%
C 4
 
6.2%
R 4
 
6.2%
A 3
 
4.7%
E 3
 
4.7%
- 2
 
3.1%
D 2
 
3.1%
Other values (12) 13
20.3%

감염등급
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size644.0 B
3급
25 
2급
22 
1급
17 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1급
2nd row1급
3rd row1급
4th row1급
5th row1급

Common Values

ValueCountFrequency (%)
3급 25
39.1%
2급 22
34.4%
1급 17
26.6%

Length

2023-12-12T13:24:48.912389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:24:49.050592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3급 25
39.1%
2급 22
34.4%
1급 17
26.6%

발생인원
Real number (ℝ)

ZEROS 

Distinct12
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.5625
Minimum0
Maximum529
Zeros49
Zeros (%)76.6%
Negative0
Negative (%)0.0%
Memory size708.0 B
2023-12-12T13:24:49.153179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile66.05
Maximum529
Range529
Interquartile range (IQR)0

Descriptive statistics

Standard deviation71.147036
Coefficient of variation (CV)4.8856334
Kurtosis45.43633
Mean14.5625
Median Absolute Deviation (MAD)0
Skewness6.5061234
Sum932
Variance5061.9008
MonotonicityNot monotonic
2023-12-12T13:24:49.274419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0 49
76.6%
1 3
 
4.7%
3 2
 
3.1%
2 2
 
3.1%
202 1
 
1.6%
21 1
 
1.6%
74 1
 
1.6%
529 1
 
1.6%
4 1
 
1.6%
5 1
 
1.6%
Other values (2) 2
 
3.1%
ValueCountFrequency (%)
0 49
76.6%
1 3
 
4.7%
2 2
 
3.1%
3 2
 
3.1%
4 1
 
1.6%
5 1
 
1.6%
8 1
 
1.6%
21 1
 
1.6%
74 1
 
1.6%
76 1
 
1.6%
ValueCountFrequency (%)
529 1
1.6%
202 1
1.6%
76 1
1.6%
74 1
1.6%
21 1
1.6%
8 1
1.6%
5 1
1.6%
4 1
1.6%
3 2
3.1%
2 2
3.1%

Interactions

2023-12-12T13:24:47.492752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:24:47.305431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:24:47.576076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:24:47.391867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:24:49.373944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번감염병명감염등급발생인원
연번1.0001.0000.9470.105
감염병명1.0001.0001.0001.000
감염등급0.9471.0001.0000.000
발생인원0.1051.0000.0001.000
2023-12-12T13:24:49.491097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번발생인원감염등급
연번1.0000.2350.881
발생인원0.2351.0000.000
감염등급0.8810.0001.000

Missing values

2023-12-12T13:24:47.687549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:24:47.770277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번감염병명감염등급발생인원
01에볼라바이러스병1급0
12마버그열1급0
23라싸열1급0
34크리미안콩고출혈열1급0
45남아메리카출혈열1급0
56리프트밸리열1급0
67두창1급0
78페스트1급0
89탄저1급0
910보툴리눔독소증1급0
연번감염병명감염등급발생인원
5455황열3급0
5556뎅기열3급2
5657큐열3급0
5758웨스트나일열3급0
5859라임병3급1
5960진드기매개뇌염3급0
6061유비저3급0
6162치쿤구니야열3급0
6263중증열성혈소판감소증후군(SFTS)3급3
6364지카바이러스감염증3급0