Overview

Dataset statistics

Number of variables6
Number of observations742
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory37.8 KiB
Average record size in memory52.2 B

Variable types

Numeric4
DateTime1
Categorical1

Dataset

Description2020년 2월 18일부터 2022년 2월 28일까지의 대구광역시 남구 관내 코로나19 일자별 확진자 현황(연번, 날짜, 일일 확진자수, 총 확진자수, 총 사망자, 비고) 정보를 제공합니다.
Author대구광역시 남구
URLhttps://www.data.go.kr/data/15085588/fileData.do

Alerts

연번 is highly overall correlated with 일일 확진자수 and 3 other fieldsHigh correlation
일일 확진자수 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
누계 is highly overall correlated with 연번 and 3 other fieldsHigh correlation
총 사망자 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
비고 is highly overall correlated with 연번 and 3 other fieldsHigh correlation
비고 is highly imbalanced (90.5%)Imbalance
연번 has unique valuesUnique
날짜 has unique valuesUnique
일일 확진자수 has 339 (45.7%) zerosZeros
총 사망자 has 51 (6.9%) zerosZeros

Reproduction

Analysis started2023-12-12 09:20:40.231269
Analysis finished2023-12-12 09:20:42.666972
Duration2.44 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct742
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean371.5
Minimum1
Maximum742
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-12-12T18:20:42.749439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile38.05
Q1186.25
median371.5
Q3556.75
95-th percentile704.95
Maximum742
Range741
Interquartile range (IQR)370.5

Descriptive statistics

Standard deviation214.34124
Coefficient of variation (CV)0.57696161
Kurtosis-1.2
Mean371.5
Median Absolute Deviation (MAD)185.5
Skewness0
Sum275653
Variance45942.167
MonotonicityStrictly increasing
2023-12-12T18:20:42.919765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
500 1
 
0.1%
491 1
 
0.1%
492 1
 
0.1%
493 1
 
0.1%
494 1
 
0.1%
495 1
 
0.1%
496 1
 
0.1%
497 1
 
0.1%
498 1
 
0.1%
Other values (732) 732
98.7%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
742 1
0.1%
741 1
0.1%
740 1
0.1%
739 1
0.1%
738 1
0.1%
737 1
0.1%
736 1
0.1%
735 1
0.1%
734 1
0.1%
733 1
0.1%

날짜
Date

UNIQUE 

Distinct742
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
Minimum2020-02-18 00:00:00
Maximum2022-02-28 00:00:00
2023-12-12T18:20:43.113882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:43.271237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

일일 확진자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct61
Distinct (%)8.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.291105
Minimum0
Maximum445
Zeros339
Zeros (%)45.7%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-12-12T18:20:43.435844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile58.7
Maximum445
Range445
Interquartile range (IQR)3

Descriptive statistics

Standard deviation45.284781
Coefficient of variation (CV)4.0106597
Kurtosis39.578383
Mean11.291105
Median Absolute Deviation (MAD)1
Skewness5.9985404
Sum8378
Variance2050.7114
MonotonicityNot monotonic
2023-12-12T18:20:43.605636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 339
45.7%
1 127
 
17.1%
2 56
 
7.5%
3 47
 
6.3%
4 32
 
4.3%
5 25
 
3.4%
8 15
 
2.0%
6 13
 
1.8%
7 13
 
1.8%
10 6
 
0.8%
Other values (51) 69
 
9.3%
ValueCountFrequency (%)
0 339
45.7%
1 127
 
17.1%
2 56
 
7.5%
3 47
 
6.3%
4 32
 
4.3%
5 25
 
3.4%
6 13
 
1.8%
7 13
 
1.8%
8 15
 
2.0%
9 4
 
0.5%
ValueCountFrequency (%)
445 1
0.1%
374 1
0.1%
347 1
0.1%
340 1
0.1%
331 1
0.1%
327 1
0.1%
313 1
0.1%
288 1
0.1%
263 1
0.1%
262 1
0.1%

누계
Real number (ℝ)

HIGH CORRELATION 

Distinct403
Distinct (%)54.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1712.2601
Minimum4
Maximum8378
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-12-12T18:20:43.766687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile1355
Q11366
median1505.5
Q31845.5
95-th percentile2463.5
Maximum8378
Range8374
Interquartile range (IQR)479.5

Descriptive statistics

Standard deviation776.60953
Coefficient of variation (CV)0.45355815
Kurtosis30.278344
Mean1712.2601
Median Absolute Deviation (MAD)143.5
Skewness4.7644251
Sum1270497
Variance603122.35
MonotonicityIncreasing
2023-12-12T18:20:43.947368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1361 57
 
7.7%
1365 49
 
6.6%
1383 14
 
1.9%
1362 13
 
1.8%
1360 11
 
1.5%
1386 11
 
1.5%
1382 10
 
1.3%
1374 8
 
1.1%
1379 8
 
1.1%
1384 8
 
1.1%
Other values (393) 553
74.5%
ValueCountFrequency (%)
4 1
0.1%
20 1
0.1%
65 1
0.1%
152 1
0.1%
211 1
0.1%
298 1
0.1%
418 1
0.1%
477 1
0.1%
580 1
0.1%
716 1
0.1%
ValueCountFrequency (%)
8378 1
0.1%
8051 1
0.1%
7720 1
0.1%
7346 1
0.1%
6901 1
0.1%
6561 1
0.1%
6214 1
0.1%
5901 1
0.1%
5613 1
0.1%
5351 1
0.1%

총 사망자
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct13
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.812668
Minimum0
Maximum30
Zeros51
Zeros (%)6.9%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-12-12T18:20:44.146609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q118
median19
Q319
95-th percentile26
Maximum30
Range30
Interquartile range (IQR)1

Descriptive statistics

Standard deviation5.2406283
Coefficient of variation (CV)0.29420793
Kurtosis6.6877193
Mean17.812668
Median Absolute Deviation (MAD)1
Skewness-2.4337007
Sum13217
Variance27.464185
MonotonicityNot monotonic
2023-12-12T18:20:44.306966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
19 293
39.5%
18 252
34.0%
0 51
 
6.9%
26 40
 
5.4%
20 39
 
5.3%
17 32
 
4.3%
21 18
 
2.4%
16 6
 
0.8%
25 5
 
0.7%
22 2
 
0.3%
Other values (3) 4
 
0.5%
ValueCountFrequency (%)
0 51
 
6.9%
16 6
 
0.8%
17 32
 
4.3%
18 252
34.0%
19 293
39.5%
20 39
 
5.3%
21 18
 
2.4%
22 2
 
0.3%
23 1
 
0.1%
24 1
 
0.1%
ValueCountFrequency (%)
30 2
 
0.3%
26 40
 
5.4%
25 5
 
0.7%
24 1
 
0.1%
23 1
 
0.1%
22 2
 
0.3%
21 18
 
2.4%
20 39
 
5.3%
19 293
39.5%
18 252
34.0%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
<NA>
733 
해외1
 
9

Length

Max length4
Median length4
Mean length3.9878706
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 733
98.8%
해외1 9
 
1.2%

Length

2023-12-12T18:20:44.490647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:20:44.649004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 733
98.8%
해외1 9
 
1.2%

Interactions

2023-12-12T18:20:41.924206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:40.497688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:41.023627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:41.467916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:42.070247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:40.647254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:41.152792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:41.573290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:42.193558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:40.776197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:41.260082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:41.677236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:42.320848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:40.913689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:41.359353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:20:41.792823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T18:20:44.742735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번일일 확진자수누계총 사망자
연번1.0000.4740.8270.810
일일 확진자수0.4741.0000.9550.652
누계0.8270.9551.0000.792
총 사망자0.8100.6520.7921.000
2023-12-12T18:20:44.896197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번일일 확진자수누계총 사망자비고
연번1.0000.5341.0000.9481.000
일일 확진자수0.5341.0000.5350.4671.000
누계1.0000.5351.0000.9461.000
총 사망자0.9480.4670.9461.0001.000
비고1.0001.0001.0001.0001.000

Missing values

2023-12-12T18:20:42.500266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:20:42.620948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번날짜일일 확진자수누계총 사망자비고
012020-02-18440<NA>
122020-02-1916200<NA>
232020-02-2045650<NA>
342020-02-21871520<NA>
452020-02-22592110<NA>
562020-02-23872980<NA>
672020-02-241204180<NA>
782020-02-25594770<NA>
892020-02-261035800<NA>
9102020-02-271367160<NA>
연번날짜일일 확진자수누계총 사망자비고
7327332022-02-19261535126<NA>
7337342022-02-20262561326<NA>
7347352022-02-21288590126<NA>
7357362022-02-22313621426<NA>
7367372022-02-23347656126<NA>
7377382022-02-24340690126<NA>
7387392022-02-25445734626<NA>
7397402022-02-26374772026<NA>
7407412022-02-27331805130<NA>
7417422022-02-28327837830<NA>