Overview

Dataset statistics

Number of variables8
Number of observations423
Missing cells417
Missing cells (%)12.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory27.0 KiB
Average record size in memory65.3 B

Variable types

Text3
Categorical5

Dataset

Description대구광역시 중구 내 코로나 확진자 일자별 현황에 대한 데이터로 확진환자, 사망자, 타구이관의 일계 누계 등의 항목을 제공합니다.
Author대구광역시 중구
URLhttps://www.data.go.kr/data/15080726/fileData.do

Alerts

확진환자(일계) is highly overall correlated with 타구이관(일계) and 1 other fieldsHigh correlation
사망자(일계) is highly overall correlated with 사망자(누계) High correlation
사망자(누계) is highly overall correlated with 사망자(일계) and 1 other fieldsHigh correlation
타구이관(일계) is highly overall correlated with 확진환자(일계) and 1 other fieldsHigh correlation
타구이관(누계) is highly overall correlated with 확진환자(일계) and 2 other fieldsHigh correlation
확진환자(일계) is highly imbalanced (68.7%)Imbalance
타구이관(일계) is highly imbalanced (86.5%)Imbalance
타구이관(누계) is highly imbalanced (80.6%)Imbalance
비고 has 417 (98.6%) missing valuesMissing
날짜/ 현황 has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:57:41.830302
Analysis finished2023-12-12 13:57:42.493605
Duration0.66 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

날짜/ 현황
Text

UNIQUE 

Distinct423
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
2023-12-12T22:57:42.779719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.9858156
Min length6

Characters and Unicode

Total characters2955
Distinct characters21
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique423 ?
Unique (%)100.0%

Sample

1st row2.18(화)
2nd row2.19(수)
3rd row2.20(목)
4th row2.21(금)
5th row2.22(토)
ValueCountFrequency (%)
10 9
 
2.1%
2.18(화 1
 
0.2%
12.03(목 1
 
0.2%
12.02(수 1
 
0.2%
12.01(화 1
 
0.2%
11.30(월 1
 
0.2%
11.29(일 1
 
0.2%
11.28(토 1
 
0.2%
11.27(금 1
 
0.2%
11.26(목 1
 
0.2%
Other values (415) 415
95.8%
2023-12-12T22:57:43.341338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 423
14.3%
) 423
14.3%
. 423
14.3%
1 341
11.5%
2 252
 
8.5%
3 124
 
4.2%
0 89
 
3.0%
4 87
 
2.9%
5 73
 
2.5%
8 73
 
2.5%
Other values (11) 647
21.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1253
42.4%
Open Punctuation 423
 
14.3%
Close Punctuation 423
 
14.3%
Other Punctuation 423
 
14.3%
Other Letter 423
 
14.3%
Space Separator 10
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 341
27.2%
2 252
20.1%
3 124
 
9.9%
0 89
 
7.1%
4 87
 
6.9%
5 73
 
5.8%
8 73
 
5.8%
7 72
 
5.7%
6 71
 
5.7%
9 71
 
5.7%
Other Letter
ValueCountFrequency (%)
61
14.4%
61
14.4%
61
14.4%
60
14.2%
60
14.2%
60
14.2%
60
14.2%
Open Punctuation
ValueCountFrequency (%)
( 423
100.0%
Close Punctuation
ValueCountFrequency (%)
) 423
100.0%
Other Punctuation
ValueCountFrequency (%)
. 423
100.0%
Space Separator
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2532
85.7%
Hangul 423
 
14.3%

Most frequent character per script

Common
ValueCountFrequency (%)
( 423
16.7%
) 423
16.7%
. 423
16.7%
1 341
13.5%
2 252
10.0%
3 124
 
4.9%
0 89
 
3.5%
4 87
 
3.4%
5 73
 
2.9%
8 73
 
2.9%
Other values (4) 224
8.8%
Hangul
ValueCountFrequency (%)
61
14.4%
61
14.4%
61
14.4%
60
14.2%
60
14.2%
60
14.2%
60
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2532
85.7%
Hangul 423
 
14.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 423
16.7%
) 423
16.7%
. 423
16.7%
1 341
13.5%
2 252
10.0%
3 124
 
4.9%
0 89
 
3.5%
4 87
 
3.4%
5 73
 
2.9%
8 73
 
2.9%
Other values (4) 224
8.8%
Hangul
ValueCountFrequency (%)
61
14.4%
61
14.4%
61
14.4%
60
14.2%
60
14.2%
60
14.2%
60
14.2%

확진환자(일계)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct17
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
-
334 
1
46 
2
 
15
5
 
5
6
 
4
Other values (12)
 
19

Length

Max length2
Median length1
Mean length1.0189125
Min length1

Unique

Unique8 ?
Unique (%)1.9%

Sample

1st row-
2nd row2
3rd row2
4th row2
5th row6

Common Values

ValueCountFrequency (%)
- 334
79.0%
1 46
 
10.9%
2 15
 
3.5%
5 5
 
1.2%
6 4
 
0.9%
3 4
 
0.9%
4 3
 
0.7%
8 2
 
0.5%
9 2
 
0.5%
12 1
 
0.2%
Other values (7) 7
 
1.7%

Length

2023-12-12T22:57:43.508713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
334
79.0%
1 47
 
11.1%
2 15
 
3.5%
5 5
 
1.2%
6 4
 
0.9%
3 4
 
0.9%
4 3
 
0.7%
8 2
 
0.5%
9 2
 
0.5%
12 1
 
0.2%
Other values (6) 6
 
1.4%
Distinct88
Distinct (%)20.8%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
2023-12-12T22:57:43.791423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.9692671
Min length1

Characters and Unicode

Total characters1256
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique47 ?
Unique (%)11.1%

Sample

1st row -
2nd row2
3rd row4
4th row6
5th row12
ValueCountFrequency (%)
249 52
 
12.3%
259 52
 
12.3%
250 41
 
9.7%
248 37
 
8.7%
261 18
 
4.3%
318 15
 
3.5%
305 15
 
3.5%
258 12
 
2.8%
251 12
 
2.8%
310 10
 
2.4%
Other values (78) 159
37.6%
2023-12-12T22:57:44.285911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 356
28.3%
5 167
13.3%
9 145
11.5%
4 124
 
9.9%
3 104
 
8.3%
0 95
 
7.6%
8 89
 
7.1%
1 85
 
6.8%
6 53
 
4.2%
7 35
 
2.8%
Other values (2) 3
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1253
99.8%
Space Separator 2
 
0.2%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 356
28.4%
5 167
13.3%
9 145
11.6%
4 124
 
9.9%
3 104
 
8.3%
0 95
 
7.6%
8 89
 
7.1%
1 85
 
6.8%
6 53
 
4.2%
7 35
 
2.8%
Space Separator
ValueCountFrequency (%)
2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1256
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 356
28.3%
5 167
13.3%
9 145
11.5%
4 124
 
9.9%
3 104
 
8.3%
0 95
 
7.6%
8 89
 
7.1%
1 85
 
6.8%
6 53
 
4.2%
7 35
 
2.8%
Other values (2) 3
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1256
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 356
28.3%
5 167
13.3%
9 145
11.5%
4 124
 
9.9%
3 104
 
8.3%
0 95
 
7.6%
8 89
 
7.1%
1 85
 
6.8%
6 53
 
4.2%
7 35
 
2.8%
Other values (2) 3
 
0.2%

사망자(일계)
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
-
239 
-
179 
1
 
5

Length

Max length3
Median length1
Mean length1.8463357
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row-
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
- 239
56.5%
- 179
42.3%
1 5
 
1.2%

Length

2023-12-12T22:57:44.432458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:57:44.548389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
418
98.8%
1 5
 
1.2%

사망자(누계)
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
4
259 
5
92 
3
33 
-
29 
2
 
8

Length

Max length3
Median length1
Mean length1.1371158
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row -
2nd row -
3rd row -
4th row -
5th row -

Common Values

ValueCountFrequency (%)
4 259
61.2%
5 92
 
21.7%
3 33
 
7.8%
- 29
 
6.9%
2 8
 
1.9%
1 2
 
0.5%

Length

2023-12-12T22:57:44.664596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:57:44.792022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 259
61.2%
5 92
 
21.7%
3 33
 
7.8%
29
 
6.9%
2 8
 
1.9%
1 2
 
0.5%

타구이관(일계)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
-
407 
<NA>
 
12
1
 
3
3
 
1

Length

Max length4
Median length1
Mean length1.0851064
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
- 407
96.2%
<NA> 12
 
2.8%
1 3
 
0.7%
3 1
 
0.2%

Length

2023-12-12T22:57:44.926193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:57:45.062613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
407
96.2%
na 12
 
2.8%
1 3
 
0.7%
3 1
 
0.2%

타구이관(누계)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
6
395 
4
 
13
<NA>
 
12
5
 
2
1
 
1

Length

Max length4
Median length1
Mean length1.0851064
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
6 395
93.4%
4 13
 
3.1%
<NA> 12
 
2.8%
5 2
 
0.5%
1 1
 
0.2%

Length

2023-12-12T22:57:45.187910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:57:45.331331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
6 395
93.4%
4 13
 
3.1%
na 12
 
2.8%
5 2
 
0.5%
1 1
 
0.2%

비고
Text

MISSING 

Distinct4
Distinct (%)66.7%
Missing417
Missing (%)98.6%
Memory size3.4 KiB
2023-12-12T22:57:45.469371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7
Mean length4.6666667
Min length2

Characters and Unicode

Total characters28
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)33.3%

Sample

1st row퇴원5포함
2nd row중구재이관1 포함
3rd row타구이관1
4th row타구이관1
5th row?
ValueCountFrequency (%)
타구이관1 2
28.6%
2
28.6%
퇴원5포함 1
14.3%
중구재이관1 1
14.3%
포함 1
14.3%
2023-12-12T22:57:45.767909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3
10.7%
3
10.7%
3
10.7%
1 3
10.7%
3
10.7%
2
7.1%
? 2
7.1%
2
7.1%
2
7.1%
1
 
3.6%
Other values (4) 4
14.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 19
67.9%
Decimal Number 4
 
14.3%
Space Separator 3
 
10.7%
Other Punctuation 2
 
7.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3
15.8%
3
15.8%
3
15.8%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Decimal Number
ValueCountFrequency (%)
1 3
75.0%
5 1
 
25.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
? 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 19
67.9%
Common 9
32.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3
15.8%
3
15.8%
3
15.8%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Common
ValueCountFrequency (%)
1 3
33.3%
3
33.3%
? 2
22.2%
5 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 19
67.9%
ASCII 9
32.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3
15.8%
3
15.8%
3
15.8%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
ASCII
ValueCountFrequency (%)
1 3
33.3%
3
33.3%
? 2
22.2%
5 1
 
11.1%

Correlations

2023-12-12T22:57:45.890465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
확진환자(일계)확진환자(누계)사망자(일계)사망자(누계)타구이관(일계)타구이관(누계)비고
확진환자(일계)1.0000.9960.2340.6310.7190.9181.000
확진환자(누계)0.9961.0000.9010.9750.9880.9961.000
사망자(일계)0.2340.9011.0000.8290.0000.102NaN
사망자(누계)0.6310.9750.8291.0000.6230.7191.000
타구이관(일계)0.7190.9880.0000.6231.0000.5271.000
타구이관(누계)0.9180.9960.1020.7190.5271.0000.568
비고1.0001.000NaN1.0001.0000.5681.000
2023-12-12T22:57:46.033684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
확진환자(일계)타구이관(누계)타구이관(일계)사망자(일계)사망자(누계)
확진환자(일계)1.0000.8050.5430.1260.354
타구이관(누계)0.8051.0000.5290.0960.550
타구이관(일계)0.5430.5291.0000.0000.320
사망자(일계)0.1260.0960.0001.0000.514
사망자(누계)0.3540.5500.3200.5141.000
2023-12-12T22:57:46.138256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
확진환자(일계)사망자(일계)사망자(누계)타구이관(일계)타구이관(누계)
확진환자(일계)1.0000.1260.3540.5430.805
사망자(일계)0.1261.0000.5140.0000.096
사망자(누계)0.3540.5141.0000.3200.550
타구이관(일계)0.5430.0000.3201.0000.529
타구이관(누계)0.8050.0960.5500.5291.000

Missing values

2023-12-12T22:57:42.288931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:57:42.432311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

날짜/ 현황확진환자(일계)확진환자(누계)사망자(일계)사망자(누계)타구이관(일계)타구이관(누계)비고
02.18(화)----<NA><NA><NA>
12.19(수)22--<NA><NA><NA>
22.20(목)24--<NA><NA><NA>
32.21(금)26--<NA><NA><NA>
42.22(토)612--<NA><NA><NA>
52.23(일)921--<NA><NA><NA>
62.24(월)223--<NA><NA><NA>
72.25(화)2750--<NA><NA><NA>
82.26(수)858--<NA><NA><NA>
92.27(목)2179--<NA><NA><NA>
날짜/ 현황확진환자(일계)확진환자(누계)사망자(일계)사망자(누계)타구이관(일계)타구이관(누계)비고
4134.6(화)-319-5-6<NA>
4144.7(수)-319-5-6<NA>
4154.8(목)-319-5-6<NA>
4164.9(금)-319-5-6<NA>
4174.10(토)-319-5-6<NA>
4184.11(일)1320-5-6<NA>
4194.12(월)-320-5-6<NA>
4204.13(화)1321-5-6<NA>
4214.14(수)1322-5-6<NA>
4224.15(목)-322-5-6<NA>