Overview

Dataset statistics

Number of variables7
Number of observations2484
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory136.0 KiB
Average record size in memory56.1 B

Variable types

Categorical3
Text3
DateTime1

Dataset

Description경상남도 진주시 코로나19 확진자 및 사망자현황 데이터(2020년 최초확진자발생일 ~ 2021.12.31.)
Author경상남도 진주시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15098679

Alerts

시군구명 has constant value ""Constant
데이터기준일 has constant value ""Constant
확진자 상태 is highly imbalanced (95.7%)Imbalance
확진자 번호 has unique valuesUnique

Reproduction

Analysis started2023-12-10 23:19:05.148551
Analysis finished2023-12-10 23:19:05.996394
Duration0.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.5 KiB
경상남도 진주시
2484 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경상남도 진주시
2nd row경상남도 진주시
3rd row경상남도 진주시
4th row경상남도 진주시
5th row경상남도 진주시

Common Values

ValueCountFrequency (%)
경상남도 진주시 2484
100.0%

Length

2023-12-11T08:19:06.067990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:19:06.167186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경상남도 2484
50.0%
진주시 2484
50.0%

확진자 번호
Text

UNIQUE 

Distinct2484
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size19.5 KiB
2023-12-11T08:19:06.369160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length15
Mean length13.845813
Min length8

Characters and Unicode

Total characters34393
Distinct characters19
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2484 ?
Unique (%)100.0%

Sample

1st row진주2483(경남20778)
2nd row진주2482(경남20777)
3rd row진주2481(경남20776)
4th row진주2480(경남20775)
5th row진주2479(경남20774)
ValueCountFrequency (%)
진주2483(경남20778 1
 
< 0.1%
진주826(경남2916 1
 
< 0.1%
진주832(경남2923 1
 
< 0.1%
진주816(경남2881 1
 
< 0.1%
진주831(경남2922 1
 
< 0.1%
진주830(경남2921 1
 
< 0.1%
진주829(경남2920 1
 
< 0.1%
진주828(경남2919 1
 
< 0.1%
진주827(경남2917 1
 
< 0.1%
진주825(경남2909 1
 
< 0.1%
Other values (2476) 2476
99.6%
2023-12-11T08:19:06.787184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3633
 
10.6%
2 2703
 
7.9%
2486
 
7.2%
2486
 
7.2%
( 2484
 
7.2%
2482
 
7.2%
2482
 
7.2%
) 2482
 
7.2%
3 1977
 
5.7%
4 1725
 
5.0%
Other values (9) 9453
27.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19482
56.6%
Other Letter 9940
28.9%
Open Punctuation 2484
 
7.2%
Close Punctuation 2482
 
7.2%
Space Separator 5
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3633
18.6%
2 2703
13.9%
3 1977
10.1%
4 1725
8.9%
5 1648
8.5%
9 1587
8.1%
7 1578
8.1%
6 1572
8.1%
8 1563
8.0%
0 1496
7.7%
Other Letter
ValueCountFrequency (%)
2486
25.0%
2486
25.0%
2482
25.0%
2482
25.0%
2
 
< 0.1%
2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 2484
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2482
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24453
71.1%
Hangul 9940
28.9%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3633
14.9%
2 2703
11.1%
( 2484
10.2%
) 2482
10.2%
3 1977
8.1%
4 1725
7.1%
5 1648
6.7%
9 1587
6.5%
7 1578
6.5%
6 1572
6.4%
Other values (3) 3064
12.5%
Hangul
ValueCountFrequency (%)
2486
25.0%
2486
25.0%
2482
25.0%
2482
25.0%
2
 
< 0.1%
2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24453
71.1%
Hangul 9940
28.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3633
14.9%
2 2703
11.1%
( 2484
10.2%
) 2482
10.2%
3 1977
8.1%
4 1725
7.1%
5 1648
6.7%
9 1587
6.5%
7 1578
6.5%
6 1572
6.4%
Other values (3) 3064
12.5%
Hangul
ValueCountFrequency (%)
2486
25.0%
2486
25.0%
2482
25.0%
2482
25.0%
2
 
< 0.1%
2
 
< 0.1%
Distinct65
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size19.5 KiB
2023-12-11T08:19:07.018306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length21
Mean length8.2117552
Min length2

Characters and Unicode

Total characters20398
Distinct characters121
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)1.1%

Sample

1st row도내 확진자 접촉
2nd row조사중
3rd row도내 확진자 접촉
4th row도내 확진자 접촉
5th row도내 확진자 접촉
ValueCountFrequency (%)
접촉 1127
17.5%
확진자 1099
17.1%
도내 951
14.8%
관련 610
9.5%
조사중 484
7.5%
진주 397
 
6.2%
233
 
3.6%
진주목욕탕 233
 
3.6%
소재 171
 
2.7%
타지역 146
 
2.3%
Other values (83) 973
15.1%
2023-12-11T08:19:07.415413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3942
19.3%
1874
 
9.2%
1167
 
5.7%
1131
 
5.5%
1130
 
5.5%
1126
 
5.5%
1023
 
5.0%
973
 
4.8%
792
 
3.9%
688
 
3.4%
Other values (111) 6552
32.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16123
79.0%
Space Separator 3942
 
19.3%
Letter Number 236
 
1.2%
Decimal Number 72
 
0.4%
Close Punctuation 11
 
0.1%
Open Punctuation 11
 
0.1%
Uppercase Letter 2
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1874
 
11.6%
1167
 
7.2%
1131
 
7.0%
1130
 
7.0%
1126
 
7.0%
1023
 
6.3%
973
 
6.0%
792
 
4.9%
688
 
4.3%
688
 
4.3%
Other values (96) 5531
34.3%
Decimal Number
ValueCountFrequency (%)
3 49
68.1%
4 6
 
8.3%
1 5
 
6.9%
5 3
 
4.2%
2 3
 
4.2%
6 3
 
4.2%
7 2
 
2.8%
8 1
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
I 1
50.0%
M 1
50.0%
Space Separator
ValueCountFrequency (%)
3942
100.0%
Letter Number
ValueCountFrequency (%)
236
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16123
79.0%
Common 4037
 
19.8%
Latin 238
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1874
 
11.6%
1167
 
7.2%
1131
 
7.0%
1130
 
7.0%
1126
 
7.0%
1023
 
6.3%
973
 
6.0%
792
 
4.9%
688
 
4.3%
688
 
4.3%
Other values (96) 5531
34.3%
Common
ValueCountFrequency (%)
3942
97.6%
3 49
 
1.2%
) 11
 
0.3%
( 11
 
0.3%
4 6
 
0.1%
1 5
 
0.1%
5 3
 
0.1%
2 3
 
0.1%
6 3
 
0.1%
7 2
 
< 0.1%
Other values (2) 2
 
< 0.1%
Latin
ValueCountFrequency (%)
236
99.2%
I 1
 
0.4%
M 1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16122
79.0%
ASCII 4039
 
19.8%
Number Forms 236
 
1.2%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3942
97.6%
3 49
 
1.2%
) 11
 
0.3%
( 11
 
0.3%
4 6
 
0.1%
1 5
 
0.1%
5 3
 
0.1%
2 3
 
0.1%
6 3
 
0.1%
7 2
 
< 0.1%
Other values (4) 4
 
0.1%
Hangul
ValueCountFrequency (%)
1874
 
11.6%
1167
 
7.2%
1131
 
7.0%
1130
 
7.0%
1126
 
7.0%
1023
 
6.3%
973
 
6.0%
792
 
4.9%
688
 
4.3%
688
 
4.3%
Other values (95) 5530
34.3%
Number Forms
ValueCountFrequency (%)
236
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Distinct331
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size19.5 KiB
Minimum2020-01-12 00:00:00
Maximum2021-12-31 00:00:00
2023-12-11T08:19:07.577298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:19:07.750548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct58
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size19.5 KiB
2023-12-11T08:19:08.002038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length13
Mean length8.5845411
Min length5

Characters and Unicode

Total characters21324
Distinct characters78
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)1.0%

Sample

1st row경남권 생활치료센터
2nd row경남권 생활치료센터
3rd row근로복지공단 창원병원
4th row국립마산병원
5th row근로복지공단 창원병원
ValueCountFrequency (%)
생활치료센터 1215
29.3%
경남권 908
21.9%
마산의료원 651
15.7%
근로복지공단 341
 
8.2%
창원병원 341
 
8.2%
국립마산병원 111
 
2.7%
창원 96
 
2.3%
진주 89
 
2.1%
양산 73
 
1.8%
한마음 68
 
1.6%
Other values (45) 247
 
6.0%
2023-12-11T08:19:08.367699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1933
 
9.1%
1728
 
8.1%
1657
 
7.8%
1279
 
6.0%
1279
 
6.0%
1279
 
6.0%
1277
 
6.0%
1277
 
6.0%
981
 
4.6%
914
 
4.3%
Other values (68) 7720
36.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 19661
92.2%
Space Separator 1657
 
7.8%
Uppercase Letter 4
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1933
 
9.8%
1728
 
8.8%
1279
 
6.5%
1279
 
6.5%
1279
 
6.5%
1277
 
6.5%
1277
 
6.5%
981
 
5.0%
914
 
4.6%
912
 
4.6%
Other values (63) 6802
34.6%
Uppercase Letter
ValueCountFrequency (%)
H 2
50.0%
M 2
50.0%
Space Separator
ValueCountFrequency (%)
1657
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 19661
92.2%
Common 1659
 
7.8%
Latin 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1933
 
9.8%
1728
 
8.8%
1279
 
6.5%
1279
 
6.5%
1279
 
6.5%
1277
 
6.5%
1277
 
6.5%
981
 
5.0%
914
 
4.6%
912
 
4.6%
Other values (63) 6802
34.6%
Common
ValueCountFrequency (%)
1657
99.9%
( 1
 
0.1%
) 1
 
0.1%
Latin
ValueCountFrequency (%)
H 2
50.0%
M 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 19661
92.2%
ASCII 1663
 
7.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1933
 
9.8%
1728
 
8.8%
1279
 
6.5%
1279
 
6.5%
1279
 
6.5%
1277
 
6.5%
1277
 
6.5%
981
 
5.0%
914
 
4.6%
912
 
4.6%
Other values (63) 6802
34.6%
ASCII
ValueCountFrequency (%)
1657
99.6%
H 2
 
0.1%
M 2
 
0.1%
( 1
 
0.1%
) 1
 
0.1%

확진자 상태
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size19.5 KiB
완치(퇴원)
2462 
<NA>
 
12
사망
 
7
격리해제
 
3

Length

Max length6
Median length6
Mean length5.9766506
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row완치(퇴원)
2nd row완치(퇴원)
3rd row완치(퇴원)
4th row완치(퇴원)
5th row완치(퇴원)

Common Values

ValueCountFrequency (%)
완치(퇴원) 2462
99.1%
<NA> 12
 
0.5%
사망 7
 
0.3%
격리해제 3
 
0.1%

Length

2023-12-11T08:19:08.547747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:19:08.679095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
완치(퇴원 2462
99.1%
na 12
 
0.5%
사망 7
 
0.3%
격리해제 3
 
0.1%

데이터기준일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.5 KiB
2022-01-28
2484 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-01-28
2nd row2022-01-28
3rd row2022-01-28
4th row2022-01-28
5th row2022-01-28

Common Values

ValueCountFrequency (%)
2022-01-28 2484
100.0%

Length

2023-12-11T08:19:08.827031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:19:08.915359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-01-28 2484
100.0%

Correlations

2023-12-11T08:19:08.968386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
감염경로입원기관확진자 상태
감염경로1.0000.7810.449
입원기관0.7811.0000.639
확진자 상태0.4490.6391.000

Missing values

2023-12-11T08:19:05.843805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:19:05.952575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군구명확진자 번호감염경로확진일자입원기관확진자 상태데이터기준일
0경상남도 진주시진주2483(경남20778)도내 확진자 접촉2021-12-31경남권 생활치료센터완치(퇴원)2022-01-28
1경상남도 진주시진주2482(경남20777)조사중2021-12-31경남권 생활치료센터완치(퇴원)2022-01-28
2경상남도 진주시진주2481(경남20776)도내 확진자 접촉2021-12-31근로복지공단 창원병원완치(퇴원)2022-01-28
3경상남도 진주시진주2480(경남20775)도내 확진자 접촉2021-12-31국립마산병원완치(퇴원)2022-01-28
4경상남도 진주시진주2479(경남20774)도내 확진자 접촉2021-12-31근로복지공단 창원병원완치(퇴원)2022-01-28
5경상남도 진주시진주2478(경남20773)도내 확진자 접촉2021-12-31경남권 생활치료센터완치(퇴원)2022-01-28
6경상남도 진주시진주2477(경남20772)도내 확진자 접촉2021-12-31경남권 생활치료센터완치(퇴원)2022-01-28
7경상남도 진주시진주2476(경남20771)도내 확진자 접촉2021-12-31경남권 생활치료센터완치(퇴원)2022-01-28
8경상남도 진주시진주2475(경남20770)도내 확진자 접촉2021-12-31경남권 생활치료센터완치(퇴원)2022-01-28
9경상남도 진주시진주2474(경남20769)도내 확진자 접촉2021-12-31경남권 생활치료센터완치(퇴원)2022-01-28
시군구명확진자 번호감염경로확진일자입원기관확진자 상태데이터기준일
2474경상남도 진주시진주10(경남114)윙스타워 관련(진주7번 접촉, 진주8번 접촉)2020-04-08경상국립대학교병원완치(퇴원)2022-01-28
2475경상남도 진주시진주9(경남107)윙스타워 관련(윙스온천 이용)2020-04-03마산의료원완치(퇴원)2022-01-28
2476경상남도 진주시진주8(경남103)윙스타워 관련(진주7번 접촉)2020-03-31경상국립대학교병원완치(퇴원)2022-01-28
2477경상남도 진주시진주7(경남100)윙스타워 관련(진주4번 접촉)2020-03-31마산의료원완치(퇴원)2022-01-28
2478경상남도 진주시진주6(경남99)윙스타워 관련(진주5번 접촉)2020-03-31마산의료원완치(퇴원)2022-01-28
2479경상남도 진주시진주5(경남98)윙스타워 관련(진주4번(배우자) 접촉)2020-03-31마산의료원완치(퇴원)2022-01-28
2480경상남도 진주시진주4(경남97)윙스타워 관련(윙스온천 이용)2020-03-31마산의료원완치(퇴원)2022-01-28
2481경상남도 진주시진주3(경남93)윙스타워 관련(윙스온천 이용)2020-03-28마산의료원완치(퇴원)2022-01-28
2482경상남도 진주시진주2(경남4)신천지 대구교회 방문(2월 16일)2020-02-21마산의료원완치(퇴원)2022-01-28
2483경상남도 진주시진주1(경남3)신천지 대구교회 방문(2월 16일)2020-02-21마산의료원완치(퇴원)2022-01-28