Overview

Dataset statistics

Number of variables7
Number of observations89
Missing cells251
Missing cells (%)40.3%
Duplicate rows9
Duplicate rows (%)10.1%
Total size in memory5.0 KiB
Average record size in memory57.5 B

Variable types

Unsupported2
Text4
Categorical1

Dataset

Description2014년 민방위 교육 일정표
Author경기도 광명시
URLhttps://www.data.go.kr/data/15053863/fileData.do

Alerts

Unnamed: 6 has constant value ""Constant
Dataset has 9 (10.1%) duplicate rowsDuplicates
Unnamed: 0 has 15 (16.9%) missing valuesMissing
Unnamed: 1 has 50 (56.2%) missing valuesMissing
Unnamed: 2 has 79 (88.8%) missing valuesMissing
Unnamed: 4 has 1 (1.1%) missing valuesMissing
Unnamed: 5 has 18 (20.2%) missing valuesMissing
Unnamed: 6 has 88 (98.9%) missing valuesMissing
Unnamed: 0 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 06:39:03.026244
Analysis finished2023-12-12 06:39:03.636487
Duration0.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing15
Missing (%)16.9%
Memory size844.0 B

Unnamed: 1
Text

MISSING 

Distinct39
Distinct (%)100.0%
Missing50
Missing (%)56.2%
Memory size844.0 B
2023-12-12T15:39:03.773237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length13.538462
Min length3

Characters and Unicode

Total characters528
Distinct characters23
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)100.0%

Sample

1st row일자별
2nd row2014. 3. 17(월)
3rd row2014. 3. 18(화)
4th row2014. 3. 19(수)
5th row2014. 3. 20(목)
ValueCountFrequency (%)
2014 28
25.9%
3 11
 
10.2%
4 10
 
9.3%
5 8
 
7.4%
2014.6 5
 
4.6%
2013 5
 
4.6%
6 2
 
1.9%
17(화 1
 
0.9%
14(수 1
 
0.9%
20(금 1
 
0.9%
Other values (36) 36
33.3%
2023-12-12T15:39:04.192348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 76
14.4%
69
13.1%
2 60
11.4%
1 60
11.4%
4 50
9.5%
0 42
8.0%
( 38
7.2%
) 38
7.2%
3 19
 
3.6%
5 12
 
2.3%
Other values (13) 64
12.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 266
50.4%
Other Punctuation 76
 
14.4%
Space Separator 69
 
13.1%
Other Letter 41
 
7.8%
Open Punctuation 38
 
7.2%
Close Punctuation 38
 
7.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 60
22.6%
1 60
22.6%
4 50
18.8%
0 42
15.8%
3 19
 
7.1%
5 12
 
4.5%
6 11
 
4.1%
7 4
 
1.5%
8 4
 
1.5%
9 4
 
1.5%
Other Letter
ValueCountFrequency (%)
8
19.5%
8
19.5%
7
17.1%
6
14.6%
6
14.6%
2
 
4.9%
2
 
4.9%
1
 
2.4%
1
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 76
100.0%
Space Separator
ValueCountFrequency (%)
69
100.0%
Open Punctuation
ValueCountFrequency (%)
( 38
100.0%
Close Punctuation
ValueCountFrequency (%)
) 38
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 487
92.2%
Hangul 41
 
7.8%

Most frequent character per script

Common
ValueCountFrequency (%)
. 76
15.6%
69
14.2%
2 60
12.3%
1 60
12.3%
4 50
10.3%
0 42
8.6%
( 38
7.8%
) 38
7.8%
3 19
 
3.9%
5 12
 
2.5%
Other values (4) 23
 
4.7%
Hangul
ValueCountFrequency (%)
8
19.5%
8
19.5%
7
17.1%
6
14.6%
6
14.6%
2
 
4.9%
2
 
4.9%
1
 
2.4%
1
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 487
92.2%
Hangul 41
 
7.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 76
15.6%
69
14.2%
2 60
12.3%
1 60
12.3%
4 50
10.3%
0 42
8.6%
( 38
7.8%
) 38
7.8%
3 19
 
3.9%
5 12
 
2.5%
Other values (4) 23
 
4.7%
Hangul
ValueCountFrequency (%)
8
19.5%
8
19.5%
7
17.1%
6
14.6%
6
14.6%
2
 
4.9%
2
 
4.9%
1
 
2.4%
1
 
2.4%

Unnamed: 2
Text

MISSING 

Distinct5
Distinct (%)50.0%
Missing79
Missing (%)88.8%
Memory size844.0 B
2023-12-12T15:39:04.323426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.9
Min length2

Characters and Unicode

Total characters29
Distinct characters6
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)10.0%

Sample

1st row년차
2nd row2년차
3rd row2년차
4th row1년차
5th row1년차
ValueCountFrequency (%)
4년차 3
30.0%
2년차 2
20.0%
1년차 2
20.0%
3년차 2
20.0%
년차 1
 
10.0%
2023-12-12T15:39:04.571349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
34.5%
10
34.5%
4 3
 
10.3%
2 2
 
6.9%
1 2
 
6.9%
3 2
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 20
69.0%
Decimal Number 9
31.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 3
33.3%
2 2
22.2%
1 2
22.2%
3 2
22.2%
Other Letter
ValueCountFrequency (%)
10
50.0%
10
50.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 20
69.0%
Common 9
31.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 3
33.3%
2 2
22.2%
1 2
22.2%
3 2
22.2%
Hangul
ValueCountFrequency (%)
10
50.0%
10
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 20
69.0%
ASCII 9
31.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10
50.0%
10
50.0%
ASCII
ValueCountFrequency (%)
4 3
33.3%
2 2
22.2%
1 2
22.2%
3 2
22.2%

Unnamed: 3
Categorical

Distinct5
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Memory size844.0 B
오전
38 
오후
34 
<NA>
15 
교육대상
 
1
시간별
 
1

Length

Max length4
Median length2
Mean length2.3707865
Min length2

Unique

Unique2 ?
Unique (%)2.2%

Sample

1st row교육대상
2nd row시간별
3rd row오전
4th row오후
5th row오전

Common Values

ValueCountFrequency (%)
오전 38
42.7%
오후 34
38.2%
<NA> 15
 
16.9%
교육대상 1
 
1.1%
시간별 1
 
1.1%

Length

2023-12-12T15:39:04.705879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:39:04.831062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
오전 38
42.7%
오후 34
38.2%
na 15
 
16.9%
교육대상 1
 
1.1%
시간별 1
 
1.1%

Unnamed: 4
Text

MISSING 

Distinct47
Distinct (%)53.4%
Missing1
Missing (%)1.1%
Memory size844.0 B
2023-12-12T15:39:05.056473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length4
Mean length5.8068182
Min length2

Characters and Unicode

Total characters511
Distinct characters27
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)33.0%

Sample

1st row동별
2nd row광명1동
3rd row하안3동
4th row광명3동
5th row광명4동
ValueCountFrequency (%)
철산3동 5
 
5.3%
광명7동 5
 
5.3%
소하1동 4
 
4.3%
하안2동 4
 
4.3%
하안1동 4
 
4.3%
철산4동 4
 
4.3%
소하2동 4
 
4.3%
하안3동 4
 
4.3%
3
 
3.2%
광명1동 3
 
3.2%
Other values (39) 54
57.4%
2023-12-12T15:39:05.446463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
85
16.6%
61
11.9%
1 40
 
7.8%
31
 
6.1%
31
 
6.1%
) 30
 
5.9%
( 30
 
5.9%
2 26
 
5.1%
18
 
3.5%
18
 
3.5%
Other values (17) 141
27.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 293
57.3%
Decimal Number 152
29.7%
Close Punctuation 30
 
5.9%
Open Punctuation 30
 
5.9%
Space Separator 6
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
85
29.0%
61
20.8%
31
 
10.6%
31
 
10.6%
18
 
6.1%
18
 
6.1%
18
 
6.1%
13
 
4.4%
4
 
1.4%
4
 
1.4%
Other values (4) 10
 
3.4%
Decimal Number
ValueCountFrequency (%)
1 40
26.3%
2 26
17.1%
4 17
11.2%
3 17
11.2%
7 13
 
8.6%
9 10
 
6.6%
0 9
 
5.9%
8 7
 
4.6%
5 7
 
4.6%
6 6
 
3.9%
Close Punctuation
ValueCountFrequency (%)
) 30
100.0%
Open Punctuation
ValueCountFrequency (%)
( 30
100.0%
Space Separator
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 293
57.3%
Common 218
42.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
85
29.0%
61
20.8%
31
 
10.6%
31
 
10.6%
18
 
6.1%
18
 
6.1%
18
 
6.1%
13
 
4.4%
4
 
1.4%
4
 
1.4%
Other values (4) 10
 
3.4%
Common
ValueCountFrequency (%)
1 40
18.3%
) 30
13.8%
( 30
13.8%
2 26
11.9%
4 17
7.8%
3 17
7.8%
7 13
 
6.0%
9 10
 
4.6%
0 9
 
4.1%
8 7
 
3.2%
Other values (3) 19
8.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 293
57.3%
ASCII 218
42.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
85
29.0%
61
20.8%
31
 
10.6%
31
 
10.6%
18
 
6.1%
18
 
6.1%
18
 
6.1%
13
 
4.4%
4
 
1.4%
4
 
1.4%
Other values (4) 10
 
3.4%
ASCII
ValueCountFrequency (%)
1 40
18.3%
) 30
13.8%
( 30
13.8%
2 26
11.9%
4 17
7.8%
3 17
7.8%
7 13
 
6.0%
9 10
 
4.6%
0 9
 
4.1%
8 7
 
3.2%
Other values (3) 19
8.7%

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing18
Missing (%)20.2%
Memory size844.0 B

Unnamed: 6
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing88
Missing (%)98.9%
Memory size844.0 B
2023-12-12T15:39:05.567040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row비고
ValueCountFrequency (%)
비고 1
100.0%
2023-12-12T15:39:05.767958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Correlations

2023-12-12T15:39:05.858120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4
Unnamed: 11.0001.0001.0001.000
Unnamed: 21.0001.0001.0000.000
Unnamed: 31.0001.0001.0000.909
Unnamed: 41.0000.0000.9091.000

Missing values

2023-12-12T15:39:03.261493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:39:03.402940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T15:39:03.529391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
0연 번일자별년차교육대상<NA>교육 인원비고
1<NA><NA>시간별동별10525<NA>
212014. 3. 17(월)2년차오전광명1동136<NA>
32<NA><NA>오후하안3동163<NA>
432014. 3. 18(화)<NA>오전광명3동105<NA>
54<NA><NA>오후광명4동148<NA>
652014. 3. 19(수)<NA>오전광명5동140<NA>
76<NA><NA>오후광명6동118<NA>
872014. 3. 20(목)<NA>오전광명7동104<NA>
98<NA><NA>오후광명7동102<NA>
Unnamed: 0Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
79NaN<NA><NA><NA>철산2동(110명)NaN<NA>
8065<NA><NA>오후철산3동(177명)193<NA>
81NaN<NA><NA><NA>학온동(16명)NaN<NA>
82662014.6. 19(목)<NA>오전철산4동152<NA>
8367<NA><NA>오후하안1동149<NA>
84682014.6. 20(금)<NA>오전하안2동172<NA>
8569<NA><NA>오후하안3동142<NA>
86702014. 6. 23(월)4년차오전하안4동104<NA>
8771<NA><NA>오후소하1동186<NA>
88722014. 6. 24(화)<NA>오전소하2동210<NA>

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 6# duplicates
0<NA><NA>오후광명4동<NA>3
1<NA><NA>오후광명6동<NA>3
3<NA><NA>오후소하1동<NA>3
2<NA><NA>오후광명7동<NA>2
4<NA><NA>오후철산3동<NA>2
5<NA><NA>오후철산4동<NA>2
6<NA><NA>오후하안1동<NA>2
7<NA><NA>오후하안2동<NA>2
8<NA><NA>오후하안3동<NA>2