Overview

Dataset statistics

Number of variables7
Number of observations57
Missing cells155
Missing cells (%)38.8%
Duplicate rows1
Duplicate rows (%)1.8%
Total size in memory3.2 KiB
Average record size in memory58.3 B

Variable types

Unsupported3
Text4

Dataset

Description안전한보행환경조성사업추진현황20147
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202076

Alerts

Unnamed: 6 has constant value ""Constant
Dataset has 1 (1.8%) duplicate rowsDuplicates
2010~2014년 안전한 보행환경 조성사업 추진 현황 has 26 (45.6%) missing valuesMissing
Unnamed: 1 has 8 (14.0%) missing valuesMissing
Unnamed: 2 has 7 (12.3%) missing valuesMissing
Unnamed: 3 has 26 (45.6%) missing valuesMissing
Unnamed: 4 has 8 (14.0%) missing valuesMissing
Unnamed: 5 has 24 (42.1%) missing valuesMissing
Unnamed: 6 has 56 (98.2%) missing valuesMissing
2010~2014년 안전한 보행환경 조성사업 추진 현황 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 00:10:18.610210
Analysis finished2024-03-14 00:10:19.104009
Duration0.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

2010~2014년 안전한 보행환경 조성사업 추진 현황
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing26
Missing (%)45.6%
Memory size588.0 B

Unnamed: 1
Text

MISSING 

Distinct25
Distinct (%)51.0%
Missing8
Missing (%)14.0%
Memory size588.0 B
2024-03-14T09:10:19.213665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length10.020408
Min length3

Characters and Unicode

Total characters491
Distinct characters74
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)46.9%

Sample

1st row사업명
2nd row안전한 보행환경 조성사업
3rd row(순창 농암)
4th row안전한 보행환경 조성사업
5th row(부안 모산)
ValueCountFrequency (%)
안전한 24
20.0%
조성사업 24
20.0%
보행환경 24
20.0%
익산 5
 
4.2%
완주 4
 
3.3%
정읍 2
 
1.7%
무주 2
 
1.7%
김제 2
 
1.7%
순창 2
 
1.7%
남원 2
 
1.7%
Other values (28) 29
24.2%
2024-03-14T09:10:19.478051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
71
 
14.5%
26
 
5.3%
25
 
5.1%
25
 
5.1%
24
 
4.9%
24
 
4.9%
24
 
4.9%
24
 
4.9%
24
 
4.9%
24
 
4.9%
Other values (64) 200
40.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 368
74.9%
Space Separator 71
 
14.5%
Open Punctuation 24
 
4.9%
Close Punctuation 24
 
4.9%
Dash Punctuation 2
 
0.4%
Decimal Number 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26
 
7.1%
25
 
6.8%
25
 
6.8%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
Other values (58) 124
33.7%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Space Separator
ValueCountFrequency (%)
71
100.0%
Open Punctuation
ValueCountFrequency (%)
( 24
100.0%
Close Punctuation
ValueCountFrequency (%)
) 24
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 368
74.9%
Common 123
 
25.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26
 
7.1%
25
 
6.8%
25
 
6.8%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
Other values (58) 124
33.7%
Common
ValueCountFrequency (%)
71
57.7%
( 24
 
19.5%
) 24
 
19.5%
- 2
 
1.6%
1 1
 
0.8%
2 1
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 368
74.9%
ASCII 123
 
25.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
71
57.7%
( 24
 
19.5%
) 24
 
19.5%
- 2
 
1.6%
1 1
 
0.8%
2 1
 
0.8%
Hangul
ValueCountFrequency (%)
26
 
7.1%
25
 
6.8%
25
 
6.8%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
24
 
6.5%
Other values (58) 124
33.7%

Unnamed: 2
Text

MISSING 

Distinct40
Distinct (%)80.0%
Missing7
Missing (%)12.3%
Memory size588.0 B
2024-03-14T09:10:19.661584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5.5
Mean length5.1
Min length3

Characters and Unicode

Total characters255
Distinct characters62
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)66.0%

Sample

1st row위 치
2nd row25개소
3rd row7개소
4th row순창 복흥
5th row부안 부안
ValueCountFrequency (%)
익산 7
 
9.2%
완주 4
 
5.3%
부안 3
 
3.9%
지722 3
 
3.9%
낭산 3
 
3.9%
금마 3
 
3.9%
무주 2
 
2.6%
정읍 2
 
2.6%
남원 2
 
2.6%
김제 2
 
2.6%
Other values (40) 45
59.2%
2024-03-14T09:10:19.966557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26
 
10.2%
19
 
7.5%
( 18
 
7.1%
7 18
 
7.1%
) 18
 
7.1%
15
 
5.9%
2 9
 
3.5%
4 8
 
3.1%
7
 
2.7%
1 7
 
2.7%
Other values (52) 110
43.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 133
52.2%
Decimal Number 60
23.5%
Space Separator 26
 
10.2%
Open Punctuation 18
 
7.1%
Close Punctuation 18
 
7.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
 
14.3%
15
 
11.3%
7
 
5.3%
7
 
5.3%
6
 
4.5%
6
 
4.5%
4
 
3.0%
4
 
3.0%
3
 
2.3%
3
 
2.3%
Other values (39) 59
44.4%
Decimal Number
ValueCountFrequency (%)
7 18
30.0%
2 9
15.0%
4 8
13.3%
1 7
 
11.7%
6 5
 
8.3%
5 4
 
6.7%
3 3
 
5.0%
9 2
 
3.3%
0 2
 
3.3%
8 2
 
3.3%
Space Separator
ValueCountFrequency (%)
26
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 133
52.2%
Common 122
47.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
 
14.3%
15
 
11.3%
7
 
5.3%
7
 
5.3%
6
 
4.5%
6
 
4.5%
4
 
3.0%
4
 
3.0%
3
 
2.3%
3
 
2.3%
Other values (39) 59
44.4%
Common
ValueCountFrequency (%)
26
21.3%
( 18
14.8%
7 18
14.8%
) 18
14.8%
2 9
 
7.4%
4 8
 
6.6%
1 7
 
5.7%
6 5
 
4.1%
5 4
 
3.3%
3 3
 
2.5%
Other values (3) 6
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 133
52.2%
ASCII 122
47.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
26
21.3%
( 18
14.8%
7 18
14.8%
) 18
14.8%
2 9
 
7.4%
4 8
 
6.6%
1 7
 
5.7%
6 5
 
4.1%
5 4
 
3.3%
3 3
 
2.5%
Other values (3) 6
 
4.9%
Hangul
ValueCountFrequency (%)
19
 
14.3%
15
 
11.3%
7
 
5.3%
7
 
5.3%
6
 
4.5%
6
 
4.5%
4
 
3.0%
4
 
3.0%
3
 
2.3%
3
 
2.3%
Other values (39) 59
44.4%

Unnamed: 3
Text

MISSING 

Distinct22
Distinct (%)71.0%
Missing26
Missing (%)45.6%
Memory size588.0 B
2024-03-14T09:10:20.121696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length5
Mean length5.4193548
Min length5

Characters and Unicode

Total characters168
Distinct characters21
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)61.3%

Sample

1st row사 업 량 (km)
2nd rowL=13.4
3rd rowL=4.3
4th rowL=0.73
5th rowL=0.78
ValueCountFrequency (%)
l=0.4 5
 
14.7%
l=0.5 4
 
11.8%
l=0.3 3
 
8.8%
l=0.78 1
 
2.9%
l=0.73 1
 
2.9%
l=1.0 1
 
2.9%
l=0.2 1
 
2.9%
l=1.3 1
 
2.9%
l=0.8 1
 
2.9%
l=1.2 1
 
2.9%
Other values (15) 15
44.1%
2024-03-14T09:10:20.364549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
L 30
17.9%
= 30
17.9%
. 30
17.9%
0 24
14.3%
4 11
 
6.5%
3 10
 
6.0%
1 9
 
5.4%
5 5
 
3.0%
7 3
 
1.8%
8 3
 
1.8%
Other values (11) 13
7.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 68
40.5%
Uppercase Letter 30
17.9%
Math Symbol 30
17.9%
Other Punctuation 30
17.9%
Other Letter 3
 
1.8%
Space Separator 2
 
1.2%
Lowercase Letter 2
 
1.2%
Control 1
 
0.6%
Open Punctuation 1
 
0.6%
Close Punctuation 1
 
0.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 24
35.3%
4 11
16.2%
3 10
14.7%
1 9
 
13.2%
5 5
 
7.4%
7 3
 
4.4%
8 3
 
4.4%
2 2
 
2.9%
9 1
 
1.5%
Other Letter
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Lowercase Letter
ValueCountFrequency (%)
k 1
50.0%
m 1
50.0%
Uppercase Letter
ValueCountFrequency (%)
L 30
100.0%
Math Symbol
ValueCountFrequency (%)
= 30
100.0%
Other Punctuation
ValueCountFrequency (%)
. 30
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Control
ValueCountFrequency (%)
1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 133
79.2%
Latin 32
 
19.0%
Hangul 3
 
1.8%

Most frequent character per script

Common
ValueCountFrequency (%)
= 30
22.6%
. 30
22.6%
0 24
18.0%
4 11
 
8.3%
3 10
 
7.5%
1 9
 
6.8%
5 5
 
3.8%
7 3
 
2.3%
8 3
 
2.3%
2
 
1.5%
Other values (5) 6
 
4.5%
Latin
ValueCountFrequency (%)
L 30
93.8%
k 1
 
3.1%
m 1
 
3.1%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 165
98.2%
Hangul 3
 
1.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L 30
18.2%
= 30
18.2%
. 30
18.2%
0 24
14.5%
4 11
 
6.7%
3 10
 
6.1%
1 9
 
5.5%
5 5
 
3.0%
7 3
 
1.8%
8 3
 
1.8%
Other values (8) 10
 
6.1%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8
Missing (%)14.0%
Memory size588.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing24
Missing (%)42.1%
Memory size588.0 B

Unnamed: 6
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing56
Missing (%)98.2%
Memory size588.0 B
2024-03-14T09:10:20.444388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row비고
ValueCountFrequency (%)
비고 1
100.0%
2024-03-14T09:10:20.610498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Correlations

2024-03-14T09:10:20.690796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 3
Unnamed: 11.0000.8441.000
Unnamed: 20.8441.0000.940
Unnamed: 31.0000.9401.000

Missing values

2024-03-14T09:10:18.811425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T09:10:18.932377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T09:10:19.032991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

2010~2014년 안전한 보행환경 조성사업 추진 현황Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
0NaN<NA><NA><NA>NaN(2014.7.31일 기준)<NA>
1년도별사업명위 치사 업 량 (km)사업기간사업비비고
2NaN<NA><NA><NA>NaN(백만원)<NA>
3<NA>25개소L=13.4NaN8790<NA>
42010<NA>7개소L=4.3NaN2800<NA>
5안전한 보행환경 조성사업순창 복흥L=0.732010.05~190<NA>
6NaN(순창 농암)<NA><NA>2010.12NaN<NA>
7안전한 보행환경 조성사업부안 부안L=0.782010.05~430<NA>
8NaN(부안 모산)<NA><NA>2010.11NaN<NA>
9안전한 보행환경 조성사업고창 부안L=0.512010.05~340<NA>
2010~2014년 안전한 보행환경 조성사업 추진 현황Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
47NaN(익산 석천)(지718)<NA>2013.1NaN<NA>
48안전한 보행환경 조성사업정읍 두지L=0.22013.01~80<NA>
49NaN(정읍 두지)(지736)<NA>2013.08NaN<NA>
50안전한 보행환경 조성사업완주 고산L=0.32013.01~290<NA>
51NaN(완주 어우)(지741)<NA>2013.12NaN<NA>
52안전한 보행환경 조성사업익산 금마L=0.42013.10~90<NA>
53NaN(익산 금마-1공구)(지722)<NA>2013.12NaN<NA>
542014<NA>1개소L=0.3NaN90<NA>
55안전한 보행환경 조성사업익산 금마L=0.32014.01~90<NA>
56NaN(익산 금마-2공구)(지722)<NA>2014.07NaN<NA>

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 6# duplicates
0<NA><NA><NA><NA>2