Overview

Dataset statistics

Number of variables10
Number of observations776
Missing cells6475
Missing cells (%)83.4%
Duplicate rows1
Duplicate rows (%)0.1%
Total size in memory60.8 KiB
Average record size in memory80.2 B

Variable types

Text4
Unsupported6

Dataset

Description청주시 관내 도로 상 도로점용허가를 득하여 도로굴착을 수반하는 점용에 관한 목록을 제공하여 굴착 공사 시 피해 최소화를 목적으로 함
Author충청북도 청주시
URLhttps://www.data.go.kr/data/15084287/fileData.do

Alerts

Dataset has 1 (0.1%) duplicate rowsDuplicates
도 로 굴 착 대 장 통 합 주 시 (2019.01.01부터) has 620 (79.9%) missing valuesMissing
Unnamed: 1 has 624 (80.4%) missing valuesMissing
Unnamed: 2 has 624 (80.4%) missing valuesMissing
Unnamed: 3 has 484 (62.4%) missing valuesMissing
Unnamed: 4 has 601 (77.4%) missing valuesMissing
Unnamed: 5 has 743 (95.7%) missing valuesMissing
Unnamed: 6 has 681 (87.8%) missing valuesMissing
Unnamed: 7 has 761 (98.1%) missing valuesMissing
Unnamed: 8 has 709 (91.4%) missing valuesMissing
Unnamed: 9 has 628 (80.9%) missing valuesMissing
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 05:23:40.922004
Analysis finished2023-12-12 05:23:42.184281
Duration1.26 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct156
Distinct (%)100.0%
Missing620
Missing (%)79.9%
Memory size6.2 KiB
2023-12-12T14:23:42.523297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.3205128
Min length1

Characters and Unicode

Total characters362
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique156 ?
Unique (%)100.0%

Sample

1st row허가 번호
2nd row1
3rd row2
4th row3
5th row4
ValueCountFrequency (%)
7 1
 
0.6%
98 1
 
0.6%
100 1
 
0.6%
101 1
 
0.6%
102 1
 
0.6%
103 1
 
0.6%
104 1
 
0.6%
105 1
 
0.6%
97 1
 
0.6%
108 1
 
0.6%
Other values (147) 147
93.6%
2023-12-12T14:23:43.129984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 92
25.4%
2 36
 
9.9%
3 36
 
9.9%
4 36
 
9.9%
5 32
 
8.8%
7 25
 
6.9%
6 25
 
6.9%
8 25
 
6.9%
9 25
 
6.9%
0 25
 
6.9%
Other values (5) 5
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 357
98.6%
Other Letter 4
 
1.1%
Control 1
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 92
25.8%
2 36
 
10.1%
3 36
 
10.1%
4 36
 
10.1%
5 32
 
9.0%
7 25
 
7.0%
6 25
 
7.0%
8 25
 
7.0%
9 25
 
7.0%
0 25
 
7.0%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 358
98.9%
Hangul 4
 
1.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 92
25.7%
2 36
 
10.1%
3 36
 
10.1%
4 36
 
10.1%
5 32
 
8.9%
7 25
 
7.0%
6 25
 
7.0%
8 25
 
7.0%
9 25
 
7.0%
0 25
 
7.0%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 358
98.9%
Hangul 4
 
1.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 92
25.7%
2 36
 
10.1%
3 36
 
10.1%
4 36
 
10.1%
5 32
 
8.9%
7 25
 
7.0%
6 25
 
7.0%
8 25
 
7.0%
9 25
 
7.0%
0 25
 
7.0%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Unnamed: 1
Text

MISSING 

Distinct73
Distinct (%)48.0%
Missing624
Missing (%)80.4%
Memory size6.2 KiB
2023-12-12T14:23:43.473091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length19
Mean length8.2302632
Min length1

Characters and Unicode

Total characters1251
Distinct characters133
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)28.3%

Sample

1st row굴 착 목 적
2nd row전력관매설
3rd row우수관로 매설공사 (D300)
4th row통신주 (23본)
5th row오수관 매설공사
ValueCountFrequency (%)
매설 24
 
8.7%
통신주 20
 
7.2%
매설공사 11
 
4.0%
도시가스 11
 
4.0%
1본 9
 
3.2%
오수관 8
 
2.9%
배관공사 7
 
2.5%
통신주설치 7
 
2.5%
지중관로매설 5
 
1.8%
배수관 5
 
1.8%
Other values (101) 170
61.4%
2023-12-12T14:23:44.307022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
101
 
8.1%
85
 
6.8%
84
 
6.7%
61
 
4.9%
54
 
4.3%
50
 
4.0%
44
 
3.5%
44
 
3.5%
43
 
3.4%
41
 
3.3%
Other values (123) 644
51.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 973
77.8%
Space Separator 101
 
8.1%
Decimal Number 55
 
4.4%
Close Punctuation 28
 
2.2%
Open Punctuation 28
 
2.2%
Uppercase Letter 27
 
2.2%
Control 26
 
2.1%
Other Punctuation 6
 
0.5%
Lowercase Letter 6
 
0.5%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
85
 
8.7%
84
 
8.6%
61
 
6.3%
54
 
5.5%
50
 
5.1%
44
 
4.5%
44
 
4.5%
43
 
4.4%
41
 
4.2%
27
 
2.8%
Other values (100) 440
45.2%
Decimal Number
ValueCountFrequency (%)
0 17
30.9%
1 13
23.6%
3 10
18.2%
2 6
 
10.9%
5 5
 
9.1%
4 2
 
3.6%
6 2
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
C 10
37.0%
P 4
 
14.8%
A 4
 
14.8%
V 3
 
11.1%
T 3
 
11.1%
M 2
 
7.4%
D 1
 
3.7%
Lowercase Letter
ValueCountFrequency (%)
p 2
33.3%
a 2
33.3%
c 2
33.3%
Space Separator
ValueCountFrequency (%)
101
100.0%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%
Control
ValueCountFrequency (%)
26
100.0%
Other Punctuation
ValueCountFrequency (%)
, 6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 973
77.8%
Common 245
 
19.6%
Latin 33
 
2.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
85
 
8.7%
84
 
8.6%
61
 
6.3%
54
 
5.5%
50
 
5.1%
44
 
4.5%
44
 
4.5%
43
 
4.4%
41
 
4.2%
27
 
2.8%
Other values (100) 440
45.2%
Common
ValueCountFrequency (%)
101
41.2%
) 28
 
11.4%
( 28
 
11.4%
26
 
10.6%
0 17
 
6.9%
1 13
 
5.3%
3 10
 
4.1%
, 6
 
2.4%
2 6
 
2.4%
5 5
 
2.0%
Other values (3) 5
 
2.0%
Latin
ValueCountFrequency (%)
C 10
30.3%
P 4
 
12.1%
A 4
 
12.1%
V 3
 
9.1%
T 3
 
9.1%
M 2
 
6.1%
p 2
 
6.1%
a 2
 
6.1%
c 2
 
6.1%
D 1
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 973
77.8%
ASCII 278
 
22.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
101
36.3%
) 28
 
10.1%
( 28
 
10.1%
26
 
9.4%
0 17
 
6.1%
1 13
 
4.7%
3 10
 
3.6%
C 10
 
3.6%
, 6
 
2.2%
2 6
 
2.2%
Other values (13) 33
 
11.9%
Hangul
ValueCountFrequency (%)
85
 
8.7%
84
 
8.6%
61
 
6.3%
54
 
5.5%
50
 
5.1%
44
 
4.5%
44
 
4.5%
43
 
4.4%
41
 
4.2%
27
 
2.8%
Other values (100) 440
45.2%

Unnamed: 2
Text

MISSING 

Distinct149
Distinct (%)98.0%
Missing624
Missing (%)80.4%
Memory size6.2 KiB
2023-12-12T14:23:44.741257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length292
Median length103
Mean length31.427632
Min length1

Characters and Unicode

Total characters4777
Distinct characters108
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique147 ?
Unique (%)96.7%

Sample

1st row굴 착 장 소 건축주소(도로번지)
2nd row명암동 101-1, 101-14, 101-4, 183-2, 73-24
3rd row가덕 계산 194-2 앞(194-1, 193-2, 520)
4th row문의 남계 135-6번지 외 14필지
5th row용암동 1626번지 앞
ValueCountFrequency (%)
94
 
9.2%
용암동 57
 
5.6%
일원 27
 
2.6%
금천동 20
 
1.9%
탑동 20
 
1.9%
문의면 19
 
1.9%
용담동 18
 
1.8%
영운동 18
 
1.8%
수동 14
 
1.4%
미원면 14
 
1.4%
Other values (495) 726
70.7%
2023-12-12T14:23:45.264412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
651
 
13.6%
1 391
 
8.2%
- 312
 
6.5%
2 270
 
5.7%
230
 
4.8%
225
 
4.7%
3 191
 
4.0%
4 159
 
3.3%
6 143
 
3.0%
5 137
 
2.9%
Other values (98) 2068
43.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1720
36.0%
Other Letter 1674
35.0%
Space Separator 651
 
13.6%
Dash Punctuation 312
 
6.5%
Control 230
 
4.8%
Open Punctuation 73
 
1.5%
Close Punctuation 71
 
1.5%
Other Punctuation 39
 
0.8%
Math Symbol 7
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
225
 
13.4%
109
 
6.5%
97
 
5.8%
95
 
5.7%
92
 
5.5%
80
 
4.8%
65
 
3.9%
64
 
3.8%
63
 
3.8%
50
 
3.0%
Other values (81) 734
43.8%
Decimal Number
ValueCountFrequency (%)
1 391
22.7%
2 270
15.7%
3 191
11.1%
4 159
9.2%
6 143
 
8.3%
5 137
 
8.0%
7 123
 
7.2%
0 113
 
6.6%
9 97
 
5.6%
8 96
 
5.6%
Space Separator
ValueCountFrequency (%)
651
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 312
100.0%
Control
ValueCountFrequency (%)
230
100.0%
Open Punctuation
ValueCountFrequency (%)
( 73
100.0%
Close Punctuation
ValueCountFrequency (%)
) 71
100.0%
Other Punctuation
ValueCountFrequency (%)
, 39
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3103
65.0%
Hangul 1674
35.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
225
 
13.4%
109
 
6.5%
97
 
5.8%
95
 
5.7%
92
 
5.5%
80
 
4.8%
65
 
3.9%
64
 
3.8%
63
 
3.8%
50
 
3.0%
Other values (81) 734
43.8%
Common
ValueCountFrequency (%)
651
21.0%
1 391
12.6%
- 312
10.1%
2 270
8.7%
230
 
7.4%
3 191
 
6.2%
4 159
 
5.1%
6 143
 
4.6%
5 137
 
4.4%
7 123
 
4.0%
Other values (7) 496
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3103
65.0%
Hangul 1674
35.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
651
21.0%
1 391
12.6%
- 312
10.1%
2 270
8.7%
230
 
7.4%
3 191
 
6.2%
4 159
 
5.1%
6 143
 
4.6%
5 137
 
4.4%
7 123
 
4.0%
Other values (7) 496
16.0%
Hangul
ValueCountFrequency (%)
225
 
13.4%
109
 
6.5%
97
 
5.8%
95
 
5.7%
92
 
5.5%
80
 
4.8%
65
 
3.9%
64
 
3.8%
63
 
3.8%
50
 
3.0%
Other values (81) 734
43.8%

Unnamed: 3
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing484
Missing (%)62.4%
Memory size6.2 KiB

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing601
Missing (%)77.4%
Memory size6.2 KiB

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing743
Missing (%)95.7%
Memory size6.2 KiB

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing681
Missing (%)87.8%
Memory size6.2 KiB

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing761
Missing (%)98.1%
Memory size6.2 KiB

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing709
Missing (%)91.4%
Memory size6.2 KiB

Unnamed: 9
Text

MISSING 

Distinct83
Distinct (%)56.1%
Missing628
Missing (%)80.9%
Memory size6.2 KiB
2023-12-12T14:23:45.447423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length38
Mean length28.851351
Min length6

Characters and Unicode

Total characters4270
Distinct characters31
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique58 ?
Unique (%)39.2%

Sample

1st row굴 착(점 용) 기 간
2nd row착공일~266
3rd row착공일~10
4th row착공일로부터~30일간/ 허가일~2028.12.31.
5th row착공일로부터~3일간/ 허가일~2028.12.31.
ValueCountFrequency (%)
착공일로부터 71
18.3%
허가일로부터~2028.12.31 67
17.3%
허가일~2028.12.31 30
 
7.8%
19
 
4.9%
착공일로부터~7일간 15
 
3.9%
30일간 11
 
2.8%
착공일로부터~30일간 9
 
2.3%
7일간 8
 
2.1%
2019 8
 
2.1%
2028.12.31 7
 
1.8%
Other values (79) 142
36.7%
2023-12-12T14:23:45.759663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 511
 
12.0%
2 461
 
10.8%
386
 
9.0%
1 330
 
7.7%
228
 
5.3%
228
 
5.3%
228
 
5.3%
0 227
 
5.3%
~ 200
 
4.7%
3 173
 
4.1%
Other values (21) 1298
30.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1721
40.3%
Decimal Number 1455
34.1%
Other Punctuation 645
 
15.1%
Math Symbol 200
 
4.7%
Space Separator 143
 
3.3%
Control 102
 
2.4%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
386
22.4%
228
13.2%
228
13.2%
228
13.2%
137
 
8.0%
137
 
8.0%
127
 
7.4%
126
 
7.3%
119
 
6.9%
1
 
0.1%
Other values (4) 4
 
0.2%
Decimal Number
ValueCountFrequency (%)
2 461
31.7%
1 330
22.7%
0 227
15.6%
3 173
 
11.9%
8 136
 
9.3%
7 39
 
2.7%
9 35
 
2.4%
5 19
 
1.3%
6 19
 
1.3%
4 16
 
1.1%
Other Punctuation
ValueCountFrequency (%)
. 511
79.2%
/ 134
 
20.8%
Math Symbol
ValueCountFrequency (%)
~ 200
100.0%
Space Separator
ValueCountFrequency (%)
143
100.0%
Control
ValueCountFrequency (%)
102
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2549
59.7%
Hangul 1721
40.3%

Most frequent character per script

Common
ValueCountFrequency (%)
. 511
20.0%
2 461
18.1%
1 330
12.9%
0 227
8.9%
~ 200
 
7.8%
3 173
 
6.8%
143
 
5.6%
8 136
 
5.3%
/ 134
 
5.3%
102
 
4.0%
Other values (7) 132
 
5.2%
Hangul
ValueCountFrequency (%)
386
22.4%
228
13.2%
228
13.2%
228
13.2%
137
 
8.0%
137
 
8.0%
127
 
7.4%
126
 
7.3%
119
 
6.9%
1
 
0.1%
Other values (4) 4
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2549
59.7%
Hangul 1720
40.3%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 511
20.0%
2 461
18.1%
1 330
12.9%
0 227
8.9%
~ 200
 
7.8%
3 173
 
6.8%
143
 
5.6%
8 136
 
5.3%
/ 134
 
5.3%
102
 
4.0%
Other values (7) 132
 
5.2%
Hangul
ValueCountFrequency (%)
386
22.4%
228
13.3%
228
13.3%
228
13.3%
137
 
8.0%
137
 
8.0%
127
 
7.4%
126
 
7.3%
119
 
6.9%
1
 
0.1%
Other values (3) 3
 
0.2%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

Correlations

2023-12-12T14:23:45.855684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 9
Unnamed: 11.0000.984
Unnamed: 90.9841.000

Missing values

2023-12-12T14:23:41.646463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:23:41.836597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T14:23:42.049134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

도 로 굴 착 대 장 통 합 주 시 (2019.01.01부터)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9
0허가 번호굴 착 목 적굴 착 장 소 건축주소(도로번지)굴 착 및 복 구 규 모NaNNaNNaNNaNNaN굴 착(점 용) 기 간
1<NA><NA><NA>아스팔트콘크리트보도블럭투수콘비포장<NA>
2<NA><NA><NA>NaNNaNNaNNaNNaNNaN<NA>
31전력관매설명암동 101-1, 101-14, 101-4, 183-2, 73-2420289NaN113NaNNaN착공일~266
4<NA><NA><NA>NaNNaNNaNNaNNaNNaN<NA>
5<NA><NA><NA>20289NaN113NaNNaN<NA>
6<NA><NA><NA>NaNNaNNaNNaNNaNNaN<NA>
7<NA><NA><NA>NaNNaNNaNNaNNaNNaN<NA>
82우수관로 매설공사 (D300)가덕 계산 194-2 앞(194-1, 193-2, 520)97.5NaNNaNNaN1.5착공일~10
9<NA><NA><NA>NaNNaNNaNNaNNaNNaN<NA>
도 로 굴 착 대 장 통 합 주 시 (2019.01.01부터)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9
766<NA><NA><NA>NaNNaNNaNNaNNaNNaN<NA>
767<NA><NA><NA>NaNNaNNaNNaNNaNNaN<NA>
768154통신주 및 통신관로매설탑동 276번지 (277-1번지 앞)10.810.8NaNNaNNaNNaN착공일로부터 2일간./ 허가일로부터~2028.12.31.
769<NA><NA><NA>NaNNaNNaNNaNNaNNaN<NA>
770<NA><NA><NA>10.810.8NaNNaNNaNNaN<NA>
771<NA><NA><NA>NaNNaNNaNNaNNaNNaN<NA>
772<NA><NA><NA>NaNNaNNaNNaNNaNNaN<NA>
773155하수관로매설영운동 127-2 (156-2번지 앞)4.84.8NaNNaNNaNNaN착공일로부터 3일간./ 허가일로부터~2028.12.31.
774<NA><NA><NA>NaNNaNNaNNaNNaNNaN<NA>
775<NA><NA><NA>4.84.8NaNNaNNaNNaN<NA>

Duplicate rows

Most frequently occurring

도 로 굴 착 대 장 통 합 주 시 (2019.01.01부터)Unnamed: 1Unnamed: 2Unnamed: 9# duplicates
0<NA><NA><NA><NA>620