Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory390.6 KiB
Average record size in memory40.0 B

Variable types

Text2
DateTime1
Categorical1

Dataset

Description부산광역시에서 운영하고 있는 도시공간정보업무포털의 굴착허가정보(도로굴착관리번호, 도로굴착허가번호, 허가날짜, 기준일자)입니다
URLhttps://www.data.go.kr/data/15119848/fileData.do

Alerts

기준일자 has constant value ""Constant
도로굴착관리번호 has unique valuesUnique
도로굴착허가번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 22:34:09.427130
Analysis finished2023-12-12 22:34:09.992080
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T07:34:10.153325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters180000
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowWRK001201901290024
2nd rowWRK001202006260022
3rd rowWRK001201710270012
4th rowWRK001202307100002
5th rowWRK001201711010023
ValueCountFrequency (%)
wrk001201901290024 1
 
< 0.1%
wrk001201806040010 1
 
< 0.1%
wrk001201907160028 1
 
< 0.1%
wrk001202209050006 1
 
< 0.1%
wrk001202205020019 1
 
< 0.1%
wrk001202103240034 1
 
< 0.1%
wrk001202010260010 1
 
< 0.1%
wrk001202111050009 1
 
< 0.1%
wrk001201712110015 1
 
< 0.1%
wrk001202301150003 1
 
< 0.1%
Other values (9990) 9990
99.9%
2023-12-13T07:34:10.550587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 68374
38.0%
1 29095
16.2%
2 25550
 
14.2%
W 10000
 
5.6%
R 10000
 
5.6%
K 10000
 
5.6%
3 4900
 
2.7%
7 4486
 
2.5%
8 4274
 
2.4%
9 4117
 
2.3%
Other values (3) 9204
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 150000
83.3%
Uppercase Letter 30000
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 68374
45.6%
1 29095
19.4%
2 25550
 
17.0%
3 4900
 
3.3%
7 4486
 
3.0%
8 4274
 
2.8%
9 4117
 
2.7%
4 3244
 
2.2%
5 3021
 
2.0%
6 2939
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
W 10000
33.3%
R 10000
33.3%
K 10000
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 150000
83.3%
Latin 30000
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0 68374
45.6%
1 29095
19.4%
2 25550
 
17.0%
3 4900
 
3.3%
7 4486
 
3.0%
8 4274
 
2.8%
9 4117
 
2.7%
4 3244
 
2.2%
5 3021
 
2.0%
6 2939
 
2.0%
Latin
ValueCountFrequency (%)
W 10000
33.3%
R 10000
33.3%
K 10000
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 180000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 68374
38.0%
1 29095
16.2%
2 25550
 
14.2%
W 10000
 
5.6%
R 10000
 
5.6%
K 10000
 
5.6%
3 4900
 
2.7%
7 4486
 
2.5%
8 4274
 
2.4%
9 4117
 
2.3%
Other values (3) 9204
 
5.1%
Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T07:34:10.893162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length18
Mean length17.9313
Min length16

Characters and Unicode

Total characters179313
Distinct characters49
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row연제구-2019-가스 -00023
2nd row강서구-2020-상수 -00159
3rd row부산진구-2017-상수 -00180
4th row남구-2023-상수 -00112
5th row동구-2017-전기 -00017
ValueCountFrequency (%)
00004 165
 
0.8%
00001 163
 
0.8%
00002 149
 
0.7%
00006 148
 
0.7%
00003 138
 
0.7%
00007 133
 
0.7%
00005 130
 
0.7%
00014 125
 
0.6%
00009 121
 
0.6%
00008 117
 
0.6%
Other values (999) 18565
93.0%
2023-12-13T07:34:11.324203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 41202
23.0%
- 30000
16.7%
2 19504
10.9%
1 11447
 
6.4%
9954
 
5.6%
9157
 
5.1%
4586
 
2.6%
4371
 
2.4%
3872
 
2.2%
3872
 
2.2%
Other values (39) 41348
23.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 90000
50.2%
Other Letter 49359
27.5%
Dash Punctuation 30000
 
16.7%
Space Separator 9954
 
5.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9157
18.6%
4586
 
9.3%
4371
 
8.9%
3872
 
7.8%
3872
 
7.8%
2638
 
5.3%
1319
 
2.7%
1255
 
2.5%
1048
 
2.1%
1023
 
2.1%
Other values (27) 16218
32.9%
Decimal Number
ValueCountFrequency (%)
0 41202
45.8%
2 19504
21.7%
1 11447
 
12.7%
7 3187
 
3.5%
8 3183
 
3.5%
9 3169
 
3.5%
3 2839
 
3.2%
4 1971
 
2.2%
5 1762
 
2.0%
6 1736
 
1.9%
Dash Punctuation
ValueCountFrequency (%)
- 30000
100.0%
Space Separator
ValueCountFrequency (%)
9954
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 129954
72.5%
Hangul 49359
 
27.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9157
18.6%
4586
 
9.3%
4371
 
8.9%
3872
 
7.8%
3872
 
7.8%
2638
 
5.3%
1319
 
2.7%
1255
 
2.5%
1048
 
2.1%
1023
 
2.1%
Other values (27) 16218
32.9%
Common
ValueCountFrequency (%)
0 41202
31.7%
- 30000
23.1%
2 19504
15.0%
1 11447
 
8.8%
9954
 
7.7%
7 3187
 
2.5%
8 3183
 
2.4%
9 3169
 
2.4%
3 2839
 
2.2%
4 1971
 
1.5%
Other values (2) 3498
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 129954
72.5%
Hangul 49359
 
27.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 41202
31.7%
- 30000
23.1%
2 19504
15.0%
1 11447
 
8.8%
9954
 
7.7%
7 3187
 
2.5%
8 3183
 
2.4%
9 3169
 
2.4%
3 2839
 
2.2%
4 1971
 
1.5%
Other values (2) 3498
 
2.7%
Hangul
ValueCountFrequency (%)
9157
18.6%
4586
 
9.3%
4371
 
8.9%
3872
 
7.8%
3872
 
7.8%
2638
 
5.3%
1319
 
2.7%
1255
 
2.5%
1048
 
2.1%
1023
 
2.1%
Other values (27) 16218
32.9%
Distinct1744
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2017-01-11 00:00:00
Maximum2023-08-11 00:00:00
2023-12-13T07:34:11.474491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:11.633070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-08-12
10000 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-08-12
2nd row2023-08-12
3rd row2023-08-12
4th row2023-08-12
5th row2023-08-12

Common Values

ValueCountFrequency (%)
2023-08-12 10000
100.0%

Length

2023-12-13T07:34:11.765906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:34:11.879714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-08-12 10000
100.0%

Missing values

2023-12-13T07:34:09.835890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:34:09.943400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

도로굴착관리번호도로굴착허가번호허가날짜기준일자
14196WRK001201901290024연제구-2019-가스 -000232019-02-012023-08-12
24102WRK001202006260022강서구-2020-상수 -001592020-07-062023-08-12
5484WRK001201710270012부산진구-2017-상수 -001802017-10-312023-08-12
42212WRK001202307100002남구-2023-상수 -001122023-07-182023-08-12
5564WRK001201711010023동구-2017-전기 -000172017-11-092023-08-12
19911WRK001201911110015동래구-2019-통신 -000122019-12-232023-08-12
21303WRK001202001300023부산진구-2020-가스 -000242020-02-032023-08-12
2470WRK001201706050032중구-2017-가스 -000162017-06-072023-08-12
38290WRK001202209150010중구-2022-상수 -000232022-09-202023-08-12
19841WRK001201911080007수영구-2019-상수 -001722019-11-282023-08-12
도로굴착관리번호도로굴착허가번호허가날짜기준일자
13852WRK001201901080030부산진구-2019-기타 -000022019-01-092023-08-12
38176WRK001202209060003동구-2022-가스 -000492022-09-072023-08-12
14810WRK001201903080023서구-2019-상수 -000402019-03-252023-08-12
6098WRK001201711210003동래구-2017-전기 -000292017-11-302023-08-12
5515WRK001201710300009동구-2017-가스 -001242017-11-032023-08-12
15894WRK001201905010005중구-2019-상수 -000212019-05-022023-08-12
9276WRK001201805230020사하구-2018-가스 -000982018-06-222023-08-12
36354WRK001202205240012동래구-2022-가스 -000702022-06-102023-08-12
5812WRK001201711090036북구-2017-상수 -001022017-11-202023-08-12
31338WRK001202107160013강서구-2021-가스 -000512021-07-212023-08-12