Overview

Dataset statistics

Number of variables6
Number of observations453
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory22.2 KiB
Average record size in memory50.3 B

Variable types

Categorical4
Text1
Numeric1

Dataset

Description부산광역시_도로굴착포장재질변경정보_20230812
Author부산광역시
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15119840

Alerts

기준일자 has constant value ""Constant
도로굴착관리번호 is highly overall correlated with 변경일련번호 and 1 other fieldsHigh correlation
변경일련번호 is highly overall correlated with 도로굴착관리번호High correlation
포장종류 is highly overall correlated with 도로굴착관리번호High correlation
도로굴착관리번호 is highly imbalanced (56.6%)Imbalance

Reproduction

Analysis started2023-12-10 16:15:31.640919
Analysis finished2023-12-10 16:15:32.228561
Duration0.59 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

도로굴착관리번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
WRK001201807040021
244 
WRK001201712270009
202 
WRK001202001310018
 
3
WRK002202107070007
 
2
WRK001201704210016
 
1

Length

Max length18
Median length18
Mean length18
Min length18

Unique

Unique2 ?
Unique (%)0.4%

Sample

1st rowWRK001201704210016
2nd rowWRK001201712270009
3rd rowWRK001201712270009
4th rowWRK001201712270009
5th rowWRK001201712270009

Common Values

ValueCountFrequency (%)
WRK001201807040021 244
53.9%
WRK001201712270009 202
44.6%
WRK001202001310018 3
 
0.7%
WRK002202107070007 2
 
0.4%
WRK001201704210016 1
 
0.2%
WRK001202205030030 1
 
0.2%

Length

2023-12-11T01:15:32.302076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:15:32.443997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
wrk001201807040021 244
53.9%
wrk001201712270009 202
44.6%
wrk001202001310018 3
 
0.7%
wrk002202107070007 2
 
0.4%
wrk001201704210016 1
 
0.2%
wrk001202205030030 1
 
0.2%
Distinct123
Distinct (%)27.2%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
2023-12-11T01:15:32.690233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters8154
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)1.3%

Sample

1st rowWRK001201704210038
2nd rowWRK001201712220007
3rd rowWRK001201712220007
4th rowWRK001201712220008
5th rowWRK001201712220008
ValueCountFrequency (%)
wrk001201712220048 14
 
3.1%
wrk001201712220028 14
 
3.1%
wrk001201712220046 13
 
2.9%
wrk001201807030043 12
 
2.6%
wrk001201712220018 10
 
2.2%
wrk001201712220030 10
 
2.2%
wrk001201712220029 10
 
2.2%
wrk001201712220024 9
 
2.0%
wrk001201712220015 9
 
2.0%
wrk001201712220032 8
 
1.8%
Other values (113) 344
75.9%
2023-12-11T01:15:33.110817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2810
34.5%
1 1202
14.7%
2 1173
14.4%
7 525
 
6.4%
W 453
 
5.6%
R 453
 
5.6%
K 453
 
5.6%
3 388
 
4.8%
8 338
 
4.1%
4 138
 
1.7%
Other values (3) 221
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6795
83.3%
Uppercase Letter 1359
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2810
41.4%
1 1202
17.7%
2 1173
17.3%
7 525
 
7.7%
3 388
 
5.7%
8 338
 
5.0%
4 138
 
2.0%
6 77
 
1.1%
9 77
 
1.1%
5 67
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
W 453
33.3%
R 453
33.3%
K 453
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 6795
83.3%
Latin 1359
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2810
41.4%
1 1202
17.7%
2 1173
17.3%
7 525
 
7.7%
3 388
 
5.7%
8 338
 
5.0%
4 138
 
2.0%
6 77
 
1.1%
9 77
 
1.1%
5 67
 
1.0%
Latin
ValueCountFrequency (%)
W 453
33.3%
R 453
33.3%
K 453
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8154
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2810
34.5%
1 1202
14.7%
2 1173
14.4%
7 525
 
6.4%
W 453
 
5.6%
R 453
 
5.6%
K 453
 
5.6%
3 388
 
4.8%
8 338
 
4.1%
4 138
 
1.7%
Other values (3) 221
 
2.7%

포장내역일련번호
Real number (ℝ)

Distinct14
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.8278146
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 KiB
2023-12-11T01:15:33.262621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile9
Maximum14
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.6518359
Coefficient of variation (CV)0.93776866
Kurtosis3.391391
Mean2.8278146
Median Absolute Deviation (MAD)1
Skewness1.8918665
Sum1281
Variance7.0322335
MonotonicityNot monotonic
2023-12-11T01:15:33.389914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
1 205
45.3%
2 85
18.8%
3 47
 
10.4%
4 32
 
7.1%
5 19
 
4.2%
6 17
 
3.8%
7 12
 
2.6%
8 11
 
2.4%
9 8
 
1.8%
10 6
 
1.3%
Other values (4) 11
 
2.4%
ValueCountFrequency (%)
1 205
45.3%
2 85
18.8%
3 47
 
10.4%
4 32
 
7.1%
5 19
 
4.2%
6 17
 
3.8%
7 12
 
2.6%
8 11
 
2.4%
9 8
 
1.8%
10 6
 
1.3%
ValueCountFrequency (%)
14 2
 
0.4%
13 3
 
0.7%
12 3
 
0.7%
11 3
 
0.7%
10 6
 
1.3%
9 8
1.8%
8 11
2.4%
7 12
2.6%
6 17
3.8%
5 19
4.2%

변경일련번호
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
1
331 
2
122 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 331
73.1%
2 122
 
26.9%

Length

2023-12-11T01:15:33.562930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:15:33.662742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 331
73.1%
2 122
 
26.9%

포장종류
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
아스팔트콘크리트
172 
콘크리트
152 
아스팔트
117 
보도(블록류)
 
11
보도(기타)
 
1

Length

Max length8
Median length4
Mean length5.5960265
Min length4

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row보도(기타)
2nd row콘크리트
3rd row콘크리트
4th row콘크리트
5th row콘크리트

Common Values

ValueCountFrequency (%)
아스팔트콘크리트 172
38.0%
콘크리트 152
33.6%
아스팔트 117
25.8%
보도(블록류) 11
 
2.4%
보도(기타) 1
 
0.2%

Length

2023-12-11T01:15:33.802732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:15:33.977032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
아스팔트콘크리트 172
38.0%
콘크리트 152
33.6%
아스팔트 117
25.8%
보도(블록류 11
 
2.4%
보도(기타 1
 
0.2%

기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
2023-08-12
453 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-08-12
2nd row2023-08-12
3rd row2023-08-12
4th row2023-08-12
5th row2023-08-12

Common Values

ValueCountFrequency (%)
2023-08-12 453
100.0%

Length

2023-12-11T01:15:34.168775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:15:34.277192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-08-12 453
100.0%

Interactions

2023-12-11T01:15:31.852308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:15:34.374048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도로굴착관리번호포장내역일련번호변경일련번호포장종류
도로굴착관리번호1.0000.4000.7470.731
포장내역일련번호0.4001.0000.3740.216
변경일련번호0.7470.3741.0000.172
포장종류0.7310.2160.1721.000
2023-12-11T01:15:34.524300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도로굴착관리번호포장종류변경일련번호
도로굴착관리번호1.0000.6000.553
포장종류0.6001.0000.210
변경일련번호0.5530.2101.000
2023-12-11T01:15:34.654860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
포장내역일련번호도로굴착관리번호변경일련번호포장종류
포장내역일련번호1.0000.2230.2840.091
도로굴착관리번호0.2231.0000.5530.600
변경일련번호0.2840.5531.0000.210
포장종류0.0910.6000.2101.000

Missing values

2023-12-11T01:15:32.053895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:15:32.184039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

도로굴착관리번호굴착위치관리번호포장내역일련번호변경일련번호포장종류기준일자
0WRK001201704210016WRK00120170421003811보도(기타)2023-08-12
1WRK001201712270009WRK00120171222000711콘크리트2023-08-12
2WRK001201712270009WRK00120171222000721콘크리트2023-08-12
3WRK001201712270009WRK00120171222000811콘크리트2023-08-12
4WRK001201712270009WRK00120171222000821콘크리트2023-08-12
5WRK001201712270009WRK00120171222000831콘크리트2023-08-12
6WRK001201712270009WRK00120171222000911콘크리트2023-08-12
7WRK001201712270009WRK00120171222001011콘크리트2023-08-12
8WRK001201712270009WRK00120171222001021콘크리트2023-08-12
9WRK001201712270009WRK00120171222001111콘크리트2023-08-12
도로굴착관리번호굴착위치관리번호포장내역일련번호변경일련번호포장종류기준일자
443WRK001201807040021WRK00120180703010611콘크리트2023-08-12
444WRK001201807040021WRK00120180703010612콘크리트2023-08-12
445WRK001201807040021WRK00120180703010711아스팔트2023-08-12
446WRK001201807040021WRK00120180703010712아스팔트2023-08-12
447WRK001202001310018WRK00120200131002311보도(블록류)2023-08-12
448WRK001202001310018WRK00120200131002321보도(블록류)2023-08-12
449WRK001202001310018WRK00120200131002331보도(블록류)2023-08-12
450WRK002202107070007WRK00120210707003911아스팔트2023-08-12
451WRK002202107070007WRK00120210707003921아스팔트2023-08-12
452WRK001202205030030WRK00120220420001411아스팔트2023-08-12