Overview

Dataset statistics

Number of variables6
Number of observations41
Missing cells1
Missing cells (%)0.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 KiB
Average record size in memory53.2 B

Variable types

Numeric2
Categorical3
Text1

Alerts

집계년도 is highly overall correlated with 사업기간High correlation
예산액(백만원) is highly overall correlated with 사업명High correlation
구분명 is highly overall correlated with 사업명 and 1 other fieldsHigh correlation
사업명 is highly overall correlated with 예산액(백만원) and 1 other fieldsHigh correlation
사업기간 is highly overall correlated with 집계년도 and 1 other fieldsHigh correlation
예산액(백만원) has 1 (2.4%) missing valuesMissing

Reproduction

Analysis started2024-03-12 23:14:39.610365
Analysis finished2024-03-12 23:14:40.469862
Duration0.86 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

집계년도
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)22.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2018.3902
Minimum2015
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size501.0 B
2024-03-13T08:14:40.519074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2015
5-th percentile2015
Q12016
median2018
Q32021
95-th percentile2023
Maximum2023
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.6634381
Coefficient of variation (CV)0.0013195853
Kurtosis-1.2675661
Mean2018.3902
Median Absolute Deviation (MAD)2
Skewness0.26618592
Sum82754
Variance7.0939024
MonotonicityDecreasing
2024-03-13T08:14:40.633495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2016 7
17.1%
2015 7
17.1%
2022 4
9.8%
2021 4
9.8%
2020 4
9.8%
2019 4
9.8%
2018 4
9.8%
2017 4
9.8%
2023 3
7.3%
ValueCountFrequency (%)
2015 7
17.1%
2016 7
17.1%
2017 4
9.8%
2018 4
9.8%
2019 4
9.8%
2020 4
9.8%
2021 4
9.8%
2022 4
9.8%
2023 3
7.3%
ValueCountFrequency (%)
2023 3
7.3%
2022 4
9.8%
2021 4
9.8%
2020 4
9.8%
2019 4
9.8%
2018 4
9.8%
2017 4
9.8%
2016 7
17.1%
2015 7
17.1%

구분명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)12.2%
Missing0
Missing (%)0.0%
Memory size460.0 B
도로 포장 유지관리
31 
지방도유지관리
도로 구조물 유지관리
 
2
터널위탁관리
 
2
구조물유지관리
 
2

Length

Max length11
Median length10
Mean length9.4146341
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row도로 포장 유지관리
2nd row도로 포장 유지관리
3rd row도로 포장 유지관리
4th row도로 포장 유지관리
5th row도로 포장 유지관리

Common Values

ValueCountFrequency (%)
도로 포장 유지관리 31
75.6%
지방도유지관리 4
 
9.8%
도로 구조물 유지관리 2
 
4.9%
터널위탁관리 2
 
4.9%
구조물유지관리 2
 
4.9%

Length

2024-03-13T08:14:40.739408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T08:14:40.821996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
도로 33
30.8%
유지관리 33
30.8%
포장 31
29.0%
지방도유지관리 4
 
3.7%
구조물 2
 
1.9%
터널위탁관리 2
 
1.9%
구조물유지관리 2
 
1.9%

사업명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)19.5%
Missing0
Missing (%)0.0%
Memory size460.0 B
시설물보수
상시보수
포장도 보수
차선도색
교량 보수,보강
Other values (3)

Length

Max length15
Median length12
Mean length5.6097561
Min length4

Unique

Unique1 ?
Unique (%)2.4%

Sample

1st row시설물보수
2nd row포장도 보수(차선도색 포함)
3rd row상시보수
4th row포장도 보수
5th row차선도색

Common Values

ValueCountFrequency (%)
시설물보수 9
22.0%
상시보수 9
22.0%
포장도 보수 8
19.5%
차선도색 8
19.5%
교량 보수,보강 2
 
4.9%
문수산터널 위탁관리용역 2
 
4.9%
교량 안전점검 2
 
4.9%
포장도 보수(차선도색 포함) 1
 
2.4%

Length

2024-03-13T08:14:40.915508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T08:14:41.005578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
시설물보수 9
15.8%
상시보수 9
15.8%
포장도 9
15.8%
보수 8
14.0%
차선도색 8
14.0%
교량 4
7.0%
보수,보강 2
 
3.5%
문수산터널 2
 
3.5%
위탁관리용역 2
 
3.5%
안전점검 2
 
3.5%
Other values (2) 2
 
3.5%

사업기간
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)26.8%
Missing0
Missing (%)0.0%
Memory size460.0 B
201601~201612
201501~201512
202201~202212
202101~202112
202001~202012
Other values (6)
17 

Length

Max length13
Median length13
Mean length13
Min length13

Unique

Unique2 ?
Unique (%)4.9%

Sample

1st row202301~202312
2nd row202301~202312
3rd row202301~202312
4th row202201~202212
5th row202201~202212

Common Values

ValueCountFrequency (%)
201601~201612 6
14.6%
201501~201512 6
14.6%
202201~202212 4
9.8%
202101~202112 4
9.8%
202001~202012 4
9.8%
201901~201912 4
9.8%
201801~201812 4
9.8%
201701~201712 4
9.8%
202301~202312 3
7.3%
201602~201612 1
 
2.4%

Length

2024-03-13T08:14:41.105160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
201601~201612 6
14.6%
201501~201512 6
14.6%
202201~202212 4
9.8%
202101~202112 4
9.8%
202001~202012 4
9.8%
201901~201912 4
9.8%
201801~201812 4
9.8%
201701~201712 4
9.8%
202301~202312 3
7.3%
201602~201612 1
 
2.4%
Distinct23
Distinct (%)56.1%
Missing0
Missing (%)0.0%
Memory size460.0 B
2024-03-13T08:14:41.234187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length20
Mean length9.9512195
Min length4

Characters and Unicode

Total characters408
Distinct characters50
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)41.5%

Sample

1st row가드레일, 안전휀스 등
2nd rowL=109.1km
3rd row배수로, 도로법면 복구 등
4th rowL=100km
5th rowL=360km
ValueCountFrequency (%)
17
18.5%
가드레일 8
 
8.7%
배수로 8
 
8.7%
도로법면 8
 
8.7%
복구 8
 
8.7%
안전휀스 8
 
8.7%
l=100km 2
 
2.2%
l=45km 2
 
2.2%
l=700km 2
 
2.2%
문수산터널 2
 
2.2%
Other values (25) 27
29.3%
2024-03-13T08:14:41.492634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
51
 
12.5%
m 19
 
4.7%
19
 
4.7%
, 18
 
4.4%
L 17
 
4.2%
k 17
 
4.2%
17
 
4.2%
= 17
 
4.2%
0 15
 
3.7%
1 12
 
2.9%
Other values (40) 206
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 202
49.5%
Decimal Number 66
 
16.2%
Space Separator 51
 
12.5%
Lowercase Letter 36
 
8.8%
Other Punctuation 19
 
4.7%
Uppercase Letter 17
 
4.2%
Math Symbol 17
 
4.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
 
9.4%
17
 
8.4%
11
 
5.4%
10
 
5.0%
10
 
5.0%
9
 
4.5%
9
 
4.5%
9
 
4.5%
9
 
4.5%
9
 
4.5%
Other values (23) 90
44.6%
Decimal Number
ValueCountFrequency (%)
0 15
22.7%
1 12
18.2%
4 9
13.6%
5 8
12.1%
6 6
 
9.1%
9 6
 
9.1%
7 5
 
7.6%
2 2
 
3.0%
8 2
 
3.0%
3 1
 
1.5%
Lowercase Letter
ValueCountFrequency (%)
m 19
52.8%
k 17
47.2%
Other Punctuation
ValueCountFrequency (%)
, 18
94.7%
. 1
 
5.3%
Space Separator
ValueCountFrequency (%)
51
100.0%
Uppercase Letter
ValueCountFrequency (%)
L 17
100.0%
Math Symbol
ValueCountFrequency (%)
= 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 202
49.5%
Common 153
37.5%
Latin 53
 
13.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
 
9.4%
17
 
8.4%
11
 
5.4%
10
 
5.0%
10
 
5.0%
9
 
4.5%
9
 
4.5%
9
 
4.5%
9
 
4.5%
9
 
4.5%
Other values (23) 90
44.6%
Common
ValueCountFrequency (%)
51
33.3%
, 18
 
11.8%
= 17
 
11.1%
0 15
 
9.8%
1 12
 
7.8%
4 9
 
5.9%
5 8
 
5.2%
6 6
 
3.9%
9 6
 
3.9%
7 5
 
3.3%
Other values (4) 6
 
3.9%
Latin
ValueCountFrequency (%)
m 19
35.8%
L 17
32.1%
k 17
32.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 206
50.5%
Hangul 202
49.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
51
24.8%
m 19
 
9.2%
, 18
 
8.7%
L 17
 
8.3%
k 17
 
8.3%
= 17
 
8.3%
0 15
 
7.3%
1 12
 
5.8%
4 9
 
4.4%
5 8
 
3.9%
Other values (7) 23
11.2%
Hangul
ValueCountFrequency (%)
19
 
9.4%
17
 
8.4%
11
 
5.4%
10
 
5.0%
10
 
5.0%
9
 
4.5%
9
 
4.5%
9
 
4.5%
9
 
4.5%
9
 
4.5%
Other values (23) 90
44.6%

예산액(백만원)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct33
Distinct (%)82.5%
Missing1
Missing (%)2.4%
Infinite0
Infinite (%)0.0%
Mean9038.425
Minimum543
Maximum215000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size501.0 B
2024-03-13T08:14:41.602055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum543
5-th percentile655.9
Q11350
median2883
Q34350
95-th percentile13156.5
Maximum215000
Range214457
Interquartile range (IQR)3000

Descriptive statistics

Standard deviation33580.856
Coefficient of variation (CV)3.7153438
Kurtosis39.079015
Mean9038.425
Median Absolute Deviation (MAD)1533
Skewness6.2198409
Sum361537
Variance1.1276739 × 109
MonotonicityNot monotonic
2024-03-13T08:14:41.691547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
1350 3
 
7.3%
900 3
 
7.3%
3000 3
 
7.3%
4100 2
 
4.9%
6234 1
 
2.4%
543 1
 
2.4%
2766 1
 
2.4%
5466 1
 
2.4%
2014 1
 
2.4%
2700 1
 
2.4%
Other values (23) 23
56.1%
ValueCountFrequency (%)
543 1
 
2.4%
578 1
 
2.4%
660 1
 
2.4%
700 1
 
2.4%
800 1
 
2.4%
900 3
7.3%
1300 1
 
2.4%
1350 3
7.3%
2014 1
 
2.4%
2140 1
 
2.4%
ValueCountFrequency (%)
215000 1
2.4%
14230 1
2.4%
13100 1
2.4%
12000 1
2.4%
11847 1
2.4%
7800 1
2.4%
6234 1
2.4%
5466 1
2.4%
4700 1
2.4%
4500 1
2.4%

Interactions

2024-03-13T08:14:39.957358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:14:39.813022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:14:40.043467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T08:14:39.890754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T08:14:41.756953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
집계년도구분명사업명사업기간사업량(km)예산액(백만원)
집계년도1.0000.0000.0001.0000.0000.459
구분명0.0001.0000.7810.8381.0000.000
사업명0.0000.7811.0000.0001.0001.000
사업기간1.0000.8380.0001.0000.0000.433
사업량(km)0.0001.0001.0000.0001.0001.000
예산액(백만원)0.4590.0001.0000.4331.0001.000
2024-03-13T08:14:41.833564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분명사업기간사업명
구분명1.0000.5990.602
사업기간0.5991.0000.000
사업명0.6020.0001.000
2024-03-13T08:14:41.902806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
집계년도예산액(백만원)구분명사업명사업기간
집계년도1.0000.3610.3460.0000.968
예산액(백만원)0.3611.0000.0000.9180.248
구분명0.3460.0001.0000.6020.599
사업명0.0000.9180.6021.0000.000
사업기간0.9680.2480.5990.0001.000

Missing values

2024-03-13T08:14:40.338242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T08:14:40.430020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

집계년도구분명사업명사업기간사업량(km)예산액(백만원)
02023도로 포장 유지관리시설물보수202301~202312가드레일, 안전휀스 등6234
12023도로 포장 유지관리포장도 보수(차선도색 포함)202301~202312L=109.1km215000
22023도로 포장 유지관리상시보수202301~202312배수로, 도로법면 복구 등2214
32022도로 포장 유지관리포장도 보수202201~202212L=100km14230
42022도로 포장 유지관리차선도색202201~202212L=360km4100
52022도로 포장 유지관리시설물보수202201~202212가드레일, 안전휀스 등1350
62022도로 포장 유지관리상시보수202201~202212배수로, 도로법면 복구 등4270
72021도로 포장 유지관리차선도색202101~202112L=581km4100
82021도로 포장 유지관리포장도 보수202101~202112L=194km13100
92021도로 포장 유지관리시설물보수202101~202112가드레일, 안전휀스 등1350
집계년도구분명사업명사업기간사업량(km)예산액(백만원)
312016도로 구조물 유지관리교량 보수,보강201601~20161247개소4300
322016터널위탁관리문수산터널 위탁관리용역201602~201612문수산터널 1,566m543
332016도로 구조물 유지관리교량 안전점검201601~201612412개소3000
342015지방도유지관리포장도 보수201501~201512L=40km4500
352015지방도유지관리차선도색201501~201512L=710km3200
362015지방도유지관리시설물보수201501~201512가드레일등 안전시설물 및 안전표지판등900
372015지방도유지관리상시보수201501~201512배수시설 등 도로 시설물 정비 및 개선660
382015구조물유지관리교량 보수,보강201501~20151259개소4700
392015구조물유지관리교량 안전점검201501~201512420개소1300
402015터널위탁관리문수산터널 위탁관리용역201502~201602문수산터널 1,566m700