Overview

Dataset statistics

Number of variables8
Number of observations7118
Missing cells1790
Missing cells (%)3.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory486.7 KiB
Average record size in memory70.0 B

Variable types

Numeric5
Categorical2
Text1

Dataset

Description경상남도_개발공채 데이터입니다. (공사년도, 공사구분, 공사번호, 지급일자, 소요액 , 실적금액 등의 데이터를 포함하고있습니다.)
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15049535

Alerts

부서코드 has constant value ""Constant
소요액 is highly overall correlated with 실적금액High correlation
실적금액 is highly overall correlated with 소요액High correlation
소요액 has 1155 (16.2%) missing valuesMissing
실적금액 has 635 (8.9%) missing valuesMissing
소요액 is highly skewed (γ1 = 68.79262527)Skewed
실적금액 is highly skewed (γ1 = 59.15766638)Skewed

Reproduction

Analysis started2023-12-11 00:36:50.158409
Analysis finished2023-12-11 00:36:52.957827
Duration2.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

공사년도
Real number (ℝ)

Distinct24
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2002.7201
Minimum1990
Maximum2013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size62.7 KiB
2023-12-11T09:36:53.019097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1990
5-th percentile1991
Q11999
median2004
Q32009
95-th percentile2011
Maximum2013
Range23
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.4661822
Coefficient of variation (CV)0.0032286998
Kurtosis-0.97022669
Mean2002.7201
Median Absolute Deviation (MAD)5
Skewness-0.40839173
Sum14255362
Variance41.811512
MonotonicityNot monotonic
2023-12-11T09:36:53.111917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
2010 697
 
9.8%
2011 547
 
7.7%
2000 442
 
6.2%
2004 433
 
6.1%
2003 404
 
5.7%
2007 386
 
5.4%
2001 373
 
5.2%
2006 368
 
5.2%
2009 365
 
5.1%
2005 352
 
4.9%
Other values (14) 2751
38.6%
ValueCountFrequency (%)
1990 207
2.9%
1991 275
3.9%
1992 195
2.7%
1993 267
3.8%
1994 204
2.9%
1995 256
3.6%
1996 185
2.6%
1997 37
 
0.5%
1998 90
 
1.3%
1999 332
4.7%
ValueCountFrequency (%)
2013 10
 
0.1%
2012 197
 
2.8%
2011 547
7.7%
2010 697
9.8%
2009 365
5.1%
2008 243
 
3.4%
2007 386
5.4%
2006 368
5.2%
2005 352
4.9%
2004 433
6.1%

공사구분
Categorical

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size55.7 KiB
공사
5238 
용역
1291 
기타
546 
구매
 
43

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공사
2nd row공사
3rd row공사
4th row공사
5th row공사

Common Values

ValueCountFrequency (%)
공사 5238
73.6%
용역 1291
 
18.1%
기타 546
 
7.7%
구매 43
 
0.6%

Length

2023-12-11T09:36:53.213356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:36:53.307308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공사 5238
73.6%
용역 1291
 
18.1%
기타 546
 
7.7%
구매 43
 
0.6%

공사번호
Real number (ℝ)

Distinct531
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean97.908401
Minimum1
Maximum623
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size62.7 KiB
2023-12-11T09:36:53.630494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q129
median59
Q3100
95-th percentile413
Maximum623
Range622
Interquartile range (IQR)71

Descriptive statistics

Standard deviation118.71299
Coefficient of variation (CV)1.2124904
Kurtosis4.186912
Mean97.908401
Median Absolute Deviation (MAD)34
Skewness2.1870508
Sum696912
Variance14092.775
MonotonicityNot monotonic
2023-12-11T09:36:53.746365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29 79
 
1.1%
8 77
 
1.1%
30 73
 
1.0%
21 73
 
1.0%
7 71
 
1.0%
5 70
 
1.0%
33 69
 
1.0%
31 69
 
1.0%
1 68
 
1.0%
20 68
 
1.0%
Other values (521) 6401
89.9%
ValueCountFrequency (%)
1 68
1.0%
2 61
0.9%
3 63
0.9%
4 54
0.8%
5 70
1.0%
6 67
0.9%
7 71
1.0%
8 77
1.1%
9 65
0.9%
10 62
0.9%
ValueCountFrequency (%)
623 1
< 0.1%
620 1
< 0.1%
619 1
< 0.1%
618 1
< 0.1%
617 2
< 0.1%
616 1
< 0.1%
615 1
< 0.1%
614 1
< 0.1%
607 1
< 0.1%
604 1
< 0.1%

부서코드
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.7 KiB
1
7118 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 7118
100.0%

Length

2023-12-11T09:36:53.845089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:36:53.921368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 7118
100.0%

순번
Real number (ℝ)

Distinct31
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0973588
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size62.7 KiB
2023-12-11T09:36:53.995291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum31
Range30
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.1467036
Coefficient of variation (CV)1.0235271
Kurtosis52.509832
Mean2.0973588
Median Absolute Deviation (MAD)0
Skewness5.9527792
Sum14929
Variance4.6083364
MonotonicityNot monotonic
2023-12-11T09:36:54.099518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
1 3591
50.4%
2 1784
25.1%
3 933
 
13.1%
4 395
 
5.5%
5 162
 
2.3%
6 84
 
1.2%
7 38
 
0.5%
8 26
 
0.4%
9 13
 
0.2%
10 12
 
0.2%
Other values (21) 80
 
1.1%
ValueCountFrequency (%)
1 3591
50.4%
2 1784
25.1%
3 933
 
13.1%
4 395
 
5.5%
5 162
 
2.3%
6 84
 
1.2%
7 38
 
0.5%
8 26
 
0.4%
9 13
 
0.2%
10 12
 
0.2%
ValueCountFrequency (%)
31 1
 
< 0.1%
30 1
 
< 0.1%
29 1
 
< 0.1%
28 2
< 0.1%
27 2
< 0.1%
26 2
< 0.1%
25 2
< 0.1%
24 2
< 0.1%
23 3
< 0.1%
22 3
< 0.1%
Distinct3079
Distinct (%)43.3%
Missing0
Missing (%)0.0%
Memory size55.7 KiB
2023-12-11T09:36:54.355878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.9519528
Min length4

Characters and Unicode

Total characters70838
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1474 ?
Unique (%)20.7%

Sample

1st row1990-03-28
2nd row1990-05-22
3rd row2002-10-30
4th row2002-10-30
5th row1990-04-09
ValueCountFrequency (%)
2003-12-30 34
 
0.5%
2010-12-31 27
 
0.4%
2005-12-28 27
 
0.4%
2003-09-05 25
 
0.4%
2010-09-17 25
 
0.4%
2010-06-29 24
 
0.3%
2006-12-27 22
 
0.3%
2010-06-28 22
 
0.3%
2004-03-03 17
 
0.2%
2011-06-29 17
 
0.2%
Other values (3069) 6878
96.6%
2023-12-11T09:36:54.716959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 17995
25.4%
- 14068
19.9%
2 10966
15.5%
1 10416
14.7%
9 5846
 
8.3%
6 2071
 
2.9%
4 2037
 
2.9%
3 2018
 
2.8%
5 1977
 
2.8%
7 1850
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 56770
80.1%
Dash Punctuation 14068
 
19.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 17995
31.7%
2 10966
19.3%
1 10416
18.3%
9 5846
 
10.3%
6 2071
 
3.6%
4 2037
 
3.6%
3 2018
 
3.6%
5 1977
 
3.5%
7 1850
 
3.3%
8 1594
 
2.8%
Dash Punctuation
ValueCountFrequency (%)
- 14068
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 70838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 17995
25.4%
- 14068
19.9%
2 10966
15.5%
1 10416
14.7%
9 5846
 
8.3%
6 2071
 
2.9%
4 2037
 
2.9%
3 2018
 
2.8%
5 1977
 
2.8%
7 1850
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 70838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 17995
25.4%
- 14068
19.9%
2 10966
15.5%
1 10416
14.7%
9 5846
 
8.3%
6 2071
 
2.9%
4 2037
 
2.9%
3 2018
 
2.8%
5 1977
 
2.8%
7 1850
 
2.6%

소요액
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED 

Distinct2642
Distinct (%)44.3%
Missing1155
Missing (%)16.2%
Infinite0
Infinite (%)0.0%
Mean38144867
Minimum0
Maximum8.118 × 1010
Zeros5
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size62.7 KiB
2023-12-11T09:36:54.833490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile45000
Q1685000
median2715000
Q37647500
95-th percentile35500000
Maximum8.118 × 1010
Range8.118 × 1010
Interquartile range (IQR)6962500

Descriptive statistics

Standard deviation1.0963646 × 109
Coefficient of variation (CV)28.742127
Kurtosis5042.9965
Mean38144867
Median Absolute Deviation (MAD)2480000
Skewness68.792625
Sum2.2745784 × 1011
Variance1.2020154 × 1018
MonotonicityNot monotonic
2023-12-11T09:36:54.956539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
55000 55
 
0.8%
20000 53
 
0.7%
50000 46
 
0.6%
45000 46
 
0.6%
30000 45
 
0.6%
35000 42
 
0.6%
15000 40
 
0.6%
115000 40
 
0.6%
25000 38
 
0.5%
60000 38
 
0.5%
Other values (2632) 5520
77.5%
(Missing) 1155
 
16.2%
ValueCountFrequency (%)
0 5
 
0.1%
5000 5
 
0.1%
10000 30
0.4%
15000 40
0.6%
20000 53
0.7%
25000 38
0.5%
27000 1
 
< 0.1%
30000 45
0.6%
35000 42
0.6%
40000 34
0.5%
ValueCountFrequency (%)
81180000000 1
< 0.1%
13692095000 1
< 0.1%
12750000000 1
< 0.1%
10539105000 1
< 0.1%
4922070000 1
< 0.1%
3519865000 1
< 0.1%
3256385000 1
< 0.1%
3162415000 1
< 0.1%
2735625000 1
< 0.1%
2638090000 1
< 0.1%

실적금액
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED 

Distinct2718
Distinct (%)41.9%
Missing635
Missing (%)8.9%
Infinite0
Infinite (%)0.0%
Mean11068677
Minimum0
Maximum1.275 × 1010
Zeros23
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size62.7 KiB
2023-12-11T09:36:55.070604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile45000
Q1765000
median2920000
Q37655000
95-th percentile29718500
Maximum1.275 × 1010
Range1.275 × 1010
Interquartile range (IQR)6890000

Descriptive statistics

Standard deviation1.904491 × 108
Coefficient of variation (CV)17.20613
Kurtosis3654.5187
Mean11068677
Median Absolute Deviation (MAD)2620000
Skewness59.157666
Sum7.1758234 × 1010
Variance3.6270858 × 1016
MonotonicityNot monotonic
2023-12-11T09:36:55.207659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
55000 62
 
0.9%
20000 55
 
0.8%
30000 50
 
0.7%
50000 47
 
0.7%
45000 44
 
0.6%
15000 44
 
0.6%
115000 40
 
0.6%
35000 40
 
0.6%
60000 38
 
0.5%
25000 37
 
0.5%
Other values (2708) 6026
84.7%
(Missing) 635
 
8.9%
ValueCountFrequency (%)
0 23
0.3%
5000 5
 
0.1%
10000 28
0.4%
15000 44
0.6%
20000 55
0.8%
20555 1
 
< 0.1%
25000 37
0.5%
27000 1
 
< 0.1%
30000 50
0.7%
35000 40
0.6%
ValueCountFrequency (%)
12750000000 1
< 0.1%
8345000000 1
< 0.1%
1192000000 1
< 0.1%
441575000 1
< 0.1%
367180000 1
< 0.1%
353310000 1
< 0.1%
249425000 1
< 0.1%
213680000 1
< 0.1%
191040000 1
< 0.1%
176855000 1
< 0.1%

Interactions

2023-12-11T09:36:52.321756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:50.701736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.084130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.475538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.889141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:52.393477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:50.767009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.166138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.543777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.960381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:52.474715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:50.840156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.239421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.626609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:52.041395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:52.558173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:50.910708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.312937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.726863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:52.141577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:52.637367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:50.988860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.397825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:51.810526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:36:52.239302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:36:55.282212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공사년도공사구분공사번호순번소요액실적금액
공사년도1.0000.5740.6770.2920.0360.000
공사구분0.5741.0000.5540.1020.0000.000
공사번호0.6770.5541.0000.1500.0000.072
순번0.2920.1020.1501.0000.0410.000
소요액0.0360.0000.0000.0411.0000.750
실적금액0.0000.0000.0720.0000.7501.000
2023-12-11T09:36:55.369867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공사년도공사번호순번소요액실적금액공사구분
공사년도1.0000.460-0.167-0.392-0.3870.385
공사번호0.4601.000-0.078-0.193-0.1890.365
순번-0.167-0.0781.0000.2480.2890.035
소요액-0.392-0.1930.2481.0000.9600.000
실적금액-0.387-0.1890.2890.9601.0000.000
공사구분0.3850.3650.0350.0000.0001.000

Missing values

2023-12-11T09:36:52.733856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:36:52.830886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T09:36:52.908667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

공사년도공사구분공사번호부서코드순번지급일자소요액실적금액
01990공사5111990-03-282475000<NA>
11990공사5121990-05-22<NA>2475000
21990공사1132002-10-30665000<NA>
31990공사1142002-10-301350000<NA>
41990공사2121990-04-09<NA>23985000
51990공사4111990-06-27<NA>10662000
61990공사1122002-10-30132000010686000
71990공사1112002-10-301110000<NA>
81990공사2111990-03-3023985000<NA>
91990공사6121990-06-05534000534000
공사년도공사구분공사번호부서코드순번지급일자소요액실적금액
71082013용역149112013-05-29760000760000
71092013용역155112013-06-257500075000
71101996공사48111996-07-04<NA>8600000
71112000용역56112000-12-0412500001250000
71122004용역2112004-03-033256385000<NA>
71132004용역2122004-04-07240350000<NA>
71142011용역483112012-01-17220000220000
71152004용역2142004-05-171162995000<NA>
71162010용역97112011-02-24255000255000
71172004용역2132004-05-06232595000<NA>