Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 2512 |
Missing cells | 7536 |
Missing cells (%) | 33.3% |
Duplicate rows | 1 |
Duplicate rows (%) | < 0.1% |
Total size in memory | 193.9 KiB |
Average record size in memory | 79.1 B |
Variable types
Numeric | 3 |
---|---|
Categorical | 2 |
Text | 1 |
Unsupported | 3 |
Dataset
Description | 경상남도 공사계약대장시스템의 선급금정산 데이터입니다. 공사년도, 공사구분, 제출일자, 정산금액등의 데이터를 포함하고있습니다. |
---|---|
Author | 경상남도 |
URL | https://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15049521 |
부서코드 has constant value "" | Constant |
Dataset has 1 (< 0.1%) duplicate rows | Duplicates |
공사구분 is highly imbalanced (63.0%) | Imbalance |
Unnamed: 6 has 2512 (100.0%) missing values | Missing |
Unnamed: 7 has 2512 (100.0%) missing values | Missing |
Unnamed: 8 has 2512 (100.0%) missing values | Missing |
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2023-12-11 00:38:12.918941 |
---|---|
Analysis finished | 2023-12-11 00:38:14.265340 |
Duration | 1.35 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
공사년도
Real number (ℝ)
Distinct | 29 |
---|---|
Distinct (%) | 1.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2006.6286 |
Minimum | 1990 |
---|---|
Maximum | 2019 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 22.2 KiB |
Quantile statistics
Minimum | 1990 |
---|---|
5-th percentile | 1992 |
Q1 | 2003 |
median | 2009 |
Q3 | 2011 |
95-th percentile | 2016 |
Maximum | 2019 |
Range | 29 |
Interquartile range (IQR) | 8 |
Descriptive statistics
Standard deviation | 7.0183939 |
---|---|
Coefficient of variation (CV) | 0.0034976049 |
Kurtosis | -0.040319738 |
Mean | 2006.6286 |
Median Absolute Deviation (MAD) | 4 |
Skewness | -0.8594025 |
Sum | 5040651 |
Variance | 49.257853 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2010 | 243 | 9.7% |
2009 | 241 | 9.6% |
2012 | 193 | 7.7% |
2011 | 184 | 7.3% |
2008 | 160 | 6.4% |
2007 | 134 | 5.3% |
2015 | 125 | 5.0% |
2003 | 109 | 4.3% |
2014 | 92 | 3.7% |
2005 | 91 | 3.6% |
Other values (19) | 940 |
Value | Count | Frequency (%) |
1990 | 61 | |
1991 | 60 | |
1992 | 51 | |
1993 | 73 | |
1994 | 61 | |
1995 | 10 | 0.4% |
1997 | 1 | < 0.1% |
1998 | 1 | < 0.1% |
1999 | 60 | |
2000 | 83 |
Value | Count | Frequency (%) |
2019 | 2 | 0.1% |
2018 | 29 | 1.2% |
2017 | 26 | 1.0% |
2016 | 74 | 2.9% |
2015 | 125 | |
2014 | 92 | 3.7% |
2013 | 76 | 3.0% |
2012 | 193 | |
2011 | 184 | |
2010 | 243 |
공사구분
Categorical
IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 19.8 KiB |
공사 | |
---|---|
용역 | |
구매 | 2 |
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 공사 |
---|---|
2nd row | 공사 |
3rd row | 공사 |
4th row | 공사 |
5th row | 공사 |
Common Values
Value | Count | Frequency (%) |
공사 | 2165 | |
용역 | 345 | 13.7% |
구매 | 2 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
공사 | 2165 | |
용역 | 345 | 13.7% |
구매 | 2 | 0.1% |
공사번호
Real number (ℝ)
Distinct | 377 |
---|---|
Distinct (%) | 15.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 104.0211 |
Minimum | 1 |
---|---|
Maximum | 617 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 22.2 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 8 |
Q1 | 33 |
median | 65 |
Q3 | 114 |
95-th percentile | 416 |
Maximum | 617 |
Range | 616 |
Interquartile range (IQR) | 81 |
Descriptive statistics
Standard deviation | 116.27908 |
---|---|
Coefficient of variation (CV) | 1.1178413 |
Kurtosis | 3.771799 |
Mean | 104.0211 |
Median Absolute Deviation (MAD) | 40 |
Skewness | 2.0740245 |
Sum | 261301 |
Variance | 13520.824 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
113 | 33 | 1.3% |
42 | 30 | 1.2% |
40 | 28 | 1.1% |
20 | 28 | 1.1% |
39 | 28 | 1.1% |
7 | 27 | 1.1% |
12 | 27 | 1.1% |
36 | 27 | 1.1% |
105 | 27 | 1.1% |
58 | 26 | 1.0% |
Other values (367) | 2231 |
Value | Count | Frequency (%) |
1 | 22 | |
2 | 13 | |
3 | 24 | |
4 | 11 | |
5 | 15 | |
6 | 9 | 0.4% |
7 | 27 | |
8 | 24 | |
9 | 20 | |
10 | 22 |
Value | Count | Frequency (%) |
617 | 1 | |
596 | 1 | |
594 | 1 | |
582 | 1 | |
564 | 1 | |
554 | 2 | |
543 | 2 | |
530 | 1 | |
529 | 1 | |
527 | 1 |
부서코드
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 19.8 KiB |
1 |
---|
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 1 |
---|---|
2nd row | 1 |
3rd row | 1 |
4th row | 1 |
5th row | 1 |
Common Values
Value | Count | Frequency (%) |
1 | 2512 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1 | 2512 |
제출일자
Text
Distinct | 1465 |
---|---|
Distinct (%) | 58.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 19.8 KiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 9.9164013 |
Min length | 4 |
Characters and Unicode
Total characters | 24910 |
---|---|
Distinct characters | 11 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 983 ? |
---|---|
Unique (%) | 39.1% |
Sample
1st row | 1990-12-15 |
---|---|
2nd row | 1990-12-29 |
3rd row | 1990-12-07 |
4th row | 199012 |
5th row | 1990-04-06 |
Value | Count | Frequency (%) |
2009-12-30 | 20 | 0.8% |
2010-12-30 | 17 | 0.7% |
2007-12-28 | 15 | 0.6% |
2009-12-29 | 15 | 0.6% |
2012-12-28 | 14 | 0.6% |
2009-06-25 | 12 | 0.5% |
2010-06-28 | 11 | 0.4% |
2010-06-29 | 11 | 0.4% |
2008-12-29 | 11 | 0.4% |
2015-12-23 | 10 | 0.4% |
Other values (1455) | 2376 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 6178 | |
- | 4920 | |
2 | 4493 | |
1 | 3745 | |
9 | 1695 | 6.8% |
6 | 773 | 3.1% |
3 | 728 | 2.9% |
5 | 613 | 2.5% |
8 | 602 | 2.4% |
7 | 590 | 2.4% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 19990 | |
Dash Punctuation | 4920 | 19.8% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 6178 | |
2 | 4493 | |
1 | 3745 | |
9 | 1695 | 8.5% |
6 | 773 | 3.9% |
3 | 728 | 3.6% |
5 | 613 | 3.1% |
8 | 602 | 3.0% |
7 | 590 | 3.0% |
4 | 573 | 2.9% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 4920 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 24910 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 6178 | |
- | 4920 | |
2 | 4493 | |
1 | 3745 | |
9 | 1695 | 6.8% |
6 | 773 | 3.1% |
3 | 728 | 2.9% |
5 | 613 | 2.5% |
8 | 602 | 2.4% |
7 | 590 | 2.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 24910 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 6178 | |
- | 4920 | |
2 | 4493 | |
1 | 3745 | |
9 | 1695 | 6.8% |
6 | 773 | 3.1% |
3 | 728 | 2.9% |
5 | 613 | 2.5% |
8 | 602 | 2.4% |
7 | 590 | 2.4% |
정산금액
Real number (ℝ)
Distinct | 1889 |
---|---|
Distinct (%) | 75.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2.7389131 × 108 |
Minimum | -9.68 × 108 |
---|---|
Maximum | 8.568216 × 109 |
Zeros | 17 |
Zeros (%) | 0.7% |
Negative | 1 |
Negative (%) | < 0.1% |
Memory size | 22.2 KiB |
Quantile statistics
Minimum | -9.68 × 108 |
---|---|
5-th percentile | 20000000 |
Q1 | 61000000 |
median | 1.3081599 × 108 |
Q3 | 2.7979 × 108 |
95-th percentile | 8.424625 × 108 |
Maximum | 8.568216 × 109 |
Range | 9.536216 × 109 |
Interquartile range (IQR) | 2.1879 × 108 |
Descriptive statistics
Standard deviation | 5.7660852 × 108 |
---|---|
Coefficient of variation (CV) | 2.1052458 |
Kurtosis | 81.941892 |
Mean | 2.7389131 × 108 |
Median Absolute Deviation (MAD) | 86815988 |
Skewness | 7.8059608 |
Sum | 6.8801496 × 1011 |
Variance | 3.3247739 × 1017 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
300000000 | 34 | 1.4% |
200000000 | 21 | 0.8% |
50000000 | 18 | 0.7% |
0 | 17 | 0.7% |
100000000 | 15 | 0.6% |
40000000 | 13 | 0.5% |
80000000 | 13 | 0.5% |
150000000 | 12 | 0.5% |
250000000 | 12 | 0.5% |
120000000 | 10 | 0.4% |
Other values (1879) | 2347 |
Value | Count | Frequency (%) |
-968000000 | 1 | < 0.1% |
0 | 17 | |
280000 | 1 | < 0.1% |
300000 | 1 | < 0.1% |
840000 | 1 | < 0.1% |
1167910 | 1 | < 0.1% |
1760000 | 1 | < 0.1% |
2670000 | 1 | < 0.1% |
3500000 | 1 | < 0.1% |
4074750 | 1 | < 0.1% |
Value | Count | Frequency (%) |
8568216000 | 1 | |
8448000000 | 1 | |
7654240000 | 1 | |
7345700000 | 1 | |
7273200000 | 1 | |
5976520000 | 1 | |
5941350000 | 1 | |
5896000000 | 1 | |
5451160000 | 1 | |
4531950000 | 1 |
Unnamed: 6
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 2512 |
---|---|
Missing (%) | 100.0% |
Memory size | 22.2 KiB |
Unnamed: 7
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 2512 |
---|---|
Missing (%) | 100.0% |
Memory size | 22.2 KiB |
Unnamed: 8
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 2512 |
---|---|
Missing (%) | 100.0% |
Memory size | 22.2 KiB |
공사년도 | 공사구분 | 공사번호 | 정산금액 | |
---|---|---|---|---|
공사년도 | 1.000 | 0.318 | 0.548 | 0.158 |
공사구분 | 0.318 | 1.000 | 0.564 | 0.024 |
공사번호 | 0.548 | 0.564 | 1.000 | 0.020 |
정산금액 | 0.158 | 0.024 | 0.020 | 1.000 |
공사년도 | 공사번호 | 정산금액 | 공사구분 | |
---|---|---|---|---|
공사년도 | 1.000 | 0.280 | 0.062 | 0.205 |
공사번호 | 0.280 | 1.000 | 0.011 | 0.407 |
정산금액 | 0.062 | 0.011 | 1.000 | 0.000 |
공사구분 | 0.205 | 0.407 | 0.000 | 1.000 |
공사년도 | 공사구분 | 공사번호 | 부서코드 | 제출일자 | 정산금액 | Unnamed: 6 | Unnamed: 7 | Unnamed: 8 | |
---|---|---|---|---|---|---|---|---|---|
0 | 1990 | 공사 | 1 | 1 | 1990-12-15 | 53400000 | <NA> | <NA> | <NA> |
1 | 1990 | 공사 | 4 | 1 | 1990-12-29 | 40850000 | <NA> | <NA> | <NA> |
2 | 1990 | 공사 | 2 | 1 | 1990-12-07 | 98000000 | <NA> | <NA> | <NA> |
3 | 1990 | 공사 | 2 | 1 | 199012 | 61900000 | <NA> | <NA> | <NA> |
4 | 1990 | 공사 | 1 | 1 | 1990-04-06 | 35600000 | <NA> | <NA> | <NA> |
5 | 1990 | 공사 | 4 | 1 | 1990-09-26 | 48000000 | <NA> | <NA> | <NA> |
6 | 1990 | 공사 | 3 | 1 | 199012 | 140090000 | <NA> | <NA> | <NA> |
7 | 1990 | 공사 | 3 | 1 | 1990-04-11 | 101600000 | <NA> | <NA> | <NA> |
8 | 1990 | 공사 | 7 | 1 | 1990-12-31 | 59000000 | <NA> | <NA> | <NA> |
9 | 1990 | 공사 | 7 | 1 | 1990-05-03 | 82000000 | <NA> | <NA> | <NA> |
공사년도 | 공사구분 | 공사번호 | 부서코드 | 제출일자 | 정산금액 | Unnamed: 6 | Unnamed: 7 | Unnamed: 8 | |
---|---|---|---|---|---|---|---|---|---|
2502 | 2018 | 공사 | 133 | 1 | 2019-04-01 | 150754000 | <NA> | <NA> | <NA> |
2503 | 2018 | 공사 | 134 | 1 | 2019-03-26 | 38500000 | <NA> | <NA> | <NA> |
2504 | 2018 | 공사 | 133 | 1 | 2018-12-19 | 100000000 | <NA> | <NA> | <NA> |
2505 | 2018 | 공사 | 133 | 1 | 2019-01-31 | 109246000 | <NA> | <NA> | <NA> |
2506 | 2018 | 공사 | 134 | 1 | 2018-12-21 | 77000000 | <NA> | <NA> | <NA> |
2507 | 2018 | 공사 | 140 | 1 | 2018-12-24 | 80000000 | <NA> | <NA> | <NA> |
2508 | 2018 | 공사 | 149 | 1 | 2019-05-13 | 110000000 | <NA> | <NA> | <NA> |
2509 | 2018 | 공사 | 150 | 1 | 2019-03-11 | 12882000 | <NA> | <NA> | <NA> |
2510 | 2019 | 공사 | 1 | 1 | 2019-05-29 | 47000000 | <NA> | <NA> | <NA> |
2511 | 2019 | 공사 | 60 | 1 | 2019-07-31 | 80000000 | <NA> | <NA> | <NA> |
Most frequently occurring
공사년도 | 공사구분 | 공사번호 | 부서코드 | 제출일자 | 정산금액 | # duplicates | |
---|---|---|---|---|---|---|---|
0 | 2005 | 용역 | 77 | 1 | 2006-08-31 | 54000000 | 2 |