Overview

Dataset statistics

Number of variables9
Number of observations31
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.5 KiB
Average record size in memory82.3 B

Variable types

Categorical1
Numeric5
Boolean3

Dataset

Description산림사업용역관리 산림자원용역진척율보고에 대한 데이터입니다.진척보고회차, 용역사업번호, 공정계획면적, 공정실적면적 등을 제공합니다.
Author산림청
URLhttps://www.data.go.kr/data/15124729/fileData.do

Alerts

공정계획면적 is highly overall correlated with 공정실적면적High correlation
공정실적면적 is highly overall correlated with 공정계획면적High correlation
공정진척율 is highly overall correlated with 공정적합여부High correlation
공정적합여부 is highly overall correlated with 공정진척율 and 2 other fieldsHigh correlation
규격적합여부 is highly overall correlated with 공정적합여부 and 1 other fieldsHigh correlation
품질적합여부 is highly overall correlated with 공정적합여부 and 1 other fieldsHigh correlation
공정적합여부 is highly imbalanced (65.5%)Imbalance
규격적합여부 is highly imbalanced (54.1%)Imbalance
품질적합여부 is highly imbalanced (54.1%)Imbalance
알림아이디 has unique valuesUnique

Reproduction

Analysis started2023-12-12 17:57:21.949248
Analysis finished2023-12-12 17:57:27.287757
Duration5.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct5
Distinct (%)16.1%
Missing0
Missing (%)0.0%
Memory size380.0 B
1
14 
2
3
4
5
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)3.2%

Sample

1st row1
2nd row2
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 14
45.2%
2 6
19.4%
3 5
 
16.1%
4 5
 
16.1%
5 1
 
3.2%

Length

2023-12-13T02:57:27.377944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:57:27.528607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 14
45.2%
2 6
19.4%
3 5
 
16.1%
4 5
 
16.1%
5 1
 
3.2%

용역사업번호
Real number (ℝ)

Distinct14
Distinct (%)45.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6212654 × 108
Minimum1.2018003 × 108
Maximum2.2019024 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size411.0 B
2023-12-13T02:57:27.696571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.2018003 × 108
5-th percentile1.2018004 × 108
Q11.2020004 × 108
median1.2020006 × 108
Q32.2018002 × 108
95-th percentile2.2019024 × 108
Maximum2.2019024 × 108
Range1.0001021 × 108
Interquartile range (IQR)99979990

Descriptive statistics

Standard deviation50156252
Coefficient of variation (CV)0.30936485
Kurtosis-2.0165466
Mean1.6212654 × 108
Median Absolute Deviation (MAD)20027
Skewness0.34372056
Sum5.0259227 × 109
Variance2.5156496 × 1015
MonotonicityIncreasing
2023-12-13T02:57:27.865826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
220180021 5
16.1%
120200035 4
12.9%
120200056 4
12.9%
120200063 4
12.9%
220190245 4
12.9%
120180034 2
 
6.5%
120180036 1
 
3.2%
120190057 1
 
3.2%
120190060 1
 
3.2%
120190070 1
 
3.2%
Other values (4) 4
12.9%
ValueCountFrequency (%)
120180034 2
 
6.5%
120180036 1
 
3.2%
120190057 1
 
3.2%
120190060 1
 
3.2%
120190070 1
 
3.2%
120200035 4
12.9%
120200056 4
12.9%
120200063 4
12.9%
220180021 5
16.1%
220180028 1
 
3.2%
ValueCountFrequency (%)
220190245 4
12.9%
220190238 1
 
3.2%
220190237 1
 
3.2%
220190205 1
 
3.2%
220180028 1
 
3.2%
220180021 5
16.1%
120200063 4
12.9%
120200056 4
12.9%
120200035 4
12.9%
120190070 1
 
3.2%

공정계획면적
Real number (ℝ)

HIGH CORRELATION 

Distinct18
Distinct (%)58.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3166.8065
Minimum1
Maximum20000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size411.0 B
2023-12-13T02:57:28.024555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q152
median560
Q34500
95-th percentile16492.5
Maximum20000
Range19999
Interquartile range (IQR)4448

Descriptive statistics

Standard deviation5482.4733
Coefficient of variation (CV)1.7312309
Kurtosis4.5045379
Mean3166.8065
Median Absolute Deviation (MAD)558
Skewness2.2340166
Sum98171
Variance30057514
MonotonicityNot monotonic
2023-12-13T02:57:28.213209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
100 5
16.1%
1 3
 
9.7%
1000 3
 
9.7%
2 3
 
9.7%
5000 3
 
9.7%
20000 2
 
6.5%
4 1
 
3.2%
2000 1
 
3.2%
3000 1
 
3.2%
4000 1
 
3.2%
Other values (8) 8
25.8%
ValueCountFrequency (%)
1 3
9.7%
2 3
9.7%
3 1
 
3.2%
4 1
 
3.2%
100 5
16.1%
150 1
 
3.2%
200 1
 
3.2%
560 1
 
3.2%
760 1
 
3.2%
1000 3
9.7%
ValueCountFrequency (%)
20000 2
6.5%
12985 1
 
3.2%
10000 1
 
3.2%
6000 1
 
3.2%
5000 3
9.7%
4000 1
 
3.2%
3000 1
 
3.2%
2000 1
 
3.2%
1000 3
9.7%
760 1
 
3.2%

공정실적면적
Real number (ℝ)

HIGH CORRELATION 

Distinct19
Distinct (%)61.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2721.129
Minimum1
Maximum20000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size411.0 B
2023-12-13T02:57:28.392718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14.5
median200
Q33500
95-th percentile13992.5
Maximum20000
Range19999
Interquartile range (IQR)3495.5

Descriptive statistics

Standard deviation5029.972
Coefficient of variation (CV)1.8484871
Kurtosis4.7381543
Mean2721.129
Median Absolute Deviation (MAD)199
Skewness2.2825438
Sum84355
Variance25300619
MonotonicityNot monotonic
2023-12-13T02:57:28.559295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
100 4
12.9%
1 3
 
9.7%
900 2
 
6.5%
5000 2
 
6.5%
4000 2
 
6.5%
1000 2
 
6.5%
3 2
 
6.5%
2 2
 
6.5%
200 2
 
6.5%
360 1
 
3.2%
Other values (9) 9
29.0%
ValueCountFrequency (%)
1 3
9.7%
2 2
6.5%
3 2
6.5%
4 1
 
3.2%
5 1
 
3.2%
100 4
12.9%
148 1
 
3.2%
200 2
6.5%
240 1
 
3.2%
360 1
 
3.2%
ValueCountFrequency (%)
20000 1
3.2%
15000 1
3.2%
12985 1
3.2%
10000 1
3.2%
5000 2
6.5%
4000 2
6.5%
3000 1
3.2%
1000 2
6.5%
900 2
6.5%
360 1
3.2%

공정진척율
Real number (ℝ)

HIGH CORRELATION 

Distinct12
Distinct (%)38.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean84.084194
Minimum5
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size411.0 B
2023-12-13T02:57:28.740865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile22
Q181.665
median100
Q3100
95-th percentile100
Maximum100
Range95
Interquartile range (IQR)18.335

Descriptive statistics

Standard deviation28.317783
Coefficient of variation (CV)0.33677891
Kurtosis1.9264126
Mean84.084194
Median Absolute Deviation (MAD)0
Skewness-1.7817722
Sum2606.61
Variance801.89682
MonotonicityNot monotonic
2023-12-13T02:57:28.905271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
100.0 19
61.3%
90.0 2
 
6.5%
98.67 1
 
3.2%
5.0 1
 
3.2%
75.0 1
 
3.2%
24.0 1
 
3.2%
26.32 1
 
3.2%
64.29 1
 
3.2%
20.0 1
 
3.2%
80.0 1
 
3.2%
Other values (2) 2
 
6.5%
ValueCountFrequency (%)
5.0 1
3.2%
20.0 1
3.2%
24.0 1
3.2%
26.32 1
3.2%
50.0 1
3.2%
64.29 1
3.2%
75.0 1
3.2%
80.0 1
3.2%
83.33 1
3.2%
90.0 2
6.5%
ValueCountFrequency (%)
100.0 19
61.3%
98.67 1
 
3.2%
90.0 2
 
6.5%
83.33 1
 
3.2%
80.0 1
 
3.2%
75.0 1
 
3.2%
64.29 1
 
3.2%
50.0 1
 
3.2%
26.32 1
 
3.2%
24.0 1
 
3.2%

공정적합여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size163.0 B
True
29 
False
 
2
ValueCountFrequency (%)
True 29
93.5%
False 2
 
6.5%
2023-12-13T02:57:29.416470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

규격적합여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size163.0 B
True
28 
False
ValueCountFrequency (%)
True 28
90.3%
False 3
 
9.7%
2023-12-13T02:57:29.529619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

품질적합여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size163.0 B
True
28 
False
ValueCountFrequency (%)
True 28
90.3%
False 3
 
9.7%
2023-12-13T02:57:29.637657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

알림아이디
Real number (ℝ)

UNIQUE 

Distinct31
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0191411 × 109
Minimum2.0180201 × 109
Maximum2.0200302 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size411.0 B
2023-12-13T02:57:29.775705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0180201 × 109
5-th percentile2.0180201 × 109
Q12.01803 × 109
median2.0190802 × 109
Q32.0200301 × 109
95-th percentile2.0200302 × 109
Maximum2.0200302 × 109
Range2010076
Interquartile range (IQR)2000061.5

Descriptive statistics

Standard deviation831537.58
Coefficient of variation (CV)0.00041182738
Kurtosis-1.4949833
Mean2.0191411 × 109
Median Absolute Deviation (MAD)949926
Skewness-0.25408464
Sum6.2593374 × 1010
Variance6.9145474 × 1011
MonotonicityNot monotonic
2023-12-13T02:57:29.989078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
2018030037 1
 
3.2%
2018030038 1
 
3.2%
2019080189 1
 
3.2%
2019080188 1
 
3.2%
2019080187 1
 
3.2%
2019080186 1
 
3.2%
2019080122 1
 
3.2%
2019080126 1
 
3.2%
2019070125 1
 
3.2%
2018030008 1
 
3.2%
Other values (21) 21
67.7%
ValueCountFrequency (%)
2018020081 1
3.2%
2018020082 1
3.2%
2018020083 1
3.2%
2018020084 1
3.2%
2018020096 1
3.2%
2018030008 1
3.2%
2018030037 1
3.2%
2018030038 1
3.2%
2018030062 1
3.2%
2019070125 1
3.2%
ValueCountFrequency (%)
2020030157 1
3.2%
2020030156 1
3.2%
2020030155 1
3.2%
2020030154 1
3.2%
2020030152 1
3.2%
2020030114 1
3.2%
2020030113 1
3.2%
2020030112 1
3.2%
2020030111 1
3.2%
2020030017 1
3.2%

Interactions

2023-12-13T02:57:26.120023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:22.746777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:23.564836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:24.506831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:25.350229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:26.252353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:22.872438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:23.747937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:24.675236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:25.488964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:26.447997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:23.026579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:23.983120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:24.868537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:25.656724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:26.628207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:23.226089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:24.172509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:25.046463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:25.827969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:26.776337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:23.403495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:24.329368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:25.188878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:57:25.959404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:57:30.146893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
진척보고회차용역사업번호공정계획면적공정실적면적공정진척율공정적합여부규격적합여부품질적합여부알림아이디
진척보고회차1.0000.0000.0000.0000.0000.4220.2560.2560.000
용역사업번호0.0001.0000.6880.3690.0000.0000.2930.2930.479
공정계획면적0.0000.6881.0000.9480.0000.0000.0000.0000.777
공정실적면적0.0000.3690.9481.0000.0000.0000.0000.0000.453
공정진척율0.0000.0000.0000.0001.0000.7620.9090.9090.473
공정적합여부0.4220.0000.0000.0000.7621.0000.7710.7710.000
규격적합여부0.2560.2930.0000.0000.9090.7711.0000.9550.081
품질적합여부0.2560.2930.0000.0000.9090.7710.9551.0000.081
알림아이디0.0000.4790.7770.4530.4730.0000.0810.0811.000
2023-12-13T02:57:30.309160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공정적합여부규격적합여부진척보고회차품질적합여부
공정적합여부1.0000.5600.4830.560
규격적합여부0.5601.0000.2900.808
진척보고회차0.4830.2901.0000.290
품질적합여부0.5600.8080.2901.000
2023-12-13T02:57:30.441362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용역사업번호공정계획면적공정실적면적공정진척율알림아이디진척보고회차공정적합여부규격적합여부품질적합여부
용역사업번호1.0000.3590.373-0.084-0.1820.0000.0000.2090.209
공정계획면적0.3591.0000.992-0.3550.4700.0000.0000.0000.000
공정실적면적0.3730.9921.000-0.2880.4580.0000.0000.0000.000
공정진척율-0.084-0.355-0.2881.000-0.4040.0000.5180.4890.489
알림아이디-0.1820.4700.458-0.4041.0000.0000.0000.1240.124
진척보고회차0.0000.0000.0000.0000.0001.0000.4830.2900.290
공정적합여부0.0000.0000.0000.5180.0000.4831.0000.5600.560
규격적합여부0.2090.0000.0000.4890.1240.2900.5601.0000.808
품질적합여부0.2090.0000.0000.4890.1240.2900.5600.8081.000

Missing values

2023-12-13T02:57:27.002820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:57:27.210666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

진척보고회차용역사업번호공정계획면적공정실적면적공정진척율공정적합여부규격적합여부품질적합여부알림아이디
0112018003411100.0YYY2018030037
1212018003422100.0YYY2018030038
21120180036100100100.0YYY2018030062
31120190057100090090.0YYY2019080148
4112019006011100.0YYY2019080161
51120190070100090090.0YYY2019080268
61120200035100100100.0YYY2020030014
7212020003515014898.67YYY2020030015
8312020003522100.0YYY2020030017
9412020003510055.0YYY2020030152
진척보고회차용역사업번호공정계획면적공정실적면적공정진척율공정적합여부규격적합여부품질적합여부알림아이디
21422018002144100.0NNN2018020084
225220180021100100100.0YYY2018020096
231220180028100100100.0YYY2018030008
2412201902055000100020.0YNN2019070125
2512201902375000400080.0YYY2019080126
2612201902386000500083.33YYY2019080122
27122019024550005000100.0YYY2019080186
28222019024540004000100.0YYY2019080187
29322019024530003000100.0YYY2019080188
3042201902452000100050.0NNN2019080189