Overview

Dataset statistics

Number of variables10
Number of observations1257
Missing cells3
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory106.9 KiB
Average record size in memory87.1 B

Variable types

Text1
Categorical6
Numeric3

Dataset

Description산림자원통합관리시스템을 활용하여 운반로 단가 및 단가산출내역산림자원통합관리시스템: 조림, 숲가꾸기, 임목생산 등 산림자원 정보화시스템
Author산림청
URLhttps://www.data.go.kr/data/15093802/fileData.do

Alerts

기계단가 is highly overall correlated with 잡품비율 and 1 other fieldsHigh correlation
시간당소료계수 is highly overall correlated with 잡품비율 and 1 other fieldsHigh correlation
주연료량 is highly overall correlated with 굴삭기구분명 and 3 other fieldsHigh correlation
운반로작업로구분명 is highly overall correlated with 기안순서번호High correlation
기안순서번호 is highly overall correlated with 운반로작업로구분명High correlation
굴삭기구분명 is highly overall correlated with 주연료량 and 2 other fieldsHigh correlation
바켓용량 is highly overall correlated with 주연료량 and 2 other fieldsHigh correlation
잡품비율 is highly overall correlated with 기계단가 and 5 other fieldsHigh correlation
건설기계운전자수 is highly overall correlated with 기계단가 and 3 other fieldsHigh correlation
운반로작업로구분명 is highly imbalanced (75.9%)Imbalance
기안순서번호 is highly imbalanced (60.8%)Imbalance
굴삭기구분명 is highly imbalanced (89.2%)Imbalance
바켓용량 is highly imbalanced (90.2%)Imbalance
잡품비율 is highly imbalanced (90.8%)Imbalance
건설기계운전자수 is highly imbalanced (97.0%)Imbalance

Reproduction

Analysis started2023-12-12 22:15:56.871467
Analysis finished2023-12-12 22:15:58.706272
Duration1.83 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1241
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
2023-12-13T07:15:58.897229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters12570
Distinct characters54
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1225 ?
Unique (%)97.5%

Sample

1st row구미2019E009
2nd row정읍2019F024
3rd row정선2019F031
4th row춘천2019F044
5th row단양2019F020
ValueCountFrequency (%)
충주2015e037 2
 
0.2%
단양2016e021 2
 
0.2%
강릉2018e029 2
 
0.2%
충주2018e006 2
 
0.2%
영덕2014e092 2
 
0.2%
강릉2021e063 2
 
0.2%
홍천2019e016 2
 
0.2%
단양2022e003 2
 
0.2%
구미2014e060 2
 
0.2%
영주2017e022 2
 
0.2%
Other values (1231) 1237
98.4%
2023-12-13T07:15:59.318080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2939
23.4%
2 2393
19.0%
F 1160
 
9.2%
1 1113
 
8.9%
3 538
 
4.3%
4 428
 
3.4%
5 336
 
2.7%
6 290
 
2.3%
9 285
 
2.3%
257
 
2.0%
Other values (44) 2831
22.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8799
70.0%
Other Letter 2514
 
20.0%
Uppercase Letter 1257
 
10.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
257
 
10.2%
204
 
8.1%
180
 
7.2%
124
 
4.9%
115
 
4.6%
94
 
3.7%
94
 
3.7%
77
 
3.1%
67
 
2.7%
67
 
2.7%
Other values (32) 1235
49.1%
Decimal Number
ValueCountFrequency (%)
0 2939
33.4%
2 2393
27.2%
1 1113
 
12.6%
3 538
 
6.1%
4 428
 
4.9%
5 336
 
3.8%
6 290
 
3.3%
9 285
 
3.2%
8 240
 
2.7%
7 237
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
F 1160
92.3%
E 97
 
7.7%

Most occurring scripts

ValueCountFrequency (%)
Common 8799
70.0%
Hangul 2514
 
20.0%
Latin 1257
 
10.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
257
 
10.2%
204
 
8.1%
180
 
7.2%
124
 
4.9%
115
 
4.6%
94
 
3.7%
94
 
3.7%
77
 
3.1%
67
 
2.7%
67
 
2.7%
Other values (32) 1235
49.1%
Common
ValueCountFrequency (%)
0 2939
33.4%
2 2393
27.2%
1 1113
 
12.6%
3 538
 
6.1%
4 428
 
4.9%
5 336
 
3.8%
6 290
 
3.3%
9 285
 
3.2%
8 240
 
2.7%
7 237
 
2.7%
Latin
ValueCountFrequency (%)
F 1160
92.3%
E 97
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10056
80.0%
Hangul 2514
 
20.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2939
29.2%
2 2393
23.8%
F 1160
 
11.5%
1 1113
 
11.1%
3 538
 
5.4%
4 428
 
4.3%
5 336
 
3.3%
6 290
 
2.9%
9 285
 
2.8%
8 240
 
2.4%
Other values (2) 334
 
3.3%
Hangul
ValueCountFrequency (%)
257
 
10.2%
204
 
8.1%
180
 
7.2%
124
 
4.9%
115
 
4.6%
94
 
3.7%
94
 
3.7%
77
 
3.1%
67
 
2.7%
67
 
2.7%
Other values (32) 1235
49.1%

운반로작업로구분명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
임산물운반로
1207 
기계화작업로
 
50

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기계화작업로
2nd row임산물운반로
3rd row임산물운반로
4th row임산물운반로
5th row임산물운반로

Common Values

ValueCountFrequency (%)
임산물운반로 1207
96.0%
기계화작업로 50
 
4.0%

Length

2023-12-13T07:15:59.472547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:15:59.571610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
임산물운반로 1207
96.0%
기계화작업로 50
 
4.0%

기안순서번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
2
1160 
3
 
97

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 1160
92.3%
3 97
 
7.7%

Length

2023-12-13T07:15:59.657440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:15:59.747584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 1160
92.3%
3 97
 
7.7%

굴삭기구분명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
굴삭기(0.2㎥)
1239 
굴삭기(0.4㎥)
 
18

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row굴삭기(0.2㎥)
2nd row굴삭기(0.2㎥)
3rd row굴삭기(0.2㎥)
4th row굴삭기(0.2㎥)
5th row굴삭기(0.2㎥)

Common Values

ValueCountFrequency (%)
굴삭기(0.2㎥) 1239
98.6%
굴삭기(0.4㎥) 18
 
1.4%

Length

2023-12-13T07:15:59.834440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:15:59.940650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
굴삭기(0.2㎥ 1239
98.6%
굴삭기(0.4㎥ 18
 
1.4%

바켓용량
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
0.2
1241 
0.4
 
16

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.2
2nd row0.2
3rd row0.2
4th row0.2
5th row0.2

Common Values

ValueCountFrequency (%)
0.2 1241
98.7%
0.4 16
 
1.3%

Length

2023-12-13T07:16:00.058856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:16:00.160868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.2 1241
98.7%
0.4 16
 
1.3%

기계단가
Real number (ℝ)

HIGH CORRELATION 

Distinct19
Distinct (%)1.5%
Missing1
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean53558717
Minimum0
Maximum65000000
Zeros2
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size11.2 KiB
2023-12-13T07:16:00.248414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile52000000
Q152000000
median52000000
Q352000000
95-th percentile60857000
Maximum65000000
Range65000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4808980.1
Coefficient of variation (CV)0.089788933
Kurtosis52.434401
Mean53558717
Median Absolute Deviation (MAD)0
Skewness-4.6576289
Sum6.7269749 × 1010
Variance2.3126289 × 1013
MonotonicityNot monotonic
2023-12-13T07:16:00.365895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
52000000 960
76.4%
60000000 128
 
10.2%
60857000 30
 
2.4%
59000000 25
 
2.0%
59400000 25
 
2.0%
60905000 16
 
1.3%
55890000 16
 
1.3%
61106000 16
 
1.3%
59780000 12
 
1.0%
58627000 11
 
0.9%
Other values (9) 17
 
1.4%
ValueCountFrequency (%)
0 2
 
0.2%
60000 1
 
0.1%
6000000 2
 
0.2%
20000000 1
 
0.1%
30000000 1
 
0.1%
52000000 960
76.4%
55208000 2
 
0.2%
55890000 16
 
1.3%
57445000 2
 
0.2%
58627000 11
 
0.9%
ValueCountFrequency (%)
65000000 2
 
0.2%
64500000 4
 
0.3%
61106000 16
 
1.3%
60905000 16
 
1.3%
60857000 30
 
2.4%
60000000 128
10.2%
59780000 12
 
1.0%
59400000 25
 
2.0%
59000000 25
 
2.0%
58627000 11
 
0.9%

시간당소료계수
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)0.6%
Missing1
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean2021.9944
Minimum0
Maximum2085
Zeros7
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size11.2 KiB
2023-12-13T07:16:00.504210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2038
Q12038
median2038
Q32038
95-th percentile2038
Maximum2085
Range2085
Interquartile range (IQR)0

Descriptive statistics

Standard deviation182.7712
Coefficient of variation (CV)0.090391546
Kurtosis115.67968
Mean2021.9944
Median Absolute Deviation (MAD)0
Skewness-10.764649
Sum2539625
Variance33405.312
MonotonicityNot monotonic
2023-12-13T07:16:00.635994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2038 1204
95.8%
2085 37
 
2.9%
0 7
 
0.6%
1500 3
 
0.2%
2084 2
 
0.2%
15 1
 
0.1%
25 1
 
0.1%
20 1
 
0.1%
(Missing) 1
 
0.1%
ValueCountFrequency (%)
0 7
 
0.6%
15 1
 
0.1%
20 1
 
0.1%
25 1
 
0.1%
1500 3
 
0.2%
2038 1204
95.8%
2084 2
 
0.2%
2085 37
 
2.9%
ValueCountFrequency (%)
2085 37
 
2.9%
2084 2
 
0.2%
2038 1204
95.8%
1500 3
 
0.2%
25 1
 
0.1%
20 1
 
0.1%
15 1
 
0.1%
0 7
 
0.6%

주연료량
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.5%
Missing1
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean5.1257962
Minimum0
Maximum21
Zeros5
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size11.2 KiB
2023-12-13T07:16:00.790498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q15
median5
Q35
95-th percentile5
Maximum21
Range21
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.353209
Coefficient of variation (CV)0.26399977
Kurtosis102.76752
Mean5.1257962
Median Absolute Deviation (MAD)0
Skewness9.1237403
Sum6438
Variance1.8311747
MonotonicityNot monotonic
2023-12-13T07:16:00.939075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
5 1226
97.5%
10 16
 
1.3%
21 6
 
0.5%
0 5
 
0.4%
1 2
 
0.2%
20 1
 
0.1%
(Missing) 1
 
0.1%
ValueCountFrequency (%)
0 5
 
0.4%
1 2
 
0.2%
5 1226
97.5%
10 16
 
1.3%
20 1
 
0.1%
21 6
 
0.5%
ValueCountFrequency (%)
21 6
 
0.5%
20 1
 
0.1%
10 16
 
1.3%
5 1226
97.5%
1 2
 
0.2%
0 5
 
0.4%

잡품비율
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
21
1221 
22
 
19
30
 
7
0
 
7
1
 
2

Length

Max length4
Median length2
Mean length1.9944312
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row21
2nd row21
3rd row21
4th row21
5th row21

Common Values

ValueCountFrequency (%)
21 1221
97.1%
22 19
 
1.5%
30 7
 
0.6%
0 7
 
0.6%
1 2
 
0.2%
<NA> 1
 
0.1%

Length

2023-12-13T07:16:01.081041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:16:01.232272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
21 1221
97.1%
22 19
 
1.5%
30 7
 
0.6%
0 7
 
0.6%
1 2
 
0.2%
na 1
 
0.1%

건설기계운전자수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
1
1251 
0
 
5
<NA>
 
1

Length

Max length4
Median length1
Mean length1.0023866
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1251
99.5%
0 5
 
0.4%
<NA> 1
 
0.1%

Length

2023-12-13T07:16:01.370128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:16:01.493207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1251
99.5%
0 5
 
0.4%
na 1
 
0.1%

Interactions

2023-12-13T07:15:58.054403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:15:57.403031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:15:57.728774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:15:58.162068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:15:57.515514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:15:57.825598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:15:58.245342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:15:57.606803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:15:57.929691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:16:01.566651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
운반로작업로구분명기안순서번호굴삭기구분명바켓용량기계단가시간당소료계수주연료량잡품비율건설기계운전자수
운반로작업로구분명1.0000.8880.0000.0000.2640.1620.3770.1980.126
기안순서번호0.8881.0000.0000.0000.2720.1900.4340.2140.300
굴삭기구분명0.0000.0001.0000.9730.2380.0000.9830.6750.000
바켓용량0.0000.0000.9731.0000.2650.0000.9950.7230.000
기계단가0.2640.2720.2380.2651.0000.7640.6230.7130.792
시간당소료계수0.1620.1900.0000.0000.7641.0000.7440.7460.456
주연료량0.3770.4340.9830.9950.6230.7441.0000.9230.970
잡품비율0.1980.2140.6750.7230.7130.7460.9231.0000.709
건설기계운전자수0.1260.3000.0000.0000.7920.4560.9700.7091.000
2023-12-13T07:16:01.707616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
굴삭기구분명잡품비율운반로작업로구분명기안순서번호바켓용량건설기계운전자수
굴삭기구분명1.0000.8070.0000.0000.8520.000
잡품비율0.8071.0000.2410.2620.8570.843
운반로작업로구분명0.0000.2411.0000.6960.0000.080
기안순서번호0.0000.2620.6961.0000.0000.194
바켓용량0.8520.8570.0000.0001.0000.000
건설기계운전자수0.0000.8430.0800.1940.0001.000
2023-12-13T07:16:01.843821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기계단가시간당소료계수주연료량운반로작업로구분명기안순서번호굴삭기구분명바켓용량잡품비율건설기계운전자수
기계단가1.0000.3100.3200.1890.1950.1710.1900.5790.595
시간당소료계수0.3101.0000.0130.2670.3130.0000.0000.7440.705
주연료량0.3200.0131.0000.2520.2920.8810.9360.9370.843
운반로작업로구분명0.1890.2670.2521.0000.6960.0000.0000.2410.080
기안순서번호0.1950.3130.2920.6961.0000.0000.0000.2620.194
굴삭기구분명0.1710.0000.8810.0000.0001.0000.8520.8070.000
바켓용량0.1900.0000.9360.0000.0000.8521.0000.8570.000
잡품비율0.5790.7440.9370.2410.2620.8070.8571.0000.843
건설기계운전자수0.5950.7050.8430.0800.1940.0000.0000.8431.000

Missing values

2023-12-13T07:15:58.354032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:15:58.507280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T07:15:58.633689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

사업번호운반로작업로구분명기안순서번호굴삭기구분명바켓용량기계단가시간당소료계수주연료량잡품비율건설기계운전자수
0구미2019E009기계화작업로3굴삭기(0.2㎥)0.25200000020385211
1정읍2019F024임산물운반로2굴삭기(0.2㎥)0.25200000020385211
2정선2019F031임산물운반로2굴삭기(0.2㎥)0.26000000020855211
3춘천2019F044임산물운반로2굴삭기(0.2㎥)0.25862700020385211
4단양2019F020임산물운반로2굴삭기(0.2㎥)0.25200000020385211
5함양2019F057임산물운반로2굴삭기(0.2㎥)0.25200000020385211
6울진2019F037임산물운반로2굴삭기(0.2㎥)0.25200000020385211
7영덕2019F046임산물운반로2굴삭기(0.2㎥)0.25200000020385211
8순천2019F043임산물운반로2굴삭기(0.2㎥)0.25200000020385211
9영암2020F009임산물운반로2굴삭기(0.2㎥)0.25200000020385211
사업번호운반로작업로구분명기안순서번호굴삭기구분명바켓용량기계단가시간당소료계수주연료량잡품비율건설기계운전자수
1247충주2023F003임산물운반로2굴삭기(0.2㎥)0.25200000020385211
1248단양2023F002임산물운반로2굴삭기(0.2㎥)0.25200000020385211
1249양양2023F039임산물운반로2굴삭기(0.2㎥)0.25200000020385211
1250보은2023F026임산물운반로2굴삭기(0.2㎥)0.25200000020385211
1251정선2023F034임산물운반로2굴삭기(0.2㎥)0.25200000020385211
1252강릉2023F032임산물운반로2굴삭기(0.2㎥)0.25200000020385211
1253정선2023F038임산물운반로2굴삭기(0.2㎥)0.25200000020385211
1254정선2023F042임산물운반로2굴삭기(0.2㎥)0.25200000020385211
1255정선2023F044임산물운반로2굴삭기(0.2㎥)0.25200000020385211
1256순천2023F034임산물운반로2굴삭기(0.2㎥)0.25200000020385211