Overview

Dataset statistics

Number of variables9
Number of observations2529
Missing cells708
Missing cells (%)3.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory187.8 KiB
Average record size in memory76.1 B

Variable types

Text2
Categorical1
Numeric4
DateTime2

Dataset

Description산림사업용역관리 공정에 대한 데이터입니다.용역관리사업ID, 구분명, 예정공정번호, 예정공정명 등을 제공합니다.
Author산림청
URLhttps://www.data.go.kr/data/15120720/fileData.do

Alerts

예정공정번호 is highly overall correlated with 정렬순서High correlation
공정일수(일) is highly overall correlated with 작업인원값High correlation
작업인원값 is highly overall correlated with 공정일수(일)High correlation
정렬순서 is highly overall correlated with 예정공정번호High correlation
공정일수(일) has 245 (9.7%) missing valuesMissing
작업인원값 has 463 (18.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 14:27:55.840560
Analysis finished2023-12-12 14:27:58.701663
Duration2.86 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct690
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Memory size19.9 KiB
2023-12-12T23:27:58.933561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length9
Mean length9.0197707
Min length9

Characters and Unicode

Total characters22811
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique221 ?
Unique (%)8.7%

Sample

1st row220190387
2nd row920200020
3rd row920200020
4th row920200020
5th row920200020
ValueCountFrequency (%)
120200035 17
 
0.7%
920220006 16
 
0.6%
920220086 16
 
0.6%
920220007 16
 
0.6%
920220056 16
 
0.6%
920220019 16
 
0.6%
920210079 16
 
0.6%
920200020 15
 
0.6%
920210014 14
 
0.6%
920200310 14
 
0.6%
Other values (680) 2373
93.8%
2023-12-12T23:27:59.407624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 7452
32.7%
0 7241
31.7%
1 2075
 
9.1%
9 1809
 
7.9%
3 1140
 
5.0%
5 792
 
3.5%
6 670
 
2.9%
4 635
 
2.8%
7 512
 
2.2%
8 445
 
2.0%
Other values (7) 40
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 22771
99.8%
Uppercase Letter 40
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 7452
32.7%
0 7241
31.8%
1 2075
 
9.1%
9 1809
 
7.9%
3 1140
 
5.0%
5 792
 
3.5%
6 670
 
2.9%
4 635
 
2.8%
7 512
 
2.2%
8 445
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
B 10
25.0%
N 10
25.0%
F 9
22.5%
D 6
15.0%
C 3
 
7.5%
E 1
 
2.5%
S 1
 
2.5%

Most occurring scripts

ValueCountFrequency (%)
Common 22771
99.8%
Latin 40
 
0.2%

Most frequent character per script

Common
ValueCountFrequency (%)
2 7452
32.7%
0 7241
31.8%
1 2075
 
9.1%
9 1809
 
7.9%
3 1140
 
5.0%
5 792
 
3.5%
6 670
 
2.9%
4 635
 
2.8%
7 512
 
2.2%
8 445
 
2.0%
Latin
ValueCountFrequency (%)
B 10
25.0%
N 10
25.0%
F 9
22.5%
D 6
15.0%
C 3
 
7.5%
E 1
 
2.5%
S 1
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22811
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 7452
32.7%
0 7241
31.7%
1 2075
 
9.1%
9 1809
 
7.9%
3 1140
 
5.0%
5 792
 
3.5%
6 670
 
2.9%
4 635
 
2.8%
7 512
 
2.2%
8 445
 
2.0%
Other values (7) 40
 
0.2%

구분명
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size19.9 KiB
P
1693 
H
576 
E
260 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowP
2nd rowE
3rd rowE
4th rowE
5th rowP

Common Values

ValueCountFrequency (%)
P 1693
66.9%
H 576
 
22.8%
E 260
 
10.3%

Length

2023-12-12T23:27:59.571846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:27:59.693026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
p 1693
66.9%
h 576
 
22.8%
e 260
 
10.3%

예정공정번호
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.250692
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.4 KiB
2023-12-12T23:27:59.817756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile5
Maximum9
Range8
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.3298932
Coefficient of variation (CV)0.59088194
Kurtosis0.62949335
Mean2.250692
Median Absolute Deviation (MAD)1
Skewness0.96503241
Sum5692
Variance1.768616
MonotonicityNot monotonic
2023-12-12T23:27:59.964799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 1008
39.9%
2 566
22.4%
3 486
19.2%
4 308
 
12.2%
5 123
 
4.9%
6 26
 
1.0%
7 7
 
0.3%
8 3
 
0.1%
9 2
 
0.1%
ValueCountFrequency (%)
1 1008
39.9%
2 566
22.4%
3 486
19.2%
4 308
 
12.2%
5 123
 
4.9%
6 26
 
1.0%
7 7
 
0.3%
8 3
 
0.1%
9 2
 
0.1%
ValueCountFrequency (%)
9 2
 
0.1%
8 3
 
0.1%
7 7
 
0.3%
6 26
 
1.0%
5 123
 
4.9%
4 308
 
12.2%
3 486
19.2%
2 566
22.4%
1 1008
39.9%
Distinct550
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size19.9 KiB
2023-12-12T23:28:00.280431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length29
Mean length6.3262159
Min length1

Characters and Unicode

Total characters15999
Distinct characters230
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique331 ?
Unique (%)13.1%

Sample

1st row숲가꾸기 설계
2nd row기계톱
3rd row굴삭기
4th row스마트집재기
5th row경계확인
ValueCountFrequency (%)
205
 
5.7%
정리보완 157
 
4.4%
경계확인 149
 
4.2%
숲가꾸기 135
 
3.8%
설계도서작성 114
 
3.2%
현지조사 110
 
3.1%
2022년 108
 
3.0%
사업 84
 
2.3%
납품 79
 
2.2%
현장조사 73
 
2.0%
Other values (494) 2372
66.1%
2023-12-12T23:28:00.749420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1067
 
6.7%
682
 
4.3%
676
 
4.2%
2 637
 
4.0%
621
 
3.9%
477
 
3.0%
436
 
2.7%
407
 
2.5%
356
 
2.2%
338
 
2.1%
Other values (220) 10302
64.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13034
81.5%
Decimal Number 1252
 
7.8%
Space Separator 1067
 
6.7%
Open Punctuation 201
 
1.3%
Close Punctuation 200
 
1.3%
Lowercase Letter 120
 
0.8%
Other Punctuation 92
 
0.6%
Uppercase Letter 20
 
0.1%
Dash Punctuation 6
 
< 0.1%
Connector Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
682
 
5.2%
676
 
5.2%
621
 
4.8%
477
 
3.7%
436
 
3.3%
407
 
3.1%
356
 
2.7%
338
 
2.6%
320
 
2.5%
312
 
2.4%
Other values (188) 8409
64.5%
Decimal Number
ValueCountFrequency (%)
2 637
50.9%
0 304
24.3%
1 141
 
11.3%
3 89
 
7.1%
4 22
 
1.8%
5 19
 
1.5%
7 18
 
1.4%
6 10
 
0.8%
9 7
 
0.6%
8 5
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
A 6
30.0%
T 4
20.0%
H 3
15.0%
M 3
15.0%
S 2
 
10.0%
E 2
 
10.0%
Other Punctuation
ValueCountFrequency (%)
, 64
69.6%
/ 25
 
27.2%
. 2
 
2.2%
· 1
 
1.1%
Lowercase Letter
ValueCountFrequency (%)
t 60
50.0%
e 30
25.0%
s 30
25.0%
Open Punctuation
ValueCountFrequency (%)
( 200
99.5%
[ 1
 
0.5%
Close Punctuation
ValueCountFrequency (%)
) 199
99.5%
] 1
 
0.5%
Math Symbol
ValueCountFrequency (%)
+ 1
50.0%
~ 1
50.0%
Space Separator
ValueCountFrequency (%)
1067
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13034
81.5%
Common 2825
 
17.7%
Latin 140
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
682
 
5.2%
676
 
5.2%
621
 
4.8%
477
 
3.7%
436
 
3.3%
407
 
3.1%
356
 
2.7%
338
 
2.6%
320
 
2.5%
312
 
2.4%
Other values (188) 8409
64.5%
Common
ValueCountFrequency (%)
1067
37.8%
2 637
22.5%
0 304
 
10.8%
( 200
 
7.1%
) 199
 
7.0%
1 141
 
5.0%
3 89
 
3.2%
, 64
 
2.3%
/ 25
 
0.9%
4 22
 
0.8%
Other values (13) 77
 
2.7%
Latin
ValueCountFrequency (%)
t 60
42.9%
e 30
21.4%
s 30
21.4%
A 6
 
4.3%
T 4
 
2.9%
H 3
 
2.1%
M 3
 
2.1%
S 2
 
1.4%
E 2
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12997
81.2%
ASCII 2964
 
18.5%
Compat Jamo 37
 
0.2%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1067
36.0%
2 637
21.5%
0 304
 
10.3%
( 200
 
6.7%
) 199
 
6.7%
1 141
 
4.8%
3 89
 
3.0%
, 64
 
2.2%
t 60
 
2.0%
e 30
 
1.0%
Other values (21) 173
 
5.8%
Hangul
ValueCountFrequency (%)
682
 
5.2%
676
 
5.2%
621
 
4.8%
477
 
3.7%
436
 
3.4%
407
 
3.1%
356
 
2.7%
338
 
2.6%
320
 
2.5%
312
 
2.4%
Other values (180) 8372
64.4%
Compat Jamo
ValueCountFrequency (%)
16
43.2%
9
24.3%
7
18.9%
1
 
2.7%
1
 
2.7%
1
 
2.7%
1
 
2.7%
1
 
2.7%
None
ValueCountFrequency (%)
· 1
100.0%
Distinct805
Distinct (%)31.8%
Missing0
Missing (%)0.0%
Memory size19.9 KiB
Minimum2019-11-21 00:00:00
Maximum2023-09-02 00:00:00
2023-12-12T23:28:00.887713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:28:01.035639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct820
Distinct (%)32.4%
Missing0
Missing (%)0.0%
Memory size19.9 KiB
Minimum2019-11-23 00:00:00
Maximum2023-09-02 00:00:00
2023-12-12T23:28:01.223957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:28:01.397099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

공정일수(일)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct90
Distinct (%)3.9%
Missing245
Missing (%)9.7%
Infinite0
Infinite (%)0.0%
Mean16.534151
Minimum1
Maximum144
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.4 KiB
2023-12-12T23:28:01.544311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median9
Q322
95-th percentile55.85
Maximum144
Range143
Interquartile range (IQR)18

Descriptive statistics

Standard deviation19.919971
Coefficient of variation (CV)1.2047774
Kurtosis7.7170679
Mean16.534151
Median Absolute Deviation (MAD)6
Skewness2.4103574
Sum37764
Variance396.80523
MonotonicityNot monotonic
2023-12-12T23:28:01.696528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 252
 
10.0%
3 162
 
6.4%
1 154
 
6.1%
6 127
 
5.0%
5 122
 
4.8%
7 108
 
4.3%
8 101
 
4.0%
10 101
 
4.0%
9 98
 
3.9%
4 90
 
3.6%
Other values (80) 969
38.3%
(Missing) 245
 
9.7%
ValueCountFrequency (%)
1 154
6.1%
2 252
10.0%
3 162
6.4%
4 90
 
3.6%
5 122
4.8%
6 127
5.0%
7 108
4.3%
8 101
4.0%
9 98
 
3.9%
10 101
4.0%
ValueCountFrequency (%)
144 3
0.1%
129 1
 
< 0.1%
126 5
0.2%
124 1
 
< 0.1%
120 2
 
0.1%
118 2
 
0.1%
116 2
 
0.1%
106 1
 
< 0.1%
104 1
 
< 0.1%
103 3
0.1%

작업인원값
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct80
Distinct (%)3.9%
Missing463
Missing (%)18.3%
Infinite0
Infinite (%)0.0%
Mean11.780736
Minimum1
Maximum496
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.4 KiB
2023-12-12T23:28:01.848185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q38
95-th percentile17
Maximum496
Range495
Interquartile range (IQR)6

Descriptive statistics

Standard deviation42.188105
Coefficient of variation (CV)3.5811095
Kurtosis54.259005
Mean11.780736
Median Absolute Deviation (MAD)2
Skewness6.9090241
Sum24339
Variance1779.8362
MonotonicityNot monotonic
2023-12-12T23:28:01.997646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 490
19.4%
1 485
19.2%
3 171
 
6.8%
8 156
 
6.2%
6 131
 
5.2%
9 109
 
4.3%
4 90
 
3.6%
5 88
 
3.5%
7 86
 
3.4%
10 68
 
2.7%
Other values (70) 192
 
7.6%
(Missing) 463
18.3%
ValueCountFrequency (%)
1 485
19.2%
2 490
19.4%
3 171
 
6.8%
4 90
 
3.6%
5 88
 
3.5%
6 131
 
5.2%
7 86
 
3.4%
8 156
 
6.2%
9 109
 
4.3%
10 68
 
2.7%
ValueCountFrequency (%)
496 2
0.1%
449 1
< 0.1%
418 1
< 0.1%
398 2
0.1%
362 2
0.1%
348 2
0.1%
342 1
< 0.1%
318 1
< 0.1%
302 2
0.1%
280 2
0.1%

정렬순서
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.214314
Minimum0
Maximum8
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size22.4 KiB
2023-12-12T23:28:02.105476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q33
95-th percentile5
Maximum8
Range8
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.2725563
Coefficient of variation (CV)0.5746955
Kurtosis-0.17370083
Mean2.214314
Median Absolute Deviation (MAD)1
Skewness0.8004964
Sum5600
Variance1.6193995
MonotonicityNot monotonic
2023-12-12T23:28:02.219688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 1016
40.2%
2 570
22.5%
3 491
19.4%
4 309
 
12.2%
5 121
 
4.8%
6 18
 
0.7%
7 2
 
0.1%
8 1
 
< 0.1%
0 1
 
< 0.1%
ValueCountFrequency (%)
0 1
 
< 0.1%
1 1016
40.2%
2 570
22.5%
3 491
19.4%
4 309
 
12.2%
5 121
 
4.8%
6 18
 
0.7%
7 2
 
0.1%
8 1
 
< 0.1%
ValueCountFrequency (%)
8 1
 
< 0.1%
7 2
 
0.1%
6 18
 
0.7%
5 121
 
4.8%
4 309
 
12.2%
3 491
19.4%
2 570
22.5%
1 1016
40.2%
0 1
 
< 0.1%

Interactions

2023-12-12T23:27:57.865895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:56.328139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:56.822598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:57.345995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:58.000343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:56.461469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:56.952877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:57.483978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:58.112677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:56.580080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:57.093625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:57.602451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:58.235813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:56.710758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:57.231522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:27:57.723387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:28:02.306931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분명예정공정번호공정일수(일)작업인원값정렬순서
구분명1.0000.1950.4290.1440.241
예정공정번호0.1951.0000.1120.0000.980
공정일수(일)0.4290.1121.0000.5480.092
작업인원값0.1440.0000.5481.0000.000
정렬순서0.2410.9800.0920.0001.000
2023-12-12T23:28:02.406836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
예정공정번호공정일수(일)작업인원값정렬순서구분명
예정공정번호1.000-0.107-0.0310.9740.086
공정일수(일)-0.1071.0000.509-0.1030.285
작업인원값-0.0310.5091.000-0.0250.086
정렬순서0.974-0.103-0.0251.0000.109
구분명0.0860.2850.0860.1091.000

Missing values

2023-12-12T23:27:58.396716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:27:58.543468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T23:27:58.653117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

용역관리사업아이디구분명예정공정번호예정공정명공정시작일공정종료일공정일수(일)작업인원값정렬순서
0220190387P1숲가꾸기 설계2019-12-172019-12-31<NA>21
1920200020E1기계톱2020-01-302020-03-164751
2920200020E2굴삭기2020-01-292020-03-174912
3920200020E3스마트집재기2020-02-062020-03-153913
4920200020P1경계확인2020-01-292020-01-31311
5920200020P2선목2020-01-292020-02-05<NA>22
6920200020P7가지치기2020-02-122020-03-102843
7920200020P4벌도2020-01-302020-03-134454
8920200020P8하산집재2020-02-062020-03-174145
9920200020P9정리보완2020-03-072020-03-171156
용역관리사업아이디구분명예정공정번호예정공정명공정시작일공정종료일공정일수(일)작업인원값정렬순서
2519920230050H4보완작업2023-04-242023-05-071484
2520920230050P1선목2023-02-072023-02-11581
2521920230050P2솎아베기2023-02-132023-04-075482
2522920230050P3산물수집2023-03-012023-04-215263
2523920230050P4보완작업2023-04-242023-05-071484
2524920230010H12023년 제5차 조림예정지정리사업2023-01-092023-01-157371
2525920230010E1우드그랩(굴삭기)2023-01-092023-01-15771
2526920230010P12023년 제5차 조림예정지정리사업2023-01-092023-01-157<NA>1
2527220230162P1조사및 최종보2023-05-162023-06-122861
2528220230173P1숲가꾸기설계2023-08-042023-08-1815<NA>1