Overview

Dataset statistics

Number of variables6
Number of observations43
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.3 KiB
Average record size in memory55.1 B

Variable types

Text1
Numeric2
Categorical2
Boolean1

Dataset

Description산림사업 추진시 산림보조사업 정보로서 상세사업명, 국고비율, 시군구비율, 자부담비율, 사후관리기간, 사용여부 정보로 구성
Author산림청
URLhttps://www.data.go.kr/data/15093785/fileData.do

Alerts

사후관리기간 has constant value ""Constant
국고비율 is highly overall correlated with 자부담비율High correlation
자부담비율 is highly overall correlated with 국고비율 and 1 other fieldsHigh correlation
시군구비율 is highly overall correlated with 자부담비율High correlation
사용여부 is highly imbalanced (84.1%)Imbalance
상세사업명 has unique valuesUnique
자부담비율 has 2 (4.7%) zerosZeros

Reproduction

Analysis started2023-12-12 12:13:06.491344
Analysis finished2023-12-12 12:13:07.444910
Duration0.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상세사업명
Text

UNIQUE 

Distinct43
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size476.0 B
2023-12-12T21:13:07.652746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length16
Mean length11.325581
Min length5

Characters and Unicode

Total characters487
Distinct characters136
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)100.0%

Sample

1st row산림복합경영단지
2nd row산지종합유통센터
3rd row가공산업활성화
4th row생산기반지원
5th row상품화지원
ValueCountFrequency (%)
지원 16
 
16.7%
해외 4
 
4.2%
임산물 4
 
4.2%
수출협의회 2
 
2.1%
2
 
2.1%
해외공동물류센터 2
 
2.1%
수출 2
 
2.1%
결성·운영 1
 
1.0%
정보 1
 
1.0%
활성화 1
 
1.0%
Other values (61) 61
63.5%
2023-12-12T21:13:08.197538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
53
 
10.9%
27
 
5.5%
22
 
4.5%
15
 
3.1%
15
 
3.1%
13
 
2.7%
12
 
2.5%
10
 
2.1%
10
 
2.1%
9
 
1.8%
Other values (126) 301
61.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 423
86.9%
Space Separator 53
 
10.9%
Other Punctuation 5
 
1.0%
Close Punctuation 3
 
0.6%
Open Punctuation 3
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
27
 
6.4%
22
 
5.2%
15
 
3.5%
15
 
3.5%
13
 
3.1%
12
 
2.8%
10
 
2.4%
10
 
2.4%
9
 
2.1%
8
 
1.9%
Other values (122) 282
66.7%
Space Separator
ValueCountFrequency (%)
53
100.0%
Other Punctuation
ValueCountFrequency (%)
· 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 423
86.9%
Common 64
 
13.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
27
 
6.4%
22
 
5.2%
15
 
3.5%
15
 
3.5%
13
 
3.1%
12
 
2.8%
10
 
2.4%
10
 
2.4%
9
 
2.1%
8
 
1.9%
Other values (122) 282
66.7%
Common
ValueCountFrequency (%)
53
82.8%
· 5
 
7.8%
) 3
 
4.7%
( 3
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 423
86.9%
ASCII 59
 
12.1%
None 5
 
1.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
53
89.8%
) 3
 
5.1%
( 3
 
5.1%
Hangul
ValueCountFrequency (%)
27
 
6.4%
22
 
5.2%
15
 
3.5%
15
 
3.5%
13
 
3.1%
12
 
2.8%
10
 
2.4%
10
 
2.4%
9
 
2.1%
8
 
1.9%
Other values (122) 282
66.7%
None
ValueCountFrequency (%)
· 5
100.0%

국고비율
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)18.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.302326
Minimum20
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size519.0 B
2023-12-12T21:13:08.369027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile20
Q120
median50
Q375
95-th percentile90
Maximum100
Range80
Interquartile range (IQR)55

Descriptive statistics

Standard deviation26.312712
Coefficient of variation (CV)0.53370123
Kurtosis-1.0979033
Mean49.302326
Median Absolute Deviation (MAD)30
Skewness0.41652149
Sum2120
Variance692.3588
MonotonicityNot monotonic
2023-12-12T21:13:08.547847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
20 13
30.2%
50 12
27.9%
80 6
14.0%
30 3
 
7.0%
90 3
 
7.0%
40 2
 
4.7%
70 2
 
4.7%
100 2
 
4.7%
ValueCountFrequency (%)
20 13
30.2%
30 3
 
7.0%
40 2
 
4.7%
50 12
27.9%
70 2
 
4.7%
80 6
14.0%
90 3
 
7.0%
100 2
 
4.7%
ValueCountFrequency (%)
100 2
 
4.7%
90 3
 
7.0%
80 6
14.0%
70 2
 
4.7%
50 12
27.9%
40 2
 
4.7%
30 3
 
7.0%
20 13
30.2%

시군구비율
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size476.0 B
0
31 
20
40
 
3

Length

Max length2
Median length1
Mean length1.2790698
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row40
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
0 31
72.1%
20 9
 
20.9%
40 3
 
7.0%

Length

2023-12-12T21:13:08.730196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:13:08.892465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 31
72.1%
20 9
 
20.9%
40 3
 
7.0%

자부담비율
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)20.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.72093
Minimum0
Maximum80
Zeros2
Zeros (%)4.7%
Negative0
Negative (%)0.0%
Memory size519.0 B
2023-12-12T21:13:09.012011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10
Q120
median40
Q365
95-th percentile80
Maximum80
Range80
Interquartile range (IQR)45

Descriptive statistics

Standard deviation25.729662
Coefficient of variation (CV)0.58849759
Kurtosis-1.2372978
Mean43.72093
Median Absolute Deviation (MAD)20
Skewness0.16332268
Sum1880
Variance662.0155
MonotonicityNot monotonic
2023-12-12T21:13:09.159441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
80 10
23.3%
30 9
20.9%
20 7
16.3%
50 7
16.3%
60 3
 
7.0%
10 3
 
7.0%
0 2
 
4.7%
40 1
 
2.3%
70 1
 
2.3%
ValueCountFrequency (%)
0 2
 
4.7%
10 3
 
7.0%
20 7
16.3%
30 9
20.9%
40 1
 
2.3%
50 7
16.3%
60 3
 
7.0%
70 1
 
2.3%
80 10
23.3%
ValueCountFrequency (%)
80 10
23.3%
70 1
 
2.3%
60 3
 
7.0%
50 7
16.3%
40 1
 
2.3%
30 9
20.9%
20 7
16.3%
10 3
 
7.0%
0 2
 
4.7%

사후관리기간
Categorical

CONSTANT 

Distinct1
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size476.0 B
2
43 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 43
100.0%

Length

2023-12-12T21:13:09.327879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:13:09.443622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 43
100.0%

사용여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size175.0 B
True
42 
False
 
1
ValueCountFrequency (%)
True 42
97.7%
False 1
 
2.3%
2023-12-12T21:13:09.567285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-12T21:13:07.040568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:13:06.797515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:13:07.145736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:13:06.923771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:13:09.651483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상세사업명국고비율시군구비율자부담비율사용여부
상세사업명1.0001.0001.0001.0001.000
국고비율1.0001.0000.6490.9320.000
시군구비율1.0000.6491.0000.8400.000
자부담비율1.0000.9320.8401.0000.000
사용여부1.0000.0000.0000.0001.000
2023-12-12T21:13:09.754535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구비율사용여부
시군구비율1.0000.000
사용여부0.0001.000
2023-12-12T21:13:09.838547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국고비율자부담비율시군구비율사용여부
국고비율1.000-0.9110.4870.000
자부담비율-0.9111.0000.5070.000
시군구비율0.4870.5071.0000.000
사용여부0.0000.0000.0001.000

Missing values

2023-12-12T21:13:07.288836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:13:07.392448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상세사업명국고비율시군구비율자부담비율사후관리기간사용여부
0산림복합경영단지4040202Y
1산지종합유통센터5020302Y
2가공산업활성화5020302Y
3생산기반지원2020602Y
4상품화지원2020602Y
5유통기반지원2020602Y
6산림작물생산단지4020402Y
7임산물 판매촉진비 지원300702Y
8임산물 수출기계장비 구입비 지원500502Y
9수출 포장디자인 개발비 지원700302Y
상세사업명국고비율시군구비율자부담비율사후관리기간사용여부
33목재체험교실500502Y
34목공지도자 양성교실500502Y
35초·중등교사 목공체험500502Y
36목재산업박람회500502Y
37목재제품(목공예품) 전시·홍보500502Y
38목재문화진흥회 관련 사업500502Y
39주민편의시설용목재팰릿보일러3040302Y
40주택용목재팰릿보일러3040302Y
41목재팰릿제조시설효율 개선5020302Y
42목재산업시설 현대화5020302Y