Overview

Dataset statistics

Number of variables5
Number of observations4757
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory199.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Text1
Categorical1

Dataset

Description성인지예산제도는 예산이 여성과 남성에게 미칠 영향을 미리 분석하여 이를 예산편성에 반영함으로써 여성과 남성이 동등하게 예산의 수혜를 받도록 하는 제도입니다.
Author행정안전부
URLhttps://www.data.go.kr/data/15066112/fileData.do

Alerts

사업수 is highly overall correlated with 예산액(천원) High correlation
예산액(천원) is highly overall correlated with 사업수High correlation

Reproduction

Analysis started2023-12-12 18:11:57.014017
Analysis finished2023-12-12 18:11:58.529998
Duration1.52 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회계연도
Real number (ℝ)

Distinct8
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2019.4082
Minimum2015
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.9 KiB
2023-12-13T03:11:58.585684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2015
5-th percentile2016
Q12017
median2019
Q32022
95-th percentile2023
Maximum2023
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.4501419
Coefficient of variation (CV)0.001213297
Kurtosis-1.4239457
Mean2019.4082
Median Absolute Deviation (MAD)2
Skewness0.087165901
Sum9606325
Variance6.0031953
MonotonicityNot monotonic
2023-12-13T03:11:58.738357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2017 683
14.4%
2018 678
14.3%
2021 677
14.2%
2022 677
14.2%
2016 677
14.2%
2019 675
14.2%
2023 674
14.2%
2015 16
 
0.3%
ValueCountFrequency (%)
2015 16
 
0.3%
2016 677
14.2%
2017 683
14.4%
2018 678
14.3%
2019 675
14.2%
2021 677
14.2%
2022 677
14.2%
2023 674
14.2%
ValueCountFrequency (%)
2023 674
14.2%
2022 677
14.2%
2021 677
14.2%
2019 675
14.2%
2018 678
14.3%
2017 683
14.4%
2016 677
14.2%
2015 16
 
0.3%
Distinct263
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Memory size37.3 KiB
2023-12-13T03:11:59.133893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length7.9066639
Min length3

Characters and Unicode

Total characters37612
Distinct characters139
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시 종로구
5th row서울특별시 종로구
ValueCountFrequency (%)
경기도 650
 
7.1%
서울특별시 487
 
5.3%
전라남도 473
 
5.2%
경상북도 449
 
4.9%
경상남도 376
 
4.1%
부산광역시 339
 
3.7%
충청남도 327
 
3.6%
전라북도 303
 
3.3%
충청북도 241
 
2.6%
강원도 215
 
2.3%
Other values (212) 5297
57.8%
2023-12-13T03:11:59.701313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4400
 
11.7%
3328
 
8.8%
3058
 
8.1%
1675
 
4.5%
1535
 
4.1%
1522
 
4.0%
1420
 
3.8%
1269
 
3.4%
1111
 
3.0%
1027
 
2.7%
Other values (129) 17267
45.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 33212
88.3%
Space Separator 4400
 
11.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3328
 
10.0%
3058
 
9.2%
1675
 
5.0%
1535
 
4.6%
1522
 
4.6%
1420
 
4.3%
1269
 
3.8%
1111
 
3.3%
1027
 
3.1%
905
 
2.7%
Other values (128) 16362
49.3%
Space Separator
ValueCountFrequency (%)
4400
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 33212
88.3%
Common 4400
 
11.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3328
 
10.0%
3058
 
9.2%
1675
 
5.0%
1535
 
4.6%
1522
 
4.6%
1420
 
4.3%
1269
 
3.8%
1111
 
3.3%
1027
 
3.1%
905
 
2.7%
Other values (128) 16362
49.3%
Common
ValueCountFrequency (%)
4400
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 33212
88.3%
ASCII 4400
 
11.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4400
100.0%
Hangul
ValueCountFrequency (%)
3328
 
10.0%
3058
 
9.2%
1675
 
5.0%
1535
 
4.6%
1522
 
4.6%
1420
 
4.3%
1269
 
3.8%
1111
 
3.3%
1027
 
3.1%
905
 
2.7%
Other values (128) 16362
49.3%
Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size37.3 KiB
자치단체특화사업
1431 
양성평등정책추진사업
1384 
성별영향분석평가사업
977 
성별영향평가사업
729 
여성정책추진사업
236 

Length

Max length10
Median length8
Mean length8.9926424
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row양성평등정책추진사업
2nd row성별영향평가사업
3rd row자치단체특화사업
4th row양성평등정책추진사업
5th row성별영향평가사업

Common Values

ValueCountFrequency (%)
자치단체특화사업 1431
30.1%
양성평등정책추진사업 1384
29.1%
성별영향분석평가사업 977
20.5%
성별영향평가사업 729
15.3%
여성정책추진사업 236
 
5.0%

Length

2023-12-13T03:11:59.874780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:12:00.004263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
자치단체특화사업 1431
30.1%
양성평등정책추진사업 1384
29.1%
성별영향분석평가사업 977
20.5%
성별영향평가사업 729
15.3%
여성정책추진사업 236
 
5.0%

사업수
Real number (ℝ)

HIGH CORRELATION 

Distinct156
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.565693
Minimum1
Maximum281
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.9 KiB
2023-12-13T03:12:00.149446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q16
median17
Q334
95-th percentile71.2
Maximum281
Range280
Interquartile range (IQR)28

Descriptive statistics

Standard deviation26.479425
Coefficient of variation (CV)1.0779026
Kurtosis15.697886
Mean24.565693
Median Absolute Deviation (MAD)12
Skewness2.9071038
Sum116859
Variance701.15995
MonotonicityNot monotonic
2023-12-13T03:12:00.608744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 248
 
5.2%
3 239
 
5.0%
1 235
 
4.9%
5 176
 
3.7%
4 176
 
3.7%
6 158
 
3.3%
7 156
 
3.3%
8 126
 
2.6%
9 116
 
2.4%
16 112
 
2.4%
Other values (146) 3015
63.4%
ValueCountFrequency (%)
1 235
4.9%
2 248
5.2%
3 239
5.0%
4 176
3.7%
5 176
3.7%
6 158
3.3%
7 156
3.3%
8 126
2.6%
9 116
2.4%
10 104
2.2%
ValueCountFrequency (%)
281 1
< 0.1%
273 2
< 0.1%
269 1
< 0.1%
261 1
< 0.1%
252 1
< 0.1%
251 1
< 0.1%
226 1
< 0.1%
223 1
< 0.1%
212 1
< 0.1%
205 1
< 0.1%

예산액(천원)
Real number (ℝ)

HIGH CORRELATION 

Distinct4703
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31589846
Minimum0
Maximum3.0958585 × 109
Zeros3
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size41.9 KiB
2023-12-13T03:12:00.768551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile116984
Q11825334
median7660846
Q320829399
95-th percentile1.0849368 × 108
Maximum3.0958585 × 109
Range3.0958585 × 109
Interquartile range (IQR)19004065

Descriptive statistics

Standard deviation1.197522 × 108
Coefficient of variation (CV)3.7908445
Kurtosis212.8579
Mean31589846
Median Absolute Deviation (MAD)6854728
Skewness12.426453
Sum1.502729 × 1011
Variance1.4340588 × 1016
MonotonicityNot monotonic
2023-12-13T03:12:00.935246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25000 6
 
0.1%
1000000 5
 
0.1%
40000 4
 
0.1%
30500 4
 
0.1%
20000 3
 
0.1%
0 3
 
0.1%
8000 3
 
0.1%
83856 3
 
0.1%
70000 3
 
0.1%
295000 3
 
0.1%
Other values (4693) 4720
99.2%
ValueCountFrequency (%)
0 3
0.1%
500 1
 
< 0.1%
980 1
 
< 0.1%
1516 1
 
< 0.1%
1720 2
< 0.1%
1900 1
 
< 0.1%
2270 1
 
< 0.1%
2491 1
 
< 0.1%
2800 1
 
< 0.1%
3200 2
< 0.1%
ValueCountFrequency (%)
3095858535 1
< 0.1%
2299520281 1
< 0.1%
2263029194 1
< 0.1%
2208862319 1
< 0.1%
1935244087 1
< 0.1%
1567554539 1
< 0.1%
1543548662 1
< 0.1%
1436938618 1
< 0.1%
1320198185 1
< 0.1%
1299421448 1
< 0.1%

Interactions

2023-12-13T03:11:58.011806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:57.371017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:57.704584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:58.130832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:57.493790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:57.808868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:58.257457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:57.592168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:57.911151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:12:01.041057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도대상사업유형사업수예산액(천원)
회계연도1.0000.5750.0350.000
대상사업유형0.5751.0000.5540.056
사업수0.0350.5541.0000.441
예산액(천원)0.0000.0560.4411.000
2023-12-13T03:12:01.182694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도사업수예산액(천원)대상사업유형
회계연도1.0000.0680.1100.417
사업수0.0681.0000.8360.258
예산액(천원)0.1100.8361.0000.032
대상사업유형0.4170.2580.0321.000

Missing values

2023-12-13T03:11:58.393316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:11:58.489545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회계연도자치단체명대상사업유형사업수예산액(천원)
02021서울특별시양성평등정책추진사업50838262191
12021서울특별시성별영향평가사업139484643223
22021서울특별시자치단체특화사업1292263029194
32021서울특별시 종로구양성평등정책추진사업348744564
42021서울특별시 종로구성별영향평가사업2714773437
52021서울특별시 종로구자치단체특화사업153270
62021서울특별시 중구양성평등정책추진사업1010813709
72021서울특별시 중구성별영향평가사업97323932
82021서울특별시 중구자치단체특화사업5428100
92021서울특별시 용산구양성평등정책추진사업2116715449
회계연도자치단체명대상사업유형사업수예산액(천원)
47472019경상남도 함양군자치단체특화사업1310976248
47482019경상남도 거창군양성평등정책추진사업128384892
47492019경상남도 거창군성별영향분석평가사업268189226
47502019경상남도 거창군자치단체특화사업4725697
47512019경상남도 합천군양성평등정책추진사업233200
47522019경상남도 합천군성별영향분석평가사업3312523664
47532019경상남도 합천군자치단체특화사업22031088
47542019제주특별자치도양성평등정책추진사업2543491174
47552019제주특별자치도성별영향분석평가사업226128073067
47562019제주특별자치도자치단체특화사업233531874