Overview

Dataset statistics

Number of variables7
Number of observations4978
Missing cells9956
Missing cells (%)28.6%
Duplicate rows20
Duplicate rows (%)0.4%
Total size in memory296.7 KiB
Average record size in memory61.0 B

Variable types

Categorical3
Unsupported2
Numeric2

Dataset

Description기능별 회계별 세출예산 현황
Author행정안전부
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=4M2LWOK0S7DHBB5VGCSH22385240&infSeq=1

Alerts

회계연도 has constant value ""Constant
Dataset has 20 (0.4%) duplicate rowsDuplicates
세출예산총계액(원) is highly overall correlated with 세출예산순계액(원)High correlation
세출예산순계액(원) is highly overall correlated with 세출예산총계액(원)High correlation
시군명 has 4978 (100.0%) missing valuesMissing
자치단체명 has 4978 (100.0%) missing valuesMissing
시군명 is an unsupported type, check if it needs cleaning or further analysisUnsupported
자치단체명 is an unsupported type, check if it needs cleaning or further analysisUnsupported
세출예산총계액(원) has 129 (2.6%) zerosZeros
세출예산순계액(원) has 140 (2.8%) zerosZeros

Reproduction

Analysis started2023-12-10 22:50:11.531520
Analysis finished2023-12-10 22:50:12.304399
Duration0.77 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회계연도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size39.0 KiB
2022
4978 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 4978
100.0%

Length

2023-12-11T07:50:12.360456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:50:12.460889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 4978
100.0%

시군명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing4978
Missing (%)100.0%
Memory size43.9 KiB

자치단체명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing4978
Missing (%)100.0%
Memory size43.9 KiB

회계구분명
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size39.0 KiB
일반회계
3180 
기타특별회계
1479 
공기업특별회계
319 

Length

Max length7
Median length4
Mean length4.7864604
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기타특별회계
2nd row기타특별회계
3rd row기타특별회계
4th row기타특별회계
5th row일반회계

Common Values

ValueCountFrequency (%)
일반회계 3180
63.9%
기타특별회계 1479
29.7%
공기업특별회계 319
 
6.4%

Length

2023-12-11T07:50:12.574688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:50:12.689423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반회계 3180
63.9%
기타특별회계 1479
29.7%
공기업특별회계 319
 
6.4%

분야명
Categorical

Distinct14
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size39.0 KiB
기타
586 
환경
550 
사회복지
486 
국토및지역개발
482 
교통및물류
429 
Other values (9)
2445 

Length

Max length11
Median length6
Mean length4.674769
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row산업ㆍ중소기업및에너지
2nd row교통및물류
3rd row국토및지역개발
4th row기타
5th row일반공공행정

Common Values

ValueCountFrequency (%)
기타 586
11.8%
환경 550
11.0%
사회복지 486
9.8%
국토및지역개발 482
9.7%
교통및물류 429
8.6%
산업ㆍ중소기업및에너지 383
7.7%
일반공공행정 342
6.9%
예비비 321
 
6.4%
농림해양수산 311
 
6.2%
문화및관광 278
 
5.6%
Other values (4) 810
16.3%

Length

2023-12-11T07:50:12.783181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
기타 586
11.8%
환경 550
11.0%
사회복지 486
9.8%
국토및지역개발 482
9.7%
교통및물류 429
8.6%
산업ㆍ중소기업및에너지 383
7.7%
일반공공행정 342
6.9%
예비비 321
 
6.4%
농림해양수산 311
 
6.2%
문화및관광 278
 
5.6%
Other values (4) 810
16.3%

세출예산총계액(원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct4830
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.0374365 × 1010
Minimum0
Maximum1.1630777 × 1013
Zeros129
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size43.9 KiB
2023-12-11T07:50:12.899255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile74685000
Q13.3939002 × 109
median1.5738563 × 1010
Q35.7658312 × 1010
95-th percentile3.1248506 × 1011
Maximum1.1630777 × 1013
Range1.1630777 × 1013
Interquartile range (IQR)5.4264412 × 1010

Descriptive statistics

Standard deviation3.6129902 × 1011
Coefficient of variation (CV)4.4952022
Kurtosis504.1835
Mean8.0374365 × 1010
Median Absolute Deviation (MAD)1.4935184 × 1010
Skewness19.203051
Sum4.0010359 × 1014
Variance1.3053698 × 1023
MonotonicityNot monotonic
2023-12-11T07:50:13.311272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 129
 
2.6%
500000000 4
 
0.1%
4000000000 3
 
0.1%
40000000 3
 
0.1%
70326000 2
 
< 0.1%
658700000 2
 
< 0.1%
983000000 2
 
< 0.1%
200000000 2
 
< 0.1%
35690000 2
 
< 0.1%
5000000000 2
 
< 0.1%
Other values (4820) 4827
97.0%
ValueCountFrequency (%)
0 129
2.6%
142000 1
 
< 0.1%
480000 1
 
< 0.1%
550000 1
 
< 0.1%
620000 1
 
< 0.1%
1100000 2
 
< 0.1%
1200000 1
 
< 0.1%
1400000 1
 
< 0.1%
1500000 1
 
< 0.1%
1834000 1
 
< 0.1%
ValueCountFrequency (%)
11630777199000 1
< 0.1%
11536486863000 1
< 0.1%
7315585320000 1
< 0.1%
7293472874000 1
< 0.1%
4858905122000 1
< 0.1%
4416741457000 1
< 0.1%
4315832075000 1
< 0.1%
4275025713000 1
< 0.1%
3766900950000 1
< 0.1%
3753077307000 1
< 0.1%

세출예산순계액(원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct4819
Distinct (%)96.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.7916497 × 1010
Minimum0
Maximum4.2845248 × 1012
Zeros140
Zeros (%)2.8%
Negative0
Negative (%)0.0%
Memory size43.9 KiB
2023-12-11T07:50:13.471462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile61175800
Q12.9762232 × 109
median1.5128546 × 1010
Q35.3628243 × 1010
95-th percentile2.3788497 × 1011
Maximum4.2845248 × 1012
Range4.2845248 × 1012
Interquartile range (IQR)5.065202 × 1010

Descriptive statistics

Standard deviation1.6299997 × 1011
Coefficient of variation (CV)2.8143963
Kurtosis299.36619
Mean5.7916497 × 1010
Median Absolute Deviation (MAD)1.4405578 × 1010
Skewness13.697535
Sum2.8830832 × 1014
Variance2.6568991 × 1022
MonotonicityNot monotonic
2023-12-11T07:50:13.617049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 140
 
2.8%
4000000000 4
 
0.1%
500000000 4
 
0.1%
900000000 2
 
< 0.1%
1394000000 2
 
< 0.1%
300000000 2
 
< 0.1%
200000000 2
 
< 0.1%
1100000 2
 
< 0.1%
100000000 2
 
< 0.1%
658700000 2
 
< 0.1%
Other values (4809) 4816
96.7%
ValueCountFrequency (%)
0 140
2.8%
142000 1
 
< 0.1%
480000 1
 
< 0.1%
550000 1
 
< 0.1%
620000 1
 
< 0.1%
1079000 1
 
< 0.1%
1100000 2
 
< 0.1%
1200000 1
 
< 0.1%
1400000 1
 
< 0.1%
1500000 1
 
< 0.1%
ValueCountFrequency (%)
4284524829000 1
< 0.1%
4256402043000 1
< 0.1%
4222695135000 1
< 0.1%
3046541727000 1
< 0.1%
1939122949000 1
< 0.1%
1774328000000 1
< 0.1%
1424832455000 1
< 0.1%
1242802023000 1
< 0.1%
1201789553000 1
< 0.1%
1161531276000 1
< 0.1%

Interactions

2023-12-11T07:50:11.939742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:50:11.740168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:50:12.024132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:50:11.854960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:50:13.714256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계구분명분야명세출예산총계액(원)세출예산순계액(원)
회계구분명1.0000.5210.0290.066
분야명0.5211.0000.1420.272
세출예산총계액(원)0.0290.1421.0000.848
세출예산순계액(원)0.0660.2720.8481.000
2023-12-11T07:50:13.814841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계구분명분야명
회계구분명1.0000.340
분야명0.3401.000
2023-12-11T07:50:13.892431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세출예산총계액(원)세출예산순계액(원)회계구분명분야명
세출예산총계액(원)1.0000.9850.0190.053
세출예산순계액(원)0.9851.0000.0440.104
회계구분명0.0190.0441.0000.340
분야명0.0530.1040.3401.000

Missing values

2023-12-11T07:50:12.146442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:50:12.256071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회계연도시군명자치단체명회계구분명분야명세출예산총계액(원)세출예산순계액(원)
02022<NA><NA>기타특별회계산업ㆍ중소기업및에너지654360000654360000
12022<NA><NA>기타특별회계교통및물류5464136200054641362000
22022<NA><NA>기타특별회계국토및지역개발5437229800054372298000
32022<NA><NA>기타특별회계기타31605440003160544000
42022<NA><NA>일반회계일반공공행정141801549000141801549000
52022<NA><NA>일반회계공공질서및안전5577866800027733668000
62022<NA><NA>일반회계교육3754156500037541565000
72022<NA><NA>일반회계문화및관광191499693000191496693000
82022<NA><NA>일반회계환경171014452000169514452000
92022<NA><NA>일반회계사회복지12128004160001201789553000
회계연도시군명자치단체명회계구분명분야명세출예산총계액(원)세출예산순계액(원)
49682022<NA><NA>기타특별회계기타4367300043673000
49692022<NA><NA>일반회계일반공공행정1794262600017910546000
49702022<NA><NA>일반회계공공질서및안전1311862600010992856000
49712022<NA><NA>일반회계교육18336440001833644000
49722022<NA><NA>일반회계문화및관광2811980800027867808000
49732022<NA><NA>일반회계환경3480546700025583496000
49742022<NA><NA>일반회계사회복지7952591200079085057000
49752022<NA><NA>일반회계보건83892960008389296000
49762022<NA><NA>일반회계농림해양수산108972536000108461516000
49772022<NA><NA>일반회계산업ㆍ중소기업및에너지1607312200015674158000

Duplicate rows

Most frequently occurring

회계연도회계구분명분야명세출예산총계액(원)세출예산순계액(원)# duplicates
172022기타특별회계일반공공행정0033
132022기타특별회계산업ㆍ중소기업및에너지0016
52022기타특별회계공공질서및안전009
152022기타특별회계예비비009
182022기타특별회계환경009
62022기타특별회계국토및지역개발008
72022기타특별회계기타008
102022기타특별회계농림해양수산007
22022공기업특별회계예비비005
32022공기업특별회계일반공공행정005