Overview

Dataset statistics

Number of variables7
Number of observations3598
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory210.9 KiB
Average record size in memory60.0 B

Variable types

Categorical3
Text1
Numeric3

Dataset

Description구조별 기능별 세출결산 현황
Author행정안전부
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=B0SMC68T4DJAL2R1HL4722306449&infSeq=1

Alerts

회계연도 is highly overall correlated with 시군명High correlation
시군명 is highly overall correlated with 회계연도High correlation
정책사업금액(원) is highly overall correlated with 재무활동비(원)High correlation
재무활동비(원) is highly overall correlated with 정책사업금액(원)High correlation
시군명 is highly imbalanced (78.2%)Imbalance
정책사업금액(원) has 510 (14.2%) zerosZeros
재무활동비(원) has 1105 (30.7%) zerosZeros
행정운영경비(원) has 3315 (92.1%) zerosZeros

Reproduction

Analysis started2023-12-10 20:59:25.367157
Analysis finished2023-12-10 20:59:27.142726
Duration1.78 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회계연도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size28.2 KiB
2020
3179 
2021
419 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2020 3179
88.4%
2021 419
 
11.6%

Length

2023-12-11T05:59:27.217600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T05:59:27.327834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 3179
88.4%
2021 419
 
11.6%

시군명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct33
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size28.2 KiB
<NA>
3179 
경기도
 
14
안성시
 
14
시흥시
 
14
성남시
 
13
Other values (28)
364 

Length

Max length4
Median length4
Mean length3.8943858
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가평군
2nd row가평군
3rd row가평군
4th row가평군
5th row가평군

Common Values

ValueCountFrequency (%)
<NA> 3179
88.4%
경기도 14
 
0.4%
안성시 14
 
0.4%
시흥시 14
 
0.4%
성남시 13
 
0.4%
안양시 13
 
0.4%
고양시 13
 
0.4%
안산시 13
 
0.4%
과천시 13
 
0.4%
수원시 13
 
0.4%
Other values (23) 299
 
8.3%

Length

2023-12-11T05:59:27.448835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 3179
88.4%
안성시 14
 
0.4%
시흥시 14
 
0.4%
경기도 14
 
0.4%
파주시 13
 
0.4%
오산시 13
 
0.4%
용인시 13
 
0.4%
의왕시 13
 
0.4%
의정부시 13
 
0.4%
이천시 13
 
0.4%
Other values (23) 299
 
8.3%
Distinct243
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Memory size28.2 KiB
2023-12-11T05:59:27.849175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.8827126
Min length4

Characters and Unicode

Total characters17568
Distinct characters133
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경기가평군
2nd row경기가평군
3rd row경기가평군
4th row경기가평군
5th row경기가평군
ValueCountFrequency (%)
경기안성시 28
 
0.8%
경기본청 28
 
0.8%
경기시흥시 28
 
0.8%
경기용인시 26
 
0.7%
경기가평군 26
 
0.7%
경기의왕시 26
 
0.7%
경기화성시 26
 
0.7%
경기연천군 26
 
0.7%
경기오산시 26
 
0.7%
경기양주시 26
 
0.7%
Other values (233) 3332
92.6%
2023-12-11T05:59:28.431722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1441
 
8.2%
1375
 
7.8%
1159
 
6.6%
1088
 
6.2%
956
 
5.4%
851
 
4.8%
746
 
4.2%
592
 
3.4%
537
 
3.1%
507
 
2.9%
Other values (123) 8316
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 17568
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1441
 
8.2%
1375
 
7.8%
1159
 
6.6%
1088
 
6.2%
956
 
5.4%
851
 
4.8%
746
 
4.2%
592
 
3.4%
537
 
3.1%
507
 
2.9%
Other values (123) 8316
47.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 17568
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1441
 
8.2%
1375
 
7.8%
1159
 
6.6%
1088
 
6.2%
956
 
5.4%
851
 
4.8%
746
 
4.2%
592
 
3.4%
537
 
3.1%
507
 
2.9%
Other values (123) 8316
47.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 17568
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1441
 
8.2%
1375
 
7.8%
1159
 
6.6%
1088
 
6.2%
956
 
5.4%
851
 
4.8%
746
 
4.2%
592
 
3.4%
537
 
3.1%
507
 
2.9%
Other values (123) 8316
47.3%

분야명
Categorical

Distinct14
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size28.2 KiB
사회복지
275 
예비비
275 
국토및지역개발
275 
교통및물류
275 
산업ㆍ중소기업및에너지
275 
Other values (9)
2223 

Length

Max length11
Median length6
Mean length4.7620901
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사회복지
2nd row예비비
3rd row국토및지역개발
4th row교통및물류
5th row산업ㆍ중소기업및에너지

Common Values

ValueCountFrequency (%)
사회복지 275
 
7.6%
예비비 275
 
7.6%
국토및지역개발 275
 
7.6%
교통및물류 275
 
7.6%
산업ㆍ중소기업및에너지 275
 
7.6%
보건 275
 
7.6%
기타 275
 
7.6%
환경 275
 
7.6%
문화및관광 275
 
7.6%
공공질서및안전 275
 
7.6%
Other values (4) 848
23.6%

Length

2023-12-11T05:59:28.626975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사회복지 275
 
7.6%
예비비 275
 
7.6%
국토및지역개발 275
 
7.6%
교통및물류 275
 
7.6%
산업ㆍ중소기업및에너지 275
 
7.6%
보건 275
 
7.6%
기타 275
 
7.6%
환경 275
 
7.6%
문화및관광 275
 
7.6%
공공질서및안전 275
 
7.6%
Other values (4) 848
23.6%

정책사업금액(원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3089
Distinct (%)85.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1897853 × 1011
Minimum0
Maximum1.6264507 × 1013
Zeros510
Zeros (%)14.2%
Negative0
Negative (%)0.0%
Memory size31.8 KiB
2023-12-11T05:59:28.803088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16.9667308 × 109
median2.6659389 × 1010
Q38.346715 × 1010
95-th percentile4.1688911 × 1011
Maximum1.6264507 × 1013
Range1.6264507 × 1013
Interquartile range (IQR)7.6500419 × 1010

Descriptive statistics

Standard deviation5.7001432 × 1011
Coefficient of variation (CV)4.7909007
Kurtosis486.8517
Mean1.1897853 × 1011
Median Absolute Deviation (MAD)2.573558 × 1010
Skewness19.558506
Sum4.2808475 × 1014
Variance3.2491632 × 1023
MonotonicityNot monotonic
2023-12-11T05:59:28.995400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 510
 
14.2%
2667539870 1
 
< 0.1%
4609219960 1
 
< 0.1%
59689284014 1
 
< 0.1%
9832158250 1
 
< 0.1%
24429903514 1
 
< 0.1%
46999253225 1
 
< 0.1%
526850771840 1
 
< 0.1%
341192876510 1
 
< 0.1%
260296830000 1
 
< 0.1%
Other values (3079) 3079
85.6%
ValueCountFrequency (%)
0 510
14.2%
7137270 1
 
< 0.1%
11504573 1
 
< 0.1%
13633004 1
 
< 0.1%
22249400 1
 
< 0.1%
46580130 1
 
< 0.1%
50000000 1
 
< 0.1%
67870775 1
 
< 0.1%
81306600 1
 
< 0.1%
81653869 1
 
< 0.1%
ValueCountFrequency (%)
16264507488566 1
< 0.1%
15537485959380 1
< 0.1%
15251990201970 1
< 0.1%
7150857028470 1
< 0.1%
6807507397877 1
< 0.1%
6230472305380 1
< 0.1%
5002884354290 1
< 0.1%
4755103109213 1
< 0.1%
4622611607109 1
< 0.1%
4055877133270 1
< 0.1%

재무활동비(원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2483
Distinct (%)69.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.2129513 × 109
Minimum0
Maximum1.6091427 × 1012
Zeros1105
Zeros (%)30.7%
Negative0
Negative (%)0.0%
Memory size31.8 KiB
2023-12-11T05:59:29.188174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2.9255152 × 108
Q33.1116686 × 109
95-th percentile2.9145809 × 1010
Maximum1.6091427 × 1012
Range1.6091427 × 1012
Interquartile range (IQR)3.1116686 × 109

Descriptive statistics

Standard deviation6.2626403 × 1010
Coefficient of variation (CV)6.7976483
Kurtosis361.47534
Mean9.2129513 × 109
Median Absolute Deviation (MAD)2.9255152 × 108
Skewness17.666689
Sum3.3148199 × 1013
Variance3.9220663 × 1021
MonotonicityNot monotonic
2023-12-11T05:59:29.413625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1105
30.7%
200000000 5
 
0.1%
100000000 4
 
0.1%
2000000000 2
 
0.1%
46000000 2
 
0.1%
1000000000 2
 
0.1%
500000000 2
 
0.1%
49351220 1
 
< 0.1%
640018320 1
 
< 0.1%
219852070 1
 
< 0.1%
Other values (2473) 2473
68.7%
ValueCountFrequency (%)
0 1105
30.7%
450 1
 
< 0.1%
1420 1
 
< 0.1%
1830 1
 
< 0.1%
14102 1
 
< 0.1%
41000 1
 
< 0.1%
55890 1
 
< 0.1%
100000 1
 
< 0.1%
111370 1
 
< 0.1%
121960 1
 
< 0.1%
ValueCountFrequency (%)
1609142726220 1
< 0.1%
1390249130280 1
< 0.1%
1306795548894 1
< 0.1%
1268279166210 1
< 0.1%
1157809301670 1
< 0.1%
1079285988680 1
< 0.1%
1065462980000 1
< 0.1%
540930594660 1
< 0.1%
501885994275 1
< 0.1%
498380924008 1
< 0.1%

행정운영경비(원)
Real number (ℝ)

ZEROS 

Distinct284
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.058766 × 1010
Minimum0
Maximum1.8064021 × 1012
Zeros3315
Zeros (%)92.1%
Negative0
Negative (%)0.0%
Memory size31.8 KiB
2023-12-11T05:59:29.609495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile6.8660865 × 1010
Maximum1.8064021 × 1012
Range1.8064021 × 1012
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.1415506 × 1010
Coefficient of variation (CV)5.8006684
Kurtosis305.68147
Mean1.058766 × 1010
Median Absolute Deviation (MAD)0
Skewness14.374916
Sum3.8094402 × 1013
Variance3.7718644 × 1021
MonotonicityNot monotonic
2023-12-11T05:59:29.795578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3315
92.1%
50267091205 1
 
< 0.1%
114289071480 1
 
< 0.1%
17625440 1
 
< 0.1%
134680841850 1
 
< 0.1%
342935982068 1
 
< 0.1%
354883500990 1
 
< 0.1%
50205180972 1
 
< 0.1%
51742064112 1
 
< 0.1%
62188498550 1
 
< 0.1%
Other values (274) 274
 
7.6%
ValueCountFrequency (%)
0 3315
92.1%
16300000 1
 
< 0.1%
17625440 1
 
< 0.1%
48084900 1
 
< 0.1%
51588360 1
 
< 0.1%
55048970 1
 
< 0.1%
121103040 1
 
< 0.1%
350572980 1
 
< 0.1%
473105965 1
 
< 0.1%
34806859358 1
 
< 0.1%
ValueCountFrequency (%)
1806402128209 1
< 0.1%
1201893169769 1
< 0.1%
1163356838054 1
< 0.1%
811385729419 1
< 0.1%
675321103128 1
< 0.1%
661413608021 1
< 0.1%
652202456580 1
< 0.1%
595886493434 1
< 0.1%
522886456277 1
< 0.1%
506920399040 1
< 0.1%

Interactions

2023-12-11T05:59:26.571028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T05:59:25.988472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T05:59:26.280725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T05:59:26.693295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T05:59:26.095774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T05:59:26.378342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T05:59:26.809764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T05:59:26.190730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T05:59:26.469137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T05:59:29.922666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도시군명분야명정책사업금액(원)재무활동비(원)행정운영경비(원)
회계연도1.000NaN0.0000.0000.0530.060
시군명NaN1.0000.0000.2480.2400.000
분야명0.0000.0001.0000.1440.0890.363
정책사업금액(원)0.0000.2480.1441.0000.6960.000
재무활동비(원)0.0530.2400.0890.6961.0000.000
행정운영경비(원)0.0600.0000.3630.0000.0001.000
2023-12-11T05:59:30.042522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도분야명시군명
회계연도1.0000.0001.000
분야명0.0001.0000.000
시군명1.0000.0001.000
2023-12-11T05:59:30.157780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정책사업금액(원)재무활동비(원)행정운영경비(원)회계연도시군명분야명
정책사업금액(원)1.0000.584-0.3980.0000.1150.071
재무활동비(원)0.5841.000-0.0610.0400.1110.039
행정운영경비(원)-0.398-0.0611.0000.0640.0000.142
회계연도0.0000.0400.0641.0001.0000.000
시군명0.1150.1110.0001.0001.0000.000
분야명0.0710.0390.1420.0000.0001.000

Missing values

2023-12-11T05:59:26.956814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T05:59:27.083151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회계연도시군명자치단체명분야명정책사업금액(원)재무활동비(원)행정운영경비(원)
02021가평군경기가평군사회복지14607493130944517950400
12021가평군경기가평군예비비000
22021가평군경기가평군국토및지역개발5788243122436487686500
32021가평군경기가평군교통및물류3085671194000
42021가평군경기가평군산업ㆍ중소기업및에너지233513901909813605900
52021가평군경기가평군농림해양수산618723980418948640400
62021가평군경기가평군보건68596811407104825300
72021가평군경기가평군기타0294136984064811992010
82021가평군경기가평군환경104106275499159904002700
92021가평군경기가평군문화및관광5105568897820639524270
회계연도시군명자치단체명분야명정책사업금액(원)재무활동비(원)행정운영경비(원)
35882020<NA>부산금정구환경15835994190427167400
35892020<NA>부산금정구사회복지35141319253565519367500
35902020<NA>부산금정구보건93827697206511577900
35912020<NA>부산금정구농림해양수산6074301760172371000
35922020<NA>부산금정구산업ㆍ중소기업및에너지1360510481000
35932020<NA>부산금정구교통및물류253658758605385265000
35942020<NA>부산금정구국토및지역개발23238114100635153500
35952020<NA>부산금정구예비비000
35962020<NA>부산금정구기타0068630066650
35972020<NA>부산강서구일반공공행정8298438562660102760910