Overview

Dataset statistics

Number of variables4
Number of observations1215
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory41.7 KiB
Average record size in memory35.1 B

Variable types

Numeric2
Categorical1
Text1

Dataset

Description지방자치단체별 순세계 잉여금 예산 정보 (자치단체, 순세계잉여금 예산액(일반회계))
Author행정안전부
URLhttps://www.data.go.kr/data/15066421/fileData.do

Alerts

연번 is highly overall correlated with 연도High correlation
연도 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique
예산액(백만원) has 24 (2.0%) zerosZeros

Reproduction

Analysis started2023-12-12 13:04:32.196846
Analysis finished2023-12-12 13:04:32.816672
Duration0.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1215
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean608
Minimum1
Maximum1215
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.8 KiB
2023-12-12T22:04:32.889576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile61.7
Q1304.5
median608
Q3911.5
95-th percentile1154.3
Maximum1215
Range1214
Interquartile range (IQR)607

Descriptive statistics

Standard deviation350.8846
Coefficient of variation (CV)0.57711282
Kurtosis-1.2
Mean608
Median Absolute Deviation (MAD)304
Skewness0
Sum738720
Variance123120
MonotonicityStrictly increasing
2023-12-12T22:04:33.025265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
809 1
 
0.1%
816 1
 
0.1%
815 1
 
0.1%
814 1
 
0.1%
813 1
 
0.1%
812 1
 
0.1%
811 1
 
0.1%
810 1
 
0.1%
808 1
 
0.1%
Other values (1205) 1205
99.2%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1215 1
0.1%
1214 1
0.1%
1213 1
0.1%
1212 1
0.1%
1211 1
0.1%
1210 1
0.1%
1209 1
0.1%
1208 1
0.1%
1207 1
0.1%
1206 1
0.1%

연도
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size9.6 KiB
2015
243 
2016
243 
2017
243 
2018
243 
2019
243 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015
2nd row2015
3rd row2015
4th row2015
5th row2015

Common Values

ValueCountFrequency (%)
2015 243
20.0%
2016 243
20.0%
2017 243
20.0%
2018 243
20.0%
2019 243
20.0%

Length

2023-12-12T22:04:33.176644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:04:33.262881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2015 243
20.0%
2016 243
20.0%
2017 243
20.0%
2018 243
20.0%
2019 243
20.0%
Distinct243
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size9.6 KiB
2023-12-12T22:04:33.612034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length7.8024691
Min length3

Characters and Unicode

Total characters9480
Distinct characters139
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시 종로구
3rd row서울특별시 중구
4th row서울특별시 용산구
5th row서울특별시 성동구
ValueCountFrequency (%)
경기도 160
 
6.8%
서울특별시 130
 
5.5%
경상북도 120
 
5.1%
전라남도 115
 
4.9%
강원도 95
 
4.1%
경상남도 95
 
4.1%
부산광역시 85
 
3.6%
충청남도 80
 
3.4%
전라북도 75
 
3.2%
충청북도 60
 
2.6%
Other values (211) 1330
56.7%
2023-12-12T22:04:34.152707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1130
 
11.9%
830
 
8.8%
790
 
8.3%
425
 
4.5%
415
 
4.4%
390
 
4.1%
350
 
3.7%
335
 
3.5%
285
 
3.0%
275
 
2.9%
Other values (129) 4255
44.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8350
88.1%
Space Separator 1130
 
11.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
830
 
9.9%
790
 
9.5%
425
 
5.1%
415
 
5.0%
390
 
4.7%
350
 
4.2%
335
 
4.0%
285
 
3.4%
275
 
3.3%
225
 
2.7%
Other values (128) 4030
48.3%
Space Separator
ValueCountFrequency (%)
1130
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8350
88.1%
Common 1130
 
11.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
830
 
9.9%
790
 
9.5%
425
 
5.1%
415
 
5.0%
390
 
4.7%
350
 
4.2%
335
 
4.0%
285
 
3.4%
275
 
3.3%
225
 
2.7%
Other values (128) 4030
48.3%
Common
ValueCountFrequency (%)
1130
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8350
88.1%
ASCII 1130
 
11.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1130
100.0%
Hangul
ValueCountFrequency (%)
830
 
9.9%
790
 
9.5%
425
 
5.1%
415
 
5.0%
390
 
4.7%
350
 
4.2%
335
 
4.0%
285
 
3.4%
275
 
3.3%
225
 
2.7%
Other values (128) 4030
48.3%

예산액(백만원)
Real number (ℝ)

ZEROS 

Distinct591
Distinct (%)48.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30492.619
Minimum0
Maximum499500
Zeros24
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size10.8 KiB
2023-12-12T22:04:34.312875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3985
Q112000
median20000
Q332000
95-th percentile100000
Maximum499500
Range499500
Interquartile range (IQR)20000

Descriptive statistics

Standard deviation39919.528
Coefficient of variation (CV)1.3091538
Kurtosis40.955556
Mean30492.619
Median Absolute Deviation (MAD)9700
Skewness5.2232852
Sum37048532
Variance1.5935687 × 109
MonotonicityNot monotonic
2023-12-12T22:04:34.485162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20000 58
 
4.8%
10000 50
 
4.1%
15000 37
 
3.0%
25000 31
 
2.6%
30000 29
 
2.4%
12000 26
 
2.1%
0 24
 
2.0%
18000 19
 
1.6%
13000 18
 
1.5%
14000 17
 
1.4%
Other values (581) 906
74.6%
ValueCountFrequency (%)
0 24
2.0%
1000 1
 
0.1%
1369 1
 
0.1%
1393 1
 
0.1%
1564 1
 
0.1%
1601 1
 
0.1%
1737 1
 
0.1%
1752 1
 
0.1%
2000 6
 
0.5%
2145 1
 
0.1%
ValueCountFrequency (%)
499500 1
0.1%
447000 1
0.1%
371000 1
0.1%
360000 1
0.1%
330000 1
0.1%
325000 1
0.1%
280000 1
0.1%
225000 1
0.1%
222987 1
0.1%
210000 2
0.2%

Interactions

2023-12-12T22:04:32.524984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:04:32.359695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:04:32.612020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:04:32.442822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:04:34.569853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번연도예산액(백만원)
연번1.0001.0000.093
연도1.0001.0000.140
예산액(백만원)0.0930.1401.000
2023-12-12T22:04:34.656637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번예산액(백만원)연도
연번1.0000.2860.998
예산액(백만원)0.2861.0000.058
연도0.9980.0581.000

Missing values

2023-12-12T22:04:32.711965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:04:32.780466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번연도자치단체예산액(백만원)
012015서울특별시0
122015서울특별시 종로구17400
232015서울특별시 중구20500
342015서울특별시 용산구15000
452015서울특별시 성동구12500
562015서울특별시 광진구11748
672015서울특별시 동대문구9270
782015서울특별시 중랑구3050
892015서울특별시 성북구9409
9102015서울특별시 강북구13000
연번연도자치단체예산액(백만원)
120512062019경상남도 함안군38000
120612072019경상남도 창녕군36319
120712082019경상남도 고성군28000
120812092019경상남도 남해군45787
120912102019경상남도 하동군20700
121012112019경상남도 산청군30081
121112122019경상남도 함양군45000
121212132019경상남도 거창군72900
121312142019경상남도 합천군25000
121412152019제주특별자치도100000