Overview

Dataset statistics

Number of variables4
Number of observations6207
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory212.3 KiB
Average record size in memory35.0 B

Variable types

Numeric2
Categorical2

Dataset

Description회계년도,참여예산위원코드,위원코드구분,구별 참여예산 제안사업 투표여부
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15710/S/1/datasetView.do

Alerts

회계년도 is highly overall correlated with 구별 참여예산 제안사업 투표여부High correlation
참여예산위원코드 is highly overall correlated with 위원코드구분 and 1 other fieldsHigh correlation
위원코드구분 is highly overall correlated with 참여예산위원코드High correlation
구별 참여예산 제안사업 투표여부 is highly overall correlated with 회계년도 and 1 other fieldsHigh correlation
위원코드구분 is highly imbalanced (51.3%)Imbalance
참여예산위원코드 is highly skewed (γ1 = 24.87223492)Skewed

Reproduction

Analysis started2024-05-11 08:24:05.210010
Analysis finished2024-05-11 08:24:06.850719
Duration1.64 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회계년도
Real number (ℝ)

HIGH CORRELATION 

Distinct13
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2018.091
Minimum2012
Maximum2024
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size54.7 KiB
2024-05-11T17:24:06.907454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2012
5-th percentile2013
Q12017
median2018
Q32020
95-th percentile2023
Maximum2024
Range12
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.8189992
Coefficient of variation (CV)0.0013968642
Kurtosis-0.26771106
Mean2018.091
Median Absolute Deviation (MAD)2
Skewness-0.12680304
Sum12526291
Variance7.9467564
MonotonicityDecreasing
2024-05-11T17:24:07.017004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
2018 1230
19.8%
2017 1132
18.2%
2021 803
12.9%
2020 737
11.9%
2019 380
 
6.1%
2015 361
 
5.8%
2016 301
 
4.8%
2014 249
 
4.0%
2013 247
 
4.0%
2012 235
 
3.8%
Other values (3) 532
8.6%
ValueCountFrequency (%)
2012 235
 
3.8%
2013 247
 
4.0%
2014 249
 
4.0%
2015 361
 
5.8%
2016 301
 
4.8%
2017 1132
18.2%
2018 1230
19.8%
2019 380
 
6.1%
2020 737
11.9%
2021 803
12.9%
ValueCountFrequency (%)
2024 205
 
3.3%
2023 222
 
3.6%
2022 105
 
1.7%
2021 803
12.9%
2020 737
11.9%
2019 380
 
6.1%
2018 1230
19.8%
2017 1132
18.2%
2016 301
 
4.8%
2015 361
 
5.8%

참여예산위원코드
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct2650
Distinct (%)42.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1134.3491
Minimum0
Maximum99999
Zeros3
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size54.7 KiB
2024-05-11T17:24:07.139960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile39
Q1212
median512
Q31099.5
95-th percentile5244.4
Maximum99999
Range99999
Interquartile range (IQR)887.5

Descriptive statistics

Standard deviation3164.6585
Coefficient of variation (CV)2.7898453
Kurtosis766.18933
Mean1134.3491
Median Absolute Deviation (MAD)352
Skewness24.872235
Sum7040905
Variance10015064
MonotonicityNot monotonic
2024-05-11T17:24:07.271310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
63 8
 
0.1%
27 8
 
0.1%
48 8
 
0.1%
47 8
 
0.1%
46 8
 
0.1%
43 8
 
0.1%
42 8
 
0.1%
41 8
 
0.1%
40 8
 
0.1%
39 8
 
0.1%
Other values (2640) 6127
98.7%
ValueCountFrequency (%)
0 3
 
< 0.1%
1 8
0.1%
2 8
0.1%
3 8
0.1%
4 8
0.1%
5 8
0.1%
6 8
0.1%
7 8
0.1%
8 8
0.1%
9 8
0.1%
ValueCountFrequency (%)
99999 2
< 0.1%
99998 2
< 0.1%
99997 1
< 0.1%
20224 1
< 0.1%
20223 1
< 0.1%
20222 1
< 0.1%
20221 1
< 0.1%
5744 1
< 0.1%
5743 1
< 0.1%
5742 1
< 0.1%

위원코드구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size48.6 KiB
<NA>
3136 
1
2966 
4
 
52
3
 
31
2
 
22

Length

Max length4
Median length4
Mean length2.5157081
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
<NA> 3136
50.5%
1 2966
47.8%
4 52
 
0.8%
3 31
 
0.5%
2 22
 
0.4%

Length

2024-05-11T17:24:07.395089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T17:24:07.515417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 3136
50.5%
1 2966
47.8%
4 52
 
0.8%
3 31
 
0.5%
2 22
 
0.4%

구별 참여예산 제안사업 투표여부
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size48.6 KiB
<NA>
2941 
N
1586 
1
880 
Y
586 
 
214

Length

Max length4
Median length1
Mean length2.4214596
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 2941
47.4%
N 1586
25.6%
1 880
 
14.2%
Y 586
 
9.4%
214
 
3.4%

Length

2024-05-11T17:24:07.632245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T17:24:07.739018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 2941
49.1%
n 1586
26.5%
1 880
 
14.7%
y 586
 
9.8%

Interactions

2024-05-11T17:24:06.501797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T17:24:06.243202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T17:24:06.591556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T17:24:06.416581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T17:24:07.810490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계년도참여예산위원코드위원코드구분구별 참여예산 제안사업 투표여부
회계년도1.0000.1810.3710.810
참여예산위원코드0.1811.000NaNNaN
위원코드구분0.371NaN1.0000.511
구별 참여예산 제안사업 투표여부0.810NaN0.5111.000
2024-05-11T17:24:07.911197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구별 참여예산 제안사업 투표여부위원코드구분
구별 참여예산 제안사업 투표여부1.0000.219
위원코드구분0.2191.000
2024-05-11T17:24:07.986869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계년도참여예산위원코드위원코드구분구별 참여예산 제안사업 투표여부
회계년도1.000-0.2920.1740.619
참여예산위원코드-0.2921.0001.0001.000
위원코드구분0.1741.0001.0000.219
구별 참여예산 제안사업 투표여부0.6191.0000.2191.000

Missing values

2024-05-11T17:24:06.718381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T17:24:06.801579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회계년도참여예산위원코드위원코드구분구별 참여예산 제안사업 투표여부
020244191<NA>
120244181<NA>
220244171<NA>
320244161<NA>
420244151<NA>
520244141<NA>
620244131<NA>
720244121<NA>
820244111<NA>
920244101<NA>
회계년도참여예산위원코드위원코드구분구별 참여예산 제안사업 투표여부
6197201210<NA>N
619820129<NA>N
619920128<NA>N
620020127<NA>N
62012012611
62022012511
62032012411
62042012311
62052012211
62062012111