Overview

Dataset statistics

Number of variables3
Number of observations120
Missing cells9
Missing cells (%)2.5%
Duplicate rows1
Duplicate rows (%)0.8%
Total size in memory3.2 KiB
Average record size in memory27.1 B

Variable types

Numeric2
Text1

Dataset

Description서울특별시 영등포구 주민참여예산 사업현황(1) 제공데이터- 년도별 시행 사업명, 사업예산- 단 주민참여예산 액수의 단위는 천원
Author서울특별시 영등포구
URLhttps://www.data.go.kr/data/15048758/fileData.do

Alerts

Dataset has 1 (0.8%) duplicate rowsDuplicates
회계연도 has 3 (2.5%) missing valuesMissing
사업명 has 3 (2.5%) missing valuesMissing
사업예산(천원) has 3 (2.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 12:23:18.000131
Analysis finished2023-12-12 12:23:19.363048
Duration1.36 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회계연도
Real number (ℝ)

MISSING 

Distinct7
Distinct (%)6.0%
Missing3
Missing (%)2.5%
Infinite0
Infinite (%)0.0%
Mean2020.0085
Minimum2017
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-12T21:23:19.417964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2017
5-th percentile2017
Q12018
median2020
Q32022
95-th percentile2023
Maximum2023
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.8822938
Coefficient of variation (CV)0.00093182465
Kurtosis-1.1482231
Mean2020.0085
Median Absolute Deviation (MAD)2
Skewness-0.1230148
Sum236341
Variance3.5430298
MonotonicityIncreasing
2023-12-12T21:23:19.568087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2022 22
18.3%
2020 20
16.7%
2021 19
15.8%
2018 16
13.3%
2017 15
12.5%
2019 15
12.5%
2023 10
8.3%
(Missing) 3
 
2.5%
ValueCountFrequency (%)
2017 15
12.5%
2018 16
13.3%
2019 15
12.5%
2020 20
16.7%
2021 19
15.8%
2022 22
18.3%
2023 10
8.3%
ValueCountFrequency (%)
2023 10
8.3%
2022 22
18.3%
2021 19
15.8%
2020 20
16.7%
2019 15
12.5%
2018 16
13.3%
2017 15
12.5%

사업명
Text

MISSING 

Distinct117
Distinct (%)100.0%
Missing3
Missing (%)2.5%
Memory size1.1 KiB
2023-12-12T21:23:19.883690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length29
Mean length20.213675
Min length4

Characters and Unicode

Total characters2365
Distinct characters415
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique117 ?
Unique (%)100.0%

Sample

1st row춥고 비새는 교실에 양동이는 이제 그만
2nd row빗물로 얼룩진 교실 어르신들 한글공부
3rd row영등포전통시장 버티칼 가림막 시설
4th row장애인,노인,임산부를 위한 작은배려
5th row아동들과 함께하는 체험학습과 여행
ValueCountFrequency (%)
설치 15
 
2.7%
7
 
1.3%
6
 
1.1%
위한 5
 
0.9%
조성 5
 
0.9%
안전한 4
 
0.7%
우리 4
 
0.7%
건강 3
 
0.5%
up 3
 
0.5%
안양천 3
 
0.5%
Other values (447) 501
90.1%
2023-12-12T21:23:20.437252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
439
 
18.6%
36
 
1.5%
28
 
1.2%
26
 
1.1%
25
 
1.1%
! 25
 
1.1%
24
 
1.0%
24
 
1.0%
23
 
1.0%
23
 
1.0%
Other values (405) 1692
71.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1798
76.0%
Space Separator 439
 
18.6%
Other Punctuation 51
 
2.2%
Uppercase Letter 15
 
0.6%
Lowercase Letter 14
 
0.6%
Close Punctuation 12
 
0.5%
Open Punctuation 12
 
0.5%
Decimal Number 12
 
0.5%
Final Punctuation 4
 
0.2%
Initial Punctuation 4
 
0.2%
Other values (2) 4
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
36
 
2.0%
28
 
1.6%
26
 
1.4%
25
 
1.4%
24
 
1.3%
24
 
1.3%
23
 
1.3%
23
 
1.3%
23
 
1.3%
22
 
1.2%
Other values (363) 1544
85.9%
Uppercase Letter
ValueCountFrequency (%)
O 3
20.0%
U 2
13.3%
D 2
13.3%
G 2
13.3%
C 2
13.3%
J 1
 
6.7%
B 1
 
6.7%
T 1
 
6.7%
V 1
 
6.7%
Other Punctuation
ValueCountFrequency (%)
! 25
49.0%
, 9
 
17.6%
" 7
 
13.7%
? 3
 
5.9%
' 2
 
3.9%
: 2
 
3.9%
. 2
 
3.9%
& 1
 
2.0%
Lowercase Letter
ValueCountFrequency (%)
o 3
21.4%
p 3
21.4%
n 2
14.3%
w 2
14.3%
r 1
 
7.1%
z 1
 
7.1%
e 1
 
7.1%
u 1
 
7.1%
Decimal Number
ValueCountFrequency (%)
3 3
25.0%
2 2
16.7%
6 2
16.7%
4 1
 
8.3%
1 1
 
8.3%
9 1
 
8.3%
0 1
 
8.3%
7 1
 
8.3%
Final Punctuation
ValueCountFrequency (%)
2
50.0%
2
50.0%
Initial Punctuation
ValueCountFrequency (%)
2
50.0%
2
50.0%
Space Separator
ValueCountFrequency (%)
439
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1798
76.0%
Common 538
 
22.7%
Latin 29
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
36
 
2.0%
28
 
1.6%
26
 
1.4%
25
 
1.4%
24
 
1.3%
24
 
1.3%
23
 
1.3%
23
 
1.3%
23
 
1.3%
22
 
1.2%
Other values (363) 1544
85.9%
Common
ValueCountFrequency (%)
439
81.6%
! 25
 
4.6%
) 12
 
2.2%
( 12
 
2.2%
, 9
 
1.7%
" 7
 
1.3%
3 3
 
0.6%
? 3
 
0.6%
2
 
0.4%
2 2
 
0.4%
Other values (15) 24
 
4.5%
Latin
ValueCountFrequency (%)
o 3
10.3%
p 3
10.3%
O 3
10.3%
U 2
 
6.9%
n 2
 
6.9%
w 2
 
6.9%
D 2
 
6.9%
G 2
 
6.9%
C 2
 
6.9%
r 1
 
3.4%
Other values (7) 7
24.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1798
76.0%
ASCII 559
 
23.6%
Punctuation 8
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
439
78.5%
! 25
 
4.5%
) 12
 
2.1%
( 12
 
2.1%
, 9
 
1.6%
" 7
 
1.3%
o 3
 
0.5%
3 3
 
0.5%
? 3
 
0.5%
p 3
 
0.5%
Other values (28) 43
 
7.7%
Hangul
ValueCountFrequency (%)
36
 
2.0%
28
 
1.6%
26
 
1.4%
25
 
1.4%
24
 
1.3%
24
 
1.3%
23
 
1.3%
23
 
1.3%
23
 
1.3%
22
 
1.2%
Other values (363) 1544
85.9%
Punctuation
ValueCountFrequency (%)
2
25.0%
2
25.0%
2
25.0%
2
25.0%

사업예산(천원)
Real number (ℝ)

MISSING 

Distinct45
Distinct (%)38.5%
Missing3
Missing (%)2.5%
Infinite0
Infinite (%)0.0%
Mean46633.615
Minimum7000
Maximum100000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-12T21:23:20.655628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7000
5-th percentile10000
Q123000
median50000
Q350000
95-th percentile100000
Maximum100000
Range93000
Interquartile range (IQR)27000

Descriptive statistics

Standard deviation26960.789
Coefficient of variation (CV)0.57814065
Kurtosis-0.48929562
Mean46633.615
Median Absolute Deviation (MAD)20000
Skewness0.58742963
Sum5456133
Variance7.2688414 × 108
MonotonicityNot monotonic
2023-12-12T21:23:20.853510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
50000 31
25.8%
20000 11
 
9.2%
100000 10
 
8.3%
30000 6
 
5.0%
10000 6
 
5.0%
70000 4
 
3.3%
15000 4
 
3.3%
45000 3
 
2.5%
40000 2
 
1.7%
47000 2
 
1.7%
Other values (35) 38
31.7%
(Missing) 3
 
2.5%
ValueCountFrequency (%)
7000 1
 
0.8%
8800 1
 
0.8%
9000 1
 
0.8%
10000 6
5.0%
12000 1
 
0.8%
13000 1
 
0.8%
15000 4
3.3%
17000 1
 
0.8%
18000 1
 
0.8%
19000 1
 
0.8%
ValueCountFrequency (%)
100000 10
8.3%
98500 1
 
0.8%
96100 1
 
0.8%
95000 1
 
0.8%
90000 2
 
1.7%
86100 1
 
0.8%
85000 1
 
0.8%
81600 1
 
0.8%
81000 1
 
0.8%
80000 1
 
0.8%

Interactions

2023-12-12T21:23:18.486077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:23:18.277402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:23:18.601395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:23:18.384042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:23:20.991199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도사업예산(천원)
회계연도1.0000.448
사업예산(천원)0.4481.000
2023-12-12T21:23:21.126460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도사업예산(천원)
회계연도1.000-0.060
사업예산(천원)-0.0601.000

Missing values

2023-12-12T21:23:18.722934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:23:18.811450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T21:23:19.303245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

회계연도사업명사업예산(천원)
02017춥고 비새는 교실에 양동이는 이제 그만30000
12017빗물로 얼룩진 교실 어르신들 한글공부19000
22017영등포전통시장 버티칼 가림막 시설50000
32017장애인,노인,임산부를 위한 작은배려50000
42017아동들과 함께하는 체험학습과 여행23000
52017가족봉사단 봉사 트럭 "사랑이 붕붕"20000
62017결혼이민자 취업을 JOB자50000
72017한강 가을 가족힐링소풍13000
82017무궁화꽃이 피었습니다50000
92017가로녹지조성사업(선유로207 양평동 6차 현대아파트 앞)50000
회계연도사업명사업예산(천원)
1102023따뜻한 이불나기10000
1112023레기 무단투기 잡GO "그 놈"도 막?20000
1122023양평동유수지 생태공원 속 야외 커뮤니티공간" 조성70000
1132023다모이소 안전둥지 둘레길 복원7000
1142023양평2동 당산나무 주변 “(가칭) 선유봉 느티나무 공원”조성53000
1152023쾌적한 도보환경을 위한 띠녹지 조성사업50000
1162023눈이 오면 무서운 골목길, 친환경 안전하고 편안한 길로 변신70000
117<NA><NA><NA>
118<NA><NA><NA>
119<NA><NA><NA>

Duplicate rows

Most frequently occurring

회계연도사업명사업예산(천원)# duplicates
0<NA><NA><NA>3