Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory576.2 KiB
Average record size in memory59.0 B

Variable types

Numeric3
Categorical3

Dataset

Description부산광역시상수도사업본부_수용가정보시스템_민원신청정보_급수공사비_20220131
Author부산광역시 상수도사업본부
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15083685

Alerts

사업소코드 is highly overall correlated with 사업소명High correlation
사업소명 is highly overall correlated with 사업소코드High correlation
예산과목 is highly overall correlated with 분류코드High correlation
분류코드 is highly overall correlated with 예산과목High correlation
실제공사비 is highly skewed (γ1 = 37.93999146)Skewed
연번 has unique valuesUnique
실제공사비 has 8972 (89.7%) zerosZeros

Reproduction

Analysis started2023-12-10 16:41:07.627716
Analysis finished2023-12-10 16:41:09.764391
Duration2.14 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15783.627
Minimum1
Maximum31840
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:41:09.855882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1590.8
Q17758.5
median15768.5
Q323784.75
95-th percentile30224.25
Maximum31840
Range31839
Interquartile range (IQR)16026.25

Descriptive statistics

Standard deviation9224.9972
Coefficient of variation (CV)0.58446626
Kurtosis-1.2077914
Mean15783.627
Median Absolute Deviation (MAD)8013.5
Skewness0.020036236
Sum1.5783627 × 108
Variance85100573
MonotonicityNot monotonic
2023-12-11T01:41:10.079054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20662 1
 
< 0.1%
2174 1
 
< 0.1%
19733 1
 
< 0.1%
2537 1
 
< 0.1%
7021 1
 
< 0.1%
2859 1
 
< 0.1%
19480 1
 
< 0.1%
5418 1
 
< 0.1%
11664 1
 
< 0.1%
20301 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
11 1
< 0.1%
13 1
< 0.1%
17 1
< 0.1%
20 1
< 0.1%
ValueCountFrequency (%)
31840 1
< 0.1%
31839 1
< 0.1%
31838 1
< 0.1%
31830 1
< 0.1%
31823 1
< 0.1%
31822 1
< 0.1%
31819 1
< 0.1%
31812 1
< 0.1%
31809 1
< 0.1%
31808 1
< 0.1%

사업소코드
Real number (ℝ)

HIGH CORRELATION 

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean297.1018
Minimum201
Maximum312
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:41:10.273010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201
5-th percentile244
Q1303
median307
Q3311
95-th percentile312
Maximum312
Range111
Interquartile range (IQR)8

Descriptive statistics

Standard deviation24.784763
Coefficient of variation (CV)0.083421786
Kurtosis1.326096
Mean297.1018
Median Absolute Deviation (MAD)4
Skewness-1.7450553
Sum2971018
Variance614.28447
MonotonicityNot monotonic
2023-12-11T01:41:10.411363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
311 1944
19.4%
312 1680
16.8%
244 1674
16.7%
307 1063
10.6%
306 922
9.2%
304 744
 
7.4%
308 558
 
5.6%
309 484
 
4.8%
301 347
 
3.5%
302 284
 
2.8%
Other values (2) 300
 
3.0%
ValueCountFrequency (%)
201 33
 
0.3%
244 1674
16.7%
301 347
 
3.5%
302 284
 
2.8%
303 267
 
2.7%
304 744
7.4%
306 922
9.2%
307 1063
10.6%
308 558
 
5.6%
309 484
 
4.8%
ValueCountFrequency (%)
312 1680
16.8%
311 1944
19.4%
309 484
 
4.8%
308 558
 
5.6%
307 1063
10.6%
306 922
9.2%
304 744
 
7.4%
303 267
 
2.7%
302 284
 
2.8%
301 347
 
3.5%

사업소명
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
강서 사업소
1944 
기장 사업소
1680 
동래통합사업소
1674 
북부 사업소
1063 
남부 사업소
922 
Other values (7)
2717 

Length

Max length9
Median length9
Mean length8.4937
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강서 사업소
2nd row남부 사업소
3rd row기장 사업소
4th row남부 사업소
5th row기장 사업소

Common Values

ValueCountFrequency (%)
강서 사업소 1944
19.4%
기장 사업소 1680
16.8%
동래통합사업소 1674
16.7%
북부 사업소 1063
10.6%
남부 사업소 922
9.2%
부산진 사업소 744
 
7.4%
해운대 사업소 558
 
5.6%
사하 사업소 484
 
4.8%
중동부 사업소 347
 
3.5%
서부 사업소 284
 
2.8%
Other values (2) 300
 
3.0%

Length

2023-12-11T01:41:10.601688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사업소 8293
45.3%
강서 1944
 
10.6%
기장 1680
 
9.2%
동래통합사업소 1674
 
9.2%
북부 1063
 
5.8%
남부 922
 
5.0%
부산진 744
 
4.1%
해운대 558
 
3.1%
사하 484
 
2.6%
중동부 347
 
1.9%
Other values (3) 584
 
3.2%

예산과목
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
기타수수료수익
2979 
신설공사수입
2271 
개조공사수입
1485 
일반용
1242 
<NA>
1194 
Other values (3)
829 

Length

Max length9
Median length7
Mean length5.9313
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row일반용
5th row일반용

Common Values

ValueCountFrequency (%)
기타수수료수익 2979
29.8%
신설공사수입 2271
22.7%
개조공사수입 1485
14.8%
일반용 1242
12.4%
<NA> 1194
11.9%
기타공사부담금수입 816
 
8.2%
신설공사수익 10
 
0.1%
개조공사수익 3
 
< 0.1%

Length

2023-12-11T01:41:10.822633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:41:11.040973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기타수수료수익 2979
29.8%
신설공사수입 2271
22.7%
개조공사수입 1485
14.8%
일반용 1242
12.4%
na 1194
11.9%
기타공사부담금수입 816
 
8.2%
신설공사수익 10
 
0.1%
개조공사수익 3
 
< 0.1%

분류코드
Categorical

HIGH CORRELATION 

Distinct28
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
신설수탁급수공사수입
2281 
물이용부담금(일반용)
1194 
개조수탁급수공사수입
876 
설계료(25mm이하)
838 
준공검사료(25mm이하)
824 
Other values (23)
3987 

Length

Max length17
Median length14
Mean length10.5666
Min length8

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row물이용부담금(일반용)
2nd row물이용부담금(일반용)
3rd row물이용부담금(일반용)
4th row공사방수료(개조)
5th row공사방수량(신설)

Common Values

ValueCountFrequency (%)
신설수탁급수공사수입 2281
22.8%
물이용부담금(일반용) 1194
11.9%
개조수탁급수공사수입 876
 
8.8%
설계료(25mm이하) 838
 
8.4%
준공검사료(25mm이하) 824
 
8.2%
공사방수량(신설) 670
 
6.7%
원인자부담금수입 573
 
5.7%
공사방수료(개조) 572
 
5.7%
자재비(이동공사) 228
 
2.3%
설계료(비정액제) 207
 
2.1%
Other values (18) 1737
17.4%

Length

2023-12-11T01:41:11.292174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
신설수탁급수공사수입 2281
22.8%
물이용부담금(일반용 1194
11.9%
개조수탁급수공사수입 876
 
8.8%
설계료(25mm이하 838
 
8.4%
준공검사료(25mm이하 824
 
8.2%
공사방수량(신설 670
 
6.7%
원인자부담금수입 573
 
5.7%
공사방수료(개조 572
 
5.7%
자재비(이동공사 228
 
2.3%
설계료(비정액제 207
 
2.1%
Other values (18) 1737
17.4%

실제공사비
Real number (ℝ)

SKEWED  ZEROS 

Distinct931
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean281969.57
Minimum0
Maximum1.77621 × 108
Zeros8972
Zeros (%)89.7%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:41:11.503888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile924220
Maximum1.77621 × 108
Range1.77621 × 108
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3076321.3
Coefficient of variation (CV)10.910118
Kurtosis1846.1542
Mean281969.57
Median Absolute Deviation (MAD)0
Skewness37.939991
Sum2.8196957 × 109
Variance9.4637529 × 1012
MonotonicityNot monotonic
2023-12-11T01:41:11.739030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8972
89.7%
271300 13
 
0.1%
7000 9
 
0.1%
150 9
 
0.1%
1330 8
 
0.1%
999900 7
 
0.1%
34000 4
 
< 0.1%
8000 4
 
< 0.1%
10000 4
 
< 0.1%
1386000 3
 
< 0.1%
Other values (921) 967
 
9.7%
ValueCountFrequency (%)
0 8972
89.7%
150 9
 
0.1%
1330 8
 
0.1%
7000 9
 
0.1%
8000 4
 
< 0.1%
9680 1
 
< 0.1%
10000 4
 
< 0.1%
11700 1
 
< 0.1%
12470 1
 
< 0.1%
15300 1
 
< 0.1%
ValueCountFrequency (%)
177621000 1
< 0.1%
136600000 1
< 0.1%
131699000 1
< 0.1%
50319560 1
< 0.1%
45489430 1
< 0.1%
42304900 1
< 0.1%
38919300 1
< 0.1%
37064000 1
< 0.1%
30109000 1
< 0.1%
21964000 1
< 0.1%

Interactions

2023-12-11T01:41:08.925218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:08.220197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:08.579934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:09.079049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:08.350988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:08.699446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:09.228707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:08.472154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:08.804707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:41:11.903039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드사업소명예산과목분류코드실제공사비
연번1.0000.0000.1560.0340.0400.000
사업소코드0.0001.0001.0000.1390.2810.010
사업소명0.1561.0001.0000.2650.3980.027
예산과목0.0340.1390.2651.0000.9600.048
분류코드0.0400.2810.3980.9601.0000.069
실제공사비0.0000.0100.0270.0480.0691.000
2023-12-11T01:41:12.145886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
예산과목분류코드사업소명
예산과목1.0000.8150.132
분류코드0.8151.0000.141
사업소명0.1320.1411.000
2023-12-11T01:41:12.334256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드실제공사비사업소명예산과목분류코드
연번1.000-0.040-0.0450.0660.0170.014
사업소코드-0.0401.0000.2371.0000.1310.173
실제공사비-0.0450.2371.0000.0150.0300.033
사업소명0.0661.0000.0151.0000.1320.141
예산과목0.0170.1310.0300.1321.0000.815
분류코드0.0140.1730.0330.1410.8151.000

Missing values

2023-12-11T01:41:09.509422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:41:09.691869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번사업소코드사업소명예산과목분류코드실제공사비
2066120662311강서 사업소<NA>물이용부담금(일반용)0
1892818929306남부 사업소<NA>물이용부담금(일반용)0
1461314614312기장 사업소<NA>물이용부담금(일반용)0
42054206306남부 사업소일반용공사방수료(개조)0
3150931510312기장 사업소일반용공사방수량(신설)0
10411042312기장 사업소기타공사부담금수입원인자부담금수입0
1437414375244동래통합사업소<NA>물이용부담금(일반용)0
2668026681307북부 사업소신설공사수입신설수탁급수공사수입0
2490124902244동래통합사업소기타수수료수익준공검사료(25mm이하)0
69846985311강서 사업소신설공사수입신설수탁급수공사수입928400
연번사업소코드사업소명예산과목분류코드실제공사비
1746017461311강서 사업소일반용공사방수료(개조)0
1662916630307북부 사업소기타수수료수익준공검사료(25mm이하)0
1120111202304부산진 사업소개조공사수입개조수탁급수공사수입0
2573625737244동래통합사업소개조공사수입개조수탁급수공사수입0
1441114412308해운대 사업소<NA>물이용부담금(일반용)0
2265622657306남부 사업소기타공사부담금수입원인자부담금(구경확대)0
1224812249244동래통합사업소신설공사수입신설수탁급수공사수입0
1804218043301중동부 사업소개조공사수입자재비(세대별분리공사)0
2766127662306남부 사업소<NA>물이용부담금(일반용)0
51695170301중동부 사업소개조공사수입자재비(세대별분리공사)0