Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory576.2 KiB
Average record size in memory59.0 B

Variable types

Numeric3
Categorical3

Dataset

Description부산광역시상수도사업본부_수용가정보시스템_민원신청정보_급수공사비_20230126
Author부산광역시 상수도사업본부
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15083685

Alerts

분류코드 has a high cardinality: 51 distinct valuesHigh cardinality
예산과목 is highly overall correlated with 분류코드High correlation
분류코드 is highly overall correlated with 예산과목High correlation
사업소코드 is highly overall correlated with 사업소명High correlation
사업소명 is highly overall correlated with 사업소코드High correlation
연번 has unique valuesUnique
실제공사비 has 8705 (87.1%) zerosZeros

Reproduction

Analysis started2023-12-10 16:41:00.812765
Analysis finished2023-12-10 16:41:03.362070
Duration2.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12053.277
Minimum2
Maximum24000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:41:03.462415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile1185.95
Q16101.5
median12007.5
Q318032.5
95-th percentile22793.1
Maximum24000
Range23998
Interquartile range (IQR)11931

Descriptive statistics

Standard deviation6915.8449
Coefficient of variation (CV)0.573773
Kurtosis-1.1941323
Mean12053.277
Median Absolute Deviation (MAD)5968
Skewness-0.0013971201
Sum1.2053277 × 108
Variance47828910
MonotonicityNot monotonic
2023-12-11T01:41:03.650967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23624 1
 
< 0.1%
4402 1
 
< 0.1%
11190 1
 
< 0.1%
7472 1
 
< 0.1%
8763 1
 
< 0.1%
8077 1
 
< 0.1%
3872 1
 
< 0.1%
10766 1
 
< 0.1%
23178 1
 
< 0.1%
1968 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
12 1
< 0.1%
14 1
< 0.1%
15 1
< 0.1%
16 1
< 0.1%
18 1
< 0.1%
19 1
< 0.1%
ValueCountFrequency (%)
24000 1
< 0.1%
23996 1
< 0.1%
23995 1
< 0.1%
23993 1
< 0.1%
23991 1
< 0.1%
23989 1
< 0.1%
23988 1
< 0.1%
23986 1
< 0.1%
23984 1
< 0.1%
23980 1
< 0.1%

사업소코드
Real number (ℝ)

HIGH CORRELATION 

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean299.216
Minimum101
Maximum312
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:41:03.801368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile244
Q1304
median308
Q3311
95-th percentile312
Maximum312
Range211
Interquartile range (IQR)7

Descriptive statistics

Standard deviation23.455067
Coefficient of variation (CV)0.078388411
Kurtosis3.057958
Mean299.216
Median Absolute Deviation (MAD)3
Skewness-2.092555
Sum2992160
Variance550.14016
MonotonicityNot monotonic
2023-12-11T01:41:03.932624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
311 2233
22.3%
312 2006
20.1%
244 1383
13.8%
306 941
9.4%
307 838
 
8.4%
304 743
 
7.4%
309 562
 
5.6%
308 523
 
5.2%
303 267
 
2.7%
302 242
 
2.4%
Other values (3) 262
 
2.6%
ValueCountFrequency (%)
101 1
 
< 0.1%
201 41
 
0.4%
244 1383
13.8%
301 220
 
2.2%
302 242
 
2.4%
303 267
 
2.7%
304 743
7.4%
306 941
9.4%
307 838
8.4%
308 523
 
5.2%
ValueCountFrequency (%)
312 2006
20.1%
311 2233
22.3%
309 562
 
5.6%
308 523
 
5.2%
307 838
 
8.4%
306 941
9.4%
304 743
 
7.4%
303 267
 
2.7%
302 242
 
2.4%
301 220
 
2.2%

사업소명
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
강서 사업소
2233 
기장 사업소
2006 
동래통합사업소
1383 
남부 사업소
941 
북부 사업소
838 
Other values (8)
2599 

Length

Max length9
Median length9
Mean length8.5662
Min length5

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row남부 사업소
2nd row사하 사업소
3rd row남부 사업소
4th row기장 사업소
5th row남부 사업소

Common Values

ValueCountFrequency (%)
강서 사업소 2233
22.3%
기장 사업소 2006
20.1%
동래통합사업소 1383
13.8%
남부 사업소 941
9.4%
북부 사업소 838
 
8.4%
부산진 사업소 743
 
7.4%
사하 사업소 562
 
5.6%
해운대 사업소 523
 
5.2%
영도 사업소 267
 
2.7%
서부 사업소 242
 
2.4%
Other values (3) 262
 
2.6%

Length

2023-12-11T01:41:04.078375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사업소 8575
46.2%
강서 2233
 
12.0%
기장 2006
 
10.8%
동래통합사업소 1383
 
7.4%
남부 941
 
5.1%
북부 838
 
4.5%
부산진 743
 
4.0%
사하 562
 
3.0%
해운대 523
 
2.8%
영도 267
 
1.4%
Other values (4) 504
 
2.7%

예산과목
Categorical

HIGH CORRELATION 

Distinct50
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
1257 
준공검사료(25mm이하)
849 
설계료(25mm이하)
826 
공사방수량(신설)
774 
자재비(신설,구경별정액제)
699 
Other values (45)
5595 

Length

Max length17
Median length13
Mean length10.6691
Min length4

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row준공검사료(25mm이하)
2nd row<NA>
3rd row설계료(비정액제)
4th row시공비(신설,정액제)
5th row복구비(구경확대)

Common Values

ValueCountFrequency (%)
<NA> 1257
 
12.6%
준공검사료(25mm이하) 849
 
8.5%
설계료(25mm이하) 826
 
8.3%
공사방수량(신설) 774
 
7.7%
자재비(신설,구경별정액제) 699
 
7.0%
시공비(신설,정액제) 690
 
6.9%
복구비(신설,구경별정액제) 664
 
6.6%
공사방수료(개조) 444
 
4.4%
원인자부담금(구경별정액제) 383
 
3.8%
원인자부담금(시설부담금) 242
 
2.4%
Other values (40) 3172
31.7%

Length

2023-12-11T01:41:04.542554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 1257
 
12.6%
준공검사료(25mm이하 849
 
8.5%
설계료(25mm이하 826
 
8.3%
공사방수량(신설 774
 
7.7%
자재비(신설,구경별정액제 699
 
7.0%
시공비(신설,정액제 690
 
6.9%
복구비(신설,구경별정액제 664
 
6.6%
공사방수료(개조 444
 
4.4%
원인자부담금(구경별정액제 383
 
3.8%
원인자부담금(시설부담금 242
 
2.4%
Other values (40) 3172
31.7%

분류코드
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct51
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
물이용부담금(일반용)
1255 
준공검사료(25mm이하)
849 
설계료(25mm이하)
826 
공사방수량(신설)
774 
자재비(신설,구경별정액제)
699 
Other values (46)
5597 

Length

Max length17
Median length14
Mean length11.5476
Min length4

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row준공검사료(25mm이하)
2nd row물이용부담금(일반용)
3rd row설계료(비정액제)
4th row시공비(신설,정액제)
5th row복구비(구경확대)

Common Values

ValueCountFrequency (%)
물이용부담금(일반용) 1255
 
12.6%
준공검사료(25mm이하) 849
 
8.5%
설계료(25mm이하) 826
 
8.3%
공사방수량(신설) 774
 
7.7%
자재비(신설,구경별정액제) 699
 
7.0%
시공비(신설,정액제) 690
 
6.9%
복구비(신설,구경별정액제) 664
 
6.6%
공사방수료(개조) 444
 
4.4%
원인자부담금(구경별정액제) 383
 
3.8%
원인자부담금(시설부담금) 242
 
2.4%
Other values (41) 3174
31.7%

Length

2023-12-11T01:41:04.688378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
물이용부담금(일반용 1255
 
12.6%
준공검사료(25mm이하 849
 
8.5%
설계료(25mm이하 826
 
8.3%
공사방수량(신설 774
 
7.7%
자재비(신설,구경별정액제 699
 
7.0%
시공비(신설,정액제 690
 
6.9%
복구비(신설,구경별정액제 664
 
6.6%
공사방수료(개조 444
 
4.4%
원인자부담금(구경별정액제 383
 
3.8%
원인자부담금(시설부담금 242
 
2.4%
Other values (41) 3174
31.7%

실제공사비
Real number (ℝ)

ZEROS 

Distinct662
Distinct (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean171540.04
Minimum0
Maximum35671000
Zeros8705
Zeros (%)87.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:41:04.892772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile368964
Maximum35671000
Range35671000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1268091.7
Coefficient of variation (CV)7.3923953
Kurtosis243.35333
Mean171540.04
Median Absolute Deviation (MAD)0
Skewness13.714255
Sum1.7154004 × 109
Variance1.6080567 × 1012
MonotonicityNot monotonic
2023-12-11T01:41:05.082890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8705
87.1%
7000 277
 
2.8%
150 148
 
1.5%
1330 141
 
1.4%
8000 9
 
0.1%
42930 7
 
0.1%
326700 5
 
0.1%
10000 5
 
0.1%
214650 4
 
< 0.1%
1282600 4
 
< 0.1%
Other values (652) 695
 
7.0%
ValueCountFrequency (%)
0 8705
87.1%
130 1
 
< 0.1%
150 148
 
1.5%
700 1
 
< 0.1%
1190 1
 
< 0.1%
1330 141
 
1.4%
6800 1
 
< 0.1%
7000 277
 
2.8%
8000 9
 
0.1%
10000 5
 
0.1%
ValueCountFrequency (%)
35671000 1
< 0.1%
32942000 1
< 0.1%
28389350 1
< 0.1%
21958000 1
< 0.1%
21954000 1
< 0.1%
21920000 1
< 0.1%
21862000 1
< 0.1%
21790000 1
< 0.1%
21758000 1
< 0.1%
20096000 1
< 0.1%

Interactions

2023-12-11T01:41:02.645025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:01.879981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:02.284395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:02.799586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:02.018383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:02.393490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:02.915986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:02.138672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:41:02.511520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:41:05.220836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드사업소명예산과목분류코드실제공사비
연번1.0000.0350.1450.0000.0000.023
사업소코드0.0351.0001.0000.5590.5490.000
사업소명0.1451.0001.0000.5730.5560.107
예산과목0.0000.5590.5731.0001.0000.383
분류코드0.0000.5490.5561.0001.0000.390
실제공사비0.0230.0000.1070.3830.3901.000
2023-12-11T01:41:05.345523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
예산과목분류코드사업소명
예산과목1.0001.0000.208
분류코드1.0001.0000.199
사업소명0.2080.1991.000
2023-12-11T01:41:05.444461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드실제공사비사업소명예산과목분류코드
연번1.000-0.0520.0010.0600.0000.000
사업소코드-0.0521.0000.1541.0000.3130.305
실제공사비0.0010.1541.0000.0460.1470.150
사업소명0.0601.0000.0461.0000.2080.199
예산과목0.0000.3130.1470.2081.0001.000
분류코드0.0000.3050.1500.1991.0001.000

Missing values

2023-12-11T01:41:03.127385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:41:03.288031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번사업소코드사업소명예산과목분류코드실제공사비
2362323624306남부 사업소준공검사료(25mm이하)준공검사료(25mm이하)0
77697770309사하 사업소<NA>물이용부담금(일반용)0
1082010821306남부 사업소설계료(비정액제)설계료(비정액제)0
57995800312기장 사업소시공비(신설,정액제)시공비(신설,정액제)0
1931319314306남부 사업소복구비(구경확대)복구비(구경확대)0
2257222573312기장 사업소시공비(이동공사)시공비(이동공사)0
2255722558311강서 사업소준공검사료(25mm이하)준공검사료(25mm이하)7000
2065320654302서부 사업소공사방수료(개조)공사방수료(개조)0
1011710118311강서 사업소공사방수량(신설)공사방수량(신설)0
1969819699311강서 사업소시공비(신설,정액제)시공비(신설,정액제)1327700
연번사업소코드사업소명예산과목분류코드실제공사비
40094010304부산진 사업소시공비(이동공사)시공비(이동공사)0
16361637311강서 사업소설계료(32mm이상)설계료(32mm이상)10000
1002110022312기장 사업소공사방수료(개조)공사방수료(개조)0
1824618247308해운대 사업소설계료(공동주택)설계료(공동주택)0
42384239306남부 사업소설계료(32mm이상)설계료(32mm이상)0
1923219233312기장 사업소원인자부담금(구경별정액제)원인자부담금(구경별정액제)0
2070920710244동래통합사업소공사방수료(개조)공사방수료(개조)0
2134821349307북부 사업소원인자부담금(구경별정액제)원인자부담금(구경별정액제)0
29662967312기장 사업소시공비(이동공사)시공비(이동공사)0
2199221993308해운대 사업소복구비(신설,구경별정액제)복구비(신설,구경별정액제)0