Overview

Dataset statistics

Number of variables7
Number of observations2238
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory131.3 KiB
Average record size in memory60.1 B

Variable types

Numeric3
Categorical4

Dataset

Description부산광역시 상수도사업본부에서 상하수도 요금 계산 및 징수를 위해 운영하는 수용가정보시스템에 사용되는 민원신청 정보(누수탐지비) 자료입니다.
Author부산광역시 상수도사업본부
URLhttps://www.data.go.kr/data/15083449/fileData.do

Alerts

사업소코드 is highly overall correlated with 사업소명 and 1 other fieldsHigh correlation
사업소명 is highly overall correlated with 사업소코드 and 2 other fieldsHigh correlation
신청부서명 is highly overall correlated with 사업소코드 and 1 other fieldsHigh correlation
건물형태 is highly overall correlated with 사업소명High correlation
건물형태 is highly imbalanced (51.5%)Imbalance
누수탐지지급액 is highly imbalanced (96.4%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2024-03-14 18:48:17.134720
Analysis finished2024-03-14 18:48:20.740912
Duration3.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct2238
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1119.5
Minimum1
Maximum2238
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.8 KiB
2024-03-15T03:48:20.953789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile112.85
Q1560.25
median1119.5
Q31678.75
95-th percentile2126.15
Maximum2238
Range2237
Interquartile range (IQR)1118.5

Descriptive statistics

Standard deviation646.19927
Coefficient of variation (CV)0.57722132
Kurtosis-1.2
Mean1119.5
Median Absolute Deviation (MAD)559.5
Skewness0
Sum2505441
Variance417573.5
MonotonicityStrictly increasing
2024-03-15T03:48:21.368121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1496 1
 
< 0.1%
1490 1
 
< 0.1%
1491 1
 
< 0.1%
1492 1
 
< 0.1%
1493 1
 
< 0.1%
1494 1
 
< 0.1%
1495 1
 
< 0.1%
1497 1
 
< 0.1%
1505 1
 
< 0.1%
Other values (2228) 2228
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2238 1
< 0.1%
2237 1
< 0.1%
2236 1
< 0.1%
2235 1
< 0.1%
2234 1
< 0.1%
2233 1
< 0.1%
2232 1
< 0.1%
2231 1
< 0.1%
2230 1
< 0.1%
2229 1
< 0.1%

사업소코드
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean289.06122
Minimum244
Maximum312
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.8 KiB
2024-03-15T03:48:21.718541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum244
5-th percentile244
Q1244
median304
Q3307
95-th percentile311
Maximum312
Range68
Interquartile range (IQR)63

Descriptive statistics

Standard deviation27.711963
Coefficient of variation (CV)0.095868838
Kurtosis-0.97162848
Mean289.06122
Median Absolute Deviation (MAD)3
Skewness-0.99235789
Sum646919
Variance767.95289
MonotonicityNot monotonic
2024-03-15T03:48:22.073977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
244 610
27.3%
304 272
12.2%
306 260
11.6%
307 235
 
10.5%
309 179
 
8.0%
301 147
 
6.6%
308 136
 
6.1%
303 117
 
5.2%
302 111
 
5.0%
312 86
 
3.8%
ValueCountFrequency (%)
244 610
27.3%
301 147
 
6.6%
302 111
 
5.0%
303 117
 
5.2%
304 272
12.2%
306 260
11.6%
307 235
 
10.5%
308 136
 
6.1%
309 179
 
8.0%
311 85
 
3.8%
ValueCountFrequency (%)
312 86
 
3.8%
311 85
 
3.8%
309 179
8.0%
308 136
6.1%
307 235
10.5%
306 260
11.6%
304 272
12.2%
303 117
5.2%
302 111
5.0%
301 147
6.6%

사업소명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size17.6 KiB
동래통합사업소
610 
부산진 사업소
272 
남부사업소
260 
북부사업소
235 
사하사업소
179 
Other values (6)
682 

Length

Max length9
Median length8
Mean length6.2345845
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산진 사업소
2nd row남부사업소
3rd row남부사업소
4th row남부사업소
5th row강서사업소

Common Values

ValueCountFrequency (%)
동래통합사업소 610
27.3%
부산진 사업소 272
12.2%
남부사업소 260
11.6%
북부사업소 235
 
10.5%
사하사업소 179
 
8.0%
중동부사업소 147
 
6.6%
해운대사업소 136
 
6.1%
영도사업소 117
 
5.2%
서부 사업소 111
 
5.0%
기장사업소 86
 
3.8%

Length

2024-03-15T03:48:22.584241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
동래통합사업소 610
23.3%
사업소 383
14.6%
부산진 272
10.4%
남부사업소 260
9.9%
북부사업소 235
 
9.0%
사하사업소 179
 
6.8%
중동부사업소 147
 
5.6%
해운대사업소 136
 
5.2%
영도사업소 117
 
4.5%
서부 111
 
4.2%
Other values (2) 171
 
6.5%

신청부서명
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size17.6 KiB
급수운영팀
603 
요금1
449 
공무1
282 
요금
270 
공무
259 
Other values (5)
375 

Length

Max length5
Median length4
Mean length3.3087578
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row요금1
2nd row요금2
3rd row요금2
4th row요금2
5th row공무

Common Values

ValueCountFrequency (%)
급수운영팀 603
26.9%
요금1 449
20.1%
공무1 282
12.6%
요금 270
12.1%
공무 259
11.6%
요금2 257
11.5%
공무2 99
 
4.4%
<NA> 15
 
0.7%
업무 3
 
0.1%
행정지원팀 1
 
< 0.1%

Length

2024-03-15T03:48:23.282732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T03:48:23.662493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
급수운영팀 603
26.9%
요금1 449
20.1%
공무1 282
12.6%
요금 270
12.1%
공무 259
11.6%
요금2 257
11.5%
공무2 99
 
4.4%
na 15
 
0.7%
업무 3
 
0.1%
행정지원팀 1
 
< 0.1%

건물형태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size17.6 KiB
단독주택
1616 
기타
472 
공동주택
 
127
<NA>
 
15
근린생활시설(상가 등)
 
8

Length

Max length12
Median length4
Mean length3.6067918
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기타
2nd row단독주택
3rd row단독주택
4th row공동주택
5th row단독주택

Common Values

ValueCountFrequency (%)
단독주택 1616
72.2%
기타 472
 
21.1%
공동주택 127
 
5.7%
<NA> 15
 
0.7%
근린생활시설(상가 등) 8
 
0.4%

Length

2024-03-15T03:48:24.139948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T03:48:24.490847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단독주택 1616
72.0%
기타 472
 
21.0%
공동주택 127
 
5.7%
na 15
 
0.7%
근린생활시설(상가 8
 
0.4%
8
 
0.4%

누수탐지소요액
Real number (ℝ)

Distinct53
Distinct (%)2.4%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean252241.39
Minimum0
Maximum2100000
Zeros16
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size19.8 KiB
2024-03-15T03:48:24.907560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile100000
Q1100000
median200000
Q3300000
95-th percentile500000
Maximum2100000
Range2100000
Interquartile range (IQR)200000

Descriptive statistics

Standard deviation179018
Coefficient of variation (CV)0.70970906
Kurtosis16.350013
Mean252241.39
Median Absolute Deviation (MAD)100000
Skewness2.9538575
Sum5.64264 × 108
Variance3.2047445 × 1010
MonotonicityNot monotonic
2024-03-15T03:48:25.655716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100000 508
22.7%
300000 445
19.9%
200000 346
15.5%
150000 202
 
9.0%
250000 174
 
7.8%
400000 135
 
6.0%
500000 86
 
3.8%
350000 80
 
3.6%
80000 44
 
2.0%
600000 35
 
1.6%
Other values (43) 182
 
8.1%
ValueCountFrequency (%)
0 16
 
0.7%
40000 1
 
< 0.1%
50000 3
 
0.1%
60000 2
 
0.1%
70000 2
 
0.1%
80000 44
 
2.0%
90000 29
 
1.3%
100000 508
22.7%
110000 1
 
< 0.1%
120000 3
 
0.1%
ValueCountFrequency (%)
2100000 1
 
< 0.1%
1700000 1
 
< 0.1%
1680000 1
 
< 0.1%
1540000 1
 
< 0.1%
1500000 2
 
0.1%
1320000 3
 
0.1%
1300000 3
 
0.1%
1200000 1
 
< 0.1%
1100000 1
 
< 0.1%
1000000 9
0.4%

누수탐지지급액
Categorical

IMBALANCE 

Distinct6
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size17.6 KiB
40000
2216 
0
 
16
25000
 
3
20000
 
1
<NA>
 
1

Length

Max length5
Median length5
Mean length4.9709562
Min length1

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row40000
2nd row40000
3rd row40000
4th row40000
5th row40000

Common Values

ValueCountFrequency (%)
40000 2216
99.0%
0 16
 
0.7%
25000 3
 
0.1%
20000 1
 
< 0.1%
<NA> 1
 
< 0.1%
30000 1
 
< 0.1%

Length

2024-03-15T03:48:26.152710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T03:48:26.591711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40000 2216
99.0%
0 16
 
0.7%
25000 3
 
0.1%
20000 1
 
< 0.1%
na 1
 
< 0.1%
30000 1
 
< 0.1%

Interactions

2024-03-15T03:48:19.303015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T03:48:17.661722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T03:48:18.351173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T03:48:19.554724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T03:48:17.923397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T03:48:18.653343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T03:48:19.821222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T03:48:18.163068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T03:48:18.969805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T03:48:26.844649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드사업소명신청부서명건물형태누수탐지소요액누수탐지지급액
연번1.0000.1460.2200.1650.1260.0780.182
사업소코드0.1461.0001.0000.5730.4890.0000.000
사업소명0.2201.0001.0000.9190.8070.1750.000
신청부서명0.1650.5730.9191.0000.6370.1330.034
건물형태0.1260.4890.8070.6371.0000.0840.000
누수탐지소요액0.0780.0000.1750.1330.0841.0000.000
누수탐지지급액0.1820.0000.0000.0340.0000.0001.000
2024-03-15T03:48:27.149904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건물형태신청부서명사업소명누수탐지지급액
건물형태1.0000.4660.6410.000
신청부서명0.4661.0000.7540.019
사업소명0.6410.7541.0000.000
누수탐지지급액0.0000.0190.0001.000
2024-03-15T03:48:27.424356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드누수탐지소요액사업소명신청부서명건물형태누수탐지지급액
연번1.0000.024-0.0190.0950.0750.0760.076
사업소코드0.0241.0000.0130.9980.8300.3250.000
누수탐지소요액-0.0190.0131.0000.0750.0600.0500.000
사업소명0.0950.9980.0751.0000.7540.6410.000
신청부서명0.0750.8300.0600.7541.0000.4660.019
건물형태0.0760.3250.0500.6410.4661.0000.000
누수탐지지급액0.0760.0000.0000.0000.0190.0001.000

Missing values

2024-03-15T03:48:20.169938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T03:48:20.586225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번사업소코드사업소명신청부서명건물형태누수탐지소요액누수탐지지급액
01304부산진 사업소요금1기타50000040000
12306남부사업소요금2단독주택15000040000
23306남부사업소요금2단독주택20000040000
34306남부사업소요금2공동주택20000040000
45311강서사업소공무단독주택25000040000
56311강서사업소공무단독주택60000040000
67311강서사업소공무단독주택30000040000
78311강서사업소공무단독주택20000040000
89309사하사업소요금1단독주택10000040000
910309사하사업소요금1단독주택8000040000
연번사업소코드사업소명신청부서명건물형태누수탐지소요액누수탐지지급액
22282229307북부사업소공무1단독주택20000040000
22292230307북부사업소공무1단독주택30000040000
22302231307북부사업소공무1단독주택20000040000
22312232307북부사업소공무1단독주택15000040000
22322233311강서사업소요금단독주택15000040000
22332234311강서사업소요금단독주택25000040000
22342235311강서사업소요금단독주택30000040000
22352236307북부사업소공무1단독주택35000040000
22362237301중동부사업소공무2단독주택8000040000
22372238309사하사업소요금1단독주택12000040000