Overview

Dataset statistics

Number of variables7
Number of observations2593
Missing cells2
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory152.1 KiB
Average record size in memory60.1 B

Variable types

Numeric4
Categorical3

Dataset

Description부산광역시상수도사업본부_수용가정보시스템_민원신청정보_누수탐지비_20220131
Author부산광역시 상수도사업본부
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15083449

Alerts

사업소코드 is highly overall correlated with 사업소명 and 1 other fieldsHigh correlation
사업소명 is highly overall correlated with 사업소코드 and 2 other fieldsHigh correlation
신청부서명 is highly overall correlated with 사업소코드 and 1 other fieldsHigh correlation
건물형태 is highly overall correlated with 사업소명High correlation
건물형태 is highly imbalanced (60.4%)Imbalance
누수탐지지급액 is highly skewed (γ1 = 35.07234848)Skewed
연번 has unique valuesUnique
누수탐지소요액 has 28 (1.1%) zerosZeros
누수탐지지급액 has 27 (1.0%) zerosZeros

Reproduction

Analysis started2023-12-10 16:56:38.954178
Analysis finished2023-12-10 16:56:43.607695
Duration4.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct2593
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1297
Minimum1
Maximum2593
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.9 KiB
2023-12-11T01:56:43.753880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile130.6
Q1649
median1297
Q31945
95-th percentile2463.4
Maximum2593
Range2592
Interquartile range (IQR)1296

Descriptive statistics

Standard deviation748.67895
Coefficient of variation (CV)0.57723897
Kurtosis-1.2
Mean1297
Median Absolute Deviation (MAD)648
Skewness0
Sum3363121
Variance560520.17
MonotonicityStrictly increasing
2023-12-11T01:56:44.096725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1743 1
 
< 0.1%
1725 1
 
< 0.1%
1726 1
 
< 0.1%
1727 1
 
< 0.1%
1728 1
 
< 0.1%
1729 1
 
< 0.1%
1730 1
 
< 0.1%
1731 1
 
< 0.1%
1732 1
 
< 0.1%
Other values (2583) 2583
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2593 1
< 0.1%
2592 1
< 0.1%
2591 1
< 0.1%
2590 1
< 0.1%
2589 1
< 0.1%
2588 1
< 0.1%
2587 1
< 0.1%
2586 1
< 0.1%
2585 1
< 0.1%
2584 1
< 0.1%

사업소코드
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean289.69148
Minimum244
Maximum312
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.9 KiB
2023-12-11T01:56:44.300430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum244
5-th percentile244
Q1244
median304
Q3307
95-th percentile311
Maximum312
Range68
Interquartile range (IQR)63

Descriptive statistics

Standard deviation26.968941
Coefficient of variation (CV)0.093095389
Kurtosis-0.77714593
Mean289.69148
Median Absolute Deviation (MAD)3
Skewness-1.0829919
Sum751170
Variance727.32376
MonotonicityNot monotonic
2023-12-11T01:56:44.482792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
244 665
25.6%
306 336
13.0%
304 271
10.5%
301 226
 
8.7%
307 222
 
8.6%
303 197
 
7.6%
302 181
 
7.0%
308 173
 
6.7%
309 165
 
6.4%
312 81
 
3.1%
ValueCountFrequency (%)
244 665
25.6%
301 226
 
8.7%
302 181
 
7.0%
303 197
 
7.6%
304 271
10.5%
306 336
13.0%
307 222
 
8.6%
308 173
 
6.7%
309 165
 
6.4%
311 76
 
2.9%
ValueCountFrequency (%)
312 81
 
3.1%
311 76
 
2.9%
309 165
6.4%
308 173
6.7%
307 222
8.6%
306 336
13.0%
304 271
10.5%
303 197
7.6%
302 181
7.0%
301 226
8.7%

사업소명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
동래통합사업소
665 
남부 사업소
336 
부산진 사업소
271 
중동부 사업소
226 
북부 사업소
222 
Other values (6)
873 

Length

Max length9
Median length8
Mean length8.2286926
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남부 사업소
2nd row남부 사업소
3rd row남부 사업소
4th row남부 사업소
5th row남부 사업소

Common Values

ValueCountFrequency (%)
동래통합사업소 665
25.6%
남부 사업소 336
13.0%
부산진 사업소 271
10.5%
중동부 사업소 226
 
8.7%
북부 사업소 222
 
8.6%
영도 사업소 197
 
7.6%
서부 사업소 181
 
7.0%
해운대 사업소 173
 
6.7%
사하 사업소 165
 
6.4%
기장 사업소 81
 
3.1%

Length

2023-12-11T01:56:44.736648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사업소 1928
42.6%
동래통합사업소 665
 
14.7%
남부 336
 
7.4%
부산진 271
 
6.0%
중동부 226
 
5.0%
북부 222
 
4.9%
영도 197
 
4.4%
서부 181
 
4.0%
해운대 173
 
3.8%
사하 165
 
3.6%
Other values (2) 157
 
3.5%

신청부서명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
급수운영팀
657 
요금1
497 
요금
424 
요금2
384 
공무
192 
Other values (6)
439 

Length

Max length5
Median length4
Mean length3.2398766
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row요금2
2nd row요금2
3rd row요금2
4th row요금2
5th row요금2

Common Values

ValueCountFrequency (%)
급수운영팀 657
25.3%
요금1 497
19.2%
요금 424
16.4%
요금2 384
14.8%
공무 192
 
7.4%
공무1 170
 
6.6%
공무2 144
 
5.6%
업무 74
 
2.9%
서무 26
 
1.0%
<NA> 24
 
0.9%

Length

2023-12-11T01:56:44.988831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
급수운영팀 657
25.3%
요금1 497
19.2%
요금 424
16.4%
요금2 384
14.8%
공무 192
 
7.4%
공무1 170
 
6.6%
공무2 144
 
5.6%
업무 74
 
2.9%
서무 26
 
1.0%
na 24
 
0.9%

건물형태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size20.4 KiB
단독주택
2114 
기타
276 
공동주택
 
176
<NA>
 
25
근린생활시설(상가 등)
 
2

Length

Max length12
Median length4
Mean length3.7932896
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row단독주택
2nd row단독주택
3rd row단독주택
4th row단독주택
5th row단독주택

Common Values

ValueCountFrequency (%)
단독주택 2114
81.5%
기타 276
 
10.6%
공동주택 176
 
6.8%
<NA> 25
 
1.0%
근린생활시설(상가 등) 2
 
0.1%

Length

2023-12-11T01:56:45.202772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:56:45.377936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단독주택 2114
81.5%
기타 276
 
10.6%
공동주택 176
 
6.8%
na 25
 
1.0%
근린생활시설(상가 2
 
0.1%
2
 
0.1%

누수탐지소요액
Real number (ℝ)

ZEROS 

Distinct59
Distinct (%)2.3%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean244266.59
Minimum0
Maximum2500000
Zeros28
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size22.9 KiB
2023-12-11T01:56:45.591112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile90000
Q1120000
median200000
Q3300000
95-th percentile550000
Maximum2500000
Range2500000
Interquartile range (IQR)180000

Descriptive statistics

Standard deviation188978.67
Coefficient of variation (CV)0.77365746
Kurtosis27.429373
Mean244266.59
Median Absolute Deviation (MAD)100000
Skewness3.8231064
Sum6.33139 × 108
Variance3.5712937 × 1010
MonotonicityNot monotonic
2023-12-11T01:56:45.828760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100000 490
18.9%
150000 428
16.5%
200000 405
15.6%
300000 399
15.4%
250000 213
8.2%
400000 131
 
5.1%
500000 90
 
3.5%
350000 89
 
3.4%
80000 78
 
3.0%
600000 43
 
1.7%
Other values (49) 226
8.7%
ValueCountFrequency (%)
0 28
 
1.1%
20000 1
 
< 0.1%
30000 1
 
< 0.1%
40000 12
 
0.5%
50000 4
 
0.2%
60000 1
 
< 0.1%
70000 3
 
0.1%
80000 78
3.0%
90000 26
 
1.0%
95000 1
 
< 0.1%
ValueCountFrequency (%)
2500000 1
 
< 0.1%
2400000 1
 
< 0.1%
2000000 1
 
< 0.1%
1800000 1
 
< 0.1%
1705000 1
 
< 0.1%
1700000 1
 
< 0.1%
1600000 1
 
< 0.1%
1500000 4
0.2%
1300000 2
0.1%
1200000 4
0.2%

누수탐지지급액
Real number (ℝ)

SKEWED  ZEROS 

Distinct7
Distinct (%)0.3%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean39706.79
Minimum0
Maximum450000
Zeros27
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size22.9 KiB
2023-12-11T01:56:46.034157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile40000
Q140000
median40000
Q340000
95-th percentile40000
Maximum450000
Range450000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation9052.5112
Coefficient of variation (CV)0.22798396
Kurtosis1633.328
Mean39706.79
Median Absolute Deviation (MAD)0
Skewness35.072348
Sum1.0292 × 108
Variance81947959
MonotonicityNot monotonic
2023-12-11T01:56:46.201866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
40000 2556
98.6%
0 27
 
1.0%
25000 3
 
0.1%
35000 3
 
0.1%
20000 1
 
< 0.1%
450000 1
 
< 0.1%
30000 1
 
< 0.1%
(Missing) 1
 
< 0.1%
ValueCountFrequency (%)
0 27
 
1.0%
20000 1
 
< 0.1%
25000 3
 
0.1%
30000 1
 
< 0.1%
35000 3
 
0.1%
40000 2556
98.6%
450000 1
 
< 0.1%
ValueCountFrequency (%)
450000 1
 
< 0.1%
40000 2556
98.6%
35000 3
 
0.1%
30000 1
 
< 0.1%
25000 3
 
0.1%
20000 1
 
< 0.1%
0 27
 
1.0%

Interactions

2023-12-11T01:56:42.420958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:39.662723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:40.305702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:41.696201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:42.603457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:39.836899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:40.477598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:41.870038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:42.757430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:39.996375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:40.694721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:42.064648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:42.909741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:40.160518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:41.330512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:42.231377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:56:46.337665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드사업소명신청부서명건물형태누수탐지소요액누수탐지지급액
연번1.0000.1820.2210.2900.1060.0000.003
사업소코드0.1821.0001.0000.8200.6570.1090.000
사업소명0.2211.0001.0000.9300.8410.2180.017
신청부서명0.2900.8200.9301.0000.6620.2050.058
건물형태0.1060.6570.8410.6621.0000.0000.000
누수탐지소요액0.0000.1090.2180.2050.0001.0000.000
누수탐지지급액0.0030.0000.0170.0580.0000.0001.000
2023-12-11T01:56:46.500537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
신청부서명사업소명건물형태
신청부서명1.0000.7480.462
사업소명0.7481.0000.692
건물형태0.4620.6921.000
2023-12-11T01:56:46.664978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드누수탐지소요액누수탐지지급액사업소명신청부서명건물형태
연번1.000-0.031-0.028-0.0920.0960.0930.064
사업소코드-0.0311.0000.0010.0070.9980.8390.366
누수탐지소요액-0.0280.0011.0000.2030.1000.0940.000
누수탐지지급액-0.0920.0070.2031.0000.0160.0450.000
사업소명0.0960.9980.1000.0161.0000.7480.692
신청부서명0.0930.8390.0940.0450.7481.0000.462
건물형태0.0640.3660.0000.0000.6920.4621.000

Missing values

2023-12-11T01:56:43.105296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:56:43.303290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T01:56:43.505277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번사업소코드사업소명신청부서명건물형태누수탐지소요액누수탐지지급액
01306남부 사업소요금2단독주택15000040000
12306남부 사업소요금2단독주택10000040000
23306남부 사업소요금2단독주택60000040000
34306남부 사업소요금2단독주택50000040000
45306남부 사업소요금2단독주택120000040000
56306남부 사업소요금2단독주택35000040000
67306남부 사업소요금2단독주택30000040000
78308해운대 사업소공무단독주택30000040000
89304부산진 사업소요금2기타80000040000
910304부산진 사업소요금1기타20000040000
연번사업소코드사업소명신청부서명건물형태누수탐지소요액누수탐지지급액
25832584244동래통합사업소급수운영팀단독주택30000040000
25842585244동래통합사업소급수운영팀단독주택100000040000
25852586244동래통합사업소급수운영팀단독주택15000040000
25862587244동래통합사업소급수운영팀단독주택20000040000
25872588244동래통합사업소급수운영팀단독주택50000040000
25882589244동래통합사업소<NA><NA>00
25892590244동래통합사업소급수운영팀단독주택70000040000
25902591244동래통합사업소급수운영팀단독주택30000040000
25912592244동래통합사업소급수운영팀단독주택15000040000
25922593244동래통합사업소급수운영팀단독주택15000040000