Overview

Dataset statistics

Number of variables8
Number of observations1087
Missing cells12
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory74.4 KiB
Average record size in memory70.1 B

Variable types

Numeric5
Categorical3

Dataset

Description부산광역시상수도사업본부_수용가정보시스템_민원신청정보_누수탐지비_20210601
Author부산광역시 상수도사업본부
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15083449

Alerts

사업소코드 is highly overall correlated with 신청부서 and 2 other fieldsHigh correlation
신청부서 is highly overall correlated with 사업소코드 and 2 other fieldsHigh correlation
사업소명 is highly overall correlated with 사업소코드 and 3 other fieldsHigh correlation
신청부서명 is highly overall correlated with 사업소코드 and 2 other fieldsHigh correlation
건물형태 is highly overall correlated with 사업소명High correlation
건물형태 is highly imbalanced (57.9%)Imbalance
신청부서 has 12 (1.1%) missing valuesMissing
누수탐지지급액 is highly skewed (γ1 = 26.52450838)Skewed
연번 has unique valuesUnique
누수탐지소요액 has 15 (1.4%) zerosZeros
누수탐지지급액 has 15 (1.4%) zerosZeros

Reproduction

Analysis started2023-12-10 16:56:48.392013
Analysis finished2023-12-10 16:56:53.853332
Duration5.46 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct1087
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean544
Minimum1
Maximum1087
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.7 KiB
2023-12-11T01:56:53.973774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile55.3
Q1272.5
median544
Q3815.5
95-th percentile1032.7
Maximum1087
Range1086
Interquartile range (IQR)543

Descriptive statistics

Standard deviation313.93418
Coefficient of variation (CV)0.57708488
Kurtosis-1.2
Mean544
Median Absolute Deviation (MAD)272
Skewness0
Sum591328
Variance98554.667
MonotonicityStrictly increasing
2023-12-11T01:56:54.228484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
724 1
 
0.1%
730 1
 
0.1%
729 1
 
0.1%
728 1
 
0.1%
727 1
 
0.1%
726 1
 
0.1%
725 1
 
0.1%
723 1
 
0.1%
2 1
 
0.1%
Other values (1077) 1077
99.1%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1087 1
0.1%
1086 1
0.1%
1085 1
0.1%
1084 1
0.1%
1083 1
0.1%
1082 1
0.1%
1081 1
0.1%
1080 1
0.1%
1079 1
0.1%
1078 1
0.1%

사업소코드
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean289.18767
Minimum244
Maximum312
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.7 KiB
2023-12-11T01:56:54.433949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum244
5-th percentile244
Q1244
median304
Q3307
95-th percentile311
Maximum312
Range68
Interquartile range (IQR)63

Descriptive statistics

Standard deviation27.270585
Coefficient of variation (CV)0.094300649
Kurtosis-0.88245399
Mean289.18767
Median Absolute Deviation (MAD)3
Skewness-1.0344637
Sum314347
Variance743.68482
MonotonicityNot monotonic
2023-12-11T01:56:54.607959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
244 288
26.5%
306 156
14.4%
304 114
 
10.5%
301 85
 
7.8%
303 83
 
7.6%
302 82
 
7.5%
308 78
 
7.2%
309 68
 
6.3%
307 64
 
5.9%
312 42
 
3.9%
ValueCountFrequency (%)
244 288
26.5%
301 85
 
7.8%
302 82
 
7.5%
303 83
 
7.6%
304 114
 
10.5%
306 156
14.4%
307 64
 
5.9%
308 78
 
7.2%
309 68
 
6.3%
311 27
 
2.5%
ValueCountFrequency (%)
312 42
 
3.9%
311 27
 
2.5%
309 68
6.3%
308 78
7.2%
307 64
5.9%
306 156
14.4%
304 114
10.5%
303 83
7.6%
302 82
7.5%
301 85
7.8%

사업소명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
북부통합사업소
288 
남부 사업소
156 
부산진 사업소
114 
중동부 사업소
85 
영도 사업소
83 
Other values (6)
361 

Length

Max length9
Median length8
Mean length8.2152714
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남부 사업소
2nd row남부 사업소
3rd row남부 사업소
4th row남부 사업소
5th row남부 사업소

Common Values

ValueCountFrequency (%)
북부통합사업소 288
26.5%
남부 사업소 156
14.4%
부산진 사업소 114
 
10.5%
중동부 사업소 85
 
7.8%
영도 사업소 83
 
7.6%
서부 사업소 82
 
7.5%
해운대 사업소 78
 
7.2%
사하 사업소 68
 
6.3%
북부 사업소 64
 
5.9%
기장 사업소 42
 
3.9%

Length

2023-12-11T01:56:54.812913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사업소 799
42.4%
북부통합사업소 288
 
15.3%
남부 156
 
8.3%
부산진 114
 
6.0%
중동부 85
 
4.5%
영도 83
 
4.4%
서부 82
 
4.3%
해운대 78
 
4.1%
사하 68
 
3.6%
북부 64
 
3.4%
Other values (2) 69
 
3.7%

신청부서
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct22
Distinct (%)2.0%
Missing12
Missing (%)1.1%
Infinite0
Infinite (%)0.0%
Mean2889942.6
Minimum2440020
Maximum3120010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.7 KiB
2023-12-11T01:56:55.019716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2440020
5-th percentile2440030
Q12440030
median3040020
Q33070010
95-th percentile3110020
Maximum3120010
Range679990
Interquartile range (IQR)629980

Descriptive statistics

Standard deviation273561.29
Coefficient of variation (CV)0.094659768
Kurtosis-0.91748038
Mean2889942.6
Median Absolute Deviation (MAD)30000
Skewness-1.0180205
Sum3.1066883 × 109
Variance7.4835781 × 1010
MonotonicityNot monotonic
2023-12-11T01:56:55.205841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
2440030 287
26.4%
3060030 89
 
8.2%
3010040 83
 
7.6%
3030020 82
 
7.5%
3020020 80
 
7.4%
3080030 75
 
6.9%
3090020 68
 
6.3%
3060020 64
 
5.9%
3040030 59
 
5.4%
3040020 52
 
4.8%
Other values (12) 136
12.5%
ValueCountFrequency (%)
2440020 1
 
0.1%
2440030 287
26.4%
3010040 83
 
7.6%
3010050 1
 
0.1%
3020010 1
 
0.1%
3020020 80
 
7.4%
3020030 1
 
0.1%
3030020 82
 
7.5%
3040020 52
 
4.8%
3040030 59
 
5.4%
ValueCountFrequency (%)
3120010 40
3.7%
3110020 25
 
2.3%
3110010 1
 
0.1%
3090020 68
6.3%
3080030 75
6.9%
3070040 15
 
1.4%
3070030 20
 
1.8%
3070020 13
 
1.2%
3070010 16
 
1.5%
3060040 2
 
0.2%

신청부서명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
급수운영팀
287 
요금1
197 
요금
187 
요금2
168 
공무1
100 
Other values (6)
148 

Length

Max length5
Median length4
Mean length3.2437902
Min length2

Unique

Unique3 ?
Unique (%)0.3%

Sample

1st row요금2
2nd row요금2
3rd row요금2
4th row요금2
5th row요금2

Common Values

ValueCountFrequency (%)
급수운영팀 287
26.4%
요금1 197
18.1%
요금 187
17.2%
요금2 168
15.5%
공무1 100
 
9.2%
공무 76
 
7.0%
업무 57
 
5.2%
<NA> 12
 
1.1%
요금팀 1
 
0.1%
공무2 1
 
0.1%

Length

2023-12-11T01:56:55.463072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
급수운영팀 287
26.4%
요금1 197
18.1%
요금 187
17.2%
요금2 168
15.5%
공무1 100
 
9.2%
공무 76
 
7.0%
업무 57
 
5.2%
na 12
 
1.1%
요금팀 1
 
0.1%
공무2 1
 
0.1%

건물형태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
1
872 
4
118 
2
 
82
<NA>
 
13
3
 
2

Length

Max length4
Median length1
Mean length1.0358786
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 872
80.2%
4 118
 
10.9%
2 82
 
7.5%
<NA> 13
 
1.2%
3 2
 
0.2%

Length

2023-12-11T01:56:55.652927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:56:55.820486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 872
80.2%
4 118
 
10.9%
2 82
 
7.5%
na 13
 
1.2%
3 2
 
0.2%

누수탐지소요액
Real number (ℝ)

ZEROS 

Distinct45
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean251226.31
Minimum0
Maximum2000000
Zeros15
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size9.7 KiB
2023-12-11T01:56:56.018811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile80000
Q1100000
median200000
Q3300000
95-th percentile600000
Maximum2000000
Range2000000
Interquartile range (IQR)200000

Descriptive statistics

Standard deviation197240.67
Coefficient of variation (CV)0.7851115
Kurtosis17.549875
Mean251226.31
Median Absolute Deviation (MAD)100000
Skewness3.1959382
Sum2.73083 × 108
Variance3.8903881 × 1010
MonotonicityNot monotonic
2023-12-11T01:56:56.266119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
100000 204
18.8%
150000 173
15.9%
300000 159
14.6%
200000 147
13.5%
250000 83
7.6%
400000 61
 
5.6%
500000 45
 
4.1%
80000 43
 
4.0%
350000 41
 
3.8%
600000 23
 
2.1%
Other values (35) 108
9.9%
ValueCountFrequency (%)
0 15
 
1.4%
40000 1
 
0.1%
50000 3
 
0.3%
70000 1
 
0.1%
80000 43
 
4.0%
90000 12
 
1.1%
95000 1
 
0.1%
100000 204
18.8%
110000 1
 
0.1%
120000 3
 
0.3%
ValueCountFrequency (%)
2000000 1
 
0.1%
1800000 1
 
0.1%
1700000 1
 
0.1%
1500000 3
0.3%
1300000 1
 
0.1%
1200000 2
 
0.2%
1000000 5
0.5%
900000 2
 
0.2%
800000 6
0.6%
750000 2
 
0.2%

누수탐지지급액
Real number (ℝ)

SKEWED  ZEROS 

Distinct6
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39774.609
Minimum0
Maximum450000
Zeros15
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size9.7 KiB
2023-12-11T01:56:56.476285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile40000
Q140000
median40000
Q340000
95-th percentile40000
Maximum450000
Range450000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation13328.26
Coefficient of variation (CV)0.33509469
Kurtosis829.03327
Mean39774.609
Median Absolute Deviation (MAD)0
Skewness26.524508
Sum43235000
Variance1.7764252 × 108
MonotonicityNot monotonic
2023-12-11T01:56:56.623059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
40000 1067
98.2%
0 15
 
1.4%
25000 2
 
0.2%
20000 1
 
0.1%
450000 1
 
0.1%
35000 1
 
0.1%
ValueCountFrequency (%)
0 15
 
1.4%
20000 1
 
0.1%
25000 2
 
0.2%
35000 1
 
0.1%
40000 1067
98.2%
450000 1
 
0.1%
ValueCountFrequency (%)
450000 1
 
0.1%
40000 1067
98.2%
35000 1
 
0.1%
25000 2
 
0.2%
20000 1
 
0.1%
0 15
 
1.4%

Interactions

2023-12-11T01:56:52.868443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:48.979046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:49.879852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:50.841461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:51.682428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:53.000712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:49.148747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:50.088763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:51.017448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:51.854278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:53.149173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:49.321264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:50.334691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:51.219307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:52.080079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:53.276102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:49.463419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:50.506258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:51.378123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:52.242095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:53.400524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:49.740252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:50.681487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:51.552490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:52.712303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:56:56.741210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드사업소명신청부서신청부서명건물형태누수탐지소요액누수탐지지급액
연번1.0000.1380.3860.2500.3880.1000.0730.000
사업소코드0.1381.0001.0001.0000.8200.6510.1880.000
사업소명0.3861.0001.0001.0000.9220.8340.2820.078
신청부서0.2501.0001.0001.0000.9820.3720.2350.000
신청부서명0.3880.8200.9220.9821.0000.6630.3420.035
건물형태0.1000.6510.8340.3720.6631.0000.0400.000
누수탐지소요액0.0730.1880.2820.2350.3420.0401.0000.063
누수탐지지급액0.0000.0000.0780.0000.0350.0000.0631.000
2023-12-11T01:56:56.897313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
신청부서명사업소명건물형태
신청부서명1.0000.7260.462
사업소명0.7261.0000.681
건물형태0.4620.6811.000
2023-12-11T01:56:57.045885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드신청부서누수탐지소요액누수탐지지급액사업소명신청부서명건물형태
연번1.000-0.030-0.042-0.069-0.1030.1750.1280.060
사업소코드-0.0301.0000.9980.013-0.0620.9960.8370.362
신청부서-0.0420.9981.0000.026-0.0070.9960.8370.362
누수탐지소요액-0.0690.0130.0261.0000.2320.1240.1110.023
누수탐지지급액-0.103-0.062-0.0070.2321.0000.0740.0260.000
사업소명0.1750.9960.9960.1240.0741.0000.7260.681
신청부서명0.1280.8370.8370.1110.0260.7261.0000.462
건물형태0.0600.3620.3620.0230.0000.6810.4621.000

Missing values

2023-12-11T01:56:53.580283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:56:53.779717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번사업소코드사업소명신청부서신청부서명건물형태누수탐지소요액누수탐지지급액
01306남부 사업소3060030요금2115000040000
12306남부 사업소3060030요금2110000040000
23306남부 사업소3060030요금2160000040000
34306남부 사업소3060030요금2150000040000
45306남부 사업소3060030요금21120000040000
56306남부 사업소3060030요금2135000040000
67306남부 사업소3060030요금2130000040000
78308해운대 사업소3080030공무130000040000
89304부산진 사업소3040030요금2480000040000
910304부산진 사업소3040020요금1420000040000
연번사업소코드사업소명신청부서신청부서명건물형태누수탐지소요액누수탐지지급액
10771078312기장 사업소3120010업무210000040000
10781079312기장 사업소3120010업무1100000040000
10791080302서부 사업소3020020요금18000040000
10801081302서부 사업소3020020요금115000040000
10811082302서부 사업소3020020요금115000040000
10821083301중동부 사업소3010040공무1130000040000
10831084301중동부 사업소3010040공무1110000040000
10841085304부산진 사업소3040020요금1410000040000
10851086302서부 사업소3020020요금115000040000
10861087312기장 사업소<NA><NA><NA>00