Overview

Dataset statistics

Number of variables8
Number of observations280
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory19.0 KiB
Average record size in memory69.5 B

Variable types

Numeric4
DateTime3
Categorical1

Dataset

Description부산광역시상수도사업본부_수용가정보시스템_요금계산관련정보_추징계산이력_20210601
Author부산광역시 상수도사업본부
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15083669

Alerts

추징금액(상) is highly overall correlated with 추징금액(하) and 1 other fieldsHigh correlation
추징금액(하) is highly overall correlated with 추징금액(상) and 1 other fieldsHigh correlation
추징금액(물) is highly overall correlated with 추징금액(상) and 1 other fieldsHigh correlation
사용순번 is highly imbalanced (93.9%)Imbalance
연번 has unique valuesUnique
추징금액(상) has 32 (11.4%) zerosZeros
추징금액(하) has 81 (28.9%) zerosZeros
추징금액(물) has 109 (38.9%) zerosZeros

Reproduction

Analysis started2023-12-10 17:14:21.265820
Analysis finished2023-12-10 17:14:26.236815
Duration4.97 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct280
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean140.5
Minimum1
Maximum280
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-11T02:14:26.469725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile14.95
Q170.75
median140.5
Q3210.25
95-th percentile266.05
Maximum280
Range279
Interquartile range (IQR)139.5

Descriptive statistics

Standard deviation80.973247
Coefficient of variation (CV)0.57632204
Kurtosis-1.2
Mean140.5
Median Absolute Deviation (MAD)70
Skewness0
Sum39340
Variance6556.6667
MonotonicityStrictly increasing
2023-12-11T02:14:26.790018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
186 1
 
0.4%
192 1
 
0.4%
191 1
 
0.4%
190 1
 
0.4%
189 1
 
0.4%
188 1
 
0.4%
187 1
 
0.4%
185 1
 
0.4%
194 1
 
0.4%
Other values (270) 270
96.4%
ValueCountFrequency (%)
1 1
0.4%
2 1
0.4%
3 1
0.4%
4 1
0.4%
5 1
0.4%
6 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
10 1
0.4%
ValueCountFrequency (%)
280 1
0.4%
279 1
0.4%
278 1
0.4%
277 1
0.4%
276 1
0.4%
275 1
0.4%
274 1
0.4%
273 1
0.4%
272 1
0.4%
271 1
0.4%
Distinct5
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
Minimum2021-01-01 00:00:00
Maximum2021-05-01 00:00:00
2023-12-11T02:14:27.095887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:27.355644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
Distinct6
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
Minimum2021-01-01 00:00:00
Maximum2021-06-01 00:00:00
2023-12-11T02:14:27.605559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:27.896248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
Distinct6
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
Minimum2020-12-01 00:00:00
Maximum2021-05-01 00:00:00
2023-12-11T02:14:28.136125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:28.409364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)

추징금액(상)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct108
Distinct (%)38.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73189
Minimum-4859400
Maximum2020720
Zeros32
Zeros (%)11.4%
Negative218
Negative (%)77.9%
Memory size2.6 KiB
2023-12-11T02:14:28.738380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-4859400
5-th percentile-381525
Q1-10087.5
median-3040
Q3-1200
95-th percentile460
Maximum2020720
Range6880120
Interquartile range (IQR)8887.5

Descriptive statistics

Standard deviation531894
Coefficient of variation (CV)-7.2674035
Kurtosis49.437821
Mean-73189
Median Absolute Deviation (MAD)3040
Skewness-5.2946609
Sum-20492920
Variance2.8291122 × 1011
MonotonicityNot monotonic
2023-12-11T02:14:29.143166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1200 48
 
17.1%
0 32
 
11.4%
-3040 24
 
8.6%
460 18
 
6.4%
-14400 6
 
2.1%
-7200 5
 
1.8%
-5160 4
 
1.4%
-1920 4
 
1.4%
-12690 3
 
1.1%
-480 3
 
1.1%
Other values (98) 133
47.5%
ValueCountFrequency (%)
-4859400 1
0.4%
-4787640 1
0.4%
-2726880 1
0.4%
-1812720 1
0.4%
-1544400 1
0.4%
-1253000 1
0.4%
-1201200 1
0.4%
-1115400 1
0.4%
-1015560 1
0.4%
-940680 1
0.4%
ValueCountFrequency (%)
2020720 1
 
0.4%
2020710 2
 
0.7%
194090 1
 
0.4%
155990 1
 
0.4%
82170 2
 
0.7%
36690 1
 
0.4%
22250 2
 
0.7%
10800 1
 
0.4%
7200 1
 
0.4%
460 18
6.4%

추징금액(하)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct112
Distinct (%)40.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-47971.071
Minimum-3613400
Maximum2962700
Zeros81
Zeros (%)28.9%
Negative184
Negative (%)65.7%
Memory size2.6 KiB
2023-12-11T02:14:29.549285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-3613400
5-th percentile-438956.5
Q1-7690
median-2400
Q30
95-th percentile107
Maximum2962700
Range6576100
Interquartile range (IQR)7690

Descriptive statistics

Standard deviation486749.64
Coefficient of variation (CV)-10.146733
Kurtosis35.089419
Mean-47971.071
Median Absolute Deviation (MAD)2400
Skewness-0.71930299
Sum-13431900
Variance2.3692521 × 1011
MonotonicityNot monotonic
2023-12-11T02:14:29.934145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 81
28.9%
-5340 13
 
4.6%
-2470 6
 
2.1%
-4500 5
 
1.8%
-450 5
 
1.8%
-900 5
 
1.8%
-7690 4
 
1.4%
-8270 4
 
1.4%
-5080 4
 
1.4%
-2700 4
 
1.4%
Other values (102) 149
53.2%
ValueCountFrequency (%)
-3613400 1
0.4%
-3560040 1
0.4%
-2027680 1
0.4%
-1347920 1
0.4%
-1148400 1
0.4%
-1120620 1
0.4%
-1100220 1
0.4%
-893200 1
0.4%
-829400 1
0.4%
-776860 1
0.4%
ValueCountFrequency (%)
2962700 3
 
1.1%
212900 1
 
0.4%
209250 1
 
0.4%
208500 3
 
1.1%
19260 1
 
0.4%
17060 2
 
0.7%
4500 1
 
0.4%
240 2
 
0.7%
100 1
 
0.4%
0 81
28.9%

추징금액(물)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct88
Distinct (%)31.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-15248.714
Minimum-928900
Maximum227750
Zeros109
Zeros (%)38.9%
Negative160
Negative (%)57.1%
Memory size2.6 KiB
2023-12-11T02:14:30.289674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-928900
5-th percentile-73279
Q1-1722.5
median-440
Q30
95-th percentile0
Maximum227750
Range1156650
Interquartile range (IQR)1722.5

Descriptive statistics

Standard deviation95728.871
Coefficient of variation (CV)-6.2778323
Kurtosis60.388657
Mean-15248.714
Median Absolute Deviation (MAD)440
Skewness-6.8988363
Sum-4269640
Variance9.1640167 × 109
MonotonicityNot monotonic
2023-12-11T02:14:30.647199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 109
38.9%
-800 23
 
8.2%
-440 7
 
2.5%
-820 6
 
2.1%
-140 5
 
1.8%
-1490 5
 
1.8%
-890 5
 
1.8%
-2310 4
 
1.4%
-1640 4
 
1.4%
-2230 4
 
1.4%
Other values (78) 108
38.6%
ValueCountFrequency (%)
-928900 1
0.4%
-915180 1
0.4%
-521250 1
0.4%
-346510 1
0.4%
-295220 1
0.4%
-229620 1
0.4%
-213210 1
0.4%
-194120 1
0.4%
-186820 1
0.4%
-179820 1
0.4%
ValueCountFrequency (%)
227750 2
 
0.7%
227740 1
 
0.4%
18420 1
 
0.4%
12450 2
 
0.7%
5590 1
 
0.4%
3550 2
 
0.7%
1490 1
 
0.4%
1360 1
 
0.4%
0 109
38.9%
-70 2
 
0.7%

사용순번
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
1
278 
2
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 278
99.3%
2 2
 
0.7%

Length

2023-12-11T02:14:30.937524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:14:31.159466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 278
99.3%
2 2
 
0.7%

Interactions

2023-12-11T02:14:24.725785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:21.802822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:22.863600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:23.779967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:24.950513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:21.990280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:23.071931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:24.006387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:25.215962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:22.249687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:23.310976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:24.252012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:25.468441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:22.625886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:23.566517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:24.490583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:14:31.322793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번추징발생년월고지년월계산년월추징금액(상)추징금액(하)추징금액(물)사용순번
연번1.0000.9930.7590.6500.4960.3230.4970.239
추징발생년월0.9931.0000.7770.6000.4980.2880.5010.064
고지년월0.7590.7771.0000.9200.1250.0000.1260.164
계산년월0.6500.6000.9201.0000.2610.0950.2410.134
추징금액(상)0.4960.4980.1250.2611.0000.9481.0000.000
추징금액(하)0.3230.2880.0000.0950.9481.0000.9520.554
추징금액(물)0.4970.5010.1260.2411.0000.9521.0000.000
사용순번0.2390.0640.1640.1340.0000.5540.0001.000
2023-12-11T02:14:31.630938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번추징금액(상)추징금액(하)추징금액(물)사용순번
연번1.000-0.446-0.255-0.3940.181
추징금액(상)-0.4461.0000.7720.9270.000
추징금액(하)-0.2550.7721.0000.8210.398
추징금액(물)-0.3940.9270.8211.0000.000
사용순번0.1810.0000.3980.0001.000

Missing values

2023-12-11T02:14:25.789955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:14:26.088131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번추징발생년월고지년월계산년월추징금액(상)추징금액(하)추징금액(물)사용순번
012021-012021-012021-0182170240124501
122021-012021-012020-12-4320-2320-6001
232021-012021-012021-01-4320-2320-6001
342021-012021-032021-02-1200001
452021-012021-032021-03-1200001
562021-012021-052021-04-1200001
672021-012021-052021-05-2160-86001
782021-012021-022021-01-12000-10700-15001
892021-012021-012021-01-100000-7601
9102021-012021-012021-01-9540-6240-10501
연번추징발생년월고지년월계산년월추징금액(상)추징금액(하)추징금액(물)사용순번
2702712021-052021-052021-05-4787640-3560040-9151801
2712722021-052021-052021-05-79800-48000-89501
2722732021-052021-052021-05-90440-54400-101401
2732742021-052021-052021-04-1200001
2742752021-052021-052021-05-1200001
2752762021-052021-052021-05-4859400-3613400-9289001
2762772021-052021-052021-05-99750-60000-111901
2772782021-052021-052021-05-1015560-755160-1941201
2782792021-052021-052021-05-717120-448200-1485101
2792802021-052021-052021-05-442800-276750-916901