Dataset statistics
Number of variables | 8 |
---|---|
Number of observations | 10000 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 742.2 KiB |
Average record size in memory | 76.0 B |
Variable types
Numeric | 4 |
---|---|
Text | 1 |
Categorical | 3 |
Dataset
Description | 부산광역시상수도사업본부_수용가정보시스템_요금계산관련정보_추징계산이력_20220609 |
---|---|
Author | 부산광역시 상수도사업본부 |
URL | http://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15083669 |
추징금액(하) is highly overall correlated with 추징금액(물) | High correlation |
추징금액(물) is highly overall correlated with 추징금액(하) | High correlation |
추징발생년월 is highly overall correlated with 고지년월 and 1 other fields | High correlation |
고지년월 is highly overall correlated with 추징발생년월 and 1 other fields | High correlation |
계산년월 is highly overall correlated with 추징발생년월 and 1 other fields | High correlation |
추징발생년월 is highly imbalanced (90.9%) | Imbalance |
고지년월 is highly imbalanced (71.3%) | Imbalance |
계산년월 is highly imbalanced (93.3%) | Imbalance |
추징금액(상) is highly skewed (γ1 = -79.55228714) | Skewed |
추징금액(하) is highly skewed (γ1 = -22.92633547) | Skewed |
연번 has unique values | Unique |
추징금액(하) has 9963 (99.6%) zeros | Zeros |
추징금액(물) has 9970 (99.7%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-10 17:13:49.178704 |
---|---|
Analysis finished | 2023-12-10 17:13:55.298172 |
Duration | 6.12 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
연번
Real number (ℝ)
UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 47590.43 |
Minimum | 5 |
---|---|
Maximum | 95392 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 5 |
---|---|
5-th percentile | 4804.95 |
Q1 | 23302.5 |
median | 47946 |
Q3 | 71434.75 |
95-th percentile | 90594.5 |
Maximum | 95392 |
Range | 95387 |
Interquartile range (IQR) | 48132.25 |
Descriptive statistics
Standard deviation | 27544.536 |
---|---|
Coefficient of variation (CV) | 0.57878308 |
Kurtosis | -1.2026119 |
Mean | 47590.43 |
Median Absolute Deviation (MAD) | 23977 |
Skewness | -0.0031175675 |
Sum | 4.759043 × 108 |
Variance | 7.5870145 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
33739 | 1 | < 0.1% |
2204 | 1 | < 0.1% |
8859 | 1 | < 0.1% |
26291 | 1 | < 0.1% |
15637 | 1 | < 0.1% |
11995 | 1 | < 0.1% |
86813 | 1 | < 0.1% |
28844 | 1 | < 0.1% |
40752 | 1 | < 0.1% |
83714 | 1 | < 0.1% |
Other values (9990) | 9990 |
Value | Count | Frequency (%) |
5 | 1 | |
7 | 1 | |
19 | 1 | |
28 | 1 | |
44 | 1 | |
50 | 1 | |
83 | 1 | |
85 | 1 | |
88 | 1 | |
97 | 1 |
Value | Count | Frequency (%) |
95392 | 1 | |
95390 | 1 | |
95389 | 1 | |
95359 | 1 | |
95348 | 1 | |
95325 | 1 | |
95308 | 1 | |
95303 | 1 | |
95290 | 1 | |
95284 | 1 |
고객번호
Text
Distinct | 5792 |
---|---|
Distinct (%) | 57.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
93*59 | 21 | 0.2% |
07*98 | 8 | 0.1% |
07*88 | 8 | 0.1% |
07*13 | 8 | 0.1% |
07*89 | 8 | 0.1% |
07*16 | 8 | 0.1% |
89*98 | 7 | 0.1% |
00*16 | 7 | 0.1% |
07*75 | 7 | 0.1% |
21*93 | 7 | 0.1% |
Other values (5782) | 9911 |
Most occurring characters
Value | Count | Frequency (%) |
* | 20000 | |
0 | 5193 | 8.7% |
9 | 4402 | 7.3% |
1 | 4274 | 7.1% |
5 | 4048 | 6.7% |
3 | 3898 | 6.5% |
7 | 3856 | 6.4% |
2 | 3810 | 6.3% |
8 | 3715 | 6.2% |
4 | 3535 | 5.9% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 40000 | |
Other Punctuation | 20000 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 5193 | |
9 | 4402 | |
1 | 4274 | |
5 | 4048 | |
3 | 3898 | |
7 | 3856 | |
2 | 3810 | |
8 | 3715 | |
4 | 3535 | |
6 | 3269 |
Other Punctuation
Value | Count | Frequency (%) |
* | 20000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 60000 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
* | 20000 | |
0 | 5193 | 8.7% |
9 | 4402 | 7.3% |
1 | 4274 | 7.1% |
5 | 4048 | 6.7% |
3 | 3898 | 6.5% |
7 | 3856 | 6.4% |
2 | 3810 | 6.3% |
8 | 3715 | 6.2% |
4 | 3535 | 5.9% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 60000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
* | 20000 | |
0 | 5193 | 8.7% |
9 | 4402 | 7.3% |
1 | 4274 | 7.1% |
5 | 4048 | 6.7% |
3 | 3898 | 6.5% |
7 | 3856 | 6.4% |
2 | 3810 | 6.3% |
8 | 3715 | 6.2% |
4 | 3535 | 5.9% |
추징발생년월
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 8 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2021-07-01 | |
---|---|
2021-08-01 | 357 |
2021-03-01 | 23 |
2021-02-01 | 7 |
2021-06-01 | 4 |
Other values (3) | 11 |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2021-07-01 |
---|---|
2nd row | 2021-07-01 |
3rd row | 2021-07-01 |
4th row | 2021-07-01 |
5th row | 2021-07-01 |
Common Values
Value | Count | Frequency (%) |
2021-07-01 | 9598 | |
2021-08-01 | 357 | 3.6% |
2021-03-01 | 23 | 0.2% |
2021-02-01 | 7 | 0.1% |
2021-06-01 | 4 | < 0.1% |
2021-05-01 | 4 | < 0.1% |
2021-01-01 | 4 | < 0.1% |
2021-04-01 | 3 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2021-07-01 | 9598 | |
2021-08-01 | 357 | 3.6% |
2021-03-01 | 23 | 0.2% |
2021-02-01 | 7 | 0.1% |
2021-06-01 | 4 | < 0.1% |
2021-05-01 | 4 | < 0.1% |
2021-01-01 | 4 | < 0.1% |
2021-04-01 | 3 | < 0.1% |
고지년월
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 15 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2021-07-01 | |
---|---|
2021-08-01 | |
2021-09-01 | 117 |
2021-03-01 | 8 |
2021-05-01 | 6 |
Other values (10) | 25 |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Unique
Unique | 4 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | 2021-07-01 |
---|---|
2nd row | 2021-07-01 |
3rd row | 2021-08-01 |
4th row | 2021-07-01 |
5th row | 2021-07-01 |
Common Values
Value | Count | Frequency (%) |
2021-07-01 | 5387 | |
2021-08-01 | 4457 | |
2021-09-01 | 117 | 1.2% |
2021-03-01 | 8 | 0.1% |
2021-05-01 | 6 | 0.1% |
2021-04-01 | 5 | 0.1% |
2022-05-01 | 4 | < 0.1% |
2021-02-01 | 4 | < 0.1% |
2021-11-01 | 3 | < 0.1% |
2022-03-01 | 3 | < 0.1% |
Other values (5) | 6 | 0.1% |
Length
Value | Count | Frequency (%) |
2021-07-01 | 5387 | |
2021-08-01 | 4457 | |
2021-09-01 | 117 | 1.2% |
2021-03-01 | 8 | 0.1% |
2021-05-01 | 6 | 0.1% |
2021-04-01 | 5 | < 0.1% |
2022-05-01 | 4 | < 0.1% |
2021-02-01 | 4 | < 0.1% |
2021-11-01 | 3 | < 0.1% |
2022-03-01 | 3 | < 0.1% |
Other values (5) | 6 | 0.1% |
계산년월
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 18 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2021-07-01 | |
---|---|
2021-08-01 | 360 |
2021-03-01 | 10 |
2021-02-01 | 6 |
2021-05-01 | 5 |
Other values (13) | 22 |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Unique
Unique | 6 ? |
---|---|
Unique (%) | 0.1% |
Sample
1st row | 2021-07-01 |
---|---|
2nd row | 2021-07-01 |
3rd row | 2021-07-01 |
4th row | 2021-07-01 |
5th row | 2021-07-01 |
Common Values
Value | Count | Frequency (%) |
2021-07-01 | 9597 | |
2021-08-01 | 360 | 3.6% |
2021-03-01 | 10 | 0.1% |
2021-02-01 | 6 | 0.1% |
2021-05-01 | 5 | 0.1% |
2021-06-01 | 4 | < 0.1% |
2022-02-01 | 2 | < 0.1% |
2022-05-01 | 2 | < 0.1% |
2021-10-01 | 2 | < 0.1% |
2021-11-01 | 2 | < 0.1% |
Other values (8) | 10 | 0.1% |
Length
Value | Count | Frequency (%) |
2021-07-01 | 9597 | |
2021-08-01 | 360 | 3.6% |
2021-03-01 | 10 | 0.1% |
2021-02-01 | 6 | 0.1% |
2021-05-01 | 5 | < 0.1% |
2021-06-01 | 4 | < 0.1% |
2022-03-01 | 2 | < 0.1% |
2022-04-01 | 2 | < 0.1% |
2021-11-01 | 2 | < 0.1% |
2021-10-01 | 2 | < 0.1% |
Other values (8) | 10 | 0.1% |
추징금액(상)
Real number (ℝ)
SKEWED
 
Distinct | 911 |
---|---|
Distinct (%) | 9.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | -3093.195 |
Minimum | -5792950 |
---|---|
Maximum | 155990 |
Zeros | 15 |
Zeros (%) | 0.1% |
Negative | 9981 |
Negative (%) | 99.8% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | -5792950 |
---|---|
5-th percentile | -7211.5 |
Q1 | -1200 |
median | -430 |
Q3 | -140 |
95-th percentile | -20 |
Maximum | 155990 |
Range | 5948940 |
Interquartile range (IQR) | 1060 |
Descriptive statistics
Standard deviation | 63526.446 |
---|---|
Coefficient of variation (CV) | -20.537485 |
Kurtosis | 7013.8682 |
Mean | -3093.195 |
Median Absolute Deviation (MAD) | 350 |
Skewness | -79.552287 |
Sum | -30931950 |
Variance | 4.0356094 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
-10 | 286 | 2.9% |
-30 | 247 | 2.5% |
-40 | 245 | 2.5% |
-20 | 243 | 2.4% |
-90 | 231 | 2.3% |
-60 | 228 | 2.3% |
-80 | 219 | 2.2% |
-70 | 211 | 2.1% |
-120 | 190 | 1.9% |
-110 | 182 | 1.8% |
Other values (901) | 7718 |
Value | Count | Frequency (%) |
-5792950 | 1 | |
-2040550 | 1 | |
-955010 | 1 | |
-463250 | 1 | |
-449300 | 1 | |
-367610 | 1 | |
-353690 | 1 | |
-347800 | 1 | |
-302290 | 1 | |
-299040 | 1 |
Value | Count | Frequency (%) |
155990 | 1 | < 0.1% |
94630 | 1 | < 0.1% |
460 | 2 | < 0.1% |
0 | 15 | 0.1% |
-10 | 286 | |
-20 | 243 | |
-30 | 247 | |
-40 | 245 | |
-60 | 228 | |
-70 | 211 |
추징금액(하)
Real number (ℝ)
HIGH CORRELATION
  SKEWED
  ZEROS
 
Distinct | 32 |
---|---|
Distinct (%) | 0.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | -19.946 |
Minimum | -356310 |
---|---|
Maximum | 212900 |
Zeros | 9963 |
Zeros (%) | 99.6% |
Negative | 34 |
Negative (%) | 0.3% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | -356310 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 0 |
Q3 | 0 |
95-th percentile | 0 |
Maximum | 212900 |
Range | 569210 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 5010.3727 |
---|---|
Coefficient of variation (CV) | -251.19687 |
Kurtosis | 3246.3427 |
Mean | -19.946 |
Median Absolute Deviation (MAD) | 0 |
Skewness | -22.926335 |
Sum | -199460 |
Variance | 25103835 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 9963 | |
-1800 | 3 | < 0.1% |
-900 | 3 | < 0.1% |
-4500 | 2 | < 0.1% |
-220 | 2 | < 0.1% |
-190 | 1 | < 0.1% |
-670 | 1 | < 0.1% |
-25050 | 1 | < 0.1% |
-1150 | 1 | < 0.1% |
-8020 | 1 | < 0.1% |
Other values (22) | 22 | 0.2% |
Value | Count | Frequency (%) |
-356310 | 1 | |
-116800 | 1 | |
-111690 | 1 | |
-25050 | 1 | |
-15820 | 1 | |
-11770 | 1 | |
-8020 | 1 | |
-7690 | 1 | |
-6200 | 1 | |
-5660 | 1 |
Value | Count | Frequency (%) |
212900 | 1 | < 0.1% |
208500 | 1 | < 0.1% |
88320 | 1 | < 0.1% |
0 | 9963 | |
-190 | 1 | < 0.1% |
-220 | 2 | < 0.1% |
-450 | 1 | < 0.1% |
-530 | 1 | < 0.1% |
-670 | 1 | < 0.1% |
-860 | 1 | < 0.1% |
추징금액(물)
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 26 |
---|---|
Distinct (%) | 0.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | -3.425 |
Minimum | -21770 |
---|---|
Maximum | 18420 |
Zeros | 9970 |
Zeros (%) | 99.7% |
Negative | 28 |
Negative (%) | 0.3% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | -21770 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 0 |
Q3 | 0 |
95-th percentile | 0 |
Maximum | 18420 |
Range | 40190 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 362.61099 |
---|---|
Coefficient of variation (CV) | -105.87182 |
Kurtosis | 2388.6316 |
Mean | -3.425 |
Median Absolute Deviation (MAD) | 0 |
Skewness | -9.1915201 |
Sum | -34250 |
Variance | 131486.73 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 9970 | |
-70 | 4 | < 0.1% |
-820 | 2 | < 0.1% |
-1500 | 2 | < 0.1% |
-750 | 1 | < 0.1% |
-12300 | 1 | < 0.1% |
-590 | 1 | < 0.1% |
-5210 | 1 | < 0.1% |
-290 | 1 | < 0.1% |
-1110 | 1 | < 0.1% |
Other values (16) | 16 | 0.2% |
Value | Count | Frequency (%) |
-21770 | 1 | |
-12300 | 1 | |
-10210 | 1 | |
-5210 | 1 | |
-2310 | 1 | |
-1780 | 1 | |
-1640 | 1 | |
-1500 | 2 | |
-1110 | 1 | |
-830 | 1 |
Value | Count | Frequency (%) |
18420 | 1 | < 0.1% |
14050 | 1 | < 0.1% |
0 | 9970 | |
-60 | 1 | < 0.1% |
-70 | 4 | < 0.1% |
-140 | 1 | < 0.1% |
-220 | 1 | < 0.1% |
-290 | 1 | < 0.1% |
-300 | 1 | < 0.1% |
-370 | 1 | < 0.1% |
연번 | 추징발생년월 | 고지년월 | 계산년월 | 추징금액(상) | 추징금액(하) | 추징금액(물) | |
---|---|---|---|---|---|---|---|
연번 | 1.000 | 0.447 | 0.341 | 0.470 | 0.025 | 0.061 | 0.061 |
추징발생년월 | 0.447 | 1.000 | 0.904 | 0.917 | 0.000 | 0.511 | 0.478 |
고지년월 | 0.341 | 0.904 | 1.000 | 0.969 | 0.000 | 0.703 | 0.557 |
계산년월 | 0.470 | 0.917 | 0.969 | 1.000 | 0.000 | 0.649 | 0.613 |
추징금액(상) | 0.025 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 |
추징금액(하) | 0.061 | 0.511 | 0.703 | 0.649 | 0.000 | 1.000 | 0.966 |
추징금액(물) | 0.061 | 0.478 | 0.557 | 0.613 | 0.000 | 0.966 | 1.000 |
고지년월 | 계산년월 | 추징발생년월 | |
---|---|---|---|
고지년월 | 1.000 | 0.796 | 0.685 |
계산년월 | 0.796 | 1.000 | 0.711 |
추징발생년월 | 0.685 | 0.711 | 1.000 |
연번 | 추징금액(상) | 추징금액(하) | 추징금액(물) | 추징발생년월 | 고지년월 | 계산년월 | |
---|---|---|---|---|---|---|---|
연번 | 1.000 | 0.037 | 0.078 | 0.077 | 0.232 | 0.134 | 0.200 |
추징금액(상) | 0.037 | 1.000 | 0.011 | 0.023 | 0.000 | 0.000 | 0.000 |
추징금액(하) | 0.078 | 0.011 | 1.000 | 0.840 | 0.345 | 0.384 | 0.394 |
추징금액(물) | 0.077 | 0.023 | 0.840 | 1.000 | 0.362 | 0.300 | 0.381 |
추징발생년월 | 0.232 | 0.000 | 0.345 | 0.362 | 1.000 | 0.685 | 0.711 |
고지년월 | 0.134 | 0.000 | 0.384 | 0.300 | 0.685 | 1.000 | 0.796 |
계산년월 | 0.200 | 0.000 | 0.394 | 0.381 | 0.711 | 0.796 | 1.000 |
연번 | 고객번호 | 추징발생년월 | 고지년월 | 계산년월 | 추징금액(상) | 추징금액(하) | 추징금액(물) | |
---|---|---|---|---|---|---|---|---|
33738 | 33739 | *91*30 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -7290 | 0 | 0 |
69546 | 69547 | *19*90 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -80 | 0 | 0 |
81813 | 81814 | *97*05 | 2021-07-01 | 2021-08-01 | 2021-07-01 | -90 | 0 | 0 |
87088 | 87089 | *07*83 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -2450 | 0 | 0 |
34038 | 34039 | *92*27 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -220 | 0 | 0 |
75260 | 75261 | *51*62 | 2021-07-01 | 2021-08-01 | 2021-07-01 | -2750 | 0 | 0 |
72570 | 72571 | *35*54 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -2000 | 0 | 0 |
1115 | 1116 | *00*79 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -40 | 0 | 0 |
73298 | 73299 | *38*76 | 2021-07-01 | 2021-08-01 | 2021-07-01 | -340 | 0 | 0 |
6636 | 6637 | *33*02 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -1350 | 0 | 0 |
연번 | 고객번호 | 추징발생년월 | 고지년월 | 계산년월 | 추징금액(상) | 추징금액(하) | 추징금액(물) | |
---|---|---|---|---|---|---|---|---|
16585 | 16586 | *93*18 | 2021-07-01 | 2021-08-01 | 2021-07-01 | -220 | 0 | 0 |
28029 | 28030 | *55*71 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -250 | 0 | 0 |
43293 | 43294 | *42*65 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -300 | 0 | 0 |
16059 | 16060 | *89*23 | 2021-07-01 | 2021-08-01 | 2021-07-01 | -540 | 0 | 0 |
64875 | 64876 | *99*52 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -170 | 0 | 0 |
59401 | 59402 | *13*32 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -280 | 0 | 0 |
77808 | 77809 | *90*97 | 2021-07-01 | 2021-08-01 | 2021-07-01 | -40 | 0 | 0 |
35366 | 35367 | *98*68 | 2021-07-01 | 2021-08-01 | 2021-07-01 | -130 | 0 | 0 |
80960 | 80961 | *96*92 | 2021-07-01 | 2021-07-01 | 2021-07-01 | -160 | 0 | 0 |
27573 | 27574 | *53*50 | 2021-07-01 | 2021-08-01 | 2021-07-01 | -160 | 0 | 0 |