Overview

Dataset statistics

Number of variables8
Number of observations737
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory49.1 KiB
Average record size in memory68.2 B

Variable types

Numeric4
Text1
DateTime3

Dataset

Description부산광역시상수도사업본부_수용가정보시스템_요금계산관련정보_추징계산이력_20230126
Author부산광역시 상수도사업본부
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15083669

Alerts

추징금액(상) is highly overall correlated with 추징금액(물)High correlation
추징금액(하) is highly overall correlated with 추징금액(물)High correlation
추징금액(물) is highly overall correlated with 추징금액(상) and 1 other fieldsHigh correlation
추징금액(하) is highly skewed (γ1 = 24.31880472)Skewed
연번 has unique valuesUnique
추징금액(상) has 142 (19.3%) zerosZeros
추징금액(하) has 198 (26.9%) zerosZeros
추징금액(물) has 291 (39.5%) zerosZeros

Reproduction

Analysis started2023-12-10 17:13:32.499279
Analysis finished2023-12-10 17:13:38.201687
Duration5.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct737
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean369
Minimum1
Maximum737
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2023-12-11T02:13:38.346971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile37.8
Q1185
median369
Q3553
95-th percentile700.2
Maximum737
Range736
Interquartile range (IQR)368

Descriptive statistics

Standard deviation212.89786
Coefficient of variation (CV)0.57695898
Kurtosis-1.2
Mean369
Median Absolute Deviation (MAD)184
Skewness0
Sum271953
Variance45325.5
MonotonicityStrictly increasing
2023-12-11T02:13:38.675295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
496 1
 
0.1%
487 1
 
0.1%
488 1
 
0.1%
489 1
 
0.1%
490 1
 
0.1%
491 1
 
0.1%
492 1
 
0.1%
493 1
 
0.1%
494 1
 
0.1%
Other values (727) 727
98.6%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
737 1
0.1%
736 1
0.1%
735 1
0.1%
734 1
0.1%
733 1
0.1%
732 1
0.1%
731 1
0.1%
730 1
0.1%
729 1
0.1%
728 1
0.1%
Distinct104
Distinct (%)14.1%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
2023-12-11T02:13:39.212858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters4422
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51 ?
Unique (%)6.9%

Sample

1st row*17*10
2nd row*26*63
3rd row*54*01
4th row*54*01
5th row*54*01
ValueCountFrequency (%)
30*58 242
32.8%
50*11 174
23.6%
54*01 26
 
3.5%
02*08 20
 
2.7%
78*85 18
 
2.4%
17*43 16
 
2.2%
18*35 13
 
1.8%
02*90 13
 
1.8%
12*84 12
 
1.6%
94*90 11
 
1.5%
Other values (94) 192
26.1%
2023-12-11T02:13:39.996061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 1474
33.3%
0 622
14.1%
5 543
 
12.3%
1 523
 
11.8%
8 379
 
8.6%
3 343
 
7.8%
7 126
 
2.8%
9 126
 
2.8%
4 122
 
2.8%
2 96
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2948
66.7%
Other Punctuation 1474
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 622
21.1%
5 543
18.4%
1 523
17.7%
8 379
12.9%
3 343
11.6%
7 126
 
4.3%
9 126
 
4.3%
4 122
 
4.1%
2 96
 
3.3%
6 68
 
2.3%
Other Punctuation
ValueCountFrequency (%)
* 1474
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4422
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 1474
33.3%
0 622
14.1%
5 543
 
12.3%
1 523
 
11.8%
8 379
 
8.6%
3 343
 
7.8%
7 126
 
2.8%
9 126
 
2.8%
4 122
 
2.8%
2 96
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4422
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 1474
33.3%
0 622
14.1%
5 543
 
12.3%
1 523
 
11.8%
8 379
 
8.6%
3 343
 
7.8%
7 126
 
2.8%
9 126
 
2.8%
4 122
 
2.8%
2 96
 
2.2%
Distinct12
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
Minimum2022-01-01 00:00:00
Maximum2022-12-01 00:00:00
2023-12-11T02:13:40.255055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:40.438626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
Distinct13
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
Minimum2022-01-01 00:00:00
Maximum2023-01-01 00:00:00
2023-12-11T02:13:40.646296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:40.801490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
Distinct14
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
Minimum2021-12-01 00:00:00
Maximum2023-01-01 00:00:00
2023-12-11T02:13:40.971565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:41.158728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)

추징금액(상)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct198
Distinct (%)26.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2312.7273
Minimum-3000000
Maximum3787810
Zeros142
Zeros (%)19.3%
Negative531
Negative (%)72.0%
Memory size6.6 KiB
2023-12-11T02:13:41.407839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-3000000
5-th percentile-23584
Q1-6960
median-1920
Q30
95-th percentile60490
Maximum3787810
Range6787810
Interquartile range (IQR)6960

Descriptive statistics

Standard deviation230908.53
Coefficient of variation (CV)99.842524
Kurtosis184.96832
Mean2312.7273
Median Absolute Deviation (MAD)2050
Skewness6.1784181
Sum1704480
Variance5.3318748 × 1010
MonotonicityNot monotonic
2023-12-11T02:13:41.662135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 142
 
19.3%
-1200 121
 
16.4%
-6960 17
 
2.3%
-15600 17
 
2.3%
-5880 14
 
1.9%
-5160 14
 
1.9%
-4800 14
 
1.9%
-7680 13
 
1.8%
-3090 12
 
1.6%
-7200 11
 
1.5%
Other values (188) 362
49.1%
ValueCountFrequency (%)
-3000000 1
0.1%
-1230380 1
0.1%
-1000000 1
0.1%
-686900 1
0.1%
-591200 1
0.1%
-515390 1
0.1%
-435240 1
0.1%
-194130 1
0.1%
-162270 1
0.1%
-142260 1
0.1%
ValueCountFrequency (%)
3787810 1
 
0.1%
3159000 1
 
0.1%
600000 1
 
0.1%
598360 1
 
0.1%
571900 1
 
0.1%
526540 1
 
0.1%
329580 1
 
0.1%
200000 1
 
0.1%
198000 4
0.5%
187120 1
 
0.1%

추징금액(하)
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct236
Distinct (%)32.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9332.4966
Minimum-921370
Maximum12249360
Zeros198
Zeros (%)26.9%
Negative492
Negative (%)66.8%
Memory size6.6 KiB
2023-12-11T02:13:41.969014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-921370
5-th percentile-48958
Q1-5950
median-2470
Q30
95-th percentile14110
Maximum12249360
Range13170730
Interquartile range (IQR)5950

Descriptive statistics

Standard deviation469409.46
Coefficient of variation (CV)50.29838
Kurtosis631.30017
Mean9332.4966
Median Absolute Deviation (MAD)2470
Skewness24.318805
Sum6878050
Variance2.2034524 × 1011
MonotonicityNot monotonic
2023-12-11T02:13:42.251808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 198
26.9%
-4500 21
 
2.8%
-3600 18
 
2.4%
-4050 17
 
2.3%
-4270 17
 
2.3%
-2250 15
 
2.0%
-2470 14
 
1.9%
-2920 13
 
1.8%
-16530 10
 
1.4%
-220 9
 
1.2%
Other values (226) 405
55.0%
ValueCountFrequency (%)
-921370 1
0.1%
-911180 1
0.1%
-906750 1
0.1%
-826190 1
0.1%
-604500 1
0.1%
-380820 1
0.1%
-271800 1
0.1%
-257040 1
0.1%
-207560 1
0.1%
-187680 1
0.1%
ValueCountFrequency (%)
12249360 1
 
0.1%
2349000 1
 
0.1%
1452700 1
 
0.1%
198240 1
 
0.1%
174760 1
 
0.1%
150000 4
0.5%
100260 1
 
0.1%
74620 1
 
0.1%
65960 3
0.4%
62910 1
 
0.1%

추징금액(물)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct151
Distinct (%)20.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-332.42877
Minimum-380350
Maximum603860
Zeros291
Zeros (%)39.5%
Negative397
Negative (%)53.9%
Memory size6.6 KiB
2023-12-11T02:13:42.529493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-380350
5-th percentile-3034
Q1-1200
median-150
Q30
95-th percentile4780
Maximum603860
Range984210
Interquartile range (IQR)1200

Descriptive statistics

Standard deviation29006.773
Coefficient of variation (CV)-87.257109
Kurtosis302.26003
Mean-332.42877
Median Absolute Deviation (MAD)320
Skewness8.4384567
Sum-245000
Variance8.4139289 × 108
MonotonicityNot monotonic
2023-12-11T02:13:42.827792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 291
39.5%
-1200 20
 
2.7%
-1500 17
 
2.3%
-750 15
 
2.0%
-1920 14
 
1.9%
-820 14
 
1.9%
-70 13
 
1.8%
-970 13
 
1.8%
-470 13
 
1.8%
-980 12
 
1.6%
Other values (141) 315
42.7%
ValueCountFrequency (%)
-380350 1
0.1%
-241900 1
0.1%
-103780 1
0.1%
-69360 1
0.1%
-60770 1
0.1%
-48580 1
0.1%
-26080 1
0.1%
-23440 1
0.1%
-19760 1
0.1%
-17610 1
0.1%
ValueCountFrequency (%)
603860 1
 
0.1%
76070 1
 
0.1%
67080 1
 
0.1%
64110 1
 
0.1%
62360 1
 
0.1%
37370 1
 
0.1%
22340 1
 
0.1%
20700 4
0.5%
14910 6
0.8%
13690 1
 
0.1%

Interactions

2023-12-11T02:13:36.096383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:32.989731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:33.864461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:34.812233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:36.481961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:33.235958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:34.077510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:34.998058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:36.806443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:33.497031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:34.298420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:35.252185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:37.076701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:33.689307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:34.594042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:13:35.660785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:13:42.999576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번추징발생년월고지년월계산년월추징금액(상)추징금액(하)추징금액(물)
연번1.0000.8910.8320.7520.0690.0060.168
추징발생년월0.8911.0000.9210.8110.1140.0000.184
고지년월0.8320.9211.0000.9340.1920.0000.163
계산년월0.7520.8110.9341.0000.1600.0000.116
추징금액(상)0.0690.1140.1920.1601.0000.7060.916
추징금액(하)0.0060.0000.0000.0000.7061.0000.658
추징금액(물)0.1680.1840.1630.1160.9160.6581.000
2023-12-11T02:13:43.286170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번추징금액(상)추징금액(하)추징금액(물)
연번1.0000.1710.1020.199
추징금액(상)0.1711.0000.4830.839
추징금액(하)0.1020.4831.0000.616
추징금액(물)0.1990.8390.6161.000

Missing values

2023-12-11T02:13:37.835212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:13:38.108383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번고객번호추징발생년월고지년월계산년월추징금액(상)추징금액(하)추징금액(물)
01*17*102022-012022-012021-12-986000
12*26*632022-012022-012022-0160490504709210
23*54*012022-012022-012021-12-607000
34*54*012022-012022-012022-01-705000
45*54*012022-012022-032022-02-1280-8400
56*54*012022-012022-032022-030-2320-610
67*54*012022-012022-052022-040-3190-830
78*54*012022-012022-052022-050-2650-830
89*54*012022-012022-072022-0600-710
910*59*952022-012022-032022-02-77000
연번고객번호추징발생년월고지년월계산년월추징금액(상)추징금액(하)추징금액(물)
727728*03*912022-122022-122022-12-11910-7110-2180
728729*14*392022-122022-122022-12128000
729730*12*572022-122022-122022-12-7200-4500-1500
730731*99*632022-122022-122022-1121880141104560
731732*11*092022-122022-122022-12927604512012180
732733*18*352022-122022-122022-11-3550-5280-680
733734*18*352022-122022-122022-11-3550-5280-680
734735*18*352022-122022-122022-11-3550-2700-680
735736*18*352022-122022-122022-120-25800
736737*18*352022-122022-122022-11-3550-5280-680