Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 10000 |
Missing cells | 19953 |
Missing cells (%) | 28.5% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 644.5 KiB |
Average record size in memory | 66.0 B |
Variable types
Numeric | 2 |
---|---|
Text | 3 |
DateTime | 1 |
Categorical | 1 |
Dataset
Description | 부산광역시상수도사업본부_수용가정보시스템_계량기정보_계량기변경이력정보_20220131 |
---|---|
Author | 부산광역시 상수도사업본부 |
URL | http://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15100348 |
계량기형식 is highly imbalanced (82.1%) | Imbalance |
결재처리문서아이디(ID) has 9997 (> 99.9%) missing values | Missing |
진척번호 has 9956 (99.6%) missing values | Missing |
연번 has unique values | Unique |
Reproduction
Analysis started | 2023-12-10 16:15:35.309549 |
---|---|
Analysis finished | 2023-12-10 16:15:36.721380 |
Duration | 1.41 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
연번
Real number (ℝ)
UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 36479.525 |
Minimum | 14 |
---|---|
Maximum | 73674 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 14 |
---|---|
5-th percentile | 3488.75 |
Q1 | 17993 |
median | 36206 |
Q3 | 54972.75 |
95-th percentile | 69731.4 |
Maximum | 73674 |
Range | 73660 |
Interquartile range (IQR) | 36979.75 |
Descriptive statistics
Standard deviation | 21260.434 |
---|---|
Coefficient of variation (CV) | 0.58280458 |
Kurtosis | -1.2021913 |
Mean | 36479.525 |
Median Absolute Deviation (MAD) | 18466 |
Skewness | 0.013469914 |
Sum | 3.6479525 × 108 |
Variance | 4.5200605 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
55573 | 1 | < 0.1% |
30614 | 1 | < 0.1% |
47581 | 1 | < 0.1% |
2882 | 1 | < 0.1% |
15921 | 1 | < 0.1% |
57616 | 1 | < 0.1% |
10360 | 1 | < 0.1% |
36393 | 1 | < 0.1% |
23975 | 1 | < 0.1% |
71280 | 1 | < 0.1% |
Other values (9990) | 9990 |
Value | Count | Frequency (%) |
14 | 1 | |
22 | 1 | |
23 | 1 | |
40 | 1 | |
45 | 1 | |
46 | 1 | |
54 | 1 | |
55 | 1 | |
87 | 1 | |
88 | 1 |
Value | Count | Frequency (%) |
73674 | 1 | |
73665 | 1 | |
73647 | 1 | |
73646 | 1 | |
73636 | 1 | |
73631 | 1 | |
73616 | 1 | |
73612 | 1 | |
73607 | 1 | |
73599 | 1 |
고객번호
Text
Distinct | 3439 |
---|---|
Distinct (%) | 34.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
49**88 | 20 | 0.2% |
49**11 | 20 | 0.2% |
49**15 | 18 | 0.2% |
50**76 | 17 | 0.2% |
49**31 | 17 | 0.2% |
49**73 | 17 | 0.2% |
51**63 | 16 | 0.2% |
50**44 | 16 | 0.2% |
50**94 | 16 | 0.2% |
50**05 | 16 | 0.2% |
Other values (3429) | 9827 |
Most occurring characters
Value | Count | Frequency (%) |
* | 20000 | |
4 | 5273 | 8.8% |
1 | 5111 | 8.5% |
2 | 4985 | 8.3% |
0 | 4914 | 8.2% |
5 | 4633 | 7.7% |
9 | 3764 | 6.3% |
3 | 3645 | 6.1% |
6 | 2718 | 4.5% |
8 | 2552 | 4.3% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 40000 | |
Other Punctuation | 20000 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
4 | 5273 | |
1 | 5111 | |
2 | 4985 | |
0 | 4914 | |
5 | 4633 | |
9 | 3764 | |
3 | 3645 | |
6 | 2718 | |
8 | 2552 | |
7 | 2405 |
Other Punctuation
Value | Count | Frequency (%) |
* | 20000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 60000 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
* | 20000 | |
4 | 5273 | 8.8% |
1 | 5111 | 8.5% |
2 | 4985 | 8.3% |
0 | 4914 | 8.2% |
5 | 4633 | 7.7% |
9 | 3764 | 6.3% |
3 | 3645 | 6.1% |
6 | 2718 | 4.5% |
8 | 2552 | 4.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 60000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
* | 20000 | |
4 | 5273 | 8.8% |
1 | 5111 | 8.5% |
2 | 4985 | 8.3% |
0 | 4914 | 8.2% |
5 | 4633 | 7.7% |
9 | 3764 | 6.3% |
3 | 3645 | 6.1% |
6 | 2718 | 4.5% |
8 | 2552 | 4.3% |
장치일
Date
Distinct | 335 |
---|---|
Distinct (%) | 3.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Minimum | 2021-01-01 00:00:00 |
---|---|
Maximum | 2021-12-31 00:00:00 |
구경
Real number (ℝ)
Distinct | 12 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 18.4716 |
Minimum | 15 |
---|---|
Maximum | 300 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 15 |
---|---|
5-th percentile | 15 |
Q1 | 15 |
median | 15 |
Q3 | 15 |
95-th percentile | 25 |
Maximum | 300 |
Range | 285 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 13.930312 |
---|---|
Coefficient of variation (CV) | 0.75414757 |
Kurtosis | 109.53858 |
Mean | 18.4716 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 8.8823022 |
Sum | 184716 |
Variance | 194.0536 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
15 | 7927 | |
20 | 987 | 9.9% |
25 | 599 | 6.0% |
40 | 160 | 1.6% |
50 | 109 | 1.1% |
80 | 78 | 0.8% |
32 | 58 | 0.6% |
100 | 44 | 0.4% |
150 | 25 | 0.2% |
200 | 7 | 0.1% |
Other values (2) | 6 | 0.1% |
Value | Count | Frequency (%) |
15 | 7927 | |
20 | 987 | 9.9% |
25 | 599 | 6.0% |
32 | 58 | 0.6% |
40 | 160 | 1.6% |
50 | 109 | 1.1% |
80 | 78 | 0.8% |
100 | 44 | 0.4% |
150 | 25 | 0.2% |
200 | 7 | 0.1% |
Value | Count | Frequency (%) |
300 | 2 | < 0.1% |
250 | 4 | < 0.1% |
200 | 7 | 0.1% |
150 | 25 | 0.2% |
100 | 44 | 0.4% |
80 | 78 | 0.8% |
50 | 109 | 1.1% |
40 | 160 | 1.6% |
32 | 58 | 0.6% |
25 | 599 |
계량기형식
Categorical
IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
건식 | |
---|---|
습식 | 376 |
<NA> | 60 |
Length
Max length | 4 |
---|---|
Median length | 2 |
Mean length | 2.012 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 건식 |
---|---|
2nd row | 건식 |
3rd row | 건식 |
4th row | 건식 |
5th row | 건식 |
Common Values
Value | Count | Frequency (%) |
건식 | 9564 | |
습식 | 376 | 3.8% |
<NA> | 60 | 0.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
건식 | 9564 | |
습식 | 376 | 3.8% |
na | 60 | 0.6% |
결재처리문서아이디(ID)
Text
MISSING
 
Distinct | 2 |
---|---|
Distinct (%) | 66.7% |
Missing | 9997 |
Missing (%) | > 99.9% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
sudo_301_20210300161 | 2 | |
sudo_301_20210300160 | 1 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 16 | |
1 | 11 | |
_ | 6 | 10.0% |
3 | 6 | 10.0% |
2 | 6 | 10.0% |
S | 3 | 5.0% |
U | 3 | 5.0% |
D | 3 | 5.0% |
O | 3 | 5.0% |
6 | 3 | 5.0% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 42 | |
Uppercase Letter | 12 | 20.0% |
Connector Punctuation | 6 | 10.0% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 16 | |
1 | 11 | |
3 | 6 | 14.3% |
2 | 6 | 14.3% |
6 | 3 | 7.1% |
Uppercase Letter
Value | Count | Frequency (%) |
S | 3 | |
U | 3 | |
D | 3 | |
O | 3 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 6 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 48 | |
Latin | 12 | 20.0% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 16 | |
1 | 11 | |
_ | 6 | 12.5% |
3 | 6 | 12.5% |
2 | 6 | 12.5% |
6 | 3 | 6.2% |
Latin
Value | Count | Frequency (%) |
S | 3 | |
U | 3 | |
D | 3 | |
O | 3 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 60 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 16 | |
1 | 11 | |
_ | 6 | 10.0% |
3 | 6 | 10.0% |
2 | 6 | 10.0% |
S | 3 | 5.0% |
U | 3 | 5.0% |
D | 3 | 5.0% |
O | 3 | 5.0% |
6 | 3 | 5.0% |
진척번호
Text
MISSING
 
Distinct | 44 |
---|---|
Distinct (%) | 100.0% |
Missing | 9956 |
Missing (%) | 99.6% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
21-278 | 1 | 2.3% |
21-33 | 1 | 2.3% |
21-58 | 1 | 2.3% |
21-912 | 1 | 2.3% |
21-162 | 1 | 2.3% |
21-470 | 1 | 2.3% |
553 | 1 | 2.3% |
21-483 | 1 | 2.3% |
683 | 1 | 2.3% |
21-588 | 1 | 2.3% |
Other values (34) | 34 |
Most occurring characters
Value | Count | Frequency (%) |
2 | 55 | |
1 | 49 | |
- | 39 | |
3 | 19 | 7.8% |
4 | 16 | 6.6% |
5 | 14 | 5.8% |
7 | 13 | 5.3% |
6 | 13 | 5.3% |
8 | 10 | 4.1% |
0 | 7 | 2.9% |
Other values (2) | 8 | 3.3% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 203 | |
Dash Punctuation | 39 | 16.0% |
Other Punctuation | 1 | 0.4% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
2 | 55 | |
1 | 49 | |
3 | 19 | 9.4% |
4 | 16 | 7.9% |
5 | 14 | 6.9% |
7 | 13 | 6.4% |
6 | 13 | 6.4% |
8 | 10 | 4.9% |
0 | 7 | 3.4% |
9 | 7 | 3.4% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 39 |
Other Punctuation
Value | Count | Frequency (%) |
. | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 243 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
2 | 55 | |
1 | 49 | |
- | 39 | |
3 | 19 | 7.8% |
4 | 16 | 6.6% |
5 | 14 | 5.8% |
7 | 13 | 5.3% |
6 | 13 | 5.3% |
8 | 10 | 4.1% |
0 | 7 | 2.9% |
Other values (2) | 8 | 3.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 243 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
2 | 55 | |
1 | 49 | |
- | 39 | |
3 | 19 | 7.8% |
4 | 16 | 6.6% |
5 | 14 | 5.8% |
7 | 13 | 5.3% |
6 | 13 | 5.3% |
8 | 10 | 4.1% |
0 | 7 | 2.9% |
Other values (2) | 8 | 3.3% |
연번 | 구경 | 계량기형식 | 결재처리문서아이디(ID) | 진척번호 | |
---|---|---|---|---|---|
연번 | 1.000 | 0.056 | 0.097 | 1.000 | 1.000 |
구경 | 0.056 | 1.000 | 0.216 | NaN | NaN |
계량기형식 | 0.097 | 0.216 | 1.000 | NaN | 1.000 |
결재처리문서아이디(ID) | 1.000 | NaN | NaN | 1.000 | NaN |
진척번호 | 1.000 | NaN | 1.000 | NaN | 1.000 |
연번 | 구경 | 계량기형식 | |
---|---|---|---|
연번 | 1.000 | 0.057 | 0.074 |
구경 | 0.057 | 1.000 | 0.155 |
계량기형식 | 0.074 | 0.155 | 1.000 |
연번 | 고객번호 | 장치일 | 구경 | 계량기형식 | 결재처리문서아이디(ID) | 진척번호 | |
---|---|---|---|---|---|---|---|
55572 | 55573 | 19**85 | 2021-08-27 | 15 | 건식 | <NA> | <NA> |
28604 | 28605 | 45**64 | 2021-08-02 | 15 | 건식 | <NA> | <NA> |
4144 | 4145 | 39**18 | 2021-06-07 | 15 | 건식 | <NA> | <NA> |
3215 | 3216 | 40**36 | 2021-06-07 | 15 | 건식 | <NA> | <NA> |
51040 | 51041 | 25**77 | 2021-10-08 | 15 | 건식 | <NA> | <NA> |
16872 | 16873 | 49**37 | 2021-08-30 | 15 | 건식 | <NA> | <NA> |
53136 | 53137 | 40**24 | 2021-07-20 | 15 | 건식 | <NA> | <NA> |
27094 | 27095 | 45**14 | 2021-10-26 | 15 | 건식 | <NA> | <NA> |
47450 | 47451 | 15**71 | 2021-04-22 | 15 | 건식 | <NA> | <NA> |
60644 | 60645 | 45**86 | 2021-08-10 | 15 | 건식 | <NA> | <NA> |
연번 | 고객번호 | 장치일 | 구경 | 계량기형식 | 결재처리문서아이디(ID) | 진척번호 | |
---|---|---|---|---|---|---|---|
33872 | 33873 | 13**33 | 2021-07-05 | 25 | 습식 | <NA> | 21-1277 |
7091 | 7092 | 39**15 | 2021-05-11 | 15 | 건식 | <NA> | <NA> |
19541 | 19542 | 41**47 | 2021-11-11 | 15 | 건식 | <NA> | <NA> |
63073 | 63074 | 26**67 | 2021-11-29 | 15 | 건식 | <NA> | <NA> |
46346 | 46347 | 14**00 | 2021-10-20 | 15 | 건식 | <NA> | <NA> |
69313 | 69314 | 30**23 | 2021-12-01 | 25 | 습식 | <NA> | <NA> |
69935 | 69936 | 01**49 | 2021-11-05 | 15 | 건식 | <NA> | <NA> |
38646 | 38647 | 17**02 | 2021-04-29 | 15 | 건식 | <NA> | <NA> |
36923 | 36924 | 29**18 | 2021-06-17 | 15 | 건식 | <NA> | <NA> |
760 | 761 | 50**93 | 2021-05-11 | 15 | 건식 | <NA> | <NA> |