Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 10000 |
Missing cells | 9996 |
Missing cells (%) | 14.3% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 644.5 KiB |
Average record size in memory | 66.0 B |
Variable types
Numeric | 2 |
---|---|
Text | 2 |
DateTime | 1 |
Categorical | 2 |
Dataset
Description | 부산광역시상수도사업본부_수용가정보시스템_계량기정보_계량기변경이력정보_20230125 |
---|---|
Author | 부산광역시 상수도사업본부 |
URL | http://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15100348 |
계량기형식 is highly overall correlated with 결재처리문서아이디(ID) | High correlation |
결재처리문서아이디(ID) is highly overall correlated with 계량기형식 | High correlation |
계량기형식 is highly imbalanced (65.7%) | Imbalance |
결재처리문서아이디(ID) is highly imbalanced (97.6%) | Imbalance |
진척번호 has 9996 (> 99.9%) missing values | Missing |
연번 has unique values | Unique |
Reproduction
Analysis started | 2023-12-10 16:15:28.436093 |
---|---|
Analysis finished | 2023-12-10 16:15:29.772919 |
Duration | 1.34 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
연번
Real number (ℝ)
UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 23636.912 |
Minimum | 7 |
---|---|
Maximum | 47196 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 7 |
---|---|
5-th percentile | 2553.95 |
Q1 | 11858.25 |
median | 23608.5 |
Q3 | 35331.25 |
95-th percentile | 44872.2 |
Maximum | 47196 |
Range | 47189 |
Interquartile range (IQR) | 23473 |
Descriptive statistics
Standard deviation | 13510.792 |
---|---|
Coefficient of variation (CV) | 0.57159714 |
Kurtosis | -1.1881096 |
Mean | 23636.912 |
Median Absolute Deviation (MAD) | 11734 |
Skewness | 0.0095790645 |
Sum | 2.3636912 × 108 |
Variance | 1.8254149 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
37711 | 1 | < 0.1% |
28484 | 1 | < 0.1% |
13445 | 1 | < 0.1% |
17868 | 1 | < 0.1% |
25097 | 1 | < 0.1% |
39549 | 1 | < 0.1% |
20162 | 1 | < 0.1% |
4412 | 1 | < 0.1% |
41528 | 1 | < 0.1% |
5028 | 1 | < 0.1% |
Other values (9990) | 9990 |
Value | Count | Frequency (%) |
7 | 1 | |
10 | 1 | |
16 | 1 | |
24 | 1 | |
27 | 1 | |
35 | 1 | |
46 | 1 | |
58 | 1 | |
59 | 1 | |
76 | 1 |
Value | Count | Frequency (%) |
47196 | 1 | |
47187 | 1 | |
47184 | 1 | |
47183 | 1 | |
47182 | 1 | |
47177 | 1 | |
47176 | 1 | |
47175 | 1 | |
47171 | 1 | |
47167 | 1 |
고객번호
Text
Distinct | 5514 |
---|---|
Distinct (%) | 55.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
52*54 | 17 | 0.2% |
27*05 | 16 | 0.2% |
28*78 | 14 | 0.1% |
80*96 | 14 | 0.1% |
15*91 | 12 | 0.1% |
02*78 | 12 | 0.1% |
54*92 | 12 | 0.1% |
33*85 | 11 | 0.1% |
03*28 | 11 | 0.1% |
34*75 | 11 | 0.1% |
Other values (5504) | 9870 |
Most occurring characters
Value | Count | Frequency (%) |
* | 20000 | |
1 | 4552 | 7.6% |
0 | 4505 | 7.5% |
9 | 4479 | 7.5% |
2 | 4044 | 6.7% |
8 | 3937 | 6.6% |
7 | 3886 | 6.5% |
5 | 3863 | 6.4% |
3 | 3798 | 6.3% |
4 | 3695 | 6.2% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 40000 | |
Other Punctuation | 20000 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 4552 | |
0 | 4505 | |
9 | 4479 | |
2 | 4044 | |
8 | 3937 | |
7 | 3886 | |
5 | 3863 | |
3 | 3798 | |
4 | 3695 | |
6 | 3241 |
Other Punctuation
Value | Count | Frequency (%) |
* | 20000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 60000 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
* | 20000 | |
1 | 4552 | 7.6% |
0 | 4505 | 7.5% |
9 | 4479 | 7.5% |
2 | 4044 | 6.7% |
8 | 3937 | 6.6% |
7 | 3886 | 6.5% |
5 | 3863 | 6.4% |
3 | 3798 | 6.3% |
4 | 3695 | 6.2% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 60000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
* | 20000 | |
1 | 4552 | 7.6% |
0 | 4505 | 7.5% |
9 | 4479 | 7.5% |
2 | 4044 | 6.7% |
8 | 3937 | 6.6% |
7 | 3886 | 6.5% |
5 | 3863 | 6.4% |
3 | 3798 | 6.3% |
4 | 3695 | 6.2% |
장치일
Date
Distinct | 330 |
---|---|
Distinct (%) | 3.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Minimum | 2022-01-03 00:00:00 |
---|---|
Maximum | 2022-12-30 00:00:00 |
구경
Real number (ℝ)
Distinct | 10 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 18.1216 |
Minimum | 15 |
---|---|
Maximum | 200 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 15 |
---|---|
5-th percentile | 15 |
Q1 | 15 |
median | 15 |
Q3 | 15 |
95-th percentile | 32 |
Maximum | 200 |
Range | 185 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 12.021949 |
---|---|
Coefficient of variation (CV) | 0.66340441 |
Kurtosis | 77.605085 |
Mean | 18.1216 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 7.6888416 |
Sum | 181216 |
Variance | 144.52727 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
15 | 8224 | |
20 | 647 | 6.5% |
25 | 604 | 6.0% |
40 | 203 | 2.0% |
50 | 127 | 1.3% |
32 | 73 | 0.7% |
80 | 52 | 0.5% |
100 | 39 | 0.4% |
150 | 25 | 0.2% |
200 | 6 | 0.1% |
Value | Count | Frequency (%) |
15 | 8224 | |
20 | 647 | 6.5% |
25 | 604 | 6.0% |
32 | 73 | 0.7% |
40 | 203 | 2.0% |
50 | 127 | 1.3% |
80 | 52 | 0.5% |
100 | 39 | 0.4% |
150 | 25 | 0.2% |
200 | 6 | 0.1% |
Value | Count | Frequency (%) |
200 | 6 | 0.1% |
150 | 25 | 0.2% |
100 | 39 | 0.4% |
80 | 52 | 0.5% |
50 | 127 | 1.3% |
40 | 203 | 2.0% |
32 | 73 | 0.7% |
25 | 604 | 6.0% |
20 | 647 | 6.5% |
15 | 8224 |
계량기형식
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
건식 | |
---|---|
습식 | |
<NA> | 60 |
Length
Max length | 4 |
---|---|
Median length | 2 |
Mean length | 2.012 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 건식 |
---|---|
2nd row | 습식 |
3rd row | 건식 |
4th row | 건식 |
5th row | 건식 |
Common Values
Value | Count | Frequency (%) |
건식 | 8866 | |
습식 | 1074 | 10.7% |
<NA> | 60 | 0.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
건식 | 8866 | |
습식 | 1074 | 10.7% |
na | 60 | 0.6% |
결재처리문서아이디(ID)
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 7 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
<NA> | |
---|---|
SUDO_309_20220600016 | 55 |
SUDO_309_20220600017 | 6 |
SUDO_309_20220500002 | 2 |
SUDO_309_20220600014 | 2 |
Other values (2) | 3 |
Length
Max length | 20 |
---|---|
Median length | 4 |
Mean length | 4.1088 |
Min length | 4 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 9932 | |
SUDO_309_20220600016 | 55 | 0.5% |
SUDO_309_20220600017 | 6 | 0.1% |
SUDO_309_20220500002 | 2 | < 0.1% |
SUDO_309_20220600014 | 2 | < 0.1% |
SUDO_309_20220600010 | 2 | < 0.1% |
SUDO_309_20220400001 | 1 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 9932 | |
sudo_309_20220600016 | 55 | 0.5% |
sudo_309_20220600017 | 6 | 0.1% |
sudo_309_20220500002 | 2 | < 0.1% |
sudo_309_20220600014 | 2 | < 0.1% |
sudo_309_20220600010 | 2 | < 0.1% |
sudo_309_20220400001 | 1 | < 0.1% |
진척번호
Text
MISSING
 
Distinct | 4 |
---|---|
Distinct (%) | 100.0% |
Missing | 9996 |
Missing (%) | > 99.9% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
22-1921 | 1 | |
22-2200 | 1 | |
22-1947 | 1 | |
22-984 | 1 |
Most occurring characters
Value | Count | Frequency (%) |
2 | 11 | |
- | 4 | 14.8% |
1 | 3 | 11.1% |
9 | 3 | 11.1% |
0 | 2 | 7.4% |
4 | 2 | 7.4% |
7 | 1 | 3.7% |
8 | 1 | 3.7% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 23 | |
Dash Punctuation | 4 | 14.8% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
2 | 11 | |
1 | 3 | 13.0% |
9 | 3 | 13.0% |
0 | 2 | 8.7% |
4 | 2 | 8.7% |
7 | 1 | 4.3% |
8 | 1 | 4.3% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 4 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 27 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
2 | 11 | |
- | 4 | 14.8% |
1 | 3 | 11.1% |
9 | 3 | 11.1% |
0 | 2 | 7.4% |
4 | 2 | 7.4% |
7 | 1 | 3.7% |
8 | 1 | 3.7% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 27 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
2 | 11 | |
- | 4 | 14.8% |
1 | 3 | 11.1% |
9 | 3 | 11.1% |
0 | 2 | 7.4% |
4 | 2 | 7.4% |
7 | 1 | 3.7% |
8 | 1 | 3.7% |
연번 | 구경 | 계량기형식 | 결재처리문서아이디(ID) | 진척번호 | |
---|---|---|---|---|---|
연번 | 1.000 | 0.000 | 0.087 | 0.497 | 1.000 |
구경 | 0.000 | 1.000 | 0.333 | 0.264 | NaN |
계량기형식 | 0.087 | 0.333 | 1.000 | 0.836 | 1.000 |
결재처리문서아이디(ID) | 0.497 | 0.264 | 0.836 | 1.000 | NaN |
진척번호 | 1.000 | NaN | 1.000 | NaN | 1.000 |
계량기형식 | 결재처리문서아이디(ID) | |
---|---|---|
계량기형식 | 1.000 | 0.621 |
결재처리문서아이디(ID) | 0.621 | 1.000 |
연번 | 구경 | 계량기형식 | 결재처리문서아이디(ID) | |
---|---|---|---|---|
연번 | 1.000 | 0.023 | 0.067 | 0.296 |
구경 | 0.023 | 1.000 | 0.239 | 0.181 |
계량기형식 | 0.067 | 0.239 | 1.000 | 0.621 |
결재처리문서아이디(ID) | 0.296 | 0.181 | 0.621 | 1.000 |
연번 | 고객번호 | 장치일 | 구경 | 계량기형식 | 결재처리문서아이디(ID) | 진척번호 | |
---|---|---|---|---|---|---|---|
37710 | 37711 | *71*29 | 2022-09-15 | 15 | 건식 | <NA> | <NA> |
25102 | 25103 | *36*87 | 2022-02-04 | 32 | 습식 | <NA> | <NA> |
3974 | 3975 | *76*15 | 2022-09-29 | 15 | 건식 | <NA> | <NA> |
1479 | 1480 | *18*50 | 2022-10-18 | 15 | 건식 | <NA> | <NA> |
21995 | 21996 | *56*55 | 2022-05-30 | 15 | 건식 | <NA> | <NA> |
12977 | 12978 | *18*62 | 2022-05-17 | 15 | 건식 | <NA> | <NA> |
2188 | 2189 | *26*92 | 2022-06-13 | 40 | 건식 | <NA> | <NA> |
42271 | 42272 | *03*63 | 2022-10-21 | 40 | 습식 | <NA> | <NA> |
8099 | 8100 | *95*24 | 2022-07-02 | 15 | 건식 | <NA> | <NA> |
40445 | 40446 | *91*57 | 2022-11-09 | 15 | 건식 | <NA> | <NA> |
연번 | 고객번호 | 장치일 | 구경 | 계량기형식 | 결재처리문서아이디(ID) | 진척번호 | |
---|---|---|---|---|---|---|---|
13906 | 13907 | *17*57 | 2022-03-10 | 15 | 건식 | <NA> | <NA> |
45129 | 45130 | *19*31 | 2022-12-12 | 32 | 건식 | <NA> | <NA> |
36795 | 36796 | *81*89 | 2022-09-13 | 15 | 건식 | <NA> | <NA> |
38444 | 38445 | *95*68 | 2022-12-13 | 15 | 건식 | <NA> | <NA> |
18457 | 18458 | *01*90 | 2022-01-11 | 15 | 건식 | <NA> | <NA> |
3507 | 3508 | *86*94 | 2022-10-05 | 40 | 습식 | <NA> | <NA> |
14784 | 14785 | *83*33 | 2022-07-11 | 15 | 건식 | <NA> | <NA> |
39650 | 39651 | *75*49 | 2022-11-23 | 40 | 습식 | <NA> | <NA> |
28521 | 28522 | *36*36 | 2022-03-11 | 15 | 건식 | <NA> | <NA> |
17439 | 17440 | *11*39 | 2022-05-09 | 15 | 건식 | <NA> | <NA> |