Overview

Dataset statistics

Number of variables10
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.9 KiB
Average record size in memory91.3 B

Variable types

Categorical6
Numeric4

Alerts

협잡물함수율 has constant value ""Constant
액상슬러지함수율 has constant value ""Constant
권역 is highly overall correlated with 탈수슬러지처리량 and 3 other fieldsHigh correlation
하수처리시설명 is highly overall correlated with 탈수슬러지처리량 and 4 other fieldsHigh correlation
관리단 is highly overall correlated with 탈수슬러지처리량 and 3 other fieldsHigh correlation
협잡물처리량 is highly overall correlated with 탈수슬러지처리량 and 3 other fieldsHigh correlation
탈수슬러지처리량 is highly overall correlated with 탈수슬러지함수율 and 4 other fieldsHigh correlation
탈수슬러지함수율 is highly overall correlated with 탈수슬러지처리량 and 1 other fieldsHigh correlation
권역 is highly imbalanced (71.4%)Imbalance
관리단 is highly imbalanced (71.4%)Imbalance
협잡물처리량 is highly imbalanced (82.3%)Imbalance
탈수슬러지처리량 has 44 (44.0%) zerosZeros
액상슬러지처리량 has 89 (89.0%) zerosZeros
탈수슬러지함수율 has 56 (56.0%) zerosZeros

Reproduction

Analysis started2023-12-10 13:03:21.265545
Analysis finished2023-12-10 13:03:23.718166
Duration2.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

권역
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
90
95 
91
 
5

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row90
2nd row90
3rd row90
4th row90
5th row90

Common Values

ValueCountFrequency (%)
90 95
95.0%
91 5
 
5.0%

Length

2023-12-10T22:03:23.779273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:23.878832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
90 95
95.0%
91 5
 
5.0%

하수처리시설명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
50001
51 
50002
23 
50003
21 
60001
 
5

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row50002
2nd row50001
3rd row50002
4th row50002
5th row50001

Common Values

ValueCountFrequency (%)
50001 51
51.0%
50002 23
23.0%
50003 21
21.0%
60001 5
 
5.0%

Length

2023-12-10T22:03:23.987608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:24.086957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
50001 51
51.0%
50002 23
23.0%
50003 21
21.0%
60001 5
 
5.0%

처리일자
Real number (ℝ)

Distinct31
Distinct (%)31.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20190516
Minimum20190501
Maximum20190531
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:24.217107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20190501
5-th percentile20190502
Q120190508
median20190516
Q320190523
95-th percentile20190530
Maximum20190531
Range30
Interquartile range (IQR)15.25

Descriptive statistics

Standard deviation8.8986209
Coefficient of variation (CV)4.4073271 × 10-7
Kurtosis-1.1767057
Mean20190516
Median Absolute Deviation (MAD)8
Skewness0.014283806
Sum2.0190516 × 109
Variance79.185455
MonotonicityNot monotonic
2023-12-10T22:03:24.371871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20190507 5
 
5.0%
20190524 5
 
5.0%
20190522 5
 
5.0%
20190515 5
 
5.0%
20190531 4
 
4.0%
20190520 4
 
4.0%
20190514 4
 
4.0%
20190503 4
 
4.0%
20190502 4
 
4.0%
20190513 4
 
4.0%
Other values (21) 56
56.0%
ValueCountFrequency (%)
20190501 2
 
2.0%
20190502 4
4.0%
20190503 4
4.0%
20190504 2
 
2.0%
20190505 2
 
2.0%
20190506 3
3.0%
20190507 5
5.0%
20190508 4
4.0%
20190509 4
4.0%
20190510 4
4.0%
ValueCountFrequency (%)
20190531 4
4.0%
20190530 3
3.0%
20190529 3
3.0%
20190528 3
3.0%
20190527 4
4.0%
20190526 1
 
1.0%
20190525 2
 
2.0%
20190524 5
5.0%
20190523 3
3.0%
20190522 5
5.0%

관리단
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
500
95 
600
 
5

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row500
2nd row500
3rd row500
4th row500
5th row500

Common Values

ValueCountFrequency (%)
500 95
95.0%
600 5
 
5.0%

Length

2023-12-10T22:03:24.515583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:24.602796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
500 95
95.0%
600 5
 
5.0%

탈수슬러지처리량
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct51
Distinct (%)51.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3212.79
Minimum0
Maximum10291
Zeros44
Zeros (%)44.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:24.708469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median4675
Q35892.5
95-th percentile6606.5
Maximum10291
Range10291
Interquartile range (IQR)5892.5

Descriptive statistics

Standard deviation2976.1388
Coefficient of variation (CV)0.9263409
Kurtosis-1.6617748
Mean3212.79
Median Absolute Deviation (MAD)1895
Skewness-0.0088044262
Sum321279
Variance8857402.1
MonotonicityNot monotonic
2023-12-10T22:03:24.866352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 44
44.0%
6480 3
 
3.0%
5550 2
 
2.0%
3423 2
 
2.0%
6440 2
 
2.0%
5890 2
 
2.0%
6850 1
 
1.0%
4450 1
 
1.0%
4900 1
 
1.0%
5920 1
 
1.0%
Other values (41) 41
41.0%
ValueCountFrequency (%)
0 44
44.0%
2923 1
 
1.0%
3249 1
 
1.0%
3423 2
 
2.0%
4450 1
 
1.0%
4600 1
 
1.0%
4750 1
 
1.0%
4870 1
 
1.0%
4900 1
 
1.0%
4930 1
 
1.0%
ValueCountFrequency (%)
10291 1
 
1.0%
7300 1
 
1.0%
6870 1
 
1.0%
6850 1
 
1.0%
6730 1
 
1.0%
6600 1
 
1.0%
6540 1
 
1.0%
6520 1
 
1.0%
6480 3
3.0%
6440 2
2.0%

액상슬러지처리량
Real number (ℝ)

ZEROS 

Distinct11
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean140.4
Minimum0
Maximum1670
Zeros89
Zeros (%)89.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:25.008923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1214.5
Maximum1670
Range1670
Interquartile range (IQR)0

Descriptive statistics

Standard deviation408.08476
Coefficient of variation (CV)2.9065866
Kurtosis5.6992412
Mean140.4
Median Absolute Deviation (MAD)0
Skewness2.6914033
Sum14040
Variance166533.17
MonotonicityNot monotonic
2023-12-10T22:03:25.134936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
0 89
89.0%
1520 2
 
2.0%
960 1
 
1.0%
1160 1
 
1.0%
1440 1
 
1.0%
1170 1
 
1.0%
1210 1
 
1.0%
1670 1
 
1.0%
1300 1
 
1.0%
1030 1
 
1.0%
ValueCountFrequency (%)
0 89
89.0%
960 1
 
1.0%
1030 1
 
1.0%
1060 1
 
1.0%
1160 1
 
1.0%
1170 1
 
1.0%
1210 1
 
1.0%
1300 1
 
1.0%
1440 1
 
1.0%
1520 2
 
2.0%
ValueCountFrequency (%)
1670 1
1.0%
1520 2
2.0%
1440 1
1.0%
1300 1
1.0%
1210 1
1.0%
1170 1
1.0%
1160 1
1.0%
1060 1
1.0%
1030 1
1.0%
960 1
1.0%

탈수슬러지함수율
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct31
Distinct (%)31.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6938.8
Minimum0
Maximum35860
Zeros56
Zeros (%)56.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:25.262249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q311952.5
95-th percentile21229
Maximum35860
Range35860
Interquartile range (IQR)11952.5

Descriptive statistics

Standard deviation8859.3209
Coefficient of variation (CV)1.27678
Kurtosis-0.20089228
Mean6938.8
Median Absolute Deviation (MAD)0
Skewness0.93287479
Sum693880
Variance78487566
MonotonicityNot monotonic
2023-12-10T22:03:25.401455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 56
56.0%
21060 4
 
4.0%
11970 3
 
3.0%
23900 2
 
2.0%
17920 2
 
2.0%
11930 2
 
2.0%
11940 2
 
2.0%
21040 2
 
2.0%
11950 2
 
2.0%
10510 2
 
2.0%
Other values (21) 23
23.0%
ValueCountFrequency (%)
0 56
56.0%
5970 1
 
1.0%
5980 1
 
1.0%
10480 1
 
1.0%
10510 2
 
2.0%
10520 2
 
2.0%
10530 2
 
2.0%
10540 1
 
1.0%
10550 1
 
1.0%
10570 1
 
1.0%
ValueCountFrequency (%)
35860 1
 
1.0%
23900 2
2.0%
23890 1
 
1.0%
23870 1
 
1.0%
21090 1
 
1.0%
21080 1
 
1.0%
21060 4
4.0%
21050 1
 
1.0%
21040 2
2.0%
20880 1
 
1.0%

협잡물처리량
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0.0
95 
83.5
 
3
83.4
 
1
82.8
 
1

Length

Max length4
Median length3
Mean length3.05
Min length3

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 95
95.0%
83.5 3
 
3.0%
83.4 1
 
1.0%
82.8 1
 
1.0%

Length

2023-12-10T22:03:25.553799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:25.645444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 95
95.0%
83.5 3
 
3.0%
83.4 1
 
1.0%
82.8 1
 
1.0%

협잡물함수율
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:03:25.781776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:25.881113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

액상슬러지함수율
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:03:26.009641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:26.111742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

Interactions

2023-12-10T22:03:23.047026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:21.648169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:22.056661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:22.682986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:23.136753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:21.749048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:22.141641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:22.765990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:23.219085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:21.851634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:22.225266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:22.848488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:23.377555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:21.966674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:22.606533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:22.949615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:03:26.167705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량
권역1.0001.0000.0000.9861.0000.0000.0001.000
하수처리시설명1.0001.0000.0001.0000.9780.0000.9680.885
처리일자0.0000.0001.0000.0000.0000.0000.0000.000
관리단0.9861.0000.0001.0001.0000.0000.0001.000
탈수슬러지처리량1.0000.9780.0001.0001.0000.1910.6700.983
액상슬러지처리량0.0000.0000.0000.0000.1911.0000.0000.000
탈수슬러지함수율0.0000.9680.0000.0000.6700.0001.0000.000
협잡물처리량1.0000.8850.0001.0000.9830.0000.0001.000
2023-12-10T22:03:26.287726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
권역하수처리시설명관리단협잡물처리량
권역1.0000.9900.8940.990
하수처리시설명0.9901.0000.9900.559
관리단0.8940.9901.0000.990
협잡물처리량0.9900.5590.9901.000
2023-12-10T22:03:26.384109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자탈수슬러지처리량액상슬러지처리량탈수슬러지함수율권역하수처리시설명관리단협잡물처리량
처리일자1.000-0.000-0.0440.0750.0000.0000.0000.000
탈수슬러지처리량-0.0001.0000.411-0.8510.9690.7840.9690.808
액상슬러지처리량-0.0440.4111.000-0.2940.0000.0000.0000.000
탈수슬러지함수율0.075-0.851-0.2941.0000.0000.7470.0000.000
권역0.0000.9690.0000.0001.0000.9900.8940.990
하수처리시설명0.0000.7840.0000.7470.9901.0000.9900.559
관리단0.0000.9690.0000.0000.8940.9901.0000.990
협잡물처리량0.0000.8080.0000.0000.9900.5590.9901.000

Missing values

2023-12-10T22:03:23.502347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:03:23.644293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량협잡물함수율액상슬러지함수율
090500022019052750000210800.000
19050001201905025005660000.000
290500022019052450000210600.000
390500022019052550000105300.000
49050001201905225006040000.000
59050001201905165006060000.000
690500022019052150000105200.000
790500022019052350000105100.000
890500032019052850000119600.000
990500022019050850000105800.000
권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량협잡물함수율액상슬러지함수율
909050001201905145005610000.000
919050001201905015005760000.000
929050001201905155006440106000.000
9390500022019050350000210600.000
949050001201905155005520000.000
9591600012019050660034230083.500
9691600012019052460032490083.400
97916000120190507600102910082.800
9891600012019051560034230083.500
9991600012019052260029230083.500