Overview

Dataset statistics

Number of variables10
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.9 KiB
Average record size in memory91.3 B

Variable types

Categorical6
Numeric4

Alerts

협잡물함수율 has constant value ""Constant
액상슬러지함수율 has constant value ""Constant
권역 is highly overall correlated with 탈수슬러지처리량 and 3 other fieldsHigh correlation
하수처리시설명 is highly overall correlated with 탈수슬러지처리량 and 4 other fieldsHigh correlation
관리단 is highly overall correlated with 탈수슬러지처리량 and 3 other fieldsHigh correlation
협잡물처리량 is highly overall correlated with 탈수슬러지처리량 and 3 other fieldsHigh correlation
탈수슬러지처리량 is highly overall correlated with 탈수슬러지함수율 and 4 other fieldsHigh correlation
탈수슬러지함수율 is highly overall correlated with 탈수슬러지처리량 and 1 other fieldsHigh correlation
권역 is highly imbalanced (63.4%)Imbalance
관리단 is highly imbalanced (63.4%)Imbalance
협잡물처리량 is highly imbalanced (73.1%)Imbalance
탈수슬러지처리량 has 44 (44.0%) zerosZeros
액상슬러지처리량 has 91 (91.0%) zerosZeros
탈수슬러지함수율 has 56 (56.0%) zerosZeros

Reproduction

Analysis started2023-12-10 13:03:16.126413
Analysis finished2023-12-10 13:03:18.222502
Duration2.1 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

권역
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
90
93 
91
 
7

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row90
2nd row90
3rd row90
4th row90
5th row90

Common Values

ValueCountFrequency (%)
90 93
93.0%
91 7
 
7.0%

Length

2023-12-10T22:03:18.286561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:18.382618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
90 93
93.0%
91 7
 
7.0%

하수처리시설명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
50001
49 
50002
25 
50003
19 
60001

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row50003
2nd row50001
3rd row50003
4th row50003
5th row50001

Common Values

ValueCountFrequency (%)
50001 49
49.0%
50002 25
25.0%
50003 19
 
19.0%
60001 7
 
7.0%

Length

2023-12-10T22:03:18.487016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:18.618426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
50001 49
49.0%
50002 25
25.0%
50003 19
 
19.0%
60001 7
 
7.0%

처리일자
Real number (ℝ)

Distinct30
Distinct (%)30.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20190616
Minimum20190601
Maximum20190630
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:18.738260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20190601
5-th percentile20190603
Q120190609
median20190617
Q320190624
95-th percentile20190629
Maximum20190630
Range29
Interquartile range (IQR)15.25

Descriptive statistics

Standard deviation8.5070143
Coefficient of variation (CV)4.2133505 × 10-7
Kurtosis-1.1958068
Mean20190616
Median Absolute Deviation (MAD)7
Skewness-0.036856072
Sum2.0190616 × 109
Variance72.369293
MonotonicityNot monotonic
2023-12-10T22:03:18.885161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20190619 5
 
5.0%
20190614 5
 
5.0%
20190624 4
 
4.0%
20190606 4
 
4.0%
20190626 4
 
4.0%
20190630 4
 
4.0%
20190613 4
 
4.0%
20190610 4
 
4.0%
20190612 4
 
4.0%
20190618 4
 
4.0%
Other values (20) 58
58.0%
ValueCountFrequency (%)
20190601 2
2.0%
20190602 1
 
1.0%
20190603 4
4.0%
20190604 3
3.0%
20190605 4
4.0%
20190606 4
4.0%
20190607 4
4.0%
20190608 3
3.0%
20190609 1
 
1.0%
20190610 4
4.0%
ValueCountFrequency (%)
20190630 4
4.0%
20190629 3
3.0%
20190628 4
4.0%
20190627 3
3.0%
20190626 4
4.0%
20190625 4
4.0%
20190624 4
4.0%
20190623 3
3.0%
20190622 1
 
1.0%
20190621 4
4.0%

관리단
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
500
93 
600
 
7

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row500
2nd row500
3rd row500
4th row500
5th row500

Common Values

ValueCountFrequency (%)
500 93
93.0%
600 7
 
7.0%

Length

2023-12-10T22:03:19.007293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:19.110546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
500 93
93.0%
600 7
 
7.0%

탈수슬러지처리량
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct52
Distinct (%)52.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3066.74
Minimum0
Maximum7410
Zeros44
Zeros (%)44.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:19.210327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3431
Q35840
95-th percentile6600
Maximum7410
Range7410
Interquartile range (IQR)5840

Descriptive statistics

Standard deviation2855.256
Coefficient of variation (CV)0.93103949
Kurtosis-1.8568527
Mean3066.74
Median Absolute Deviation (MAD)3149
Skewness-0.033752482
Sum306674
Variance8152487.1
MonotonicityNot monotonic
2023-12-10T22:03:19.344158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 44
44.0%
5650 2
 
2.0%
5940 2
 
2.0%
5770 2
 
2.0%
6130 2
 
2.0%
6600 2
 
2.0%
7410 1
 
1.0%
4810 1
 
1.0%
3930 1
 
1.0%
4910 1
 
1.0%
Other values (42) 42
42.0%
ValueCountFrequency (%)
0 44
44.0%
2424 1
 
1.0%
2728 1
 
1.0%
3018 1
 
1.0%
3224 1
 
1.0%
3328 1
 
1.0%
3348 1
 
1.0%
3514 1
 
1.0%
3930 1
 
1.0%
4540 1
 
1.0%
ValueCountFrequency (%)
7410 1
1.0%
6990 1
1.0%
6750 1
1.0%
6730 1
1.0%
6600 2
2.0%
6560 1
1.0%
6510 1
1.0%
6440 1
1.0%
6410 1
1.0%
6370 1
1.0%

액상슬러지처리량
Real number (ℝ)

ZEROS 

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean134.4
Minimum0
Maximum2830
Zeros91
Zeros (%)91.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:19.474303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1292
Maximum2830
Range2830
Interquartile range (IQR)0

Descriptive statistics

Standard deviation453.64365
Coefficient of variation (CV)3.3753248
Kurtosis14.599807
Mean134.4
Median Absolute Deviation (MAD)0
Skewness3.6726928
Sum13440
Variance205792.57
MonotonicityNot monotonic
2023-12-10T22:03:19.568726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0 91
91.0%
1290 2
 
2.0%
1430 1
 
1.0%
1500 1
 
1.0%
1420 1
 
1.0%
1330 1
 
1.0%
1200 1
 
1.0%
1150 1
 
1.0%
2830 1
 
1.0%
ValueCountFrequency (%)
0 91
91.0%
1150 1
 
1.0%
1200 1
 
1.0%
1290 2
 
2.0%
1330 1
 
1.0%
1420 1
 
1.0%
1430 1
 
1.0%
1500 1
 
1.0%
2830 1
 
1.0%
ValueCountFrequency (%)
2830 1
 
1.0%
1500 1
 
1.0%
1430 1
 
1.0%
1420 1
 
1.0%
1330 1
 
1.0%
1290 2
 
2.0%
1200 1
 
1.0%
1150 1
 
1.0%
0 91
91.0%

탈수슬러지함수율
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct35
Distinct (%)35.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5677.7
Minimum0
Maximum28370
Zeros56
Zeros (%)56.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:19.705095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q311932.5
95-th percentile18954
Maximum28370
Range28370
Interquartile range (IQR)11932.5

Descriptive statistics

Standard deviation7400.1448
Coefficient of variation (CV)1.3033702
Kurtosis0.5243413
Mean5677.7
Median Absolute Deviation (MAD)0
Skewness1.1100744
Sum567770
Variance54762143
MonotonicityNot monotonic
2023-12-10T22:03:19.847594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
0 56
56.0%
11940 4
 
4.0%
11950 3
 
3.0%
17930 2
 
2.0%
5970 2
 
2.0%
9490 2
 
2.0%
9420 2
 
2.0%
9470 2
 
2.0%
13160 1
 
1.0%
18280 1
 
1.0%
Other values (25) 25
25.0%
ValueCountFrequency (%)
0 56
56.0%
5570 1
 
1.0%
5970 2
 
2.0%
5980 1
 
1.0%
6410 1
 
1.0%
6520 1
 
1.0%
9390 1
 
1.0%
9420 2
 
2.0%
9430 1
 
1.0%
9440 1
 
1.0%
ValueCountFrequency (%)
28370 1
1.0%
28280 1
1.0%
26320 1
1.0%
19070 1
1.0%
19030 1
1.0%
18950 1
1.0%
18280 1
1.0%
17930 2
2.0%
17910 1
1.0%
17890 1
1.0%

협잡물처리량
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0.0
93 
83.5
 
5
83.4
 
2

Length

Max length4
Median length3
Mean length3.07
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 93
93.0%
83.5 5
 
5.0%
83.4 2
 
2.0%

Length

2023-12-10T22:03:19.981770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:20.073286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 93
93.0%
83.5 5
 
5.0%
83.4 2
 
2.0%

협잡물함수율
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:03:20.171852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:20.280679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

액상슬러지함수율
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:03:20.364059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:20.447951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

Interactions

2023-12-10T22:03:17.642972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:16.476221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:16.850036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:17.273991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:17.728765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:16.559163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:16.956783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:17.366051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:17.812181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:16.676123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:17.084422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:17.463839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:17.904300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:16.758834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:17.181208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:17.539814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:03:20.521544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량
권역1.0001.0000.0000.9931.0000.4950.0001.000
하수처리시설명1.0001.0000.0001.0000.9780.5010.7760.667
처리일자0.0000.0001.0000.0000.0630.0000.0000.079
관리단0.9931.0000.0001.0001.0000.4950.0001.000
탈수슬러지처리량1.0000.9780.0631.0001.0000.8190.5350.793
액상슬러지처리량0.4950.5010.0000.4950.8191.0000.0000.277
탈수슬러지함수율0.0000.7760.0000.0000.5350.0001.0000.000
협잡물처리량1.0000.6670.0791.0000.7930.2770.0001.000
2023-12-10T22:03:20.642634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
권역하수처리시설명관리단협잡물처리량
권역1.0000.9900.9220.995
하수처리시설명0.9901.0000.9900.692
관리단0.9220.9901.0000.995
협잡물처리량0.9950.6920.9951.000
2023-12-10T22:03:20.731292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자탈수슬러지처리량액상슬러지처리량탈수슬러지함수율권역하수처리시설명관리단협잡물처리량
처리일자1.000-0.0970.014-0.0800.0000.0000.0000.061
탈수슬러지처리량-0.0971.0000.251-0.8510.9690.7840.9690.696
액상슬러지처리량0.0140.2511.000-0.2640.3320.2130.3320.264
탈수슬러지함수율-0.080-0.851-0.2641.0000.0000.6530.0000.000
권역0.0000.9690.3320.0001.0000.9900.9220.995
하수처리시설명0.0000.7840.2130.6530.9901.0000.9900.692
관리단0.0000.9690.3320.0000.9220.9901.0000.995
협잡물처리량0.0610.6960.2640.0000.9950.6920.9951.000

Missing values

2023-12-10T22:03:18.004709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:03:18.164610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량협잡물함수율액상슬러지함수율
090500032019062450000119500.000
19050001201906235005800000.000
290500032019060750000119400.000
390500032019062150000179300.000
49050001201906055005650000.000
59050001201906085006310000.000
690500032019060550000119500.000
79050001201906145005410000.000
890500032019061750000179300.000
99050003201906205000059700.000
권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량협잡물함수율액상슬러지함수율
909050001201906075005650000.000
919050001201906215006410000.000
929050001201906225005950000.000
9391600012019062060033280083.400
9491600012019061960024240083.400
9591600012019060860032240083.500
9691600012019061460033480083.500
9791600012019062960027282830083.500
9891600012019060660030180083.500
9991600012019061860035140083.500