Overview

Dataset statistics

Number of variables10
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.9 KiB
Average record size in memory91.3 B

Variable types

Categorical5
Numeric5

Alerts

협잡물함수율 has constant value ""Constant
액상슬러지함수율 has constant value ""Constant
권역 is highly overall correlated with 하수처리시설명 and 2 other fieldsHigh correlation
관리단 is highly overall correlated with 하수처리시설명 and 4 other fieldsHigh correlation
하수처리시설명 is highly overall correlated with 탈수슬러지처리량 and 4 other fieldsHigh correlation
탈수슬러지처리량 is highly overall correlated with 하수처리시설명 and 3 other fieldsHigh correlation
탈수슬러지함수율 is highly overall correlated with 탈수슬러지처리량High correlation
협잡물처리량 is highly overall correlated with 하수처리시설명 and 2 other fieldsHigh correlation
액상슬러지처리량 is highly overall correlated with 하수처리시설명 and 2 other fieldsHigh correlation
액상슬러지처리량 is highly imbalanced (89.8%)Imbalance
탈수슬러지처리량 has 28 (28.0%) zerosZeros
탈수슬러지함수율 has 72 (72.0%) zerosZeros
협잡물처리량 has 71 (71.0%) zerosZeros

Reproduction

Analysis started2023-12-10 13:03:43.524677
Analysis finished2023-12-10 13:03:46.243609
Duration2.72 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

권역
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
90
69 
91
31 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row90
2nd row90
3rd row90
4th row90
5th row90

Common Values

ValueCountFrequency (%)
90 69
69.0%
91 31
31.0%

Length

2023-12-10T22:03:46.300401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:46.381221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
90 69
69.0%
91 31
31.0%

하수처리시설명
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53001.38
Minimum40001
Maximum70001
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:46.461319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum40001
5-th percentile50001
Q150001
median50002
Q360001
95-th percentile60001
Maximum70001
Range30000
Interquartile range (IQR)10000

Descriptive statistics

Standard deviation5024.96
Coefficient of variation (CV)0.094808097
Kurtosis0.059350478
Mean53001.38
Median Absolute Deviation (MAD)1
Skewness0.87718521
Sum5300138
Variance25250223
MonotonicityNot monotonic
2023-12-10T22:03:46.563677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
50001 41
41.0%
60001 29
29.0%
50002 18
18.0%
50003 10
 
10.0%
70001 1
 
1.0%
40001 1
 
1.0%
ValueCountFrequency (%)
40001 1
 
1.0%
50001 41
41.0%
50002 18
18.0%
50003 10
 
10.0%
60001 29
29.0%
70001 1
 
1.0%
ValueCountFrequency (%)
70001 1
 
1.0%
60001 29
29.0%
50003 10
 
10.0%
50002 18
18.0%
50001 41
41.0%
40001 1
 
1.0%

처리일자
Real number (ℝ)

Distinct31
Distinct (%)31.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20190117
Minimum20190101
Maximum20190131
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:46.675752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20190101
5-th percentile20190103
Q120190110
median20190117
Q320190125
95-th percentile20190131
Maximum20190131
Range30
Interquartile range (IQR)15.25

Descriptive statistics

Standard deviation9.0775781
Coefficient of variation (CV)4.4960503 × 10-7
Kurtosis-1.1925558
Mean20190117
Median Absolute Deviation (MAD)8
Skewness-0.09445672
Sum2.0190117 × 109
Variance82.402424
MonotonicityNot monotonic
2023-12-10T22:03:46.800003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20190131 6
 
6.0%
20190115 5
 
5.0%
20190128 5
 
5.0%
20190107 5
 
5.0%
20190121 5
 
5.0%
20190114 4
 
4.0%
20190123 4
 
4.0%
20190125 4
 
4.0%
20190129 4
 
4.0%
20190124 4
 
4.0%
Other values (21) 54
54.0%
ValueCountFrequency (%)
20190101 2
 
2.0%
20190102 3
3.0%
20190103 3
3.0%
20190104 4
4.0%
20190105 2
 
2.0%
20190106 1
 
1.0%
20190107 5
5.0%
20190108 3
3.0%
20190109 2
 
2.0%
20190110 4
4.0%
ValueCountFrequency (%)
20190131 6
6.0%
20190130 3
3.0%
20190129 4
4.0%
20190128 5
5.0%
20190127 2
 
2.0%
20190126 2
 
2.0%
20190125 4
4.0%
20190124 4
4.0%
20190123 4
4.0%
20190122 2
 
2.0%

관리단
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
500
69 
600
29 
700
 
1
400
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row500
2nd row500
3rd row500
4th row500
5th row500

Common Values

ValueCountFrequency (%)
500 69
69.0%
600 29
29.0%
700 1
 
1.0%
400 1
 
1.0%

Length

2023-12-10T22:03:46.930024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:47.018998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
500 69
69.0%
600 29
29.0%
700 1
 
1.0%
400 1
 
1.0%

탈수슬러지처리량
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct69
Distinct (%)69.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9675.09
Minimum0
Maximum626980
Zeros28
Zeros (%)28.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:47.130789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2646
Q35497.5
95-th percentile6143.5
Maximum626980
Range626980
Interquartile range (IQR)5497.5

Descriptive statistics

Standard deviation62526.446
Coefficient of variation (CV)6.4626217
Kurtosis98.868593
Mean9675.09
Median Absolute Deviation (MAD)2646
Skewness9.9184069
Sum967509
Variance3.9095565 × 109
MonotonicityNot monotonic
2023-12-10T22:03:47.253245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 28
28.0%
6090 3
 
3.0%
2736 2
 
2.0%
2774 2
 
2.0%
2141 1
 
1.0%
2606 1
 
1.0%
2710 1
 
1.0%
2600 1
 
1.0%
2654 1
 
1.0%
1624 1
 
1.0%
Other values (59) 59
59.0%
ValueCountFrequency (%)
0 28
28.0%
1624 1
 
1.0%
1851 1
 
1.0%
1938 1
 
1.0%
2040 1
 
1.0%
2052 1
 
1.0%
2078 1
 
1.0%
2090 1
 
1.0%
2128 1
 
1.0%
2141 1
 
1.0%
ValueCountFrequency (%)
626980 1
 
1.0%
43020 1
 
1.0%
6960 1
 
1.0%
6350 1
 
1.0%
6210 1
 
1.0%
6140 1
 
1.0%
6090 3
3.0%
6070 1
 
1.0%
5990 1
 
1.0%
5960 1
 
1.0%

액상슬러지처리량
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
98 
560
 
1
14180
 
1

Length

Max length5
Median length1
Mean length1.06
Min length1

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 98
98.0%
560 1
 
1.0%
14180 1
 
1.0%

Length

2023-12-10T22:03:47.428122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:47.562433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 98
98.0%
560 1
 
1.0%
14180 1
 
1.0%

탈수슬러지함수율
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct22
Distinct (%)22.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4756.4
Minimum0
Maximum59540
Zeros72
Zeros (%)72.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:47.654638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q310497.5
95-th percentile21050.5
Maximum59540
Range59540
Interquartile range (IQR)10497.5

Descriptive statistics

Standard deviation9210.3739
Coefficient of variation (CV)1.936417
Kurtosis11.742661
Mean4756.4
Median Absolute Deviation (MAD)0
Skewness2.8102819
Sum475640
Variance84830987
MonotonicityNot monotonic
2023-12-10T22:03:47.771085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
0 72
72.0%
11900 3
 
3.0%
10490 3
 
3.0%
21060 2
 
2.0%
10530 2
 
2.0%
10520 2
 
2.0%
10540 1
 
1.0%
10580 1
 
1.0%
10560 1
 
1.0%
20770 1
 
1.0%
Other values (12) 12
 
12.0%
ValueCountFrequency (%)
0 72
72.0%
10490 3
 
3.0%
10520 2
 
2.0%
10530 2
 
2.0%
10540 1
 
1.0%
10560 1
 
1.0%
10580 1
 
1.0%
11900 3
 
3.0%
11910 1
 
1.0%
11920 1
 
1.0%
ValueCountFrequency (%)
59540 1
1.0%
23830 1
1.0%
23820 1
1.0%
21060 2
2.0%
21050 1
1.0%
21040 1
1.0%
21000 1
1.0%
20990 1
1.0%
20960 1
1.0%
20770 1
1.0%

협잡물처리량
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.147
Minimum0
Maximum83.5
Zeros71
Zeros (%)71.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:47.905959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q383.2
95-th percentile83.4
Maximum83.5
Range83.5
Interquartile range (IQR)83.2

Descriptive statistics

Standard deviation37.973261
Coefficient of variation (CV)1.5725871
Kurtosis-1.1399668
Mean24.147
Median Absolute Deviation (MAD)0
Skewness0.93978348
Sum2414.7
Variance1441.9686
MonotonicityNot monotonic
2023-12-10T22:03:48.021223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0.0 71
71.0%
83.3 14
 
14.0%
83.2 8
 
8.0%
83.4 3
 
3.0%
83.5 3
 
3.0%
82.2 1
 
1.0%
ValueCountFrequency (%)
0.0 71
71.0%
82.2 1
 
1.0%
83.2 8
 
8.0%
83.3 14
 
14.0%
83.4 3
 
3.0%
83.5 3
 
3.0%
ValueCountFrequency (%)
83.5 3
 
3.0%
83.4 3
 
3.0%
83.3 14
 
14.0%
83.2 8
 
8.0%
82.2 1
 
1.0%
0.0 71
71.0%

협잡물함수율
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:03:48.126506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:48.206386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

액상슬러지함수율
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:03:48.289738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:48.367376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

Interactions

2023-12-10T22:03:45.659748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:43.793302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.152364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.528025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.966921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:45.733454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:43.874840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.225404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.627592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:45.059557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:45.803931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:43.949298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.296657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.743327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:45.178459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:45.876134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.018961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.368098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.819865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:45.260935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:45.948347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.087754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.453949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:44.897757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:45.570073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:03:48.421161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량
권역1.0001.0000.0001.0000.0000.0500.5020.994
하수처리시설명1.0001.0000.0001.0001.0000.9410.4221.000
처리일자0.0000.0001.0000.0000.0000.0000.0000.000
관리단1.0001.0000.0001.0001.0000.6690.1441.000
탈수슬러지처리량0.0001.0000.0001.0001.0001.0000.0000.000
액상슬러지처리량0.0500.9410.0000.6691.0001.0000.0000.000
탈수슬러지함수율0.5020.4220.0000.1440.0000.0001.0000.469
협잡물처리량0.9941.0000.0001.0000.0000.0000.4691.000
2023-12-10T22:03:48.517141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
액상슬러지처리량권역관리단
액상슬러지처리량1.0000.0820.694
권역0.0821.0000.990
관리단0.6940.9901.000
2023-12-10T22:03:48.597106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
하수처리시설명처리일자탈수슬러지처리량탈수슬러지함수율협잡물처리량권역관리단액상슬러지처리량
하수처리시설명1.000-0.047-0.5840.1420.7920.9901.0000.694
처리일자-0.0471.0000.051-0.036-0.0180.0000.0000.000
탈수슬러지처리량-0.5840.0511.000-0.773-0.1590.0000.9900.995
탈수슬러지함수율0.142-0.036-0.7731.000-0.3850.3540.0890.000
협잡물처리량0.792-0.018-0.159-0.3851.0000.9290.9900.000
권역0.9900.0000.0000.3540.9291.0000.9900.082
관리단1.0000.0000.9900.0890.9900.9901.0000.694
액상슬러지처리량0.6940.0000.9950.0000.0000.0820.6941.000

Missing values

2023-12-10T22:03:46.043873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:03:46.186465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량협잡물함수율액상슬러지함수율
09050001201901225005760000.000
190500022019011650000104900.000
29050001201901085006140000.000
39050001201901095006090000.000
490500032019011450000238200.000
590500032019011550000595400.000
690500032019011750000238300.000
79050001201901025005690000.000
89050001201901035005590000.000
99050001201901045005290000.000
권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량협잡물함수율액상슬러지함수율
9091600012019010960023080083.500
9191600012019011960027740083.300
9291600012019011360021910083.200
9391600012019011460020520083.300
9491600012019010360021280083.300
9591600012019010760020900083.300
9691400012019013140043020000.000
9791600012019012060027740083.300
9891600012019011560021660083.300
9991600012019012360026220083.300