Overview

Dataset statistics

Number of variables10
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.9 KiB
Average record size in memory91.3 B

Variable types

Categorical6
Numeric4

Alerts

협잡물함수율 has constant value ""Constant
액상슬러지함수율 has constant value ""Constant
권역 is highly overall correlated with 하수처리시설명 and 2 other fieldsHigh correlation
하수처리시설명 is highly overall correlated with 탈수슬러지처리량 and 3 other fieldsHigh correlation
관리단 is highly overall correlated with 탈수슬러지처리량 and 3 other fieldsHigh correlation
협잡물처리량 is highly overall correlated with 권역 and 1 other fieldsHigh correlation
탈수슬러지처리량 is highly overall correlated with 탈수슬러지함수율 and 2 other fieldsHigh correlation
탈수슬러지함수율 is highly overall correlated with 탈수슬러지처리량 and 1 other fieldsHigh correlation
관리단 is highly imbalanced (56.6%)Imbalance
협잡물처리량 is highly imbalanced (64.9%)Imbalance
탈수슬러지처리량 has 38 (38.0%) zerosZeros
액상슬러지처리량 has 93 (93.0%) zerosZeros
탈수슬러지함수율 has 62 (62.0%) zerosZeros

Reproduction

Analysis started2023-12-10 13:03:32.088287
Analysis finished2023-12-10 13:03:33.886182
Duration1.8 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

권역
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
90
84 
91
16 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row90
2nd row90
3rd row90
4th row90
5th row90

Common Values

ValueCountFrequency (%)
90 84
84.0%
91 16
 
16.0%

Length

2023-12-10T22:03:33.942097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:34.015674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
90 84
84.0%
91 16
 
16.0%

하수처리시설명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
50001
46 
50002
24 
60001
15 
50003
14 
40001
 
1

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row50001
2nd row50002
3rd row50001
4th row50001
5th row50002

Common Values

ValueCountFrequency (%)
50001 46
46.0%
50002 24
24.0%
60001 15
 
15.0%
50003 14
 
14.0%
40001 1
 
1.0%

Length

2023-12-10T22:03:34.107285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:34.221737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
50001 46
46.0%
50002 24
24.0%
60001 15
 
15.0%
50003 14
 
14.0%
40001 1
 
1.0%

처리일자
Real number (ℝ)

Distinct31
Distinct (%)31.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20190316
Minimum20190301
Maximum20190331
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:34.349130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20190301
5-th percentile20190303
Q120190308
median20190317
Q320190323
95-th percentile20190330
Maximum20190331
Range30
Interquartile range (IQR)15.25

Descriptive statistics

Standard deviation8.7604252
Coefficient of variation (CV)4.3389242 × 10-7
Kurtosis-1.1674996
Mean20190316
Median Absolute Deviation (MAD)7
Skewness-0.020959062
Sum2.0190316 × 109
Variance76.745051
MonotonicityNot monotonic
2023-12-10T22:03:34.720395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20190307 5
 
5.0%
20190304 5
 
5.0%
20190318 5
 
5.0%
20190305 4
 
4.0%
20190322 4
 
4.0%
20190319 4
 
4.0%
20190327 4
 
4.0%
20190325 4
 
4.0%
20190320 4
 
4.0%
20190331 4
 
4.0%
Other values (21) 57
57.0%
ValueCountFrequency (%)
20190301 2
 
2.0%
20190302 2
 
2.0%
20190303 2
 
2.0%
20190304 5
5.0%
20190305 4
4.0%
20190306 3
3.0%
20190307 5
5.0%
20190308 3
3.0%
20190309 1
 
1.0%
20190310 2
 
2.0%
ValueCountFrequency (%)
20190331 4
4.0%
20190330 3
3.0%
20190329 3
3.0%
20190328 2
2.0%
20190327 4
4.0%
20190326 2
2.0%
20190325 4
4.0%
20190324 3
3.0%
20190323 3
3.0%
20190322 4
4.0%

관리단
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
500
84 
600
15 
400
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row500
2nd row500
3rd row500
4th row500
5th row500

Common Values

ValueCountFrequency (%)
500 84
84.0%
600 15
 
15.0%
400 1
 
1.0%

Length

2023-12-10T22:03:34.864042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:34.962970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
500 84
84.0%
600 15
 
15.0%
400 1
 
1.0%

탈수슬러지처리량
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct58
Distinct (%)58.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3842.51
Minimum0
Maximum64150
Zeros38
Zeros (%)38.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:35.058396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3644
Q35780
95-th percentile6939.5
Maximum64150
Range64150
Interquartile range (IQR)5780

Descriptive statistics

Standard deviation6764.1927
Coefficient of variation (CV)1.7603579
Kurtosis64.803484
Mean3842.51
Median Absolute Deviation (MAD)2841
Skewness7.2783706
Sum384251
Variance45754303
MonotonicityNot monotonic
2023-12-10T22:03:35.181269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 38
38.0%
4900 3
 
3.0%
5840 2
 
2.0%
2676 2
 
2.0%
4370 2
 
2.0%
4520 1
 
1.0%
5670 1
 
1.0%
4480 1
 
1.0%
6820 1
 
1.0%
5810 1
 
1.0%
Other values (48) 48
48.0%
ValueCountFrequency (%)
0 38
38.0%
988 1
 
1.0%
2370 1
 
1.0%
2394 1
 
1.0%
2408 1
 
1.0%
2676 2
 
2.0%
2752 1
 
1.0%
3058 1
 
1.0%
3135 1
 
1.0%
3173 1
 
1.0%
ValueCountFrequency (%)
64150 1
1.0%
11098 1
1.0%
10666 1
1.0%
10600 1
1.0%
7500 1
1.0%
6910 1
1.0%
6830 1
1.0%
6820 1
1.0%
6760 1
1.0%
6590 1
1.0%

액상슬러지처리량
Real number (ℝ)

ZEROS 

Distinct8
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean168.4
Minimum0
Maximum3410
Zeros93
Zeros (%)93.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:35.277202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1953.5
Maximum3410
Range3410
Interquartile range (IQR)0

Descriptive statistics

Standard deviation631.38571
Coefficient of variation (CV)3.7493213
Kurtosis12.840404
Mean168.4
Median Absolute Deviation (MAD)0
Skewness3.7088666
Sum16840
Variance398647.92
MonotonicityNot monotonic
2023-12-10T22:03:35.368866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
0 93
93.0%
1730 1
 
1.0%
2580 1
 
1.0%
2600 1
 
1.0%
2210 1
 
1.0%
1940 1
 
1.0%
2370 1
 
1.0%
3410 1
 
1.0%
ValueCountFrequency (%)
0 93
93.0%
1730 1
 
1.0%
1940 1
 
1.0%
2210 1
 
1.0%
2370 1
 
1.0%
2580 1
 
1.0%
2600 1
 
1.0%
3410 1
 
1.0%
ValueCountFrequency (%)
3410 1
 
1.0%
2600 1
 
1.0%
2580 1
 
1.0%
2370 1
 
1.0%
2210 1
 
1.0%
1940 1
 
1.0%
1730 1
 
1.0%
0 93
93.0%

탈수슬러지함수율
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct30
Distinct (%)30.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6140.5
Minimum0
Maximum41750
Zeros62
Zeros (%)62.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:03:35.482529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q310560
95-th percentile24177
Maximum41750
Range41750
Interquartile range (IQR)10560

Descriptive statistics

Standard deviation9409.4024
Coefficient of variation (CV)1.5323512
Kurtosis2.1014971
Mean6140.5
Median Absolute Deviation (MAD)0
Skewness1.5956329
Sum614050
Variance88536853
MonotonicityNot monotonic
2023-12-10T22:03:35.590374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
0 62
62.0%
10560 4
 
4.0%
10540 2
 
2.0%
10570 2
 
2.0%
10520 2
 
2.0%
10510 2
 
2.0%
11930 2
 
2.0%
10530 2
 
2.0%
5960 1
 
1.0%
29820 1
 
1.0%
Other values (20) 20
 
20.0%
ValueCountFrequency (%)
0 62
62.0%
5960 1
 
1.0%
5970 1
 
1.0%
10370 1
 
1.0%
10490 1
 
1.0%
10510 2
 
2.0%
10520 2
 
2.0%
10530 2
 
2.0%
10540 2
 
2.0%
10560 4
 
4.0%
ValueCountFrequency (%)
41750 1
1.0%
31710 1
1.0%
31630 1
1.0%
31550 1
1.0%
29820 1
1.0%
23880 1
1.0%
23850 1
1.0%
21190 1
1.0%
21070 1
1.0%
21060 1
1.0%

협잡물처리량
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0.0
85 
83.4
10 
83.3
 
3
83.0
 
1
82.9
 
1

Length

Max length4
Median length3
Mean length3.15
Min length3

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 85
85.0%
83.4 10
 
10.0%
83.3 3
 
3.0%
83.0 1
 
1.0%
82.9 1
 
1.0%

Length

2023-12-10T22:03:35.696900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:35.782071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 85
85.0%
83.4 10
 
10.0%
83.3 3
 
3.0%
83.0 1
 
1.0%
82.9 1
 
1.0%

협잡물함수율
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:03:35.878871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:35.955314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

액상슬러지함수율
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:03:36.055818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:03:36.132530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

Interactions

2023-12-10T22:03:33.385586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:32.399191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:32.746243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:33.079944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:33.456948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:32.481083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:32.823436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:33.159840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:33.526454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:32.580838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:32.912833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:33.236838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:33.604910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:32.660456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:33.002154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:03:33.312692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:03:36.189004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량
권역1.0001.0000.0001.0000.1270.0000.2300.841
하수처리시설명1.0001.0000.0001.0000.7300.0000.6730.836
처리일자0.0000.0001.0000.0000.0000.1510.0000.000
관리단1.0001.0000.0001.0000.9410.0000.0000.707
탈수슬러지처리량0.1270.7300.0000.9411.0000.0000.0000.253
액상슬러지처리량0.0000.0000.1510.0000.0001.0000.0000.000
탈수슬러지함수율0.2300.6730.0000.0000.0000.0001.0000.000
협잡물처리량0.8410.8360.0000.7070.2530.0000.0001.000
2023-12-10T22:03:36.305811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
권역하수처리시설명관리단협잡물처리량
권역1.0000.9850.9950.946
하수처리시설명0.9851.0000.9900.467
관리단0.9950.9901.0000.685
협잡물처리량0.9460.4670.6851.000
2023-12-10T22:03:36.407634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리일자탈수슬러지처리량액상슬러지처리량탈수슬러지함수율권역하수처리시설명관리단협잡물처리량
처리일자1.000-0.0080.0720.0490.0000.0000.0000.000
탈수슬러지처리량-0.0081.0000.163-0.8330.2080.7160.7040.194
액상슬러지처리량0.0720.1631.000-0.1330.0000.0000.0000.000
탈수슬러지함수율0.049-0.833-0.1331.0000.2380.5120.0000.000
권역0.0000.2080.0000.2381.0000.9850.9950.946
하수처리시설명0.0000.7160.0000.5120.9851.0000.9900.467
관리단0.0000.7040.0000.0000.9950.9901.0000.685
협잡물처리량0.0000.1940.0000.0000.9460.4670.6851.000

Missing values

2023-12-10T22:03:33.697765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:03:33.835081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량협잡물함수율액상슬러지함수율
09050001201903075005000000.000
190500022019030350000105400.000
29050001201903045005880000.000
39050001201903075005920000.000
490500022019031150000315500.000
590500022019030450000201600.000
69050001201903025005770000.000
79050001201903035005910000.000
89050001201903065004370000.000
990500022019032150000105700.000
권역하수처리시설명처리일자관리단탈수슬러지처리량액상슬러지처리량탈수슬러지함수율협잡물처리량협잡물함수율액상슬러지함수율
9091600012019032460023700083.400
9191600012019032360023940083.300
9291600012019032260032490083.400
9391600012019030760031350083.400
9491600012019030560032680083.300
95916000120190318600110980083.400
9691600012019031060026760083.400
97916000120190304600106660082.900
9891600012019033060024080083.400
9991600012019031260026760083.400