Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells10000
Missing cells (%)14.3%
Duplicate rows2
Duplicate rows (%)< 0.1%
Total size in memory654.3 KiB
Average record size in memory67.0 B

Variable types

DateTime1
Numeric2
Categorical3
Unsupported1

Dataset

Description김해도시개발공사 대동 하수처리시설별에 대한 시간대별 계측 현황을 조회하는 서비스로 기준연월일, 기준시간, 하수처리장구분명, 계측구분명, 계측값 등의 정보를 제공
Author김해시도시개발공사
URLhttps://www.data.go.kr/data/15096552/fileData.do

Alerts

하수처리장구분명 has constant value ""Constant
Dataset has 2 (< 0.1%) duplicate rowsDuplicates
계측구분명 is highly overall correlated with 계측태그명High correlation
계측태그명 is highly overall correlated with 계측구분명High correlation
계측단위 has 10000 (100.0%) missing valuesMissing
계측단위 is an unsupported type, check if it needs cleaning or further analysisUnsupported
기준시간 has 409 (4.1%) zerosZeros
계측값 has 2411 (24.1%) zerosZeros

Reproduction

Analysis started2023-12-12 01:44:08.051721
Analysis finished2023-12-12 01:44:09.203741
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct621
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2019-04-23 00:00:00
Maximum2021-01-27 00:00:00
2023-12-12T10:44:09.277228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:44:09.469798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

기준시간
Real number (ℝ)

ZEROS 

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.6283
Minimum0
Maximum23
Zeros409
Zeros (%)4.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T10:44:09.630879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q318
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)12

Descriptive statistics

Standard deviation6.9496452
Coefficient of variation (CV)0.59764929
Kurtosis-1.2046169
Mean11.6283
Median Absolute Deviation (MAD)6
Skewness-0.034787371
Sum116283
Variance48.297569
MonotonicityNot monotonic
2023-12-12T10:44:09.792952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
18 464
 
4.6%
22 459
 
4.6%
17 451
 
4.5%
1 443
 
4.4%
15 437
 
4.4%
2 435
 
4.3%
23 433
 
4.3%
9 426
 
4.3%
6 421
 
4.2%
10 420
 
4.2%
Other values (14) 5611
56.1%
ValueCountFrequency (%)
0 409
4.1%
1 443
4.4%
2 435
4.3%
3 378
3.8%
4 408
4.1%
5 367
3.7%
6 421
4.2%
7 391
3.9%
8 392
3.9%
9 426
4.3%
ValueCountFrequency (%)
23 433
4.3%
22 459
4.6%
21 393
3.9%
20 401
4.0%
19 409
4.1%
18 464
4.6%
17 451
4.5%
16 415
4.2%
15 437
4.4%
14 418
4.2%

하수처리장구분명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
대동 공공하수처리시설
10000 

Length

Max length11
Median length11
Mean length11
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대동 공공하수처리시설
2nd row대동 공공하수처리시설
3rd row대동 공공하수처리시설
4th row대동 공공하수처리시설
5th row대동 공공하수처리시설

Common Values

ValueCountFrequency (%)
대동 공공하수처리시설 10000
100.0%

Length

2023-12-12T10:44:09.949495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:44:10.069792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대동 10000
50.0%
공공하수처리시설 10000
50.0%

계측구분명
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
약품공급량
1603 
유량조정조 수위
1602 
PAC저장탱크수위
1599 
약품용해탱크수위
1593 
NaOCl저장탱크수위
1529 
Other values (2)
2074 

Length

Max length11
Median length8
Mean length7.4191
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row유입유량
2nd row약품공급량
3rd row약품공급량
4th rowNaOCl저장탱크수위
5th row유입유량

Common Values

ValueCountFrequency (%)
약품공급량 1603
16.0%
유량조정조 수위 1602
16.0%
PAC저장탱크수위 1599
16.0%
약품용해탱크수위 1593
15.9%
NaOCl저장탱크수위 1529
15.3%
유입유량 1519
15.2%
PAC공급량 555
 
5.5%

Length

2023-12-12T10:44:10.213999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:44:10.423145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
약품공급량 1603
13.8%
유량조정조 1602
13.8%
수위 1602
13.8%
pac저장탱크수위 1599
13.8%
약품용해탱크수위 1593
13.7%
naocl저장탱크수위 1529
13.2%
유입유량 1519
13.1%
pac공급량 555
 
4.8%

계측태그명
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
FIT-304
1603 
LIT-201
1602 
LIT-304
1599 
LIT-302
1593 
LIT-303
1529 
Other values (2)
2074 

Length

Max length7
Median length7
Mean length6.9445
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFIT-201
2nd rowFIT-304
3rd rowFIT-304
4th rowLIT-303
5th rowFIT-201

Common Values

ValueCountFrequency (%)
FIT-304 1603
16.0%
LIT-201 1602
16.0%
LIT-304 1599
16.0%
LIT-302 1593
15.9%
LIT-303 1529
15.3%
FIT-201 1519
15.2%
FIT309 555
 
5.5%

Length

2023-12-12T10:44:10.586389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:44:10.715476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
fit-304 1603
16.0%
lit-201 1602
16.0%
lit-304 1599
16.0%
lit-302 1593
15.9%
lit-303 1529
15.3%
fit-201 1519
15.2%
fit309 555
 
5.5%

계측단위
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB

계측값
Real number (ℝ)

ZEROS 

Distinct885
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.270668
Minimum-10
Maximum154.8
Zeros2411
Zeros (%)24.1%
Negative508
Negative (%)5.1%
Memory size166.0 KiB
2023-12-12T10:44:10.896025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-10
5-th percentile-10
Q10
median0.51
Q31
95-th percentile59.9955
Maximum154.8
Range164.8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation16.726502
Coefficient of variation (CV)3.916601
Kurtosis13.879139
Mean4.270668
Median Absolute Deviation (MAD)0.51
Skewness3.7733259
Sum42706.68
Variance279.77588
MonotonicityNot monotonic
2023-12-12T10:44:11.101246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 2411
24.1%
0.81 1286
 
12.9%
-10.0 508
 
5.1%
0.24 405
 
4.0%
0.23 357
 
3.6%
1.0 342
 
3.4%
1.01 317
 
3.2%
0.25 241
 
2.4%
0.82 238
 
2.4%
0.51 181
 
1.8%
Other values (875) 3714
37.1%
ValueCountFrequency (%)
-10.0 508
 
5.1%
0.0 2411
24.1%
0.01 24
 
0.2%
0.03 12
 
0.1%
0.04 21
 
0.2%
0.05 19
 
0.2%
0.06 18
 
0.2%
0.08 14
 
0.1%
0.09 13
 
0.1%
0.1 75
 
0.8%
ValueCountFrequency (%)
154.8 4
< 0.1%
106.76 1
 
< 0.1%
88.71 1
 
< 0.1%
88.45 1
 
< 0.1%
87.23 1
 
< 0.1%
85.61 1
 
< 0.1%
85.33 1
 
< 0.1%
85.25 1
 
< 0.1%
85.05 1
 
< 0.1%
84.83 1
 
< 0.1%

Interactions

2023-12-12T10:44:08.705333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:44:08.464491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:44:08.833934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:44:08.577028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:44:11.218425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준시간계측구분명계측태그명계측값
기준시간1.0000.0000.0000.046
계측구분명0.0001.0001.0000.428
계측태그명0.0001.0001.0000.428
계측값0.0460.4280.4281.000
2023-12-12T10:44:11.651981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
계측구분명계측태그명
계측구분명1.0001.000
계측태그명1.0001.000
2023-12-12T10:44:11.770201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준시간계측값계측구분명계측태그명
기준시간1.0000.0130.0000.000
계측값0.0131.0000.2460.246
계측구분명0.0000.2461.0001.000
계측태그명0.0000.2461.0001.000

Missing values

2023-12-12T10:44:09.010929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:44:09.143786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준연월일기준시간하수처리장구분명계측구분명계측태그명계측단위계측값
270492020-09-2611대동 공공하수처리시설유입유량FIT-201<NA>0.0
872382020-10-090대동 공공하수처리시설약품공급량FIT-304<NA>0.0
885692020-12-0511대동 공공하수처리시설약품공급량FIT-304<NA>0.0
450562019-04-2822대동 공공하수처리시설NaOCl저장탱크수위LIT-303<NA>0.24
174562019-08-0615대동 공공하수처리시설유입유량FIT-201<NA>68.54
114102020-08-2815대동 공공하수처리시설유량조정조 수위LIT-201<NA>3.64
42092019-10-1811대동 공공하수처리시설유량조정조 수위LIT-201<NA>1.7
388132020-05-1219대동 공공하수처리시설약품용해탱크수위LIT-302<NA>1.01
545692020-06-1411대동 공공하수처리시설NaOCl저장탱크수위LIT-303<NA>0.25
447642021-01-2118대동 공공하수처리시설약품용해탱크수위LIT-302<NA>0.58
기준연월일기준시간하수처리장구분명계측구분명계측태그명계측단위계측값
677652020-03-2311대동 공공하수처리시설PAC저장탱크수위LIT-304<NA>0.81
821942020-02-2820대동 공공하수처리시설약품공급량FIT-304<NA>0.0
690112020-05-239대동 공공하수처리시설PAC저장탱크수위LIT-304<NA>0.81
12392019-06-1417대동 공공하수처리시설유량조정조 수위LIT-201<NA>2.31
745842021-01-1614대동 공공하수처리시설PAC저장탱크수위LIT-304<NA>0.81
860292020-08-1815대동 공공하수처리시설약품공급량FIT-304<NA>0.0
634122019-09-200대동 공공하수처리시설PAC저장탱크수위LIT-304<NA>0.81
497632019-11-153대동 공공하수처리시설NaOCl저장탱크수위LIT-303<NA>0.23
804832019-12-1713대동 공공하수처리시설약품공급량FIT-304<NA>0.0
519892020-02-1723대동 공공하수처리시설NaOCl저장탱크수위LIT-303<NA>0.23

Duplicate rows

Most frequently occurring

기준연월일기준시간하수처리장구분명계측구분명계측태그명계측값# duplicates
02019-11-2117대동 공공하수처리시설유입유량FIT-2010.02
12019-11-2120대동 공공하수처리시설약품공급량FIT-3040.02