Overview

Dataset statistics

Number of variables13
Number of observations21
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory118.3 B

Variable types

Numeric3
Categorical10

Dataset

Description수도법에 따라 매월, 광역 또는 지방상수도에서 해수를 원수로 사용하는 원수 수질 검사 결과 데이터로, 수질검사결과, 수질검사기관 등을 포함 * 상세자료조회는 아래 URL을 참고 해주시기 바랍니다. https://www.waternow.go.kr/web/lawData2/?pMENUID=96&ATTR_1=3106
URLhttps://www.data.go.kr/data/15093992/fileData.do

Alerts

수원 has constant value ""Constant
수은 has constant value ""Constant
크롬 has constant value ""Constant
지역 is highly overall correlated with 붕소 and 3 other fieldsHigh correlation
측정지점주소 is highly overall correlated with 지역 and 2 other fieldsHigh correlation
채수지점 is highly overall correlated with 지역 and 2 other fieldsHigh correlation
취수장명 is highly overall correlated with 지역 and 2 other fieldsHigh correlation
연번 is highly overall correlated with 검사년도High correlation
검사년도 is highly overall correlated with 연번High correlation
붕소 is highly overall correlated with 지역High correlation
카드뮴 is highly imbalanced (72.4%)Imbalance
비소 is highly imbalanced (59.1%)Imbalance
is highly imbalanced (72.4%)Imbalance
연번 has unique valuesUnique
붕소 has 8 (38.1%) zerosZeros

Reproduction

Analysis started2023-12-12 17:48:13.845374
Analysis finished2023-12-12 17:48:16.098081
Duration2.25 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct21
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11
Minimum1
Maximum21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.0 B
2023-12-13T02:48:16.184138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q16
median11
Q316
95-th percentile20
Maximum21
Range20
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.2048368
Coefficient of variation (CV)0.56407607
Kurtosis-1.2
Mean11
Median Absolute Deviation (MAD)5
Skewness0
Sum231
Variance38.5
MonotonicityStrictly increasing
2023-12-13T02:48:16.368463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
1 1
 
4.8%
2 1
 
4.8%
21 1
 
4.8%
20 1
 
4.8%
19 1
 
4.8%
18 1
 
4.8%
17 1
 
4.8%
16 1
 
4.8%
15 1
 
4.8%
14 1
 
4.8%
Other values (11) 11
52.4%
ValueCountFrequency (%)
1 1
4.8%
2 1
4.8%
3 1
4.8%
4 1
4.8%
5 1
4.8%
6 1
4.8%
7 1
4.8%
8 1
4.8%
9 1
4.8%
10 1
4.8%
ValueCountFrequency (%)
21 1
4.8%
20 1
4.8%
19 1
4.8%
18 1
4.8%
17 1
4.8%
16 1
4.8%
15 1
4.8%
14 1
4.8%
13 1
4.8%
12 1
4.8%

검사년도
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)38.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2018.9524
Minimum2015
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.0 B
2023-12-13T02:48:16.524681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2015
5-th percentile2016
Q12017
median2019
Q32021
95-th percentile2022
Maximum2022
Range7
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.0365704
Coefficient of variation (CV)0.0010087263
Kurtosis-1.0387834
Mean2018.9524
Median Absolute Deviation (MAD)2
Skewness-0.16452442
Sum42398
Variance4.147619
MonotonicityIncreasing
2023-12-13T02:48:16.691187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2017 4
19.0%
2018 4
19.0%
2020 4
19.0%
2021 4
19.0%
2022 2
9.5%
2015 1
 
4.8%
2016 1
 
4.8%
2019 1
 
4.8%
ValueCountFrequency (%)
2015 1
 
4.8%
2016 1
 
4.8%
2017 4
19.0%
2018 4
19.0%
2019 1
 
4.8%
2020 4
19.0%
2021 4
19.0%
2022 2
9.5%
ValueCountFrequency (%)
2022 2
9.5%
2021 4
19.0%
2020 4
19.0%
2019 1
 
4.8%
2018 4
19.0%
2017 4
19.0%
2016 1
 
4.8%
2015 1
 
4.8%

지역
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Memory size300.0 B
전라남도 진도군
전라남도 여수시
제주특별자치도

Length

Max length8
Median length8
Mean length7.7142857
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전라남도 여수시
2nd row전라남도 여수시
3rd row전라남도 여수시
4th row전라남도 진도군
5th row전라남도 진도군

Common Values

ValueCountFrequency (%)
전라남도 진도군 8
38.1%
전라남도 여수시 7
33.3%
제주특별자치도 6
28.6%

Length

2023-12-13T02:48:16.851574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:48:16.981691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전라남도 15
41.7%
진도군 8
22.2%
여수시 7
19.4%
제주특별자치도 6
 
16.7%

취수장명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)23.8%
Missing0
Missing (%)0.0%
Memory size300.0 B
거문도서도
추자담수장
관사
성남
추자지구

Length

Max length5
Median length5
Mean length3.8095238
Min length2

Unique

Unique1 ?
Unique (%)4.8%

Sample

1st row거문도서도
2nd row거문도서도
3rd row거문도서도
4th row관사
5th row성남

Common Values

ValueCountFrequency (%)
거문도서도 7
33.3%
추자담수장 5
23.8%
관사 4
19.0%
성남 4
19.0%
추자지구 1
 
4.8%

Length

2023-12-13T02:48:17.144084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:48:17.300090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
거문도서도 7
33.3%
추자담수장 5
23.8%
관사 4
19.0%
성남 4
19.0%
추자지구 1
 
4.8%

측정지점주소
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)23.8%
Missing0
Missing (%)0.0%
Memory size300.0 B
전라남도 여수시 삼산면 덕촌리 3621
제주특별자치도 제주시 추자면 묵리 619
전라남도 진도군 조도면 관사도리 175
전라남도 진도군 조도면 성남도리 346
제주특별자치도 제주시 추자면 묵리

Length

Max length22
Median length21
Mean length21.095238
Min length18

Unique

Unique1 ?
Unique (%)4.8%

Sample

1st row전라남도 여수시 삼산면 덕촌리 3621
2nd row전라남도 여수시 삼산면 덕촌리 3621
3rd row전라남도 여수시 삼산면 덕촌리 3621
4th row전라남도 진도군 조도면 관사도리 175
5th row전라남도 진도군 조도면 성남도리 346

Common Values

ValueCountFrequency (%)
전라남도 여수시 삼산면 덕촌리 3621 7
33.3%
제주특별자치도 제주시 추자면 묵리 619 5
23.8%
전라남도 진도군 조도면 관사도리 175 4
19.0%
전라남도 진도군 조도면 성남도리 346 4
19.0%
제주특별자치도 제주시 추자면 묵리 1
 
4.8%

Length

2023-12-13T02:48:17.485449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:48:17.675044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전라남도 15
14.4%
진도군 8
 
7.7%
조도면 8
 
7.7%
여수시 7
 
6.7%
삼산면 7
 
6.7%
덕촌리 7
 
6.7%
3621 7
 
6.7%
제주특별자치도 6
 
5.8%
제주시 6
 
5.8%
추자면 6
 
5.8%
Other values (6) 27
26.0%

채수지점
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size300.0 B
취수구
13 
착수정

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row취수구
2nd row취수구
3rd row취수구
4th row착수정
5th row착수정

Common Values

ValueCountFrequency (%)
취수구 13
61.9%
착수정 8
38.1%

Length

2023-12-13T02:48:17.877199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:48:18.019207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
취수구 13
61.9%
착수정 8
38.1%

수원
Categorical

CONSTANT 

Distinct1
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size300.0 B
해수
21 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row해수
2nd row해수
3rd row해수
4th row해수
5th row해수

Common Values

ValueCountFrequency (%)
해수 21
100.0%

Length

2023-12-13T02:48:18.154682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:48:18.289728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
해수 21
100.0%

카드뮴
Categorical

IMBALANCE 

Distinct2
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size300.0 B
0.0
20 
0.002
 
1

Length

Max length5
Median length3
Mean length3.0952381
Min length3

Unique

Unique1 ?
Unique (%)4.8%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 20
95.2%
0.002 1
 
4.8%

Length

2023-12-13T02:48:18.427041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:48:18.581109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 20
95.2%
0.002 1
 
4.8%

비소
Categorical

IMBALANCE 

Distinct4
Distinct (%)19.0%
Missing0
Missing (%)0.0%
Memory size300.0 B
0.0
18 
0.012
 
1
0.054
 
1
0.064
 
1

Length

Max length5
Median length3
Mean length3.2857143
Min length3

Unique

Unique3 ?
Unique (%)14.3%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 18
85.7%
0.012 1
 
4.8%
0.054 1
 
4.8%
0.064 1
 
4.8%

Length

2023-12-13T02:48:18.743128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:48:18.911935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 18
85.7%
0.012 1
 
4.8%
0.054 1
 
4.8%
0.064 1
 
4.8%

수은
Categorical

CONSTANT 

Distinct1
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size300.0 B
0
21 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 21
100.0%

Length

2023-12-13T02:48:19.055660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:48:19.189034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 21
100.0%


Categorical

IMBALANCE 

Distinct2
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size300.0 B
0.0
20 
0.011
 
1

Length

Max length5
Median length3
Mean length3.0952381
Min length3

Unique

Unique1 ?
Unique (%)4.8%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.011
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 20
95.2%
0.011 1
 
4.8%

Length

2023-12-13T02:48:19.341845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:48:19.515181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 20
95.2%
0.011 1
 
4.8%

크롬
Categorical

CONSTANT 

Distinct1
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size300.0 B
0
21 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 21
100.0%

Length

2023-12-13T02:48:19.666494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:48:19.805839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 21
100.0%

붕소
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct13
Distinct (%)61.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.55
Minimum0
Maximum4.98
Zeros8
Zeros (%)38.1%
Negative0
Negative (%)0.0%
Memory size321.0 B
2023-12-13T02:48:19.929477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.32
Q33.7
95-th percentile4.63
Maximum4.98
Range4.98
Interquartile range (IQR)3.7

Descriptive statistics

Standard deviation1.9814565
Coefficient of variation (CV)1.2783591
Kurtosis-1.3257942
Mean1.55
Median Absolute Deviation (MAD)0.32
Skewness0.75083323
Sum32.55
Variance3.92617
MonotonicityNot monotonic
2023-12-13T02:48:20.069693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0.0 8
38.1%
0.01 2
 
9.5%
0.47 1
 
4.8%
0.57 1
 
4.8%
4.29 1
 
4.8%
2.6 1
 
4.8%
4.63 1
 
4.8%
4.49 1
 
4.8%
0.32 1
 
4.8%
4.98 1
 
4.8%
Other values (3) 3
 
14.3%
ValueCountFrequency (%)
0.0 8
38.1%
0.01 2
 
9.5%
0.32 1
 
4.8%
0.47 1
 
4.8%
0.57 1
 
4.8%
2.22 1
 
4.8%
2.6 1
 
4.8%
3.7 1
 
4.8%
4.26 1
 
4.8%
4.29 1
 
4.8%
ValueCountFrequency (%)
4.98 1
4.8%
4.63 1
4.8%
4.49 1
4.8%
4.29 1
4.8%
4.26 1
4.8%
3.7 1
4.8%
2.6 1
4.8%
2.22 1
4.8%
0.57 1
4.8%
0.47 1
4.8%

Interactions

2023-12-13T02:48:15.315856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:48:14.497170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:48:14.918770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:48:15.429273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:48:14.648977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:48:15.056031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:48:15.563690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:48:14.805349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:48:15.203982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:48:20.189130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번검사년도지역취수장명측정지점주소채수지점카드뮴비소붕소
연번1.0000.9060.3750.0000.0000.6940.3710.0000.3710.000
검사년도0.9061.0000.0000.0000.0000.0000.0000.0000.0000.777
지역0.3750.0001.0001.0001.0001.0000.0000.1760.0000.769
취수장명0.0000.0001.0001.0001.0001.0000.1350.0000.1350.595
측정지점주소0.0000.0001.0001.0001.0001.0000.1350.0000.1350.595
채수지점0.6940.0001.0001.0001.0001.0000.0000.0000.0000.463
카드뮴0.3710.0000.0000.1350.1350.0001.0000.0000.0000.000
비소0.0000.0000.1760.0000.0000.0000.0001.0000.0000.441
0.3710.0000.0000.1350.1350.0000.0000.0001.0000.000
붕소0.0000.7770.7690.5950.5950.4630.0000.4410.0001.000
2023-12-13T02:48:20.351404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비소지역측정지점주소채수지점취수장명카드뮴
비소1.0000.1360.0000.0000.0000.0000.000
지역0.1361.0000.0000.9430.9730.9430.000
0.0000.0001.0000.1150.0000.1150.000
측정지점주소0.0000.9430.1151.0000.9181.0000.115
채수지점0.0000.9730.0000.9181.0000.9180.000
취수장명0.0000.9430.1151.0000.9181.0000.115
카드뮴0.0000.0000.0000.1150.0000.1151.000
2023-12-13T02:48:20.507163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번검사년도붕소지역취수장명측정지점주소채수지점카드뮴비소
연번1.0000.9870.0710.3580.1890.1890.3920.1620.1620.162
검사년도0.9871.0000.0070.0000.0000.0000.0000.0000.0000.000
붕소0.0710.0071.0000.6170.3900.3900.4120.0000.2610.000
지역0.3580.0000.6171.0000.9430.9430.9730.0000.1360.000
취수장명0.1890.0000.3900.9431.0001.0000.9180.1150.0000.115
측정지점주소0.1890.0000.3900.9431.0001.0000.9180.1150.0000.115
채수지점0.3920.0000.4120.9730.9180.9181.0000.0000.0000.000
카드뮴0.1620.0000.0000.0000.1150.1150.0001.0000.0000.000
비소0.1620.0000.2610.1360.0000.0000.0000.0001.0000.000
0.1620.0000.0000.0000.1150.1150.0000.0000.0001.000

Missing values

2023-12-13T02:48:15.741172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:48:15.996760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번검사년도지역취수장명측정지점주소채수지점수원카드뮴비소수은크롬붕소
012015전라남도 여수시거문도서도전라남도 여수시 삼산면 덕촌리 3621취수구해수0.00.000.000.47
122016전라남도 여수시거문도서도전라남도 여수시 삼산면 덕촌리 3621취수구해수0.00.000.000.57
232017전라남도 여수시거문도서도전라남도 여수시 삼산면 덕촌리 3621취수구해수0.00.000.000.01
342017전라남도 진도군관사전라남도 진도군 조도면 관사도리 175착수정해수0.00.000.01100.0
452017전라남도 진도군성남전라남도 진도군 조도면 성남도리 346착수정해수0.00.000.000.0
562017제주특별자치도추자지구제주특별자치도 제주시 추자면 묵리취수구해수0.00.000.004.29
672018전라남도 여수시거문도서도전라남도 여수시 삼산면 덕촌리 3621취수구해수0.00.000.002.6
782018전라남도 진도군관사전라남도 진도군 조도면 관사도리 175착수정해수0.00.000.000.0
892018전라남도 진도군성남전라남도 진도군 조도면 성남도리 346착수정해수0.0020.000.000.0
9102018제주특별자치도추자담수장제주특별자치도 제주시 추자면 묵리 619취수구해수0.00.000.004.63
연번검사년도지역취수장명측정지점주소채수지점수원카드뮴비소수은크롬붕소
11122020전라남도 여수시거문도서도전라남도 여수시 삼산면 덕촌리 3621취수구해수0.00.01200.000.32
12132020전라남도 진도군관사전라남도 진도군 조도면 관사도리 175착수정해수0.00.000.000.0
13142020전라남도 진도군성남전라남도 진도군 조도면 성남도리 346착수정해수0.00.000.000.0
14152020제주특별자치도추자담수장제주특별자치도 제주시 추자면 묵리 619취수구해수0.00.000.004.98
15162021전라남도 여수시거문도서도전라남도 여수시 삼산면 덕촌리 3621취수구해수0.00.05400.002.22
16172021전라남도 진도군관사전라남도 진도군 조도면 관사도리 175착수정해수0.00.000.000.0
17182021전라남도 진도군성남전라남도 진도군 조도면 성남도리 346착수정해수0.00.000.000.01
18192021제주특별자치도추자담수장제주특별자치도 제주시 추자면 묵리 619취수구해수0.00.000.004.26
19202022전라남도 여수시거문도서도전라남도 여수시 삼산면 덕촌리 3621취수구해수0.00.06400.000.0
20212022제주특별자치도추자담수장제주특별자치도 제주시 추자면 묵리 619취수구해수0.00.000.003.7