Overview

Dataset statistics

Number of variables11
Number of observations118
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.8%
Total size in memory11.0 KiB
Average record size in memory95.1 B

Variable types

Categorical11

Dataset

Description남동구 해수공급사업소 정수시설 및 생산량, 사용량, 설비현황 등의 정보 제공
Author인천광역시남동구도시관리공단
URLhttps://www.data.go.kr/data/15067364/fileData.do

Alerts

Dataset has 1 (0.8%) duplicate rowsDuplicates
시설명 is highly overall correlated with 공급가격(원/t당) and 3 other fieldsHigh correlation
공급가격(원/t당) is highly overall correlated with 시설명 and 2 other fieldsHigh correlation
수질검사기관 is highly overall correlated with 시설명 and 2 other fieldsHigh correlation
해수공급시간(시간) is highly overall correlated with 시설명 and 1 other fieldsHigh correlation
Unnamed: 10 is highly overall correlated with 시설명 and 3 other fieldsHigh correlation
시설명 is highly imbalanced (57.8%)Imbalance
일 최대생산량(톤) is highly imbalanced (92.9%)Imbalance
일일 사용량(톤) is highly imbalanced (92.9%)Imbalance
전기시설용량(kw) is highly imbalanced (92.9%)Imbalance
취수저장탱크(톤) is highly imbalanced (92.9%)Imbalance
정수탱크(톤) is highly imbalanced (92.9%)Imbalance
시설면적(㎡) is highly imbalanced (92.9%)Imbalance

Reproduction

Analysis started2023-12-11 23:00:31.719787
Analysis finished2023-12-11 23:00:32.666099
Duration0.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct9
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
76 
수영
35 
해수사업소
 
1
남동수영장
 
1
종목구분
 
1
Other values (4)
 
4

Length

Max length7
Median length4
Mean length3.4830508
Min length2

Unique

Unique7 ?
Unique (%)5.9%

Sample

1st row해수사업소
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 76
64.4%
수영 35
29.7%
해수사업소 1
 
0.8%
남동수영장 1
 
0.8%
종목구분 1
 
0.8%
아쿠아로빅A 1
 
0.8%
아쿠아로빅B 1
 
0.8%
자유수영 1
 
0.8%
아쿠아로빅AB 1
 
0.8%

Length

2023-12-12T08:00:32.732031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:00:32.841331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 76
64.4%
수영 35
29.7%
해수사업소 1
 
0.8%
남동수영장 1
 
0.8%
종목구분 1
 
0.8%
아쿠아로빅a 1
 
0.8%
아쿠아로빅b 1
 
0.8%
자유수영 1
 
0.8%
아쿠아로빅ab 1
 
0.8%

공급가격(원/t당)
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)11.9%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
77 
새벽반
 
5
오전반
 
5
어머니1반
 
5
어머니2반
 
5
Other values (9)
21 

Length

Max length5
Median length4
Mean length3.9152542
Min length2

Unique

Unique6 ?
Unique (%)5.1%

Sample

1st row3,500
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 77
65.3%
새벽반 5
 
4.2%
오전반 5
 
4.2%
어머니1반 5
 
4.2%
어머니2반 5
 
4.2%
화합반 5
 
4.2%
저녁반 5
 
4.2%
직장인반 5
 
4.2%
3,500 1
 
0.8%
회원구분 1
 
0.8%
Other values (4) 4
 
3.4%

Length

2023-12-12T08:00:32.988509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 77
65.3%
새벽반 5
 
4.2%
오전반 5
 
4.2%
어머니1반 5
 
4.2%
어머니2반 5
 
4.2%
화합반 5
 
4.2%
저녁반 5
 
4.2%
직장인반 5
 
4.2%
3,500 1
 
0.8%
회원구분 1
 
0.8%
Other values (4) 4
 
3.4%

해수공급시간(시간)
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)8.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
78 
초급
 
7
중급1
 
7
마스터즈
 
7
연수1
 
7
Other values (5)
12 

Length

Max length6
Median length4
Mean length3.6440678
Min length2

Unique

Unique3 ?
Unique (%)2.5%

Sample

1st row24
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 78
66.1%
초급 7
 
5.9%
중급1 7
 
5.9%
마스터즈 7
 
5.9%
연수1 7
 
5.9%
상급1 6
 
5.1%
성인 3
 
2.5%
24 1
 
0.8%
대상(등급) 1
 
0.8%
상급 1
 
0.8%

Length

2023-12-12T08:00:33.470308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:00:33.593914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 78
66.1%
초급 7
 
5.9%
중급1 7
 
5.9%
마스터즈 7
 
5.9%
연수1 7
 
5.9%
상급1 6
 
5.1%
성인 3
 
2.5%
24 1
 
0.8%
대상(등급 1
 
0.8%
상급 1
 
0.8%

일 최대생산량(톤)
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
117 
1000
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row1000
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 117
99.2%
1000 1
 
0.8%

Length

2023-12-12T08:00:33.750956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:00:33.860841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 117
99.2%
1000 1
 
0.8%

일일 사용량(톤)
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
117 
480
 
1

Length

Max length4
Median length4
Mean length3.9915254
Min length3

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row480
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 117
99.2%
480 1
 
0.8%

Length

2023-12-12T08:00:33.964239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:00:34.074329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 117
99.2%
480 1
 
0.8%

전기시설용량(kw)
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
117 
900
 
1

Length

Max length4
Median length4
Mean length3.9915254
Min length3

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row900
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 117
99.2%
900 1
 
0.8%

Length

2023-12-12T08:00:34.183715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:00:34.269773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 117
99.2%
900 1
 
0.8%

취수저장탱크(톤)
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
117 
1500
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row1500
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 117
99.2%
1500 1
 
0.8%

Length

2023-12-12T08:00:34.377281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:00:34.479112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 117
99.2%
1500 1
 
0.8%

정수탱크(톤)
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
117 
1575
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row1575
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 117
99.2%
1575 1
 
0.8%

Length

2023-12-12T08:00:34.565457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:00:34.657141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 117
99.2%
1575 1
 
0.8%

시설면적(㎡)
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
117 
57.3
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row57.3
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 117
99.2%
57.3 1
 
0.8%

Length

2023-12-12T08:00:34.779667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:00:34.868422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 117
99.2%
57.3 1
 
0.8%

수질검사기관
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)10.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
77 
06:00-06:50
 
5
07:00-07:50
 
5
09:00-09:50
 
5
10:00-10:50
 
5
Other values (7)
21 

Length

Max length22
Median length4
Mean length6.5
Min length4

Unique

Unique3 ?
Unique (%)2.5%

Sample

1st row인천보건환경연구원, 한국화학융합시험연구원
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 77
65.3%
06:00-06:50 5
 
4.2%
07:00-07:50 5
 
4.2%
09:00-09:50 5
 
4.2%
10:00-10:50 5
 
4.2%
11:00-11:50 5
 
4.2%
19:00-19:50 5
 
4.2%
20:00-20:50 5
 
4.2%
15:00-15:50 3
 
2.5%
인천보건환경연구원, 한국화학융합시험연구원 1
 
0.8%
Other values (2) 2
 
1.7%

Length

2023-12-12T08:00:34.962248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 77
64.7%
06:00-06:50 5
 
4.2%
07:00-07:50 5
 
4.2%
09:00-09:50 5
 
4.2%
10:00-10:50 5
 
4.2%
11:00-11:50 5
 
4.2%
19:00-19:50 5
 
4.2%
20:00-20:50 5
 
4.2%
15:00-15:50 3
 
2.5%
인천보건환경연구원 1
 
0.8%
Other values (3) 3
 
2.5%

Unnamed: 10
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
78 
56,000
22 
60,000
15 
회비
 
1
46,000
 
1

Length

Max length6
Median length4
Mean length4.6440678
Min length2

Unique

Unique3 ?
Unique (%)2.5%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 78
66.1%
56,000 22
 
18.6%
60,000 15
 
12.7%
회비 1
 
0.8%
46,000 1
 
0.8%
86,000 1
 
0.8%

Length

2023-12-12T08:00:35.079779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:00:35.181491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 78
66.1%
56,000 22
 
18.6%
60,000 15
 
12.7%
회비 1
 
0.8%
46,000 1
 
0.8%
86,000 1
 
0.8%

Correlations

2023-12-12T08:00:35.253517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명공급가격(원/t당)해수공급시간(시간)수질검사기관Unnamed: 10
시설명1.0001.0000.8990.9040.907
공급가격(원/t당)1.0001.0000.7271.0000.920
해수공급시간(시간)0.8990.7271.0000.7580.985
수질검사기관0.9041.0000.7581.0000.943
Unnamed: 100.9070.9200.9850.9431.000
2023-12-12T08:00:35.384851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일 최대생산량(톤)시설면적(㎡)시설명전기시설용량(kw)취수저장탱크(톤)해수공급시간(시간)정수탱크(톤)공급가격(원/t당)일일 사용량(톤)수질검사기관Unnamed: 10
일 최대생산량(톤)1.000NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
시설면적(㎡)NaN1.000NaNNaNNaNNaNNaNNaNNaNNaNNaN
시설명NaNNaN1.000NaNNaN0.673NaN0.907NaN0.7000.842
전기시설용량(kw)NaNNaNNaN1.000NaNNaNNaNNaNNaNNaNNaN
취수저장탱크(톤)NaNNaNNaNNaN1.000NaNNaNNaNNaNNaNNaN
해수공급시간(시간)NaNNaN0.673NaNNaN1.000NaN0.385NaN0.4610.786
정수탱크(톤)NaNNaNNaNNaNNaNNaN1.000NaNNaNNaNNaN
공급가격(원/t당)NaNNaN0.907NaNNaN0.385NaN1.000NaN0.9660.733
일일 사용량(톤)NaNNaNNaNNaNNaNNaNNaNNaN1.000NaNNaN
수질검사기관NaNNaN0.700NaNNaN0.461NaN0.966NaN1.0000.623
Unnamed: 10NaNNaN0.842NaNNaN0.786NaN0.733NaN0.6231.000
2023-12-12T08:00:35.539450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명공급가격(원/t당)해수공급시간(시간)일 최대생산량(톤)일일 사용량(톤)전기시설용량(kw)취수저장탱크(톤)정수탱크(톤)시설면적(㎡)수질검사기관Unnamed: 10
시설명1.0000.9070.673NaNNaNNaNNaNNaNNaN0.7000.842
공급가격(원/t당)0.9071.0000.385NaNNaNNaNNaNNaNNaN0.9660.733
해수공급시간(시간)0.6730.3851.000NaNNaNNaNNaNNaNNaN0.4610.786
일 최대생산량(톤)NaNNaNNaN1.000NaNNaNNaNNaNNaNNaN0.000
일일 사용량(톤)NaNNaNNaNNaN1.000NaNNaNNaNNaNNaN0.000
전기시설용량(kw)NaNNaNNaNNaNNaN1.000NaNNaNNaNNaN0.000
취수저장탱크(톤)NaNNaNNaNNaNNaNNaN1.000NaNNaNNaN0.000
정수탱크(톤)NaNNaNNaNNaNNaNNaNNaN1.000NaNNaN0.000
시설면적(㎡)NaNNaNNaNNaNNaNNaNNaNNaN1.000NaN0.000
수질검사기관0.7000.9660.461NaNNaNNaNNaNNaNNaN1.0000.623
Unnamed: 100.8420.7330.7860.0000.0000.0000.0000.0000.0000.6231.000

Missing values

2023-12-12T08:00:32.423057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:00:32.597343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명공급가격(원/t당)해수공급시간(시간)일 최대생산량(톤)일일 사용량(톤)전기시설용량(kw)취수저장탱크(톤)정수탱크(톤)시설면적(㎡)수질검사기관Unnamed: 10
0해수사업소3,5002410004809001500157557.3인천보건환경연구원, 한국화학융합시험연구원<NA>
1<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
2<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
3<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
4<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
5<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
6<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
7<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
8<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
9<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
시설명공급가격(원/t당)해수공급시간(시간)일 최대생산량(톤)일일 사용량(톤)전기시설용량(kw)취수저장탱크(톤)정수탱크(톤)시설면적(㎡)수질검사기관Unnamed: 10
108수영직장인반연수1<NA><NA><NA><NA><NA><NA>20:00-20:5060,000
109수영직장인반마스터즈<NA><NA><NA><NA><NA><NA>20:00-20:5060,000
110아쿠아로빅A아쿠아A성인<NA><NA><NA><NA><NA><NA>15:00-15:5060,000
111아쿠아로빅B아쿠아B성인<NA><NA><NA><NA><NA><NA>15:00-15:5056,000
112자유수영전일<NA><NA><NA><NA><NA><NA><NA>08:00-18:5046,000
113수영화합반상급1<NA><NA><NA><NA><NA><NA>11:00-11:5056,000
114수영저녁반연수1<NA><NA><NA><NA><NA><NA>19:00-19:5060,000
115수영새벽반연수1<NA><NA><NA><NA><NA><NA>06:00-06:5060,000
116수영어머니2반상급1<NA><NA><NA><NA><NA><NA>10:00-10:5056,000
117아쿠아로빅AB아쿠아AB성인<NA><NA><NA><NA><NA><NA>15:00-15:5086,000

Duplicate rows

Most frequently occurring

시설명공급가격(원/t당)해수공급시간(시간)일 최대생산량(톤)일일 사용량(톤)전기시설용량(kw)취수저장탱크(톤)정수탱크(톤)시설면적(㎡)수질검사기관Unnamed: 10# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>76