Overview

Dataset statistics

Number of variables8
Number of observations22
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.5 KiB
Average record size in memory72.0 B

Variable types

Categorical6
Text1
Numeric1

Dataset

Description경기도_오산시_상하수도요금 정보(2023. 8. 1. 현재)에 대한 데이터로, 상수도 및 하수도의 업종별 요율 내용을 제공합니다.
URLhttps://www.data.go.kr/data/15117673/fileData.do

Alerts

단계 is highly overall correlated with 데이터기준일자High correlation
구분 is highly overall correlated with 관리기관 and 2 other fieldsHigh correlation
연락처 is highly overall correlated with 구분 and 2 other fieldsHigh correlation
관리기관 is highly overall correlated with 구분 and 2 other fieldsHigh correlation
데이터기준일자 is highly overall correlated with 세제곱미터당 금액(원) and 5 other fieldsHigh correlation
업종 is highly overall correlated with 데이터기준일자High correlation
세제곱미터당 금액(원) is highly overall correlated with 데이터기준일자High correlation
데이터기준일자 is highly imbalanced (73.3%)Imbalance

Reproduction

Analysis started2023-12-12 15:29:40.915178
Analysis finished2023-12-12 15:29:41.421773
Duration0.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size308.0 B
하수도
12 
상수도
10 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상수도
2nd row상수도
3rd row상수도
4th row상수도
5th row상수도

Common Values

ValueCountFrequency (%)
하수도 12
54.5%
상수도 10
45.5%

Length

2023-12-13T00:29:41.484032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:29:41.564533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
하수도 12
54.5%
상수도 10
45.5%

업종
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
일반용
10 
욕탕용
가정용

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가정용
2nd row일반용
3rd row일반용
4th row일반용
5th row일반용

Common Values

ValueCountFrequency (%)
일반용 10
45.5%
욕탕용 8
36.4%
가정용 4
 
18.2%

Length

2023-12-13T00:29:41.659255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:29:41.744121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반용 10
45.5%
욕탕용 8
36.4%
가정용 4
 
18.2%

단계
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Memory size308.0 B
1
2
3
4
5

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row2
4th row3
5th row4

Common Values

ValueCountFrequency (%)
1 6
27.3%
2 5
22.7%
3 5
22.7%
4 4
18.2%
5 2
 
9.1%

Length

2023-12-13T00:29:41.832821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:29:41.928140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 6
27.3%
2 5
22.7%
3 5
22.7%
4 4
18.2%
5 2
 
9.1%
Distinct14
Distinct (%)63.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
2023-12-13T00:29:42.269934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7
Mean length6.2272727
Min length4

Characters and Unicode

Total characters137
Distinct characters10
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)27.3%

Sample

1st row1 이상
2nd row1~50
3rd row51~100
4th row101~300
5th row301~500
ValueCountFrequency (%)
이상 6
21.4%
1~50 2
 
7.1%
51~100 2
 
7.1%
101~300 2
 
7.1%
301~500 2
 
7.1%
501 2
 
7.1%
1~500 2
 
7.1%
501~1000 2
 
7.1%
1001~1500 2
 
7.1%
1 1
 
3.6%
Other values (5) 5
17.9%
2023-12-13T00:29:42.498055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 44
32.1%
1 34
24.8%
~ 16
 
11.7%
5 16
 
11.7%
3 6
 
4.4%
6
 
4.4%
6
 
4.4%
6
 
4.4%
2 2
 
1.5%
, 1
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 102
74.5%
Math Symbol 16
 
11.7%
Other Letter 12
 
8.8%
Space Separator 6
 
4.4%
Other Punctuation 1
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 44
43.1%
1 34
33.3%
5 16
 
15.7%
3 6
 
5.9%
2 2
 
2.0%
Other Letter
ValueCountFrequency (%)
6
50.0%
6
50.0%
Math Symbol
ValueCountFrequency (%)
~ 16
100.0%
Space Separator
ValueCountFrequency (%)
6
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 125
91.2%
Hangul 12
 
8.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0 44
35.2%
1 34
27.2%
~ 16
 
12.8%
5 16
 
12.8%
3 6
 
4.8%
6
 
4.8%
2 2
 
1.6%
, 1
 
0.8%
Hangul
ValueCountFrequency (%)
6
50.0%
6
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 125
91.2%
Hangul 12
 
8.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 44
35.2%
1 34
27.2%
~ 16
 
12.8%
5 16
 
12.8%
3 6
 
4.8%
6
 
4.8%
2 2
 
1.6%
, 1
 
0.8%
Hangul
ValueCountFrequency (%)
6
50.0%
6
50.0%

세제곱미터당 금액(원)
Real number (ℝ)

HIGH CORRELATION 

Distinct20
Distinct (%)90.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1087.7273
Minimum510
Maximum1920
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size330.0 B
2023-12-13T00:29:42.607250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum510
5-th percentile586
Q1860
median1025
Q31270
95-th percentile1769.5
Maximum1920
Range1410
Interquartile range (IQR)410

Descriptive statistics

Standard deviation363.26472
Coefficient of variation (CV)0.33396673
Kurtosis0.19609456
Mean1087.7273
Median Absolute Deviation (MAD)175
Skewness0.69232137
Sum23930
Variance131961.26
MonotonicityNot monotonic
2023-12-13T00:29:42.725771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
860 3
 
13.6%
580 1
 
4.5%
940 1
 
4.5%
770 1
 
4.5%
700 1
 
4.5%
1920 1
 
4.5%
1780 1
 
4.5%
1570 1
 
4.5%
1210 1
 
4.5%
1060 1
 
4.5%
Other values (10) 10
45.5%
ValueCountFrequency (%)
510 1
 
4.5%
580 1
 
4.5%
700 1
 
4.5%
770 1
 
4.5%
860 3
13.6%
870 1
 
4.5%
940 1
 
4.5%
960 1
 
4.5%
990 1
 
4.5%
1060 1
 
4.5%
ValueCountFrequency (%)
1920 1
4.5%
1780 1
4.5%
1570 1
4.5%
1440 1
4.5%
1380 1
4.5%
1290 1
4.5%
1210 1
4.5%
1160 1
4.5%
1130 1
4.5%
1090 1
4.5%

관리기관
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
경기도 오산시 하수과
12 
경기도 오산시 수도과
<NA>
 
1

Length

Max length11
Median length11
Mean length10.681818
Min length4

Unique

Unique1 ?
Unique (%)4.5%

Sample

1st row<NA>
2nd row경기도 오산시 수도과
3rd row경기도 오산시 수도과
4th row경기도 오산시 수도과
5th row경기도 오산시 수도과

Common Values

ValueCountFrequency (%)
경기도 오산시 하수과 12
54.5%
경기도 오산시 수도과 9
40.9%
<NA> 1
 
4.5%

Length

2023-12-13T00:29:42.856190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:29:42.973668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 21
32.8%
오산시 21
32.8%
하수과 12
18.8%
수도과 9
14.1%
na 1
 
1.6%

연락처
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
031-8036-6122
12 
031-8036-6385
<NA>
 
1

Length

Max length13
Median length13
Mean length12.590909
Min length4

Unique

Unique1 ?
Unique (%)4.5%

Sample

1st row<NA>
2nd row031-8036-6385
3rd row031-8036-6385
4th row031-8036-6385
5th row031-8036-6385

Common Values

ValueCountFrequency (%)
031-8036-6122 12
54.5%
031-8036-6385 9
40.9%
<NA> 1
 
4.5%

Length

2023-12-13T00:29:43.064244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:29:43.158041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
031-8036-6122 12
54.5%
031-8036-6385 9
40.9%
na 1
 
4.5%

데이터기준일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size308.0 B
2023-08-01
21 
<NA>
 
1

Length

Max length10
Median length10
Mean length9.7272727
Min length4

Unique

Unique1 ?
Unique (%)4.5%

Sample

1st row<NA>
2nd row2023-08-01
3rd row2023-08-01
4th row2023-08-01
5th row2023-08-01

Common Values

ValueCountFrequency (%)
2023-08-01 21
95.5%
<NA> 1
 
4.5%

Length

2023-12-13T00:29:43.273413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:29:43.460380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-08-01 21
95.5%
na 1
 
4.5%

Interactions

2023-12-13T00:29:41.192141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:29:43.548488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분업종단계사용량(세제곱미터)세제곱미터당 금액(원)관리기관연락처
구분1.0000.0000.0000.0000.4780.9870.987
업종0.0001.0000.0001.0000.5690.1100.110
단계0.0000.0001.0001.0000.0000.0000.000
사용량(세제곱미터)0.0001.0001.0001.0000.4790.0000.000
세제곱미터당 금액(원)0.4780.5690.0000.4791.0000.6230.623
관리기관0.9870.1100.0000.0000.6231.0000.987
연락처0.9870.1100.0000.0000.6230.9871.000
2023-12-13T00:29:43.669907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단계구분연락처관리기관데이터기준일자업종
단계1.0000.0000.0000.0001.0000.000
구분0.0001.0000.8970.8971.0000.000
연락처0.0000.8971.0000.8971.0000.162
관리기관0.0000.8970.8971.0001.0000.162
데이터기준일자1.0001.0001.0001.0001.0001.000
업종0.0000.0000.1620.1621.0001.000
2023-12-13T00:29:43.813506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세제곱미터당 금액(원)구분업종단계관리기관연락처데이터기준일자
세제곱미터당 금액(원)1.0000.2690.3630.0000.3460.3461.000
구분0.2691.0000.0000.0000.8970.8971.000
업종0.3630.0001.0000.0000.1620.1621.000
단계0.0000.0000.0001.0000.0000.0001.000
관리기관0.3460.8970.1620.0001.0000.8971.000
연락처0.3460.8970.1620.0000.8971.0001.000
데이터기준일자1.0001.0001.0001.0001.0001.0001.000

Missing values

2023-12-13T00:29:41.280674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:29:41.381088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분업종단계사용량(세제곱미터)세제곱미터당 금액(원)관리기관연락처데이터기준일자
0상수도가정용11 이상580<NA><NA><NA>
1상수도일반용11~50960경기도 오산시 수도과031-8036-63852023-08-01
2상수도일반용251~1001130경기도 오산시 수도과031-8036-63852023-08-01
3상수도일반용3101~3001290경기도 오산시 수도과031-8036-63852023-08-01
4상수도일반용4301~5001380경기도 오산시 수도과031-8036-63852023-08-01
5상수도일반용5501 이상1440경기도 오산시 수도과031-8036-63852023-08-01
6상수도욕탕용11~500870경기도 오산시 수도과031-8036-63852023-08-01
7상수도욕탕용2501~1000990경기도 오산시 수도과031-8036-63852023-08-01
8상수도욕탕용31001~15001090경기도 오산시 수도과031-8036-63852023-08-01
9상수도욕탕용41,501 이상1160경기도 오산시 수도과031-8036-63852023-08-01
구분업종단계사용량(세제곱미터)세제곱미터당 금액(원)관리기관연락처데이터기준일자
12하수도가정용331 이상1060경기도 오산시 하수과031-8036-61222023-08-01
13하수도일반용11~50860경기도 오산시 하수과031-8036-61222023-08-01
14하수도일반용251~1001210경기도 오산시 하수과031-8036-61222023-08-01
15하수도일반용3101~3001570경기도 오산시 하수과031-8036-61222023-08-01
16하수도일반용4301~5001780경기도 오산시 하수과031-8036-61222023-08-01
17하수도일반용5501 이상1920경기도 오산시 하수과031-8036-61222023-08-01
18하수도욕탕용11~500700경기도 오산시 하수과031-8036-61222023-08-01
19하수도욕탕용2501~1000770경기도 오산시 하수과031-8036-61222023-08-01
20하수도욕탕용31001~1500860경기도 오산시 하수과031-8036-61222023-08-01
21하수도욕탕용41501 이상940경기도 오산시 하수과031-8036-61222023-08-01