Overview

Dataset statistics

Number of variables7
Number of observations24
Missing cells3
Missing cells (%)1.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 KiB
Average record size in memory66.3 B

Variable types

Numeric2
DateTime1
Categorical3
Text1

Dataset

Description부산광역시상수도사업본부_수용가정보시스템_물이용요금정보_20220609
Author부산광역시 상수도사업본부
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15083446

Alerts

연번 is highly overall correlated with 취수율 and 1 other fieldsHigh correlation
취수율 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
구분 is highly overall correlated with 취수율 and 1 other fieldsHigh correlation
부과율 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
설명 has 3 (12.5%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2024-04-21 07:09:51.312474
Analysis finished2024-04-21 07:09:53.162846
Duration1.85 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct24
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.5
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size344.0 B
2024-04-21T16:09:53.351934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.15
Q16.75
median12.5
Q318.25
95-th percentile22.85
Maximum24
Range23
Interquartile range (IQR)11.5

Descriptive statistics

Standard deviation7.0710678
Coefficient of variation (CV)0.56568542
Kurtosis-1.2
Mean12.5
Median Absolute Deviation (MAD)6
Skewness0
Sum300
Variance50
MonotonicityStrictly increasing
2024-04-21T16:09:53.742916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
1 1
 
4.2%
14 1
 
4.2%
24 1
 
4.2%
23 1
 
4.2%
22 1
 
4.2%
21 1
 
4.2%
20 1
 
4.2%
19 1
 
4.2%
18 1
 
4.2%
17 1
 
4.2%
Other values (14) 14
58.3%
ValueCountFrequency (%)
1 1
4.2%
2 1
4.2%
3 1
4.2%
4 1
4.2%
5 1
4.2%
6 1
4.2%
7 1
4.2%
8 1
4.2%
9 1
4.2%
10 1
4.2%
ValueCountFrequency (%)
24 1
4.2%
23 1
4.2%
22 1
4.2%
21 1
4.2%
20 1
4.2%
19 1
4.2%
18 1
4.2%
17 1
4.2%
16 1
4.2%
15 1
4.2%
Distinct22
Distinct (%)91.7%
Missing0
Missing (%)0.0%
Memory size320.0 B
Minimum2010-02-01 00:00:00
Maximum2021-02-01 00:00:00
2024-04-21T16:09:54.099645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T16:09:54.487611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)

구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size320.0 B
1
21 
8

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8
2nd row1
3rd row8
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 21
87.5%
8 3
 
12.5%

Length

2024-04-21T16:09:54.897753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T16:09:55.197537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 21
87.5%
8 3
 
12.5%

취수율
Real number (ℝ)

HIGH CORRELATION 

Distinct12
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.90775
Minimum0.877
Maximum1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size344.0 B
2024-04-21T16:09:55.498005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.877
5-th percentile0.88215
Q10.888
median0.8965
Q30.92075
95-th percentile0.95195
Maximum1
Range0.123
Interquartile range (IQR)0.03275

Descriptive statistics

Standard deviation0.028398178
Coefficient of variation (CV)0.03128414
Kurtosis3.6534294
Mean0.90775
Median Absolute Deviation (MAD)0.014
Skewness1.6822504
Sum21.786
Variance0.00080645652
MonotonicityNot monotonic
2024-04-21T16:09:55.886774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0.888 5
20.8%
0.929 4
16.7%
0.889 4
16.7%
0.916 3
12.5%
1.0 1
 
4.2%
0.918 1
 
4.2%
0.904 1
 
4.2%
0.956 1
 
4.2%
0.883 1
 
4.2%
0.877 1
 
4.2%
Other values (2) 2
 
8.3%
ValueCountFrequency (%)
0.877 1
 
4.2%
0.882 1
 
4.2%
0.883 1
 
4.2%
0.888 5
20.8%
0.889 4
16.7%
0.904 1
 
4.2%
0.906 1
 
4.2%
0.916 3
12.5%
0.918 1
 
4.2%
0.929 4
16.7%
ValueCountFrequency (%)
1.0 1
 
4.2%
0.956 1
 
4.2%
0.929 4
16.7%
0.918 1
 
4.2%
0.916 3
12.5%
0.906 1
 
4.2%
0.904 1
 
4.2%
0.889 4
16.7%
0.888 5
20.8%
0.883 1
 
4.2%

부과율
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size320.0 B
160
12 
170
150
140
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)4.2%

Sample

1st row140
2nd row150
3rd row150
4th row150
5th row160

Common Values

ValueCountFrequency (%)
160 12
50.0%
170 7
29.2%
150 4
 
16.7%
140 1
 
4.2%

Length

2024-04-21T16:09:56.284168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T16:09:56.559597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
160 12
50.0%
170 7
29.2%
150 4
 
16.7%
140 1
 
4.2%

부과율(BOD)
Categorical

Distinct3
Distinct (%)12.5%
Missing0
Missing (%)0.0%
Memory size320.0 B
100
19 
70
60
 
1

Length

Max length3
Median length3
Mean length2.7916667
Min length2

Unique

Unique1 ?
Unique (%)4.2%

Sample

1st row100
2nd row60
3rd row100
4th row100
5th row100

Common Values

ValueCountFrequency (%)
100 19
79.2%
70 4
 
16.7%
60 1
 
4.2%

Length

2024-04-21T16:09:56.786734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T16:09:56.990608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
100 19
79.2%
70 4
 
16.7%
60 1
 
4.2%

설명
Text

MISSING 

Distinct21
Distinct (%)100.0%
Missing3
Missing (%)12.5%
Memory size320.0 B
2024-04-21T16:09:57.828840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length42
Mean length35.285714
Min length18

Characters and Unicode

Total characters741
Distinct characters64
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)100.0%

Sample

1st row주한미군관련 시설 물이용부담금 부과단가 변경 요청(경영관리팀-1883,2019.4.5.)
2nd row2010년 4월 납기 부과율 변경
3rd row2011년 2월 납기 부과율 변경
4th row2012년 2월 납기 부과율 변경
5th row2012년 3월 납기 부과율 변경
ValueCountFrequency (%)
변경 7
 
5.0%
2월 7
 
5.0%
100 7
 
5.0%
납기분(격월 6
 
4.3%
6
 
4.3%
70 6
 
4.3%
납기 6
 
4.3%
부과율 6
 
4.3%
1개월 6
 
4.3%
물이용부담금 5
 
3.6%
Other values (53) 77
55.4%
2024-04-21T16:09:58.892136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
120
 
16.2%
1 62
 
8.4%
0 58
 
7.8%
2 53
 
7.2%
39
 
5.3%
. 24
 
3.2%
20
 
2.7%
20
 
2.7%
18
 
2.4%
( 17
 
2.3%
Other values (54) 310
41.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 307
41.4%
Decimal Number 230
31.0%
Space Separator 120
 
16.2%
Other Punctuation 44
 
5.9%
Open Punctuation 17
 
2.3%
Close Punctuation 16
 
2.2%
Dash Punctuation 7
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
39
 
12.7%
20
 
6.5%
20
 
6.5%
18
 
5.9%
16
 
5.2%
16
 
5.2%
14
 
4.6%
11
 
3.6%
11
 
3.6%
11
 
3.6%
Other values (37) 131
42.7%
Decimal Number
ValueCountFrequency (%)
1 62
27.0%
0 58
25.2%
2 53
23.0%
3 11
 
4.8%
4 10
 
4.3%
5 9
 
3.9%
7 9
 
3.9%
9 7
 
3.0%
6 6
 
2.6%
8 5
 
2.2%
Other Punctuation
ValueCountFrequency (%)
. 24
54.5%
, 13
29.5%
: 7
 
15.9%
Space Separator
ValueCountFrequency (%)
120
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 434
58.6%
Hangul 307
41.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
39
 
12.7%
20
 
6.5%
20
 
6.5%
18
 
5.9%
16
 
5.2%
16
 
5.2%
14
 
4.6%
11
 
3.6%
11
 
3.6%
11
 
3.6%
Other values (37) 131
42.7%
Common
ValueCountFrequency (%)
120
27.6%
1 62
14.3%
0 58
13.4%
2 53
12.2%
. 24
 
5.5%
( 17
 
3.9%
) 16
 
3.7%
, 13
 
3.0%
3 11
 
2.5%
4 10
 
2.3%
Other values (7) 50
11.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 434
58.6%
Hangul 307
41.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
120
27.6%
1 62
14.3%
0 58
13.4%
2 53
12.2%
. 24
 
5.5%
( 17
 
3.9%
) 16
 
3.7%
, 13
 
3.0%
3 11
 
2.5%
4 10
 
2.3%
Other values (7) 50
11.5%
Hangul
ValueCountFrequency (%)
39
 
12.7%
20
 
6.5%
20
 
6.5%
18
 
5.9%
16
 
5.2%
16
 
5.2%
14
 
4.6%
11
 
3.6%
11
 
3.6%
11
 
3.6%
Other values (37) 131
42.7%

Interactions

2024-04-21T16:09:52.036728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T16:09:51.699550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T16:09:52.282937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T16:09:51.842713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T16:09:59.070142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번적용년월구분취수율부과율부과율(BOD)설명
연번1.0000.9450.6890.6020.7930.0001.000
적용년월0.9451.0000.0000.9360.9100.0001.000
구분0.6890.0001.0000.9010.9240.0001.000
취수율0.6020.9360.9011.0000.9570.0001.000
부과율0.7930.9100.9240.9571.0000.3081.000
부과율(BOD)0.0000.0000.0000.0000.3081.0001.000
설명1.0001.0001.0001.0001.0001.0001.000
2024-04-21T16:09:59.275666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부과율구분부과율(BOD)
부과율1.0000.7150.279
구분0.7151.0000.000
부과율(BOD)0.2790.0001.000
2024-04-21T16:09:59.437018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번취수율구분부과율부과율(BOD)
연번1.000-0.6240.4140.5030.000
취수율-0.6241.0000.6480.8180.000
구분0.4140.6481.0000.7150.000
부과율0.5030.8180.7151.0000.279
부과율(BOD)0.0000.0000.0000.2791.000

Missing values

2024-04-21T16:09:52.631261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T16:09:53.016964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번적용년월구분취수율부과율부과율(BOD)설명
012019-04-0181.0140100주한미군관련 시설 물이용부담금 부과단가 변경 요청(경영관리팀-1883,2019.4.5.)
122010-02-0110.92915060<NA>
232010-02-0180.929150100<NA>
342010-04-0110.9291501002010년 4월 납기 부과율 변경
452011-02-0110.9181601002011년 2월 납기 부과율 변경
562011-02-0180.929150100<NA>
672012-02-0110.9161601002012년 2월 납기 부과율 변경
782012-03-0110.916160702012년 3월 납기 부과율 변경
892012-05-0110.9161601002012년 5월 납기 부과율 변경
9102013-02-0110.8881601002013년 2월납기분 (2013.1.1 이후 사용량)부터
연번적용년월구분취수율부과율부과율(BOD)설명
14152014-02-0110.8891601002014년 2월 납기분(격월 : 전1개월 100, 후1개월 100)
15162014-04-0110.889160702014년 4월 납기분(격월 : 전1개월 100, 후1개월 70)
16172014-05-0110.8891601002014년 5월 납기분(격월:전1개월 70, 후1개월 100)
17182015-02-0110.9041701002015년 2월 납기분 취수율 및 부과율 변경(2015. 1. 1이후 사용량부터)
18192016-02-0110.9561701002016년 2월 납기분부터 적용(경영관리팀-608,2016.01.29.)
19202017-02-0110.8831701002017년 2월분부터 적용(경영관리팀-639, 2017.2.1.)
20212021-02-0110.8771701002021년 물이용부담금 부과계수 및 단가 변경 안내(경영관리팀-569, 2021.2.2.)
21222020-02-0110.8821701002020년 물이용부담금 부과단가 변경(경영관리팀-545(2020.1.30.)
22232018-02-0110.9061701002018년 2월 물이용부담금 부과단가 변경사항 알림 (경영관리팀-471,2018.1.23)
23242019-02-0110.8891701002019년 2월 납기 물이용부담금 부과단가 변경(경영관리팀-462, 2019.1.22.)