Overview

Dataset statistics

Number of variables5
Number of observations3617
Missing cells104
Missing cells (%)0.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory148.5 KiB
Average record size in memory42.0 B

Variable types

Numeric2
Categorical2
Text1

Dataset

Description부산광역시 상수도사업본부에서 상하수도 요금 계산 및 징수를 위해 운영하는 수용가정보시스템에 사용되는 민원 신청 정보(급수폐전) 자료입니다.
Author부산광역시 상수도사업본부
URLhttps://www.data.go.kr/data/15100353/fileData.do

Alerts

사업소코드 is highly overall correlated with 사업소명High correlation
사업소명 is highly overall correlated with 사업소코드High correlation
폐전일자 has 104 (2.9%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2024-03-14 14:16:05.949506
Analysis finished2024-03-14 14:16:08.064561
Duration2.12 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct3617
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1809
Minimum1
Maximum3617
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.9 KiB
2024-03-14T23:16:08.271745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile181.8
Q1905
median1809
Q32713
95-th percentile3436.2
Maximum3617
Range3616
Interquartile range (IQR)1808

Descriptive statistics

Standard deviation1044.2823
Coefficient of variation (CV)0.57727048
Kurtosis-1.2
Mean1809
Median Absolute Deviation (MAD)904
Skewness0
Sum6543153
Variance1090525.5
MonotonicityStrictly increasing
2024-03-14T23:16:08.718726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
2431 1
 
< 0.1%
2405 1
 
< 0.1%
2406 1
 
< 0.1%
2407 1
 
< 0.1%
2408 1
 
< 0.1%
2409 1
 
< 0.1%
2410 1
 
< 0.1%
2411 1
 
< 0.1%
2412 1
 
< 0.1%
Other values (3607) 3607
99.7%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
3617 1
< 0.1%
3616 1
< 0.1%
3615 1
< 0.1%
3614 1
< 0.1%
3613 1
< 0.1%
3612 1
< 0.1%
3611 1
< 0.1%
3610 1
< 0.1%
3609 1
< 0.1%
3608 1
< 0.1%

사업소코드
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean300.0047
Minimum244
Maximum312
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.9 KiB
2024-03-14T23:16:09.082666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum244
5-th percentile244
Q1303
median306
Q3306
95-th percentile309
Maximum312
Range68
Interquartile range (IQR)3

Descriptive statistics

Standard deviation17.098014
Coefficient of variation (CV)0.056992486
Kurtosis6.6790274
Mean300.0047
Median Absolute Deviation (MAD)2
Skewness-2.9039936
Sum1085117
Variance292.34207
MonotonicityNot monotonic
2024-03-14T23:16:09.441230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
306 1305
36.1%
304 710
19.6%
301 327
 
9.0%
244 303
 
8.4%
302 271
 
7.5%
307 219
 
6.1%
308 123
 
3.4%
303 110
 
3.0%
309 108
 
3.0%
312 76
 
2.1%
ValueCountFrequency (%)
244 303
 
8.4%
301 327
 
9.0%
302 271
 
7.5%
303 110
 
3.0%
304 710
19.6%
306 1305
36.1%
307 219
 
6.1%
308 123
 
3.4%
309 108
 
3.0%
311 65
 
1.8%
ValueCountFrequency (%)
312 76
 
2.1%
311 65
 
1.8%
309 108
 
3.0%
308 123
 
3.4%
307 219
 
6.1%
306 1305
36.1%
304 710
19.6%
303 110
 
3.0%
302 271
 
7.5%
301 327
 
9.0%

사업소명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size28.4 KiB
남부사업소
1305 
부산진 사업소
710 
중동부사업소
327 
동래통합사업소
303 
서부 사업소
271 
Other values (6)
701 

Length

Max length9
Median length5
Mean length6.1805364
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남부사업소
2nd row남부사업소
3rd row남부사업소
4th row북부사업소
5th row남부사업소

Common Values

ValueCountFrequency (%)
남부사업소 1305
36.1%
부산진 사업소 710
19.6%
중동부사업소 327
 
9.0%
동래통합사업소 303
 
8.4%
서부 사업소 271
 
7.5%
북부사업소 219
 
6.1%
해운대사업소 123
 
3.4%
영도사업소 110
 
3.0%
사하사업소 108
 
3.0%
기장사업소 76
 
2.1%

Length

2024-03-14T23:16:09.756996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
남부사업소 1305
28.4%
사업소 981
21.3%
부산진 710
15.4%
중동부사업소 327
 
7.1%
동래통합사업소 303
 
6.6%
서부 271
 
5.9%
북부사업소 219
 
4.8%
해운대사업소 123
 
2.7%
영도사업소 110
 
2.4%
사하사업소 108
 
2.3%
Other values (2) 141
 
3.1%

폐전사유
Categorical

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size28.4 KiB
건물철거
2310 
불필요등기타
684 
직권폐전
394 
도시계획 도로편입
 
164
<NA>
 
42
Other values (2)
 
23

Length

Max length10
Median length4
Mean length4.6115565
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row건물철거
2nd row건물철거
3rd row건물철거
4th row건물철거
5th row건물철거

Common Values

ValueCountFrequency (%)
건물철거 2310
63.9%
불필요등기타 684
 
18.9%
직권폐전 394
 
10.9%
도시계획 도로편입 164
 
4.5%
<NA> 42
 
1.2%
폐전분실 19
 
0.5%
중지후 건물신축포기 4
 
0.1%

Length

2024-03-14T23:16:09.994429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T23:16:10.242670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
건물철거 2310
61.0%
불필요등기타 684
 
18.1%
직권폐전 394
 
10.4%
도시계획 164
 
4.3%
도로편입 164
 
4.3%
na 42
 
1.1%
폐전분실 19
 
0.5%
중지후 4
 
0.1%
건물신축포기 4
 
0.1%

폐전일자
Text

MISSING 

Distinct277
Distinct (%)7.9%
Missing104
Missing (%)2.9%
Memory size28.4 KiB
2024-03-14T23:16:11.398362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters35130
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)0.6%

Sample

1st row2023-01-25
2nd row2023-01-26
3rd row2023-01-26
4th row2023-01-27
5th row2023-01-27
ValueCountFrequency (%)
2023-10-18 90
 
2.6%
2023-09-27 75
 
2.1%
2023-10-20 50
 
1.4%
2023-06-08 42
 
1.2%
2023-11-24 42
 
1.2%
2023-07-04 36
 
1.0%
2023-10-16 35
 
1.0%
2023-02-14 32
 
0.9%
2023-01-06 31
 
0.9%
2023-04-18 28
 
0.8%
Other values (267) 3052
86.9%
2024-03-14T23:16:12.956418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 9137
26.0%
0 7738
22.0%
- 7026
20.0%
3 4203
12.0%
1 3222
 
9.2%
6 707
 
2.0%
8 701
 
2.0%
7 674
 
1.9%
4 672
 
1.9%
9 543
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 28104
80.0%
Dash Punctuation 7026
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 9137
32.5%
0 7738
27.5%
3 4203
15.0%
1 3222
 
11.5%
6 707
 
2.5%
8 701
 
2.5%
7 674
 
2.4%
4 672
 
2.4%
9 543
 
1.9%
5 507
 
1.8%
Dash Punctuation
ValueCountFrequency (%)
- 7026
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 35130
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 9137
26.0%
0 7738
22.0%
- 7026
20.0%
3 4203
12.0%
1 3222
 
9.2%
6 707
 
2.0%
8 701
 
2.0%
7 674
 
1.9%
4 672
 
1.9%
9 543
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35130
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 9137
26.0%
0 7738
22.0%
- 7026
20.0%
3 4203
12.0%
1 3222
 
9.2%
6 707
 
2.0%
8 701
 
2.0%
7 674
 
1.9%
4 672
 
1.9%
9 543
 
1.5%

Interactions

2024-03-14T23:16:06.785794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T23:16:06.231087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T23:16:07.060057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T23:16:06.506428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T23:16:13.224677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드사업소명폐전사유
연번1.0000.2410.1470.152
사업소코드0.2411.0001.0000.502
사업소명0.1471.0001.0000.613
폐전사유0.1520.5020.6131.000
2024-03-14T23:16:13.476066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
폐전사유사업소명
폐전사유1.0000.372
사업소명0.3721.000
2024-03-14T23:16:13.713993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업소코드사업소명폐전사유
연번1.000-0.0870.0630.080
사업소코드-0.0871.0000.9990.404
사업소명0.0630.9991.0000.372
폐전사유0.0800.4040.3721.000

Missing values

2024-03-14T23:16:07.609353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T23:16:07.936443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번사업소코드사업소명폐전사유폐전일자
01306남부사업소건물철거2023-01-25
12306남부사업소건물철거2023-01-26
23306남부사업소건물철거2023-01-26
34307북부사업소건물철거<NA>
45306남부사업소건물철거2023-01-27
56306남부사업소건물철거2023-01-27
67306남부사업소건물철거2023-01-27
78306남부사업소건물철거2023-01-27
89306남부사업소건물철거2023-01-27
910309사하사업소불필요등기타2023-01-27
연번사업소코드사업소명폐전사유폐전일자
36073608309사하사업소건물철거2023-08-03
36083609244동래통합사업소불필요등기타2023-08-08
36093610304부산진 사업소건물철거2023-08-11
36103611304부산진 사업소건물철거2023-08-03
36113612302서부 사업소직권폐전2023-08-10
36123613304부산진 사업소건물철거2023-08-17
36133614304부산진 사업소건물철거2023-08-17
36143615301중동부사업소건물철거2023-08-18
36153616301중동부사업소건물철거2023-08-23
36163617306남부사업소건물철거2023-09-01