Overview

Dataset statistics

Number of variables6
Number of observations1050
Missing cells10
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory50.4 KiB
Average record size in memory49.1 B

Variable types

Numeric1
Text4
DateTime1

Dataset

Description부산광역시 상수도사업본부에서 상하수도 요금 계산 및 징수를 위해 운영하는 수용가정보시스템에 사용되는 계량기 정보(계량기 번호 교체정보) 자료입니다.
Author부산광역시 상수도사업본부
URLhttps://www.data.go.kr/data/15101349/fileData.do

Alerts

연번 has unique valuesUnique
삭제시간 has unique valuesUnique

Reproduction

Analysis started2024-03-14 08:59:07.213502
Analysis finished2024-03-14 08:59:08.428694
Duration1.22 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct1050
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean525.5
Minimum1
Maximum1050
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2024-03-14T17:59:08.651416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile53.45
Q1263.25
median525.5
Q3787.75
95-th percentile997.55
Maximum1050
Range1049
Interquartile range (IQR)524.5

Descriptive statistics

Standard deviation303.25319
Coefficient of variation (CV)0.57707554
Kurtosis-1.2
Mean525.5
Median Absolute Deviation (MAD)262.5
Skewness0
Sum551775
Variance91962.5
MonotonicityStrictly increasing
2024-03-14T17:59:09.086670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
691 1
 
0.1%
693 1
 
0.1%
694 1
 
0.1%
695 1
 
0.1%
696 1
 
0.1%
697 1
 
0.1%
698 1
 
0.1%
699 1
 
0.1%
700 1
 
0.1%
Other values (1040) 1040
99.0%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1050 1
0.1%
1049 1
0.1%
1048 1
0.1%
1047 1
0.1%
1046 1
0.1%
1045 1
0.1%
1044 1
0.1%
1043 1
0.1%
1042 1
0.1%
1041 1
0.1%
Distinct853
Distinct (%)81.2%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
2024-03-14T17:59:10.304335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6300
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique737 ?
Unique (%)70.2%

Sample

1st row*78*57
2nd row*50*13
3rd row*05*71
4th row*62*04
5th row*38*14
ValueCountFrequency (%)
39*24 19
 
1.8%
16*01 18
 
1.7%
54*39 7
 
0.7%
99*01 6
 
0.6%
59*73 5
 
0.5%
53*12 4
 
0.4%
48*27 4
 
0.4%
04*20 4
 
0.4%
10*68 4
 
0.4%
90*18 4
 
0.4%
Other values (843) 975
92.9%
2024-03-14T17:59:11.963735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 2100
33.3%
1 536
 
8.5%
5 462
 
7.3%
9 436
 
6.9%
0 435
 
6.9%
2 432
 
6.9%
6 411
 
6.5%
3 403
 
6.4%
4 374
 
5.9%
8 362
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4200
66.7%
Other Punctuation 2100
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 536
12.8%
5 462
11.0%
9 436
10.4%
0 435
10.4%
2 432
10.3%
6 411
9.8%
3 403
9.6%
4 374
8.9%
8 362
8.6%
7 349
8.3%
Other Punctuation
ValueCountFrequency (%)
* 2100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6300
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 2100
33.3%
1 536
 
8.5%
5 462
 
7.3%
9 436
 
6.9%
0 435
 
6.9%
2 432
 
6.9%
6 411
 
6.5%
3 403
 
6.4%
4 374
 
5.9%
8 362
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6300
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 2100
33.3%
1 536
 
8.5%
5 462
 
7.3%
9 436
 
6.9%
0 435
 
6.9%
2 432
 
6.9%
6 411
 
6.5%
3 403
 
6.4%
4 374
 
5.9%
8 362
 
5.7%

삭제시간
Date

UNIQUE 

Distinct1050
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
Minimum2023-01-02 14:28:31
Maximum2023-12-29 17:36:23
2024-03-14T17:59:12.204560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T17:59:12.545824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1030
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
2024-03-14T17:59:14.411329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length8
Mean length8.1685714
Min length1

Characters and Unicode

Total characters8577
Distinct characters28
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1010 ?
Unique (%)96.2%

Sample

1st row2715023
2nd rowD21437461
3rd row62600883
4th row11742272
5th row0
ValueCountFrequency (%)
50504109 2
 
0.2%
20544059 2
 
0.2%
20634763 2
 
0.2%
30506844 2
 
0.2%
20528183 2
 
0.2%
50609577 2
 
0.2%
d11300160 2
 
0.2%
d31118990 2
 
0.2%
30921234 2
 
0.2%
d11300150 2
 
0.2%
Other values (1018) 1030
98.1%
2024-03-14T17:59:16.338484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1811
21.1%
1 1039
12.1%
5 938
10.9%
2 928
10.8%
3 859
10.0%
4 675
 
7.9%
6 635
 
7.4%
7 486
 
5.7%
8 483
 
5.6%
9 395
 
4.6%
Other values (18) 328
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8249
96.2%
Uppercase Letter 293
 
3.4%
Other Letter 24
 
0.3%
Dash Punctuation 10
 
0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
37.5%
3
 
12.5%
2
 
8.3%
2
 
8.3%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other values (2) 2
 
8.3%
Decimal Number
ValueCountFrequency (%)
0 1811
22.0%
1 1039
12.6%
5 938
11.4%
2 928
11.2%
3 859
10.4%
4 675
 
8.2%
6 635
 
7.7%
7 486
 
5.9%
8 483
 
5.9%
9 395
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
D 230
78.5%
B 31
 
10.6%
E 31
 
10.6%
A 1
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8260
96.3%
Latin 293
 
3.4%
Hangul 24
 
0.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1811
21.9%
1 1039
12.6%
5 938
11.4%
2 928
11.2%
3 859
10.4%
4 675
 
8.2%
6 635
 
7.7%
7 486
 
5.9%
8 483
 
5.8%
9 395
 
4.8%
Other values (2) 11
 
0.1%
Hangul
ValueCountFrequency (%)
9
37.5%
3
 
12.5%
2
 
8.3%
2
 
8.3%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other values (2) 2
 
8.3%
Latin
ValueCountFrequency (%)
D 230
78.5%
B 31
 
10.6%
E 31
 
10.6%
A 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8553
99.7%
Hangul 15
 
0.2%
Compat Jamo 9
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1811
21.2%
1 1039
12.1%
5 938
11.0%
2 928
10.8%
3 859
10.0%
4 675
 
7.9%
6 635
 
7.4%
7 486
 
5.7%
8 483
 
5.6%
9 395
 
4.6%
Other values (6) 304
 
3.6%
Compat Jamo
ValueCountFrequency (%)
9
100.0%
Hangul
ValueCountFrequency (%)
3
20.0%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Distinct1033
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
2024-03-14T17:59:17.576033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length8
Mean length8.2466667
Min length1

Characters and Unicode

Total characters8659
Distinct characters22
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1016 ?
Unique (%)96.8%

Sample

1st row27115023
2nd rowD21437407
3rd row60620883
4th rowD11742272
5th row92800141
ValueCountFrequency (%)
d11300160 2
 
0.2%
d11300150 2
 
0.2%
30506844 2
 
0.2%
20634764 2
 
0.2%
d01100665 2
 
0.2%
50609597 2
 
0.2%
8888888 2
 
0.2%
70045836 2
 
0.2%
06-073809 2
 
0.2%
20004825 2
 
0.2%
Other values (1020) 1030
98.1%
2024-03-14T17:59:19.033686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1845
21.3%
1 1019
11.8%
5 972
11.2%
2 966
11.2%
3 834
9.6%
4 679
 
7.8%
6 636
 
7.3%
8 506
 
5.8%
7 473
 
5.5%
9 393
 
4.5%
Other values (12) 336
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8323
96.1%
Uppercase Letter 304
 
3.5%
Dash Punctuation 21
 
0.2%
Other Letter 7
 
0.1%
Lowercase Letter 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1845
22.2%
1 1019
12.2%
5 972
11.7%
2 966
11.6%
3 834
10.0%
4 679
 
8.2%
6 636
 
7.6%
8 506
 
6.1%
7 473
 
5.7%
9 393
 
4.7%
Other Letter
ValueCountFrequency (%)
2
28.6%
2
28.6%
1
14.3%
1
14.3%
1
14.3%
Uppercase Letter
ValueCountFrequency (%)
D 248
81.6%
B 27
 
8.9%
E 26
 
8.6%
A 3
 
1.0%
Lowercase Letter
ValueCountFrequency (%)
e 2
50.0%
b 2
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8344
96.4%
Latin 308
 
3.6%
Hangul 7
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1845
22.1%
1 1019
12.2%
5 972
11.6%
2 966
11.6%
3 834
10.0%
4 679
 
8.1%
6 636
 
7.6%
8 506
 
6.1%
7 473
 
5.7%
9 393
 
4.7%
Latin
ValueCountFrequency (%)
D 248
80.5%
B 27
 
8.8%
E 26
 
8.4%
A 3
 
1.0%
e 2
 
0.6%
b 2
 
0.6%
Hangul
ValueCountFrequency (%)
2
28.6%
2
28.6%
1
14.3%
1
14.3%
1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8652
99.9%
Hangul 5
 
0.1%
Compat Jamo 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1845
21.3%
1 1019
11.8%
5 972
11.2%
2 966
11.2%
3 834
9.6%
4 679
 
7.8%
6 636
 
7.4%
8 506
 
5.8%
7 473
 
5.5%
9 393
 
4.5%
Other values (7) 329
 
3.8%
Hangul
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
Distinct93
Distinct (%)8.9%
Missing10
Missing (%)1.0%
Memory size8.3 KiB
2024-03-14T17:59:19.942675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length2
Mean length4.1538462
Min length1

Characters and Unicode

Total characters4320
Distinct characters120
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique60 ?
Unique (%)5.8%

Sample

1st row오기입
2nd row오기
3rd row계량기번호 오입력
4th row"D" 누락
5th row타사업소 전산착오로 이중입력 확인후 정정
ValueCountFrequency (%)
오기 366
24.8%
정정 139
 
9.4%
오기입 125
 
8.5%
기물번호 107
 
7.3%
기입 97
 
6.6%
착오 96
 
6.5%
오타 87
 
5.9%
착오입력 37
 
2.5%
수정 36
 
2.4%
입력 32
 
2.2%
Other values (96) 351
23.8%
2024-03-14T17:59:21.111869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
815
18.9%
783
18.1%
434
10.0%
341
 
7.9%
330
 
7.6%
158
 
3.7%
155
 
3.6%
149
 
3.4%
140
 
3.2%
102
 
2.4%
Other values (110) 913
21.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3737
86.5%
Space Separator 434
 
10.0%
Decimal Number 59
 
1.4%
Lowercase Letter 45
 
1.0%
Other Punctuation 29
 
0.7%
Close Punctuation 5
 
0.1%
Open Punctuation 5
 
0.1%
Uppercase Letter 3
 
0.1%
Dash Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
815
21.8%
783
21.0%
341
9.1%
330
8.8%
158
 
4.2%
155
 
4.1%
149
 
4.0%
140
 
3.7%
102
 
2.7%
91
 
2.4%
Other values (84) 673
18.0%
Decimal Number
ValueCountFrequency (%)
1 14
23.7%
0 13
22.0%
8 7
11.9%
6 6
10.2%
4 5
 
8.5%
2 5
 
8.5%
3 4
 
6.8%
9 3
 
5.1%
7 2
 
3.4%
Lowercase Letter
ValueCountFrequency (%)
r 11
24.4%
l 10
22.2%
h 10
22.2%
d 10
22.2%
k 1
 
2.2%
f 1
 
2.2%
n 1
 
2.2%
s 1
 
2.2%
Other Punctuation
ValueCountFrequency (%)
. 25
86.2%
" 2
 
6.9%
1
 
3.4%
, 1
 
3.4%
Space Separator
ValueCountFrequency (%)
434
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Uppercase Letter
ValueCountFrequency (%)
D 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3737
86.5%
Common 535
 
12.4%
Latin 48
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
815
21.8%
783
21.0%
341
9.1%
330
8.8%
158
 
4.2%
155
 
4.1%
149
 
4.0%
140
 
3.7%
102
 
2.7%
91
 
2.4%
Other values (84) 673
18.0%
Common
ValueCountFrequency (%)
434
81.1%
. 25
 
4.7%
1 14
 
2.6%
0 13
 
2.4%
8 7
 
1.3%
6 6
 
1.1%
) 5
 
0.9%
( 5
 
0.9%
4 5
 
0.9%
2 5
 
0.9%
Other values (7) 16
 
3.0%
Latin
ValueCountFrequency (%)
r 11
22.9%
l 10
20.8%
h 10
20.8%
d 10
20.8%
D 3
 
6.2%
k 1
 
2.1%
f 1
 
2.1%
n 1
 
2.1%
s 1
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3732
86.4%
ASCII 582
 
13.5%
Compat Jamo 5
 
0.1%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
815
21.8%
783
21.0%
341
9.1%
330
8.8%
158
 
4.2%
155
 
4.2%
149
 
4.0%
140
 
3.8%
102
 
2.7%
91
 
2.4%
Other values (81) 668
17.9%
ASCII
ValueCountFrequency (%)
434
74.6%
. 25
 
4.3%
1 14
 
2.4%
0 13
 
2.2%
r 11
 
1.9%
l 10
 
1.7%
h 10
 
1.7%
d 10
 
1.7%
8 7
 
1.2%
6 6
 
1.0%
Other values (15) 42
 
7.2%
Compat Jamo
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%
None
ValueCountFrequency (%)
1
100.0%

Interactions

2024-03-14T17:59:07.579246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T17:59:21.262968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번변경사유
연번1.0000.387
변경사유0.3871.000

Missing values

2024-03-14T17:59:07.935209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T17:59:08.289735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번고객번호삭제시간변경전계량기번호변경후계량기번호변경사유
01*78*572023-01-02 14:28:31271502327115023오기입
12*50*132023-01-11 17:18:17D21437461D21437407오기
23*05*712023-01-12 14:06:566260088360620883계량기번호 오입력
34*62*042023-01-13 15:58:0011742272D11742272"D" 누락
45*38*142023-01-17 15:35:55092800141타사업소 전산착오로 이중입력 확인후 정정
56*13*122023-01-30 15:48:08905411190540111오기
67*53*442023-02-03 09:00:351580030115800303오기
78*92*262023-02-08 15:31:22D11766944D11746944오기
89*91*992023-02-08 15:32:30D12400409D12404092오기
910*10*842023-02-08 15:46:22205126620512626오기입
연번고객번호삭제시간변경전계량기번호변경후계량기번호변경사유
10401041*05*152023-11-05 16:59:575051329250513285입력오류
10411042*81*842023-11-08 09:52:303092587630925875오기
10421043*82*392023-11-08 09:55:293092666230926663오기
10431044*99*722023-11-13 11:10:20D15400911D12503154오기입
10441045*56*122023-11-14 14:28:185660037756600733계량기번호상이.
10451046*03*952023-11-17 16:20:485051248450512487기록오류
10461047*32*502023-11-21 10:54:40309208430929084누락
10471048*75*192023-11-22 10:18:43D31111860D31111186오기
10481049*07*462023-11-28 09:10:076052706360527053오타
10491050*00*782023-12-26 21:11:2806-890506-008095입력 오류