Overview

Dataset statistics

Number of variables4
Number of observations1493
Missing cells115
Missing cells (%)1.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory48.2 KiB
Average record size in memory33.1 B

Variable types

Numeric1
Text3

Dataset

Description부산광역시상수도사업본부_수용가정보시스템_계량기정보_계량기번호교체정보_20220620
Author부산광역시 상수도사업본부
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15101349

Alerts

변경사유 has 115 (7.7%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:32:27.449175
Analysis finished2023-12-10 16:32:27.937961
Duration0.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct1493
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean747
Minimum1
Maximum1493
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.3 KiB
2023-12-11T01:32:28.114041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile75.6
Q1374
median747
Q31120
95-th percentile1418.4
Maximum1493
Range1492
Interquartile range (IQR)746

Descriptive statistics

Standard deviation431.13629
Coefficient of variation (CV)0.57715701
Kurtosis-1.2
Mean747
Median Absolute Deviation (MAD)373
Skewness0
Sum1115271
Variance185878.5
MonotonicityStrictly increasing
2023-12-11T01:32:28.436559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
994 1
 
0.1%
1003 1
 
0.1%
1002 1
 
0.1%
1001 1
 
0.1%
1000 1
 
0.1%
999 1
 
0.1%
998 1
 
0.1%
997 1
 
0.1%
996 1
 
0.1%
Other values (1483) 1483
99.3%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1493 1
0.1%
1492 1
0.1%
1491 1
0.1%
1490 1
0.1%
1489 1
0.1%
1488 1
0.1%
1487 1
0.1%
1486 1
0.1%
1485 1
0.1%
1484 1
0.1%
Distinct1438
Distinct (%)96.3%
Missing0
Missing (%)0.0%
Memory size11.8 KiB
2023-12-11T01:32:28.779903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length8.5150703
Min length1

Characters and Unicode

Total characters12713
Distinct characters20
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1402 ?
Unique (%)93.9%

Sample

1st row60526936
2nd row522742
3rd rowD-011100596
4th row563445
5th row9380005
ValueCountFrequency (%)
60620636 7
 
0.5%
신신 6
 
0.4%
40514931 5
 
0.3%
0 5
 
0.3%
548199 3
 
0.2%
563541 3
 
0.2%
d-00100501 3
 
0.2%
556069 3
 
0.2%
30514531 2
 
0.1%
2600822 2
 
0.1%
Other values (1427) 1454
97.4%
2023-12-11T01:32:29.307267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2374
18.7%
1 2146
16.9%
5 1255
9.9%
3 1131
8.9%
2 975
7.7%
4 928
 
7.3%
6 908
 
7.1%
7 698
 
5.5%
8 655
 
5.2%
- 607
 
4.8%
Other values (10) 1036
8.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11602
91.3%
Dash Punctuation 607
 
4.8%
Uppercase Letter 486
 
3.8%
Other Letter 14
 
0.1%
Other Punctuation 3
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2374
20.5%
1 2146
18.5%
5 1255
10.8%
3 1131
9.7%
2 975
8.4%
4 928
 
8.0%
6 908
 
7.8%
7 698
 
6.0%
8 655
 
5.6%
9 532
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
D 483
99.4%
C 1
 
0.2%
I 1
 
0.2%
A 1
 
0.2%
Other Letter
ValueCountFrequency (%)
12
85.7%
2
 
14.3%
Other Punctuation
ValueCountFrequency (%)
' 2
66.7%
. 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 607
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12213
96.1%
Latin 486
 
3.8%
Hangul 14
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2374
19.4%
1 2146
17.6%
5 1255
10.3%
3 1131
9.3%
2 975
8.0%
4 928
 
7.6%
6 908
 
7.4%
7 698
 
5.7%
8 655
 
5.4%
- 607
 
5.0%
Other values (4) 536
 
4.4%
Latin
ValueCountFrequency (%)
D 483
99.4%
C 1
 
0.2%
I 1
 
0.2%
A 1
 
0.2%
Hangul
ValueCountFrequency (%)
12
85.7%
2
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12699
99.9%
Hangul 12
 
0.1%
Compat Jamo 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2374
18.7%
1 2146
16.9%
5 1255
9.9%
3 1131
8.9%
2 975
7.7%
4 928
 
7.3%
6 908
 
7.2%
7 698
 
5.5%
8 655
 
5.2%
- 607
 
4.8%
Other values (8) 1022
8.0%
Hangul
ValueCountFrequency (%)
12
100.0%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
Distinct1451
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Memory size11.8 KiB
2023-12-11T01:32:29.627246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length8.7809779
Min length1

Characters and Unicode

Total characters13110
Distinct characters22
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1418 ?
Unique (%)95.0%

Sample

1st row60526536
2nd row522472
3rd rowD-01100596
4th row563448
5th row93800005
ValueCountFrequency (%)
60620636 7
 
0.5%
40514931 5
 
0.3%
548199 3
 
0.2%
d-11300007 2
 
0.1%
d-10308288 2
 
0.1%
30524061 2
 
0.1%
19-150400 2
 
0.1%
d-15-01100359 2
 
0.1%
20688275 2
 
0.1%
523213 2
 
0.1%
Other values (1440) 1464
98.1%
2023-12-11T01:32:30.101773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2370
18.1%
1 2120
16.2%
5 1261
9.6%
3 1139
8.7%
2 1007
7.7%
6 942
 
7.2%
4 925
 
7.1%
- 748
 
5.7%
7 706
 
5.4%
8 684
 
5.2%
Other values (12) 1208
9.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11685
89.1%
Dash Punctuation 748
 
5.7%
Uppercase Letter 665
 
5.1%
Other Punctuation 6
 
< 0.1%
Other Letter 4
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2370
20.3%
1 2120
18.1%
5 1261
10.8%
3 1139
9.7%
2 1007
8.6%
6 942
 
8.1%
4 925
 
7.9%
7 706
 
6.0%
8 684
 
5.9%
9 531
 
4.5%
Uppercase Letter
ValueCountFrequency (%)
D 660
99.2%
B 2
 
0.3%
A 2
 
0.3%
C 1
 
0.2%
Other Letter
ValueCountFrequency (%)
2
50.0%
1
25.0%
1
25.0%
Other Punctuation
ValueCountFrequency (%)
' 3
50.0%
. 3
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 748
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12441
94.9%
Latin 665
 
5.1%
Hangul 4
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2370
19.0%
1 2120
17.0%
5 1261
10.1%
3 1139
9.2%
2 1007
8.1%
6 942
 
7.6%
4 925
 
7.4%
- 748
 
6.0%
7 706
 
5.7%
8 684
 
5.5%
Other values (5) 539
 
4.3%
Latin
ValueCountFrequency (%)
D 660
99.2%
B 2
 
0.3%
A 2
 
0.3%
C 1
 
0.2%
Hangul
ValueCountFrequency (%)
2
50.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13106
> 99.9%
Compat Jamo 2
 
< 0.1%
Hangul 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2370
18.1%
1 2120
16.2%
5 1261
9.6%
3 1139
8.7%
2 1007
7.7%
6 942
 
7.2%
4 925
 
7.1%
- 748
 
5.7%
7 706
 
5.4%
8 684
 
5.2%
Other values (9) 1204
9.2%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

변경사유
Text

MISSING 

Distinct131
Distinct (%)9.5%
Missing115
Missing (%)7.7%
Memory size11.8 KiB
2023-12-11T01:32:30.302206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length2
Mean length3.9550073
Min length1

Characters and Unicode

Total characters5450
Distinct characters141
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)6.0%

Sample

1st row계량기번호 착오기재
2nd row오기
3rd row오기
4th row오기
5th row입력착오
ValueCountFrequency (%)
오기 470
25.0%
수정 286
15.2%
오타 100
 
5.3%
기물번호 90
 
4.8%
착오기재 87
 
4.6%
입력오기 79
 
4.2%
확인 76
 
4.0%
입력오류 67
 
3.6%
46
 
2.5%
오류 40
 
2.1%
Other values (152) 536
28.6%
2023-12-11T01:32:30.904427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
923
16.9%
856
15.7%
499
 
9.2%
310
 
5.7%
306
 
5.6%
220
 
4.0%
209
 
3.8%
151
 
2.8%
124
 
2.3%
119
 
2.2%
Other values (131) 1733
31.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4508
82.7%
Space Separator 499
 
9.2%
Decimal Number 309
 
5.7%
Other Punctuation 92
 
1.7%
Lowercase Letter 11
 
0.2%
Open Punctuation 9
 
0.2%
Close Punctuation 9
 
0.2%
Dash Punctuation 7
 
0.1%
Math Symbol 6
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
923
20.5%
856
19.0%
310
 
6.9%
306
 
6.8%
220
 
4.9%
209
 
4.6%
151
 
3.3%
124
 
2.8%
119
 
2.6%
118
 
2.6%
Other values (108) 1172
26.0%
Decimal Number
ValueCountFrequency (%)
2 67
21.7%
0 65
21.0%
1 55
17.8%
7 26
 
8.4%
6 23
 
7.4%
4 18
 
5.8%
8 15
 
4.9%
5 15
 
4.9%
9 14
 
4.5%
3 11
 
3.6%
Lowercase Letter
ValueCountFrequency (%)
d 3
27.3%
m 2
18.2%
a 2
18.2%
s 2
18.2%
h 2
18.2%
Other Punctuation
ValueCountFrequency (%)
. 91
98.9%
/ 1
 
1.1%
Math Symbol
ValueCountFrequency (%)
< 3
50.0%
> 3
50.0%
Space Separator
ValueCountFrequency (%)
499
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4508
82.7%
Common 931
 
17.1%
Latin 11
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
923
20.5%
856
19.0%
310
 
6.9%
306
 
6.8%
220
 
4.9%
209
 
4.6%
151
 
3.3%
124
 
2.8%
119
 
2.6%
118
 
2.6%
Other values (108) 1172
26.0%
Common
ValueCountFrequency (%)
499
53.6%
. 91
 
9.8%
2 67
 
7.2%
0 65
 
7.0%
1 55
 
5.9%
7 26
 
2.8%
6 23
 
2.5%
4 18
 
1.9%
8 15
 
1.6%
5 15
 
1.6%
Other values (8) 57
 
6.1%
Latin
ValueCountFrequency (%)
d 3
27.3%
m 2
18.2%
a 2
18.2%
s 2
18.2%
h 2
18.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4504
82.6%
ASCII 942
 
17.3%
Compat Jamo 4
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
923
20.5%
856
19.0%
310
 
6.9%
306
 
6.8%
220
 
4.9%
209
 
4.6%
151
 
3.4%
124
 
2.8%
119
 
2.6%
118
 
2.6%
Other values (107) 1168
25.9%
ASCII
ValueCountFrequency (%)
499
53.0%
. 91
 
9.7%
2 67
 
7.1%
0 65
 
6.9%
1 55
 
5.8%
7 26
 
2.8%
6 23
 
2.4%
4 18
 
1.9%
8 15
 
1.6%
5 15
 
1.6%
Other values (13) 68
 
7.2%
Compat Jamo
ValueCountFrequency (%)
4
100.0%

Interactions

2023-12-11T01:32:27.645849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-11T01:32:27.796591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:32:27.893492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번변경전계량기번호변경후계량기번호변경사유
016052693660526536계량기번호 착오기재
12522742522472오기
23D-011100596D-01100596오기
34563445563448오기
45938000593800005입력착오
565050173750501563착오기재
675637495637477
7819-15032519-150330수정
89D-1501100772D-15-01100772수정
91019-15001018-150010수정
연번변경전계량기번호변경후계량기번호변경사유
1483148430515279305142794
14841485D-1185956D-11851956오기
14851486D-11891139D-11851139오기
14861487D-11744237D-11744235오기
14871488D-11743791D-11743781오기
148814896200288620029090
1489149011435953D-11435953오기
149014917440036744003362021.12.3. 바로서비스 수압확인 시 기번 상이함 확인.
1491149276870174076801740오타
149214936052963460629634계량기 철거 시 기번 상이함 확인.