Overview

Dataset statistics

Number of variables6
Number of observations1083
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory52.0 KiB
Average record size in memory49.1 B

Variable types

Numeric1
Text4
DateTime1

Dataset

Description부산광역시상수도사업본부_수용가정보시스템_계량기정보_계량기번호교체정보_20230125
Author부산광역시 상수도사업본부
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15101349

Alerts

연번 has unique valuesUnique
삭제시간 has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:32:21.848448
Analysis finished2023-12-10 16:32:22.321376
Duration0.47 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct1083
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean542
Minimum1
Maximum1083
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.6 KiB
2023-12-11T01:32:22.409559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile55.1
Q1271.5
median542
Q3812.5
95-th percentile1028.9
Maximum1083
Range1082
Interquartile range (IQR)541

Descriptive statistics

Standard deviation312.77948
Coefficient of variation (CV)0.5770839
Kurtosis-1.2
Mean542
Median Absolute Deviation (MAD)271
Skewness0
Sum586986
Variance97831
MonotonicityStrictly increasing
2023-12-11T01:32:22.537133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
729 1
 
0.1%
715 1
 
0.1%
716 1
 
0.1%
717 1
 
0.1%
718 1
 
0.1%
719 1
 
0.1%
720 1
 
0.1%
721 1
 
0.1%
722 1
 
0.1%
Other values (1073) 1073
99.1%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1083 1
0.1%
1082 1
0.1%
1081 1
0.1%
1080 1
0.1%
1079 1
0.1%
1078 1
0.1%
1077 1
0.1%
1076 1
0.1%
1075 1
0.1%
1074 1
0.1%
Distinct937
Distinct (%)86.5%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
2023-12-11T01:32:22.810401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6498
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique826 ?
Unique (%)76.3%

Sample

1st row*01*22
2nd row*13*68
3rd row*20*71
4th row*04*27
5th row*97*87
ValueCountFrequency (%)
75*95 8
 
0.7%
54*84 7
 
0.6%
39*49 6
 
0.6%
52*06 5
 
0.5%
55*55 4
 
0.4%
15*08 4
 
0.4%
01*54 3
 
0.3%
09*94 3
 
0.3%
82*53 3
 
0.3%
12*36 3
 
0.3%
Other values (927) 1037
95.8%
2023-12-11T01:32:23.249378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 2166
33.3%
0 493
 
7.6%
1 479
 
7.4%
9 475
 
7.3%
3 455
 
7.0%
5 434
 
6.7%
4 432
 
6.6%
2 429
 
6.6%
8 396
 
6.1%
6 375
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4332
66.7%
Other Punctuation 2166
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 493
11.4%
1 479
11.1%
9 475
11.0%
3 455
10.5%
5 434
10.0%
4 432
10.0%
2 429
9.9%
8 396
9.1%
6 375
8.7%
7 364
8.4%
Other Punctuation
ValueCountFrequency (%)
* 2166
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6498
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 2166
33.3%
0 493
 
7.6%
1 479
 
7.4%
9 475
 
7.3%
3 455
 
7.0%
5 434
 
6.7%
4 432
 
6.6%
2 429
 
6.6%
8 396
 
6.1%
6 375
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6498
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 2166
33.3%
0 493
 
7.6%
1 479
 
7.4%
9 475
 
7.3%
3 455
 
7.0%
5 434
 
6.7%
4 432
 
6.6%
2 429
 
6.6%
8 396
 
6.1%
6 375
 
5.8%

삭제시간
Date

UNIQUE 

Distinct1083
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
Minimum2022-01-03 14:30:29
Maximum2022-12-30 10:12:27
2023-12-11T01:32:23.402216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:32:23.558914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1069
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
2023-12-11T01:32:23.877593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length8.4062789
Min length2

Characters and Unicode

Total characters9104
Distinct characters20
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1057 ?
Unique (%)97.6%

Sample

1st row50600507
2nd row12402318
3rd rowD1720701
4th row602000846
5th rowD11303786
ValueCountFrequency (%)
12345678 3
 
0.3%
40522311 3
 
0.3%
d16300063 2
 
0.2%
40040720 2
 
0.2%
d12706329 2
 
0.2%
d15400447 2
 
0.2%
d16500442 2
 
0.2%
00553385 2
 
0.2%
d16500441 2
 
0.2%
09553388 2
 
0.2%
Other values (1059) 1061
98.0%
2023-12-11T01:32:24.367975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1661
18.2%
1 1640
18.0%
4 935
10.3%
2 863
9.5%
5 803
8.8%
3 704
7.7%
6 600
 
6.6%
D 526
 
5.8%
7 511
 
5.6%
8 489
 
5.4%
Other values (10) 372
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8562
94.0%
Uppercase Letter 529
 
5.8%
Other Letter 8
 
0.1%
Dash Punctuation 5
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1661
19.4%
1 1640
19.2%
4 935
10.9%
2 863
10.1%
5 803
9.4%
3 704
8.2%
6 600
 
7.0%
7 511
 
6.0%
8 489
 
5.7%
9 356
 
4.2%
Other Letter
ValueCountFrequency (%)
2
25.0%
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Uppercase Letter
ValueCountFrequency (%)
D 526
99.4%
R 2
 
0.4%
A 1
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8567
94.1%
Latin 529
 
5.8%
Hangul 8
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1661
19.4%
1 1640
19.1%
4 935
10.9%
2 863
10.1%
5 803
9.4%
3 704
8.2%
6 600
 
7.0%
7 511
 
6.0%
8 489
 
5.7%
9 356
 
4.2%
Hangul
ValueCountFrequency (%)
2
25.0%
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Latin
ValueCountFrequency (%)
D 526
99.4%
R 2
 
0.4%
A 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9096
99.9%
Hangul 6
 
0.1%
Compat Jamo 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1661
18.3%
1 1640
18.0%
4 935
10.3%
2 863
9.5%
5 803
8.8%
3 704
7.7%
6 600
 
6.6%
D 526
 
5.8%
7 511
 
5.6%
8 489
 
5.4%
Other values (4) 364
 
4.0%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
Hangul
ValueCountFrequency (%)
2
33.3%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Distinct1071
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
2023-12-11T01:32:24.669348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length8.5355494
Min length4

Characters and Unicode

Total characters9244
Distinct characters18
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1061 ?
Unique (%)98.0%

Sample

1st row50620570
2nd rowD12402318
3rd rowD17200701
4th row60200846
5th rowD11303789
ValueCountFrequency (%)
12345678 4
 
0.4%
02-324884 2
 
0.2%
20009044 2
 
0.2%
d16500442 2
 
0.2%
20515908 2
 
0.2%
d15300063 2
 
0.2%
00553385 2
 
0.2%
40040728 2
 
0.2%
09553388 2
 
0.2%
d16500441 2
 
0.2%
Other values (1061) 1061
98.0%
2023-12-11T01:32:25.201771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1677
18.1%
1 1626
17.6%
4 932
10.1%
2 857
9.3%
5 831
9.0%
3 711
7.7%
6 638
 
6.9%
D 554
 
6.0%
7 546
 
5.9%
8 503
 
5.4%
Other values (8) 369
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8662
93.7%
Uppercase Letter 556
 
6.0%
Dash Punctuation 20
 
0.2%
Other Letter 4
 
< 0.1%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1677
19.4%
1 1626
18.8%
4 932
10.8%
2 857
9.9%
5 831
9.6%
3 711
8.2%
6 638
 
7.4%
7 546
 
6.3%
8 503
 
5.8%
9 341
 
3.9%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Uppercase Letter
ValueCountFrequency (%)
D 554
99.6%
A 2
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 20
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8684
93.9%
Latin 556
 
6.0%
Hangul 4
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1677
19.3%
1 1626
18.7%
4 932
10.7%
2 857
9.9%
5 831
9.6%
3 711
8.2%
6 638
 
7.3%
7 546
 
6.3%
8 503
 
5.8%
9 341
 
3.9%
Other values (2) 22
 
0.3%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Latin
ValueCountFrequency (%)
D 554
99.6%
A 2
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9240
> 99.9%
Hangul 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1677
18.1%
1 1626
17.6%
4 932
10.1%
2 857
9.3%
5 831
9.0%
3 711
7.7%
6 638
 
6.9%
D 554
 
6.0%
7 546
 
5.9%
8 503
 
5.4%
Other values (4) 365
 
4.0%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Distinct142
Distinct (%)13.1%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
2023-12-11T01:32:25.409045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length32
Mean length4.2594645
Min length1

Characters and Unicode

Total characters4613
Distinct characters135
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique93 ?
Unique (%)8.6%

Sample

1st row오기
2nd row입력오기
3rd row통보결과 오류
4th row오기
5th row계량기번호입력오류
ValueCountFrequency (%)
오기 355
25.8%
착오입력 151
 
11.0%
오타 76
 
5.5%
오기입 69
 
5.0%
입력오기 58
 
4.2%
계량기번호입력오류 52
 
3.8%
입력오류 47
 
3.4%
수정 37
 
2.7%
입력시 35
 
2.5%
최초 34
 
2.5%
Other values (152) 461
33.5%
2023-12-11T01:32:25.797125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
889
19.3%
637
13.8%
464
 
10.1%
390
 
8.5%
292
 
6.3%
169
 
3.7%
141
 
3.1%
139
 
3.0%
127
 
2.8%
109
 
2.4%
Other values (125) 1256
27.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4000
86.7%
Space Separator 292
 
6.3%
Decimal Number 234
 
5.1%
Other Punctuation 58
 
1.3%
Open Punctuation 7
 
0.2%
Close Punctuation 7
 
0.2%
Dash Punctuation 6
 
0.1%
Lowercase Letter 6
 
0.1%
Uppercase Letter 2
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
889
22.2%
637
15.9%
464
11.6%
390
9.8%
169
 
4.2%
141
 
3.5%
139
 
3.5%
127
 
3.2%
109
 
2.7%
109
 
2.7%
Other values (101) 826
20.6%
Decimal Number
ValueCountFrequency (%)
2 49
20.9%
0 44
18.8%
1 31
13.2%
7 24
10.3%
4 17
 
7.3%
5 16
 
6.8%
6 16
 
6.8%
3 13
 
5.6%
9 12
 
5.1%
8 12
 
5.1%
Lowercase Letter
ValueCountFrequency (%)
m 2
33.3%
l 1
16.7%
r 1
16.7%
h 1
16.7%
d 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 53
91.4%
, 3
 
5.2%
' 2
 
3.4%
Space Separator
ValueCountFrequency (%)
292
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Uppercase Letter
ValueCountFrequency (%)
D 2
100.0%
Math Symbol
ValueCountFrequency (%)
> 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4000
86.7%
Common 605
 
13.1%
Latin 8
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
889
22.2%
637
15.9%
464
11.6%
390
9.8%
169
 
4.2%
141
 
3.5%
139
 
3.5%
127
 
3.2%
109
 
2.7%
109
 
2.7%
Other values (101) 826
20.6%
Common
ValueCountFrequency (%)
292
48.3%
. 53
 
8.8%
2 49
 
8.1%
0 44
 
7.3%
1 31
 
5.1%
7 24
 
4.0%
4 17
 
2.8%
5 16
 
2.6%
6 16
 
2.6%
3 13
 
2.1%
Other values (8) 50
 
8.3%
Latin
ValueCountFrequency (%)
m 2
25.0%
D 2
25.0%
l 1
12.5%
r 1
12.5%
h 1
12.5%
d 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3998
86.7%
ASCII 613
 
13.3%
Compat Jamo 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
889
22.2%
637
15.9%
464
11.6%
390
9.8%
169
 
4.2%
141
 
3.5%
139
 
3.5%
127
 
3.2%
109
 
2.7%
109
 
2.7%
Other values (100) 824
20.6%
ASCII
ValueCountFrequency (%)
292
47.6%
. 53
 
8.6%
2 49
 
8.0%
0 44
 
7.2%
1 31
 
5.1%
7 24
 
3.9%
4 17
 
2.8%
5 16
 
2.6%
6 16
 
2.6%
3 13
 
2.1%
Other values (14) 58
 
9.5%
Compat Jamo
ValueCountFrequency (%)
2
100.0%

Interactions

2023-12-11T01:32:22.056281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-11T01:32:22.170711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:32:22.274699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번고객번호삭제시간변경전계량기번호변경후계량기번호변경사유
01*01*222022-01-14 16:13:485060050750620570오기
12*13*682022-01-18 16:54:4212402318D12402318입력오기
23*20*712022-02-14 16:52:54D1720701D17200701통보결과 오류
34*04*272022-02-15 17:17:2660200084660200846오기
45*97*872022-02-22 10:23:50D11303786D11303789계량기번호입력오류
56*94*222022-02-22 10:34:5310000038910000389오타
67*26*232022-02-22 10:45:30D17600336D17600366계량기번호입력오류
78*95*532022-02-22 10:58:31D11300564D13300564계량기번호입력오류
89*56*722022-02-22 15:58:25D11303026D113030252022.2.22. 현장확인결과 기물번호 다름.
910*03*392022-02-25 16:48:44D11300896D13400896오기
연번고객번호삭제시간변경전계량기번호변경후계량기번호변경사유
10731074*44*552022-11-07 13:03:251680142716801425오기입
10741075*57*582022-11-07 15:22:262280123622801235오기입
10751076*37*862022-11-09 10:37:444052124911212313오기
10761077*37*552022-11-10 09:32:064051166340511563오기
10771078*92*982022-11-10 09:49:47700749770007497오기입
10781079*44*982022-11-10 17:42:584050050140500512오기입
10791080*63*862022-11-16 17:28:132051754920517542계량기 번호 상이.
10801081*33*802022-11-18 10:44:462680102726801432오타
10811082*05*482022-11-18 15:27:102000028620000486오타
10821083*09*662022-12-16 12:02:045050593050505980담당자 입력 오류