Overview

Dataset statistics

Number of variables6
Number of observations462
Missing cells13
Missing cells (%)0.5%
Duplicate rows20
Duplicate rows (%)4.3%
Total size in memory22.2 KiB
Average record size in memory49.3 B

Variable types

Text2
DateTime1
Categorical2
Numeric1

Dataset

Description전라남도 무안군 공간정보시스템에 등록된 전산화된 상수도 맨홀 정보(도엽번호, 설치일자, 규격, 맨홀종류, 맨홀형태, 법정동 등)를 제공 합니다.
URLhttps://www.data.go.kr/data/15041000/fileData.do

Alerts

Dataset has 20 (4.3%) duplicate rowsDuplicates
맨홀형태 is highly imbalanced (68.4%)Imbalance
규격 has 13 (2.8%) missing valuesMissing

Reproduction

Analysis started2023-12-12 21:58:39.414536
Analysis finished2023-12-12 21:58:39.850280
Duration0.44 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct280
Distinct (%)60.6%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
2023-12-13T06:58:40.015507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters4620
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique200 ?
Unique (%)43.3%

Sample

1st row346020503A
2nd row346020503D
3rd row346020503C
4th row356142594D
5th row356142595C
ValueCountFrequency (%)
346022097a 13
 
2.8%
346022087d 11
 
2.4%
346020512c 10
 
2.2%
346022097b 9
 
1.9%
346022096d 9
 
1.9%
346022096b 8
 
1.7%
346020526b 7
 
1.5%
346022097d 6
 
1.3%
346022097c 6
 
1.3%
346022088c 6
 
1.3%
Other values (270) 377
81.6%
2023-12-13T06:58:40.373318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 741
16.0%
2 673
14.6%
3 620
13.4%
6 585
12.7%
4 560
12.1%
1 288
 
6.2%
5 248
 
5.4%
7 174
 
3.8%
8 143
 
3.1%
9 126
 
2.7%
Other values (4) 462
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4158
90.0%
Uppercase Letter 462
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 741
17.8%
2 673
16.2%
3 620
14.9%
6 585
14.1%
4 560
13.5%
1 288
 
6.9%
5 248
 
6.0%
7 174
 
4.2%
8 143
 
3.4%
9 126
 
3.0%
Uppercase Letter
ValueCountFrequency (%)
A 118
25.5%
D 116
25.1%
C 115
24.9%
B 113
24.5%

Most occurring scripts

ValueCountFrequency (%)
Common 4158
90.0%
Latin 462
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 741
17.8%
2 673
16.2%
3 620
14.9%
6 585
14.1%
4 560
13.5%
1 288
 
6.9%
5 248
 
6.0%
7 174
 
4.2%
8 143
 
3.4%
9 126
 
3.0%
Latin
ValueCountFrequency (%)
A 118
25.5%
D 116
25.1%
C 115
24.9%
B 113
24.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 741
16.0%
2 673
14.6%
3 620
13.4%
6 585
12.7%
4 560
12.1%
1 288
 
6.2%
5 248
 
5.4%
7 174
 
3.8%
8 143
 
3.1%
9 126
 
2.7%
Other values (4) 462
10.0%
Distinct19
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
Minimum1900-01-01 00:00:00
Maximum2020-01-01 00:00:00
2023-12-13T06:58:40.494918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:58:40.610424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)

규격
Text

MISSING 

Distinct264
Distinct (%)58.8%
Missing13
Missing (%)2.8%
Memory size3.7 KiB
2023-12-13T06:58:40.852259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length11
Mean length11.151448
Min length8

Characters and Unicode

Total characters5007
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique199 ?
Unique (%)44.3%

Sample

1st row1.3x1.3x404
2nd row1.5x2.7x1.4
3rd row2.0x2x2.72
4th row1.2x1.2x1.6
5th row1.2x1.2x4.0
ValueCountFrequency (%)
0.9x0.9x1.0 22
 
4.9%
1.2x1.2x1.2 16
 
3.6%
2.5x2.0x1.5 13
 
2.9%
3.5x2.5x1.5 13
 
2.9%
1.2x1.2x1.5 12
 
2.7%
ø1200x1.2 10
 
2.2%
1.2x1.2x1.6 7
 
1.6%
1.3x1.5x0.63 7
 
1.6%
1.2x2.0x1.2 6
 
1.3%
2.0x2.5x1.5 6
 
1.3%
Other values (252) 337
75.1%
2023-12-13T06:58:41.222307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 1289
25.7%
1 890
17.8%
x 866
17.3%
2 523
10.4%
0 416
 
8.3%
5 404
 
8.1%
3 226
 
4.5%
9 89
 
1.8%
4 80
 
1.6%
6 73
 
1.5%
Other values (4) 151
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2820
56.3%
Other Punctuation 1289
25.7%
Lowercase Letter 866
 
17.3%
Uppercase Letter 32
 
0.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 890
31.6%
2 523
18.5%
0 416
14.8%
5 404
14.3%
3 226
 
8.0%
9 89
 
3.2%
4 80
 
2.8%
6 73
 
2.6%
8 63
 
2.2%
7 56
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
Ø 26
81.2%
X 6
 
18.8%
Other Punctuation
ValueCountFrequency (%)
. 1289
100.0%
Lowercase Letter
ValueCountFrequency (%)
x 866
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4109
82.1%
Latin 898
 
17.9%

Most frequent character per script

Common
ValueCountFrequency (%)
. 1289
31.4%
1 890
21.7%
2 523
12.7%
0 416
 
10.1%
5 404
 
9.8%
3 226
 
5.5%
9 89
 
2.2%
4 80
 
1.9%
6 73
 
1.8%
8 63
 
1.5%
Latin
ValueCountFrequency (%)
x 866
96.4%
Ø 26
 
2.9%
X 6
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4981
99.5%
None 26
 
0.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 1289
25.9%
1 890
17.9%
x 866
17.4%
2 523
10.5%
0 416
 
8.4%
5 404
 
8.1%
3 226
 
4.5%
9 89
 
1.8%
4 80
 
1.6%
6 73
 
1.5%
Other values (3) 125
 
2.5%
None
ValueCountFrequency (%)
Ø 26
100.0%

맨홀종류
Categorical

Distinct15
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
SOM999
153 
SOM040
107 
SOM012
39 
SOM903
37 
SOM914
31 
Other values (10)
95 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSOM040
2nd rowSOM040
3rd rowSOM040
4th rowSOM040
5th rowSOM040

Common Values

ValueCountFrequency (%)
SOM999 153
33.1%
SOM040 107
23.2%
SOM012 39
 
8.4%
SOM903 37
 
8.0%
SOM914 31
 
6.7%
SOM915 25
 
5.4%
SOM002 25
 
5.4%
SOM000 13
 
2.8%
SOM013 9
 
1.9%
SOM015 7
 
1.5%
Other values (5) 16
 
3.5%

Length

2023-12-13T06:58:41.348230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
som999 153
33.1%
som040 107
23.2%
som012 39
 
8.4%
som903 37
 
8.0%
som914 31
 
6.7%
som915 25
 
5.4%
som002 25
 
5.4%
som000 13
 
2.8%
som013 9
 
1.9%
som015 7
 
1.5%
Other values (5) 16
 
3.5%

맨홀형태
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
MHS003
414 
MHS001
 
26
MHS000
 
13
MHS005
 
9

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMHS005
2nd rowMHS005
3rd rowMHS005
4th rowMHS005
5th rowMHS005

Common Values

ValueCountFrequency (%)
MHS003 414
89.6%
MHS001 26
 
5.6%
MHS000 13
 
2.8%
MHS005 9
 
1.9%

Length

2023-12-13T06:58:41.464759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:58:41.566092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
mhs003 414
89.6%
mhs001 26
 
5.6%
mhs000 13
 
2.8%
mhs005 9
 
1.9%

법정동
Real number (ℝ)

Distinct10
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.6840279 × 109
Minimum4.684025 × 109
Maximum4.684037 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.2 KiB
2023-12-13T06:58:41.713306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4.684025 × 109
5-th percentile4.684025 × 109
Q14.6840253 × 109
median4.6840253 × 109
Q34.684033 × 109
95-th percentile4.684035 × 109
Maximum4.684037 × 109
Range12000
Interquartile range (IQR)7700

Descriptive statistics

Standard deviation4113.1302
Coefficient of variation (CV)8.7811821 × 10-7
Kurtosis-0.66404703
Mean4.6840279 × 109
Median Absolute Deviation (MAD)300
Skewness1.0657536
Sum2.1640209 × 1012
Variance16917840
MonotonicityNot monotonic
2023-12-13T06:58:41.822248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
4684025300 172
37.2%
4684025622 78
16.9%
4684025000 74
16.0%
4684034000 72
15.6%
4684037000 21
 
4.5%
4684033000 15
 
3.2%
4684032000 14
 
3.0%
4684035000 8
 
1.7%
4684025600 6
 
1.3%
4684036000 2
 
0.4%
ValueCountFrequency (%)
4684025000 74
16.0%
4684025300 172
37.2%
4684025600 6
 
1.3%
4684025622 78
16.9%
4684032000 14
 
3.0%
4684033000 15
 
3.2%
4684034000 72
15.6%
4684035000 8
 
1.7%
4684036000 2
 
0.4%
4684037000 21
 
4.5%
ValueCountFrequency (%)
4684037000 21
 
4.5%
4684036000 2
 
0.4%
4684035000 8
 
1.7%
4684034000 72
15.6%
4684033000 15
 
3.2%
4684032000 14
 
3.0%
4684025622 78
16.9%
4684025600 6
 
1.3%
4684025300 172
37.2%
4684025000 74
16.0%

Interactions

2023-12-13T06:58:39.589803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:58:41.901150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치일자맨홀종류맨홀형태법정동
설치일자1.0000.7580.6190.777
맨홀종류0.7581.0000.3310.440
맨홀형태0.6190.3311.0000.148
법정동0.7770.4400.1481.000
2023-12-13T06:58:41.987213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
맨홀종류맨홀형태
맨홀종류1.0000.191
맨홀형태0.1911.000
2023-12-13T06:58:42.069737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동맨홀종류맨홀형태
법정동1.0000.2090.070
맨홀종류0.2091.0000.191
맨홀형태0.0700.1911.000

Missing values

2023-12-13T06:58:39.708577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:58:39.805746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

도엽번호설치일자규격맨홀종류맨홀형태법정동
0346020503A1900-01-011.3x1.3x404SOM040MHS0054684025300
1346020503D2007-01-011.5x2.7x1.4SOM040MHS0054684025300
2346020503C2007-01-012.0x2x2.72SOM040MHS0054684025300
3356142594D2010-01-011.2x1.2x1.6SOM040MHS0054684025300
4356142595C2010-01-011.2x1.2x4.0SOM040MHS0054684025300
5356142596D2010-01-011.2x1.5x3.0SOM040MHS0054684025300
6356142596D2010-01-011.2x1.2x3.0SOM040MHS0054684025300
7346020516A2007-01-011.4x0.9x4.2SOM040MHS0054684025300
8346020516A2007-01-010.9x0.9x4.2SOM040MHS0054684025300
9346020517A2007-01-012.5x2.8x1.9SOM040MHS0034684025000
도엽번호설치일자규격맨홀종류맨홀형태법정동
452346022033B2007-01-011.7x2.4x1.7SOM999MHS0004684025000
453346021570B2007-01-011.5x1.5x1.9SOM999MHS0004684034000
454346021500C2007-01-011.5x1.5x1.3SOM999MHS0004684034000
455346031141A2007-01-011.3x1.3x2.4SOM999MHS0004684034000
456346031171C2007-01-011.3x1.3x2.4SOM999MHS0004684034000
457346021500D2007-01-011.3x1.3x2.4SOM999MHS0004684034000
458346021587D2007-01-011.2x1.2x2.0SOM002MHS0004684025622
459346021587A2007-01-011.2x1.2x1.0SOM002MHS0004684025300
460346022016C2007-01-011.0x1.5x0.9SOM999MHS0004684025300
461346022025B2007-01-011.0x1.5x0.9SOM999MHS0004684033000

Duplicate rows

Most frequently occurring

도엽번호설치일자규격맨홀종류맨홀형태법정동# duplicates
12346022088C2019-01-013.5x2.5x1.5SOM999MHS00346840253004
2346020526B1900-01-011.2x1.2x1.2SOM040MHS00346840253003
14346022096D2019-01-012.5x2.0x1.5SOM999MHS00346840253003
0346020503A2011-01-011.3x1.3x3.69SOM040MHS00346840250002
1346020526B1900-01-011.2x1.2x1.2SOM040MHS00346840250002
3346020527D1900-01-012.0x1.2x2.0SOM040MHS00346840250002
4346021440B2007-01-011.2x1.2x4.7SOM002MHS00146840250002
5346022072B2010-01-011.2x1.2x1.3SOM012MHS00346840256222
6346022082C2010-01-011.2x1.2x1.2SOM012MHS00346840256222
7346022087B2019-01-011.2x1.2x1.5SOM903MHS00346840253002