Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 KiB
Average record size in memory76.3 B

Variable types

Text2
Categorical6
Numeric1

Alerts

사업명 has constant value ""Constant
관로길이 has constant value ""Constant
취수원 has constant value ""Constant
사업단계명 is highly overall correlated with 소재지High correlation
소재지 is highly overall correlated with 사업단계명High correlation
권역 is highly imbalanced (91.9%)Imbalance
시설용량 has 18 (18.0%) zerosZeros

Reproduction

Analysis started2023-12-10 13:15:10.966078
Analysis finished2023-12-10 13:15:12.411367
Duration1.45 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct99
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T22:15:12.793779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length5
Mean length4.91
Min length2

Characters and Unicode

Total characters491
Distinct characters100
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98 ?
Unique (%)98.0%

Sample

1st row검복
2nd row능평
3rd row도마
4th row산성
5th row오전
ValueCountFrequency (%)
배수지 3
 
2.9%
가학배수지 2
 
1.9%
노안배수지 1
 
1.0%
논산배수지 1
 
1.0%
당포배수지 1
 
1.0%
당인배수지 1
 
1.0%
당산배수지 1
 
1.0%
달아배수지 1
 
1.0%
달도배수지 1
 
1.0%
단양배수지 1
 
1.0%
Other values (91) 91
87.5%
2023-12-10T22:15:13.508305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
88
17.9%
88
17.9%
87
17.7%
12
 
2.4%
10
 
2.0%
8
 
1.6%
7
 
1.4%
6
 
1.2%
) 6
 
1.2%
( 6
 
1.2%
Other values (90) 173
35.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 474
96.5%
Close Punctuation 6
 
1.2%
Open Punctuation 6
 
1.2%
Space Separator 4
 
0.8%
Other Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
88
18.6%
88
18.6%
87
18.4%
12
 
2.5%
10
 
2.1%
8
 
1.7%
7
 
1.5%
6
 
1.3%
6
 
1.3%
5
 
1.1%
Other values (86) 157
33.1%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 474
96.5%
Common 17
 
3.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
88
18.6%
88
18.6%
87
18.4%
12
 
2.5%
10
 
2.1%
8
 
1.7%
7
 
1.5%
6
 
1.3%
6
 
1.3%
5
 
1.1%
Other values (86) 157
33.1%
Common
ValueCountFrequency (%)
) 6
35.3%
( 6
35.3%
4
23.5%
1
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 474
96.5%
ASCII 16
 
3.3%
None 1
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
88
18.6%
88
18.6%
87
18.4%
12
 
2.5%
10
 
2.1%
8
 
1.7%
7
 
1.5%
6
 
1.3%
6
 
1.3%
5
 
1.1%
Other values (86) 157
33.1%
ASCII
ValueCountFrequency (%)
) 6
37.5%
( 6
37.5%
4
25.0%
None
ValueCountFrequency (%)
1
100.0%

권역
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
99 
낙동강권역
 
1

Length

Max length5
Median length1
Mean length1.04
Min length1

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 99
99.0%
낙동강권역 1
 
1.0%

Length

2023-12-10T22:15:13.869340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:15:14.034136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 99
99.0%
낙동강권역 1
 
1.0%

사업명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
지방상수도
100 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지방상수도
2nd row지방상수도
3rd row지방상수도
4th row지방상수도
5th row지방상수도

Common Values

ValueCountFrequency (%)
지방상수도 100
100.0%

Length

2023-12-10T22:15:14.203817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:15:14.366708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지방상수도 100
100.0%

사업단계명
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)18.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
완도수도관리단
24 
광주수도관리단
11 
고령권관리단
10 
진도수도관리단
통영수도관리단
Other values (13)
42 

Length

Max length8
Median length7
Mean length6.88
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광주수도관리단
2nd row광주수도관리단
3rd row광주수도관리단
4th row광주수도관리단
5th row광주수도관리단

Common Values

ValueCountFrequency (%)
완도수도관리단 24
24.0%
광주수도관리단 11
11.0%
고령권관리단 10
10.0%
진도수도관리단 7
 
7.0%
통영수도관리단 6
 
6.0%
경남서부권관리단 5
 
5.0%
거제권관리단 4
 
4.0%
고성수도관리단 4
 
4.0%
금산권관리단 3
 
3.0%
예천수도관리단 3
 
3.0%
Other values (8) 23
23.0%

Length

2023-12-10T22:15:14.550914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
완도수도관리단 24
24.0%
광주수도관리단 11
11.0%
고령권관리단 10
10.0%
진도수도관리단 7
 
7.0%
통영수도관리단 6
 
6.0%
경남서부권관리단 5
 
5.0%
거제권관리단 4
 
4.0%
고성수도관리단 4
 
4.0%
양주수도관리단 3
 
3.0%
단양수도관리단 3
 
3.0%
Other values (8) 23
23.0%
Distinct63
Distinct (%)63.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T22:15:14.909648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length3.15
Min length1

Characters and Unicode

Total characters315
Distinct characters79
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)62.0%

Sample

1st row검복
2nd row능평
3rd row도마
4th row산성
5th row오전
ValueCountFrequency (%)
0 38
38.0%
덕정(배 1
 
1.0%
대죽(배 1
 
1.0%
나리배 1
 
1.0%
남동배 1
 
1.0%
남평(배 1
 
1.0%
내장(배 1
 
1.0%
내행(배 1
 
1.0%
노안(배 1
 
1.0%
논산(배 1
 
1.0%
Other values (53) 53
53.0%
2023-12-10T22:15:15.412223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
51
16.2%
( 47
14.9%
) 47
14.9%
0 38
 
12.1%
9
 
2.9%
7
 
2.2%
5
 
1.6%
4
 
1.3%
4
 
1.3%
4
 
1.3%
Other values (69) 99
31.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 183
58.1%
Open Punctuation 47
 
14.9%
Close Punctuation 47
 
14.9%
Decimal Number 38
 
12.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
51
27.9%
9
 
4.9%
7
 
3.8%
5
 
2.7%
4
 
2.2%
4
 
2.2%
4
 
2.2%
3
 
1.6%
3
 
1.6%
3
 
1.6%
Other values (66) 90
49.2%
Open Punctuation
ValueCountFrequency (%)
( 47
100.0%
Close Punctuation
ValueCountFrequency (%)
) 47
100.0%
Decimal Number
ValueCountFrequency (%)
0 38
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 183
58.1%
Common 132
41.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
51
27.9%
9
 
4.9%
7
 
3.8%
5
 
2.7%
4
 
2.2%
4
 
2.2%
4
 
2.2%
3
 
1.6%
3
 
1.6%
3
 
1.6%
Other values (66) 90
49.2%
Common
ValueCountFrequency (%)
( 47
35.6%
) 47
35.6%
0 38
28.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 183
58.1%
ASCII 132
41.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
51
27.9%
9
 
4.9%
7
 
3.8%
5
 
2.7%
4
 
2.2%
4
 
2.2%
4
 
2.2%
3
 
1.6%
3
 
1.6%
3
 
1.6%
Other values (66) 90
49.2%
ASCII
ValueCountFrequency (%)
( 47
35.6%
) 47
35.6%
0 38
28.8%

소재지
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)31.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기도 광주시 송정동 466-5번지 (회안대로 1061-51)
11 
경북 고령군 고령읍 장기리 260-3번지
10 
경남 통영시 광도면 죽림리 1574-10 미래메디컬센터 9층 (죽림2로 49-10)
 
6
경남 사천시 축동면 배춘리 18번지 (수자원길 30)
 
5
전라남도 완도군 노화읍
 
4
Other values (26)
64 

Length

Max length46
Median length34
Mean length25.09
Min length12

Unique

Unique5 ?
Unique (%)5.0%

Sample

1st row경기도 광주시 송정동 466-5번지 (회안대로 1061-51)
2nd row경기도 광주시 송정동 466-5번지 (회안대로 1061-51)
3rd row경기도 광주시 송정동 466-5번지 (회안대로 1061-51)
4th row경기도 광주시 송정동 466-5번지 (회안대로 1061-51)
5th row경기도 광주시 송정동 466-5번지 (회안대로 1061-51)

Common Values

ValueCountFrequency (%)
경기도 광주시 송정동 466-5번지 (회안대로 1061-51) 11
 
11.0%
경북 고령군 고령읍 장기리 260-3번지 10
 
10.0%
경남 통영시 광도면 죽림리 1574-10 미래메디컬센터 9층 (죽림2로 49-10) 6
 
6.0%
경남 사천시 축동면 배춘리 18번지 (수자원길 30) 5
 
5.0%
전라남도 완도군 노화읍 4
 
4.0%
전라남도 완도군 완도읍 4
 
4.0%
경남 고성군 고성읍 기월리 603-4번지 (기월2길 59) 4
 
4.0%
경남 거제시 장평동 1195-41번지 (장평로 16-5) 4
 
4.0%
경북 예천군 예천읍 남본리 258-20 베스트프라자 B동 3
 
3.0%
전남 나주시 이창동 191번지 (예향로 3803) 3
 
3.0%
Other values (21) 46
46.0%

Length

2023-12-10T22:15:15.726541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전라남도 31
 
5.8%
완도군 24
 
4.5%
경기도 19
 
3.6%
경남 19
 
3.6%
경북 13
 
2.4%
송정동 11
 
2.1%
466-5번지 11
 
2.1%
회안대로 11
 
2.1%
1061-51 11
 
2.1%
광주시 11
 
2.1%
Other values (102) 371
69.7%

시설용량
Real number (ℝ)

ZEROS 

Distinct54
Distinct (%)54.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1684.82
Minimum0
Maximum32800
Zeros18
Zeros (%)18.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:15:15.977907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q130
median220
Q31700
95-th percentile6150
Maximum32800
Range32800
Interquartile range (IQR)1670

Descriptive statistics

Standard deviation4002.8583
Coefficient of variation (CV)2.3758374
Kurtosis37.430143
Mean1684.82
Median Absolute Deviation (MAD)220
Skewness5.3687872
Sum168482
Variance16022875
MonotonicityNot monotonic
2023-12-10T22:15:16.200520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 18
 
18.0%
30 5
 
5.0%
50 4
 
4.0%
500 4
 
4.0%
200 3
 
3.0%
20 3
 
3.0%
100 3
 
3.0%
2000 3
 
3.0%
300 2
 
2.0%
140 2
 
2.0%
Other values (44) 53
53.0%
ValueCountFrequency (%)
0 18
18.0%
3 1
 
1.0%
20 3
 
3.0%
30 5
 
5.0%
32 1
 
1.0%
40 1
 
1.0%
50 4
 
4.0%
60 2
 
2.0%
70 1
 
1.0%
72 1
 
1.0%
ValueCountFrequency (%)
32800 1
1.0%
13000 1
1.0%
10500 1
1.0%
10000 1
1.0%
9000 1
1.0%
6000 2
2.0%
5500 1
1.0%
5240 1
1.0%
5000 2
2.0%
4800 1
1.0%

관로길이
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:15:16.429538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:15:16.566579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

취수원
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:15:16.718966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:15:16.845712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

Interactions

2023-12-10T22:15:11.537653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:15:16.935018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명권역사업단계명시설약칭소재지시설용량
시설명1.0001.0001.0001.0000.0001.000
권역1.0001.0000.0000.0000.0000.000
사업단계명1.0000.0001.0000.9861.0000.706
시설약칭1.0000.0000.9861.0000.0000.991
소재지0.0000.0001.0000.0001.0000.501
시설용량1.0000.0000.7060.9910.5011.000
2023-12-10T22:15:17.102333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업단계명소재지권역
사업단계명1.0000.9170.000
소재지0.9171.0000.000
권역0.0000.0001.000
2023-12-10T22:15:17.242342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설용량권역사업단계명소재지
시설용량1.0000.0000.4230.217
권역0.0001.0000.0000.000
사업단계명0.4230.0001.0000.917
소재지0.2170.0000.9171.000

Missing values

2023-12-10T22:15:12.105178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:15:12.319639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명권역사업명사업단계명시설약칭소재지시설용량관로길이취수원
0검복0지방상수도광주수도관리단검복경기도 광주시 송정동 466-5번지 (회안대로 1061-51)20000
1능평0지방상수도광주수도관리단능평경기도 광주시 송정동 466-5번지 (회안대로 1061-51)400000
2도마0지방상수도광주수도관리단도마경기도 광주시 송정동 466-5번지 (회안대로 1061-51)220000
3산성0지방상수도광주수도관리단산성경기도 광주시 송정동 466-5번지 (회안대로 1061-51)50000
4오전0지방상수도광주수도관리단오전경기도 광주시 송정동 466-5번지 (회안대로 1061-51)20000
5오향0지방상수도광주수도관리단오향경기도 광주시 송정동 466-5번지 (회안대로 1061-51)200000
6진우0지방상수도광주수도관리단진우경기도 광주시 송정동 466-5번지 (회안대로 1061-51)200000
7추자0지방상수도광주수도관리단추자경기도 광주시 송정동 466-5번지 (회안대로 1061-51)000
8학동0지방상수도광주수도관리단학동경기도 광주시 송정동 466-5번지 (회안대로 1061-51)000
9대산 배수지0지방상수도서산권관리단대산(배)충남 서산시 석림동 800-3524000
시설명권역사업명사업단계명시설약칭소재지시설용량관로길이취수원
90동문배수지0지방상수도서산권관리단동문(배)충남 서산시 석림동 800-310000
91동부배수지0지방상수도거제권관리단동부(배)경남 거제시 장평동 1195-41번지 (장평로 16-5)300
92동안배수지0지방상수도동두천수도관리단동안(배)경기도 동두천시 하봉암동 155번지 (평화로 3208번길 1)1050000
93동외배수지0지방상수도진도수도관리단동외배전라남도 진도군 진도읍105000
94동진배수지0지방상수도완도수도관리단0전라남도 완도군 소안면7200
95동천배수지0지방상수도경남서부권관리단동천(배)경남 사천시 축동면 배춘리 18번지 (수자원길 30)10000
96둔덕배수지0지방상수도거제권관리단둔덕(배)경남 거제시 장평동 1195-41번지 (장평로 16-5)70000
97득성배수지0지방상수도고령권관리단0경북 고령군 고령읍 장기리 260-3번지000
98마곡배수지0지방상수도정읍권관리단마곡(배)전북 정읍시 농소동 78-22번지 (서부산업도로 418)1300000
99마동배수지0지방상수도고성수도관리단마동(배)경남 고성군 고성읍 기월리 603-4번지 (기월2길 59)20000