Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 KiB
Average record size in memory77.3 B

Variable types

Text2
Categorical6
Numeric1

Alerts

권역 has constant value ""Constant
사업명 has constant value ""Constant
관로길이 has constant value ""Constant
취수원 has constant value ""Constant
소재지 is highly overall correlated with 사업단계명High correlation
사업단계명 is highly overall correlated with 소재지High correlation
시설명 has unique valuesUnique
시설용량 has 43 (43.0%) zerosZeros

Reproduction

Analysis started2023-12-10 13:16:56.071089
Analysis finished2023-12-10 13:16:57.337176
Duration1.27 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설명
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T22:16:57.734308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length5.7
Min length2

Characters and Unicode

Total characters570
Distinct characters120
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row 여미리 소규모가압장
2nd row검복
3rd row광남
4th row목동
5th row목현
ValueCountFrequency (%)
가압장 25
 
17.5%
소규모가압장 11
 
7.7%
소규모 3
 
2.1%
여미리 1
 
0.7%
정광마을 1
 
0.7%
고암가압장 1
 
0.7%
고봉가압장 1
 
0.7%
고막가압장 1
 
0.7%
객현가압장 1
 
0.7%
갈구가압장 1
 
0.7%
Other values (97) 97
67.8%
2023-12-10T22:16:58.508055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
78
 
13.7%
76
 
13.3%
73
 
12.8%
45
 
7.9%
20
 
3.5%
15
 
2.6%
14
 
2.5%
14
 
2.5%
10
 
1.8%
9
 
1.6%
Other values (110) 216
37.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 519
91.1%
Space Separator 45
 
7.9%
Close Punctuation 3
 
0.5%
Open Punctuation 3
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
78
 
15.0%
76
 
14.6%
73
 
14.1%
20
 
3.9%
15
 
2.9%
14
 
2.7%
14
 
2.7%
10
 
1.9%
9
 
1.7%
8
 
1.5%
Other values (107) 202
38.9%
Space Separator
ValueCountFrequency (%)
45
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 519
91.1%
Common 51
 
8.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
78
 
15.0%
76
 
14.6%
73
 
14.1%
20
 
3.9%
15
 
2.9%
14
 
2.7%
14
 
2.7%
10
 
1.9%
9
 
1.7%
8
 
1.5%
Other values (107) 202
38.9%
Common
ValueCountFrequency (%)
45
88.2%
) 3
 
5.9%
( 3
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 519
91.1%
ASCII 51
 
8.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
78
 
15.0%
76
 
14.6%
73
 
14.1%
20
 
3.9%
15
 
2.9%
14
 
2.7%
14
 
2.7%
10
 
1.9%
9
 
1.7%
8
 
1.5%
Other values (107) 202
38.9%
ASCII
ValueCountFrequency (%)
45
88.2%
) 3
 
5.9%
( 3
 
5.9%

권역
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:16:58.764691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:16:58.913404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

사업명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
지방상수도
100 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지방상수도
2nd row지방상수도
3rd row지방상수도
4th row지방상수도
5th row지방상수도

Common Values

ValueCountFrequency (%)
지방상수도 100
100.0%

Length

2023-12-10T22:16:59.081663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:16:59.226348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지방상수도 100
100.0%

사업단계명
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)19.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
광주수도관리단
23 
서산권관리단
14 
나주수도관리단
완도수도관리단
고성수도관리단
Other values (14)
41 

Length

Max length8
Median length7
Mean length6.89
Min length6

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row서산권관리단
2nd row광주수도관리단
3rd row광주수도관리단
4th row광주수도관리단
5th row광주수도관리단

Common Values

ValueCountFrequency (%)
광주수도관리단 23
23.0%
서산권관리단 14
14.0%
나주수도관리단 9
 
9.0%
완도수도관리단 7
 
7.0%
고성수도관리단 6
 
6.0%
예천수도관리단 6
 
6.0%
경남서부권관리단 5
 
5.0%
파주수도관리단 4
 
4.0%
진도수도관리단 4
 
4.0%
단양수도관리단 3
 
3.0%
Other values (9) 19
19.0%

Length

2023-12-10T22:16:59.423158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
광주수도관리단 23
23.0%
서산권관리단 14
14.0%
나주수도관리단 9
 
9.0%
완도수도관리단 7
 
7.0%
고성수도관리단 6
 
6.0%
예천수도관리단 6
 
6.0%
경남서부권관리단 5
 
5.0%
파주수도관리단 4
 
4.0%
진도수도관리단 4
 
4.0%
장흥수도관리단 3
 
3.0%
Other values (9) 19
19.0%
Distinct87
Distinct (%)87.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T22:16:59.824281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.02
Min length1

Characters and Unicode

Total characters402
Distinct characters103
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique86 ?
Unique (%)86.0%

Sample

1st row여미(가)
2nd row검복
3rd row광남
4th row목동
5th row목현
ValueCountFrequency (%)
0 14
 
13.1%
5
 
4.7%
고봉(가 1
 
0.9%
고암(가 1
 
0.9%
객현(가 1
 
0.9%
갈구가압장 1
 
0.9%
갈곡(가 1
 
0.9%
가업(가 1
 
0.9%
가산(가 1
 
0.9%
정광마을(가 1
 
0.9%
Other values (80) 80
74.8%
2023-12-10T22:17:00.393510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
66
 
16.4%
) 53
 
13.2%
( 53
 
13.2%
0 14
 
3.5%
8
 
2.0%
8
 
2.0%
8
 
2.0%
7
 
1.7%
7
 
1.7%
7
 
1.7%
Other values (93) 171
42.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 275
68.4%
Close Punctuation 53
 
13.2%
Open Punctuation 53
 
13.2%
Decimal Number 14
 
3.5%
Space Separator 7
 
1.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
66
24.0%
8
 
2.9%
8
 
2.9%
8
 
2.9%
7
 
2.5%
7
 
2.5%
6
 
2.2%
6
 
2.2%
6
 
2.2%
5
 
1.8%
Other values (89) 148
53.8%
Close Punctuation
ValueCountFrequency (%)
) 53
100.0%
Open Punctuation
ValueCountFrequency (%)
( 53
100.0%
Decimal Number
ValueCountFrequency (%)
0 14
100.0%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 275
68.4%
Common 127
31.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
66
24.0%
8
 
2.9%
8
 
2.9%
8
 
2.9%
7
 
2.5%
7
 
2.5%
6
 
2.2%
6
 
2.2%
6
 
2.2%
5
 
1.8%
Other values (89) 148
53.8%
Common
ValueCountFrequency (%)
) 53
41.7%
( 53
41.7%
0 14
 
11.0%
7
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 275
68.4%
ASCII 127
31.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
66
24.0%
8
 
2.9%
8
 
2.9%
8
 
2.9%
7
 
2.5%
7
 
2.5%
6
 
2.2%
6
 
2.2%
6
 
2.2%
5
 
1.8%
Other values (89) 148
53.8%
ASCII
ValueCountFrequency (%)
) 53
41.7%
( 53
41.7%
0 14
 
11.0%
7
 
5.5%

소재지
Categorical

HIGH CORRELATION 

Distinct26
Distinct (%)26.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기도 광주시 송정동 466-5번지 (회안대로 1061-51)
23 
충남 서산시 석림동 800-3
14 
전남 나주시 이창동 191번지 (예향로 3803)
경남 고성군 고성읍 기월리 603-4번지 (기월2길 59)
경북 예천군 예천읍 남본리 258-20 베스트프라자 B동
Other values (21)
42 

Length

Max length46
Median length34
Mean length26.81
Min length12

Unique

Unique9 ?
Unique (%)9.0%

Sample

1st row충남 서산시 석림동 800-3
2nd row경기도 광주시 송정동 466-5번지 (회안대로 1061-51)
3rd row경기도 광주시 송정동 466-5번지 (회안대로 1061-51)
4th row경기도 광주시 송정동 466-5번지 (회안대로 1061-51)
5th row경기도 광주시 송정동 466-5번지 (회안대로 1061-51)

Common Values

ValueCountFrequency (%)
경기도 광주시 송정동 466-5번지 (회안대로 1061-51) 23
23.0%
충남 서산시 석림동 800-3 14
14.0%
전남 나주시 이창동 191번지 (예향로 3803) 9
 
9.0%
경남 고성군 고성읍 기월리 603-4번지 (기월2길 59) 6
 
6.0%
경북 예천군 예천읍 남본리 258-20 베스트프라자 B동 6
 
6.0%
경남 사천시 축동면 배춘리 18번지 (수자원길 30) 5
 
5.0%
경기도 파주시 문산읍 선유리 343번지 (화석정로 43-2) 4
 
4.0%
경기도 양주시 덕정동 162-9번지 (화합로 1402번길 9-24) 3
 
3.0%
충북 단양군 단양읍 별곡리 637번지 (수변로 137) 3
 
3.0%
전라남도 진도군 조도면 3
 
3.0%
Other values (16) 24
24.0%

Length

2023-12-10T22:17:00.635480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 33
 
5.9%
송정동 23
 
4.1%
466-5번지 23
 
4.1%
회안대로 23
 
4.1%
1061-51 23
 
4.1%
광주시 23
 
4.1%
충남 17
 
3.0%
800-3 14
 
2.5%
경남 14
 
2.5%
전라남도 14
 
2.5%
Other values (98) 354
63.1%

시설용량
Real number (ℝ)

ZEROS 

Distinct51
Distinct (%)51.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean390.25
Minimum0
Maximum5500
Zeros43
Zeros (%)43.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:17:00.939983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median42
Q3197
95-th percentile2544
Maximum5500
Range5500
Interquartile range (IQR)197

Descriptive statistics

Standard deviation1002.9213
Coefficient of variation (CV)2.5699457
Kurtosis13.838977
Mean390.25
Median Absolute Deviation (MAD)42
Skewness3.6919271
Sum39025
Variance1005851.2
MonotonicityNot monotonic
2023-12-10T22:17:01.229393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 43
43.0%
115 4
 
4.0%
144 2
 
2.0%
140 2
 
2.0%
96 2
 
2.0%
150 2
 
2.0%
3000 1
 
1.0%
5500 1
 
1.0%
1440 1
 
1.0%
650 1
 
1.0%
Other values (41) 41
41.0%
ValueCountFrequency (%)
0 43
43.0%
6 1
 
1.0%
12 1
 
1.0%
14 1
 
1.0%
15 1
 
1.0%
20 1
 
1.0%
33 1
 
1.0%
36 1
 
1.0%
48 1
 
1.0%
50 1
 
1.0%
ValueCountFrequency (%)
5500 1
1.0%
5000 1
1.0%
4500 1
1.0%
3600 1
1.0%
3000 1
1.0%
2520 1
1.0%
1440 1
1.0%
1400 1
1.0%
1296 1
1.0%
1060 1
1.0%

관로길이
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:17:01.479748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:17:01.790548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

취수원
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T22:17:02.023184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:17:02.179158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

Interactions

2023-12-10T22:16:56.729944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:17:02.288902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명사업단계명시설약칭소재지시설용량
시설명1.0001.0001.0001.0001.000
사업단계명1.0001.0000.0001.0000.571
시설약칭1.0000.0001.0000.0000.989
소재지1.0001.0000.0001.0000.410
시설용량1.0000.5710.9890.4101.000
2023-12-10T22:17:02.503490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소재지사업단계명
소재지1.0000.956
사업단계명0.9561.000
2023-12-10T22:17:02.659969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설용량사업단계명소재지
시설용량1.0000.2610.150
사업단계명0.2611.0000.956
소재지0.1500.9561.000

Missing values

2023-12-10T22:16:57.023177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:16:57.247186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명권역사업명사업단계명시설약칭소재지시설용량관로길이취수원
0여미리 소규모가압장0지방상수도서산권관리단여미(가)충남 서산시 석림동 800-319600
1검복0지방상수도광주수도관리단검복경기도 광주시 송정동 466-5번지 (회안대로 1061-51)000
2광남0지방상수도광주수도관리단광남경기도 광주시 송정동 466-5번지 (회안대로 1061-51)000
3목동0지방상수도광주수도관리단목동경기도 광주시 송정동 466-5번지 (회안대로 1061-51)000
4목현0지방상수도광주수도관리단목현경기도 광주시 송정동 466-5번지 (회안대로 1061-51)000
5문형0지방상수도광주수도관리단문형경기도 광주시 송정동 466-5번지 (회안대로 1061-51)000
6봉골0지방상수도광주수도관리단봉골경기도 광주시 송정동 466-5번지 (회안대로 1061-51)000
7산성0지방상수도광주수도관리단산성경기도 광주시 송정동 466-5번지 (회안대로 1061-51)000
8삼리0지방상수도광주수도관리단삼리경기도 광주시 송정동 466-5번지 (회안대로 1061-51)000
9송정0지방상수도광주수도관리단송정경기도 광주시 송정동 466-5번지 (회안대로 1061-51)000
시설명권역사업명사업단계명시설약칭소재지시설용량관로길이취수원
90금곡가압장0지방상수도파주수도관리단금곡(가)경기도 파주시 문산읍 선유리 343번지 (화석정로 43-2)140000
91금산가압장0지방상수도장흥수도관리단금산가전라남도 장흥군 장흥읍7600
92기동가압장0지방상수도장흥수도관리단기동가전라남도 장흥군 장평면29000
93기산가압장0지방상수도양주수도관리단기산(가)경기도 양주시 덕정동 162-9번지 (화합로 1402번길 9-24)54000
94기촌가압장0지방상수도단양수도관리단기촌(가)충북 단양군 단양읍 별곡리 637번지 (수변로 137)20000
95남영가압장0지방상수도정읍권관리단0전북 정읍시 농소동 78-22번지 (서부산업도로 418)000
96내동가압장0지방상수도충남중부권관리단내동(가)충남 논산시 내동 273-2번지 (중앙2로 14-27)000
97내행가압장0지방상수도동두천수도관리단내행(가)경기도 동두천시 하봉암동 155번지 (평화로 3208번길 1)500000
98노동가압장0지방상수도나주수도관리단노동(가)전남 나주시 이창동 191번지 (예향로 3803)000
99다복가압장0지방상수도금산권관리단0충남 금산군 금산읍 아인리 620-8번지 아인택지개발지구내8600