Overview

Dataset statistics

Number of variables7
Number of observations1486
Missing cells0
Missing cells (%)0.0%
Duplicate rows123
Duplicate rows (%)8.3%
Total size in memory84.3 KiB
Average record size in memory58.1 B

Variable types

Categorical5
DateTime1
Text1

Dataset

Description대전상수도사업본부 구별로 진행한 급수공사에 대한 정보입니다.
Author대전광역시 상수도사업본부
URLhttps://www.data.go.kr/data/15061987/fileData.do

Alerts

공사년도 has constant value ""Constant
민원접수년도 has constant value ""Constant
Dataset has 123 (8.3%) duplicate rowsDuplicates
공사구분 is highly imbalanced (52.8%)Imbalance
연장 is highly imbalanced (59.2%)Imbalance

Reproduction

Analysis started2023-12-12 05:21:35.813355
Analysis finished2023-12-12 05:21:36.371230
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

공사년도
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.7 KiB
2019
1486 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
2019 1486
100.0%

Length

2023-12-12T14:21:36.437685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:21:36.547041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2019 1486
100.0%

등록구
Categorical

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size11.7 KiB
대덕구
506 
동구
474 
중구
463 
서구
 
43

Length

Max length3
Median length2
Mean length2.3405114
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중구
2nd row동구
3rd row대덕구
4th row중구
5th row동구

Common Values

ValueCountFrequency (%)
대덕구 506
34.1%
동구 474
31.9%
중구 463
31.2%
서구 43
 
2.9%

Length

2023-12-12T14:21:36.655874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:21:36.755115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대덕구 506
34.1%
동구 474
31.9%
중구 463
31.2%
서구 43
 
2.9%

민원접수년도
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.7 KiB
2019
1486 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
2019 1486
100.0%

Length

2023-12-12T14:21:36.879992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:21:37.001969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2019 1486
100.0%

공사구분
Categorical

IMBALANCE 

Distinct8
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size11.7 KiB
분할
978 
신규
291 
구경확대
175 
이설
 
20
신규(임시)
 
15
Other values (3)
 
7

Length

Max length8
Median length2
Mean length2.2934051
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row구경확대
2nd row신규
3rd row분할
4th row이설
5th row구경확대

Common Values

ValueCountFrequency (%)
분할 978
65.8%
신규 291
 
19.6%
구경확대 175
 
11.8%
이설 20
 
1.3%
신규(임시) 15
 
1.0%
신규(한시) 3
 
0.2%
미분류 2
 
0.1%
분할(구경확대) 2
 
0.1%

Length

2023-12-12T14:21:37.135035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:21:37.276014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
분할 978
65.8%
신규 291
 
19.6%
구경확대 175
 
11.8%
이설 20
 
1.3%
신규(임시 15
 
1.0%
신규(한시 3
 
0.2%
미분류 2
 
0.1%
분할(구경확대 2
 
0.1%
Distinct209
Distinct (%)14.1%
Missing0
Missing (%)0.0%
Memory size11.7 KiB
Minimum2019-01-14 00:00:00
Maximum2020-01-03 00:00:00
2023-12-12T14:21:37.431915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:21:37.582411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

연장
Categorical

IMBALANCE 

Distinct40
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size11.7 KiB
985 
5m
 
95
6m
 
89
4m
 
65
7m
 
42
Other values (35)
210 

Length

Max length4
Median length1
Mean length1.3903096
Min length1

Unique

Unique16 ?
Unique (%)1.1%

Sample

1st row10m
2nd row5m
3rd row
4th row7m
5th row7m

Common Values

ValueCountFrequency (%)
985
66.3%
5m 95
 
6.4%
6m 89
 
6.0%
4m 65
 
4.4%
7m 42
 
2.8%
3m 42
 
2.8%
9m 39
 
2.6%
8m 35
 
2.4%
10m 16
 
1.1%
2m 10
 
0.7%
Other values (30) 68
 
4.6%

Length

2023-12-12T14:21:37.732198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5m 95
19.0%
6m 89
17.8%
4m 65
13.0%
7m 42
8.4%
3m 42
8.4%
9m 39
7.8%
8m 35
 
7.0%
10m 16
 
3.2%
2m 10
 
2.0%
11m 9
 
1.8%
Other values (29) 59
11.8%
Distinct97
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size11.7 KiB
2023-12-12T14:21:37.985974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.9475101
Min length1

Characters and Unicode

Total characters4380
Distinct characters96
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)1.6%

Sample

1st row대흥동
2nd row낭월동
3rd row송촌동
4th row대흥동
5th row가양동
ValueCountFrequency (%)
가양동 164
 
11.1%
오정동 90
 
6.1%
산성동 88
 
5.9%
석봉동 70
 
4.7%
평촌동 65
 
4.4%
석교동 54
 
3.6%
비래동 53
 
3.6%
덕암동 48
 
3.2%
옥계동 47
 
3.2%
중리동 40
 
2.7%
Other values (86) 761
51.4%
2023-12-12T14:21:38.398515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1480
33.8%
186
 
4.2%
185
 
4.2%
130
 
3.0%
124
 
2.8%
114
 
2.6%
106
 
2.4%
97
 
2.2%
94
 
2.1%
88
 
2.0%
Other values (86) 1776
40.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4374
99.9%
Space Separator 6
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1480
33.8%
186
 
4.3%
185
 
4.2%
130
 
3.0%
124
 
2.8%
114
 
2.6%
106
 
2.4%
97
 
2.2%
94
 
2.1%
88
 
2.0%
Other values (85) 1770
40.5%
Space Separator
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4374
99.9%
Common 6
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1480
33.8%
186
 
4.3%
185
 
4.2%
130
 
3.0%
124
 
2.8%
114
 
2.6%
106
 
2.4%
97
 
2.2%
94
 
2.1%
88
 
2.0%
Other values (85) 1770
40.5%
Common
ValueCountFrequency (%)
6
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4374
99.9%
ASCII 6
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1480
33.8%
186
 
4.3%
185
 
4.2%
130
 
3.0%
124
 
2.8%
114
 
2.6%
106
 
2.4%
97
 
2.2%
94
 
2.1%
88
 
2.0%
Other values (85) 1770
40.5%
ASCII
ValueCountFrequency (%)
6
100.0%

Correlations

2023-12-12T14:21:38.489167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록구공사구분연장법정동
등록구1.0000.3170.3320.999
공사구분0.3171.0000.7820.771
연장0.3320.7821.0000.932
법정동0.9990.7710.9321.000
2023-12-12T14:21:38.601527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연장등록구공사구분
연장1.0000.1620.403
등록구0.1621.0000.146
공사구분0.4030.1461.000
2023-12-12T14:21:38.728940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록구공사구분연장
등록구1.0000.1460.162
공사구분0.1461.0000.403
연장0.1620.4031.000

Missing values

2023-12-12T14:21:36.212114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:21:36.329235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

공사년도등록구민원접수년도공사구분준공일연장법정동
02019중구2019구경확대2019-01-1710m대흥동
12019동구2019신규2019-01-195m낭월동
22019대덕구2019분할2019-01-18송촌동
32019중구2019이설2019-01-257m대흥동
42019동구2019구경확대2019-01-197m가양동
52019대덕구2019분할2019-01-18송촌동
62019동구2019분할2019-01-14가양동
72019중구2019분할2019-01-25유천동
82019대덕구2019분할2019-01-18송촌동
92019동구2019분할2019-01-14가양동
공사년도등록구민원접수년도공사구분준공일연장법정동
14762019서구2019신규2019-10-162m가수원동
14772019서구2019신규2019-10-283m가수원동
14782019서구2019분할2019-10-24변동
14792019서구2019신규2019-11-118m월평동
14802019서구2019신규2019-11-065m가수원동
14812019서구2019신규2019-11-259m가수원동
14822019서구2019분할2019-11-21내동
14832019서구2019신규2019-12-09용문동
14842019서구2019분할2019-12-13관저동
14852019서구2019분할2019-12-23갈마동

Duplicate rows

Most frequently occurring

공사년도등록구민원접수년도공사구분준공일연장법정동# duplicates
392019대덕구2019분할2019-12-16오정동31
182019대덕구2019분할2019-05-13평촌동30
822019동구2019분할2019-12-13가양동30
232019대덕구2019분할2019-06-10석봉동18
972019중구2019분할2019-04-17사정동18
1082019중구2019분할2019-08-16호동18
192019대덕구2019분할2019-05-17평촌동16
302019대덕구2019분할2019-10-01평촌동16
752019동구2019분할2019-09-27가양동16
1132019중구2019분할2019-10-29문화동16