Overview

Dataset statistics

Number of variables8
Number of observations1471
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory94.9 KiB
Average record size in memory66.1 B

Variable types

Numeric2
Categorical4
Text1
DateTime1

Dataset

Description세종특별자치시에서 운영 중인 스마트 워터 그리드에 대한 관할 행정동, 설치위치, 업종, 계량기 종류, 구경, 원격검침 단말기 설치여부에 관한 정보 입니다.
URLhttps://www.data.go.kr/data/15103441/fileData.do

Alerts

원격검침단말기 설치여부 has constant value ""Constant
데이터기준일 has constant value ""Constant
연번 is highly overall correlated with 행정동High correlation
구경 is highly overall correlated with 계량기종류High correlation
행정동 is highly overall correlated with 연번High correlation
계량기종류 is highly overall correlated with 구경High correlation
업종 is highly imbalanced (50.2%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 08:24:33.775718
Analysis finished2023-12-12 08:24:34.700661
Duration0.92 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1471
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean736
Minimum1
Maximum1471
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.1 KiB
2023-12-12T17:24:34.765001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile74.5
Q1368.5
median736
Q31103.5
95-th percentile1397.5
Maximum1471
Range1470
Interquartile range (IQR)735

Descriptive statistics

Standard deviation424.78544
Coefficient of variation (CV)0.57715413
Kurtosis-1.2
Mean736
Median Absolute Deviation (MAD)368
Skewness0
Sum1082656
Variance180442.67
MonotonicityStrictly increasing
2023-12-12T17:24:34.907077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
979 1
 
0.1%
988 1
 
0.1%
987 1
 
0.1%
986 1
 
0.1%
985 1
 
0.1%
984 1
 
0.1%
983 1
 
0.1%
982 1
 
0.1%
981 1
 
0.1%
Other values (1461) 1461
99.3%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1471 1
0.1%
1470 1
0.1%
1469 1
0.1%
1468 1
0.1%
1467 1
0.1%
1466 1
0.1%
1465 1
0.1%
1464 1
0.1%
1463 1
0.1%
1462 1
0.1%

행정동
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size11.6 KiB
고운동
348 
도담동
216 
아름동
182 
보람동
127 
한솔동
98 
Other values (8)
500 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가람동
2nd row가람동
3rd row가람동
4th row가람동
5th row가람동

Common Values

ValueCountFrequency (%)
고운동 348
23.7%
도담동 216
14.7%
아름동 182
12.4%
보람동 127
 
8.6%
한솔동 98
 
6.7%
나성동 93
 
6.3%
어진동 92
 
6.3%
새롬동 73
 
5.0%
다정동 63
 
4.3%
소담동 63
 
4.3%
Other values (3) 116
 
7.9%

Length

2023-12-12T17:24:35.043619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
고운동 348
23.7%
도담동 216
14.7%
아름동 182
12.4%
보람동 127
 
8.6%
한솔동 98
 
6.7%
나성동 93
 
6.3%
어진동 92
 
6.3%
새롬동 73
 
5.0%
다정동 63
 
4.3%
소담동 63
 
4.3%
Other values (3) 116
 
7.9%

주소
Text

Distinct1252
Distinct (%)85.1%
Missing0
Missing (%)0.0%
Memory size11.6 KiB
2023-12-12T17:24:35.322057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length21
Mean length8.6437797
Min length5

Characters and Unicode

Total characters12715
Distinct characters163
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1077 ?
Unique (%)73.2%

Sample

1st row가람동 765
2nd row금송로 625
3rd row금송로 625
4th row금송로 650
5th row가람동 19번지
ValueCountFrequency (%)
한누리대로 108
 
3.6%
마음로 52
 
1.7%
고운동 51
 
1.7%
나성동 44
 
1.5%
보듬3로 43
 
1.4%
남세종로 39
 
1.3%
만남로 33
 
1.1%
도담서5길 31
 
1.0%
시청대로 31
 
1.0%
어진동 29
 
1.0%
Other values (864) 2517
84.5%
2023-12-12T17:24:35.929096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2233
17.6%
1 1042
 
8.2%
932
 
7.3%
3 617
 
4.9%
2 598
 
4.7%
4 416
 
3.3%
397
 
3.1%
5 375
 
2.9%
6 351
 
2.8%
- 306
 
2.4%
Other values (153) 5448
42.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5587
43.9%
Decimal Number 4516
35.5%
Space Separator 2233
 
17.6%
Dash Punctuation 306
 
2.4%
Uppercase Letter 37
 
0.3%
Math Symbol 12
 
0.1%
Close Punctuation 9
 
0.1%
Open Punctuation 8
 
0.1%
Lowercase Letter 5
 
< 0.1%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
932
 
16.7%
397
 
7.1%
221
 
4.0%
162
 
2.9%
162
 
2.9%
153
 
2.7%
152
 
2.7%
148
 
2.6%
126
 
2.3%
125
 
2.2%
Other values (119) 3009
53.9%
Decimal Number
ValueCountFrequency (%)
1 1042
23.1%
3 617
13.7%
2 598
13.2%
4 416
 
9.2%
5 375
 
8.3%
6 351
 
7.8%
0 288
 
6.4%
8 283
 
6.3%
9 274
 
6.1%
7 272
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
C 11
29.7%
H 9
24.3%
D 5
13.5%
B 5
13.5%
A 2
 
5.4%
L 1
 
2.7%
R 1
 
2.7%
O 1
 
2.7%
S 1
 
2.7%
J 1
 
2.7%
Lowercase Letter
ValueCountFrequency (%)
f 1
20.0%
g 1
20.0%
u 1
20.0%
a 1
20.0%
n 1
20.0%
Close Punctuation
ValueCountFrequency (%)
) 8
88.9%
] 1
 
11.1%
Math Symbol
ValueCountFrequency (%)
> 6
50.0%
< 6
50.0%
Other Punctuation
ValueCountFrequency (%)
: 1
50.0%
, 1
50.0%
Space Separator
ValueCountFrequency (%)
2233
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 306
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7086
55.7%
Hangul 5587
43.9%
Latin 42
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
932
 
16.7%
397
 
7.1%
221
 
4.0%
162
 
2.9%
162
 
2.9%
153
 
2.7%
152
 
2.7%
148
 
2.6%
126
 
2.3%
125
 
2.2%
Other values (119) 3009
53.9%
Common
ValueCountFrequency (%)
2233
31.5%
1 1042
14.7%
3 617
 
8.7%
2 598
 
8.4%
4 416
 
5.9%
5 375
 
5.3%
6 351
 
5.0%
- 306
 
4.3%
0 288
 
4.1%
8 283
 
4.0%
Other values (9) 577
 
8.1%
Latin
ValueCountFrequency (%)
C 11
26.2%
H 9
21.4%
D 5
11.9%
B 5
11.9%
A 2
 
4.8%
f 1
 
2.4%
L 1
 
2.4%
g 1
 
2.4%
R 1
 
2.4%
O 1
 
2.4%
Other values (5) 5
11.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7128
56.1%
Hangul 5587
43.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2233
31.3%
1 1042
14.6%
3 617
 
8.7%
2 598
 
8.4%
4 416
 
5.8%
5 375
 
5.3%
6 351
 
4.9%
- 306
 
4.3%
0 288
 
4.0%
8 283
 
4.0%
Other values (24) 619
 
8.7%
Hangul
ValueCountFrequency (%)
932
 
16.7%
397
 
7.1%
221
 
4.0%
162
 
2.9%
162
 
2.9%
153
 
2.7%
152
 
2.7%
148
 
2.6%
126
 
2.3%
125
 
2.2%
Other values (119) 3009
53.9%

업종
Categorical

IMBALANCE 

Distinct6
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size11.6 KiB
일반용
810 
가정용
588 
일반(학교)
 
43
일반(유치원)
 
22
일반(어린이집)
 
6

Length

Max length8
Median length3
Mean length3.1692726
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반용
2nd row일반용
3rd row일반용
4th row일반용
5th row일반용

Common Values

ValueCountFrequency (%)
일반용 810
55.1%
가정용 588
40.0%
일반(학교) 43
 
2.9%
일반(유치원) 22
 
1.5%
일반(어린이집) 6
 
0.4%
대중탕용 2
 
0.1%

Length

2023-12-12T17:24:36.100472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:24:36.287292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반용 810
55.1%
가정용 588
40.0%
일반(학교 43
 
2.9%
일반(유치원 22
 
1.5%
일반(어린이집 6
 
0.4%
대중탕용 2
 
0.1%

계량기종류
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.6 KiB
디지털식
1290 
전자식
181 

Length

Max length4
Median length4
Mean length3.8769545
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row디지털식
2nd row전자식
3rd row전자식
4th row디지털식
5th row디지털식

Common Values

ValueCountFrequency (%)
디지털식 1290
87.7%
전자식 181
 
12.3%

Length

2023-12-12T17:24:36.490956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:24:36.618294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
디지털식 1290
87.7%
전자식 181
 
12.3%

구경
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.840925
Minimum13
Maximum250
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.1 KiB
2023-12-12T17:24:36.755785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile13
Q120
median40
Q350
95-th percentile100
Maximum250
Range237
Interquartile range (IQR)30

Descriptive statistics

Standard deviation28.903932
Coefficient of variation (CV)0.64458822
Kurtosis5.3212065
Mean44.840925
Median Absolute Deviation (MAD)20
Skewness1.6873373
Sum65961
Variance835.43726
MonotonicityNot monotonic
2023-12-12T17:24:36.895839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
50 376
25.6%
20 324
22.0%
75 221
15.0%
25 199
13.5%
13 101
 
6.9%
40 90
 
6.1%
100 86
 
5.8%
32 49
 
3.3%
150 21
 
1.4%
250 2
 
0.1%
ValueCountFrequency (%)
13 101
 
6.9%
20 324
22.0%
25 199
13.5%
32 49
 
3.3%
40 90
 
6.1%
50 376
25.6%
75 221
15.0%
100 86
 
5.8%
150 21
 
1.4%
200 2
 
0.1%
ValueCountFrequency (%)
250 2
 
0.1%
200 2
 
0.1%
150 21
 
1.4%
100 86
 
5.8%
75 221
15.0%
50 376
25.6%
40 90
 
6.1%
32 49
 
3.3%
25 199
13.5%
20 324
22.0%
Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.6 KiB
설치완료
1471 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row설치완료
2nd row설치완료
3rd row설치완료
4th row설치완료
5th row설치완료

Common Values

ValueCountFrequency (%)
설치완료 1471
100.0%

Length

2023-12-12T17:24:37.040681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:24:37.183068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
설치완료 1471
100.0%

데이터기준일
Date

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.6 KiB
Minimum2023-07-11 00:00:00
Maximum2023-07-11 00:00:00
2023-12-12T17:24:37.295594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:24:37.412624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T17:24:34.333800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:24:34.141198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:24:34.419342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:24:34.245446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:24:37.494571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번행정동업종계량기종류구경
연번1.0000.9440.2910.3550.363
행정동0.9441.0000.3880.3180.472
업종0.2910.3881.0000.2510.470
계량기종류0.3550.3180.2511.0000.646
구경0.3630.4720.4700.6461.000
2023-12-12T17:24:37.601372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
계량기종류행정동업종
계량기종류1.0000.2950.180
행정동0.2951.0000.203
업종0.1800.2031.000
2023-12-12T17:24:37.706122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구경행정동업종계량기종류
연번1.0000.1700.7900.1570.271
구경0.1701.0000.2410.3030.696
행정동0.7900.2411.0000.2030.295
업종0.1570.3030.2031.0000.180
계량기종류0.2710.6960.2950.1801.000

Missing values

2023-12-12T17:24:34.540305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:24:34.652863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번행정동주소업종계량기종류구경원격검침단말기 설치여부데이터기준일
01가람동가람동 765일반용디지털식25설치완료2023-07-11
12가람동금송로 625일반용전자식250설치완료2023-07-11
23가람동금송로 625일반용전자식75설치완료2023-07-11
34가람동금송로 650일반용디지털식25설치완료2023-07-11
45가람동가람동 19번지일반용디지털식20설치완료2023-07-11
56가람동가람동 19-17번지일반용디지털식20설치완료2023-07-11
67가람동금송로 687일반용디지털식75설치완료2023-07-11
78가람동라온로 66일반용전자식75설치완료2023-07-11
89가람동라온로 121일반용디지털식100설치완료2023-07-11
910가람동라온로 82일반용디지털식50설치완료2023-07-11
연번행정동주소업종계량기종류구경원격검침단말기 설치여부데이터기준일
14611462종촌동시설사업소 녹지관리과일반용디지털식13설치완료2023-07-11
14621463고운동시설사업소 녹지관리과일반용디지털식13설치완료2023-07-11
14631464고운동시설사업소 녹지관리과일반용디지털식13설치완료2023-07-11
14641465도담동시설사업소 녹지관리과일반용디지털식13설치완료2023-07-11
14651466고운동시설사업소 녹지관리과일반용디지털식13설치완료2023-07-11
14661467한솔동시설사업소 녹지관리과일반용디지털식13설치완료2023-07-11
14671468고운동시설사업소 녹지관리과일반용디지털식13설치완료2023-07-11
14681469아름동시설사업소 녹지관리과일반용디지털식13설치완료2023-07-11
14691470도담동시설사업소 녹지관리과일반용디지털식13설치완료2023-07-11
14701471새롬동시설사업소 녹지관리과일반용디지털식13설치완료2023-07-11