Overview

Dataset statistics

Number of variables4
Number of observations4563
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory147.2 KiB
Average record size in memory33.0 B

Variable types

Numeric1
Categorical2
Text1

Dataset

Description경상북도 의성군 스마트워터 미터기 위치데이터 제공 신청에 따라 스마트워터미터기 위치에 대한 데이터를 제공합니다.
URLhttps://www.data.go.kr/data/15103010/fileData.do

Alerts

검침방식 has constant value ""Constant
연번 is highly overall correlated with 지역High correlation
지역 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 10:37:10.496370
Analysis finished2023-12-12 10:37:11.161425
Duration0.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct4563
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2282
Minimum1
Maximum4563
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size40.2 KiB
2023-12-12T19:37:11.313247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile229.1
Q11141.5
median2282
Q33422.5
95-th percentile4334.9
Maximum4563
Range4562
Interquartile range (IQR)2281

Descriptive statistics

Standard deviation1317.369
Coefficient of variation (CV)0.57728702
Kurtosis-1.2
Mean2282
Median Absolute Deviation (MAD)1141
Skewness0
Sum10412766
Variance1735461
MonotonicityStrictly increasing
2023-12-12T19:37:11.825103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
3042 1
 
< 0.1%
3048 1
 
< 0.1%
3047 1
 
< 0.1%
3046 1
 
< 0.1%
3045 1
 
< 0.1%
3044 1
 
< 0.1%
3043 1
 
< 0.1%
3041 1
 
< 0.1%
3050 1
 
< 0.1%
Other values (4553) 4553
99.8%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
4563 1
< 0.1%
4562 1
< 0.1%
4561 1
< 0.1%
4560 1
< 0.1%
4559 1
< 0.1%
4558 1
< 0.1%
4557 1
< 0.1%
4556 1
< 0.1%
4555 1
< 0.1%
4554 1
< 0.1%

지역
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size35.8 KiB
의성읍
965 
다인면
673 
안계면
463 
금성면
452 
단북면
273 
Other values (13)
1737 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row의성읍
2nd row의성읍
3rd row의성읍
4th row의성읍
5th row의성읍

Common Values

ValueCountFrequency (%)
의성읍 965
21.1%
다인면 673
14.7%
안계면 463
10.1%
금성면 452
9.9%
단북면 273
 
6.0%
봉양면 226
 
5.0%
안평면 218
 
4.8%
비안면 210
 
4.6%
단촌면 210
 
4.6%
점곡면 209
 
4.6%
Other values (8) 664
14.6%

Length

2023-12-12T19:37:12.020371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
의성읍 965
21.1%
다인면 673
14.7%
안계면 463
10.1%
금성면 452
9.9%
단북면 273
 
6.0%
봉양면 226
 
5.0%
안평면 218
 
4.8%
비안면 210
 
4.6%
단촌면 210
 
4.6%
점곡면 209
 
4.6%
Other values (8) 664
14.6%

주소
Text

Distinct4383
Distinct (%)96.1%
Missing0
Missing (%)0.0%
Memory size35.8 KiB
2023-12-12T19:37:12.459422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length25
Mean length8.5318869
Min length3

Characters and Unicode

Total characters38931
Distinct characters249
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4315 ?
Unique (%)94.6%

Sample

1st row중앙길 57
2nd row중앙길 63-1
3rd row동서1길 20
4th row문소3길 36-3
5th row중앙길 56-3
ValueCountFrequency (%)
삼분2길 73
 
0.8%
문소1길 70
 
0.8%
양곡2길 66
 
0.7%
안계길 60
 
0.7%
서부로 58
 
0.6%
점곡길 57
 
0.6%
주암길 53
 
0.6%
자락길 51
 
0.6%
용연2길 51
 
0.6%
외정길 51
 
0.6%
Other values (3048) 8511
93.5%
2023-12-12T19:37:13.138196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4547
 
11.7%
3834
 
9.8%
1 3799
 
9.8%
2 2738
 
7.0%
- 2332
 
6.0%
3 1915
 
4.9%
5 1403
 
3.6%
4 1330
 
3.4%
6 1148
 
2.9%
7 1084
 
2.8%
Other values (239) 14801
38.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16300
41.9%
Other Letter 15604
40.1%
Space Separator 4547
 
11.7%
Dash Punctuation 2332
 
6.0%
Open Punctuation 60
 
0.2%
Close Punctuation 60
 
0.2%
Uppercase Letter 20
 
0.1%
Other Punctuation 6
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3834
24.6%
660
 
4.2%
459
 
2.9%
390
 
2.5%
331
 
2.1%
318
 
2.0%
278
 
1.8%
270
 
1.7%
231
 
1.5%
216
 
1.4%
Other values (222) 8617
55.2%
Decimal Number
ValueCountFrequency (%)
1 3799
23.3%
2 2738
16.8%
3 1915
11.7%
5 1403
 
8.6%
4 1330
 
8.2%
6 1148
 
7.0%
7 1084
 
6.7%
8 1053
 
6.5%
9 927
 
5.7%
0 903
 
5.5%
Space Separator
ValueCountFrequency (%)
4547
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2332
100.0%
Open Punctuation
ValueCountFrequency (%)
( 60
100.0%
Close Punctuation
ValueCountFrequency (%)
) 60
100.0%
Uppercase Letter
ValueCountFrequency (%)
B 20
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 6
100.0%
Math Symbol
ValueCountFrequency (%)
> 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23307
59.9%
Hangul 15604
40.1%
Latin 20
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3834
24.6%
660
 
4.2%
459
 
2.9%
390
 
2.5%
331
 
2.1%
318
 
2.0%
278
 
1.8%
270
 
1.7%
231
 
1.5%
216
 
1.4%
Other values (222) 8617
55.2%
Common
ValueCountFrequency (%)
4547
19.5%
1 3799
16.3%
2 2738
11.7%
- 2332
10.0%
3 1915
8.2%
5 1403
 
6.0%
4 1330
 
5.7%
6 1148
 
4.9%
7 1084
 
4.7%
8 1053
 
4.5%
Other values (6) 1958
8.4%
Latin
ValueCountFrequency (%)
B 20
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23327
59.9%
Hangul 15604
40.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4547
19.5%
1 3799
16.3%
2 2738
11.7%
- 2332
10.0%
3 1915
8.2%
5 1403
 
6.0%
4 1330
 
5.7%
6 1148
 
4.9%
7 1084
 
4.6%
8 1053
 
4.5%
Other values (7) 1978
8.5%
Hangul
ValueCountFrequency (%)
3834
24.6%
660
 
4.2%
459
 
2.9%
390
 
2.5%
331
 
2.1%
318
 
2.0%
278
 
1.8%
270
 
1.7%
231
 
1.5%
216
 
1.4%
Other values (222) 8617
55.2%

검침방식
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size35.8 KiB
무선검침
4563 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row무선검침
2nd row무선검침
3rd row무선검침
4th row무선검침
5th row무선검침

Common Values

ValueCountFrequency (%)
무선검침 4563
100.0%

Length

2023-12-12T19:37:13.340757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:37:13.452817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
무선검침 4563
100.0%

Interactions

2023-12-12T19:37:10.815375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:37:13.519791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번지역
연번1.0000.966
지역0.9661.000
2023-12-12T19:37:13.611585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번지역
연번1.0000.834
지역0.8341.000

Missing values

2023-12-12T19:37:10.981495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:37:11.097000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번지역주소검침방식
01의성읍중앙길 57무선검침
12의성읍중앙길 63-1무선검침
23의성읍동서1길 20무선검침
34의성읍문소3길 36-3무선검침
45의성읍중앙길 56-3무선검침
56의성읍중앙길 56-7무선검침
67의성읍동서1길 31무선검침
78의성읍중앙길 59무선검침
89의성읍염매시장길 18무선검침
910의성읍염매시장길 20무선검침
연번지역주소검침방식
45534554춘산면금천리 1199-3무선검침
45544555춘산면옥정리 168-2무선검침
45554556춘산면신흥리 1115-1 앞 도로무선검침
45564557춘산면빙계리 435-2 앞무선검침
45574558춘산면빙계리 588-1 앞 도로무선검침
45584559춘산면금천리 554-1 앞무선검침
45594560춘산면금천리 1224-57 다리 옆무선검침
45604561춘산면옥정리 322-1 앞무선검침
45614562춘산면옥정리 134-2 앞 삼거리무선검침
45624563가음면현리리 1122-1 앞 골목무선검침