Overview

Dataset statistics

Number of variables6
Number of observations430
Missing cells418
Missing cells (%)16.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory22.4 KiB
Average record size in memory53.3 B

Variable types

Text1
Numeric5

Dataset

Description2022년 7월 29일 기준 2018년~2021년 기간 동안 서울특별시에 설치된 스마트 원격검침의 행정동별 통계 데이터로 공공데이터 제공 신청에 의해 제공합니다.
Author서울특별시
URLhttps://www.data.go.kr/data/15102966/fileData.do

Alerts

2021년 is highly overall correlated with 합계High correlation
합계 is highly overall correlated with 2021년High correlation
2018년 has 84 (19.5%) missing valuesMissing
2019년 has 245 (57.0%) missing valuesMissing
2020년 has 87 (20.2%) missing valuesMissing
원격검침 설치 has unique valuesUnique

Reproduction

Analysis started2023-12-12 04:10:50.071027
Analysis finished2023-12-12 04:10:54.125170
Duration4.05 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

원격검침 설치
Text

UNIQUE 

Distinct430
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.5 KiB
2023-12-12T13:10:54.655214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length17
Mean length12.353488
Min length9

Characters and Unicode

Total characters5312
Distinct characters189
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique430 ?
Unique (%)100.0%

Sample

1st row서울시 강남구 개포1동
2nd row서울시 강남구 개포2동
3rd row서울시 강남구 개포4동
4th row서울시 강남구 논현1동
5th row서울시 강남구 논현2동
ValueCountFrequency (%)
서울시 430
33.3%
송파구 24
 
1.9%
강남구 22
 
1.7%
노원구 22
 
1.7%
관악구 21
 
1.6%
성북구 20
 
1.6%
강서구 20
 
1.6%
마포구 20
 
1.6%
강동구 19
 
1.5%
서초구 18
 
1.4%
Other values (445) 674
52.2%
2023-12-12T13:10:55.506963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
860
16.2%
497
 
9.4%
492
 
9.3%
455
 
8.6%
435
 
8.2%
430
 
8.1%
203
 
3.8%
2 101
 
1.9%
1 101
 
1.9%
78
 
1.5%
Other values (179) 1660
31.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4138
77.9%
Space Separator 860
 
16.2%
Decimal Number 306
 
5.8%
Other Punctuation 8
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
497
 
12.0%
492
 
11.9%
455
 
11.0%
435
 
10.5%
430
 
10.4%
203
 
4.9%
78
 
1.9%
53
 
1.3%
48
 
1.2%
45
 
1.1%
Other values (167) 1402
33.9%
Decimal Number
ValueCountFrequency (%)
2 101
33.0%
1 101
33.0%
3 46
15.0%
4 27
 
8.8%
5 12
 
3.9%
6 8
 
2.6%
7 6
 
2.0%
8 3
 
1.0%
9 1
 
0.3%
0 1
 
0.3%
Space Separator
ValueCountFrequency (%)
860
100.0%
Other Punctuation
ValueCountFrequency (%)
. 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4138
77.9%
Common 1174
 
22.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
497
 
12.0%
492
 
11.9%
455
 
11.0%
435
 
10.5%
430
 
10.4%
203
 
4.9%
78
 
1.9%
53
 
1.3%
48
 
1.2%
45
 
1.1%
Other values (167) 1402
33.9%
Common
ValueCountFrequency (%)
860
73.3%
2 101
 
8.6%
1 101
 
8.6%
3 46
 
3.9%
4 27
 
2.3%
5 12
 
1.0%
. 8
 
0.7%
6 8
 
0.7%
7 6
 
0.5%
8 3
 
0.3%
Other values (2) 2
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4138
77.9%
ASCII 1174
 
22.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
860
73.3%
2 101
 
8.6%
1 101
 
8.6%
3 46
 
3.9%
4 27
 
2.3%
5 12
 
1.0%
. 8
 
0.7%
6 8
 
0.7%
7 6
 
0.5%
8 3
 
0.3%
Other values (2) 2
 
0.2%
Hangul
ValueCountFrequency (%)
497
 
12.0%
492
 
11.9%
455
 
11.0%
435
 
10.5%
430
 
10.4%
203
 
4.9%
78
 
1.9%
53
 
1.3%
48
 
1.2%
45
 
1.1%
Other values (167) 1402
33.9%

2018년
Real number (ℝ)

MISSING 

Distinct29
Distinct (%)8.4%
Missing84
Missing (%)19.5%
Infinite0
Infinite (%)0.0%
Mean5.5867052
Minimum1
Maximum135
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-12T13:10:55.783083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11.25
median3
Q36
95-th percentile15
Maximum135
Range134
Interquartile range (IQR)4.75

Descriptive statistics

Standard deviation9.8528152
Coefficient of variation (CV)1.7636182
Kurtosis94.37923
Mean5.5867052
Median Absolute Deviation (MAD)2
Skewness8.3081279
Sum1933
Variance97.077968
MonotonicityNot monotonic
2023-12-12T13:10:55.988987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
1 87
20.2%
2 56
13.0%
3 38
8.8%
4 35
8.1%
5 24
 
5.6%
6 21
 
4.9%
8 18
 
4.2%
7 16
 
3.7%
9 10
 
2.3%
11 8
 
1.9%
Other values (19) 33
 
7.7%
(Missing) 84
19.5%
ValueCountFrequency (%)
1 87
20.2%
2 56
13.0%
3 38
8.8%
4 35
8.1%
5 24
 
5.6%
6 21
 
4.9%
7 16
 
3.7%
8 18
 
4.2%
9 10
 
2.3%
10 4
 
0.9%
ValueCountFrequency (%)
135 1
0.2%
75 1
0.2%
45 2
0.5%
36 1
0.2%
35 1
0.2%
29 1
0.2%
28 1
0.2%
25 1
0.2%
24 1
0.2%
21 1
0.2%

2019년
Real number (ℝ)

MISSING 

Distinct13
Distinct (%)7.0%
Missing245
Missing (%)57.0%
Infinite0
Infinite (%)0.0%
Mean2.6756757
Minimum1
Maximum33
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-12T13:10:56.182288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile8.8
Maximum33
Range32
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.5999249
Coefficient of variation (CV)1.3454265
Kurtosis36.204179
Mean2.6756757
Median Absolute Deviation (MAD)1
Skewness5.2321864
Sum495
Variance12.959459
MonotonicityNot monotonic
2023-12-12T13:10:56.359107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
1 91
 
21.2%
2 41
 
9.5%
3 17
 
4.0%
4 12
 
2.8%
5 8
 
1.9%
10 3
 
0.7%
8 3
 
0.7%
9 3
 
0.7%
12 2
 
0.5%
6 2
 
0.5%
Other values (3) 3
 
0.7%
(Missing) 245
57.0%
ValueCountFrequency (%)
1 91
21.2%
2 41
9.5%
3 17
 
4.0%
4 12
 
2.8%
5 8
 
1.9%
6 2
 
0.5%
7 1
 
0.2%
8 3
 
0.7%
9 3
 
0.7%
10 3
 
0.7%
ValueCountFrequency (%)
33 1
 
0.2%
26 1
 
0.2%
12 2
 
0.5%
10 3
 
0.7%
9 3
 
0.7%
8 3
 
0.7%
7 1
 
0.2%
6 2
 
0.5%
5 8
1.9%
4 12
2.8%

2020년
Real number (ℝ)

MISSING 

Distinct47
Distinct (%)13.7%
Missing87
Missing (%)20.2%
Infinite0
Infinite (%)0.0%
Mean10.119534
Minimum1
Maximum313
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-12T13:10:56.574046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q310
95-th percentile37
Maximum313
Range312
Interquartile range (IQR)8

Descriptive statistics

Standard deviation21.000703
Coefficient of variation (CV)2.0752639
Kurtosis128.43745
Mean10.119534
Median Absolute Deviation (MAD)3
Skewness9.6218987
Sum3471
Variance441.02953
MonotonicityNot monotonic
2023-12-12T13:10:56.762415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
1 56
13.0%
2 49
11.4%
4 31
 
7.2%
3 30
 
7.0%
5 23
 
5.3%
9 21
 
4.9%
6 20
 
4.7%
8 13
 
3.0%
10 11
 
2.6%
12 10
 
2.3%
Other values (37) 79
18.4%
(Missing) 87
20.2%
ValueCountFrequency (%)
1 56
13.0%
2 49
11.4%
3 30
7.0%
4 31
7.2%
5 23
5.3%
6 20
 
4.7%
7 8
 
1.9%
8 13
 
3.0%
9 21
 
4.9%
10 11
 
2.6%
ValueCountFrequency (%)
313 1
0.2%
103 1
0.2%
88 1
0.2%
77 2
0.5%
56 1
0.2%
55 1
0.2%
50 1
0.2%
49 1
0.2%
47 1
0.2%
45 2
0.5%

2021년
Real number (ℝ)

HIGH CORRELATION 

Distinct232
Distinct (%)54.2%
Missing2
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean150.35047
Minimum1
Maximum1829
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-12T13:10:56.926548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q127.75
median81
Q3191.25
95-th percentile526.6
Maximum1829
Range1828
Interquartile range (IQR)163.5

Descriptive statistics

Standard deviation202.49044
Coefficient of variation (CV)1.3467896
Kurtosis17.955066
Mean150.35047
Median Absolute Deviation (MAD)63
Skewness3.42894
Sum64350
Variance41002.378
MonotonicityNot monotonic
2023-12-12T13:10:57.110276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6 11
 
2.6%
2 7
 
1.6%
19 7
 
1.6%
22 7
 
1.6%
5 7
 
1.6%
33 6
 
1.4%
16 6
 
1.4%
71 5
 
1.2%
36 5
 
1.2%
46 5
 
1.2%
Other values (222) 362
84.2%
ValueCountFrequency (%)
1 1
 
0.2%
2 7
1.6%
3 3
 
0.7%
4 3
 
0.7%
5 7
1.6%
6 11
2.6%
7 3
 
0.7%
8 4
 
0.9%
9 3
 
0.7%
10 2
 
0.5%
ValueCountFrequency (%)
1829 1
0.2%
1487 1
0.2%
1124 1
0.2%
1094 1
0.2%
927 1
0.2%
875 1
0.2%
843 1
0.2%
760 1
0.2%
744 1
0.2%
732 1
0.2%

합계
Real number (ℝ)

HIGH CORRELATION 

Distinct244
Distinct (%)56.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean163.36977
Minimum1
Maximum1830
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-12T13:10:57.276985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q137
median94
Q3202.25
95-th percentile553.6
Maximum1830
Range1829
Interquartile range (IQR)165.25

Descriptive statistics

Standard deviation206.70783
Coefficient of variation (CV)1.2652759
Kurtosis16.224858
Mean163.36977
Median Absolute Deviation (MAD)69
Skewness3.2625022
Sum70249
Variance42728.126
MonotonicityNot monotonic
2023-12-12T13:10:57.505058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6 7
 
1.6%
11 7
 
1.6%
42 6
 
1.4%
28 6
 
1.4%
31 6
 
1.4%
56 6
 
1.4%
198 5
 
1.2%
64 5
 
1.2%
9 5
 
1.2%
34 5
 
1.2%
Other values (234) 372
86.5%
ValueCountFrequency (%)
1 1
 
0.2%
2 1
 
0.2%
3 3
0.7%
4 1
 
0.2%
5 3
0.7%
6 7
1.6%
7 1
 
0.2%
8 4
0.9%
9 5
1.2%
10 4
0.9%
ValueCountFrequency (%)
1830 1
0.2%
1488 1
0.2%
1125 1
0.2%
1096 1
0.2%
1039 1
0.2%
954 1
0.2%
849 1
0.2%
767 1
0.2%
761 1
0.2%
758 1
0.2%

Interactions

2023-12-12T13:10:52.760675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:50.356144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:50.887696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:51.477195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:52.074379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:52.944245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:50.468978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:50.987482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:51.600692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:52.213066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:53.120199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:50.565653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:51.114723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:51.723722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:52.344806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:53.302325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:50.664491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:51.241532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:51.814324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:52.479737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:53.482369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:50.772100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:51.353873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:51.936789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:10:52.613007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:10:57.641078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2018년2019년2020년2021년합계
2018년1.0000.0000.6400.3810.457
2019년0.0001.0000.5850.0000.000
2020년0.6400.5851.0000.2170.274
2021년0.3810.0000.2171.0000.996
합계0.4570.0000.2740.9961.000
2023-12-12T13:10:57.768209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2018년2019년2020년2021년합계
2018년1.0000.0860.2190.1270.191
2019년0.0861.0000.1990.1740.213
2020년0.2190.1991.0000.2620.354
2021년0.1270.1740.2621.0000.989
합계0.1910.2130.3540.9891.000

Missing values

2023-12-12T13:10:53.680290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:10:53.837877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T13:10:54.011721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

원격검침 설치2018년2019년2020년2021년합계
0서울시 강남구 개포1동1<NA><NA>23
1서울시 강남구 개포2동<NA>112628
2서울시 강남구 개포4동1<NA>81928
3서울시 강남구 논현1동21<NA>103204328
4서울시 강남구 논현2동7229116154
5서울시 강남구 대치1동111912
6서울시 강남구 대치2동41101631
7서울시 강남구 대치4동9112233
8서울시 강남구 도곡1동5<NA>21623
9서울시 강남구 도곡2동4<NA>22329
원격검침 설치2018년2019년2020년2021년합계
420서울시 중랑구 면목제5동<NA><NA>2911
421서울시 중랑구 면목제7동113200205
422서울시 중랑구 묵제1동33123351
423서울시 중랑구 묵제2동2<NA>3611
424서울시 중랑구 상봉제1동<NA>122124
425서울시 중랑구 상봉제2동4<NA>3390397
426서울시 중랑구 신내1동5<NA><NA>1419
427서울시 중랑구 신내2동7<NA>41829
428서울시 중랑구 중화제1동<NA><NA>75865
429서울시 중랑구 중화제2동3<NA><NA>117120