Overview

Dataset statistics

Number of variables8
Number of observations86
Missing cells41
Missing cells (%)6.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.6 KiB
Average record size in memory66.5 B

Variable types

Categorical5
Text3

Dataset

Description서울특별시 영등포구 구내 법정동별 빈집(주택유형별) 통계 (1) 위치테이터: 지번, 도로명, 위경도 (2) 주택유형: 단독주택, 다세대주택, 연립주택, 아파트 (3) 빈집판정일 (4) 연면적,무허가여부,
Author서울특별시 영등포구
URLhttps://www.data.go.kr/data/15036251/fileData.do

Alerts

무허가 is highly overall correlated with 빈집판정일High correlation
빈집판정일 is highly overall correlated with 무허가High correlation
현황용도 is highly imbalanced (68.3%)Imbalance
도로명주소 has 8 (9.3%) missing valuesMissing
연면적 has 33 (38.4%) missing valuesMissing

Reproduction

Analysis started2023-12-12 12:11:40.383511
Analysis finished2023-12-12 12:11:41.258630
Duration0.88 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

행정동
Categorical

Distinct18
Distinct (%)20.9%
Missing0
Missing (%)0.0%
Memory size820.0 B
신길동
33 
당산동1가
15 
영등포동5가
영등포동6가
대림동
Other values (13)
22 

Length

Max length6
Median length5
Mean length4.255814
Min length3

Unique

Unique6 ?
Unique (%)7.0%

Sample

1st row당산동1가
2nd row당산동1가
3rd row당산동1가
4th row당산동1가
5th row당산동1가

Common Values

ValueCountFrequency (%)
신길동 33
38.4%
당산동1가 15
17.4%
영등포동5가 8
 
9.3%
영등포동6가 4
 
4.7%
대림동 4
 
4.7%
양평동3가 4
 
4.7%
당산동4가 2
 
2.3%
문래동3가 2
 
2.3%
양평동1가 2
 
2.3%
여의도동 2
 
2.3%
Other values (8) 10
 
11.6%

Length

2023-12-12T21:11:41.345612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
신길동 33
38.4%
당산동1가 15
17.4%
영등포동5가 8
 
9.3%
영등포동6가 4
 
4.7%
대림동 4
 
4.7%
양평동3가 4
 
4.7%
당산동3가 2
 
2.3%
영등포동 2
 
2.3%
여의도동 2
 
2.3%
양평동1가 2
 
2.3%
Other values (8) 10
 
11.6%
Distinct83
Distinct (%)96.5%
Missing0
Missing (%)0.0%
Memory size820.0 B
2023-12-12T21:11:41.620591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length23
Mean length21.127907
Min length18

Characters and Unicode

Total characters1817
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)95.3%

Sample

1st row서울특별시 영등포구 당산동1가 100
2nd row서울특별시 영등포구 당산동1가 100-1
3rd row서울특별시 영등포구 당산동1가 186-13
4th row서울특별시 영등포구 당산동1가 186-14
5th row서울특별시 영등포구 당산동1가 186-15
ValueCountFrequency (%)
서울특별시 86
25.0%
영등포구 86
25.0%
신길동 33
 
9.6%
당산동1가 15
 
4.4%
영등포동5가 8
 
2.3%
81 4
 
1.2%
영등포동6가 4
 
1.2%
양평동3가 4
 
1.2%
대림동 4
 
1.2%
문래동3가 2
 
0.6%
Other values (93) 98
28.5%
2023-12-12T21:11:42.047659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
258
 
14.2%
102
 
5.6%
102
 
5.6%
102
 
5.6%
1 88
 
4.8%
86
 
4.7%
86
 
4.7%
86
 
4.7%
86
 
4.7%
86
 
4.7%
Other values (26) 735
40.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1095
60.3%
Decimal Number 391
 
21.5%
Space Separator 258
 
14.2%
Dash Punctuation 73
 
4.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
102
9.3%
102
9.3%
102
9.3%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
Other values (14) 187
17.1%
Decimal Number
ValueCountFrequency (%)
1 88
22.5%
3 49
12.5%
2 47
12.0%
4 44
11.3%
8 41
10.5%
5 35
 
9.0%
6 33
 
8.4%
0 22
 
5.6%
7 18
 
4.6%
9 14
 
3.6%
Space Separator
ValueCountFrequency (%)
258
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 73
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1095
60.3%
Common 722
39.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
102
9.3%
102
9.3%
102
9.3%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
Other values (14) 187
17.1%
Common
ValueCountFrequency (%)
258
35.7%
1 88
 
12.2%
- 73
 
10.1%
3 49
 
6.8%
2 47
 
6.5%
4 44
 
6.1%
8 41
 
5.7%
5 35
 
4.8%
6 33
 
4.6%
0 22
 
3.0%
Other values (2) 32
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1095
60.3%
ASCII 722
39.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
258
35.7%
1 88
 
12.2%
- 73
 
10.1%
3 49
 
6.8%
2 47
 
6.5%
4 44
 
6.1%
8 41
 
5.7%
5 35
 
4.8%
6 33
 
4.6%
0 22
 
3.0%
Other values (2) 32
 
4.4%
Hangul
ValueCountFrequency (%)
102
9.3%
102
9.3%
102
9.3%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
86
7.9%
Other values (14) 187
17.1%

도로명주소
Text

MISSING 

Distinct74
Distinct (%)94.9%
Missing8
Missing (%)9.3%
Memory size820.0 B
2023-12-12T21:11:42.372457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length33
Mean length28.717949
Min length24

Characters and Unicode

Total characters2240
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique71 ?
Unique (%)91.0%

Sample

1st row서울특별시 영등포구 영신로37길 4-1 (당산동1가)
2nd row서울특별시 영등포구 영신로37길 4-1 (당산동1가)
3rd row서울특별시 영등포구 영신로 167-1 (당산동1가)
4th row서울특별시 영등포구 영신로 167-3 (당산동1가)
5th row서울특별시 영등포구 영신로 167-5 (당산동1가)
ValueCountFrequency (%)
서울특별시 78
19.9%
영등포구 78
19.9%
신길동 30
 
7.7%
당산동1가 15
 
3.8%
영신로 11
 
2.8%
영등포동5가 8
 
2.0%
양평동3가 4
 
1.0%
선유동2로 4
 
1.0%
15-1 4
 
1.0%
도신로51길 3
 
0.8%
Other values (126) 157
40.1%
2023-12-12T21:11:42.818341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
314
 
14.0%
121
 
5.4%
1 103
 
4.6%
99
 
4.4%
99
 
4.4%
97
 
4.3%
82
 
3.7%
78
 
3.5%
) 78
 
3.5%
78
 
3.5%
Other values (53) 1091
48.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1329
59.3%
Decimal Number 375
 
16.7%
Space Separator 314
 
14.0%
Close Punctuation 78
 
3.5%
Open Punctuation 78
 
3.5%
Dash Punctuation 64
 
2.9%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
121
 
9.1%
99
 
7.4%
99
 
7.4%
97
 
7.3%
82
 
6.2%
78
 
5.9%
78
 
5.9%
78
 
5.9%
78
 
5.9%
78
 
5.9%
Other values (38) 441
33.2%
Decimal Number
ValueCountFrequency (%)
1 103
27.5%
2 47
12.5%
5 41
 
10.9%
4 38
 
10.1%
3 36
 
9.6%
6 31
 
8.3%
7 31
 
8.3%
9 20
 
5.3%
8 15
 
4.0%
0 13
 
3.5%
Space Separator
ValueCountFrequency (%)
314
100.0%
Close Punctuation
ValueCountFrequency (%)
) 78
100.0%
Open Punctuation
ValueCountFrequency (%)
( 78
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 64
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1329
59.3%
Common 911
40.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
121
 
9.1%
99
 
7.4%
99
 
7.4%
97
 
7.3%
82
 
6.2%
78
 
5.9%
78
 
5.9%
78
 
5.9%
78
 
5.9%
78
 
5.9%
Other values (38) 441
33.2%
Common
ValueCountFrequency (%)
314
34.5%
1 103
 
11.3%
) 78
 
8.6%
( 78
 
8.6%
- 64
 
7.0%
2 47
 
5.2%
5 41
 
4.5%
4 38
 
4.2%
3 36
 
4.0%
6 31
 
3.4%
Other values (5) 81
 
8.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1329
59.3%
ASCII 911
40.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
314
34.5%
1 103
 
11.3%
) 78
 
8.6%
( 78
 
8.6%
- 64
 
7.0%
2 47
 
5.2%
5 41
 
4.5%
4 38
 
4.2%
3 36
 
4.0%
6 31
 
3.4%
Other values (5) 81
 
8.9%
Hangul
ValueCountFrequency (%)
121
 
9.1%
99
 
7.4%
99
 
7.4%
97
 
7.3%
82
 
6.2%
78
 
5.9%
78
 
5.9%
78
 
5.9%
78
 
5.9%
78
 
5.9%
Other values (38) 441
33.2%

연면적
Text

MISSING 

Distinct50
Distinct (%)94.3%
Missing33
Missing (%)38.4%
Memory size820.0 B
2023-12-12T21:11:43.088138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length5
Mean length5.1886792
Min length4

Characters and Unicode

Total characters275
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique47 ?
Unique (%)88.7%

Sample

1st row39.67
2nd row14.88
3rd row30.24
4th row49.59
5th row13.22
ValueCountFrequency (%)
33.06 2
 
3.8%
42.98 2
 
3.8%
57.85 2
 
3.8%
41.66 1
 
1.9%
210.97 1
 
1.9%
25.8 1
 
1.9%
17.28 1
 
1.9%
40.93 1
 
1.9%
46.61 1
 
1.9%
75.16 1
 
1.9%
Other values (40) 40
75.5%
2023-12-12T21:11:43.488940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 53
19.3%
3 27
9.8%
6 26
9.5%
5 26
9.5%
1 25
9.1%
4 24
8.7%
2 24
8.7%
9 22
8.0%
8 21
 
7.6%
7 14
 
5.1%
Other values (2) 13
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 220
80.0%
Other Punctuation 55
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 27
12.3%
6 26
11.8%
5 26
11.8%
1 25
11.4%
4 24
10.9%
2 24
10.9%
9 22
10.0%
8 21
9.5%
7 14
6.4%
0 11
5.0%
Other Punctuation
ValueCountFrequency (%)
. 53
96.4%
, 2
 
3.6%

Most occurring scripts

ValueCountFrequency (%)
Common 275
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 53
19.3%
3 27
9.8%
6 26
9.5%
5 26
9.5%
1 25
9.1%
4 24
8.7%
2 24
8.7%
9 22
8.0%
8 21
 
7.6%
7 14
 
5.1%
Other values (2) 13
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 275
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 53
19.3%
3 27
9.8%
6 26
9.5%
5 26
9.5%
1 25
9.1%
4 24
8.7%
2 24
8.7%
9 22
8.0%
8 21
 
7.6%
7 14
 
5.1%
Other values (2) 13
 
4.7%
Distinct2
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size820.0 B
해당없음
70 
정비
16 

Length

Max length4
Median length4
Mean length3.627907
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row해당없음
2nd row해당없음
3rd row해당없음
4th row해당없음
5th row해당없음

Common Values

ValueCountFrequency (%)
해당없음 70
81.4%
정비 16
 
18.6%

Length

2023-12-12T21:11:43.671802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:11:43.790136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
해당없음 70
81.4%
정비 16
 
18.6%

무허가
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size820.0 B
해당없음
56 
무허가
30 

Length

Max length4
Median length4
Mean length3.6511628
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row해당없음
2nd row무허가
3rd row무허가
4th row무허가
5th row무허가

Common Values

ValueCountFrequency (%)
해당없음 56
65.1%
무허가 30
34.9%

Length

2023-12-12T21:11:43.895826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:11:43.989954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
해당없음 56
65.1%
무허가 30
34.9%

현황용도
Categorical

IMBALANCE 

Distinct4
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size820.0 B
단독
77 
다가구
 
5
다세대
 
2
아파트
 
2

Length

Max length3
Median length2
Mean length2.1046512
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row단독
2nd row단독
3rd row단독
4th row단독
5th row단독

Common Values

ValueCountFrequency (%)
단독 77
89.5%
다가구 5
 
5.8%
다세대 2
 
2.3%
아파트 2
 
2.3%

Length

2023-12-12T21:11:44.090395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:11:44.213683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단독 77
89.5%
다가구 5
 
5.8%
다세대 2
 
2.3%
아파트 2
 
2.3%

빈집판정일
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size820.0 B
2020
42 
2021
23 
2022
19 
<NA>
 
2

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 42
48.8%
2021 23
26.7%
2022 19
22.1%
<NA> 2
 
2.3%

Length

2023-12-12T21:11:44.327660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:11:44.440894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 42
48.8%
2021 23
26.7%
2022 19
22.1%
na 2
 
2.3%

Correlations

2023-12-12T21:11:44.522241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동지번주소도로명주소연면적정비구역여부무허가현황용도빈집판정일
행정동1.0001.0001.0000.8420.6570.5060.7750.757
지번주소1.0001.0001.0000.9921.0001.0001.0001.000
도로명주소1.0001.0001.0000.9961.0000.8771.0000.959
연면적0.8420.9920.9961.0000.0001.0000.9120.000
정비구역여부0.6571.0001.0000.0001.0000.0000.3180.288
무허가0.5061.0000.8771.0000.0001.0000.2550.482
현황용도0.7751.0001.0000.9120.3180.2551.0000.151
빈집판정일0.7571.0000.9590.0000.2880.4820.1511.000
2023-12-12T21:11:44.684589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
현황용도정비구역여부무허가빈집판정일행정동
현황용도1.0000.2080.1660.1400.499
정비구역여부0.2081.0000.0000.4620.471
무허가0.1660.0001.0000.7330.358
빈집판정일0.1400.4620.7331.0000.439
행정동0.4990.4710.3580.4391.000
2023-12-12T21:11:44.815503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동정비구역여부무허가현황용도빈집판정일
행정동1.0000.4710.3580.4990.439
정비구역여부0.4711.0000.0000.2080.462
무허가0.3580.0001.0000.1660.733
현황용도0.4990.2080.1661.0000.140
빈집판정일0.4390.4620.7330.1401.000

Missing values

2023-12-12T21:11:40.928890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:11:41.078003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T21:11:41.189100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

행정동지번주소도로명주소연면적정비구역여부무허가현황용도빈집판정일
0당산동1가서울특별시 영등포구 당산동1가 100서울특별시 영등포구 영신로37길 4-1 (당산동1가)39.67해당없음해당없음단독2022
1당산동1가서울특별시 영등포구 당산동1가 100-1서울특별시 영등포구 영신로37길 4-1 (당산동1가)<NA>해당없음무허가단독2020
2당산동1가서울특별시 영등포구 당산동1가 186-13서울특별시 영등포구 영신로 167-1 (당산동1가)<NA>해당없음무허가단독2020
3당산동1가서울특별시 영등포구 당산동1가 186-14서울특별시 영등포구 영신로 167-3 (당산동1가)<NA>해당없음무허가단독2020
4당산동1가서울특별시 영등포구 당산동1가 186-15서울특별시 영등포구 영신로 167-5 (당산동1가)<NA>해당없음무허가단독2020
5당산동1가서울특별시 영등포구 당산동1가 186-16서울특별시 영등포구 영신로 167-7 (당산동1가)<NA>해당없음무허가단독2020
6당산동1가서울특별시 영등포구 당산동1가 186-17서울특별시 영등포구 영신로 167-9 (당산동1가)<NA>해당없음무허가단독2020
7당산동1가서울특별시 영등포구 당산동1가 186-19서울특별시 영등포구 영신로 167-13 (당산동1가)<NA>해당없음무허가단독2020
8당산동1가서울특별시 영등포구 당산동1가 186-33서울특별시 영등포구 영신로 167-31 (당산동1가)14.88해당없음해당없음단독2021
9당산동1가서울특별시 영등포구 당산동1가 186-8서울특별시 영등포구 영신로45길 1-1 (당산동1가)30.24해당없음해당없음단독2022
행정동지번주소도로명주소연면적정비구역여부무허가현황용도빈집판정일
76영등포동5가서울특별시 영등포구 영등포동5가 32-55서울특별시 영등포구 영중로22길 18-2 (영등포동5가)<NA>정비무허가단독2020
77영등포동5가서울특별시 영등포구 영등포동5가 33-134서울특별시 영등포구 영중로18길 21 (영등포동5가)40.93정비해당없음다가구2020
78영등포동5가서울특별시 영등포구 영등포동5가 34-13서울특별시 영등포구 영등포로43가길 7-1 (영등포동5가)42.98정비해당없음다가구2020
79영등포동5가서울특별시 영등포구 영등포동5가 38-7서울특별시 영등포구 영중로 64-1 (영등포동5가)41.66해당없음해당없음단독2021
80영등포동5가서울특별시 영등포구 영등포동5가 48-46서울특별시 영등포구 영등포로43길 26-9 (영등포동5가)<NA>정비무허가단독2020
81영등포동6가서울특별시 영등포구 영등포동6가 81<NA>42.98해당없음해당없음단독2022
82영등포동6가서울특별시 영등포구 영등포동6가 81<NA>19.83해당없음해당없음단독2022
83영등포동6가서울특별시 영등포구 영등포동6가 81<NA>33.06해당없음해당없음단독2022
84영등포동6가서울특별시 영등포구 영등포동6가 81<NA><NA>해당없음해당없음단독2022
85영등포동8가서울특별시 영등포구 영등포동8가 33-4서울특별시 영등포구 영신로54길 3-3 (영등포동8가)<NA>해당없음무허가단독2020