Overview

Dataset statistics

Number of variables8
Number of observations369
Missing cells163
Missing cells (%)5.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory24.3 KiB
Average record size in memory67.4 B

Variable types

Numeric2
Categorical5
Text1

Dataset

Description서웉특별시 강서구 오피스텔 현황 데이터 입니다. 제공데이터 : 연도, 시군구, 법정동, 주용도, 기타용도, 세대수, 가구수, 데이터기준일자
Author서울특별시 강서구
URLhttps://www.data.go.kr/data/15107751/fileData.do

Alerts

시군구 has constant value ""Constant
데이터기준일자 has constant value ""Constant
연번 is highly overall correlated with 세대수High correlation
세대수 is highly overall correlated with 연번High correlation
가구수 is highly imbalanced (57.0%)Imbalance
세대수 has 163 (44.2%) missing valuesMissing
연번 has unique valuesUnique
세대수 has 73 (19.8%) zerosZeros

Reproduction

Analysis started2023-12-12 15:45:49.012111
Analysis finished2023-12-12 15:45:50.101287
Duration1.09 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct369
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean185
Minimum1
Maximum369
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.4 KiB
2023-12-13T00:45:50.187339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile19.4
Q193
median185
Q3277
95-th percentile350.6
Maximum369
Range368
Interquartile range (IQR)184

Descriptive statistics

Standard deviation106.66536
Coefficient of variation (CV)0.57656954
Kurtosis-1.2
Mean185
Median Absolute Deviation (MAD)92
Skewness0
Sum68265
Variance11377.5
MonotonicityStrictly increasing
2023-12-13T00:45:50.746728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.3%
244 1
 
0.3%
253 1
 
0.3%
252 1
 
0.3%
251 1
 
0.3%
250 1
 
0.3%
249 1
 
0.3%
248 1
 
0.3%
247 1
 
0.3%
246 1
 
0.3%
Other values (359) 359
97.3%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
369 1
0.3%
368 1
0.3%
367 1
0.3%
366 1
0.3%
365 1
0.3%
364 1
0.3%
363 1
0.3%
362 1
0.3%
361 1
0.3%
360 1
0.3%

시군구
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
서울특별시 강서구
369 

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시 강서구
2nd row서울특별시 강서구
3rd row서울특별시 강서구
4th row서울특별시 강서구
5th row서울특별시 강서구

Common Values

ValueCountFrequency (%)
서울특별시 강서구 369
100.0%

Length

2023-12-13T00:45:50.938677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:45:51.063748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 369
50.0%
강서구 369
50.0%

법정동
Categorical

Distinct8
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
화곡동
173 
등촌동
61 
염창동
40 
마곡동
37 
방화동
29 
Other values (3)
29 

Length

Max length4
Median length3
Mean length3.0108401
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row화곡동
2nd row등촌동
3rd row등촌동
4th row등촌동
5th row등촌동

Common Values

ValueCountFrequency (%)
화곡동 173
46.9%
등촌동 61
 
16.5%
염창동 40
 
10.8%
마곡동 37
 
10.0%
방화동 29
 
7.9%
가양동 22
 
6.0%
내발산동 4
 
1.1%
공항동 3
 
0.8%

Length

2023-12-13T00:45:51.192894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:45:51.395564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
화곡동 173
46.9%
등촌동 61
 
16.5%
염창동 40
 
10.8%
마곡동 37
 
10.0%
방화동 29
 
7.9%
가양동 22
 
6.0%
내발산동 4
 
1.1%
공항동 3
 
0.8%

주용도
Categorical

Distinct3
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
업무시설
287 
공동주택
80 
제1종근린생활시설
 
2

Length

Max length9
Median length4
Mean length4.0271003
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row업무시설
2nd row업무시설
3rd row업무시설
4th row업무시설
5th row업무시설

Common Values

ValueCountFrequency (%)
업무시설 287
77.8%
공동주택 80
 
21.7%
제1종근린생활시설 2
 
0.5%

Length

2023-12-13T00:45:51.574577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:45:51.732493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
업무시설 287
77.8%
공동주택 80
 
21.7%
제1종근린생활시설 2
 
0.5%
Distinct148
Distinct (%)40.1%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-13T00:45:51.969324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length55
Median length40
Mean length16.01897
Min length4

Characters and Unicode

Total characters5911
Distinct characters65
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique120 ?
Unique (%)32.5%

Sample

1st row오피스텔
2nd row업무시설,오피스텔
3rd row업무시설,오피스텔
4th row업무시설,오피스텔
5th row업무시설,오피스텔
ValueCountFrequency (%)
업무시설(오피스텔 164
26.9%
오피스텔 83
13.6%
59
 
9.7%
근린생활시설 53
 
8.7%
업무시설,오피스텔 28
 
4.6%
업무시설(오피스텔),근린생활시설 18
 
3.0%
다세대주택 12
 
2.0%
도시형생활주택(단지형다세대 12
 
2.0%
도시형생활주택(원룸형 10
 
1.6%
공동주택(다세대주택 9
 
1.5%
Other values (103) 162
26.6%
2023-12-13T00:45:52.403489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
464
 
7.8%
392
 
6.6%
369
 
6.2%
369
 
6.2%
369
 
6.2%
369
 
6.2%
) 338
 
5.7%
( 338
 
5.7%
275
 
4.7%
264
 
4.5%
Other values (55) 2364
40.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4665
78.9%
Close Punctuation 338
 
5.7%
Open Punctuation 338
 
5.7%
Other Punctuation 256
 
4.3%
Space Separator 241
 
4.1%
Decimal Number 66
 
1.1%
Dash Punctuation 7
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
464
 
9.9%
392
 
8.4%
369
 
7.9%
369
 
7.9%
369
 
7.9%
369
 
7.9%
275
 
5.9%
264
 
5.7%
202
 
4.3%
197
 
4.2%
Other values (39) 1395
29.9%
Decimal Number
ValueCountFrequency (%)
1 26
39.4%
2 22
33.3%
0 4
 
6.1%
6 4
 
6.1%
8 3
 
4.5%
4 2
 
3.0%
3 2
 
3.0%
5 2
 
3.0%
7 1
 
1.5%
Other Punctuation
ValueCountFrequency (%)
, 239
93.4%
/ 16
 
6.2%
. 1
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 338
100.0%
Open Punctuation
ValueCountFrequency (%)
( 338
100.0%
Space Separator
ValueCountFrequency (%)
241
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4665
78.9%
Common 1246
 
21.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
464
 
9.9%
392
 
8.4%
369
 
7.9%
369
 
7.9%
369
 
7.9%
369
 
7.9%
275
 
5.9%
264
 
5.7%
202
 
4.3%
197
 
4.2%
Other values (39) 1395
29.9%
Common
ValueCountFrequency (%)
) 338
27.1%
( 338
27.1%
241
19.3%
, 239
19.2%
1 26
 
2.1%
2 22
 
1.8%
/ 16
 
1.3%
- 7
 
0.6%
0 4
 
0.3%
6 4
 
0.3%
Other values (6) 11
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4665
78.9%
ASCII 1246
 
21.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
464
 
9.9%
392
 
8.4%
369
 
7.9%
369
 
7.9%
369
 
7.9%
369
 
7.9%
275
 
5.9%
264
 
5.7%
202
 
4.3%
197
 
4.2%
Other values (39) 1395
29.9%
ASCII
ValueCountFrequency (%)
) 338
27.1%
( 338
27.1%
241
19.3%
, 239
19.2%
1 26
 
2.1%
2 22
 
1.8%
/ 16
 
1.3%
- 7
 
0.6%
0 4
 
0.3%
6 4
 
0.3%
Other values (6) 11
 
0.9%

세대수
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct42
Distinct (%)20.4%
Missing163
Missing (%)44.2%
Infinite0
Infinite (%)0.0%
Mean15.966019
Minimum0
Maximum299
Zeros73
Zeros (%)19.8%
Negative0
Negative (%)0.0%
Memory size3.4 KiB
2023-12-13T00:45:52.613622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median12
Q320
95-th percentile47
Maximum299
Range299
Interquartile range (IQR)20

Descriptive statistics

Standard deviation27.49286
Coefficient of variation (CV)1.7219609
Kurtosis57.337198
Mean15.966019
Median Absolute Deviation (MAD)12
Skewness6.3597164
Sum3289
Variance755.85738
MonotonicityNot monotonic
2023-12-13T00:45:52.820704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
0 73
19.8%
16 24
 
6.5%
12 20
 
5.4%
20 13
 
3.5%
24 6
 
1.6%
15 5
 
1.4%
18 5
 
1.4%
32 4
 
1.1%
8 4
 
1.1%
13 4
 
1.1%
Other values (32) 48
 
13.0%
(Missing) 163
44.2%
ValueCountFrequency (%)
0 73
19.8%
1 2
 
0.5%
6 1
 
0.3%
7 3
 
0.8%
8 4
 
1.1%
9 1
 
0.3%
10 2
 
0.5%
11 2
 
0.5%
12 20
 
5.4%
13 4
 
1.1%
ValueCountFrequency (%)
299 1
0.3%
138 1
0.3%
126 1
0.3%
96 1
0.3%
75 1
0.3%
72 1
0.3%
63 1
0.3%
56 1
0.3%
50 1
0.3%
48 2
0.5%

가구수
Categorical

IMBALANCE 

Distinct4
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
<NA>
282 
0
83 
1
 
3
221
 
1

Length

Max length4
Median length4
Mean length3.298103
Min length1

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row<NA>
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
<NA> 282
76.4%
0 83
 
22.5%
1 3
 
0.8%
221 1
 
0.3%

Length

2023-12-13T00:45:53.025032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:45:53.206462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 282
76.4%
0 83
 
22.5%
1 3
 
0.8%
221 1
 
0.3%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2022-10-28
369 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-10-28
2nd row2022-10-28
3rd row2022-10-28
4th row2022-10-28
5th row2022-10-28

Common Values

ValueCountFrequency (%)
2022-10-28 369
100.0%

Length

2023-12-13T00:45:53.376381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:45:53.509262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-10-28 369
100.0%

Interactions

2023-12-13T00:45:49.565381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:45:49.324658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:45:49.698771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:45:49.452746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:45:53.611135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번법정동주용도세대수가구수
연번1.0000.3760.3660.1950.000
법정동0.3761.0000.2130.4130.559
주용도0.3660.2131.0000.3310.000
세대수0.1950.4130.3311.0000.000
가구수0.0000.5590.0000.0001.000
2023-12-13T00:45:53.772503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동주용도가구수
법정동1.0000.1370.437
주용도0.1371.0000.000
가구수0.4370.0001.000
2023-12-13T00:45:53.911014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번세대수법정동주용도가구수
연번1.0000.5790.1890.2340.000
세대수0.5791.0000.2430.1440.000
법정동0.1890.2431.0000.1370.437
주용도0.2340.1440.1371.0000.000
가구수0.0000.0000.4370.0001.000

Missing values

2023-12-13T00:45:49.889124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:45:50.051445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시군구법정동주용도기타용도세대수가구수데이터기준일자
01서울특별시 강서구화곡동업무시설오피스텔<NA><NA>2022-10-28
12서울특별시 강서구등촌동업무시설업무시설,오피스텔002022-10-28
23서울특별시 강서구등촌동업무시설업무시설,오피스텔002022-10-28
34서울특별시 강서구등촌동업무시설업무시설,오피스텔002022-10-28
45서울특별시 강서구등촌동업무시설업무시설,오피스텔002022-10-28
56서울특별시 강서구화곡동업무시설오피스텔 및 근린생활시설002022-10-28
67서울특별시 강서구염창동업무시설업무시설,오피스텔002022-10-28
78서울특별시 강서구화곡동업무시설업무시설(오피스텔)002022-10-28
89서울특별시 강서구화곡동업무시설업무시설(오피스텔)002022-10-28
910서울특별시 강서구내발산동업무시설업무시설(오피스텔)002022-10-28
연번시군구법정동주용도기타용도세대수가구수데이터기준일자
359360서울특별시 강서구등촌동업무시설오피스텔, 근린생활시설, 도시형생활주택(단지형다세대)16<NA>2022-10-28
360361서울특별시 강서구등촌동업무시설오피스텔, 도시형생활주택(단지형다세대)13<NA>2022-10-28
361362서울특별시 강서구염창동공동주택근린생활시설, 오피스텔, 아파트23<NA>2022-10-28
362363서울특별시 강서구내발산동업무시설오피스텔, 도시형생활주택(단지형 다세대), 근린생활시설22<NA>2022-10-28
363364서울특별시 강서구화곡동업무시설오피스텔<NA><NA>2022-10-28
364365서울특별시 강서구등촌동업무시설업무시설(오피스텔)12<NA>2022-10-28
365366서울특별시 강서구화곡동업무시설업무시설(오피스텔8호)002022-10-28
366367서울특별시 강서구화곡동업무시설오피스텔<NA><NA>2022-10-28
367368서울특별시 강서구화곡동업무시설업무시설(오피스텔), 제1,2종근린생활시설<NA><NA>2022-10-28
368369서울특별시 강서구화곡동업무시설오피스텔,공동주택(도시형생활주택)63<NA>2022-10-28