Overview

Dataset statistics

Number of variables5
Number of observations54
Missing cells6
Missing cells (%)2.2%
Duplicate rows1
Duplicate rows (%)1.9%
Total size in memory2.4 KiB
Average record size in memory45.4 B

Variable types

Categorical2
Numeric3

Dataset

Description한국토지주택공사에서 관리하는 서울시 소재 구별 공공임대주택의 유형별 주택 수(2017-2019)자료로 구 및 연도별 공급유형 자료를 제공합니다.
URLhttps://www.data.go.kr/data/15063435/fileData.do

Alerts

Dataset has 1 (1.9%) duplicate rowsDuplicates
2019 is highly overall correlated with 2018 and 1 other fieldsHigh correlation
2018 is highly overall correlated with 2019 and 1 other fieldsHigh correlation
2017 is highly overall correlated with 2019 and 1 other fieldsHigh correlation
2019 has 2 (3.7%) missing valuesMissing
2018 has 2 (3.7%) missing valuesMissing
2017 has 2 (3.7%) missing valuesMissing
2019 has 3 (5.6%) zerosZeros
2018 has 4 (7.4%) zerosZeros
2017 has 2 (3.7%) zerosZeros

Reproduction

Analysis started2023-12-11 23:53:01.492408
Analysis finished2023-12-11 23:53:03.358435
Duration1.87 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Categorical

Distinct26
Distinct (%)48.1%
Missing0
Missing (%)0.0%
Memory size564.0 B
서초구
강남구
강서구
노원구
송파구
 
3
Other values (21)
31 

Length

Max length4
Median length3
Mean length3.0740741
Min length2

Unique

Unique11 ?
Unique (%)20.4%

Sample

1st row중구
2nd row강남구
3rd row강남구
4th row강남구
5th row강남구

Common Values

ValueCountFrequency (%)
서초구 6
 
11.1%
강남구 6
 
11.1%
강서구 4
 
7.4%
노원구 4
 
7.4%
송파구 3
 
5.6%
<NA> 2
 
3.7%
강북구 2
 
3.7%
구로구 2
 
3.7%
도봉구 2
 
3.7%
관악구 2
 
3.7%
Other values (16) 21
38.9%

Length

2023-12-12T08:53:03.430837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서초구 6
 
11.1%
강남구 6
 
11.1%
강서구 4
 
7.4%
노원구 4
 
7.4%
송파구 3
 
5.6%
마포구 2
 
3.7%
동작구 2
 
3.7%
은평구 2
 
3.7%
용산구 2
 
3.7%
중랑구 2
 
3.7%
Other values (16) 21
38.9%

공급유형
Categorical

Distinct10
Distinct (%)18.5%
Missing0
Missing (%)0.0%
Memory size564.0 B
기존주택매입임대
25 
공공임대(10년)
영구임대
국민임대
행복주택
Other values (5)

Length

Max length9
Median length8.5
Mean length6.7407407
Min length4

Unique

Unique1 ?
Unique (%)1.9%

Sample

1st row기존주택매입임대
2nd row공공임대(10년)
3rd row공공임대(분납)
4th row장기전세
5th row국민임대

Common Values

ValueCountFrequency (%)
기존주택매입임대 25
46.3%
공공임대(10년) 6
 
11.1%
영구임대 6
 
11.1%
국민임대 4
 
7.4%
행복주택 4
 
7.4%
공공임대(분납) 2
 
3.7%
장기전세 2
 
3.7%
공공임대(50년) 2
 
3.7%
<NA> 2
 
3.7%
외인임대 1
 
1.9%

Length

2023-12-12T08:53:03.551491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:53:03.677921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기존주택매입임대 25
46.3%
공공임대(10년 6
 
11.1%
영구임대 6
 
11.1%
국민임대 4
 
7.4%
행복주택 4
 
7.4%
공공임대(분납 2
 
3.7%
장기전세 2
 
3.7%
공공임대(50년 2
 
3.7%
na 2
 
3.7%
외인임대 1
 
1.9%

2019
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct49
Distinct (%)94.2%
Missing2
Missing (%)3.7%
Infinite0
Infinite (%)0.0%
Mean935.26923
Minimum0
Maximum7834
Zeros3
Zeros (%)5.6%
Negative0
Negative (%)0.0%
Memory size618.0 B
2023-12-12T08:53:03.840910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile22
Q1248.75
median487.5
Q3894.75
95-th percentile3397.8
Maximum7834
Range7834
Interquartile range (IQR)646

Descriptive statistics

Standard deviation1526.1873
Coefficient of variation (CV)1.631816
Kurtosis13.359377
Mean935.26923
Median Absolute Deviation (MAD)325
Skewness3.5635182
Sum48634
Variance2329247.7
MonotonicityNot monotonic
2023-12-12T08:53:03.968090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
0 3
 
5.6%
44 2
 
3.7%
512 1
 
1.9%
460 1
 
1.9%
299 1
 
1.9%
362 1
 
1.9%
203 1
 
1.9%
202 1
 
1.9%
222 1
 
1.9%
250 1
 
1.9%
Other values (39) 39
72.2%
(Missing) 2
 
3.7%
ValueCountFrequency (%)
0 3
5.6%
40 1
 
1.9%
42 1
 
1.9%
44 2
3.7%
58 1
 
1.9%
100 1
 
1.9%
202 1
 
1.9%
203 1
 
1.9%
222 1
 
1.9%
245 1
 
1.9%
ValueCountFrequency (%)
7834 1
1.9%
7329 1
1.9%
4181 1
1.9%
2757 1
1.9%
2200 1
1.9%
1327 1
1.9%
1285 1
1.9%
1242 1
1.9%
1131 1
1.9%
1084 1
1.9%

2018
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct47
Distinct (%)90.4%
Missing2
Missing (%)3.7%
Infinite0
Infinite (%)0.0%
Mean890.63462
Minimum0
Maximum7834
Zeros4
Zeros (%)7.4%
Negative0
Negative (%)0.0%
Memory size618.0 B
2023-12-12T08:53:04.096759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1222.75
median405.5
Q3852
95-th percentile3397.8
Maximum7834
Range7834
Interquartile range (IQR)629.25

Descriptive statistics

Standard deviation1536.1931
Coefficient of variation (CV)1.7248298
Kurtosis13.381535
Mean890.63462
Median Absolute Deviation (MAD)331.5
Skewness3.576293
Sum46313
Variance2359889.3
MonotonicityNot monotonic
2023-12-12T08:53:04.253492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
0 4
 
7.4%
737 2
 
3.7%
44 2
 
3.7%
58 1
 
1.9%
299 1
 
1.9%
362 1
 
1.9%
202 1
 
1.9%
222 1
 
1.9%
250 1
 
1.9%
440 1
 
1.9%
Other values (37) 37
68.5%
(Missing) 2
 
3.7%
ValueCountFrequency (%)
0 4
7.4%
40 1
 
1.9%
42 1
 
1.9%
44 2
3.7%
58 1
 
1.9%
80 1
 
1.9%
103 1
 
1.9%
202 1
 
1.9%
222 1
 
1.9%
223 1
 
1.9%
ValueCountFrequency (%)
7834 1
1.9%
7329 1
1.9%
4181 1
1.9%
2757 1
1.9%
2200 1
1.9%
1273 1
1.9%
1242 1
1.9%
1228 1
1.9%
1084 1
1.9%
1032 1
1.9%

2017
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct47
Distinct (%)90.4%
Missing2
Missing (%)3.7%
Infinite0
Infinite (%)0.0%
Mean856.11538
Minimum0
Maximum7834
Zeros2
Zeros (%)3.7%
Negative0
Negative (%)0.0%
Memory size618.0 B
2023-12-12T08:53:04.390718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q1201
median369
Q3746
95-th percentile3397.8
Maximum7834
Range7834
Interquartile range (IQR)545

Descriptive statistics

Standard deviation1543.978
Coefficient of variation (CV)1.8034695
Kurtosis13.396762
Mean856.11538
Median Absolute Deviation (MAD)306.5
Skewness3.5864082
Sum44518
Variance2383868.1
MonotonicityNot monotonic
2023-12-12T08:53:04.516603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
44 2
 
3.7%
671 2
 
3.7%
1 2
 
3.7%
0 2
 
3.7%
238 2
 
3.7%
222 1
 
1.9%
250 1
 
1.9%
440 1
 
1.9%
1084 1
 
1.9%
368 1
 
1.9%
Other values (37) 37
68.5%
(Missing) 2
 
3.7%
ValueCountFrequency (%)
0 2
3.7%
1 2
3.7%
9 1
1.9%
40 1
1.9%
42 1
1.9%
44 2
3.7%
58 1
1.9%
80 1
1.9%
103 1
1.9%
198 1
1.9%
ValueCountFrequency (%)
7834 1
1.9%
7329 1
1.9%
4181 1
1.9%
2757 1
1.9%
2200 1
1.9%
1267 1
1.9%
1242 1
1.9%
1219 1
1.9%
1084 1
1.9%
935 1
1.9%

Interactions

2023-12-12T08:53:02.526775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:53:01.771636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:53:02.229305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:53:02.875388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:53:01.915108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:53:02.343832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:53:02.975862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:53:02.071052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:53:02.440992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:53:04.611740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공급유형201920182017
1.0000.0000.0000.0000.000
공급유형0.0001.0000.5840.5690.582
20190.0000.5841.0000.9990.998
20180.0000.5690.9991.0001.000
20170.0000.5820.9981.0001.000
2023-12-12T08:53:04.692770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공급유형
1.0000.000
공급유형0.0001.000
2023-12-12T08:53:04.759198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
201920182017공급유형
20191.0000.9890.9100.0000.320
20180.9891.0000.9240.0000.309
20170.9100.9241.0000.0000.319
0.0000.0000.0001.0000.000
공급유형0.3200.3090.3190.0001.000

Missing values

2023-12-12T08:53:03.120211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:53:03.210728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T08:53:03.296504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

공급유형201920182017
0중구기존주택매입임대424242
1강남구공공임대(10년)124212421242
2강남구공공임대(분납)550550550
3강남구장기전세370370370
4강남구국민임대873873873
5강남구영구임대275727572757
6강남구기존주택매입임대755671671
7강동구기존주택매입임대11311032935
8강북구영구임대418141814181
9강북구기존주택매입임대909774671
공급유형201920182017
44은평구기존주택매입임대827733669
45은평구공공임대(10년)001
46종로구기존주택매입임대2708080
47중랑구기존주택매입임대790675682
48중랑구공공임대(10년)001
49동대문구기존주택매입임대460328198
50서대문구기존주택매입임대512354332
51영등포구기존주택매입임대203103103
52<NA><NA><NA><NA><NA>
53<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

공급유형201920182017# duplicates
0<NA><NA><NA><NA><NA>2