Overview

Dataset statistics

Number of variables6
Number of observations560
Missing cells381
Missing cells (%)11.3%
Duplicate rows32
Duplicate rows (%)5.7%
Total size in memory28.0 KiB
Average record size in memory51.2 B

Variable types

Categorical3
Numeric3

Dataset

Description인천광역시 미추홀구 동별 빈집 현황에 대한 데이터로 행정동, 주택유형, 건축년도, 견축면적 , 대지면적 등을 제공합니다
URLhttps://www.data.go.kr/data/15060861/fileData.do

Alerts

Dataset has 32 (5.7%) duplicate rowsDuplicates
건축년도 is highly overall correlated with 건축면적(제곱미터)High correlation
건축면적(제곱미터) is highly overall correlated with 건축년도 and 2 other fieldsHigh correlation
대지면적(제곱미터) is highly overall correlated with 건축면적(제곱미터)High correlation
주택유형 is highly overall correlated with 건축면적(제곱미터)High correlation
주택유형 is highly imbalanced (50.7%)Imbalance
건축년도 has 188 (33.6%) missing valuesMissing
건축면적(제곱미터) has 193 (34.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 18:29:37.181573
Analysis finished2023-12-12 18:29:38.785337
Duration1.6 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

행정동
Categorical

Distinct20
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
주안4동
80 
용현3동
69 
도화2.3동
56 
숭의2동
51 
숭의4동
42 
Other values (15)
262 

Length

Max length6
Median length4
Mean length4.4267857
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숭의4동
2nd row숭의4동
3rd row숭의4동
4th row숭의4동
5th row숭의4동

Common Values

ValueCountFrequency (%)
주안4동 80
14.3%
용현3동 69
12.3%
도화2.3동 56
10.0%
숭의2동 51
9.1%
숭의4동 42
 
7.5%
용현1.4동 42
 
7.5%
도화1동 37
 
6.6%
학익2동 26
 
4.6%
학익1동 25
 
4.5%
숭의1.3동 25
 
4.5%
Other values (10) 107
19.1%

Length

2023-12-13T03:29:38.873575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
주안4동 80
14.3%
용현3동 69
12.3%
도화2.3동 56
10.0%
숭의2동 51
9.1%
숭의4동 42
 
7.5%
용현1.4동 42
 
7.5%
도화1동 37
 
6.6%
학익2동 26
 
4.6%
학익1동 25
 
4.5%
숭의1.3동 25
 
4.5%
Other values (10) 107
19.1%

주택유형
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
단독주택
383 
다세대
153 
연립
 
12
아파트
 
8
다가구주택
 
4

Length

Max length5
Median length4
Mean length3.6767857
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row단독주택
2nd row단독주택
3rd row단독주택
4th row단독주택
5th row단독주택

Common Values

ValueCountFrequency (%)
단독주택 383
68.4%
다세대 153
 
27.3%
연립 12
 
2.1%
아파트 8
 
1.4%
다가구주택 4
 
0.7%

Length

2023-12-13T03:29:39.017950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:29:39.134236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단독주택 383
68.4%
다세대 153
 
27.3%
연립 12
 
2.1%
아파트 8
 
1.4%
다가구주택 4
 
0.7%

건축년도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct49
Distinct (%)13.2%
Missing188
Missing (%)33.6%
Infinite0
Infinite (%)0.0%
Mean1984.4086
Minimum1938
Maximum2011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.1 KiB
2023-12-13T03:29:39.280996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1938
5-th percentile1969
Q11977
median1985
Q31991
95-th percentile2001
Maximum2011
Range73
Interquartile range (IQR)14

Descriptive statistics

Standard deviation10.606906
Coefficient of variation (CV)0.0053451218
Kurtosis0.7704058
Mean1984.4086
Median Absolute Deviation (MAD)7
Skewness-0.44824352
Sum738200
Variance112.50645
MonotonicityNot monotonic
2023-12-13T03:29:39.464945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
1985 78
13.9%
1970 24
 
4.3%
1988 23
 
4.1%
1989 22
 
3.9%
1990 17
 
3.0%
1991 17
 
3.0%
1997 16
 
2.9%
1977 12
 
2.1%
1974 12
 
2.1%
1971 11
 
2.0%
Other values (39) 140
25.0%
(Missing) 188
33.6%
ValueCountFrequency (%)
1938 1
0.2%
1952 1
0.2%
1954 2
0.4%
1957 1
0.2%
1959 1
0.2%
1960 2
0.4%
1962 1
0.2%
1963 1
0.2%
1965 2
0.4%
1966 1
0.2%
ValueCountFrequency (%)
2011 2
 
0.4%
2010 1
 
0.2%
2009 3
 
0.5%
2003 2
 
0.4%
2002 6
 
1.1%
2001 11
2.0%
1999 1
 
0.2%
1998 3
 
0.5%
1997 16
2.9%
1996 7
1.2%

건축면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct306
Distinct (%)83.4%
Missing193
Missing (%)34.5%
Infinite0
Infinite (%)0.0%
Mean118.60463
Minimum0
Maximum4671.18
Zeros3
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size5.1 KiB
2023-12-13T03:29:39.694670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile23.158
Q145.935
median76.12
Q398.495
95-th percentile168.053
Maximum4671.18
Range4671.18
Interquartile range (IQR)52.56

Descriptive statistics

Standard deviation356.50352
Coefficient of variation (CV)3.0058144
Kurtosis144.66713
Mean118.60463
Median Absolute Deviation (MAD)27.46
Skewness11.580303
Sum43527.901
Variance127094.76
MonotonicityNot monotonic
2023-12-13T03:29:39.880055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80.86 8
 
1.4%
116.29 5
 
0.9%
43.47 4
 
0.7%
82.48 4
 
0.7%
67.32 3
 
0.5%
83.76 3
 
0.5%
69.12 3
 
0.5%
0.0 3
 
0.5%
82.32 3
 
0.5%
105.48 3
 
0.5%
Other values (296) 328
58.6%
(Missing) 193
34.5%
ValueCountFrequency (%)
0.0 3
0.5%
14.3 1
 
0.2%
17.0 2
0.4%
18.21 1
 
0.2%
20.72 1
 
0.2%
21.25 1
 
0.2%
21.48 1
 
0.2%
22.12 1
 
0.2%
22.32 1
 
0.2%
22.66 1
 
0.2%
ValueCountFrequency (%)
4671.18 1
 
0.2%
4665.53 1
 
0.2%
970.84 2
0.4%
837.0 1
 
0.2%
714.04 3
0.5%
705.78 1
 
0.2%
642.64 1
 
0.2%
612.43 1
 
0.2%
265.0 1
 
0.2%
217.99 1
 
0.2%

대지면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct356
Distinct (%)63.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean458.84911
Minimum10
Maximum13985.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.1 KiB
2023-12-13T03:29:40.046703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile26.95
Q176
median143.05
Q3235.35
95-th percentile2623.1
Maximum13985.6
Range13975.6
Interquartile range (IQR)159.35

Descriptive statistics

Standard deviation1120.9406
Coefficient of variation (CV)2.4429396
Kurtosis47.682956
Mean458.84911
Median Absolute Deviation (MAD)74.85
Skewness5.7455023
Sum256955.5
Variance1256507.9
MonotonicityNot monotonic
2023-12-13T03:29:40.250058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1762.3 21
 
3.8%
2623.1 11
 
2.0%
4826.1 8
 
1.4%
182.0 8
 
1.4%
66.0 7
 
1.2%
152.0 7
 
1.2%
81.0 6
 
1.1%
36.0 6
 
1.1%
89.0 5
 
0.9%
33.0 5
 
0.9%
Other values (346) 476
85.0%
ValueCountFrequency (%)
10.0 1
 
0.2%
10.6 2
0.4%
13.0 4
0.7%
16.0 1
 
0.2%
17.7 1
 
0.2%
19.0 1
 
0.2%
19.3 1
 
0.2%
20.0 4
0.7%
20.2 1
 
0.2%
20.5 1
 
0.2%
ValueCountFrequency (%)
13985.6 1
 
0.2%
8083.0 1
 
0.2%
6887.2 1
 
0.2%
6109.0 2
 
0.4%
4826.1 8
1.4%
3676.0 1
 
0.2%
3570.0 1
 
0.2%
3306.1 2
 
0.4%
3291.0 1
 
0.2%
2887.0 1
 
0.2%
Distinct4
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
2등급
270 
3등급
144 
1등급
101 
4등급
45 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2등급
2nd row2등급
3rd row3등급
4th row3등급
5th row3등급

Common Values

ValueCountFrequency (%)
2등급 270
48.2%
3등급 144
25.7%
1등급 101
 
18.0%
4등급 45
 
8.0%

Length

2023-12-13T03:29:40.430021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:29:40.558609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2등급 270
48.2%
3등급 144
25.7%
1등급 101
 
18.0%
4등급 45
 
8.0%

Interactions

2023-12-13T03:29:38.107070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:29:37.466297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:29:37.776284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:29:38.230781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:29:37.576415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:29:37.892787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:29:38.328714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:29:37.670417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:29:37.996318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:29:40.656942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동주택유형건축년도건축면적(제곱미터)대지면적(제곱미터)등급판정결과
행정동1.0000.5710.5830.2850.6060.431
주택유형0.5711.0000.5960.6510.3630.370
건축년도0.5830.5961.0000.1910.0000.474
건축면적(제곱미터)0.2850.6510.1911.0000.8390.000
대지면적(제곱미터)0.6060.3630.0000.8391.0000.121
등급판정결과0.4310.3700.4740.0000.1211.000
2023-12-13T03:29:40.793187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주택유형행정동등급판정결과
주택유형1.0000.2790.310
행정동0.2791.0000.213
등급판정결과0.3100.2131.000
2023-12-13T03:29:40.894969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건축년도건축면적(제곱미터)대지면적(제곱미터)행정동주택유형등급판정결과
건축년도1.0000.5870.3090.2590.3920.337
건축면적(제곱미터)0.5871.0000.7260.1340.5810.000
대지면적(제곱미터)0.3090.7261.0000.3100.2420.083
행정동0.2590.1340.3101.0000.2790.213
주택유형0.3920.5810.2420.2791.0000.310
등급판정결과0.3370.0000.0830.2130.3101.000

Missing values

2023-12-13T03:29:38.476863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:29:38.596761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T03:29:38.715280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

행정동주택유형건축년도건축면적(제곱미터)대지면적(제곱미터)등급판정결과
0숭의4동단독주택1985103.710.02등급
1숭의4동단독주택199759.4687.02등급
2숭의4동단독주택<NA><NA>90.03등급
3숭의4동단독주택<NA><NA>90.03등급
4숭의4동단독주택<NA><NA>90.03등급
5숭의4동단독주택198523.02513.01등급
6숭의4동단독주택<NA><NA>40.03등급
7숭의4동단독주택198531.3766.03등급
8숭의4동단독주택198532.8669.02등급
9숭의4동단독주택198555.44152.03등급
행정동주택유형건축년도건축면적(제곱미터)대지면적(제곱미터)등급판정결과
550주안8동다세대198995.48166.42등급
551주안8동다세대198995.48166.42등급
552주안6동다세대1990165.52391.82등급
553문학동다세대1997169.97355.02등급
554문학동다세대198891.62224.02등급
555문학동단독주택<NA><NA>139.04등급
556문학동단독주택<NA><NA>109.02등급
557문학동단독주택<NA><NA>8083.04등급
558문학동다세대2001161.34295.12등급
559문학동단독주택<NA><NA>251.91등급

Duplicate rows

Most frequently occurring

행정동주택유형건축년도건축면적(제곱미터)대지면적(제곱미터)등급판정결과# duplicates
22주안4동단독주택<NA><NA>1762.32등급16
2도화2.3동다세대198880.86182.02등급6
24주안4동단독주택<NA><NA>2623.12등급6
18용현3동연립2001116.2981.01등급5
25주안4동단독주택<NA><NA>2623.13등급5
9숭의2동단독주택<NA><NA>4826.12등급4
30학익2동단독주택<NA><NA>1033.03등급4
5도화2.3동다세대1988105.48236.72등급3
6도화2.3동다세대198967.32151.72등급3
11숭의4동단독주택<NA><NA>90.03등급3