Overview

Dataset statistics

Number of variables6
Number of observations1967
Missing cells84
Missing cells (%)0.7%
Duplicate rows1
Duplicate rows (%)0.1%
Total size in memory101.9 KiB
Average record size in memory53.1 B

Variable types

Text1
Numeric5

Dataset

Description한국부동산원(구.한국감정원)에서 제공하는 부동산거래현황 중 아파트매매 거래현황의 연도별 매입자연령대별(동(호)수) 데이터입니다.-(단위 : 동(호)수)- 공표시기 : 익월 말일경
Author한국부동산원
URLhttps://www.data.go.kr/data/15068658/fileData.do

Alerts

Dataset has 1 (0.1%) duplicate rowsDuplicates
2019 is highly overall correlated with 2020 and 3 other fieldsHigh correlation
2020 is highly overall correlated with 2019 and 3 other fieldsHigh correlation
2021 is highly overall correlated with 2019 and 3 other fieldsHigh correlation
2022 is highly overall correlated with 2019 and 3 other fieldsHigh correlation
2023 is highly overall correlated with 2019 and 3 other fieldsHigh correlation
2019 has 21 (1.1%) missing valuesMissing
2020 has 21 (1.1%) missing valuesMissing
2019 has 22 (1.1%) zerosZeros
2021 has 29 (1.5%) zerosZeros
2022 has 35 (1.8%) zerosZeros
2023 has 38 (1.9%) zerosZeros

Reproduction

Analysis started2024-03-23 06:32:16.656756
Analysis finished2024-03-23 06:32:29.027291
Duration12.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1960
Distinct (%)100.0%
Missing7
Missing (%)0.4%
Memory size15.5 KiB
2024-03-23T06:32:29.507415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length16
Mean length10.539286
Min length6

Characters and Unicode

Total characters20657
Distinct characters153
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1960 ?
Unique (%)100.0%

Sample

1st row전국 /20대이하
2nd row전국 /30대
3rd row전국 /40대
4th row전국 /50대
5th row전국 /60대
ValueCountFrequency (%)
경기 343
 
8.8%
서울 182
 
4.6%
경북 182
 
4.6%
경남 168
 
4.3%
전남 161
 
4.1%
강원 133
 
3.4%
충남 126
 
3.2%
전북 119
 
3.0%
부산 119
 
3.0%
충북 112
 
2.9%
Other values (1688) 2275
58.0%
2024-03-23T06:32:30.858430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1960
 
9.5%
/ 1960
 
9.5%
1820
 
8.8%
0 1680
 
8.1%
812
 
3.9%
777
 
3.8%
714
 
3.5%
637
 
3.1%
609
 
2.9%
574
 
2.8%
Other values (143) 9114
44.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13377
64.8%
Decimal Number 3360
 
16.3%
Space Separator 1960
 
9.5%
Other Punctuation 1960
 
9.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1820
 
13.6%
812
 
6.1%
777
 
5.8%
714
 
5.3%
637
 
4.8%
609
 
4.6%
574
 
4.3%
567
 
4.2%
469
 
3.5%
350
 
2.6%
Other values (134) 6048
45.2%
Decimal Number
ValueCountFrequency (%)
0 1680
50.0%
3 280
 
8.3%
5 280
 
8.3%
6 280
 
8.3%
2 280
 
8.3%
7 280
 
8.3%
4 280
 
8.3%
Space Separator
ValueCountFrequency (%)
1960
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1960
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13377
64.8%
Common 7280
35.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1820
 
13.6%
812
 
6.1%
777
 
5.8%
714
 
5.3%
637
 
4.8%
609
 
4.6%
574
 
4.3%
567
 
4.2%
469
 
3.5%
350
 
2.6%
Other values (134) 6048
45.2%
Common
ValueCountFrequency (%)
1960
26.9%
/ 1960
26.9%
0 1680
23.1%
3 280
 
3.8%
5 280
 
3.8%
6 280
 
3.8%
2 280
 
3.8%
7 280
 
3.8%
4 280
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13377
64.8%
ASCII 7280
35.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1960
26.9%
/ 1960
26.9%
0 1680
23.1%
3 280
 
3.8%
5 280
 
3.8%
6 280
 
3.8%
2 280
 
3.8%
7 280
 
3.8%
4 280
 
3.8%
Hangul
ValueCountFrequency (%)
1820
 
13.6%
812
 
6.1%
777
 
5.8%
714
 
5.3%
637
 
4.8%
609
 
4.6%
574
 
4.3%
567
 
4.2%
469
 
3.5%
350
 
2.6%
Other values (134) 6048
45.2%

2019
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct872
Distinct (%)44.8%
Missing21
Missing (%)1.1%
Infinite0
Infinite (%)0.0%
Mean897.90339
Minimum0
Maximum156664
Zeros22
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size17.4 KiB
2024-03-23T06:32:31.634950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q140
median173
Q3577.75
95-th percentile2268.75
Maximum156664
Range156664
Interquartile range (IQR)537.75

Descriptive statistics

Standard deviation5886.3392
Coefficient of variation (CV)6.5556487
Kurtosis455.13628
Mean897.90339
Median Absolute Deviation (MAD)159
Skewness19.99382
Sum1747320
Variance34648989
MonotonicityNot monotonic
2024-03-23T06:32:32.529682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 27
 
1.4%
7 26
 
1.3%
6 23
 
1.2%
1 22
 
1.1%
0 22
 
1.1%
8 20
 
1.0%
3 18
 
0.9%
13 18
 
0.9%
2 18
 
0.9%
21 17
 
0.9%
Other values (862) 1735
88.2%
(Missing) 21
 
1.1%
ValueCountFrequency (%)
0 22
1.1%
1 22
1.1%
2 18
0.9%
3 18
0.9%
4 27
1.4%
5 15
0.8%
6 23
1.2%
7 26
1.3%
8 20
1.0%
9 13
0.7%
ValueCountFrequency (%)
156664 1
0.1%
130914 1
0.1%
115110 1
0.1%
63429 1
0.1%
42185 1
0.1%
34386 1
0.1%
29688 1
0.1%
28737 1
0.1%
26809 1
0.1%
23398 1
0.1%

2020
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1055
Distinct (%)54.2%
Missing21
Missing (%)1.1%
Infinite0
Infinite (%)0.0%
Mean1544.7297
Minimum0
Maximum257112
Zeros18
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size17.4 KiB
2024-03-23T06:32:33.286309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q155
median292
Q3991.25
95-th percentile3825.75
Maximum257112
Range257112
Interquartile range (IQR)936.25

Descriptive statistics

Standard deviation10011.34
Coefficient of variation (CV)6.4809657
Kurtosis425.64331
Mean1544.7297
Median Absolute Deviation (MAD)273
Skewness19.308694
Sum3006044
Variance1.0022693 × 108
MonotonicityNot monotonic
2024-03-23T06:32:34.008859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 21
 
1.1%
5 18
 
0.9%
0 18
 
0.9%
7 17
 
0.9%
9 17
 
0.9%
2 17
 
0.9%
4 17
 
0.9%
1 16
 
0.8%
14 15
 
0.8%
6 14
 
0.7%
Other values (1045) 1776
90.3%
(Missing) 21
 
1.1%
ValueCountFrequency (%)
0 18
0.9%
1 16
0.8%
2 17
0.9%
3 21
1.1%
4 17
0.9%
5 18
0.9%
6 14
0.7%
7 17
0.9%
8 11
0.6%
9 17
0.9%
ValueCountFrequency (%)
257112 1
0.1%
227768 1
0.1%
188046 1
0.1%
115249 1
0.1%
78637 1
0.1%
72071 1
0.1%
56163 1
0.1%
53088 1
0.1%
47945 1
0.1%
44870 1
0.1%

2021
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct959
Distinct (%)49.1%
Missing14
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean1099.6749
Minimum0
Maximum169838
Zeros29
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size17.4 KiB
2024-03-23T06:32:34.533235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q148
median227
Q3714
95-th percentile2948.8
Maximum169838
Range169838
Interquartile range (IQR)666

Descriptive statistics

Standard deviation6937.9512
Coefficient of variation (CV)6.3090932
Kurtosis414.47209
Mean1099.6749
Median Absolute Deviation (MAD)211
Skewness19.069752
Sum2147665
Variance48135167
MonotonicityNot monotonic
2024-03-23T06:32:35.251928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 29
 
1.5%
7 26
 
1.3%
2 23
 
1.2%
6 19
 
1.0%
5 17
 
0.9%
1 17
 
0.9%
3 16
 
0.8%
10 16
 
0.8%
11 16
 
0.8%
9 15
 
0.8%
Other values (949) 1759
89.4%
ValueCountFrequency (%)
0 29
1.5%
1 17
0.9%
2 23
1.2%
3 16
0.8%
4 14
0.7%
5 17
0.9%
6 19
1.0%
7 26
1.3%
8 13
0.7%
9 15
0.8%
ValueCountFrequency (%)
169838 1
0.1%
166281 1
0.1%
127330 1
0.1%
86820 1
0.1%
51711 1
0.1%
46295 1
0.1%
44441 1
0.1%
41111 1
0.1%
33361 1
0.1%
32148 1
0.1%

2022
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct677
Distinct (%)34.7%
Missing14
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean488.99334
Minimum0
Maximum71861
Zeros35
Zeros (%)1.8%
Negative0
Negative (%)0.0%
Memory size17.4 KiB
2024-03-23T06:32:35.689259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q128
median103
Q3309
95-th percentile1329.2
Maximum71861
Range71861
Interquartile range (IQR)281

Descriptive statistics

Standard deviation2998.6797
Coefficient of variation (CV)6.1323528
Kurtosis401.05252
Mean488.99334
Median Absolute Deviation (MAD)91
Skewness18.899892
Sum955004
Variance8992080
MonotonicityNot monotonic
2024-03-23T06:32:36.105836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 35
 
1.8%
1 31
 
1.6%
4 31
 
1.6%
3 25
 
1.3%
2 22
 
1.1%
5 22
 
1.1%
7 22
 
1.1%
8 22
 
1.1%
9 21
 
1.1%
6 20
 
1.0%
Other values (667) 1702
86.5%
ValueCountFrequency (%)
0 35
1.8%
1 31
1.6%
2 22
1.1%
3 25
1.3%
4 31
1.6%
5 22
1.1%
6 20
1.0%
7 22
1.1%
8 22
1.1%
9 21
1.1%
ValueCountFrequency (%)
71861 1
0.1%
66790 1
0.1%
62704 1
0.1%
41675 1
0.1%
20654 1
0.1%
18045 1
0.1%
16852 1
0.1%
15830 1
0.1%
13827 1
0.1%
10967 1
0.1%

2023
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct776
Distinct (%)39.6%
Missing7
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean673.80765
Minimum0
Maximum109529
Zeros38
Zeros (%)1.9%
Negative0
Negative (%)0.0%
Memory size17.4 KiB
2024-03-23T06:32:37.177564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q126.75
median123
Q3412
95-th percentile1689.85
Maximum109529
Range109529
Interquartile range (IQR)385.25

Descriptive statistics

Standard deviation4445.5488
Coefficient of variation (CV)6.5976526
Kurtosis437.22859
Mean673.80765
Median Absolute Deviation (MAD)113
Skewness19.771198
Sum1320663
Variance19762904
MonotonicityNot monotonic
2024-03-23T06:32:37.704377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 38
 
1.9%
1 35
 
1.8%
5 34
 
1.7%
6 28
 
1.4%
8 27
 
1.4%
11 22
 
1.1%
7 22
 
1.1%
3 22
 
1.1%
4 21
 
1.1%
16 20
 
1.0%
Other values (766) 1691
86.0%
ValueCountFrequency (%)
0 38
1.9%
1 35
1.8%
2 20
1.0%
3 22
1.1%
4 21
1.1%
5 34
1.7%
6 28
1.4%
7 22
1.1%
8 27
1.4%
9 15
 
0.8%
ValueCountFrequency (%)
109529 1
0.1%
106272 1
0.1%
88516 1
0.1%
56233 1
0.1%
30935 1
0.1%
27118 1
0.1%
23825 1
0.1%
21363 1
0.1%
18772 1
0.1%
12949 1
0.1%

Interactions

2024-03-23T06:32:25.515585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:17.509291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:19.549453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:21.435920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:23.296525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:25.910284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:17.830075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:19.911694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:21.826381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:23.756600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:26.412739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:18.197065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:20.196984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:22.193202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:24.285325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:26.901323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:18.616708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:20.666820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:22.582649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:24.749837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:27.280939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:19.071486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:21.099673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:23.005600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T06:32:25.125712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T06:32:38.009144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
20192020202120222023
20191.0000.9820.9920.9290.960
20200.9821.0000.9420.9570.974
20210.9920.9421.0000.9390.966
20220.9290.9570.9391.0000.990
20230.9600.9740.9660.9901.000
2024-03-23T06:32:38.352584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
20192020202120222023
20191.0000.9700.9290.8810.934
20200.9701.0000.9460.8940.938
20210.9290.9461.0000.9520.947
20220.8810.8940.9521.0000.928
20230.9340.9380.9470.9281.000

Missing values

2024-03-23T06:32:27.868793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T06:32:28.444122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-23T06:32:28.809204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

지역 및 연령20192020202120222023
0전국 /20대이하2339844870411111804518772
1전국 /30대13091422776816628166790109529
2전국 /40대15666425711216983871861106272
3전국 /50대1151101880461273306270488516
4전국 /60대63429115249868204167556233
5전국 /70대이상2873753088444412065423825
6전국 /기타268094794533361168528665
7서울 /20대이하2155362226148621257
8서울 /30대206913137218116434412048
9서울 /40대205622580413146363210425
지역 및 연령20192020202120222023
1957제주 서귀포시/60대94128164120101
1958제주 서귀포시/70대이상2558864035
1959제주 서귀포시/기타55167305333
1960<NA><NA><NA><NA><NA><NA>
1961<NA><NA><NA><NA><NA><NA>
1962<NA><NA><NA><NA><NA><NA>
1963<NA><NA><NA><NA><NA><NA>
1964<NA><NA><NA><NA><NA><NA>
1965<NA><NA><NA><NA><NA><NA>
1966<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

지역 및 연령20192020202120222023# duplicates
0<NA><NA><NA><NA><NA><NA>7