Overview

Dataset statistics

Number of variables6
Number of observations604
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory30.8 KiB
Average record size in memory52.2 B

Variable types

DateTime1
Categorical1
Numeric4

Dataset

Description한국부동산원(구.한국감정원)의 청약홈에서 제공하는 지역별 청약 당첨자 수 현황입니다.※ 매월 25일, 전월까지의 데이터를 제공하며 전월 데이터는 향후 변동될 수 있습니다.
Author한국부동산원
URLhttps://www.data.go.kr/data/15110976/fileData.do

Alerts

30대 이하 is highly overall correlated with 40대 and 2 other fieldsHigh correlation
40대 is highly overall correlated with 30대 이하 and 2 other fieldsHigh correlation
50대 is highly overall correlated with 30대 이하 and 2 other fieldsHigh correlation
60대 이상 is highly overall correlated with 30대 이하 and 2 other fieldsHigh correlation
40대 has 16 (2.6%) zerosZeros
50대 has 27 (4.5%) zerosZeros
60대 이상 has 50 (8.3%) zerosZeros

Reproduction

Analysis started2024-05-25 19:19:48.310932
Analysis finished2024-05-25 19:19:53.646444
Duration5.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연월
Date

Distinct51
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
Minimum2020-02-01 00:00:00
Maximum2024-04-01 00:00:00
2024-05-26T04:19:54.075240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:54.563603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

시도
Categorical

Distinct17
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
경기
50 
부산
44 
인천
44 
서울
43 
충남
43 
Other values (12)
380 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울
2nd row부산
3rd row대구
4th row인천
5th row울산

Common Values

ValueCountFrequency (%)
경기 50
 
8.3%
부산 44
 
7.3%
인천 44
 
7.3%
서울 43
 
7.1%
충남 43
 
7.1%
대구 39
 
6.5%
경남 39
 
6.5%
경북 37
 
6.1%
강원 36
 
6.0%
전북 36
 
6.0%
Other values (7) 193
32.0%

Length

2024-05-26T04:19:54.996446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 50
 
8.3%
인천 44
 
7.3%
부산 44
 
7.3%
서울 43
 
7.1%
충남 43
 
7.1%
대구 39
 
6.5%
경남 39
 
6.5%
경북 37
 
6.1%
전남 36
 
6.0%
강원 36
 
6.0%
Other values (7) 193
32.0%

30대 이하
Real number (ℝ)

HIGH CORRELATION 

Distinct466
Distinct (%)77.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean638.52152
Minimum0
Maximum6103
Zeros4
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size5.4 KiB
2024-05-26T04:19:55.369447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8
Q193.5
median322.5
Q3795.25
95-th percentile2353.6
Maximum6103
Range6103
Interquartile range (IQR)701.75

Descriptive statistics

Standard deviation868.08345
Coefficient of variation (CV)1.3595211
Kurtosis8.9292225
Mean638.52152
Median Absolute Deviation (MAD)269
Skewness2.6762489
Sum385667
Variance753568.88
MonotonicityNot monotonic
2024-05-26T04:19:55.837350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 5
 
0.8%
8 5
 
0.8%
83 4
 
0.7%
58 4
 
0.7%
0 4
 
0.7%
97 4
 
0.7%
7 4
 
0.7%
31 4
 
0.7%
40 3
 
0.5%
6 3
 
0.5%
Other values (456) 564
93.4%
ValueCountFrequency (%)
0 4
0.7%
1 5
0.8%
2 3
0.5%
3 2
 
0.3%
4 3
0.5%
5 3
0.5%
6 3
0.5%
7 4
0.7%
8 5
0.8%
9 3
0.5%
ValueCountFrequency (%)
6103 1
0.2%
5312 1
0.2%
5248 1
0.2%
4900 1
0.2%
4562 1
0.2%
4547 1
0.2%
4541 1
0.2%
4052 1
0.2%
3990 1
0.2%
3810 1
0.2%

40대
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct369
Distinct (%)61.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean321.57616
Minimum0
Maximum2763
Zeros16
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size5.4 KiB
2024-05-26T04:19:56.258415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q145.75
median179
Q3401.25
95-th percentile1227.15
Maximum2763
Range2763
Interquartile range (IQR)355.5

Descriptive statistics

Standard deviation429.54753
Coefficient of variation (CV)1.3357568
Kurtosis8.0336197
Mean321.57616
Median Absolute Deviation (MAD)151.5
Skewness2.5560734
Sum194232
Variance184511.08
MonotonicityNot monotonic
2024-05-26T04:19:56.778754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 16
 
2.6%
1 12
 
2.0%
2 10
 
1.7%
7 7
 
1.2%
4 5
 
0.8%
15 5
 
0.8%
3 5
 
0.8%
48 5
 
0.8%
49 4
 
0.7%
28 4
 
0.7%
Other values (359) 531
87.9%
ValueCountFrequency (%)
0 16
2.6%
1 12
2.0%
2 10
1.7%
3 5
 
0.8%
4 5
 
0.8%
5 4
 
0.7%
6 3
 
0.5%
7 7
1.2%
8 3
 
0.5%
9 3
 
0.5%
ValueCountFrequency (%)
2763 1
0.2%
2744 1
0.2%
2581 1
0.2%
2452 1
0.2%
2441 1
0.2%
2288 1
0.2%
1898 1
0.2%
1872 1
0.2%
1870 1
0.2%
1729 1
0.2%

50대
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct295
Distinct (%)48.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean159.59437
Minimum0
Maximum1429
Zeros27
Zeros (%)4.5%
Negative0
Negative (%)0.0%
Memory size5.4 KiB
2024-05-26T04:19:57.190468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q119
median84
Q3200.25
95-th percentile588.65
Maximum1429
Range1429
Interquartile range (IQR)181.25

Descriptive statistics

Standard deviation212.68622
Coefficient of variation (CV)1.3326674
Kurtosis7.4300612
Mean159.59437
Median Absolute Deviation (MAD)74
Skewness2.4735706
Sum96395
Variance45235.429
MonotonicityNot monotonic
2024-05-26T04:19:57.648952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 27
 
4.5%
1 18
 
3.0%
2 12
 
2.0%
3 12
 
2.0%
54 10
 
1.7%
10 9
 
1.5%
12 9
 
1.5%
19 8
 
1.3%
16 8
 
1.3%
5 7
 
1.2%
Other values (285) 484
80.1%
ValueCountFrequency (%)
0 27
4.5%
1 18
3.0%
2 12
2.0%
3 12
2.0%
4 6
 
1.0%
5 7
 
1.2%
6 4
 
0.7%
7 4
 
0.7%
8 3
 
0.5%
9 3
 
0.5%
ValueCountFrequency (%)
1429 1
0.2%
1292 1
0.2%
1193 1
0.2%
1131 1
0.2%
1077 1
0.2%
1056 1
0.2%
1026 1
0.2%
1009 1
0.2%
942 1
0.2%
915 1
0.2%

60대 이상
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct218
Distinct (%)36.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean82.639073
Minimum0
Maximum709
Zeros50
Zeros (%)8.3%
Negative0
Negative (%)0.0%
Memory size5.4 KiB
2024-05-26T04:19:58.114905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median44
Q3104.25
95-th percentile314.7
Maximum709
Range709
Interquartile range (IQR)94.25

Descriptive statistics

Standard deviation112.39816
Coefficient of variation (CV)1.3601091
Kurtosis7.8390848
Mean82.639073
Median Absolute Deviation (MAD)39
Skewness2.5220598
Sum49914
Variance12633.345
MonotonicityNot monotonic
2024-05-26T04:19:58.566164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 50
 
8.3%
1 20
 
3.3%
2 16
 
2.6%
12 15
 
2.5%
16 12
 
2.0%
5 12
 
2.0%
3 11
 
1.8%
4 9
 
1.5%
10 8
 
1.3%
34 8
 
1.3%
Other values (208) 443
73.3%
ValueCountFrequency (%)
0 50
8.3%
1 20
 
3.3%
2 16
 
2.6%
3 11
 
1.8%
4 9
 
1.5%
5 12
 
2.0%
6 8
 
1.3%
7 7
 
1.2%
8 7
 
1.2%
9 7
 
1.2%
ValueCountFrequency (%)
709 1
0.2%
675 1
0.2%
674 1
0.2%
669 1
0.2%
641 1
0.2%
610 1
0.2%
503 1
0.2%
501 1
0.2%
495 1
0.2%
485 1
0.2%

Interactions

2024-05-26T04:19:51.775483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:48.705689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:49.817936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:50.692735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:52.047390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:48.983713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:50.089485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:50.866222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:52.318342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:49.253471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:50.332138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:51.128594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:52.618510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:49.536535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:50.511496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-26T04:19:51.416312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-26T04:19:58.847176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연월시도30대 이하40대50대60대 이상
연월1.0000.0000.0000.0000.0000.000
시도0.0001.0000.4750.4710.4370.404
30대 이하0.0000.4751.0000.8240.9030.926
40대0.0000.4710.8241.0000.8900.854
50대0.0000.4370.9030.8901.0000.965
60대 이상0.0000.4040.9260.8540.9651.000
2024-05-26T04:19:59.127468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
30대 이하40대50대60대 이상시도
30대 이하1.0000.9500.9220.9300.205
40대0.9501.0000.9790.9720.207
50대0.9220.9791.0000.9750.185
60대 이상0.9300.9720.9751.0000.168
시도0.2050.2070.1850.1681.000

Missing values

2024-05-26T04:19:53.024431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-26T04:19:53.512958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연월시도30대 이하40대50대60대 이상
02020-02서울392291167112
12020-02부산4551506750
22020-02대구3192098840
32020-02인천6802299250
42020-02울산13412
52020-02경기22561296701352
62020-02강원2151147740
72020-02충남30618812167
82020-02전남1861536041
92020-02경북83582816
연월시도30대 이하40대50대60대 이상
5942024-04대구62493011
5952024-04인천58624316562
5962024-04광주939943568311
5972024-04대전47323715148
5982024-04경기114138115070
5992024-04강원142333212
6002024-04충남890669295137
6012024-04전북23928313869
6022024-04전남6592318557
6032024-04경남1000