Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 604 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 30.8 KiB |
Average record size in memory | 52.2 B |
Variable types
DateTime | 1 |
---|---|
Categorical | 1 |
Numeric | 4 |
Dataset
Description | 한국부동산원(구.한국감정원)의 청약홈에서 제공하는 지역별 청약 당첨자 수 현황입니다.※ 매월 25일, 전월까지의 데이터를 제공하며 전월 데이터는 향후 변동될 수 있습니다. |
---|---|
Author | 한국부동산원 |
URL | https://www.data.go.kr/data/15110976/fileData.do |
30대 이하 is highly overall correlated with 40대 and 2 other fields | High correlation |
40대 is highly overall correlated with 30대 이하 and 2 other fields | High correlation |
50대 is highly overall correlated with 30대 이하 and 2 other fields | High correlation |
60대 이상 is highly overall correlated with 30대 이하 and 2 other fields | High correlation |
40대 has 16 (2.6%) zeros | Zeros |
50대 has 27 (4.5%) zeros | Zeros |
60대 이상 has 50 (8.3%) zeros | Zeros |
Reproduction
Analysis started | 2024-05-25 19:19:48.310932 |
---|---|
Analysis finished | 2024-05-25 19:19:53.646444 |
Duration | 5.34 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
연월
Date
Distinct | 51 |
---|---|
Distinct (%) | 8.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.8 KiB |
Minimum | 2020-02-01 00:00:00 |
---|---|
Maximum | 2024-04-01 00:00:00 |
Histogram with fixed size bins (bins=50)
시도
Categorical
Distinct | 17 |
---|---|
Distinct (%) | 2.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.8 KiB |
경기 | |
---|---|
부산 | |
인천 | |
서울 | |
충남 | |
Other values (12) |
Length
Max length | 2 |
---|---|
Median length | 2 |
Mean length | 2 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 서울 |
---|---|
2nd row | 부산 |
3rd row | 대구 |
4th row | 인천 |
5th row | 울산 |
Common Values
Value | Count | Frequency (%) |
경기 | 50 | 8.3% |
부산 | 44 | 7.3% |
인천 | 44 | 7.3% |
서울 | 43 | 7.1% |
충남 | 43 | 7.1% |
대구 | 39 | 6.5% |
경남 | 39 | 6.5% |
경북 | 37 | 6.1% |
강원 | 36 | 6.0% |
전북 | 36 | 6.0% |
Other values (7) | 193 |
Length
Histogram of lengths of the category
Value | Count | Frequency (%) |
경기 | 50 | 8.3% |
인천 | 44 | 7.3% |
부산 | 44 | 7.3% |
서울 | 43 | 7.1% |
충남 | 43 | 7.1% |
대구 | 39 | 6.5% |
경남 | 39 | 6.5% |
경북 | 37 | 6.1% |
전남 | 36 | 6.0% |
강원 | 36 | 6.0% |
Other values (7) | 193 |
30대 이하
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 466 |
---|---|
Distinct (%) | 77.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 638.52152 |
Minimum | 0 |
---|---|
Maximum | 6103 |
Zeros | 4 |
Zeros (%) | 0.7% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 5.4 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 8 |
Q1 | 93.5 |
median | 322.5 |
Q3 | 795.25 |
95-th percentile | 2353.6 |
Maximum | 6103 |
Range | 6103 |
Interquartile range (IQR) | 701.75 |
Descriptive statistics
Standard deviation | 868.08345 |
---|---|
Coefficient of variation (CV) | 1.3595211 |
Kurtosis | 8.9292225 |
Mean | 638.52152 |
Median Absolute Deviation (MAD) | 269 |
Skewness | 2.6762489 |
Sum | 385667 |
Variance | 753568.88 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
1 | 5 | 0.8% |
8 | 5 | 0.8% |
83 | 4 | 0.7% |
58 | 4 | 0.7% |
0 | 4 | 0.7% |
97 | 4 | 0.7% |
7 | 4 | 0.7% |
31 | 4 | 0.7% |
40 | 3 | 0.5% |
6 | 3 | 0.5% |
Other values (456) | 564 |
Value | Count | Frequency (%) |
0 | 4 | |
1 | 5 | |
2 | 3 | |
3 | 2 | 0.3% |
4 | 3 | |
5 | 3 | |
6 | 3 | |
7 | 4 | |
8 | 5 | |
9 | 3 |
Value | Count | Frequency (%) |
6103 | 1 | |
5312 | 1 | |
5248 | 1 | |
4900 | 1 | |
4562 | 1 | |
4547 | 1 | |
4541 | 1 | |
4052 | 1 | |
3990 | 1 | |
3810 | 1 |
40대
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 369 |
---|---|
Distinct (%) | 61.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 321.57616 |
Minimum | 0 |
---|---|
Maximum | 2763 |
Zeros | 16 |
Zeros (%) | 2.6% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 5.4 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 2 |
Q1 | 45.75 |
median | 179 |
Q3 | 401.25 |
95-th percentile | 1227.15 |
Maximum | 2763 |
Range | 2763 |
Interquartile range (IQR) | 355.5 |
Descriptive statistics
Standard deviation | 429.54753 |
---|---|
Coefficient of variation (CV) | 1.3357568 |
Kurtosis | 8.0336197 |
Mean | 321.57616 |
Median Absolute Deviation (MAD) | 151.5 |
Skewness | 2.5560734 |
Sum | 194232 |
Variance | 184511.08 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 16 | 2.6% |
1 | 12 | 2.0% |
2 | 10 | 1.7% |
7 | 7 | 1.2% |
4 | 5 | 0.8% |
15 | 5 | 0.8% |
3 | 5 | 0.8% |
48 | 5 | 0.8% |
49 | 4 | 0.7% |
28 | 4 | 0.7% |
Other values (359) | 531 |
Value | Count | Frequency (%) |
0 | 16 | |
1 | 12 | |
2 | 10 | |
3 | 5 | 0.8% |
4 | 5 | 0.8% |
5 | 4 | 0.7% |
6 | 3 | 0.5% |
7 | 7 | |
8 | 3 | 0.5% |
9 | 3 | 0.5% |
Value | Count | Frequency (%) |
2763 | 1 | |
2744 | 1 | |
2581 | 1 | |
2452 | 1 | |
2441 | 1 | |
2288 | 1 | |
1898 | 1 | |
1872 | 1 | |
1870 | 1 | |
1729 | 1 |
50대
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 295 |
---|---|
Distinct (%) | 48.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 159.59437 |
Minimum | 0 |
---|---|
Maximum | 1429 |
Zeros | 27 |
Zeros (%) | 4.5% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 5.4 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 1 |
Q1 | 19 |
median | 84 |
Q3 | 200.25 |
95-th percentile | 588.65 |
Maximum | 1429 |
Range | 1429 |
Interquartile range (IQR) | 181.25 |
Descriptive statistics
Standard deviation | 212.68622 |
---|---|
Coefficient of variation (CV) | 1.3326674 |
Kurtosis | 7.4300612 |
Mean | 159.59437 |
Median Absolute Deviation (MAD) | 74 |
Skewness | 2.4735706 |
Sum | 96395 |
Variance | 45235.429 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 27 | 4.5% |
1 | 18 | 3.0% |
2 | 12 | 2.0% |
3 | 12 | 2.0% |
54 | 10 | 1.7% |
10 | 9 | 1.5% |
12 | 9 | 1.5% |
19 | 8 | 1.3% |
16 | 8 | 1.3% |
5 | 7 | 1.2% |
Other values (285) | 484 |
Value | Count | Frequency (%) |
0 | 27 | |
1 | 18 | |
2 | 12 | |
3 | 12 | |
4 | 6 | 1.0% |
5 | 7 | 1.2% |
6 | 4 | 0.7% |
7 | 4 | 0.7% |
8 | 3 | 0.5% |
9 | 3 | 0.5% |
Value | Count | Frequency (%) |
1429 | 1 | |
1292 | 1 | |
1193 | 1 | |
1131 | 1 | |
1077 | 1 | |
1056 | 1 | |
1026 | 1 | |
1009 | 1 | |
942 | 1 | |
915 | 1 |
60대 이상
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 218 |
---|---|
Distinct (%) | 36.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 82.639073 |
Minimum | 0 |
---|---|
Maximum | 709 |
Zeros | 50 |
Zeros (%) | 8.3% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 5.4 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 10 |
median | 44 |
Q3 | 104.25 |
95-th percentile | 314.7 |
Maximum | 709 |
Range | 709 |
Interquartile range (IQR) | 94.25 |
Descriptive statistics
Standard deviation | 112.39816 |
---|---|
Coefficient of variation (CV) | 1.3601091 |
Kurtosis | 7.8390848 |
Mean | 82.639073 |
Median Absolute Deviation (MAD) | 39 |
Skewness | 2.5220598 |
Sum | 49914 |
Variance | 12633.345 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 50 | 8.3% |
1 | 20 | 3.3% |
2 | 16 | 2.6% |
12 | 15 | 2.5% |
16 | 12 | 2.0% |
5 | 12 | 2.0% |
3 | 11 | 1.8% |
4 | 9 | 1.5% |
10 | 8 | 1.3% |
34 | 8 | 1.3% |
Other values (208) | 443 |
Value | Count | Frequency (%) |
0 | 50 | |
1 | 20 | 3.3% |
2 | 16 | 2.6% |
3 | 11 | 1.8% |
4 | 9 | 1.5% |
5 | 12 | 2.0% |
6 | 8 | 1.3% |
7 | 7 | 1.2% |
8 | 7 | 1.2% |
9 | 7 | 1.2% |
Value | Count | Frequency (%) |
709 | 1 | |
675 | 1 | |
674 | 1 | |
669 | 1 | |
641 | 1 | |
610 | 1 | |
503 | 1 | |
501 | 1 | |
495 | 1 | |
485 | 1 |
연월 | 시도 | 30대 이하 | 40대 | 50대 | 60대 이상 | |
---|---|---|---|---|---|---|
연월 | 1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
시도 | 0.000 | 1.000 | 0.475 | 0.471 | 0.437 | 0.404 |
30대 이하 | 0.000 | 0.475 | 1.000 | 0.824 | 0.903 | 0.926 |
40대 | 0.000 | 0.471 | 0.824 | 1.000 | 0.890 | 0.854 |
50대 | 0.000 | 0.437 | 0.903 | 0.890 | 1.000 | 0.965 |
60대 이상 | 0.000 | 0.404 | 0.926 | 0.854 | 0.965 | 1.000 |
30대 이하 | 40대 | 50대 | 60대 이상 | 시도 | |
---|---|---|---|---|---|
30대 이하 | 1.000 | 0.950 | 0.922 | 0.930 | 0.205 |
40대 | 0.950 | 1.000 | 0.979 | 0.972 | 0.207 |
50대 | 0.922 | 0.979 | 1.000 | 0.975 | 0.185 |
60대 이상 | 0.930 | 0.972 | 0.975 | 1.000 | 0.168 |
시도 | 0.205 | 0.207 | 0.185 | 0.168 | 1.000 |
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
연월 | 시도 | 30대 이하 | 40대 | 50대 | 60대 이상 | |
---|---|---|---|---|---|---|
0 | 2020-02 | 서울 | 392 | 291 | 167 | 112 |
1 | 2020-02 | 부산 | 455 | 150 | 67 | 50 |
2 | 2020-02 | 대구 | 319 | 209 | 88 | 40 |
3 | 2020-02 | 인천 | 680 | 229 | 92 | 50 |
4 | 2020-02 | 울산 | 13 | 4 | 1 | 2 |
5 | 2020-02 | 경기 | 2256 | 1296 | 701 | 352 |
6 | 2020-02 | 강원 | 215 | 114 | 77 | 40 |
7 | 2020-02 | 충남 | 306 | 188 | 121 | 67 |
8 | 2020-02 | 전남 | 186 | 153 | 60 | 41 |
9 | 2020-02 | 경북 | 83 | 58 | 28 | 16 |
연월 | 시도 | 30대 이하 | 40대 | 50대 | 60대 이상 | |
---|---|---|---|---|---|---|
594 | 2024-04 | 대구 | 62 | 49 | 30 | 11 |
595 | 2024-04 | 인천 | 586 | 243 | 165 | 62 |
596 | 2024-04 | 광주 | 939 | 943 | 568 | 311 |
597 | 2024-04 | 대전 | 473 | 237 | 151 | 48 |
598 | 2024-04 | 경기 | 1141 | 381 | 150 | 70 |
599 | 2024-04 | 강원 | 142 | 33 | 32 | 12 |
600 | 2024-04 | 충남 | 890 | 669 | 295 | 137 |
601 | 2024-04 | 전북 | 239 | 283 | 138 | 69 |
602 | 2024-04 | 전남 | 659 | 231 | 85 | 57 |
603 | 2024-04 | 경남 | 1 | 0 | 0 | 0 |