Dataset statistics
Number of variables | 8 |
---|---|
Number of observations | 10000 |
Missing cells | 8189 |
Missing cells (%) | 10.2% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 664.2 KiB |
Average record size in memory | 68.0 B |
Variable types
Numeric | 4 |
---|---|
Categorical | 4 |
Dataset
Description | 경상남도 창원시 2015년 기준 개별주택가격 정보에 대한 자료입니다. 읍면동 별 지번별 주택 가경을 공시하는 자료입니다. |
---|---|
Author | 경상남도 |
URL | https://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15046125 |
읍면동 has a high cardinality: 158 distinct values | High cardinality |
리 has a high cardinality: 108 distinct values | High cardinality |
리 has 8189 (81.9%) missing values | Missing |
df_index has unique values | Unique |
부번 has 1136 (11.4%) zeros | Zeros |
Reproduction
Analysis started | 2022-08-11 19:32:18.394387 |
---|---|
Analysis finished | 2022-08-11 19:32:23.028554 |
Duration | 4.63 seconds |
Software version | pandas-profiling v3.2.0 |
Download configuration | config.json |
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 40750.2057 |
Minimum | 31 |
---|---|
Maximum | 81008 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 78.2 KiB |
Quantile statistics
Minimum | 31 |
---|---|
5-th percentile | 4192.6 |
Q1 | 20712.75 |
median | 40658 |
Q3 | 61242.25 |
95-th percentile | 77207.75 |
Maximum | 81008 |
Range | 80977 |
Interquartile range (IQR) | 40529.5 |
Descriptive statistics
Standard deviation | 23456.85859 |
---|---|
Coefficient of variation (CV) | 0.575625526 |
Kurtosis | -1.204165096 |
Mean | 40750.2057 |
Median Absolute Deviation (MAD) | 20317.5 |
Skewness | -0.00903744776 |
Sum | 407502057 |
Variance | 550224214.9 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
75973 | 1 | < 0.1% |
23376 | 1 | < 0.1% |
6997 | 1 | < 0.1% |
53476 | 1 | < 0.1% |
28602 | 1 | < 0.1% |
76034 | 1 | < 0.1% |
68057 | 1 | < 0.1% |
6261 | 1 | < 0.1% |
31441 | 1 | < 0.1% |
54825 | 1 | < 0.1% |
Other values (9990) | 9990 |
Value | Count | Frequency (%) |
31 | 1 | |
42 | 1 | |
48 | 1 | |
49 | 1 | |
65 | 1 | |
77 | 1 | |
90 | 1 | |
98 | 1 | |
127 | 1 | |
134 | 1 |
Value | Count | Frequency (%) |
81008 | 1 | |
81007 | 1 | |
80996 | 1 | |
80995 | 1 | |
80984 | 1 | |
80979 | 1 | |
80975 | 1 | |
80974 | 1 | |
80973 | 1 | |
80971 | 1 |
시군구
Categorical
Distinct | 5 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 78.2 KiB |
경상남도 창원시 마산합포구 | |
---|---|
경상남도 창원시 의창구 | |
경상남도 창원시 마산회원구 | |
경상남도 창원시 진해구 | |
경상남도 창원시 성산구 |
Length
Max length | 14 |
---|---|
Median length | 12 |
Mean length | 12.9582 |
Min length | 12 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 경상남도 창원시 진해구 |
---|---|
2nd row | 경상남도 창원시 마산회원구 |
3rd row | 경상남도 창원시 마산회원구 |
4th row | 경상남도 창원시 진해구 |
5th row | 경상남도 창원시 의창구 |
Common Values
Value | Count | Frequency (%) |
경상남도 창원시 마산합포구 | 2713 | |
경상남도 창원시 의창구 | 2467 | |
경상남도 창원시 마산회원구 | 2078 | |
경상남도 창원시 진해구 | 2046 | |
경상남도 창원시 성산구 | 696 | 7.0% |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
경상남도 | 10000 | |
창원시 | 10000 | |
마산합포구 | 2713 | 9.0% |
의창구 | 2467 | 8.2% |
마산회원구 | 2078 | 6.9% |
진해구 | 2046 | 6.8% |
성산구 | 696 | 2.3% |
Distinct | 158 |
---|---|
Distinct (%) | 1.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 78.2 KiB |
회원동 | 493 |
---|---|
합성동 | 371 |
명서동 | 363 |
산호동 | 339 |
경화동 | 329 |
Other values (153) |
Length
Max length | 5 |
---|---|
Median length | 3 |
Mean length | 2.9465 |
Min length | 2 |
Unique
Unique | 11 ? |
---|---|
Unique (%) | 0.1% |
Sample
1st row | 자은동 |
---|---|
2nd row | 회원동 |
3rd row | 구암동 |
4th row | 제황산동 |
5th row | 용호동 |
Common Values
Value | Count | Frequency (%) |
회원동 | 493 | 4.9% |
합성동 | 371 | 3.7% |
명서동 | 363 | 3.6% |
산호동 | 339 | 3.4% |
경화동 | 329 | 3.3% |
여좌동 | 325 | 3.2% |
구암동 | 287 | 2.9% |
북면 | 274 | 2.7% |
봉곡동 | 271 | 2.7% |
동읍 | 266 | 2.7% |
Other values (148) | 6682 |
Length
Value | Count | Frequency (%) |
회원동 | 493 | 4.9% |
합성동 | 371 | 3.7% |
명서동 | 363 | 3.6% |
산호동 | 339 | 3.4% |
경화동 | 329 | 3.3% |
여좌동 | 325 | 3.2% |
구암동 | 287 | 2.9% |
북면 | 274 | 2.7% |
봉곡동 | 271 | 2.7% |
동읍 | 266 | 2.7% |
Other values (148) | 6682 |
Distinct | 108 |
---|---|
Distinct (%) | 6.0% |
Missing | 8189 |
Missing (%) | 81.9% |
Memory size | 78.2 KiB |
진동리 | 69 |
---|---|
가술리 | 66 |
중리 | 59 |
삼계리 | 55 |
신촌리 | 45 |
Other values (103) |
Length
Max length | 3 |
---|---|
Median length | 3 |
Mean length | 2.945886251 |
Min length | 2 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.1% |
Sample
1st row | 구복리 |
---|---|
2nd row | 평성리 |
3rd row | 우암리 |
4th row | 화천리 |
5th row | 대산리 |
Common Values
Value | Count | Frequency (%) |
진동리 | 69 | 0.7% |
가술리 | 66 | 0.7% |
중리 | 59 | 0.6% |
삼계리 | 55 | 0.5% |
신촌리 | 45 | 0.4% |
갈전리 | 43 | 0.4% |
오서리 | 40 | 0.4% |
심리 | 39 | 0.4% |
수정리 | 36 | 0.4% |
내포리 | 35 | 0.4% |
Other values (98) | 1324 | 13.2% |
(Missing) | 8189 |
Length
Value | Count | Frequency (%) |
진동리 | 69 | 3.8% |
가술리 | 66 | 3.6% |
중리 | 59 | 3.3% |
삼계리 | 55 | 3.0% |
신촌리 | 45 | 2.5% |
갈전리 | 43 | 2.4% |
오서리 | 40 | 2.2% |
심리 | 39 | 2.2% |
수정리 | 36 | 2.0% |
내포리 | 35 | 1.9% |
Other values (98) | 1324 |
토지구분(1 일반, 2 산, 5 블록노트)
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 78.2 KiB |
1 | |
---|---|
2 | 9 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 1 |
---|---|
2nd row | 1 |
3rd row | 1 |
4th row | 1 |
5th row | 1 |
Common Values
Value | Count | Frequency (%) |
1 | 9991 | |
2 | 9 | 0.1% |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
1 | 9991 | |
2 | 9 | 0.1% |
본번
Real number (ℝ≥0)
Distinct | 1171 |
---|---|
Distinct (%) | 11.7% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 299.0439 |
Minimum | 1 |
---|---|
Maximum | 1773 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 88.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 8 |
Q1 | 55 |
median | 155 |
Q3 | 451 |
95-th percentile | 1084.1 |
Maximum | 1773 |
Range | 1772 |
Interquartile range (IQR) | 396 |
Descriptive statistics
Standard deviation | 330.32938 |
---|---|
Coefficient of variation (CV) | 1.104618352 |
Kurtosis | 1.639337542 |
Mean | 299.0439 |
Median Absolute Deviation (MAD) | 130 |
Skewness | 1.483659625 |
Sum | 2990439 |
Variance | 109117.4993 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
12 | 94 | 0.9% |
1 | 86 | 0.9% |
3 | 75 | 0.8% |
28 | 73 | 0.7% |
14 | 66 | 0.7% |
4 | 64 | 0.6% |
5 | 63 | 0.6% |
26 | 61 | 0.6% |
8 | 60 | 0.6% |
6 | 57 | 0.6% |
Other values (1161) | 9301 |
Value | Count | Frequency (%) |
1 | 86 | |
2 | 55 | |
3 | 75 | |
4 | 64 | |
5 | 63 | |
6 | 57 | |
7 | 51 | |
8 | 60 | |
9 | 38 | |
10 | 41 |
Value | Count | Frequency (%) |
1773 | 1 | |
1771 | 1 | |
1769 | 1 | |
1766 | 2 | |
1763 | 1 | |
1762 | 1 | |
1760 | 1 | |
1759 | 2 | |
1643 | 1 | |
1640 | 2 |
Distinct | 278 |
---|---|
Distinct (%) | 2.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 19.3043 |
Minimum | 0 |
---|---|
Maximum | 624 |
Zeros | 1136 |
Zeros (%) | 11.4% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 88.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 2 |
median | 8 |
Q3 | 18 |
95-th percentile | 62 |
Maximum | 624 |
Range | 624 |
Interquartile range (IQR) | 16 |
Descriptive statistics
Standard deviation | 45.32950229 |
---|---|
Coefficient of variation (CV) | 2.348155711 |
Kurtosis | 63.3146324 |
Mean | 19.3043 |
Median Absolute Deviation (MAD) | 7 |
Skewness | 7.066784556 |
Sum | 193043 |
Variance | 2054.763778 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 1136 | 11.4% |
1 | 801 | 8.0% |
2 | 626 | 6.3% |
3 | 551 | 5.5% |
4 | 465 | 4.7% |
6 | 432 | 4.3% |
5 | 431 | 4.3% |
7 | 377 | 3.8% |
8 | 337 | 3.4% |
10 | 305 | 3.0% |
Other values (268) | 4539 |
Value | Count | Frequency (%) |
0 | 1136 | |
1 | 801 | |
2 | 626 | |
3 | 551 | |
4 | 465 | |
5 | 431 | 4.3% |
6 | 432 | 4.3% |
7 | 377 | 3.8% |
8 | 337 | 3.4% |
9 | 300 | 3.0% |
Value | Count | Frequency (%) |
624 | 1 | |
603 | 1 | |
601 | 1 | |
600 | 1 | |
593 | 1 | |
568 | 1 | |
567 | 1 | |
566 | 1 | |
564 | 1 | |
556 | 1 |
주택공시가격
Real number (ℝ≥0)
Distinct | 1379 |
---|---|
Distinct (%) | 13.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 115900613.3 |
Minimum | 582000 |
---|---|
Maximum | 5540000000 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 88.0 KiB |
Quantile statistics
Minimum | 582000 |
---|---|
5-th percentile | 22000000 |
Q1 | 51600000 |
median | 81200000 |
Q3 | 161000000 |
95-th percentile | 312000000 |
Maximum | 5540000000 |
Range | 5539418000 |
Interquartile range (IQR) | 109400000 |
Descriptive statistics
Standard deviation | 108640030.4 |
---|---|
Coefficient of variation (CV) | 0.9373550954 |
Kurtosis | 622.4290489 |
Mean | 115900613.3 |
Median Absolute Deviation (MAD) | 38900000 |
Skewness | 13.5346705 |
Sum | 1.159006133 × 1012 |
Variance | 1.180265621 × 1016 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
101000000 | 53 | 0.5% |
102000000 | 49 | 0.5% |
100000000 | 46 | 0.5% |
103000000 | 42 | 0.4% |
109000000 | 38 | 0.4% |
113000000 | 36 | 0.4% |
114000000 | 35 | 0.4% |
105000000 | 35 | 0.4% |
112000000 | 35 | 0.4% |
111000000 | 33 | 0.3% |
Other values (1369) | 9598 |
Value | Count | Frequency (%) |
582000 | 1 | |
646000 | 1 | |
749000 | 1 | |
886000 | 1 | |
942000 | 1 | |
948000 | 1 | |
1060000 | 1 | |
1070000 | 1 | |
1170000 | 1 | |
1210000 | 2 |
Value | Count | Frequency (%) |
5540000000 | 1 | |
710000000 | 1 | |
697000000 | 1 | |
693000000 | 1 | |
684000000 | 1 | |
676000000 | 1 | |
665000000 | 1 | |
659000000 | 1 | |
650000000 | 1 | |
645000000 | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
df_index | 시군구 | 읍면동 | 리 | 토지구분(1 일반, 2 산, 5 블록노트) | 본번 | 부번 | 주택공시가격 | |
---|---|---|---|---|---|---|---|---|
0 | 75973 | 경상남도 창원시 진해구 | 자은동 | <NA> | 1 | 888 | 5 | 124000000 |
1 | 61646 | 경상남도 창원시 마산회원구 | 회원동 | <NA> | 1 | 480 | 50 | 41100000 |
2 | 48730 | 경상남도 창원시 마산회원구 | 구암동 | <NA> | 1 | 90 | 34 | 66300000 |
3 | 65356 | 경상남도 창원시 진해구 | 제황산동 | <NA> | 1 | 27 | 14 | 47900000 |
4 | 12301 | 경상남도 창원시 의창구 | 용호동 | <NA> | 1 | 36 | 3 | 270000000 |
5 | 42517 | 경상남도 창원시 마산합포구 | 구산면 | 구복리 | 1 | 194 | 34 | 15100000 |
6 | 63102 | 경상남도 창원시 마산회원구 | 내서읍 | 평성리 | 1 | 266 | 2 | 48500000 |
7 | 24838 | 경상남도 창원시 성산구 | 사파동 | <NA> | 1 | 87 | 1 | 213000000 |
8 | 26425 | 경상남도 창원시 마산합포구 | 교방동 | <NA> | 1 | 137 | 2 | 122000000 |
9 | 52503 | 경상남도 창원시 마산회원구 | 석전동 | <NA> | 1 | 266 | 7 | 34900000 |
Last rows
df_index | 시군구 | 읍면동 | 리 | 토지구분(1 일반, 2 산, 5 블록노트) | 본번 | 부번 | 주택공시가격 | |
---|---|---|---|---|---|---|---|---|
9990 | 39007 | 경상남도 창원시 마산합포구 | 자산동 | <NA> | 1 | 323 | 33 | 100000000 |
9991 | 60831 | 경상남도 창원시 마산회원구 | 회원동 | <NA> | 1 | 317 | 20 | 56000000 |
9992 | 31634 | 경상남도 창원시 마산합포구 | 산호동 | <NA> | 1 | 514 | 10 | 55200000 |
9993 | 67768 | 경상남도 창원시 진해구 | 여좌동 | <NA> | 1 | 99 | 63 | 47000000 |
9994 | 24331 | 경상남도 창원시 성산구 | 사파동 | <NA> | 1 | 35 | 18 | 289000000 |
9995 | 66982 | 경상남도 창원시 진해구 | 인사동 | <NA> | 1 | 6 | 30 | 8440000 |
9996 | 66003 | 경상남도 창원시 진해구 | 안곡동 | <NA> | 1 | 4 | 22 | 11900000 |
9997 | 77607 | 경상남도 창원시 진해구 | 제덕동 | <NA> | 1 | 150 | 0 | 289000000 |
9998 | 31625 | 경상남도 창원시 마산합포구 | 산호동 | <NA> | 1 | 510 | 55 | 17600000 |
9999 | 43190 | 경상남도 창원시 마산합포구 | 진동면 | 고현리 | 1 | 742 | 0 | 7850000 |