Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory517.6 KiB
Average record size in memory53.0 B

Variable types

Numeric4
Categorical1

Dataset

Description밀양시 개별공시지가 입니다
Author경상남도 밀양시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15002417

Alerts

구분 is highly imbalanced (64.0%)Imbalance
부번 has 2661 (26.6%) zerosZeros

Reproduction

Analysis started2023-12-10 23:40:17.858541
Analysis finished2023-12-10 23:40:19.665906
Duration1.81 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지명코드명
Real number (ℝ)

Distinct38
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21760.855
Minimum10100
Maximum31029
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:40:19.720376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10100
5-th percentile10200
Q110800
median25028
Q325323
95-th percentile31025
Maximum31029
Range20929
Interquartile range (IQR)14523

Descriptive statistics

Standard deviation7376.8585
Coefficient of variation (CV)0.33899672
Kurtosis-1.0600878
Mean21760.855
Median Absolute Deviation (MAD)296
Skewness-0.71185241
Sum2.1760855 × 108
Variance54418042
MonotonicityNot monotonic
2023-12-11T08:40:19.860364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
10200 645
 
6.5%
10400 590
 
5.9%
25321 550
 
5.5%
10300 436
 
4.4%
25021 425
 
4.2%
10800 410
 
4.1%
25023 407
 
4.1%
25323 393
 
3.9%
25029 383
 
3.8%
25322 357
 
3.6%
Other values (28) 5404
54.0%
ValueCountFrequency (%)
10100 228
 
2.3%
10200 645
6.5%
10300 436
4.4%
10400 590
5.9%
10500 92
 
0.9%
10600 282
2.8%
10700 140
 
1.4%
10800 410
4.1%
25021 425
4.2%
25022 227
 
2.3%
ValueCountFrequency (%)
31029 31
 
0.3%
31028 182
1.8%
31027 156
1.6%
31026 76
 
0.8%
31025 135
1.4%
31024 172
1.7%
31023 118
1.2%
31022 257
2.6%
31021 190
1.9%
25328 119
1.2%

구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
9316 
2
 
684

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 9316
93.2%
2 684
 
6.8%

Length

2023-12-11T08:40:20.017985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:40:20.107866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 9316
93.2%
2 684
 
6.8%

본번
Real number (ℝ)

Distinct1763
Distinct (%)17.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean570.7276
Minimum1
Maximum3118
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:40:20.218563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile35
Q1213
median470.5
Q3771
95-th percentile1518.05
Maximum3118
Range3117
Interquartile range (IQR)558

Descriptive statistics

Standard deviation477.80559
Coefficient of variation (CV)0.83718676
Kurtosis4.0667054
Mean570.7276
Median Absolute Deviation (MAD)276.5
Skewness1.6432182
Sum5707276
Variance228298.18
MonotonicityNot monotonic
2023-12-11T08:40:20.367234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
431 54
 
0.5%
728 48
 
0.5%
184 34
 
0.3%
156 30
 
0.3%
17 29
 
0.3%
111 28
 
0.3%
709 27
 
0.3%
15 27
 
0.3%
693 26
 
0.3%
575 25
 
0.2%
Other values (1753) 9672
96.7%
ValueCountFrequency (%)
1 24
0.2%
2 12
0.1%
3 17
0.2%
4 19
0.2%
5 8
 
0.1%
6 7
 
0.1%
7 13
0.1%
8 15
0.1%
9 14
0.1%
10 18
0.2%
ValueCountFrequency (%)
3118 1
< 0.1%
3116 1
< 0.1%
3108 1
< 0.1%
3105 1
< 0.1%
3074 1
< 0.1%
3067 1
< 0.1%
3063 1
< 0.1%
3055 1
< 0.1%
3050 1
< 0.1%
3049 1
< 0.1%

부번
Real number (ℝ)

ZEROS 

Distinct245
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.6715
Minimum0
Maximum539
Zeros2661
Zeros (%)26.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:40:20.530211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q38
95-th percentile47
Maximum539
Range539
Interquartile range (IQR)8

Descriptive statistics

Standard deviation36.153267
Coefficient of variation (CV)3.0975682
Kurtosis72.549566
Mean11.6715
Median Absolute Deviation (MAD)2
Skewness7.5477274
Sum116715
Variance1307.0587
MonotonicityNot monotonic
2023-12-11T08:40:20.664502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2661
26.6%
1 1314
13.1%
2 1070
10.7%
3 756
 
7.6%
4 533
 
5.3%
5 427
 
4.3%
6 329
 
3.3%
7 291
 
2.9%
8 234
 
2.3%
9 192
 
1.9%
Other values (235) 2193
21.9%
ValueCountFrequency (%)
0 2661
26.6%
1 1314
13.1%
2 1070
10.7%
3 756
 
7.6%
4 533
 
5.3%
5 427
 
4.3%
6 329
 
3.3%
7 291
 
2.9%
8 234
 
2.3%
9 192
 
1.9%
ValueCountFrequency (%)
539 1
< 0.1%
532 1
< 0.1%
503 1
< 0.1%
486 1
< 0.1%
485 1
< 0.1%
479 1
< 0.1%
478 1
< 0.1%
477 1
< 0.1%
476 1
< 0.1%
464 1
< 0.1%

공시지가
Real number (ℝ)

Distinct2639
Distinct (%)26.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean97585.644
Minimum154
Maximum2488000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:40:20.797398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum154
5-th percentile1870
Q115000
median34550
Q391025
95-th percentile401025
Maximum2488000
Range2487846
Interquartile range (IQR)76025

Descriptive statistics

Standard deviation184712.27
Coefficient of variation (CV)1.8928221
Kurtosis47.751213
Mean97585.644
Median Absolute Deviation (MAD)25310
Skewness5.5405397
Sum9.7585644 × 108
Variance3.4118621 × 1010
MonotonicityNot monotonic
2023-12-11T08:40:20.924347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29000 94
 
0.9%
5110 91
 
0.9%
8080 82
 
0.8%
27000 81
 
0.8%
10000 79
 
0.8%
28000 64
 
0.6%
7920 58
 
0.6%
27200 51
 
0.5%
9570 49
 
0.5%
13800 49
 
0.5%
Other values (2629) 9302
93.0%
ValueCountFrequency (%)
154 1
< 0.1%
227 1
< 0.1%
297 1
< 0.1%
306 1
< 0.1%
307 1
< 0.1%
312 2
< 0.1%
313 2
< 0.1%
316 1
< 0.1%
320 1
< 0.1%
323 1
< 0.1%
ValueCountFrequency (%)
2488000 1
< 0.1%
2454000 1
< 0.1%
2450000 2
< 0.1%
2446000 1
< 0.1%
2420000 2
< 0.1%
2400000 1
< 0.1%
2394000 1
< 0.1%
2320000 1
< 0.1%
2263000 1
< 0.1%
2254000 1
< 0.1%

Interactions

2023-12-11T08:40:19.218105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:18.283646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:18.585106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:18.898217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:19.314845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:18.354076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:18.668459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:18.966708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:19.389573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:18.432124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:18.746163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:19.045626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:19.460883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:18.506808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:18.817821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:40:19.135474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T08:40:21.007263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지명코드명구분본번부번공시지가
지명코드명1.0000.0570.2400.1630.390
구분0.0571.0000.4520.0490.117
본번0.2400.4521.0000.0630.189
부번0.1630.0490.0631.0000.124
공시지가0.3900.1170.1890.1241.000
2023-12-11T08:40:21.089146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지명코드명본번부번공시지가구분
지명코드명1.000-0.051-0.221-0.3320.102
본번-0.0511.000-0.0640.1520.348
부번-0.221-0.0641.0000.2760.038
공시지가-0.3320.1520.2761.0000.090
구분0.1020.3480.0380.0901.000

Missing values

2023-12-11T08:40:19.549154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:40:19.634099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지명코드명구분본번부번공시지가
314610200132611180100
6756125321180529164300
8539225328148124400
50249250281505162200
59015250311871118700
743522532311582120600
19734105001319169000
406562502415611109600
631162503311353027000
177421040014031161000
지명코드명구분본번부번공시지가
36641250231385430300
301372502112234845000
157091040011872379500
30310100113541984000
23402107001306222100
3624625023127613015000
77386253241779114500
168631040013426113500
412942502417676111000
8497925327165921202700