Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory517.6 KiB
Average record size in memory53.0 B

Variable types

Numeric4
Categorical1

Dataset

Description밀양시 개별공시지가 입니다
Author경상남도 밀양시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15002417

Alerts

구분 is highly imbalanced (65.1%)Imbalance
부번 has 2706 (27.1%) zerosZeros

Reproduction

Analysis started2023-12-10 23:41:36.597155
Analysis finished2023-12-10 23:41:39.048794
Duration2.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지명코드명
Real number (ℝ)

Distinct38
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21715.135
Minimum10100
Maximum31029
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:41:39.121231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10100
5-th percentile10200
Q110800
median25027
Q325323
95-th percentile31026
Maximum31029
Range20929
Interquartile range (IQR)14523

Descriptive statistics

Standard deviation7426.1047
Coefficient of variation (CV)0.34197828
Kurtosis-1.0992069
Mean21715.135
Median Absolute Deviation (MAD)297
Skewness-0.68746122
Sum2.1715135 × 108
Variance55147030
MonotonicityNot monotonic
2023-12-11T08:41:39.243607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
10200 648
 
6.5%
10400 600
 
6.0%
25321 502
 
5.0%
10800 468
 
4.7%
25021 434
 
4.3%
10300 419
 
4.2%
25023 416
 
4.2%
25323 398
 
4.0%
25029 358
 
3.6%
25322 307
 
3.1%
Other values (28) 5450
54.5%
ValueCountFrequency (%)
10100 232
 
2.3%
10200 648
6.5%
10300 419
4.2%
10400 600
6.0%
10500 82
 
0.8%
10600 279
2.8%
10700 140
 
1.4%
10800 468
4.7%
25021 434
4.3%
25022 235
 
2.4%
ValueCountFrequency (%)
31029 71
 
0.7%
31028 201
2.0%
31027 154
1.5%
31026 75
 
0.8%
31025 162
1.6%
31024 173
1.7%
31023 108
1.1%
31022 227
2.3%
31021 182
1.8%
25328 114
1.1%

구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
9344 
2
 
656

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 9344
93.4%
2 656
 
6.6%

Length

2023-12-11T08:41:39.400227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:41:39.510523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 9344
93.4%
2 656
 
6.6%

본번
Real number (ℝ)

Distinct1757
Distinct (%)17.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean567.9961
Minimum1
Maximum3109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:41:39.617711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile34
Q1206
median468
Q3765
95-th percentile1516.05
Maximum3109
Range3108
Interquartile range (IQR)559

Descriptive statistics

Standard deviation479.6418
Coefficient of variation (CV)0.84444558
Kurtosis4.0083494
Mean567.9961
Median Absolute Deviation (MAD)277
Skewness1.6532915
Sum5679961
Variance230056.26
MonotonicityNot monotonic
2023-12-11T08:41:39.811731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
431 50
 
0.5%
728 44
 
0.4%
156 39
 
0.4%
184 39
 
0.4%
141 34
 
0.3%
118 33
 
0.3%
34 29
 
0.3%
111 27
 
0.3%
814 27
 
0.3%
4 27
 
0.3%
Other values (1747) 9651
96.5%
ValueCountFrequency (%)
1 23
0.2%
2 12
0.1%
3 15
0.1%
4 27
0.3%
5 15
0.1%
6 12
0.1%
7 23
0.2%
8 12
0.1%
9 13
0.1%
10 15
0.1%
ValueCountFrequency (%)
3109 1
< 0.1%
3105 1
< 0.1%
3082 1
< 0.1%
3076 1
< 0.1%
3066 1
< 0.1%
3058 1
< 0.1%
3048 1
< 0.1%
3044 1
< 0.1%
3037 1
< 0.1%
3032 1
< 0.1%

부번
Real number (ℝ)

ZEROS 

Distinct255
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.553
Minimum0
Maximum567
Zeros2706
Zeros (%)27.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:41:39.972808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q38
95-th percentile45
Maximum567
Range567
Interquartile range (IQR)8

Descriptive statistics

Standard deviation37.689615
Coefficient of variation (CV)3.2623227
Kurtosis74.25663
Mean11.553
Median Absolute Deviation (MAD)2
Skewness7.6986283
Sum115530
Variance1420.507
MonotonicityNot monotonic
2023-12-11T08:41:40.146434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2706
27.1%
1 1360
13.6%
2 1055
 
10.5%
3 789
 
7.9%
4 566
 
5.7%
5 431
 
4.3%
6 309
 
3.1%
7 280
 
2.8%
8 239
 
2.4%
9 188
 
1.9%
Other values (245) 2077
20.8%
ValueCountFrequency (%)
0 2706
27.1%
1 1360
13.6%
2 1055
 
10.5%
3 789
 
7.9%
4 566
 
5.7%
5 431
 
4.3%
6 309
 
3.1%
7 280
 
2.8%
8 239
 
2.4%
9 188
 
1.9%
ValueCountFrequency (%)
567 1
< 0.1%
550 1
< 0.1%
527 1
< 0.1%
517 1
< 0.1%
514 1
< 0.1%
506 1
< 0.1%
501 1
< 0.1%
500 1
< 0.1%
492 1
< 0.1%
490 1
< 0.1%

공시지가
Real number (ℝ)

Distinct2576
Distinct (%)25.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean95940.343
Minimum201
Maximum2815000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:41:40.629917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201
5-th percentile1739
Q113475
median32100
Q386700
95-th percentile392620
Maximum2815000
Range2814799
Interquartile range (IQR)73225

Descriptive statistics

Standard deviation189709.81
Coefficient of variation (CV)1.9773726
Kurtosis45.553785
Mean95940.343
Median Absolute Deviation (MAD)23520
Skewness5.5843118
Sum9.5940343 × 108
Variance3.598981 × 1010
MonotonicityNot monotonic
2023-12-11T08:41:40.803907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25000 120
 
1.2%
9240 91
 
0.9%
27000 89
 
0.9%
28000 83
 
0.8%
26000 81
 
0.8%
4450 80
 
0.8%
7420 79
 
0.8%
31000 70
 
0.7%
16800 65
 
0.7%
6930 60
 
0.6%
Other values (2566) 9182
91.8%
ValueCountFrequency (%)
201 1
< 0.1%
257 1
< 0.1%
260 1
< 0.1%
262 1
< 0.1%
270 1
< 0.1%
287 2
< 0.1%
288 1
< 0.1%
296 1
< 0.1%
297 1
< 0.1%
300 1
< 0.1%
ValueCountFrequency (%)
2815000 1
< 0.1%
2360000 2
< 0.1%
2347000 1
< 0.1%
2320000 1
< 0.1%
2243000 1
< 0.1%
2235000 1
< 0.1%
2184000 1
< 0.1%
2171000 1
< 0.1%
2130000 1
< 0.1%
2108000 1
< 0.1%

Interactions

2023-12-11T08:41:38.500876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:37.211809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:37.663694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:38.090820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:38.610053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:37.308637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:37.754806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:38.186307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:38.702230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:37.411861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:37.870854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:38.279479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:38.798033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:37.556688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:37.979308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:38.384441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T08:41:40.921036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지명코드명구분본번부번공시지가
지명코드명1.0000.0510.2500.1670.487
구분0.0511.0000.4410.0560.095
본번0.2500.4411.0000.0750.133
부번0.1670.0560.0751.0000.098
공시지가0.4870.0950.1330.0981.000
2023-12-11T08:41:41.024741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지명코드명본번부번공시지가구분
지명코드명1.000-0.043-0.231-0.3480.092
본번-0.0431.000-0.0750.1440.339
부번-0.231-0.0751.0000.2750.043
공시지가-0.3480.1440.2751.0000.073
구분0.0920.3390.0430.0731.000

Missing values

2023-12-11T08:41:38.908608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:41:39.003238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지명코드명구분본번부번공시지가
440772502527581490
5306825029193503720
3813925023182309300
1756510400139216120100
50759250281802123600
6346825321163785800
85906253281403126300
93633310251133235700
683752532111168034000
6522325321141975770
지명코드명구분본번부번공시지가
21067106001271716500
569832503011039029700
2427910700236101610
75882253241109123100
627062503311159023000
386642502324219726
743802532311909027000
7969025325170701080
6368325321176451600
21233106001337146200