Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory517.6 KiB
Average record size in memory53.0 B

Variable types

Numeric4
Categorical1

Dataset

Description밀양시 개별공시지가 입니다
Author경상남도 밀양시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15002417

Alerts

대장구분 is highly imbalanced (65.6%)Imbalance
부번 has 2637 (26.4%) zerosZeros

Reproduction

Analysis started2023-12-10 23:41:46.003578
Analysis finished2023-12-10 23:41:48.285659
Duration2.28 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지명코드
Real number (ℝ)

Distinct38
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21827.262
Minimum10100
Maximum31029
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:41:48.344657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10100
5-th percentile10200
Q110800
median25028
Q325323
95-th percentile31025
Maximum31029
Range20929
Interquartile range (IQR)14523

Descriptive statistics

Standard deviation7370.0411
Coefficient of variation (CV)0.33765303
Kurtosis-1.0383326
Mean21827.262
Median Absolute Deviation (MAD)297
Skewness-0.72063955
Sum2.1827262 × 108
Variance54317506
MonotonicityNot monotonic
2023-12-11T08:41:48.456932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
10200 602
 
6.0%
10400 585
 
5.9%
25321 547
 
5.5%
10300 448
 
4.5%
10800 447
 
4.5%
25021 438
 
4.4%
25023 409
 
4.1%
25029 402
 
4.0%
25323 393
 
3.9%
25327 336
 
3.4%
Other values (28) 5393
53.9%
ValueCountFrequency (%)
10100 212
 
2.1%
10200 602
6.0%
10300 448
4.5%
10400 585
5.9%
10500 82
 
0.8%
10600 289
2.9%
10700 130
 
1.3%
10800 447
4.5%
25021 438
4.4%
25022 203
 
2.0%
ValueCountFrequency (%)
31029 36
 
0.4%
31028 187
1.9%
31027 144
1.4%
31026 106
1.1%
31025 165
1.7%
31024 190
1.9%
31023 93
 
0.9%
31022 237
2.4%
31021 198
2.0%
25328 123
1.2%

대장구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
9358 
2
 
642

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 9358
93.6%
2 642
 
6.4%

Length

2023-12-11T08:41:48.566971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:41:48.653830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 9358
93.6%
2 642
 
6.4%

본번
Real number (ℝ)

Distinct1757
Distinct (%)17.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean563.9671
Minimum1
Maximum3107
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:41:48.745407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile35
Q1206
median467
Q3767
95-th percentile1494.1
Maximum3107
Range3106
Interquartile range (IQR)561

Descriptive statistics

Standard deviation471.88969
Coefficient of variation (CV)0.83673267
Kurtosis4.0045846
Mean563.9671
Median Absolute Deviation (MAD)276
Skewness1.6336768
Sum5639671
Variance222679.88
MonotonicityNot monotonic
2023-12-11T08:41:48.870433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
431 47
 
0.5%
184 36
 
0.4%
141 32
 
0.3%
728 31
 
0.3%
15 30
 
0.3%
1467 30
 
0.3%
118 30
 
0.3%
17 29
 
0.3%
709 29
 
0.3%
467 27
 
0.3%
Other values (1747) 9679
96.8%
ValueCountFrequency (%)
1 22
0.2%
2 13
0.1%
3 18
0.2%
4 20
0.2%
5 10
0.1%
6 9
0.1%
7 15
0.1%
8 19
0.2%
9 14
0.1%
10 6
 
0.1%
ValueCountFrequency (%)
3107 1
< 0.1%
3105 1
< 0.1%
3098 1
< 0.1%
3092 1
< 0.1%
3088 1
< 0.1%
3070 1
< 0.1%
3066 1
< 0.1%
3065 1
< 0.1%
3060 1
< 0.1%
3058 1
< 0.1%

부번
Real number (ℝ)

ZEROS 

Distinct250
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.6534
Minimum0
Maximum595
Zeros2637
Zeros (%)26.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:41:48.994074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q38
95-th percentile47
Maximum595
Range595
Interquartile range (IQR)8

Descriptive statistics

Standard deviation36.55748
Coefficient of variation (CV)3.1370655
Kurtosis85.101812
Mean11.6534
Median Absolute Deviation (MAD)2
Skewness7.9993649
Sum116534
Variance1336.4493
MonotonicityNot monotonic
2023-12-11T08:41:49.111874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2637
26.4%
1 1380
13.8%
2 1066
10.7%
3 741
 
7.4%
4 534
 
5.3%
5 460
 
4.6%
6 323
 
3.2%
7 262
 
2.6%
8 238
 
2.4%
9 207
 
2.1%
Other values (240) 2152
21.5%
ValueCountFrequency (%)
0 2637
26.4%
1 1380
13.8%
2 1066
10.7%
3 741
 
7.4%
4 534
 
5.3%
5 460
 
4.6%
6 323
 
3.2%
7 262
 
2.6%
8 238
 
2.4%
9 207
 
2.1%
ValueCountFrequency (%)
595 1
< 0.1%
590 1
< 0.1%
580 1
< 0.1%
571 1
< 0.1%
559 1
< 0.1%
535 1
< 0.1%
509 1
< 0.1%
499 1
< 0.1%
495 1
< 0.1%
490 1
< 0.1%

공시지가
Real number (ℝ)

Distinct2696
Distinct (%)27.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98718.476
Minimum260
Maximum2560000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T08:41:49.246051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum260
5-th percentile2360
Q116500
median38800
Q396125
95-th percentile369805
Maximum2560000
Range2559740
Interquartile range (IQR)79625

Descriptive statistics

Standard deviation180167.77
Coefficient of variation (CV)1.8250664
Kurtosis47.119073
Mean98718.476
Median Absolute Deviation (MAD)28500
Skewness5.5452083
Sum9.8718476 × 108
Variance3.2460426 × 1010
MonotonicityNot monotonic
2023-12-11T08:41:49.367775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9240 97
 
1.0%
5940 92
 
0.9%
11500 90
 
0.9%
9400 68
 
0.7%
31500 67
 
0.7%
30500 61
 
0.6%
29500 49
 
0.5%
40500 48
 
0.5%
35000 48
 
0.5%
29000 47
 
0.5%
Other values (2686) 9333
93.3%
ValueCountFrequency (%)
260 1
< 0.1%
283 1
< 0.1%
320 1
< 0.1%
330 1
< 0.1%
356 1
< 0.1%
363 1
< 0.1%
364 2
< 0.1%
383 1
< 0.1%
390 1
< 0.1%
392 2
< 0.1%
ValueCountFrequency (%)
2560000 1
< 0.1%
2483000 1
< 0.1%
2470000 2
< 0.1%
2420000 1
< 0.1%
2340000 1
< 0.1%
2176000 1
< 0.1%
2172000 1
< 0.1%
2128000 1
< 0.1%
2090000 1
< 0.1%
2066000 2
< 0.1%

Interactions

2023-12-11T08:41:47.583356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:46.503518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:46.854854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:47.202190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:47.657911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:46.589775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:46.950659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:47.286917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:47.986717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:46.679522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:47.029607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:47.375942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:48.072298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:46.769326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:47.122079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:41:47.489048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T08:41:49.445065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지명코드대장구분본번부번공시지가
지명코드1.0000.0470.2370.1610.384
대장구분0.0471.0000.4430.0330.111
본번0.2370.4431.0000.0640.171
부번0.1610.0330.0641.0000.122
공시지가0.3840.1110.1710.1221.000
2023-12-11T08:41:49.532528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지명코드본번부번공시지가대장구분
지명코드1.000-0.046-0.222-0.3290.084
본번-0.0461.000-0.0700.1480.341
부번-0.222-0.0701.0000.2570.026
공시지가-0.3290.1480.2571.0000.085
대장구분0.0840.3410.0260.0851.000

Missing values

2023-12-11T08:41:48.171562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:41:48.251238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지명코드대장구분본번부번공시지가
1574110400122118287200
5232625029166316117800
100131030015950178200
765122532413861226700
4114025024186225110
6083325033165118900
254741080015495342700
52554250291782378000
5510725029239301860
61184250331257112000
지명코드대장구분본번부번공시지가
20371010016033801900
24483108001315340200
6951025322142945940
61605250331416036400
662222532115242111500
4960825028119512450
1458810400115717280200
91535310231479018500
9764831028167040000
92219310241151954000