Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory576.2 KiB
Average record size in memory59.0 B

Variable types

Numeric3
Categorical2
Text1

Dataset

Description2023년 1월 1일 기준 경기도 하남시의 개별공시지가에 대한 데이터로 법정동, 지번, 결정지가 등의 항목을 제공합니다.
URLhttps://www.data.go.kr/data/3077419/fileData.do

Alerts

연번 is highly overall correlated with 법정동High correlation
법정동 is highly overall correlated with 연번High correlation
구분 is highly imbalanced (73.6%)Imbalance
연번 has unique valuesUnique
부번 has 1374 (13.7%) zerosZeros

Reproduction

Analysis started2023-12-12 17:51:18.445852
Analysis finished2023-12-12 17:51:20.178021
Duration1.73 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27331.506
Minimum6
Maximum54625
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T02:51:20.243016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile2884.65
Q113868.75
median27329
Q340625.5
95-th percentile51916.25
Maximum54625
Range54619
Interquartile range (IQR)26756.75

Descriptive statistics

Standard deviation15634.432
Coefficient of variation (CV)0.57202965
Kurtosis-1.1794804
Mean27331.506
Median Absolute Deviation (MAD)13388
Skewness0.0051262712
Sum2.7331506 × 108
Variance2.4443545 × 108
MonotonicityNot monotonic
2023-12-13T02:51:20.349039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28151 1
 
< 0.1%
35483 1
 
< 0.1%
5595 1
 
< 0.1%
5513 1
 
< 0.1%
52170 1
 
< 0.1%
20373 1
 
< 0.1%
3925 1
 
< 0.1%
3771 1
 
< 0.1%
21134 1
 
< 0.1%
5090 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
6 1
< 0.1%
13 1
< 0.1%
16 1
< 0.1%
17 1
< 0.1%
25 1
< 0.1%
32 1
< 0.1%
34 1
< 0.1%
40 1
< 0.1%
42 1
< 0.1%
43 1
< 0.1%
ValueCountFrequency (%)
54625 1
< 0.1%
54621 1
< 0.1%
54620 1
< 0.1%
54613 1
< 0.1%
54606 1
< 0.1%
54593 1
< 0.1%
54588 1
< 0.1%
54576 1
< 0.1%
54564 1
< 0.1%
54557 1
< 0.1%

법정동
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
덕풍동
1123 
상산곡동
748 
감북동
729 
초이동
658 
하산곡동
 
602
Other values (19)
6140 

Length

Max length4
Median length3
Mean length3.1883
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row미사동
2nd row창우동
3rd row풍산동
4th row감북동
5th row하사창동

Common Values

ValueCountFrequency (%)
덕풍동 1123
 
11.2%
상산곡동 748
 
7.5%
감북동 729
 
7.3%
초이동 658
 
6.6%
하산곡동 602
 
6.0%
천현동 585
 
5.9%
신장동 566
 
5.7%
미사동 485
 
4.9%
춘궁동 480
 
4.8%
창우동 467
 
4.7%
Other values (14) 3557
35.6%

Length

2023-12-13T02:51:20.471011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
덕풍동 1123
 
11.2%
상산곡동 748
 
7.5%
감북동 729
 
7.3%
초이동 658
 
6.6%
하산곡동 602
 
6.0%
천현동 585
 
5.9%
신장동 566
 
5.7%
미사동 485
 
4.9%
춘궁동 480
 
4.8%
창우동 467
 
4.7%
Other values (14) 3557
35.6%

구분
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일반
9158 
 
841
블럭(롯트)
 
1

Length

Max length6
Median length2
Mean length1.9163
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row일반
2nd row일반
3rd row일반
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 9158
91.6%
841
 
8.4%
블럭(롯트) 1
 
< 0.1%

Length

2023-12-13T02:51:20.591061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:51:20.687705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 9158
91.6%
841
 
8.4%
블럭(롯트 1
 
< 0.1%

본번
Text

Distinct912
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T02:51:21.015347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.7879
Min length1

Characters and Unicode

Total characters27879
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique169 ?
Unique (%)1.7%

Sample

1st row527
2nd row490
3rd row448
4th row56
5th row292
ValueCountFrequency (%)
427 83
 
0.8%
413 63
 
0.6%
322 60
 
0.6%
366 58
 
0.6%
44 44
 
0.4%
345 43
 
0.4%
435 43
 
0.4%
348 42
 
0.4%
13 41
 
0.4%
372 41
 
0.4%
Other values (902) 9482
94.8%
2023-12-13T02:51:21.545814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 4284
15.4%
1 3788
13.6%
2 3714
13.3%
4 3605
12.9%
5 2691
9.7%
6 2157
7.7%
7 2014
7.2%
8 1957
7.0%
9 1835
6.6%
0 1833
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 27878
> 99.9%
Other Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 4284
15.4%
1 3788
13.6%
2 3714
13.3%
4 3605
12.9%
5 2691
9.7%
6 2157
7.7%
7 2014
7.2%
8 1957
7.0%
9 1835
6.6%
0 1833
6.6%
Other Letter
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27878
> 99.9%
Hangul 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 4284
15.4%
1 3788
13.6%
2 3714
13.3%
4 3605
12.9%
5 2691
9.7%
6 2157
7.7%
7 2014
7.2%
8 1957
7.0%
9 1835
6.6%
0 1833
6.6%
Hangul
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27878
> 99.9%
Hangul 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 4284
15.4%
1 3788
13.6%
2 3714
13.3%
4 3605
12.9%
5 2691
9.7%
6 2157
7.7%
7 2014
7.2%
8 1957
7.0%
9 1835
6.6%
0 1833
6.6%
Hangul
ValueCountFrequency (%)
1
100.0%

부번
Real number (ℝ)

ZEROS 

Distinct239
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.3783
Minimum0
Maximum551
Zeros1374
Zeros (%)13.7%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T02:51:21.708177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median5
Q313
95-th percentile55
Maximum551
Range551
Interquartile range (IQR)12

Descriptive statistics

Standard deviation36.015806
Coefficient of variation (CV)2.5048724
Kurtosis79.985844
Mean14.3783
Median Absolute Deviation (MAD)4
Skewness7.7645594
Sum143783
Variance1297.1383
MonotonicityNot monotonic
2023-12-13T02:51:21.887619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1374
13.7%
1 1142
 
11.4%
2 955
 
9.6%
3 731
 
7.3%
4 626
 
6.3%
5 525
 
5.2%
6 422
 
4.2%
7 365
 
3.6%
8 319
 
3.2%
9 280
 
2.8%
Other values (229) 3261
32.6%
ValueCountFrequency (%)
0 1374
13.7%
1 1142
11.4%
2 955
9.6%
3 731
7.3%
4 626
6.3%
5 525
 
5.2%
6 422
 
4.2%
7 365
 
3.6%
8 319
 
3.2%
9 280
 
2.8%
ValueCountFrequency (%)
551 1
< 0.1%
550 1
< 0.1%
543 1
< 0.1%
521 1
< 0.1%
501 1
< 0.1%
487 1
< 0.1%
481 1
< 0.1%
471 1
< 0.1%
470 1
< 0.1%
469 1
< 0.1%

결정지가(원)
Real number (ℝ)

Distinct3888
Distinct (%)38.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1250239.7
Minimum2330
Maximum11970000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T02:51:22.070710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2330
5-th percentile18195
Q1171975
median644800
Q31914250
95-th percentile3882350
Maximum11970000
Range11967670
Interquartile range (IQR)1742275

Descriptive statistics

Standard deviation1438128.7
Coefficient of variation (CV)1.1502824
Kurtosis6.2568341
Mean1250239.7
Median Absolute Deviation (MAD)547200
Skewness2.0241303
Sum1.2502397 × 1010
Variance2.0682142 × 1012
MonotonicityNot monotonic
2023-12-13T02:51:22.239034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
106000 133
 
1.3%
141100 108
 
1.1%
3820000 68
 
0.7%
209800 65
 
0.7%
153300 51
 
0.5%
65100 47
 
0.5%
602900 43
 
0.4%
467000 42
 
0.4%
396000 41
 
0.4%
118200 39
 
0.4%
Other values (3878) 9363
93.6%
ValueCountFrequency (%)
2330 2
 
< 0.1%
2470 4
< 0.1%
2500 5
0.1%
2510 1
 
< 0.1%
2520 1
 
< 0.1%
2530 6
0.1%
2540 3
< 0.1%
2550 1
 
< 0.1%
2560 4
< 0.1%
2580 5
0.1%
ValueCountFrequency (%)
11970000 2
< 0.1%
11130000 1
 
< 0.1%
11080000 1
 
< 0.1%
10860000 3
< 0.1%
10740000 1
 
< 0.1%
10410000 4
< 0.1%
10020000 1
 
< 0.1%
9719000 1
 
< 0.1%
9699000 2
< 0.1%
9378000 1
 
< 0.1%

Interactions

2023-12-13T02:51:19.647689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:51:19.009364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:51:19.290549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:51:19.789022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:51:19.100460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:51:19.391524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:51:19.897297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:51:19.193557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:51:19.512068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:51:22.343164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번법정동구분부번결정지가(원)
연번1.0000.9820.2170.3090.489
법정동0.9821.0000.4130.3010.542
구분0.2170.4131.0000.0590.255
부번0.3090.3010.0591.0000.251
결정지가(원)0.4890.5420.2550.2511.000
2023-12-13T02:51:22.478149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분법정동
구분1.0000.210
법정동0.2101.000
2023-12-13T02:51:22.598721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번부번결정지가(원)법정동구분
연번1.000-0.106-0.0890.8940.132
부번-0.1061.0000.1880.1150.035
결정지가(원)-0.0890.1881.0000.2310.158
법정동0.8940.1150.2311.0000.210
구분0.1320.0350.1580.2101.000

Missing values

2023-12-13T02:51:20.027280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:51:20.129506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번법정동구분본번부번결정지가(원)
2815028151미사동일반5270106000
90019002창우동일반490074700
2569125692풍산동일반44823297000
2930129302감북동일반561114400
4231542316하사창동일반2921372000
1244812449상산곡동일반502111316000
1632516326신장동일반437164390000
2645526456미사동일반12942428000
1288412885상산곡동일반56752227700
1719017191신장동일반51558481000
연번법정동구분본번부번결정지가(원)
3136931370감북동일반36812779000
4268742688하사창동일반3697395000
4941149412초이동일반6446327000
1493314934신장동일반27724425000
4794647947초일동일반6021002000
1312013121상산곡동일반6432111600
2357923580덕풍동일반93933502000
2972529726감북동일반165222053000
64356436하산곡동일반5541141100
5022850229초이동일반2162618000