Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory664.1 KiB
Average record size in memory68.0 B

Variable types

Numeric3
Categorical1
Text3

Dataset

Description2014년 광명시 개별공시지가 자료
Author경기도 광명시
URLhttps://www.data.go.kr/data/15054984/fileData.do

Alerts

No is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with No and 1 other fieldsHigh correlation
법정동 is highly overall correlated with No and 1 other fieldsHigh correlation
구분 is highly imbalanced (79.5%)Imbalance
No has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:03:53.060380
Analysis finished2023-12-12 13:03:55.051569
Duration1.99 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

No
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15021.442
Minimum3
Maximum30046
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:03:55.139291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile1592.9
Q17575.5
median14946
Q322463.75
95-th percentile28477.1
Maximum30046
Range30043
Interquartile range (IQR)14888.25

Descriptive statistics

Standard deviation8610.5536
Coefficient of variation (CV)0.57321752
Kurtosis-1.1910201
Mean15021.442
Median Absolute Deviation (MAD)7448.5
Skewness0.009101913
Sum1.5021442 × 108
Variance74141634
MonotonicityNot monotonic
2023-12-12T22:03:55.306039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7598 1
 
< 0.1%
16974 1
 
< 0.1%
8529 1
 
< 0.1%
15687 1
 
< 0.1%
42 1
 
< 0.1%
3041 1
 
< 0.1%
1455 1
 
< 0.1%
15683 1
 
< 0.1%
5734 1
 
< 0.1%
25966 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
10 1
< 0.1%
11 1
< 0.1%
13 1
< 0.1%
22 1
< 0.1%
24 1
< 0.1%
31 1
< 0.1%
ValueCountFrequency (%)
30046 1
< 0.1%
30045 1
< 0.1%
30044 1
< 0.1%
30041 1
< 0.1%
30033 1
< 0.1%
30032 1
< 0.1%
30029 1
< 0.1%
30028 1
< 0.1%
30023 1
< 0.1%
30021 1
< 0.1%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct9752
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39194.952
Minimum3
Maximum999999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:03:55.469064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile1601.65
Q17592.75
median14963
Q322489.5
95-th percentile28506.25
Maximum999999
Range999996
Interquartile range (IQR)14896.75

Descriptive statistics

Standard deviation153767.59
Coefficient of variation (CV)3.9231478
Kurtosis34.982682
Mean39194.952
Median Absolute Deviation (MAD)7446
Skewness6.0709516
Sum3.9194952 × 108
Variance2.3644472 × 1010
MonotonicityNot monotonic
2023-12-12T22:03:55.634868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
999999 249
 
2.5%
1408 1
 
< 0.1%
4886 1
 
< 0.1%
25971 1
 
< 0.1%
19341 1
 
< 0.1%
8253 1
 
< 0.1%
15206 1
 
< 0.1%
40 1
 
< 0.1%
2948 1
 
< 0.1%
7345 1
 
< 0.1%
Other values (9742) 9742
97.4%
ValueCountFrequency (%)
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
12 1
< 0.1%
21 1
< 0.1%
23 1
< 0.1%
29 1
< 0.1%
ValueCountFrequency (%)
999999 249
2.5%
29299 1
 
< 0.1%
29298 1
 
< 0.1%
29297 1
 
< 0.1%
29294 1
 
< 0.1%
29287 1
 
< 0.1%
29286 1
 
< 0.1%
29283 1
 
< 0.1%
29282 1
 
< 0.1%
29277 1
 
< 0.1%

법정동
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean103.412
Minimum101
Maximum108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:03:55.784435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile101
Q1101
median103
Q3105
95-th percentile108
Maximum108
Range7
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.33782
Coefficient of variation (CV)0.022606855
Kurtosis-1.0746819
Mean103.412
Median Absolute Deviation (MAD)2
Skewness0.48130914
Sum1034120
Variance5.4654025
MonotonicityNot monotonic
2023-12-12T22:03:55.918506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
101 3651
36.5%
104 1576
15.8%
105 1259
 
12.6%
107 1153
 
11.5%
102 801
 
8.0%
103 743
 
7.4%
108 533
 
5.3%
106 284
 
2.8%
ValueCountFrequency (%)
101 3651
36.5%
102 801
 
8.0%
103 743
 
7.4%
104 1576
15.8%
105 1259
 
12.6%
106 284
 
2.8%
107 1153
 
11.5%
108 533
 
5.3%
ValueCountFrequency (%)
108 533
 
5.3%
107 1153
 
11.5%
106 284
 
2.8%
105 1259
 
12.6%
104 1576
15.8%
103 743
 
7.4%
102 801
 
8.0%
101 3651
36.5%

구분
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
9437 
2
 
544
9
 
19

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 9437
94.4%
2 544
 
5.4%
9 19
 
0.2%

Length

2023-12-12T22:03:56.051085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:03:56.155636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 9437
94.4%
2 544
 
5.4%
9 19
 
0.2%

본번
Text

Distinct1177
Distinct (%)11.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T22:03:56.550495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.9975
Min length2

Characters and Unicode

Total characters39975
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique174 ?
Unique (%)1.7%

Sample

1st row0301
2nd row0689
3rd row0158
4th row0760
5th row0044
ValueCountFrequency (%)
0158 459
 
4.6%
0467 110
 
1.1%
0041 105
 
1.1%
0472 85
 
0.9%
0056 77
 
0.8%
0032 75
 
0.8%
0374 61
 
0.6%
0033 59
 
0.6%
0313 56
 
0.6%
0077 54
 
0.5%
Other values (1167) 8859
88.6%
2023-12-12T22:03:57.186852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 13664
34.2%
1 4215
 
10.5%
3 3371
 
8.4%
2 3163
 
7.9%
5 2882
 
7.2%
7 2831
 
7.1%
4 2718
 
6.8%
6 2572
 
6.4%
8 2520
 
6.3%
9 2014
 
5.0%
Other values (7) 25
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 39950
99.9%
Other Letter 25
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 13664
34.2%
1 4215
 
10.6%
3 3371
 
8.4%
2 3163
 
7.9%
5 2882
 
7.2%
7 2831
 
7.1%
4 2718
 
6.8%
6 2572
 
6.4%
8 2520
 
6.3%
9 2014
 
5.0%
Other Letter
ValueCountFrequency (%)
10
40.0%
4
 
16.0%
3
 
12.0%
3
 
12.0%
3
 
12.0%
1
 
4.0%
1
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
Common 39950
99.9%
Hangul 25
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 13664
34.2%
1 4215
 
10.6%
3 3371
 
8.4%
2 3163
 
7.9%
5 2882
 
7.2%
7 2831
 
7.1%
4 2718
 
6.8%
6 2572
 
6.4%
8 2520
 
6.3%
9 2014
 
5.0%
Hangul
ValueCountFrequency (%)
10
40.0%
4
 
16.0%
3
 
12.0%
3
 
12.0%
3
 
12.0%
1
 
4.0%
1
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39950
99.9%
Hangul 25
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 13664
34.2%
1 4215
 
10.6%
3 3371
 
8.4%
2 3163
 
7.9%
5 2882
 
7.2%
7 2831
 
7.1%
4 2718
 
6.8%
6 2572
 
6.4%
8 2520
 
6.3%
9 2014
 
5.0%
Hangul
ValueCountFrequency (%)
10
40.0%
4
 
16.0%
3
 
12.0%
3
 
12.0%
3
 
12.0%
1
 
4.0%
1
 
4.0%

부번
Text

Distinct691
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T22:03:57.591506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters40000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique420 ?
Unique (%)4.2%

Sample

1st row0140
2nd row0013
3rd row0945
4th row0003
5th row0029
ValueCountFrequency (%)
0000 1120
 
11.2%
0001 933
 
9.3%
0002 790
 
7.9%
0003 636
 
6.4%
0004 497
 
5.0%
0005 417
 
4.2%
0006 388
 
3.9%
0007 311
 
3.1%
0008 271
 
2.7%
0009 248
 
2.5%
Other values (681) 4389
43.9%
2023-12-12T22:03:58.131066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 26272
65.7%
1 3555
 
8.9%
2 2201
 
5.5%
3 1693
 
4.2%
4 1430
 
3.6%
5 1180
 
2.9%
6 1085
 
2.7%
7 917
 
2.3%
8 846
 
2.1%
9 819
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 39998
> 99.9%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 26272
65.7%
1 3555
 
8.9%
2 2201
 
5.5%
3 1693
 
4.2%
4 1430
 
3.6%
5 1180
 
3.0%
6 1085
 
2.7%
7 917
 
2.3%
8 846
 
2.1%
9 819
 
2.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 40000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 26272
65.7%
1 3555
 
8.9%
2 2201
 
5.5%
3 1693
 
4.2%
4 1430
 
3.6%
5 1180
 
2.9%
6 1085
 
2.7%
7 917
 
2.3%
8 846
 
2.1%
9 819
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 26272
65.7%
1 3555
 
8.9%
2 2201
 
5.5%
3 1693
 
4.2%
4 1430
 
3.6%
5 1180
 
2.9%
6 1085
 
2.7%
7 917
 
2.3%
8 846
 
2.1%
9 819
 
2.0%
Distinct3043
Distinct (%)30.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T22:03:58.435078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length7.8701
Min length5

Characters and Unicode

Total characters78701
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1498 ?
Unique (%)15.0%

Sample

1st row590,700
2nd row175,500
3rd row1,825,000
4th row345,700
5th row980,400
ValueCountFrequency (%)
2,300,000 152
 
1.5%
551,100 73
 
0.7%
537,900 73
 
0.7%
1,670,000 64
 
0.6%
495,000 50
 
0.5%
1,548,000 50
 
0.5%
1,740,000 50
 
0.5%
208,500 48
 
0.5%
1,580,000 48
 
0.5%
169,600 46
 
0.5%
Other values (3033) 9346
93.5%
2023-12-12T22:03:58.945314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 29192
37.1%
, 14799
18.8%
1 6980
 
8.9%
2 4571
 
5.8%
5 4050
 
5.1%
4 3429
 
4.4%
3 3370
 
4.3%
7 3292
 
4.2%
6 3285
 
4.2%
8 3057
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 63902
81.2%
Other Punctuation 14799
 
18.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 29192
45.7%
1 6980
 
10.9%
2 4571
 
7.2%
5 4050
 
6.3%
4 3429
 
5.4%
3 3370
 
5.3%
7 3292
 
5.2%
6 3285
 
5.1%
8 3057
 
4.8%
9 2676
 
4.2%
Other Punctuation
ValueCountFrequency (%)
, 14799
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 78701
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 29192
37.1%
, 14799
18.8%
1 6980
 
8.9%
2 4571
 
5.8%
5 4050
 
5.1%
4 3429
 
4.4%
3 3370
 
4.3%
7 3292
 
4.2%
6 3285
 
4.2%
8 3057
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78701
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 29192
37.1%
, 14799
18.8%
1 6980
 
8.9%
2 4571
 
5.8%
5 4050
 
5.1%
4 3429
 
4.4%
3 3370
 
4.3%
7 3292
 
4.2%
6 3285
 
4.2%
8 3057
 
3.9%

Interactions

2023-12-12T22:03:54.451768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:03:53.819702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:03:54.143985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:03:54.560215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:03:53.929094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:03:54.241629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:03:54.669724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:03:54.044574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:03:54.338459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:03:59.064618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
No일련번호법정동구분
No1.0000.0590.9070.280
일련번호0.0591.0000.0640.000
법정동0.9070.0641.0000.267
구분0.2800.0000.2671.000
2023-12-12T22:03:59.175953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
No일련번호법정동구분
No1.0000.9390.9710.174
일련번호0.9391.0000.9120.000
법정동0.9710.9121.0000.175
구분0.1740.0000.1751.000

Missing values

2023-12-12T22:03:54.830226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:03:54.988204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

No일련번호법정동구분본번부번결정지가
759775987345101103010140590,700
10225102269924101106890013175,500
4877487846951011015809451,825,000
231242312522500105107600003345,700
285862858727873108100440029980,400
2203220421371011004900921,882,000
9075907687931011037401281,968,000
299282992929185108104180046273,500
1852818529179951041092700101,563,000
209402094120353105102030005939,200
No일련번호법정동구분본번부번결정지가
4200420140561011015801821,896,000
1614161515641011004101591,668,000
1358413585131581031024000021,365,000
1901919020184761041110400031,463,000
1499915000145361031077800022,598,000
155311553215054104100040009147,800
289862898728261108101420006399,300
258432584425170107103510001235,500
1211612117117561021022700031,698,000
2108521086204951051022000251,035,000