Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory498.0 KiB
Average record size in memory51.0 B

Variable types

Numeric3
Text1
Categorical1

Dataset

Description경기부동산포털_토지_토지등급
Author경기도
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=1JBAPAEYV2TEFH6N1Y7834239716&infSeq=1

Alerts

등급변동일자 is highly overall correlated with 등급 and 1 other fieldsHigh correlation
등급 is highly overall correlated with 등급변동일자 and 1 other fieldsHigh correlation
등급구분 is highly overall correlated with 등급변동일자 and 1 other fieldsHigh correlation
등급구분 is highly imbalanced (66.5%)Imbalance

Reproduction

Analysis started2023-12-10 21:37:24.982023
Analysis finished2023-12-10 21:37:26.433957
Duration1.45 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

토지고유번호
Real number (ℝ)

Distinct6560
Distinct (%)65.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.1111831 × 1018
Minimum4.1111129 × 1018
Maximum4.1113138 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:37:26.756407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4.1111129 × 1018
5-th percentile4.1111129 × 1018
Q14.111113 × 1018
median4.1111132 × 1018
Q34.1113116 × 1018
95-th percentile4.1113136 × 1018
Maximum4.1113138 × 1018
Range2.009007 × 1014
Interquartile range (IQR)1.9860071 × 1014

Descriptive statistics

Standard deviation9.5228739 × 1013
Coefficient of variation (CV)2.3163342 × 10-5
Kurtosis-1.6083813
Mean4.1111831 × 1018
Median Absolute Deviation (MAD)2.000017 × 1011
Skewness0.62593701
Sum-5.9620182 × 1018
Variance9.0685127 × 1027
MonotonicityNot monotonic
2023-12-11T06:37:27.012483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4111113200102300001 6
 
0.1%
4111113200103020003 6
 
0.1%
4111113200103040012 6
 
0.1%
4111113200100300002 5
 
0.1%
4111113200100370000 5
 
0.1%
4111113200102950001 5
 
0.1%
4111113200104330015 5
 
0.1%
4111113200104330071 5
 
0.1%
4111113200104330061 5
 
0.1%
4111113200104330084 5
 
0.1%
Other values (6550) 9947
99.5%
ValueCountFrequency (%)
4111112900100020001 1
 
< 0.1%
4111112900100030000 1
 
< 0.1%
4111112900100030002 1
 
< 0.1%
4111112900100030003 1
 
< 0.1%
4111112900100030004 2
< 0.1%
4111112900100030006 2
< 0.1%
4111112900100030008 1
 
< 0.1%
4111112900100030009 3
< 0.1%
4111112900100030011 2
< 0.1%
4111112900100030012 2
< 0.1%
ValueCountFrequency (%)
4111313800800240002 1
< 0.1%
4111313800800090006 1
< 0.1%
4111313800800090004 1
< 0.1%
4111313800800060028 1
< 0.1%
4111313800106180001 1
< 0.1%
4111313800106170000 1
< 0.1%
4111313800106160000 2
< 0.1%
4111313800106150000 1
< 0.1%
4111313800106140001 1
< 0.1%
4111313800106130000 2
< 0.1%
Distinct6560
Distinct (%)65.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T06:37:27.230439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length19
Mean length16.8783
Min length13

Characters and Unicode

Total characters168783
Distinct characters53
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3888 ?
Unique (%)38.9%

Sample

1st row수원시장안구 정자동 525-1
2nd row수원시권선구 고색동 141-25
3rd row수원시장안구 정자동 541-15
4th row수원시권선구 금곡동 610
5th row수원시장안구 이목동 404-3
ValueCountFrequency (%)
수원시장안구 6493
21.5%
수원시권선구 3507
 
11.6%
정자동 2414
 
8.0%
율전동 1564
 
5.2%
파장동 1100
 
3.6%
이목동 889
 
2.9%
매탄동 758
 
2.5%
천천동 463
 
1.5%
원천동 444
 
1.5%
금곡동 393
 
1.3%
Other values (5281) 12123
40.2%
2023-12-11T06:37:27.565537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30296
17.9%
10444
 
6.2%
10000
 
5.9%
10000
 
5.9%
10000
 
5.9%
10000
 
5.9%
- 8458
 
5.0%
7730
 
4.6%
1 7071
 
4.2%
6493
 
3.8%
Other values (43) 58291
34.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 90844
53.8%
Decimal Number 39185
23.2%
Space Separator 30296
 
17.9%
Dash Punctuation 8458
 
5.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10444
11.5%
10000
11.0%
10000
11.0%
10000
11.0%
10000
11.0%
7730
8.5%
6493
7.1%
3811
 
4.2%
3811
 
4.2%
2646
 
2.9%
Other values (31) 15909
17.5%
Decimal Number
ValueCountFrequency (%)
1 7071
18.0%
3 5555
14.2%
2 5370
13.7%
4 4729
12.1%
5 3163
8.1%
6 3068
7.8%
7 2864
7.3%
9 2624
 
6.7%
8 2536
 
6.5%
0 2205
 
5.6%
Space Separator
ValueCountFrequency (%)
30296
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8458
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 90844
53.8%
Common 77939
46.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10444
11.5%
10000
11.0%
10000
11.0%
10000
11.0%
10000
11.0%
7730
8.5%
6493
7.1%
3811
 
4.2%
3811
 
4.2%
2646
 
2.9%
Other values (31) 15909
17.5%
Common
ValueCountFrequency (%)
30296
38.9%
- 8458
 
10.9%
1 7071
 
9.1%
3 5555
 
7.1%
2 5370
 
6.9%
4 4729
 
6.1%
5 3163
 
4.1%
6 3068
 
3.9%
7 2864
 
3.7%
9 2624
 
3.4%
Other values (2) 4741
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 90844
53.8%
ASCII 77939
46.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
30296
38.9%
- 8458
 
10.9%
1 7071
 
9.1%
3 5555
 
7.1%
2 5370
 
6.9%
4 4729
 
6.1%
5 3163
 
4.1%
6 3068
 
3.9%
7 2864
 
3.7%
9 2624
 
3.4%
Other values (2) 4741
 
6.1%
Hangul
ValueCountFrequency (%)
10444
11.5%
10000
11.0%
10000
11.0%
10000
11.0%
10000
11.0%
7730
8.5%
6493
7.1%
3811
 
4.2%
3811
 
4.2%
2646
 
2.9%
Other values (31) 15909
17.5%

등급변동일자
Real number (ℝ)

HIGH CORRELATION 

Distinct147
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19846051
Minimum19730401
Maximum19940101
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:37:27.724185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19730401
5-th percentile19790910
Q119810510
median19840701
Q319880501
95-th percentile19900401
Maximum19940101
Range209700
Interquartile range (IQR)69991

Descriptive statistics

Standard deviation35103.17
Coefficient of variation (CV)0.0017687736
Kurtosis-0.64563228
Mean19846051
Median Absolute Deviation (MAD)30191
Skewness0.11956714
Sum1.9846051 × 1011
Variance1.2322326 × 109
MonotonicityNot monotonic
2023-12-11T06:37:27.935924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19810510 1983
19.8%
19840701 1834
18.3%
19850815 1434
14.3%
19880501 1264
12.6%
19890501 606
 
6.1%
19790910 546
 
5.5%
19830901 388
 
3.9%
19900401 290
 
2.9%
19910101 281
 
2.8%
19850701 150
 
1.5%
Other values (137) 1224
12.2%
ValueCountFrequency (%)
19730401 2
 
< 0.1%
19740430 1
 
< 0.1%
19740930 10
 
0.1%
19750705 6
 
0.1%
19751108 2
 
< 0.1%
19760501 36
0.4%
19760620 3
 
< 0.1%
19760701 2
 
< 0.1%
19770219 62
0.6%
19781201 2
 
< 0.1%
ValueCountFrequency (%)
19940101 16
 
0.2%
19930823 1
 
< 0.1%
19930101 32
 
0.3%
19920101 122
1.2%
19911115 1
 
< 0.1%
19910101 281
2.8%
19901029 1
 
< 0.1%
19901023 1
 
< 0.1%
19900716 1
 
< 0.1%
19900707 2
 
< 0.1%

등급
Real number (ℝ)

HIGH CORRELATION 

Distinct180
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean120.2865
Minimum17
Maximum229
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:37:28.113276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile27
Q163
median133
Q3170
95-th percentile188
Maximum229
Range212
Interquartile range (IQR)107

Descriptive statistics

Standard deviation54.73193
Coefficient of variation (CV)0.45501307
Kurtosis-1.3526229
Mean120.2865
Median Absolute Deviation (MAD)45
Skewness-0.31030829
Sum1202865
Variance2995.5842
MonotonicityNot monotonic
2023-12-11T06:37:28.287749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
170 211
 
2.1%
51 202
 
2.0%
176 201
 
2.0%
167 189
 
1.9%
181 186
 
1.9%
66 186
 
1.9%
26 185
 
1.8%
174 182
 
1.8%
123 172
 
1.7%
65 170
 
1.7%
Other values (170) 8116
81.2%
ValueCountFrequency (%)
17 1
 
< 0.1%
20 1
 
< 0.1%
22 65
 
0.7%
23 97
1.0%
24 77
0.8%
25 50
 
0.5%
26 185
1.8%
27 34
 
0.3%
28 98
1.0%
29 9
 
0.1%
ValueCountFrequency (%)
229 1
 
< 0.1%
226 2
 
< 0.1%
225 1
 
< 0.1%
224 1
 
< 0.1%
222 2
 
< 0.1%
220 2
 
< 0.1%
219 2
 
< 0.1%
218 3
< 0.1%
217 7
0.1%
216 3
< 0.1%

등급구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
토지등급
9382 
기준수확량등급
 
618

Length

Max length7
Median length4
Mean length4.1854
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row토지등급
2nd row토지등급
3rd row토지등급
4th row토지등급
5th row토지등급

Common Values

ValueCountFrequency (%)
토지등급 9382
93.8%
기준수확량등급 618
 
6.2%

Length

2023-12-11T06:37:28.449020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:37:28.550328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
토지등급 9382
93.8%
기준수확량등급 618
 
6.2%

Interactions

2023-12-11T06:37:25.998819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:37:25.430400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:37:25.736214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:37:26.090908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:37:25.533444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:37:25.831594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:37:26.178059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:37:25.632726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:37:25.916895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T06:37:28.624038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
토지고유번호등급변동일자등급등급구분
토지고유번호1.0000.2510.3780.035
등급변동일자0.2511.0000.8920.993
등급0.3780.8921.0001.000
등급구분0.0350.9931.0001.000
2023-12-11T06:37:28.731050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
토지고유번호등급변동일자등급등급구분
토지고유번호1.0000.061-0.1750.022
등급변동일자0.0611.0000.7940.925
등급-0.1750.7941.0000.983
등급구분0.0220.9250.9831.000

Missing values

2023-12-11T06:37:26.289402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:37:26.388094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

토지고유번호토지소재지등급변동일자등급등급구분
157184111113000105250001수원시장안구 정자동 525-11981051067토지등급
9924111312800101410025수원시권선구 고색동 141-2519900401174토지등급
306674111113000105410015수원시장안구 정자동 541-1519850815181토지등급
15124111313400806100000수원시권선구 금곡동 61019900401139토지등급
307174111113100104040003수원시장안구 이목동 404-31981051056토지등급
53614111313500803440000수원시권선구 호매실동 34419850701110토지등급
338114111112900200100001수원시장안구 파장동 산 10-119840701105토지등급
274164111113000801700019수원시장안구 정자동 170-191979091028기준수확량등급
168394111112900102110009수원시장안구 파장동 211-919840701162토지등급
124424111311500109190006수원시권선구 인계동 919-619850815180토지등급
토지고유번호토지소재지등급변동일자등급등급구분
343824111112900101330004수원시장안구 파장동 133-41983090161토지등급
33244111311600107170001수원시권선구 매탄동 717-11981051056토지등급
109174111311500107600010수원시권선구 인계동 760-1019840701207토지등급
84424111313600800850001수원시권선구 곡반정동 85-119850815118토지등급
146104111113000800640017수원시장안구 정자동 64-1719840701137토지등급
311884111113000802470001수원시장안구 정자동 247-11981051060토지등급
344594111113100103430003수원시장안구 이목동 343-319890501184토지등급
95404111312600104940001수원시권선구 세류동 494-119920101205토지등급
298844111113100101040001수원시장안구 이목동 104-11979091022기준수확량등급
55784111311600107260003수원시권선구 매탄동 726-319840701137토지등급