Overview

Dataset statistics

Number of variables7
Number of observations3610
Missing cells3610
Missing cells (%)14.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory215.2 KiB
Average record size in memory61.0 B

Variable types

Categorical2
Unsupported1
Text1
Numeric3

Dataset

Description공인중개사의 업무 및 부동산 거래신고에 관한 법률 제27조에 의해 신고된 농지(논,밭,과수원)의 실거래 가격 정보(읍.면.동별 평균, 최저,최고가)
Author농림축산식품부
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220217000000002080

Alerts

QU is highly overall correlated with YEARHigh correlation
YEAR is highly overall correlated with QUHigh correlation
LNDPCL is highly overall correlated with MUMMPCHigh correlation
MUMMPC is highly overall correlated with LNDPCL and 1 other fieldsHigh correlation
MXMMPC is highly overall correlated with MUMMPCHigh correlation
ORDR has 3610 (100.0%) missing valuesMissing
ORDR is an unsupported type, check if it needs cleaning or further analysisUnsupported
LNDPCL has 169 (4.7%) zerosZeros
MUMMPC has 173 (4.8%) zerosZeros
MXMMPC has 169 (4.7%) zerosZeros

Reproduction

Analysis started2023-12-11 03:25:31.756884
Analysis finished2023-12-11 03:25:33.686652
Duration1.93 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

YEAR
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size28.3 KiB
2011
722 
2010
722 
2012
722 
2013
722 
2014
722 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2011
2nd row2011
3rd row2011
4th row2011
5th row2011

Common Values

ValueCountFrequency (%)
2011 722
20.0%
2010 722
20.0%
2012 722
20.0%
2013 722
20.0%
2014 722
20.0%

Length

2023-12-11T12:25:33.788749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:25:33.946385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2011 722
20.0%
2010 722
20.0%
2012 722
20.0%
2013 722
20.0%
2014 722
20.0%

QU
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size28.3 KiB
2011년 3분기
242 
2010년 3분기
242 
2012년 3분기
242 
2014년 3분기
242 
2013년 3분기
242 
Other values (10)
2400 

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2011년 1분기
2nd row2011년 1분기
3rd row2011년 1분기
4th row2011년 1분기
5th row2011년 1분기

Common Values

ValueCountFrequency (%)
2011년 3분기 242
 
6.7%
2010년 3분기 242
 
6.7%
2012년 3분기 242
 
6.7%
2014년 3분기 242
 
6.7%
2013년 3분기 242
 
6.7%
2011년 1분기 240
 
6.6%
2011년 2분기 240
 
6.6%
2010년 1분기 240
 
6.6%
2010년 2분기 240
 
6.6%
2012년 1분기 240
 
6.6%
Other values (5) 1200
33.2%

Length

2023-12-11T12:25:34.085352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3분기 1210
16.8%
1분기 1200
16.6%
2분기 1200
16.6%
2011년 722
10.0%
2010년 722
10.0%
2012년 722
10.0%
2014년 722
10.0%
2013년 722
10.0%

ORDR
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing3610
Missing (%)100.0%
Memory size31.9 KiB

AREANM
Text

Distinct242
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size28.3 KiB
2023-12-11T12:25:34.600033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length8
Mean length8.5554017
Min length7

Characters and Unicode

Total characters30885
Distinct characters144
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경상남도 창원시 진해구
2nd row경상남도 진주시
3rd row경상남도 통영시
4th row경상남도 사천시
5th row경상남도 김해시
ValueCountFrequency (%)
경기도 675
 
8.7%
경상북도 360
 
4.7%
경상남도 330
 
4.3%
전라남도 330
 
4.3%
강원도 270
 
3.5%
충청남도 270
 
3.5%
서울특별시 225
 
2.9%
전라북도 225
 
2.9%
부산광역시 225
 
2.9%
충청북도 205
 
2.7%
Other values (240) 4610
59.7%
2023-12-11T12:25:35.278883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4115
 
13.3%
2770
 
9.0%
2425
 
7.9%
1495
 
4.8%
1410
 
4.6%
1335
 
4.3%
1170
 
3.8%
910
 
2.9%
825
 
2.7%
750
 
2.4%
Other values (134) 13680
44.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26770
86.7%
Space Separator 4115
 
13.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2770
 
10.3%
2425
 
9.1%
1495
 
5.6%
1410
 
5.3%
1335
 
5.0%
1170
 
4.4%
910
 
3.4%
825
 
3.1%
750
 
2.8%
720
 
2.7%
Other values (133) 12960
48.4%
Space Separator
ValueCountFrequency (%)
4115
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26770
86.7%
Common 4115
 
13.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2770
 
10.3%
2425
 
9.1%
1495
 
5.6%
1410
 
5.3%
1335
 
5.0%
1170
 
4.4%
910
 
3.4%
825
 
3.1%
750
 
2.8%
720
 
2.7%
Other values (133) 12960
48.4%
Common
ValueCountFrequency (%)
4115
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26770
86.7%
ASCII 4115
 
13.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4115
100.0%
Hangul
ValueCountFrequency (%)
2770
 
10.3%
2425
 
9.1%
1495
 
5.6%
1410
 
5.3%
1335
 
5.0%
1170
 
4.4%
910
 
3.4%
825
 
3.1%
750
 
2.8%
720
 
2.7%
Other values (133) 12960
48.4%

LNDPCL
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct985
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean888.68781
Minimum0
Maximum5059
Zeros169
Zeros (%)4.7%
Negative0
Negative (%)0.0%
Memory size31.9 KiB
2023-12-11T12:25:35.488050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q157.5
median690
Q31516
95-th percentile2548
Maximum5059
Range5059
Interquartile range (IQR)1458.5

Descriptive statistics

Standard deviation889.29682
Coefficient of variation (CV)1.0006853
Kurtosis0.1573235
Mean888.68781
Median Absolute Deviation (MAD)659
Skewness0.8577825
Sum3208163
Variance790848.83
MonotonicityNot monotonic
2023-12-11T12:25:35.629139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 169
 
4.7%
2 67
 
1.9%
1 58
 
1.6%
3 56
 
1.6%
5 40
 
1.1%
4 39
 
1.1%
14 25
 
0.7%
8 23
 
0.6%
9 22
 
0.6%
13 20
 
0.6%
Other values (975) 3091
85.6%
ValueCountFrequency (%)
0 169
4.7%
1 58
 
1.6%
2 67
 
1.9%
3 56
 
1.6%
4 39
 
1.1%
5 40
 
1.1%
6 16
 
0.4%
7 17
 
0.5%
8 23
 
0.6%
9 22
 
0.6%
ValueCountFrequency (%)
5059 3
0.1%
4259 3
0.1%
4141 3
0.1%
4069 1
 
< 0.1%
3948 3
0.1%
3894 3
0.1%
3543 3
0.1%
3491 1
 
< 0.1%
3411 3
0.1%
3386 3
0.1%

MUMMPC
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1001
Distinct (%)27.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean45591.909
Minimum0
Maximum3040000
Zeros173
Zeros (%)4.8%
Negative0
Negative (%)0.0%
Memory size31.9 KiB
2023-12-11T12:25:35.811793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile18
Q1571.75
median2150
Q318120
95-th percentile257380
Maximum3040000
Range3040000
Interquartile range (IQR)17548.25

Descriptive statistics

Standard deviation143142.13
Coefficient of variation (CV)3.1396388
Kurtosis101.75113
Mean45591.909
Median Absolute Deviation (MAD)1939
Skewness7.8013975
Sum1.6458679 × 108
Variance2.0489668 × 1010
MonotonicityNot monotonic
2023-12-11T12:25:35.993332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 173
 
4.8%
470 27
 
0.7%
220 27
 
0.7%
790 24
 
0.7%
300 24
 
0.7%
240 23
 
0.6%
380 23
 
0.6%
400 23
 
0.6%
480 22
 
0.6%
370 22
 
0.6%
Other values (991) 3222
89.3%
ValueCountFrequency (%)
0 173
4.8%
10 7
 
0.2%
18 2
 
0.1%
20 9
 
0.2%
30 9
 
0.2%
32 2
 
0.1%
40 9
 
0.2%
50 3
 
0.1%
70 3
 
0.1%
80 4
 
0.1%
ValueCountFrequency (%)
3040000 1
 
< 0.1%
2142860 3
0.1%
1115108 1
 
< 0.1%
1115100 1
 
< 0.1%
907550 3
0.1%
891890 3
0.1%
890000 3
0.1%
882500 3
0.1%
873730 3
0.1%
819849 2
0.1%

MXMMPC
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1436
Distinct (%)39.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean353013.36
Minimum0
Maximum4000000
Zeros169
Zeros (%)4.7%
Negative0
Negative (%)0.0%
Memory size31.9 KiB
2023-12-11T12:25:36.170066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile15150
Q161930
median188970
Q3465990
95-th percentile1202530
Maximum4000000
Range4000000
Interquartile range (IQR)404060

Descriptive statistics

Standard deviation434411.07
Coefficient of variation (CV)1.2305797
Kurtosis9.6292947
Mean353013.36
Median Absolute Deviation (MAD)151710
Skewness2.5301982
Sum1.2743782 × 109
Variance1.8871298 × 1011
MonotonicityNot monotonic
2023-12-11T12:25:36.314696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 169
 
4.7%
45450 9
 
0.2%
303030 9
 
0.2%
600000 9
 
0.2%
100000 7
 
0.2%
33330 7
 
0.2%
19000 6
 
0.2%
153510 6
 
0.2%
72360 6
 
0.2%
46490 6
 
0.2%
Other values (1426) 3376
93.5%
ValueCountFrequency (%)
0 169
4.7%
11900 3
 
0.1%
12240 3
 
0.1%
12690 3
 
0.1%
13470 1
 
< 0.1%
15150 3
 
0.1%
15320 3
 
0.1%
15910 3
 
0.1%
16166 1
 
< 0.1%
16510 3
 
0.1%
ValueCountFrequency (%)
4000000 3
0.1%
3040000 1
 
< 0.1%
3032260 3
0.1%
2860220 3
0.1%
2629540 3
0.1%
2459020 3
0.1%
2441260 3
0.1%
2419942 1
 
< 0.1%
2272727 1
 
< 0.1%
2197802 2
0.1%

Interactions

2023-12-11T12:25:33.007120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:32.240185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:32.623806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:33.149422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:32.370244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:32.725224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:33.290643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:32.496002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T12:25:32.843261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:25:36.415895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
YEARQULNDPCLMUMMPCMXMMPC
YEAR1.0001.0000.3280.0500.113
QU1.0001.0000.2610.0000.000
LNDPCL0.3280.2611.0000.1560.365
MUMMPC0.0500.0000.1561.0000.551
MXMMPC0.1130.0000.3650.5511.000
2023-12-11T12:25:36.504880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
QUYEAR
QU1.0000.999
YEAR0.9991.000
2023-12-11T12:25:36.582705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
LNDPCLMUMMPCMXMMPCYEARQU
LNDPCL1.000-0.508-0.3800.1420.100
MUMMPC-0.5081.0000.8040.0340.000
MXMMPC-0.3800.8041.0000.0650.000
YEAR0.1420.0340.0651.0000.999
QU0.1000.0000.0000.9991.000

Missing values

2023-12-11T12:25:33.458432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:25:33.617692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

YEARQUORDRAREANMLNDPCLMUMMPCMXMMPC
020112011년 1분기<NA>경상남도 창원시 진해구19620160273220
120112011년 1분기<NA>경상남도 진주시2071720142860
220112011년 1분기<NA>경상남도 통영시1131300242540
320112011년 1분기<NA>경상남도 사천시11571430160680
420112011년 1분기<NA>경상남도 김해시1892470438630
520112011년 1분기<NA>경상남도 밀양시3015380165000
620112011년 1분기<NA>경상남도 거제시18152700440250
720112011년 1분기<NA>경상남도 양산시8895370303030
820112011년 1분기<NA>경상남도 의령군131352051410
920112011년 1분기<NA>경상남도 함안군2761960213330
YEARQUORDRAREANMLNDPCLMUMMPCMXMMPC
360020132013년 1분기<NA>서울특별시 종로구22991043450
360120132013년 1분기<NA>서울특별시 중랑구28918901344260
360220132013년 1분기<NA>서울특별시 성북구000
360320132013년 1분기<NA>서울특별시 강북구73113201500000
360420132013년 1분기<NA>서울특별시 도봉구27320201202530
360520132013년 1분기<NA>서울특별시 노원구5163780412800
360620132013년 1분기<NA>서울특별시 은평구42054802180000
360720132013년 1분기<NA>서울특별시 마포구000
360820132013년 1분기<NA>서울특별시 강서구131323301542310
360920132013년 1분기<NA>서울특별시 구로구41176501058780