Overview

Dataset statistics

Number of variables7
Number of observations44
Missing cells24
Missing cells (%)7.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.7 KiB
Average record size in memory62.0 B

Variable types

Text3
Categorical1
Numeric3

Dataset

Description제주영어교육도시 공급대상토지 입찰결과(2014년 7월 1일 기준)
Author제주국제자유도시개발센터
URLhttps://www.data.go.kr/data/15044043/fileData.do

Alerts

면적(㎡) is highly overall correlated with 토지용도High correlation
낙찰단가(원/㎡) is highly overall correlated with 낙찰단가(원/3.3㎡)High correlation
낙찰단가(원/3.3㎡) is highly overall correlated with 낙찰단가(원/㎡)High correlation
토지용도 is highly overall correlated with 면적(㎡)High correlation
필지번호 has 3 (6.8%) missing valuesMissing
낙찰가격 has 3 (6.8%) missing valuesMissing
낙찰단가(원/㎡) has 9 (20.5%) missing valuesMissing
낙찰단가(원/3.3㎡) has 9 (20.5%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 15:50:32.242233
Analysis finished2023-12-12 15:50:34.277816
Duration2.04 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Text

UNIQUE 

Distinct44
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size484.0 B
2023-12-13T00:50:34.448872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.7727273
Min length1

Characters and Unicode

Total characters78
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)100.0%

Sample

1st row1
2nd row2
3rd row3
4th row4
5th row5
ValueCountFrequency (%)
1 1
 
2.3%
2 1
 
2.3%
33 1
 
2.3%
25 1
 
2.3%
26 1
 
2.3%
27 1
 
2.3%
28 1
 
2.3%
29 1
 
2.3%
30 1
 
2.3%
31 1
 
2.3%
Other values (34) 34
77.3%
2023-12-13T00:50:34.885449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 15
19.2%
2 14
17.9%
3 14
17.9%
4 6
 
7.7%
5 4
 
5.1%
6 4
 
5.1%
7 4
 
5.1%
8 4
 
5.1%
9 4
 
5.1%
0 4
 
5.1%
Other values (4) 5
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 73
93.6%
Other Letter 5
 
6.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 15
20.5%
2 14
19.2%
3 14
19.2%
4 6
 
8.2%
5 4
 
5.5%
6 4
 
5.5%
7 4
 
5.5%
8 4
 
5.5%
9 4
 
5.5%
0 4
 
5.5%
Other Letter
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring scripts

ValueCountFrequency (%)
Common 73
93.6%
Hangul 5
 
6.4%

Most frequent character per script

Common
ValueCountFrequency (%)
1 15
20.5%
2 14
19.2%
3 14
19.2%
4 6
 
8.2%
5 4
 
5.5%
6 4
 
5.5%
7 4
 
5.5%
8 4
 
5.5%
9 4
 
5.5%
0 4
 
5.5%
Hangul
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73
93.6%
Hangul 5
 
6.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 15
20.5%
2 14
19.2%
3 14
19.2%
4 6
 
8.2%
5 4
 
5.5%
6 4
 
5.5%
7 4
 
5.5%
8 4
 
5.5%
9 4
 
5.5%
0 4
 
5.5%
Hangul
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%

토지용도
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Memory size484.0 B
단독주택용지(필지형)
24 
상업시설용지
단독주택용지(블록형)
공동주택용지
 
2
종교용지
 
1
Other values (5)

Length

Max length11
Median length11
Mean length9.0454545
Min length1

Unique

Unique6 ?
Unique (%)13.6%

Sample

1st row공동주택용지
2nd row공동주택용지
3rd row단독주택용지(필지형)
4th row단독주택용지(필지형)
5th row단독주택용지(필지형)

Common Values

ValueCountFrequency (%)
단독주택용지(필지형) 24
54.5%
상업시설용지 7
 
15.9%
단독주택용지(블록형) 5
 
11.4%
공동주택용지 2
 
4.5%
종교용지 1
 
2.3%
방송통신시설용지 1
 
2.3%
기타교육시설용지 1
 
2.3%
34 1
 
2.3%
7 1
 
2.3%
41 1
 
2.3%

Length

2023-12-13T00:50:35.058271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:50:35.231326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단독주택용지(필지형 24
54.5%
상업시설용지 7
 
15.9%
단독주택용지(블록형 5
 
11.4%
공동주택용지 2
 
4.5%
종교용지 1
 
2.3%
방송통신시설용지 1
 
2.3%
기타교육시설용지 1
 
2.3%
34 1
 
2.3%
7 1
 
2.3%
41 1
 
2.3%

필지번호
Text

MISSING 

Distinct41
Distinct (%)100.0%
Missing3
Missing (%)6.8%
Memory size484.0 B
2023-12-13T00:50:35.526050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.2195122
Min length3

Characters and Unicode

Total characters214
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)100.0%

Sample

1st rowD-3
2nd rowD-5
3rd rowA-35-1
4th rowA-35-2
5th rowA-35-3
ValueCountFrequency (%)
a-38-1 1
 
2.4%
a-49-3 1
 
2.4%
a-49-4 1
 
2.4%
a-50-1 1
 
2.4%
a-50-2 1
 
2.4%
b-2 1
 
2.4%
b-3 1
 
2.4%
b-4 1
 
2.4%
b-5 1
 
2.4%
b-6 1
 
2.4%
Other values (31) 31
75.6%
2023-12-13T00:50:35.910003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 72
33.6%
A 24
 
11.2%
3 21
 
9.8%
1 20
 
9.3%
4 15
 
7.0%
8 12
 
5.6%
5 11
 
5.1%
2 9
 
4.2%
E 7
 
3.3%
9 5
 
2.3%
Other values (8) 18
 
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101
47.2%
Dash Punctuation 72
33.6%
Uppercase Letter 41
19.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 21
20.8%
1 20
19.8%
4 15
14.9%
8 12
11.9%
5 11
10.9%
2 9
8.9%
9 5
 
5.0%
0 3
 
3.0%
6 3
 
3.0%
7 2
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
A 24
58.5%
E 7
 
17.1%
B 5
 
12.2%
D 2
 
4.9%
I 1
 
2.4%
N 1
 
2.4%
S 1
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
- 72
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 173
80.8%
Latin 41
 
19.2%

Most frequent character per script

Common
ValueCountFrequency (%)
- 72
41.6%
3 21
 
12.1%
1 20
 
11.6%
4 15
 
8.7%
8 12
 
6.9%
5 11
 
6.4%
2 9
 
5.2%
9 5
 
2.9%
0 3
 
1.7%
6 3
 
1.7%
Latin
ValueCountFrequency (%)
A 24
58.5%
E 7
 
17.1%
B 5
 
12.2%
D 2
 
4.9%
I 1
 
2.4%
N 1
 
2.4%
S 1
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 214
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 72
33.6%
A 24
 
11.2%
3 21
 
9.8%
1 20
 
9.3%
4 15
 
7.0%
8 12
 
5.6%
5 11
 
5.1%
2 9
 
4.2%
E 7
 
3.3%
9 5
 
2.3%
Other values (8) 18
 
8.4%

면적(㎡)
Real number (ℝ)

HIGH CORRELATION 

Distinct39
Distinct (%)88.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12090.75
Minimum262
Maximum177331
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size528.0 B
2023-12-13T00:50:36.055313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum262
5-th percentile276
Q1290.25
median442.5
Q35130.425
95-th percentile67990.41
Maximum177331
Range177069
Interquartile range (IQR)4840.175

Descriptive statistics

Standard deviation33226.879
Coefficient of variation (CV)2.7481239
Kurtosis15.442563
Mean12090.75
Median Absolute Deviation (MAD)172
Skewness3.7844955
Sum531993
Variance1.1040255 × 109
MonotonicityNot monotonic
2023-12-13T00:50:36.206730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
1232.2 3
 
6.8%
291.0 2
 
4.5%
286.0 2
 
4.5%
276.0 2
 
4.5%
1232.5 1
 
2.3%
9098.0 1
 
2.3%
5735.0 1
 
2.3%
6657.0 1
 
2.3%
6574.0 1
 
2.3%
2464.7 1
 
2.3%
Other values (29) 29
65.9%
ValueCountFrequency (%)
262.0 1
2.3%
265.0 1
2.3%
276.0 2
4.5%
277.0 1
2.3%
278.0 1
2.3%
280.0 1
2.3%
282.0 1
2.3%
286.0 2
4.5%
288.0 1
2.3%
291.0 2
4.5%
ValueCountFrequency (%)
177331.0 1
2.3%
108612.4 1
2.3%
68718.6 1
2.3%
63864.0 1
2.3%
37093.5 1
2.3%
12818.4 1
2.3%
9098.0 1
2.3%
7841.0 1
2.3%
6657.0 1
2.3%
6574.0 1
2.3%

낙찰가격
Text

MISSING 

Distinct36
Distinct (%)87.8%
Missing3
Missing (%)6.8%
Memory size484.0 B
2023-12-13T00:50:36.413756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length9
Mean length8.2439024
Min length2

Characters and Unicode

Total characters338
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)85.4%

Sample

1st row18110000000
2nd row144290000
3rd row140320000
4th row140110000
5th row140130000
ValueCountFrequency (%)
유찰 6
 
14.6%
103060000 1
 
2.4%
172000000 1
 
2.4%
1997200000 1
 
2.4%
170000000 1
 
2.4%
137000000 1
 
2.4%
146990000 1
 
2.4%
201111000 1
 
2.4%
171000999 1
 
2.4%
18110000000 1
 
2.4%
Other values (26) 26
63.4%
2023-12-13T00:50:36.829494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 154
45.6%
1 48
 
14.2%
2 23
 
6.8%
3 18
 
5.3%
8 18
 
5.3%
9 16
 
4.7%
4 15
 
4.4%
6 13
 
3.8%
7 12
 
3.6%
5 9
 
2.7%
Other values (2) 12
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 326
96.4%
Other Letter 12
 
3.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 154
47.2%
1 48
 
14.7%
2 23
 
7.1%
3 18
 
5.5%
8 18
 
5.5%
9 16
 
4.9%
4 15
 
4.6%
6 13
 
4.0%
7 12
 
3.7%
5 9
 
2.8%
Other Letter
ValueCountFrequency (%)
6
50.0%
6
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 326
96.4%
Hangul 12
 
3.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 154
47.2%
1 48
 
14.7%
2 23
 
7.1%
3 18
 
5.5%
8 18
 
5.5%
9 16
 
4.9%
4 15
 
4.6%
6 13
 
4.0%
7 12
 
3.7%
5 9
 
2.8%
Hangul
ValueCountFrequency (%)
6
50.0%
6
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 326
96.4%
Hangul 12
 
3.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 154
47.2%
1 48
 
14.7%
2 23
 
7.1%
3 18
 
5.5%
8 18
 
5.5%
9 16
 
4.9%
4 15
 
4.6%
6 13
 
4.0%
7 12
 
3.7%
5 9
 
2.8%
Hangul
ValueCountFrequency (%)
6
50.0%
6
50.0%

낙찰단가(원/㎡)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct35
Distinct (%)100.0%
Missing9
Missing (%)20.5%
Infinite0
Infinite (%)0.0%
Mean542811.71
Minimum300015
Maximum1322516
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size528.0 B
2023-12-13T00:50:37.037714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum300015
5-th percentile340282.1
Q1439751
median507717
Q3537434.5
95-th percentile958930
Maximum1322516
Range1022501
Interquartile range (IQR)97683.5

Descriptive statistics

Standard deviation209054.34
Coefficient of variation (CV)0.38513234
Kurtosis6.1737757
Mean542811.71
Median Absolute Deviation (MAD)60340
Skewness2.284462
Sum18998410
Variance4.3703719 × 1010
MonotonicityNot monotonic
2023-12-13T00:50:37.186195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
504510 1
 
2.3%
331719 1
 
2.3%
366559 1
 
2.3%
456034 1
 
2.3%
437343 1
 
2.3%
442159 1
 
2.3%
300015 1
 
2.3%
886900 1
 
2.3%
676200 1
 
2.3%
568057 1
 
2.3%
Other values (25) 25
56.8%
(Missing) 9
 
20.5%
ValueCountFrequency (%)
300015 1
2.3%
331719 1
2.3%
343952 1
2.3%
350544 1
2.3%
359100 1
2.3%
366559 1
2.3%
380071 1
2.3%
421836 1
2.3%
437343 1
2.3%
442159 1
2.3%
ValueCountFrequency (%)
1322516 1
2.3%
1127000 1
2.3%
886900 1
2.3%
762848 1
2.3%
676200 1
2.3%
670936 1
2.3%
580230 1
2.3%
568057 1
2.3%
544810 1
2.3%
530059 1
2.3%

낙찰단가(원/3.3㎡)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct35
Distinct (%)100.0%
Missing9
Missing (%)20.5%
Infinite0
Infinite (%)0.0%
Mean1794418.9
Minimum991785
Maximum4371953
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size528.0 B
2023-12-13T00:50:37.374028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum991785
5-th percentile1124899.3
Q11453722.5
median1678405
Q31776643
95-th percentile3170016.7
Maximum4371953
Range3380168
Interquartile range (IQR)322920.5

Descriptive statistics

Standard deviation691088.67
Coefficient of variation (CV)0.3851323
Kurtosis6.1737731
Mean1794418.9
Median Absolute Deviation (MAD)199469
Skewness2.2844616
Sum62804661
Variance4.7760355 × 1011
MonotonicityNot monotonic
2023-12-13T00:50:37.560084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
1667803 1
 
2.3%
1096592 1
 
2.3%
1211764 1
 
2.3%
1507550 1
 
2.3%
1445761 1
 
2.3%
1461684 1
 
2.3%
991785 1
 
2.3%
2931901 1
 
2.3%
2235372 1
 
2.3%
1877874 1
 
2.3%
Other values (25) 25
56.8%
(Missing) 9
 
20.5%
ValueCountFrequency (%)
991785 1
2.3%
1096592 1
2.3%
1137031 1
2.3%
1158824 1
2.3%
1187107 1
2.3%
1211764 1
2.3%
1256434 1
2.3%
1394500 1
2.3%
1445761 1
2.3%
1461684 1
2.3%
ValueCountFrequency (%)
4371953 1
2.3%
3725620 1
2.3%
2931901 1
2.3%
2521811 1
2.3%
2235372 1
2.3%
2217971 1
2.3%
1918116 1
2.3%
1877874 1
2.3%
1801025 1
2.3%
1752261 1
2.3%

Interactions

2023-12-13T00:50:33.493736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:50:32.650679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:50:33.087097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:50:33.616571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:50:32.798147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:50:33.227981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:50:33.738426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:50:32.932688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:50:33.349427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:50:37.689630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번토지용도필지번호면적(㎡)낙찰가격낙찰단가(원/㎡)낙찰단가(원/3.3㎡)
연번1.0001.0001.0001.0001.0001.0001.000
토지용도1.0001.0001.0000.9920.0000.4480.448
필지번호1.0001.0001.0001.0001.0001.0001.000
면적(㎡)1.0000.9921.0001.0001.0000.0000.000
낙찰가격1.0000.0001.0001.0001.0001.0001.000
낙찰단가(원/㎡)1.0000.4481.0000.0001.0001.0001.000
낙찰단가(원/3.3㎡)1.0000.4481.0000.0001.0001.0001.000
2023-12-13T00:50:37.823079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적(㎡)낙찰단가(원/㎡)낙찰단가(원/3.3㎡)토지용도
면적(㎡)1.0000.2400.2400.816
낙찰단가(원/㎡)0.2401.0001.0000.197
낙찰단가(원/3.3㎡)0.2401.0001.0000.197
토지용도0.8160.1970.1971.000

Missing values

2023-12-13T00:50:33.902350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:50:34.071654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T00:50:34.195139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번토지용도필지번호면적(㎡)낙찰가격낙찰단가(원/㎡)낙찰단가(원/3.3㎡)
01공동주택용지D-337093.5181100000004882261613969
12공동주택용지D-563864.0<NA><NA><NA>
23단독주택용지(필지형)A-35-1286.01442900005045101667803
34단독주택용지(필지형)A-35-2277.01403200005065701674613
45단독주택용지(필지형)A-35-3276.01401100005076451678165
56단독주택용지(필지형)A-35-4276.01401300005077171678405
67단독주택용지(필지형)A-35-5278.01411500005077341678459
78단독주택용지(필지형)A-36-4262.01222000004664121541859
89단독주택용지(필지형)A-37-2265.01389200005242261732980
910단독주택용지(필지형)A-38-1282.01427300005061351673173
연번토지용도필지번호면적(㎡)낙찰가격낙찰단가(원/㎡)낙찰단가(원/3.3㎡)
3435상업시설용지E-1-112465.016538580006709362217971
3536상업시설용지E-1-121232.2138868940011270003725620
3637상업시설용지E-1-131232.5163000078913225164371953
3738상업시설용지E-1-144928.937600010007628482521811
3839종교용지N-22682.0유찰<NA><NA>
3940방송통신시설용지S-11981.47115207403591001187107
4041기타교육시설용지I-112818.4유찰<NA><NA>
41낙찰34<NA>68718.6364249102285300591752261
42유찰7<NA>108612.4<NA><NA><NA>
4341<NA>177331.0<NA><NA><NA>