Overview

Dataset statistics

Number of variables3
Number of observations32
Missing cells4
Missing cells (%)4.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory964.0 B
Average record size in memory30.1 B

Variable types

Text1
Numeric2

Dataset

Description한국토지주택공사에서 관리중인 3기신도시 토지이용계획표로 전체 토지이용계획의 면적, 구성비 및 주택건설용지, 단독주택, 공동주택등 부분별 면적, 구성비 정보 제공
URLhttps://www.data.go.kr/data/15083597/fileData.do

Alerts

면적(천m2) is highly overall correlated with 구성비(비율)High correlation
구성비(비율) is highly overall correlated with 면적(천m2)High correlation
면적(천m2) has 2 (6.2%) missing valuesMissing
구성비(비율) has 2 (6.2%) missing valuesMissing
구성비(비율) has 1 (3.1%) zerosZeros

Reproduction

Analysis started2023-12-12 06:37:55.753646
Analysis finished2023-12-12 06:37:56.794854
Duration1.04 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct31
Distinct (%)96.9%
Missing0
Missing (%)0.0%
Memory size388.0 B
2023-12-12T15:37:56.941709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length5.84375
Min length3

Characters and Unicode

Total characters187
Distinct characters64
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)93.8%

Sample

1st row주택건설용지
2nd row단독주택 일반형
3rd row단독주택 블록형
4th row공동주택
5th row근린생활시설
ValueCountFrequency (%)
도시지원 3
 
6.0%
복합용지 3
 
6.0%
소계 3
 
6.0%
2
 
4.0%
단독주택 2
 
4.0%
2
 
4.0%
교육시설 1
 
2.0%
커뮤니티시설 1
 
2.0%
종교시설 1
 
2.0%
위험물저장 1
 
2.0%
Other values (31) 31
62.0%
2023-12-12T15:37:57.310033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18
 
9.6%
14
 
7.5%
14
 
7.5%
13
 
7.0%
9
 
4.8%
8
 
4.3%
6
 
3.2%
5
 
2.7%
5
 
2.7%
5
 
2.7%
Other values (54) 90
48.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 169
90.4%
Space Separator 18
 
9.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
8.3%
14
 
8.3%
13
 
7.7%
9
 
5.3%
8
 
4.7%
6
 
3.6%
5
 
3.0%
5
 
3.0%
5
 
3.0%
5
 
3.0%
Other values (53) 85
50.3%
Space Separator
ValueCountFrequency (%)
18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 169
90.4%
Common 18
 
9.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
8.3%
14
 
8.3%
13
 
7.7%
9
 
5.3%
8
 
4.7%
6
 
3.6%
5
 
3.0%
5
 
3.0%
5
 
3.0%
5
 
3.0%
Other values (53) 85
50.3%
Common
ValueCountFrequency (%)
18
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 169
90.4%
ASCII 18
 
9.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
18
100.0%
Hangul
ValueCountFrequency (%)
14
 
8.3%
14
 
8.3%
13
 
7.7%
9
 
5.3%
8
 
4.7%
6
 
3.6%
5
 
3.0%
5
 
3.0%
5
 
3.0%
5
 
3.0%
Other values (53) 85
50.3%

면적(천m2)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct30
Distinct (%)100.0%
Missing2
Missing (%)6.2%
Infinite0
Infinite (%)0.0%
Mean283.01673
Minimum1.02
Maximum2672.805
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size420.0 B
2023-12-12T15:37:57.462291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.02
5-th percentile2.0766
Q120.48125
median35.728
Q3464.91775
95-th percentile803.40145
Maximum2672.805
Range2671.785
Interquartile range (IQR)444.4365

Descriptive statistics

Standard deviation529.63944
Coefficient of variation (CV)1.8714068
Kurtosis14.426006
Mean283.01673
Median Absolute Deviation (MAD)33.622
Skewness3.4397495
Sum8490.502
Variance280517.93
MonotonicityNot monotonic
2023-12-12T15:37:57.634763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
38.545 1
 
3.1%
83.059 1
 
3.1%
1.812 1
 
3.1%
539.27 1
 
3.1%
31.85 1
 
3.1%
12.0 1
 
3.1%
2.4 1
 
3.1%
3.794 1
 
3.1%
4.903 1
 
3.1%
15.29 1
 
3.1%
Other values (20) 20
62.5%
(Missing) 2
 
6.2%
ValueCountFrequency (%)
1.02 1
3.1%
1.812 1
3.1%
2.4 1
3.1%
3.794 1
3.1%
4.903 1
3.1%
12.0 1
3.1%
15.29 1
3.1%
19.456 1
3.1%
23.557 1
3.1%
23.758 1
3.1%
ValueCountFrequency (%)
2672.805 1
3.1%
900.556 1
3.1%
684.657 1
3.1%
680.515 1
3.1%
660.899 1
3.1%
658.909 1
3.1%
584.608 1
3.1%
539.27 1
3.1%
241.861 1
3.1%
207.758 1
3.1%

구성비(비율)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct22
Distinct (%)73.3%
Missing2
Missing (%)6.2%
Infinite0
Infinite (%)0.0%
Mean8.49
Minimum0
Maximum80.2
Zeros1
Zeros (%)3.1%
Negative0
Negative (%)0.0%
Memory size420.0 B
2023-12-12T15:37:57.758186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.1
Q10.625
median1.05
Q313.95
95-th percentile24.075
Maximum80.2
Range80.2
Interquartile range (IQR)13.325

Descriptive statistics

Standard deviation15.890712
Coefficient of variation (CV)1.8716975
Kurtosis14.43477
Mean8.49
Median Absolute Deviation (MAD)0.95
Skewness3.4410216
Sum254.7
Variance252.51472
MonotonicityNot monotonic
2023-12-12T15:37:57.899394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
0.1 4
 
12.5%
19.8 2
 
6.2%
1.1 2
 
6.2%
0.8 2
 
6.2%
1.0 2
 
6.2%
0.7 2
 
6.2%
80.2 1
 
3.1%
0.0 1
 
3.1%
2.5 1
 
3.1%
16.2 1
 
3.1%
Other values (12) 12
37.5%
(Missing) 2
 
6.2%
ValueCountFrequency (%)
0.0 1
 
3.1%
0.1 4
12.5%
0.4 1
 
3.1%
0.5 1
 
3.1%
0.6 1
 
3.1%
0.7 2
6.2%
0.8 2
6.2%
0.9 1
 
3.1%
1.0 2
6.2%
1.1 2
6.2%
ValueCountFrequency (%)
80.2 1
3.1%
27.0 1
3.1%
20.5 1
3.1%
20.4 1
3.1%
19.8 2
6.2%
17.6 1
3.1%
16.2 1
3.1%
7.2 1
3.1%
6.2 1
3.1%
3.9 1
3.1%

Interactions

2023-12-12T15:37:56.082600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:37:55.863016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:37:56.224530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:37:55.941056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:37:58.036235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
토지이용계획면적(천m2)구성비(비율)
토지이용계획1.0001.0001.000
면적(천m2)1.0001.0001.000
구성비(비율)1.0001.0001.000
2023-12-12T15:37:58.140541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적(천m2)구성비(비율)
면적(천m2)1.0000.998
구성비(비율)0.9981.000

Missing values

2023-12-12T15:37:56.390420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:37:56.463975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T15:37:56.759565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

토지이용계획면적(천m2)구성비(비율)
0주택건설용지658.90919.8
1단독주택 일반형31.2880.9
2단독주택 블록형19.4560.6
3공동주택584.60817.6
4근린생활시설23.5570.7
5공공시설용지2672.80580.2
6상업시설37.3531.1
7복합용지 소계241.8617.2
8복합용지 주상복합207.7586.2
9복합용지 근린업무복합34.1031.0
토지이용계획면적(천m2)구성비(비율)
22종교시설4.9030.1
23위험물저장 및 처리시설3.7940.1
24전기공급설비2.40.1
25유수지<NA><NA>
26재활용시설<NA><NA>
27자동차정류장12.00.4
28주차장31.851.0
29도 로539.2716.2
30광 장1.8120.1
31유보지83.0592.5