Overview

Dataset statistics

Number of variables6
Number of observations220
Missing cells11
Missing cells (%)0.8%
Duplicate rows1
Duplicate rows (%)0.5%
Total size in memory10.7 KiB
Average record size in memory49.6 B

Variable types

Categorical3
Text2
Numeric1

Dataset

Description김포시 녹지(녹지구분, 사업지구, 녹지등록명, 주소, 면적, 데이터기준일자 등)의 데이터를 제공하고 있습니다.
Author경기도 김포시
URLhttps://www.data.go.kr/data/15113607/fileData.do

Alerts

Dataset has 1 (0.5%) duplicate rowsDuplicates
사업지구 is highly overall correlated with 데이터기준일자High correlation
녹지구분 is highly overall correlated with 데이터기준일자High correlation
데이터기준일자 is highly overall correlated with 면적(제곱미터) and 2 other fieldsHigh correlation
면적(제곱미터) is highly overall correlated with 데이터기준일자High correlation
녹지구분 is highly imbalanced (51.4%)Imbalance
데이터기준일자 is highly imbalanced (86.9%)Imbalance
녹지등록명 has 3 (1.4%) missing valuesMissing
주소 has 4 (1.8%) missing valuesMissing
면적(제곱미터) has 4 (1.8%) missing valuesMissing

Reproduction

Analysis started2024-04-29 23:10:05.213660
Analysis finished2024-04-29 23:10:07.207654
Duration1.99 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

녹지구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
완충녹지
171 
경관녹지
39 
연결녹지
 
6
<NA>
 
4

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row완충녹지
2nd row완충녹지
3rd row완충녹지
4th row완충녹지
5th row완충녹지

Common Values

ValueCountFrequency (%)
완충녹지 171
77.7%
경관녹지 39
 
17.7%
연결녹지 6
 
2.7%
<NA> 4
 
1.8%

Length

2024-04-30T08:10:07.268829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T08:10:07.376382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
완충녹지 171
77.7%
경관녹지 39
 
17.7%
연결녹지 6
 
2.7%
na 4
 
1.8%

사업지구
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
마송
48 
양곡
44 
학운
25 
신곡
21 
김포한강
10 
Other values (19)
72 

Length

Max length4
Median length2
Mean length2.2409091
Min length2

Unique

Unique4 ?
Unique (%)1.8%

Sample

1st row고촌
2nd row고촌
3rd row청송
4th row걸포
5th row걸포

Common Values

ValueCountFrequency (%)
마송 48
21.8%
양곡 44
20.0%
학운 25
11.4%
신곡 21
9.5%
김포한강 10
 
4.5%
걸포 9
 
4.1%
고촌물류 9
 
4.1%
향산 8
 
3.6%
고촌 7
 
3.2%
대포 5
 
2.3%
Other values (14) 34
15.5%

Length

2024-04-30T08:10:07.505087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
마송 48
21.8%
양곡 44
20.0%
학운 25
11.4%
신곡 21
9.5%
김포한강 10
 
4.5%
걸포 9
 
4.1%
고촌물류 9
 
4.1%
향산 8
 
3.6%
고촌 7
 
3.2%
대포 5
 
2.3%
Other values (14) 34
15.5%

녹지등록명
Text

MISSING 

Distinct217
Distinct (%)100.0%
Missing3
Missing (%)1.4%
Memory size1.8 KiB
2024-04-30T08:10:07.730103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length18
Mean length13.857143
Min length1

Characters and Unicode

Total characters3007
Distinct characters75
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique217 ?
Unique (%)100.0%

Sample

1st row고촌수기 (완충녹지 제25호)
2nd row고촌수기 (완충녹지 제26호)
3rd row청송현대 (완충녹지95호)
4th row걸포지구 (완충녹지27호)
5th row걸포지구 (완충녹지28호)
ValueCountFrequency (%)
마송택지 37
 
8.7%
완충녹지 28
 
6.6%
양곡택지 24
 
5.7%
신곡6지구 18
 
4.2%
학운3산업단지 10
 
2.4%
고촌물류단지 9
 
2.1%
향산지구 8
 
1.9%
완충녹지4호 7
 
1.7%
완충녹지5호 7
 
1.7%
걸포지구 6
 
1.4%
Other values (178) 270
63.7%
2024-04-30T08:10:08.084334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
404
 
13.4%
215
 
7.1%
215
 
7.1%
209
 
7.0%
171
 
5.7%
171
 
5.7%
) 169
 
5.6%
( 169
 
5.6%
107
 
3.6%
1 84
 
2.8%
Other values (65) 1093
36.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2036
67.7%
Decimal Number 420
 
14.0%
Space Separator 209
 
7.0%
Close Punctuation 169
 
5.6%
Open Punctuation 169
 
5.6%
Dash Punctuation 2
 
0.1%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
404
19.8%
215
 
10.6%
215
 
10.6%
171
 
8.4%
171
 
8.4%
107
 
5.3%
59
 
2.9%
49
 
2.4%
48
 
2.4%
48
 
2.4%
Other values (50) 549
27.0%
Decimal Number
ValueCountFrequency (%)
1 84
20.0%
2 68
16.2%
3 62
14.8%
4 55
13.1%
6 45
10.7%
5 36
8.6%
7 25
 
6.0%
8 19
 
4.5%
0 14
 
3.3%
9 12
 
2.9%
Space Separator
ValueCountFrequency (%)
209
100.0%
Close Punctuation
ValueCountFrequency (%)
) 169
100.0%
Open Punctuation
ValueCountFrequency (%)
( 169
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2036
67.7%
Common 971
32.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
404
19.8%
215
 
10.6%
215
 
10.6%
171
 
8.4%
171
 
8.4%
107
 
5.3%
59
 
2.9%
49
 
2.4%
48
 
2.4%
48
 
2.4%
Other values (50) 549
27.0%
Common
ValueCountFrequency (%)
209
21.5%
) 169
17.4%
( 169
17.4%
1 84
8.7%
2 68
 
7.0%
3 62
 
6.4%
4 55
 
5.7%
6 45
 
4.6%
5 36
 
3.7%
7 25
 
2.6%
Other values (5) 49
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2036
67.7%
ASCII 971
32.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
404
19.8%
215
 
10.6%
215
 
10.6%
171
 
8.4%
171
 
8.4%
107
 
5.3%
59
 
2.9%
49
 
2.4%
48
 
2.4%
48
 
2.4%
Other values (50) 549
27.0%
ASCII
ValueCountFrequency (%)
209
21.5%
) 169
17.4%
( 169
17.4%
1 84
8.7%
2 68
 
7.0%
3 62
 
6.4%
4 55
 
5.7%
6 45
 
4.6%
5 36
 
3.7%
7 25
 
2.6%
Other values (5) 49
 
5.0%

주소
Text

MISSING 

Distinct214
Distinct (%)99.1%
Missing4
Missing (%)1.8%
Memory size1.8 KiB
2024-04-30T08:10:08.375614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length38
Median length29
Mean length13.930556
Min length6

Characters and Unicode

Total characters3009
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique212 ?
Unique (%)98.1%

Sample

1st row고촌읍 신곡리 1193 외8필
2nd row고촌읍 신곡리 1257 외4필
3rd row장기동 1316 외12필지
4th row걸포동 1559 공
5th row걸포동 1556 공
ValueCountFrequency (%)
양촌읍 62
 
8.8%
통진읍 53
 
7.5%
마송리 49
 
7.0%
고촌읍 45
 
6.4%
양곡리 37
 
5.3%
일원 34
 
4.8%
학운리 30
 
4.3%
신곡리 24
 
3.4%
24
 
3.4%
김포시 18
 
2.6%
Other values (254) 326
46.4%
2024-04-30T08:10:08.852600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
488
 
16.2%
1 197
 
6.5%
174
 
5.8%
160
 
5.3%
3 119
 
4.0%
5 116
 
3.9%
109
 
3.6%
2 106
 
3.5%
104
 
3.5%
100
 
3.3%
Other values (53) 1336
44.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1441
47.9%
Decimal Number 955
31.7%
Space Separator 488
 
16.2%
Dash Punctuation 92
 
3.1%
Other Punctuation 19
 
0.6%
Open Punctuation 7
 
0.2%
Close Punctuation 7
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
174
 
12.1%
160
 
11.1%
109
 
7.6%
104
 
7.2%
100
 
6.9%
61
 
4.2%
54
 
3.7%
54
 
3.7%
50
 
3.5%
49
 
3.4%
Other values (38) 526
36.5%
Decimal Number
ValueCountFrequency (%)
1 197
20.6%
3 119
12.5%
5 116
12.1%
2 106
11.1%
6 95
9.9%
0 77
 
8.1%
4 70
 
7.3%
8 70
 
7.3%
9 62
 
6.5%
7 43
 
4.5%
Space Separator
ValueCountFrequency (%)
488
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 92
100.0%
Other Punctuation
ValueCountFrequency (%)
, 19
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1568
52.1%
Hangul 1441
47.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
174
 
12.1%
160
 
11.1%
109
 
7.6%
104
 
7.2%
100
 
6.9%
61
 
4.2%
54
 
3.7%
54
 
3.7%
50
 
3.5%
49
 
3.4%
Other values (38) 526
36.5%
Common
ValueCountFrequency (%)
488
31.1%
1 197
12.6%
3 119
 
7.6%
5 116
 
7.4%
2 106
 
6.8%
6 95
 
6.1%
- 92
 
5.9%
0 77
 
4.9%
4 70
 
4.5%
8 70
 
4.5%
Other values (5) 138
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1568
52.1%
Hangul 1441
47.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
488
31.1%
1 197
12.6%
3 119
 
7.6%
5 116
 
7.4%
2 106
 
6.8%
6 95
 
6.1%
- 92
 
5.9%
0 77
 
4.9%
4 70
 
4.5%
8 70
 
4.5%
Other values (5) 138
 
8.8%
Hangul
ValueCountFrequency (%)
174
 
12.1%
160
 
11.1%
109
 
7.6%
104
 
7.2%
100
 
6.9%
61
 
4.2%
54
 
3.7%
54
 
3.7%
50
 
3.5%
49
 
3.4%
Other values (38) 526
36.5%

면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct212
Distinct (%)98.1%
Missing4
Missing (%)1.8%
Infinite0
Infinite (%)0.0%
Mean4440.037
Minimum27
Maximum88033
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2024-04-30T08:10:08.994394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum27
5-th percentile306.25
Q1995.75
median1868
Q33363.5
95-th percentile16626.25
Maximum88033
Range88006
Interquartile range (IQR)2367.75

Descriptive statistics

Standard deviation8995.8472
Coefficient of variation (CV)2.0260748
Kurtosis39.17314
Mean4440.037
Median Absolute Deviation (MAD)1148.5
Skewness5.4565923
Sum959048
Variance80925267
MonotonicityNot monotonic
2024-04-30T08:10:09.117621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1638 2
 
0.9%
3300 2
 
0.9%
1868 2
 
0.9%
2512 2
 
0.9%
1339 1
 
0.5%
1501 1
 
0.5%
5717 1
 
0.5%
1743 1
 
0.5%
321 1
 
0.5%
557 1
 
0.5%
Other values (202) 202
91.8%
(Missing) 4
 
1.8%
ValueCountFrequency (%)
27 1
0.5%
135 1
0.5%
150 1
0.5%
180 1
0.5%
190 1
0.5%
226 1
0.5%
229 1
0.5%
251 1
0.5%
258 1
0.5%
267 1
0.5%
ValueCountFrequency (%)
88033 1
0.5%
48013 1
0.5%
38589 1
0.5%
36312 1
0.5%
36297 1
0.5%
32448 1
0.5%
31548 1
0.5%
31058 1
0.5%
22836 1
0.5%
20697 1
0.5%

데이터기준일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2024-04-18
216 
<NA>
 
4

Length

Max length10
Median length10
Mean length9.8909091
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2024-04-18
2nd row2024-04-18
3rd row2024-04-18
4th row2024-04-18
5th row2024-04-18

Common Values

ValueCountFrequency (%)
2024-04-18 216
98.2%
<NA> 4
 
1.8%

Length

2024-04-30T08:10:09.241728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T08:10:09.331248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2024-04-18 216
98.2%
na 4
 
1.8%

Interactions

2024-04-30T08:10:06.749091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T08:10:09.391737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
녹지구분사업지구면적(제곱미터)
녹지구분1.0000.5900.000
사업지구0.5901.0000.756
면적(제곱미터)0.0000.7561.000
2024-04-30T08:10:09.475285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업지구녹지구분데이터기준일자
사업지구1.0000.3591.000
녹지구분0.3591.0001.000
데이터기준일자1.0001.0001.000
2024-04-30T08:10:09.563097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적(제곱미터)녹지구분사업지구데이터기준일자
면적(제곱미터)1.0000.0000.4341.000
녹지구분0.0001.0000.3591.000
사업지구0.4340.3591.0001.000
데이터기준일자1.0001.0001.0001.000

Missing values

2024-04-30T08:10:06.909671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T08:10:07.015263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-30T08:10:07.129113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

녹지구분사업지구녹지등록명주소면적(제곱미터)데이터기준일자
0완충녹지고촌고촌수기 (완충녹지 제25호)고촌읍 신곡리 1193 외8필60652024-04-18
1완충녹지고촌고촌수기 (완충녹지 제26호)고촌읍 신곡리 1257 외4필30792024-04-18
2완충녹지청송청송현대 (완충녹지95호)장기동 1316 외12필지66452024-04-18
3완충녹지걸포걸포지구 (완충녹지27호)걸포동 1559 공8442024-04-18
4완충녹지걸포걸포지구 (완충녹지28호)걸포동 1556 공9982024-04-18
5완충녹지걸포걸포지구 (완충녹지29호)걸포동 1562 공13672024-04-18
6완충녹지걸포걸포지구 (완충녹지30호)걸포동 1564 공4302024-04-18
7완충녹지걸포걸포지구 (완충녹지32호)걸포동 909-6 공9692024-04-18
8완충녹지걸포걸포지구 (완충녹지33호)걸포동 910-4, 1583 공16712024-04-18
9완충녹지북변북변동양 (완충녹지31호)북변동 431-52 외6필4792024-04-18
녹지구분사업지구녹지등록명주소면적(제곱미터)데이터기준일자
210연결녹지양곡양곡택지(연결녹지30호)양촌읍 양곡리 1250 공19282024-04-18
211연결녹지양곡양곡택지(연결녹지31호)양촌읍 양곡리 1255 공16382024-04-18
212연결녹지고촌고촌행정타운(고촌1호)고촌읍 신곡리 129311502024-04-18
213연결녹지풍무풍무2지구(연결녹지 4호)풍무동 1100, 109915332024-04-18
214연결녹지풍무풍무2지구(연결녹지 5호)풍무동 938, 937, 939101462024-04-18
215연결녹지걸포걸포3지 연결녹지 1호걸포동 160783472024-04-18
216<NA><NA><NA><NA><NA><NA>
217<NA><NA><NA><NA><NA><NA>
218<NA><NA><NA><NA><NA><NA>
219<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

녹지구분사업지구녹지등록명주소면적(제곱미터)데이터기준일자# duplicates
0<NA><NA><NA><NA><NA><NA>3