Overview

Dataset statistics

Number of variables14
Number of observations75
Missing cells128
Missing cells (%)12.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.7 KiB
Average record size in memory118.8 B

Variable types

Text5
Categorical5
Unsupported1
Numeric3

Dataset

Description조서관리코드,프로젝트코드,지자체,조서유형(구분),대분류,중분류,소분류,위치명,지역명,면적기정,면적증감코드,면적변경,면적변경후,결정고시관리코드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-20287/S/1/datasetView.do

Alerts

대분류 has constant value ""Constant
중분류 has constant value ""Constant
지자체 is highly overall correlated with 면적변경 and 1 other fieldsHigh correlation
면적증감코드 is highly overall correlated with 지자체 and 1 other fieldsHigh correlation
면적기정 is highly overall correlated with 면적변경후High correlation
면적변경 is highly overall correlated with 지자체 and 1 other fieldsHigh correlation
면적변경후 is highly overall correlated with 면적기정High correlation
조서유형(구분) is highly overall correlated with 면적변경 and 1 other fieldsHigh correlation
지자체 is highly imbalanced (84.7%)Imbalance
프로젝트코드 has 1 (1.3%) missing valuesMissing
소분류 has 75 (100.0%) missing valuesMissing
지역명 has 1 (1.3%) missing valuesMissing
면적기정 has 12 (16.0%) missing valuesMissing
면적변경 has 36 (48.0%) missing valuesMissing
면적변경후 has 2 (2.7%) missing valuesMissing
결정고시관리코드 has 1 (1.3%) missing valuesMissing
조서관리코드 has unique valuesUnique
소분류 is an unsupported type, check if it needs cleaning or further analysisUnsupported
면적기정 has 5 (6.7%) zerosZeros
면적변경후 has 1 (1.3%) zerosZeros

Reproduction

Analysis started2024-05-11 05:49:28.921967
Analysis finished2024-05-11 05:49:33.281539
Duration4.36 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

조서관리코드
Text

UNIQUE 

Distinct75
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size732.0 B
2024-05-11T14:49:33.538921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters1500
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique75 ?
Unique (%)100.0%

Sample

1st row11000AGZ202306210001
2nd row11000AGZ201512146968
3rd row11000AGZ200810136899
4th row11000AGZ201010056915
5th row11000AGZ202004130001
ValueCountFrequency (%)
11000agz202306210001 1
 
1.3%
11000agz200905126908 1
 
1.3%
11000agz201203156927 1
 
1.3%
11000agz202004130002 1
 
1.3%
11000agz200409184403 1
 
1.3%
11000agz200907176912 1
 
1.3%
11000agz201208236936 1
 
1.3%
11000agz201208226935 1
 
1.3%
11530agz202202160001 1
 
1.3%
11000agz200704106894 1
 
1.3%
Other values (65) 65
86.7%
2024-05-11T14:49:34.145917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 480
32.0%
1 286
19.1%
2 179
 
11.9%
9 76
 
5.1%
A 75
 
5.0%
G 75
 
5.0%
Z 75
 
5.0%
6 68
 
4.5%
3 49
 
3.3%
4 39
 
2.6%
Other values (3) 98
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1275
85.0%
Uppercase Letter 225
 
15.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 480
37.6%
1 286
22.4%
2 179
 
14.0%
9 76
 
6.0%
6 68
 
5.3%
3 49
 
3.8%
4 39
 
3.1%
8 38
 
3.0%
7 37
 
2.9%
5 23
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
A 75
33.3%
G 75
33.3%
Z 75
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 1275
85.0%
Latin 225
 
15.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 480
37.6%
1 286
22.4%
2 179
 
14.0%
9 76
 
6.0%
6 68
 
5.3%
3 49
 
3.8%
4 39
 
3.1%
8 38
 
3.0%
7 37
 
2.9%
5 23
 
1.8%
Latin
ValueCountFrequency (%)
A 75
33.3%
G 75
33.3%
Z 75
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1500
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 480
32.0%
1 286
19.1%
2 179
 
11.9%
9 76
 
5.1%
A 75
 
5.0%
G 75
 
5.0%
Z 75
 
5.0%
6 68
 
4.5%
3 49
 
3.3%
4 39
 
2.6%
Other values (3) 98
 
6.5%

프로젝트코드
Text

MISSING 

Distinct62
Distinct (%)83.8%
Missing1
Missing (%)1.3%
Memory size732.0 B
2024-05-11T14:49:34.544518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters1480
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)74.3%

Sample

1st row11000PPL202306210001
2nd row11000PPL201411277341
3rd row11000PPL200512159317
4th row11000PPL201002114513
5th row11000PPL202004130001
ValueCountFrequency (%)
11000ppl201210116657 4
 
5.4%
11000ppl201307306899 3
 
4.1%
11000ppl201109226238 3
 
4.1%
11000ppl202202280001 3
 
4.1%
11000ppl201009305436 2
 
2.7%
11000ppl202402150007 2
 
2.7%
11000ppl202004130001 2
 
2.7%
11530ppl202202160004 1
 
1.4%
11000ppl200907163591 1
 
1.4%
11000ppl200402255804 1
 
1.4%
Other values (52) 52
70.3%
2024-05-11T14:49:35.207064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 478
32.3%
1 294
19.9%
2 176
 
11.9%
P 148
 
10.0%
L 74
 
5.0%
6 51
 
3.4%
7 51
 
3.4%
3 50
 
3.4%
8 45
 
3.0%
9 41
 
2.8%
Other values (2) 72
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1258
85.0%
Uppercase Letter 222
 
15.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 478
38.0%
1 294
23.4%
2 176
 
14.0%
6 51
 
4.1%
7 51
 
4.1%
3 50
 
4.0%
8 45
 
3.6%
9 41
 
3.3%
5 37
 
2.9%
4 35
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
P 148
66.7%
L 74
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 1258
85.0%
Latin 222
 
15.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 478
38.0%
1 294
23.4%
2 176
 
14.0%
6 51
 
4.1%
7 51
 
4.1%
3 50
 
4.0%
8 45
 
3.6%
9 41
 
3.3%
5 37
 
2.9%
4 35
 
2.8%
Latin
ValueCountFrequency (%)
P 148
66.7%
L 74
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1480
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 478
32.3%
1 294
19.9%
2 176
 
11.9%
P 148
 
10.0%
L 74
 
5.0%
6 51
 
3.4%
7 51
 
3.4%
3 50
 
3.4%
8 45
 
3.0%
9 41
 
2.8%
Other values (2) 72
 
4.9%

지자체
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size732.0 B
서울특별시
72 
양천구
 
1
은평구
 
1
구로구
 
1

Length

Max length5
Median length5
Mean length4.92
Min length3

Unique

Unique3 ?
Unique (%)4.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 72
96.0%
양천구 1
 
1.3%
은평구 1
 
1.3%
구로구 1
 
1.3%

Length

2024-05-11T14:49:35.527839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:49:35.749684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 72
96.0%
양천구 1
 
1.3%
은평구 1
 
1.3%
구로구 1
 
1.3%

조서유형(구분)
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size732.0 B
변경
54 
신설
17 
폐지
 
3
기정
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.3%

Sample

1st row변경
2nd row변경
3rd row신설
4th row변경
5th row변경

Common Values

ValueCountFrequency (%)
변경 54
72.0%
신설 17
 
22.7%
폐지 3
 
4.0%
기정 1
 
1.3%

Length

2024-05-11T14:49:35.979958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:49:36.180109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
변경 54
72.0%
신설 17
 
22.7%
폐지 3
 
4.0%
기정 1
 
1.3%

대분류
Categorical

CONSTANT 

Distinct1
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size732.0 B
의제처리구역
75 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row의제처리구역
2nd row의제처리구역
3rd row의제처리구역
4th row의제처리구역
5th row의제처리구역

Common Values

ValueCountFrequency (%)
의제처리구역 75
100.0%

Length

2024-05-11T14:49:36.439580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:49:36.623304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
의제처리구역 75
100.0%

중분류
Categorical

CONSTANT 

Distinct1
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size732.0 B
도시개발구역
75 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row도시개발구역
2nd row도시개발구역
3rd row도시개발구역
4th row도시개발구역
5th row도시개발구역

Common Values

ValueCountFrequency (%)
도시개발구역 75
100.0%

Length

2024-05-11T14:49:36.798143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:49:36.968364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
도시개발구역 75
100.0%

소분류
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing75
Missing (%)100.0%
Memory size807.0 B
Distinct55
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Memory size732.0 B
2024-05-11T14:49:37.318961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length31
Mean length21.426667
Min length10

Characters and Unicode

Total characters1607
Distinct characters90
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)53.3%

Sample

1st row강남구 양재대로 478(개포동 567-1번지) 일원
2nd row구로구 천왕동, 오류동 일원
3rd row서울특별시 성동구 행당동 87-4 일원
4th row서울특별시 강서구 마곡동,가양동,공항동,방화동, 내_외발산동 일대
5th row서울특별시 강동구 강일동 360번지 일원
ValueCountFrequency (%)
일원 33
 
9.5%
일대 32
 
9.2%
강서구 17
 
4.9%
마곡동 17
 
4.9%
공항동 15
 
4.3%
가양동 14
 
4.0%
방화동 13
 
3.7%
서울특별시 13
 
3.7%
구로구 7
 
2.0%
강동구 7
 
2.0%
Other values (76) 181
51.9%
2024-05-11T14:49:37.983309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
282
 
17.5%
164
 
10.2%
, 80
 
5.0%
78
 
4.9%
74
 
4.6%
1 40
 
2.5%
39
 
2.4%
37
 
2.3%
35
 
2.2%
35
 
2.2%
Other values (80) 743
46.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1001
62.3%
Space Separator 282
 
17.5%
Decimal Number 190
 
11.8%
Other Punctuation 93
 
5.8%
Dash Punctuation 34
 
2.1%
Connector Punctuation 3
 
0.2%
Close Punctuation 2
 
0.1%
Open Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
164
 
16.4%
78
 
7.8%
74
 
7.4%
39
 
3.9%
37
 
3.7%
35
 
3.5%
35
 
3.5%
35
 
3.5%
34
 
3.4%
22
 
2.2%
Other values (62) 448
44.8%
Decimal Number
ValueCountFrequency (%)
1 40
21.1%
7 21
11.1%
0 20
10.5%
6 20
10.5%
3 20
10.5%
4 19
10.0%
9 14
 
7.4%
8 14
 
7.4%
2 12
 
6.3%
5 10
 
5.3%
Other Punctuation
ValueCountFrequency (%)
, 80
86.0%
? 7
 
7.5%
. 6
 
6.5%
Space Separator
ValueCountFrequency (%)
282
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 34
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1001
62.3%
Common 606
37.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
164
 
16.4%
78
 
7.8%
74
 
7.4%
39
 
3.9%
37
 
3.7%
35
 
3.5%
35
 
3.5%
35
 
3.5%
34
 
3.4%
22
 
2.2%
Other values (62) 448
44.8%
Common
ValueCountFrequency (%)
282
46.5%
, 80
 
13.2%
1 40
 
6.6%
- 34
 
5.6%
7 21
 
3.5%
0 20
 
3.3%
6 20
 
3.3%
3 20
 
3.3%
4 19
 
3.1%
9 14
 
2.3%
Other values (8) 56
 
9.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1001
62.3%
ASCII 606
37.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
282
46.5%
, 80
 
13.2%
1 40
 
6.6%
- 34
 
5.6%
7 21
 
3.5%
0 20
 
3.3%
6 20
 
3.3%
3 20
 
3.3%
4 19
 
3.1%
9 14
 
2.3%
Other values (8) 56
 
9.2%
Hangul
ValueCountFrequency (%)
164
 
16.4%
78
 
7.8%
74
 
7.4%
39
 
3.9%
37
 
3.7%
35
 
3.5%
35
 
3.5%
35
 
3.5%
34
 
3.4%
22
 
2.2%
Other values (62) 448
44.8%

지역명
Text

MISSING 

Distinct43
Distinct (%)58.1%
Missing1
Missing (%)1.3%
Memory size732.0 B
2024-05-11T14:49:38.303237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length16.5
Mean length10.486486
Min length3

Characters and Unicode

Total characters776
Distinct characters76
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)36.5%

Sample

1st row개포(구룡마을) 도시개발구역
2nd row천왕도시개발사업
3rd row행당지구 도시개발구역
4th row마곡 도시개발구역
5th row강일 도시개발구역 3공구
ValueCountFrequency (%)
도시개발구역 29
24.2%
마곡도시개발구역 7
 
5.8%
강일도시개발구역 5
 
4.2%
문정도시개발구역 5
 
4.2%
마곡 4
 
3.3%
마곡구역 3
 
2.5%
2지구 3
 
2.5%
도시개발사업 3
 
2.5%
1지구 3
 
2.5%
행당 3
 
2.5%
Other values (37) 55
45.8%
2024-05-11T14:49:38.898372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
98
12.6%
72
 
9.3%
70
 
9.0%
67
 
8.6%
66
 
8.5%
66
 
8.5%
51
 
6.6%
27
 
3.5%
22
 
2.8%
20
 
2.6%
Other values (66) 217
28.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 687
88.5%
Space Separator 51
 
6.6%
Decimal Number 16
 
2.1%
Close Punctuation 10
 
1.3%
Open Punctuation 10
 
1.3%
Other Punctuation 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
98
14.3%
72
10.5%
70
10.2%
67
 
9.8%
66
 
9.6%
66
 
9.6%
27
 
3.9%
22
 
3.2%
20
 
2.9%
10
 
1.5%
Other values (58) 169
24.6%
Decimal Number
ValueCountFrequency (%)
1 6
37.5%
2 5
31.2%
3 4
25.0%
4 1
 
6.2%
Space Separator
ValueCountFrequency (%)
51
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Other Punctuation
ValueCountFrequency (%)
? 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 687
88.5%
Common 89
 
11.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
98
14.3%
72
10.5%
70
10.2%
67
 
9.8%
66
 
9.6%
66
 
9.6%
27
 
3.9%
22
 
3.2%
20
 
2.9%
10
 
1.5%
Other values (58) 169
24.6%
Common
ValueCountFrequency (%)
51
57.3%
) 10
 
11.2%
( 10
 
11.2%
1 6
 
6.7%
2 5
 
5.6%
3 4
 
4.5%
? 2
 
2.2%
4 1
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 686
88.4%
ASCII 89
 
11.5%
Compat Jamo 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
98
14.3%
72
10.5%
70
10.2%
67
 
9.8%
66
 
9.6%
66
 
9.6%
27
 
3.9%
22
 
3.2%
20
 
2.9%
10
 
1.5%
Other values (57) 168
24.5%
ASCII
ValueCountFrequency (%)
51
57.3%
) 10
 
11.2%
( 10
 
11.2%
1 6
 
6.7%
2 5
 
5.6%
3 4
 
4.5%
? 2
 
2.2%
4 1
 
1.1%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

면적기정
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct48
Distinct (%)76.2%
Missing12
Missing (%)16.0%
Infinite0
Infinite (%)0.0%
Mean995863.76
Minimum0
Maximum3668796
Zeros5
Zeros (%)6.7%
Negative0
Negative (%)0.0%
Memory size807.0 B
2024-05-11T14:49:39.166456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q170771.5
median484992.5
Q3988817
95-th percentile3665336
Maximum3668796
Range3668796
Interquartile range (IQR)918045.5

Descriptive statistics

Standard deviation1326934
Coefficient of variation (CV)1.3324453
Kurtosis0.11459826
Mean995863.76
Median Absolute Deviation (MAD)426436.9
Skewness1.3370937
Sum62739417
Variance1.7607538 × 1012
MonotonicityNot monotonic
2024-05-11T14:49:39.405323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
0.0 5
 
6.7%
484992.5 3
 
4.0%
3665336.0 3
 
4.0%
14529.0 2
 
2.7%
74800.0 2
 
2.7%
70677.0 2
 
2.7%
510385.9 2
 
2.7%
267455.0 2
 
2.7%
12894.0 2
 
2.7%
548239.0 2
 
2.7%
Other values (38) 38
50.7%
(Missing) 12
 
16.0%
ValueCountFrequency (%)
0.0 5
6.7%
12894.0 2
 
2.7%
14529.0 2
 
2.7%
27423.0 1
 
1.3%
31584.6 1
 
1.3%
33788.6 1
 
1.3%
33844.0 1
 
1.3%
51242.9 1
 
1.3%
70677.0 2
 
2.7%
70866.0 1
 
1.3%
ValueCountFrequency (%)
3668796.0 1
 
1.3%
3666644.2 1
 
1.3%
3666582.0 1
 
1.3%
3665336.0 3
4.0%
3665086.0 1
 
1.3%
3664875.0 1
 
1.3%
3493421.0 1
 
1.3%
3492650.9 1
 
1.3%
3364000.0 1
 
1.3%
3363591.0 1
 
1.3%

면적증감코드
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size732.0 B
<NA>
36 
1
23 
2
16 

Length

Max length4
Median length1
Mean length2.44
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row<NA>
4th row1
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 36
48.0%
1 23
30.7%
2 16
21.3%

Length

2024-05-11T14:49:39.612872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:49:39.792759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 36
48.0%
1 23
30.7%
2 16
21.3%

면적변경
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct37
Distinct (%)94.9%
Missing36
Missing (%)48.0%
Infinite0
Infinite (%)0.0%
Mean214776.22
Minimum1.9
Maximum1763219
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size807.0 B
2024-05-11T14:49:40.004995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9
5-th percentile5.33
Q1202
median3745
Q3286929
95-th percentile927422.2
Maximum1763219
Range1763217.1
Interquartile range (IQR)286727

Descriptive statistics

Standard deviation385112.39
Coefficient of variation (CV)1.7930867
Kurtosis6.2547356
Mean214776.22
Median Absolute Deviation (MAD)3743
Skewness2.3646759
Sum8376272.6
Variance1.4831155 × 1011
MonotonicityNot monotonic
2024-05-11T14:49:40.238923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%)
510385.9 2
 
2.7%
286929.0 2
 
2.7%
139108.0 1
 
1.3%
211.0 1
 
1.3%
3745.0 1
 
1.3%
261.0 1
 
1.3%
74.0 1
 
1.3%
912000.0 1
 
1.3%
31587.6 1
 
1.3%
4206.0 1
 
1.3%
Other values (27) 27
36.0%
(Missing) 36
48.0%
ValueCountFrequency (%)
1.9 1
1.3%
2.0 1
1.3%
5.7 1
1.3%
19.2 1
1.3%
74.0 1
1.3%
84.0 1
1.3%
89.9 1
1.3%
127.0 1
1.3%
189.0 1
1.3%
198.0 1
1.3%
ValueCountFrequency (%)
1763219.0 1
1.3%
1066222.0 1
1.3%
912000.0 1
1.3%
835895.0 1
1.3%
684420.0 1
1.3%
548313.0 1
1.3%
510385.9 2
2.7%
301745.0 1
1.3%
286929.0 2
2.7%
266304.0 1
1.3%

면적변경후
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct62
Distinct (%)84.9%
Missing2
Missing (%)2.7%
Infinite0
Infinite (%)0.0%
Mean1053632.9
Minimum0
Maximum3668801.7
Zeros1
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size807.0 B
2024-05-11T14:49:40.457684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile13875.08
Q187591.1
median485076.5
Q31065634
95-th percentile3665490.4
Maximum3668801.7
Range3668801.7
Interquartile range (IQR)978042.9

Descriptive statistics

Standard deviation1309576.9
Coefficient of variation (CV)1.2429157
Kurtosis0.0029420342
Mean1053632.9
Median Absolute Deviation (MAD)414027.5
Skewness1.284476
Sum76915204
Variance1.7149916 × 1012
MonotonicityNot monotonic
2024-05-11T14:49:40.706345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3665336.0 3
 
4.0%
484992.5 3
 
4.0%
70823.3 2
 
2.7%
548239.0 2
 
2.7%
27423.0 2
 
2.7%
14529.4 2
 
2.7%
12893.6 2
 
2.7%
510385.9 2
 
2.7%
74800.0 2
 
2.7%
3.0 1
 
1.3%
Other values (52) 52
69.3%
(Missing) 2
 
2.7%
ValueCountFrequency (%)
0.0 1
1.3%
3.0 1
1.3%
12893.6 2
2.7%
14529.4 2
2.7%
27423.0 2
2.7%
33788.6 1
1.3%
33844.12 1
1.3%
70677.0 1
1.3%
70794.0 1
1.3%
70823.3 2
2.7%
ValueCountFrequency (%)
3668801.7 1
 
1.3%
3668796.0 1
 
1.3%
3666582.0 1
 
1.3%
3665722.0 1
 
1.3%
3665336.0 3
4.0%
3665086.0 1
 
1.3%
3664875.0 1
 
1.3%
3495248.0 1
 
1.3%
3492649.3 1
 
1.3%
3492421.0 1
 
1.3%
Distinct63
Distinct (%)85.1%
Missing1
Missing (%)1.3%
Memory size732.0 B
2024-05-11T14:49:41.144979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters1480
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)77.0%

Sample

1st row11000NTC202306210001
2nd row11000NTC201411277341
3rd row11000NTC200512159317
4th row11000NTC201002114513
5th row11000NTC202004130001
ValueCountFrequency (%)
11000ntc201210116657 4
 
5.4%
11000ntc202202280001 3
 
4.1%
11000ntc201109226238 3
 
4.1%
11000ntc201307306899 3
 
4.1%
11000ntc202402150006 2
 
2.7%
11000ntc202004130001 2
 
2.7%
11000ntc200905143317 1
 
1.4%
11000ntc202110010003 1
 
1.4%
11000ntc202202160002 1
 
1.4%
11000ntc200610260460 1
 
1.4%
Other values (53) 53
71.6%
2024-05-11T14:49:41.788827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 483
32.6%
1 296
20.0%
2 179
 
12.1%
N 74
 
5.0%
T 74
 
5.0%
C 74
 
5.0%
6 52
 
3.5%
3 48
 
3.2%
7 47
 
3.2%
8 45
 
3.0%
Other values (3) 108
 
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1258
85.0%
Uppercase Letter 222
 
15.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 483
38.4%
1 296
23.5%
2 179
 
14.2%
6 52
 
4.1%
3 48
 
3.8%
7 47
 
3.7%
8 45
 
3.6%
9 40
 
3.2%
5 38
 
3.0%
4 30
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
N 74
33.3%
T 74
33.3%
C 74
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 1258
85.0%
Latin 222
 
15.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 483
38.4%
1 296
23.5%
2 179
 
14.2%
6 52
 
4.1%
3 48
 
3.8%
7 47
 
3.7%
8 45
 
3.6%
9 40
 
3.2%
5 38
 
3.0%
4 30
 
2.4%
Latin
ValueCountFrequency (%)
N 74
33.3%
T 74
33.3%
C 74
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1480
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 483
32.6%
1 296
20.0%
2 179
 
12.1%
N 74
 
5.0%
T 74
 
5.0%
C 74
 
5.0%
6 52
 
3.5%
3 48
 
3.2%
7 47
 
3.2%
8 45
 
3.0%
Other values (3) 108
 
7.3%

Interactions

2024-05-11T14:49:32.045434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:49:30.631559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:49:31.049301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:49:32.198053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:49:30.758810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:49:31.588861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:49:32.345930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:49:30.886782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:49:31.876883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:49:41.970019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조서관리코드프로젝트코드지자체조서유형(구분)위치명지역명면적기정면적증감코드면적변경면적변경후결정고시관리코드
조서관리코드1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
프로젝트코드1.0001.0001.0001.0000.9940.0000.0000.4190.0000.0001.000
지자체1.0001.0001.0000.6030.8650.8970.000NaNNaN0.0001.000
조서유형(구분)1.0001.0000.6031.0000.9730.8580.0000.3290.7180.0001.000
위치명1.0000.9940.8650.9731.0000.9400.0000.0000.0000.0000.993
지역명1.0000.0000.8970.8580.9401.0000.8890.0000.9411.0000.000
면적기정1.0000.0000.0000.0000.0000.8891.0000.6190.0000.9530.000
면적증감코드1.0000.419NaN0.3290.0000.0000.6191.0000.0000.0000.467
면적변경1.0000.000NaN0.7180.0000.9410.0000.0001.0000.6330.000
면적변경후1.0000.0000.0000.0000.0001.0000.9530.0000.6331.0000.000
결정고시관리코드1.0001.0001.0001.0000.9930.0000.0000.4670.0000.0001.000
2024-05-11T14:49:42.165307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조서유형(구분)지자체면적증감코드
조서유형(구분)1.0000.2720.519
지자체0.2721.0001.000
면적증감코드0.5191.0001.000
2024-05-11T14:49:42.664838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적기정면적변경면적변경후지자체조서유형(구분)면적증감코드
면적기정1.000-0.0830.8740.0000.0000.410
면적변경-0.0831.0000.0251.0000.5640.000
면적변경후0.8740.0251.0000.0000.0000.000
지자체0.0001.0000.0001.0000.2721.000
조서유형(구분)0.0000.5640.0000.2721.0000.519
면적증감코드0.4100.0000.0001.0000.5191.000

Missing values

2024-05-11T14:49:32.561840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:49:32.874272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-11T14:49:33.107246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

조서관리코드프로젝트코드지자체조서유형(구분)대분류중분류소분류위치명지역명면적기정면적증감코드면적변경면적변경후결정고시관리코드
011000AGZ20230621000111000PPL202306210001서울특별시변경의제처리구역도시개발구역<NA>강남구 양재대로 478(개포동 567-1번지) 일원개포(구룡마을) 도시개발구역266502.01964.4267466.411000NTC202306210001
111000AGZ20151214696811000PPL201411277341서울특별시변경의제처리구역도시개발구역<NA>구로구 천왕동, 오류동 일원천왕도시개발사업484992.52515.4484477.111000NTC201411277341
211000AGZ20081013689911000PPL200512159317서울특별시신설의제처리구역도시개발구역<NA>서울특별시 성동구 행당동 87-4 일원행당지구 도시개발구역<NA><NA><NA>74800.011000NTC200512159317
311000AGZ20101005691511000PPL201002114513서울특별시변경의제처리구역도시개발구역<NA>서울특별시 강서구 마곡동,가양동,공항동,방화동, 내_외발산동 일대마곡 도시개발구역3363591.01301745.03665336.011000NTC201002114513
411000AGZ20200413000111000PPL202004130001서울특별시변경의제처리구역도시개발구역<NA>서울특별시 강동구 강일동 360번지 일원강일 도시개발구역 3공구139334.0<NA><NA>102985.811000NTC202004130001
511000AGZ20080717689711000PPL200712281706서울특별시신설의제처리구역도시개발구역<NA>강서구 마곡동, 가양동, 공항동, 방화동, 내외발산동 일대마곡도시개발구역<NA><NA><NA>3364000.011000NTC200712281706
611000AGZ20131017695211000PPL201307306899서울특별시변경의제처리구역도시개발구역<NA>강서구 마곡동, 가양동, 공항동, 방화동, 내.외발산동 일대마곡도시개발구역(1지구)1065634.022.01065632.011000NTC201307306899
711000AGZ20131017695311000PPL201307306899서울특별시변경의제처리구역도시개발구역<NA>강서구 마곡동, 가양동, 공항동, 방화동, 내.외발산동 일대마곡도시개발구역(2지구)1902327.01208.01902535.011000NTC201307306899
811000AGZ20131017695411000PPL201307306899서울특별시변경의제처리구역도시개발구역<NA>강서구 마곡동, 가양동, 공항동, 방화동, 내.외발산동마곡도시개발구역(3지구)697125.02206.0696919.011000NTC201307306899
911000AGZ20110426691711000PPL201009305436서울특별시변경의제처리구역도시개발구역<NA>강서구 마곡동, 가양동, 공항동, 방화동, 내_외발산동 일대마곡도시개발구역3665336.0<NA><NA>3665336.011000NTC201009305436
조서관리코드프로젝트코드지자체조서유형(구분)대분류중분류소분류위치명지역명면적기정면적증감코드면적변경면적변경후결정고시관리코드
6511000AGZ20140422695811000PPL201403207130서울특별시변경의제처리구역도시개발구역<NA>강동구 강일동 360번지 일원강일도시개발구역895618.4119.2895637.611000NTC201403207130
6611000AGZ20210430000111000PPL202104300001서울특별시변경의제처리구역도시개발구역<NA>서울특별시 송파구 문정동 649일원문정도시개발구역548239.0<NA><NA>548239.911000NTC202104300001
6711000AGZ20190906000111000PPL201909060005서울특별시변경의제처리구역도시개발구역<NA>서울시성동구 행당동87-4번지 일원행당 도시개발구역70677.0<NA><NA>70823.311200NTC201909060005
6811000AGZ20220908000111000PPL202209080002서울특별시변경의제처리구역도시개발구역<NA>서울특별시 마곡동 일원마곡 도시개발사업 지구단위계획구역3666582.0<NA><NA>3668796.011500NTC202209080001
6911000AGZ20060118688811000PPL200512299378서울특별시변경의제처리구역도시개발구역<NA>구로구 천왕동 27번지 일대천왕도시개발구역485000.0<NA><NA>485076.511000NTC200512299378
7011710AGZ20220502000111710PPL202205020003서울특별시변경의제처리구역도시개발구역<NA>송파구 문정동 649일원문정도시개발구역548239.9<NA><NA>548239.711000NTC202205020005
7111000AGZ20211001000111000PPL202110010003서울특별시변경의제처리구역도시개발구역<NA>서초구 내곡동 374번지 일대헌인마을 도시개발구역132379.7<NA><NA>132523.011000NTC202110010003
7211000AGZ20220908000211500PPL202309150001서울특별시변경의제처리구역도시개발구역<NA>서울특별시 마곡동 일원마곡 도시개발사업 지구단위계획구역3668796.015.73668801.711500NTC202309150001
7311000AGZ20220228000411000PPL202402150007서울특별시변경의제처리구역도시개발구역<NA>도봉구 창동 1-9, 1-281지구14529.0<NA><NA>14529.411000NTC202402150006
7411000AGZ20220228000511000PPL202402150007서울특별시변경의제처리구역도시개발구역<NA>도봉구 창동 1-9, 1-292지구12894.0<NA><NA>12893.611000NTC202402150006