Overview

Dataset statistics

Number of variables14
Number of observations6133
Missing cells6190
Missing cells (%)7.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory694.9 KiB
Average record size in memory116.0 B

Variable types

Text5
Categorical6
Numeric3

Dataset

Description조서관리코드,프로젝트코드,지자체,조서유형(구분),대분류,중분류,소분류,위치명,지역명,면적기정,면적증감코드,면적변경,면적변경후,결정고시관리코드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-20281/S/1/datasetView.do

Alerts

대분류 has constant value ""Constant
중분류 has constant value ""Constant
면적기정 is highly overall correlated with 면적변경후 and 1 other fieldsHigh correlation
면적변경후 is highly overall correlated with 면적기정 and 1 other fieldsHigh correlation
조서유형(구분) is highly overall correlated with 면적증감코드High correlation
면적증감코드 is highly overall correlated with 면적기정 and 2 other fieldsHigh correlation
지자체 is highly imbalanced (86.1%)Imbalance
위치명 has 386 (6.3%) missing valuesMissing
면적기정 has 994 (16.2%) missing valuesMissing
면적변경 has 4543 (74.1%) missing valuesMissing
면적변경후 has 189 (3.1%) missing valuesMissing
면적기정 is highly skewed (γ1 = 67.56132096)Skewed
면적변경후 is highly skewed (γ1 = 71.0317499)Skewed
조서관리코드 has unique valuesUnique
면적기정 has 1229 (20.0%) zerosZeros
면적변경후 has 246 (4.0%) zerosZeros

Reproduction

Analysis started2024-04-29 20:07:46.348475
Analysis finished2024-04-29 20:07:49.930789
Duration3.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

조서관리코드
Text

UNIQUE 

Distinct6133
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size48.0 KiB
2024-04-30T05:07:50.069108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters122660
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6133 ?
Unique (%)100.0%

Sample

1st row11000AGZ198504032330
2nd row11000AGZ198504032331
3rd row11000AGZ198504032332
4th row11000AGZ198505012337
5th row11000AGZ198505142339
ValueCountFrequency (%)
11000agz198504032330 1
 
< 0.1%
11000agz197312013006 1
 
< 0.1%
11000agz197312013002 1
 
< 0.1%
11000agz197312013001 1
 
< 0.1%
11000agz197312013000 1
 
< 0.1%
11000agz197312012999 1
 
< 0.1%
11000agz197312012998 1
 
< 0.1%
11000agz197312012997 1
 
< 0.1%
11000agz197312012996 1
 
< 0.1%
11000agz197312012995 1
 
< 0.1%
Other values (6123) 6123
99.8%
2024-04-30T05:07:50.360611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 34373
28.0%
1 26362
21.5%
2 11227
 
9.2%
9 7834
 
6.4%
A 6133
 
5.0%
G 6133
 
5.0%
Z 6133
 
5.0%
8 4834
 
3.9%
3 4589
 
3.7%
7 4411
 
3.6%
Other values (3) 10631
 
8.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 104261
85.0%
Uppercase Letter 18399
 
15.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 34373
33.0%
1 26362
25.3%
2 11227
 
10.8%
9 7834
 
7.5%
8 4834
 
4.6%
3 4589
 
4.4%
7 4411
 
4.2%
4 4122
 
4.0%
6 3408
 
3.3%
5 3101
 
3.0%
Uppercase Letter
ValueCountFrequency (%)
A 6133
33.3%
G 6133
33.3%
Z 6133
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 104261
85.0%
Latin 18399
 
15.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 34373
33.0%
1 26362
25.3%
2 11227
 
10.8%
9 7834
 
7.5%
8 4834
 
4.6%
3 4589
 
4.4%
7 4411
 
4.2%
4 4122
 
4.0%
6 3408
 
3.3%
5 3101
 
3.0%
Latin
ValueCountFrequency (%)
A 6133
33.3%
G 6133
33.3%
Z 6133
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 122660
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 34373
28.0%
1 26362
21.5%
2 11227
 
9.2%
9 7834
 
6.4%
A 6133
 
5.0%
G 6133
 
5.0%
Z 6133
 
5.0%
8 4834
 
3.9%
3 4589
 
3.7%
7 4411
 
3.6%
Other values (3) 10631
 
8.7%
Distinct3739
Distinct (%)61.0%
Missing4
Missing (%)0.1%
Memory size48.0 KiB
2024-04-30T05:07:50.595952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters122580
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3226 ?
Unique (%)52.6%

Sample

1st row11000PPL198504036832
2nd row11000PPL198504036832
3rd row11000PPL198504036832
4th row11000PPL198505016898
5th row11000PPL198505146936
ValueCountFrequency (%)
11000ppl197312011134 195
 
3.2%
11000ppl198404275987 85
 
1.4%
11000ppl198204264660 76
 
1.2%
11000ppl198209244882 76
 
1.2%
11000ppl201311216998 75
 
1.2%
11000ppl197704301991 63
 
1.0%
11000ppl197604071600 57
 
0.9%
11000ppl198002123610 40
 
0.7%
11000ppl198111134440 39
 
0.6%
11000ppl198510117299 37
 
0.6%
Other values (3729) 5386
87.9%
2024-04-30T05:07:51.133793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 34171
27.9%
1 25875
21.1%
P 12258
 
10.0%
2 10526
 
8.6%
9 7861
 
6.4%
L 6129
 
5.0%
8 4802
 
3.9%
3 4508
 
3.7%
4 4477
 
3.7%
7 4382
 
3.6%
Other values (2) 7591
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 104193
85.0%
Uppercase Letter 18387
 
15.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 34171
32.8%
1 25875
24.8%
2 10526
 
10.1%
9 7861
 
7.5%
8 4802
 
4.6%
3 4508
 
4.3%
4 4477
 
4.3%
7 4382
 
4.2%
6 4040
 
3.9%
5 3551
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
P 12258
66.7%
L 6129
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 104193
85.0%
Latin 18387
 
15.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 34171
32.8%
1 25875
24.8%
2 10526
 
10.1%
9 7861
 
7.5%
8 4802
 
4.6%
3 4508
 
4.3%
4 4477
 
4.3%
7 4382
 
4.2%
6 4040
 
3.9%
5 3551
 
3.4%
Latin
ValueCountFrequency (%)
P 12258
66.7%
L 6129
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 122580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 34171
27.9%
1 25875
21.1%
P 12258
 
10.0%
2 10526
 
8.6%
9 7861
 
6.4%
L 6129
 
5.0%
8 4802
 
3.9%
3 4508
 
3.7%
4 4477
 
3.7%
7 4382
 
3.6%
Other values (2) 7591
 
6.2%

지자체
Categorical

IMBALANCE 

Distinct25
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size48.0 KiB
서울특별시
5700 
중구
 
99
마포구
 
45
종로구
 
39
서대문구
 
29
Other values (20)
 
221

Length

Max length5
Median length5
Mean length4.8543943
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 5700
92.9%
중구 99
 
1.6%
마포구 45
 
0.7%
종로구 39
 
0.6%
서대문구 29
 
0.5%
영등포구 26
 
0.4%
은평구 22
 
0.4%
용산구 21
 
0.3%
동작구 18
 
0.3%
동대문구 17
 
0.3%
Other values (15) 117
 
1.9%

Length

2024-04-30T05:07:51.255857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울특별시 5700
92.9%
중구 99
 
1.6%
마포구 45
 
0.7%
종로구 39
 
0.6%
서대문구 29
 
0.5%
영등포구 26
 
0.4%
은평구 22
 
0.4%
용산구 21
 
0.3%
동작구 18
 
0.3%
동대문구 17
 
0.3%
Other values (15) 117
 
1.9%

조서유형(구분)
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size48.0 KiB
변경
3765 
신설
1836 
폐지
386 
정정
 
94
기정
 
26

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row변경
2nd row변경
3rd row변경
4th row변경
5th row변경

Common Values

ValueCountFrequency (%)
변경 3765
61.4%
신설 1836
29.9%
폐지 386
 
6.3%
정정 94
 
1.5%
기정 26
 
0.4%
실효 26
 
0.4%

Length

2024-04-30T05:07:51.355413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T05:07:51.447905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
변경 3765
61.4%
신설 1836
29.9%
폐지 386
 
6.3%
정정 94
 
1.5%
기정 26
 
0.4%
실효 26
 
0.4%

대분류
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.0 KiB
의제처리구역
6133 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row의제처리구역
2nd row의제처리구역
3rd row의제처리구역
4th row의제처리구역
5th row의제처리구역

Common Values

ValueCountFrequency (%)
의제처리구역 6133
100.0%

Length

2024-04-30T05:07:51.547940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T05:07:51.631049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
의제처리구역 6133
100.0%

중분류
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.0 KiB
정비구역
6133 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정비구역
2nd row정비구역
3rd row정비구역
4th row정비구역
5th row정비구역

Common Values

ValueCountFrequency (%)
정비구역 6133
100.0%

Length

2024-04-30T05:07:51.708790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T05:07:51.788050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정비구역 6133
100.0%

소분류
Categorical

Distinct13
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size48.0 KiB
주택재개발사업구역
2266 
도시환경정비사업지구
899 
도시환경정비사업구역
716 
재개발사업구역
530 
주거환경개선사업
457 
Other values (8)
1265 

Length

Max length10
Median length9
Mean length8.6879178
Min length4

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row재개발사업지구
2nd row재개발사업지구
3rd row재개발사업지구
4th row주택재개발사업지구
5th row도시환경정비사업지구

Common Values

ValueCountFrequency (%)
주택재개발사업구역 2266
36.9%
도시환경정비사업지구 899
 
14.7%
도시환경정비사업구역 716
 
11.7%
재개발사업구역 530
 
8.6%
주거환경개선사업 457
 
7.5%
재개발사업지구 434
 
7.1%
주택재건축사업 336
 
5.5%
재건축사업구역 261
 
4.3%
주택재개발사업지구 108
 
1.8%
주거환경개선사업구역 92
 
1.5%
Other values (3) 34
 
0.6%

Length

2024-04-30T05:07:51.880191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
주택재개발사업구역 2266
36.9%
도시환경정비사업지구 899
 
14.7%
도시환경정비사업구역 716
 
11.7%
재개발사업구역 530
 
8.6%
주거환경개선사업 457
 
7.5%
재개발사업지구 434
 
7.1%
주택재건축사업 336
 
5.5%
재건축사업구역 261
 
4.3%
주택재개발사업지구 108
 
1.8%
주거환경개선사업구역 92
 
1.5%
Other values (3) 34
 
0.6%

위치명
Text

MISSING 

Distinct4477
Distinct (%)77.9%
Missing386
Missing (%)6.3%
Memory size48.0 KiB
2024-04-30T05:07:52.104533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length221
Median length54
Mean length16.440926
Min length3

Characters and Unicode

Total characters94486
Distinct characters238
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3687 ?
Unique (%)64.2%

Sample

1st row소공동 17번지일대
2nd row소공동 111번지일대
3rd row태평로2가 43일대
4th row성동구 하왕십리동 890번지 일대
5th row서린동 33번지 일대(41필지)
ValueCountFrequency (%)
일대 3325
 
15.8%
일원 451
 
2.1%
서대문구 438
 
2.1%
종로구 413
 
2.0%
중구 363
 
1.7%
동대문구 354
 
1.7%
성북구 341
 
1.6%
마포구 340
 
1.6%
성동구 318
 
1.5%
동작구 246
 
1.2%
Other values (3976) 14435
68.7%
2024-04-30T05:07:52.489621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
15501
16.4%
6607
 
7.0%
5561
 
5.9%
5169
 
5.5%
4917
 
5.2%
1 4721
 
5.0%
4351
 
4.6%
4050
 
4.3%
2 2994
 
3.2%
- 2320
 
2.5%
Other values (228) 38295
40.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 55299
58.5%
Decimal Number 20581
 
21.8%
Space Separator 15501
 
16.4%
Dash Punctuation 2320
 
2.5%
Other Punctuation 652
 
0.7%
Open Punctuation 58
 
0.1%
Close Punctuation 58
 
0.1%
Math Symbol 10
 
< 0.1%
Control 6
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6607
 
11.9%
5561
 
10.1%
5169
 
9.3%
4917
 
8.9%
4351
 
7.9%
4050
 
7.3%
1135
 
2.1%
1133
 
2.0%
1089
 
2.0%
940
 
1.7%
Other values (207) 20347
36.8%
Decimal Number
ValueCountFrequency (%)
1 4721
22.9%
2 2994
14.5%
3 2246
10.9%
4 1833
 
8.9%
5 1827
 
8.9%
0 1633
 
7.9%
6 1490
 
7.2%
7 1408
 
6.8%
8 1251
 
6.1%
9 1178
 
5.7%
Other Punctuation
ValueCountFrequency (%)
, 643
98.6%
. 4
 
0.6%
? 3
 
0.5%
: 2
 
0.3%
Space Separator
ValueCountFrequency (%)
15501
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2320
100.0%
Open Punctuation
ValueCountFrequency (%)
( 58
100.0%
Close Punctuation
ValueCountFrequency (%)
) 58
100.0%
Math Symbol
ValueCountFrequency (%)
~ 10
100.0%
Control
ValueCountFrequency (%)
6
100.0%
Uppercase Letter
ValueCountFrequency (%)
H 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 55299
58.5%
Common 39186
41.5%
Latin 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6607
 
11.9%
5561
 
10.1%
5169
 
9.3%
4917
 
8.9%
4351
 
7.9%
4050
 
7.3%
1135
 
2.1%
1133
 
2.0%
1089
 
2.0%
940
 
1.7%
Other values (207) 20347
36.8%
Common
ValueCountFrequency (%)
15501
39.6%
1 4721
 
12.0%
2 2994
 
7.6%
- 2320
 
5.9%
3 2246
 
5.7%
4 1833
 
4.7%
5 1827
 
4.7%
0 1633
 
4.2%
6 1490
 
3.8%
7 1408
 
3.6%
Other values (10) 3213
 
8.2%
Latin
ValueCountFrequency (%)
H 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 55296
58.5%
ASCII 39187
41.5%
Compat Jamo 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
15501
39.6%
1 4721
 
12.0%
2 2994
 
7.6%
- 2320
 
5.9%
3 2246
 
5.7%
4 1833
 
4.7%
5 1827
 
4.7%
0 1633
 
4.2%
6 1490
 
3.8%
7 1408
 
3.6%
Other values (11) 3214
 
8.2%
Hangul
ValueCountFrequency (%)
6607
 
11.9%
5561
 
10.1%
5169
 
9.3%
4917
 
8.9%
4351
 
7.9%
4050
 
7.3%
1135
 
2.1%
1133
 
2.0%
1089
 
2.0%
940
 
1.7%
Other values (204) 20344
36.8%
Compat Jamo
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct4454
Distinct (%)73.3%
Missing58
Missing (%)0.9%
Memory size48.0 KiB
2024-04-30T05:07:52.727025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length35
Mean length11.099588
Min length1

Characters and Unicode

Total characters67430
Distinct characters313
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3490 ?
Unique (%)57.4%

Sample

1st row소공4-1도시환경정비지구
2nd row소공4-2도시환경정비지구
3rd row소공4-3도시환경정비지구
4th row하왕제1구역제1지구
5th row서린구역제12지구
ValueCountFrequency (%)
주택재건축 165
 
1.9%
정비구역 147
 
1.7%
도시환경정비구역 94
 
1.1%
정비사업 92
 
1.1%
재개발구역 88
 
1.0%
도시정비형 87
 
1.0%
주택재건축정비사업 62
 
0.7%
주택재개발정비사업 59
 
0.7%
재정비촉진구역 55
 
0.6%
도시환경정비사업 54
 
0.6%
Other values (4258) 7609
89.4%
2024-04-30T05:07:53.058558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6147
 
9.1%
4323
 
6.4%
2839
 
4.2%
2660
 
3.9%
2648
 
3.9%
1 2634
 
3.9%
2593
 
3.8%
2555
 
3.8%
2278
 
3.4%
2239
 
3.3%
Other values (303) 36514
54.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 55474
82.3%
Decimal Number 7811
 
11.6%
Space Separator 2593
 
3.8%
Dash Punctuation 908
 
1.3%
Open Punctuation 166
 
0.2%
Close Punctuation 166
 
0.2%
Other Punctuation 143
 
0.2%
Uppercase Letter 115
 
0.2%
Math Symbol 50
 
0.1%
Connector Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6147
 
11.1%
4323
 
7.8%
2839
 
5.1%
2660
 
4.8%
2648
 
4.8%
2555
 
4.6%
2278
 
4.1%
2239
 
4.0%
2079
 
3.7%
2067
 
3.7%
Other values (271) 25639
46.2%
Decimal Number
ValueCountFrequency (%)
1 2634
33.7%
2 1622
20.8%
3 913
 
11.7%
4 735
 
9.4%
5 513
 
6.6%
6 406
 
5.2%
7 316
 
4.0%
8 266
 
3.4%
9 213
 
2.7%
0 193
 
2.5%
Uppercase Letter
ValueCountFrequency (%)
A 38
33.0%
C 27
23.5%
H 19
16.5%
B 17
14.8%
I 6
 
5.2%
E 3
 
2.6%
D 3
 
2.6%
F 1
 
0.9%
G 1
 
0.9%
Other Punctuation
ValueCountFrequency (%)
, 64
44.8%
? 46
32.2%
. 31
21.7%
2
 
1.4%
Math Symbol
ValueCountFrequency (%)
~ 46
92.0%
< 2
 
4.0%
> 2
 
4.0%
Space Separator
ValueCountFrequency (%)
2593
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 908
100.0%
Open Punctuation
ValueCountFrequency (%)
( 166
100.0%
Close Punctuation
ValueCountFrequency (%)
) 166
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 55474
82.3%
Common 11841
 
17.6%
Latin 115
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6147
 
11.1%
4323
 
7.8%
2839
 
5.1%
2660
 
4.8%
2648
 
4.8%
2555
 
4.6%
2278
 
4.1%
2239
 
4.0%
2079
 
3.7%
2067
 
3.7%
Other values (271) 25639
46.2%
Common
ValueCountFrequency (%)
1 2634
22.2%
2593
21.9%
2 1622
13.7%
3 913
 
7.7%
- 908
 
7.7%
4 735
 
6.2%
5 513
 
4.3%
6 406
 
3.4%
7 316
 
2.7%
8 266
 
2.2%
Other values (13) 935
 
7.9%
Latin
ValueCountFrequency (%)
A 38
33.0%
C 27
23.5%
H 19
16.5%
B 17
14.8%
I 6
 
5.2%
E 3
 
2.6%
D 3
 
2.6%
F 1
 
0.9%
G 1
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 55472
82.3%
ASCII 11953
 
17.7%
None 2
 
< 0.1%
Compat Jamo 2
 
< 0.1%
CJK Compat 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6147
 
11.1%
4323
 
7.8%
2839
 
5.1%
2660
 
4.8%
2648
 
4.8%
2555
 
4.6%
2278
 
4.1%
2239
 
4.0%
2079
 
3.7%
2067
 
3.7%
Other values (269) 25637
46.2%
ASCII
ValueCountFrequency (%)
1 2634
22.0%
2593
21.7%
2 1622
13.6%
3 913
 
7.6%
- 908
 
7.6%
4 735
 
6.1%
5 513
 
4.3%
6 406
 
3.4%
7 316
 
2.6%
8 266
 
2.2%
Other values (20) 1047
 
8.8%
None
ValueCountFrequency (%)
2
100.0%
Compat Jamo
ValueCountFrequency (%)
1
50.0%
1
50.0%
CJK Compat
ValueCountFrequency (%)
1
100.0%

면적기정
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED  ZEROS 

Distinct3311
Distinct (%)64.4%
Missing994
Missing (%)16.2%
Infinite0
Infinite (%)0.0%
Mean43132.833
Minimum0
Maximum23572124
Zeros1229
Zeros (%)20.0%
Negative0
Negative (%)0.0%
Memory size54.0 KiB
2024-04-30T05:07:53.187901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11077.5
median13188.8
Q352267.4
95-th percentile158183
Maximum23572124
Range23572124
Interquartile range (IQR)51189.9

Descriptive statistics

Standard deviation334905.36
Coefficient of variation (CV)7.7645111
Kurtosis4744.3267
Mean43132.833
Median Absolute Deviation (MAD)13188.8
Skewness67.561321
Sum2.2165963 × 108
Variance1.121616 × 1011
MonotonicityNot monotonic
2024-04-30T05:07:53.309744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 1229
 
20.0%
1330.0 12
 
0.2%
1477.4 9
 
0.1%
88893.0 7
 
0.1%
12888.0 7
 
0.1%
1353.0 7
 
0.1%
243010.0 5
 
0.1%
49679.0 5
 
0.1%
91000.0 5
 
0.1%
244294.0 5
 
0.1%
Other values (3301) 3848
62.7%
(Missing) 994
 
16.2%
ValueCountFrequency (%)
0.0 1229
20.0%
15.46 1
 
< 0.1%
29.94 1
 
< 0.1%
64.0 1
 
< 0.1%
65.7 1
 
< 0.1%
89.3 1
 
< 0.1%
127.6 1
 
< 0.1%
140.85 1
 
< 0.1%
209.8 1
 
< 0.1%
223.5 1
 
< 0.1%
ValueCountFrequency (%)
23572124.0 1
 
< 0.1%
1469460.8 1
 
< 0.1%
1297882.0 1
 
< 0.1%
898153.6 1
 
< 0.1%
626232.5 1
 
< 0.1%
472298.1 1
 
< 0.1%
433000.0 1
 
< 0.1%
431400.0 5
0.1%
405782.4 2
 
< 0.1%
399900.0 1
 
< 0.1%

면적증감코드
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.0 KiB
<NA>
4606 
1
907 
2
620 

Length

Max length4
Median length4
Mean length3.2530572
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 4606
75.1%
1 907
 
14.8%
2 620
 
10.1%

Length

2024-04-30T05:07:53.413244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T05:07:53.496070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 4606
75.1%
1 907
 
14.8%
2 620
 
10.1%

면적변경
Real number (ℝ)

MISSING 

Distinct1218
Distinct (%)76.6%
Missing4543
Missing (%)74.1%
Infinite0
Infinite (%)0.0%
Mean19457.337
Minimum0
Maximum563151.2
Zeros10
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size54.0 KiB
2024-04-30T05:07:53.580916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.445
Q144.325
median469.9
Q318374.64
95-th percentile98326.36
Maximum563151.2
Range563151.2
Interquartile range (IQR)18330.315

Descriptive statistics

Standard deviation46484.506
Coefficient of variation (CV)2.3890476
Kurtosis36.28086
Mean19457.337
Median Absolute Deviation (MAD)468.9
Skewness5.0184567
Sum30937166
Variance2.1608093 × 109
MonotonicityNot monotonic
2024-04-30T05:07:53.683444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.0 12
 
0.2%
0.1 12
 
0.2%
0.0 10
 
0.2%
8.0 8
 
0.1%
0.2 8
 
0.1%
0.4 8
 
0.1%
5.0 7
 
0.1%
3.0 7
 
0.1%
9.0 7
 
0.1%
6.0 6
 
0.1%
Other values (1208) 1505
 
24.5%
(Missing) 4543
74.1%
ValueCountFrequency (%)
0.0 10
0.2%
0.09 1
 
< 0.1%
0.1 12
0.2%
0.2 8
0.1%
0.3 5
0.1%
0.4 8
0.1%
0.5 6
0.1%
0.6 3
 
< 0.1%
0.7 2
 
< 0.1%
0.8 4
 
0.1%
ValueCountFrequency (%)
563151.2 1
< 0.1%
530399.0 1
< 0.1%
399741.7 1
< 0.1%
393729.0 2
< 0.1%
332929.0 2
< 0.1%
318415.0 1
< 0.1%
287743.1 1
< 0.1%
279472.0 1
< 0.1%
279110.0 1
< 0.1%
246208.3 1
< 0.1%

면적변경후
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED  ZEROS 

Distinct4895
Distinct (%)82.4%
Missing189
Missing (%)3.1%
Infinite0
Infinite (%)0.0%
Mean48968.163
Minimum0
Maximum22952124
Zeros246
Zeros (%)4.0%
Negative0
Negative (%)0.0%
Memory size54.0 KiB
2024-04-30T05:07:53.795802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile125.43
Q14627.37
median22561.5
Q358568.525
95-th percentile168431.81
Maximum22952124
Range22952124
Interquartile range (IQR)53941.155

Descriptive statistics

Standard deviation305468.35
Coefficient of variation (CV)6.2381011
Kurtosis5320.248
Mean48968.163
Median Absolute Deviation (MAD)20313
Skewness71.03175
Sum2.9106676 × 108
Variance9.3310913 × 1010
MonotonicityNot monotonic
2024-04-30T05:07:53.917125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 246
 
4.0%
115172.0 6
 
0.1%
50076.8 5
 
0.1%
42009.0 4
 
0.1%
13600.0 4
 
0.1%
13221.5 4
 
0.1%
65191.0 4
 
0.1%
13188.8 4
 
0.1%
5997.0 4
 
0.1%
12926.8 4
 
0.1%
Other values (4885) 5659
92.3%
(Missing) 189
 
3.1%
ValueCountFrequency (%)
0.0 246
4.0%
11.6 1
 
< 0.1%
29.94 1
 
< 0.1%
41.3 1
 
< 0.1%
50.34 1
 
< 0.1%
59.1 1
 
< 0.1%
59.2 2
 
< 0.1%
60.0 1
 
< 0.1%
62.0 1
 
< 0.1%
65.4 1
 
< 0.1%
ValueCountFrequency (%)
22952124.0 1
 
< 0.1%
1786304.0 1
 
< 0.1%
1469460.7 1
 
< 0.1%
1313984.0 1
 
< 0.1%
898252.6 1
 
< 0.1%
626232.5 3
< 0.1%
563151.2 1
 
< 0.1%
530399.0 1
 
< 0.1%
433000.0 1
 
< 0.1%
431400.0 1
 
< 0.1%
Distinct3740
Distinct (%)61.1%
Missing16
Missing (%)0.3%
Memory size48.0 KiB
2024-04-30T05:07:54.146342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters122340
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3231 ?
Unique (%)52.8%

Sample

1st row11000NTC198504036832
2nd row11000NTC198504036832
3rd row11000NTC198504036832
4th row11000NTC198505016898
5th row11000NTC198505146936
ValueCountFrequency (%)
11000ntc197312011134 195
 
3.2%
11000ntc198404275987 85
 
1.4%
11000ntc198209244882 76
 
1.2%
11000ntc198204264660 76
 
1.2%
11000ntc201311216998 75
 
1.2%
11000ntc197704301991 63
 
1.0%
11000ntc197604071600 57
 
0.9%
11000ntc198002123610 40
 
0.7%
11000ntc198111134440 39
 
0.6%
11000ntc198510117299 37
 
0.6%
Other values (3730) 5374
87.9%
2024-04-30T05:07:54.484052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 34041
27.8%
1 25717
21.0%
2 10556
 
8.6%
9 7864
 
6.4%
N 6117
 
5.0%
T 6117
 
5.0%
C 6117
 
5.0%
8 4811
 
3.9%
3 4584
 
3.7%
4 4421
 
3.6%
Other values (3) 11995
 
9.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 103989
85.0%
Uppercase Letter 18351
 
15.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 34041
32.7%
1 25717
24.7%
2 10556
 
10.2%
9 7864
 
7.6%
8 4811
 
4.6%
3 4584
 
4.4%
4 4421
 
4.3%
7 4370
 
4.2%
6 4029
 
3.9%
5 3596
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
N 6117
33.3%
T 6117
33.3%
C 6117
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 103989
85.0%
Latin 18351
 
15.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 34041
32.7%
1 25717
24.7%
2 10556
 
10.2%
9 7864
 
7.6%
8 4811
 
4.6%
3 4584
 
4.4%
4 4421
 
4.3%
7 4370
 
4.2%
6 4029
 
3.9%
5 3596
 
3.5%
Latin
ValueCountFrequency (%)
N 6117
33.3%
T 6117
33.3%
C 6117
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 122340
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 34041
27.8%
1 25717
21.0%
2 10556
 
8.6%
9 7864
 
6.4%
N 6117
 
5.0%
T 6117
 
5.0%
C 6117
 
5.0%
8 4811
 
3.9%
3 4584
 
3.7%
4 4421
 
3.6%
Other values (3) 11995
 
9.8%

Interactions

2024-04-30T05:07:49.181555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T05:07:48.597240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T05:07:48.902742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T05:07:49.250512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T05:07:48.723720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T05:07:49.001136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T05:07:49.335128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T05:07:48.830642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T05:07:49.087685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T05:07:54.593753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체조서유형(구분)소분류면적기정면적증감코드면적변경면적변경후
지자체1.0000.1490.4070.0000.0760.0000.000
조서유형(구분)0.1491.0000.3110.0000.7730.3480.000
소분류0.4070.3111.0000.0000.1860.1680.000
면적기정0.0000.0000.0001.000NaNNaN0.707
면적증감코드0.0760.7730.186NaN1.0000.088NaN
면적변경0.0000.3480.168NaN0.0881.000NaN
면적변경후0.0000.0000.0000.707NaNNaN1.000
2024-04-30T05:07:54.716629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체조서유형(구분)소분류면적증감코드
지자체1.0000.0670.1470.066
조서유형(구분)0.0671.0000.1270.578
소분류0.1470.1271.0000.144
면적증감코드0.0660.5780.1441.000
2024-04-30T05:07:54.799633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
면적기정면적변경면적변경후지자체조서유형(구분)소분류면적증감코드
면적기정1.000-0.0420.5460.0000.0000.0001.000
면적변경-0.0421.0000.1010.0000.1810.0720.094
면적변경후0.5460.1011.0000.0000.0000.0001.000
지자체0.0000.0000.0001.0000.0670.1470.066
조서유형(구분)0.0000.1810.0000.0671.0000.1270.578
소분류0.0000.0720.0000.1470.1271.0000.144
면적증감코드1.0000.0941.0000.0660.5780.1441.000

Missing values

2024-04-30T05:07:49.454787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T05:07:49.643698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-30T05:07:49.843295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

조서관리코드프로젝트코드지자체조서유형(구분)대분류중분류소분류위치명지역명면적기정면적증감코드면적변경면적변경후결정고시관리코드
011000AGZ19850403233011000PPL198504036832서울특별시변경의제처리구역정비구역재개발사업지구소공동 17번지일대소공4-1도시환경정비지구3378.1<NA><NA>2472.011000NTC198504036832
111000AGZ19850403233111000PPL198504036832서울특별시변경의제처리구역정비구역재개발사업지구소공동 111번지일대소공4-2도시환경정비지구3115.0<NA><NA>4030.111000NTC198504036832
211000AGZ19850403233211000PPL198504036832서울특별시변경의제처리구역정비구역재개발사업지구태평로2가 43일대소공4-3도시환경정비지구3197.0<NA><NA>3197.011000NTC198504036832
311000AGZ19850501233711000PPL198505016898서울특별시변경의제처리구역정비구역주택재개발사업지구성동구 하왕십리동 890번지 일대하왕제1구역제1지구<NA><NA><NA>63270.011000NTC198505016898
411000AGZ19850514233911000PPL198505146936서울특별시변경의제처리구역정비구역도시환경정비사업지구서린동 33번지 일대(41필지)서린구역제12지구7363.4<NA><NA>7472.111000NTC198505146936
511000AGZ19850517234011000PPL198505176942서울특별시변경의제처리구역정비구역주택재개발사업구역강동구 천호동 410-90외 214필지천호제2구역8834.0<NA><NA>9169.011000NTC198505176942
611000AGZ19850523234111000PPL198505236956서울특별시신설의제처리구역정비구역주택재개발사업구역강남구 반포동 539번지 일원반포1구역0.0<NA><NA>25023.011000NTC198505236956
711000AGZ19850523234211000PPL198505236956서울특별시신설의제처리구역정비구역주택재개발사업구역서울특별시 강동구 천호동 410번지 일원천호3구역0.0<NA><NA>6970.011000NTC198505236956
811000AGZ19850523234311000PPL198505236956서울특별시신설의제처리구역정비구역주택재개발사업구역서울특별시 서대문구 대현동 61번지 일원대현1주택재개발정비구역0.0<NA><NA>40870.011000NTC198505236956
911000AGZ19840813222911000PPL198408136236서울특별시변경의제처리구역정비구역재개발사업지구중구 남대문로 5가 301번지 일대양동5도시환경정비지구5907.0<NA><NA>6060.211000NTC198408136236
조서관리코드프로젝트코드지자체조서유형(구분)대분류중분류소분류위치명지역명면적기정면적증감코드면적변경면적변경후결정고시관리코드
612311110AGZ20190515000311110PPL202307110001종로구변경의제처리구역정비구역도시환경정비사업구역종로구 낙원동 283-15번지 일대공평 도시정비형 재개발구역 소단위 공동개발지구 도시정비형 재개발사업1771.31415.42186.711110NTC202307170004
612411000AGZ20131211966511110PPL202307110001서울특별시폐지의제처리구역정비구역도시환경정비사업지구종로구 낙원동 283-13대공평B8도시환경정비사업89.3289.30.011110NTC202307170004
612511000AGZ20131211966611110PPL202307110001서울특별시폐지의제처리구역정비구역도시환경정비사업지구종로구 낙원동 283-6대, 283-38공평B9도시환경정비사업209.82209.80.011110NTC202307170004
612611000AGZ20131211966711110PPL202307110001서울특별시폐지의제처리구역정비구역도시환경정비사업지구종로구 낙원동 283-35대공평B10도시환경정비사업65.7265.70.011110NTC202307170004
612711000AGZ20240326000211000PPL202403260002서울특별시변경의제처리구역정비구역주택재개발사업구역성북구 장위동 68-37일대장위10구역94037.022675.091362.011000NTC202403260002
612811380AGZ20230816000111380PPL202308160002은평구변경의제처리구역정비구역주택재건축사업은평구 불광동 19-3번지 일원불광1 주택재건축 정비구역25692.01260.025952.011380NTC202308160004
612911000AGZ20230712000211000PPL202307120005서울특별시변경의제처리구역정비구역재개발사업구역상계동 95-3번지 일대상계 6 재정비촉진구역66084.92400.065684.911000NTC202307120005
613011000AGZ20231013000111000PPL202310130005서울특별시변경의제처리구역정비구역주택재건축사업강서구 개화동로25길 39(방화동 615-103) 일대방화3재정비촉진구역(주택재건축사업)92152.0213.092139.011000NTC202310130005
613111470AGZ20230907000111470PPL202309070002양천구신설의제처리구역정비구역주택재개발사업구역서울특별시 양천구 신정동 1152번지 일대신정동 1152번지 일대 주택정비형 재개발사업0.0144082.844082.811470NTC202309070003
613211170AGZ20200327000211170PPL202311150002용산구변경의제처리구역정비구역도시환경정비사업구역용산구 한강로2가 2-194호 일대신용산역 북측 2구역 도시정비형 재개발구역22324.210.622324.811170NTC202311150001