Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory468.8 KiB
Average record size in memory48.0 B

Variable types

Text2
Categorical3

Dataset

Description부산광역시 동래구 지적정보에 관한 데이터로 변경전지번, 변경후지번, 변경일자, 변경사유, 사업명 등에 대한 항목을 제공합니다.
Author부산광역시 동래구
URLhttps://www.data.go.kr/data/3040254/fileData.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
변경일자 is highly overall correlated with 변경사유 and 1 other fieldsHigh correlation
사업명 is highly overall correlated with 변경일자 and 1 other fieldsHigh correlation
변경사유 is highly overall correlated with 변경일자 and 1 other fieldsHigh correlation
변경일자 is highly imbalanced (68.7%)Imbalance
변경사유 is highly imbalanced (78.5%)Imbalance
사업명 is highly imbalanced (68.2%)Imbalance

Reproduction

Analysis started2024-03-14 16:22:37.792918
Analysis finished2024-03-14 16:22:39.015064
Duration1.22 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct3453
Distinct (%)34.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T01:22:40.374712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length9.7083
Min length6

Characters and Unicode

Total characters97083
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1642 ?
Unique (%)16.4%

Sample

1st row사직동 134 - 1
2nd row온천동 968-15
3rd row온천동 1031-22
4th row온천동1503-32
5th row온천동 928-61
ValueCountFrequency (%)
온천동 8028
37.4%
856
 
4.0%
명장동 602
 
2.8%
사직동 472
 
2.2%
안락동 325
 
1.5%
1 202
 
0.9%
명륜동 168
 
0.8%
2 137
 
0.6%
3 99
 
0.5%
수안동 90
 
0.4%
Other values (2998) 10510
48.9%
2024-03-15T01:22:42.046641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11489
11.8%
10000
10.3%
- 9305
9.6%
8262
8.5%
8251
8.5%
1 8201
8.4%
9 6400
 
6.6%
8 5661
 
5.8%
2 5152
 
5.3%
7 4685
 
4.8%
Other values (26) 19677
20.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 46251
47.6%
Other Letter 30038
30.9%
Space Separator 11489
 
11.8%
Dash Punctuation 9305
 
9.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10000
33.3%
8262
27.5%
8251
27.5%
770
 
2.6%
602
 
2.0%
472
 
1.6%
472
 
1.6%
415
 
1.4%
325
 
1.1%
168
 
0.6%
Other values (14) 301
 
1.0%
Decimal Number
ValueCountFrequency (%)
1 8201
17.7%
9 6400
13.8%
8 5661
12.2%
2 5152
11.1%
7 4685
10.1%
0 3896
8.4%
6 3237
 
7.0%
3 3120
 
6.7%
4 3040
 
6.6%
5 2859
 
6.2%
Space Separator
ValueCountFrequency (%)
11489
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9305
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 67045
69.1%
Hangul 30038
30.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10000
33.3%
8262
27.5%
8251
27.5%
770
 
2.6%
602
 
2.0%
472
 
1.6%
472
 
1.6%
415
 
1.4%
325
 
1.1%
168
 
0.6%
Other values (14) 301
 
1.0%
Common
ValueCountFrequency (%)
11489
17.1%
- 9305
13.9%
1 8201
12.2%
9 6400
9.5%
8 5661
8.4%
2 5152
7.7%
7 4685
7.0%
0 3896
 
5.8%
6 3237
 
4.8%
3 3120
 
4.7%
Other values (2) 5899
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 67045
69.1%
Hangul 30038
30.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11489
17.1%
- 9305
13.9%
1 8201
12.2%
9 6400
9.5%
8 5661
8.4%
2 5152
7.7%
7 4685
7.0%
0 3896
 
5.8%
6 3237
 
4.8%
3 3120
 
4.7%
Other values (2) 5899
8.8%
Hangul
ValueCountFrequency (%)
10000
33.3%
8262
27.5%
8251
27.5%
770
 
2.6%
602
 
2.0%
472
 
1.6%
472
 
1.6%
415
 
1.4%
325
 
1.1%
168
 
0.6%
Other values (14) 301
 
1.0%
Distinct977
Distinct (%)9.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T01:22:43.485839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length11
Mean length10.3008
Min length3

Characters and Unicode

Total characters103008
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique839 ?
Unique (%)8.4%

Sample

1st row사직동 110 - 3
2nd row온천동 1853-10
3rd row온천동 1853-26
4th row온천동 1849
5th row온천동 1853-24
ValueCountFrequency (%)
온천동 8225
37.6%
926
 
4.2%
명장동 585
 
2.7%
사직동 480
 
2.2%
안락동 333
 
1.5%
1853-11 284
 
1.3%
1853-17 281
 
1.3%
1853-26 274
 
1.3%
1853-1 271
 
1.2%
1853-28 264
 
1.2%
Other values (429) 9930
45.4%
2024-03-15T01:22:45.703552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 12764
12.4%
11853
11.5%
9986
9.7%
8 9412
9.1%
3 9402
9.1%
5 9311
9.0%
- 8634
8.4%
8228
8.0%
8225
8.0%
2 4500
 
4.4%
Other values (24) 10693
10.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 52491
51.0%
Other Letter 30030
29.2%
Space Separator 11853
 
11.5%
Dash Punctuation 8634
 
8.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9986
33.3%
8228
27.4%
8225
27.4%
793
 
2.6%
588
 
2.0%
480
 
1.6%
480
 
1.6%
424
 
1.4%
333
 
1.1%
208
 
0.7%
Other values (12) 285
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 12764
24.3%
8 9412
17.9%
3 9402
17.9%
5 9311
17.7%
2 4500
 
8.6%
4 1739
 
3.3%
6 1731
 
3.3%
0 1435
 
2.7%
7 1121
 
2.1%
9 1076
 
2.0%
Space Separator
ValueCountFrequency (%)
11853
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8634
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 72978
70.8%
Hangul 30030
29.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9986
33.3%
8228
27.4%
8225
27.4%
793
 
2.6%
588
 
2.0%
480
 
1.6%
480
 
1.6%
424
 
1.4%
333
 
1.1%
208
 
0.7%
Other values (12) 285
 
0.9%
Common
ValueCountFrequency (%)
1 12764
17.5%
11853
16.2%
8 9412
12.9%
3 9402
12.9%
5 9311
12.8%
- 8634
11.8%
2 4500
 
6.2%
4 1739
 
2.4%
6 1731
 
2.4%
0 1435
 
2.0%
Other values (2) 2197
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 72978
70.8%
Hangul 30030
29.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 12764
17.5%
11853
16.2%
8 9412
12.9%
3 9402
12.9%
5 9311
12.8%
- 8634
11.8%
2 4500
 
6.2%
4 1739
 
2.4%
6 1731
 
2.4%
0 1435
 
2.0%
Other values (2) 2197
 
3.0%
Hangul
ValueCountFrequency (%)
9986
33.3%
8228
27.4%
8225
27.4%
793
 
2.6%
588
 
2.0%
480
 
1.6%
480
 
1.6%
424
 
1.4%
333
 
1.1%
208
 
0.7%
Other values (12) 285
 
0.9%

변경일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct33
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2022-11-03
7757 
2020-04-03
 
577
1974-07-05
 
460
1974-12-31
 
194
2022-07-18
 
153
Other values (28)
859 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row1974-07-05
2nd row2022-11-03
3rd row2022-11-03
4th row2021-03-23
5th row2022-11-03

Common Values

ValueCountFrequency (%)
2022-11-03 7757
77.6%
2020-04-03 577
 
5.8%
1974-07-05 460
 
4.6%
1974-12-31 194
 
1.9%
2022-07-18 153
 
1.5%
2021-03-23 124
 
1.2%
1980-06-16 120
 
1.2%
2020-04-16 105
 
1.1%
1972-02-26 74
 
0.7%
2021-12-30 66
 
0.7%
Other values (23) 370
 
3.7%

Length

2024-03-15T01:22:46.282117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022-11-03 7757
77.6%
2020-04-03 577
 
5.8%
1974-07-05 460
 
4.6%
1974-12-31 194
 
1.9%
2022-07-18 153
 
1.5%
2021-03-23 124
 
1.2%
1980-06-16 120
 
1.2%
2020-04-16 105
 
1.1%
1972-02-26 74
 
0.7%
2021-12-30 66
 
0.7%
Other values (23) 370
 
3.7%

변경사유
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct28
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
주택재개발정비사업
8523 
사직지구 구획정리사업
 
460
주택건설사업
 
437
안락북지구 구획정리사업
 
194
연산.수민.안락지구 구획정리사업
 
120
Other values (23)
 
266

Length

Max length17
Median length9
Mean length9.2162
Min length6

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row사직지구 구획정리사업
2nd row주택재개발정비사업
3rd row주택재개발정비사업
4th row주택재개발정비사업
5th row주택재개발정비사업

Common Values

ValueCountFrequency (%)
주택재개발정비사업 8523
85.2%
사직지구 구획정리사업 460
 
4.6%
주택건설사업 437
 
4.4%
안락북지구 구획정리사업 194
 
1.9%
연산.수민.안락지구 구획정리사업 120
 
1.2%
안락지구 구획정리사업 47
 
0.5%
수민지구 구획정리사업 27
 
0.3%
행정관할구역변경(재송동→안락동) 23
 
0.2%
행정관할구역변경(온천동→명륜동) 18
 
0.2%
명륜에스케이뷰 15
 
0.1%
Other values (18) 136
 
1.4%

Length

2024-03-15T01:22:46.736155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
주택재개발정비사업 8523
78.6%
구획정리사업 848
 
7.8%
사직지구 460
 
4.2%
주택건설사업 437
 
4.0%
안락북지구 194
 
1.8%
연산.수민.안락지구 120
 
1.1%
안락지구 47
 
0.4%
수민지구 27
 
0.2%
행정관할구역변경(재송동→안락동 23
 
0.2%
행정관할구역변경(온천동→명륜동 18
 
0.2%
Other values (19) 151
 
1.4%

사업명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct21
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
동래래미안아이파크
7757 
구획정리사업
848 
e편한세상동래명장1단지
 
577
힐스테이트명륜트라디움
 
153
e편한세상 동래 아시아드
 
124
Other values (16)
 
541

Length

Max length29
Median length9
Mean length9.2004
Min length5

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row구획정리사업
2nd row동래래미안아이파크
3rd row동래래미안아이파크
4th rowe편한세상 동래 아시아드
5th row동래래미안아이파크

Common Values

ValueCountFrequency (%)
동래래미안아이파크 7757
77.6%
구획정리사업 848
 
8.5%
e편한세상동래명장1단지 577
 
5.8%
힐스테이트명륜트라디움 153
 
1.5%
e편한세상 동래 아시아드 124
 
1.2%
행정관할구역변경 119
 
1.2%
쌍용더플래티넘사직아시아드 105
 
1.1%
동래 3차 SK VIEW 66
 
0.7%
사직1구역 주택재개발정비사업(사직 롯데캐슬 더클래식) 65
 
0.7%
주택건설사업 58
 
0.6%
Other values (11) 128
 
1.3%

Length

2024-03-15T01:22:47.152541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
동래래미안아이파크 7757
72.1%
구획정리사업 848
 
7.9%
e편한세상동래명장1단지 577
 
5.4%
동래 219
 
2.0%
힐스테이트명륜트라디움 153
 
1.4%
e편한세상 124
 
1.2%
아시아드 124
 
1.2%
행정관할구역변경 119
 
1.1%
쌍용더플래티넘사직아시아드 105
 
1.0%
view 66
 
0.6%
Other values (26) 669
 
6.2%

Correlations

2024-03-15T01:22:47.440466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
변경일자변경사유사업명
변경일자1.0000.9891.000
변경사유0.9891.0000.931
사업명1.0000.9311.000
2024-03-15T01:22:47.718370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
변경일자사업명변경사유
변경일자1.0000.9990.812
사업명0.9991.0000.546
변경사유0.8120.5461.000
2024-03-15T01:22:47.976870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
변경일자변경사유사업명
변경일자1.0000.8120.999
변경사유0.8121.0000.546
사업명0.9990.5461.000

Missing values

2024-03-15T01:22:38.392677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T01:22:38.872797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

변경전지번변경후지번변경일자변경사유사업명
1860사직동 134 - 1사직동 110 - 31974-07-05사직지구 구획정리사업구획정리사업
31013온천동 968-15온천동 1853-102022-11-03주택재개발정비사업동래래미안아이파크
56560온천동 1031-22온천동 1853-262022-11-03주택재개발정비사업동래래미안아이파크
12868온천동1503-32온천동 18492021-03-23주택재개발정비사업e편한세상 동래 아시아드
52827온천동 928-61온천동 1853-242022-11-03주택재개발정비사업동래래미안아이파크
9958명장동 74-1명장동 6282020-04-03주택재개발정비사업e편한세상동래명장1단지
50091온천동 1002-4온천동 1853-222022-11-03주택재개발정비사업동래래미안아이파크
17473온천동 1788-8온천동 1853-12022-11-03주택재개발정비사업동래래미안아이파크
31684온천동 728-77온천동 1853-112022-11-03주택재개발정비사업동래래미안아이파크
16184온천동 855온천동 1853-12022-11-03주택재개발정비사업동래래미안아이파크
변경전지번변경후지번변경일자변경사유사업명
3712안락동 485 - 7안락동 429 - 161974-12-31안락북지구 구획정리사업구획정리사업
57260온천동 882-30온천동 1853-272022-11-03주택재개발정비사업동래래미안아이파크
36737온천동 871-67온천동 1853-142022-11-03주택재개발정비사업동래래미안아이파크
16237온천동 868-5온천동 1853-12022-11-03주택재개발정비사업동래래미안아이파크
59407온천동 979-7온천동 1853-282022-11-03주택재개발정비사업동래래미안아이파크
40075온천동 905-2온천동 1853-162022-11-03주택재개발정비사업동래래미안아이파크
58366온천동 728-35온천동 1853-282022-11-03주택재개발정비사업동래래미안아이파크
48496온천동 996-7온천동 1853-212022-11-03주택재개발정비사업동래래미안아이파크
55739온천동 891-3온천동 1853-262022-11-03주택재개발정비사업동래래미안아이파크
8956명장동 70-1명장동 6262020-04-03주택재개발정비사업e편한세상동래명장1단지

Duplicate rows

Most frequently occurring

변경전지번변경후지번변경일자변경사유사업명# duplicates
0안락동 745 - 4안락동 423 - 61974-12-31안락북지구 구획정리사업구획정리사업2