Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory468.8 KiB
Average record size in memory48.0 B

Variable types

Text2
DateTime1
Categorical2

Dataset

Description부산광역시_동래구_지적정보_20230210
Author부산광역시 동래구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3040254

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
변경사유 is highly overall correlated with 사업명High correlation
사업명 is highly overall correlated with 변경사유High correlation
변경사유 is highly imbalanced (78.7%)Imbalance
사업명 is highly imbalanced (67.7%)Imbalance

Reproduction

Analysis started2023-12-10 17:37:05.100259
Analysis finished2023-12-10 17:37:06.646612
Duration1.55 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct3423
Distinct (%)34.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T02:37:07.374786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length9.6969
Min length6

Characters and Unicode

Total characters96969
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1598 ?
Unique (%)16.0%

Sample

1st row온천동 929-1
2nd row온천동 1777-37
3rd row온천동 990-9
4th row수안동 477 - 1
5th row온천동 1008-16
ValueCountFrequency (%)
온천동 7995
37.2%
839
 
3.9%
명장동 639
 
3.0%
사직동 490
 
2.3%
안락동 346
 
1.6%
1 191
 
0.9%
명륜동 155
 
0.7%
2 134
 
0.6%
3 115
 
0.5%
수안동 102
 
0.5%
Other values (2929) 10473
48.8%
2023-12-11T02:37:08.634836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11479
11.8%
10000
10.3%
- 9316
9.6%
8201
8.5%
8194
8.5%
1 8124
8.4%
9 6440
 
6.6%
8 5701
 
5.9%
2 5104
 
5.3%
7 4687
 
4.8%
Other values (26) 19723
20.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 46143
47.6%
Other Letter 30031
31.0%
Space Separator 11479
 
11.8%
Dash Punctuation 9316
 
9.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10000
33.3%
8201
27.3%
8194
27.3%
794
 
2.6%
639
 
2.1%
490
 
1.6%
490
 
1.6%
448
 
1.5%
346
 
1.2%
155
 
0.5%
Other values (14) 274
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 8124
17.6%
9 6440
14.0%
8 5701
12.4%
2 5104
11.1%
7 4687
10.2%
0 3740
8.1%
3 3266
7.1%
6 3245
 
7.0%
4 2968
 
6.4%
5 2868
 
6.2%
Space Separator
ValueCountFrequency (%)
11479
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9316
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 66938
69.0%
Hangul 30031
31.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10000
33.3%
8201
27.3%
8194
27.3%
794
 
2.6%
639
 
2.1%
490
 
1.6%
490
 
1.6%
448
 
1.5%
346
 
1.2%
155
 
0.5%
Other values (14) 274
 
0.9%
Common
ValueCountFrequency (%)
11479
17.1%
- 9316
13.9%
1 8124
12.1%
9 6440
9.6%
8 5701
8.5%
2 5104
7.6%
7 4687
7.0%
0 3740
 
5.6%
3 3266
 
4.9%
6 3245
 
4.8%
Other values (2) 5836
8.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 66938
69.0%
Hangul 30031
31.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11479
17.1%
- 9316
13.9%
1 8124
12.1%
9 6440
9.6%
8 5701
8.5%
2 5104
7.6%
7 4687
7.0%
0 3740
 
5.6%
3 3266
 
4.9%
6 3245
 
4.8%
Other values (2) 5836
8.7%
Hangul
ValueCountFrequency (%)
10000
33.3%
8201
27.3%
8194
27.3%
794
 
2.6%
639
 
2.1%
490
 
1.6%
490
 
1.6%
448
 
1.5%
346
 
1.2%
155
 
0.5%
Other values (14) 274
 
0.9%
Distinct971
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T02:37:09.481496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length11
Mean length10.2942
Min length3

Characters and Unicode

Total characters102942
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique841 ?
Unique (%)8.4%

Sample

1st row온천동 1853-4
2nd row온천동 1853-1
3rd row온천동 1853-15
4th row복천동 292 - 9
5th row온천동 1853-30
ValueCountFrequency (%)
온천동 8185
37.4%
938
 
4.3%
명장동 632
 
2.9%
사직동 486
 
2.2%
안락동 357
 
1.6%
1853-15 277
 
1.3%
1853-30 272
 
1.2%
1853-28 270
 
1.2%
1853-26 270
 
1.2%
1853-29 269
 
1.2%
Other values (415) 9915
45.3%
2023-12-11T02:37:10.866966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 12596
12.2%
11871
11.5%
9990
9.7%
3 9454
9.2%
8 9344
9.1%
5 9264
9.0%
- 8627
8.4%
8188
8.0%
8185
8.0%
2 4532
 
4.4%
Other values (26) 10891
10.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 52434
50.9%
Other Letter 30010
29.2%
Space Separator 11871
 
11.5%
Dash Punctuation 8627
 
8.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9990
33.3%
8188
27.3%
8185
27.3%
821
 
2.7%
636
 
2.1%
486
 
1.6%
486
 
1.6%
451
 
1.5%
357
 
1.2%
189
 
0.6%
Other values (14) 221
 
0.7%
Decimal Number
ValueCountFrequency (%)
1 12596
24.0%
3 9454
18.0%
8 9344
17.8%
5 9264
17.7%
2 4532
 
8.6%
6 1823
 
3.5%
4 1756
 
3.3%
0 1447
 
2.8%
7 1126
 
2.1%
9 1092
 
2.1%
Space Separator
ValueCountFrequency (%)
11871
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8627
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 72932
70.8%
Hangul 30010
29.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9990
33.3%
8188
27.3%
8185
27.3%
821
 
2.7%
636
 
2.1%
486
 
1.6%
486
 
1.6%
451
 
1.5%
357
 
1.2%
189
 
0.6%
Other values (14) 221
 
0.7%
Common
ValueCountFrequency (%)
1 12596
17.3%
11871
16.3%
3 9454
13.0%
8 9344
12.8%
5 9264
12.7%
- 8627
11.8%
2 4532
 
6.2%
6 1823
 
2.5%
4 1756
 
2.4%
0 1447
 
2.0%
Other values (2) 2218
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 72932
70.8%
Hangul 30010
29.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 12596
17.3%
11871
16.3%
3 9454
13.0%
8 9344
12.8%
5 9264
12.7%
- 8627
11.8%
2 4532
 
6.2%
6 1823
 
2.5%
4 1756
 
2.4%
0 1447
 
2.0%
Other values (2) 2218
 
3.0%
Hangul
ValueCountFrequency (%)
9990
33.3%
8188
27.3%
8185
27.3%
821
 
2.7%
636
 
2.1%
486
 
1.6%
486
 
1.6%
451
 
1.5%
357
 
1.2%
189
 
0.6%
Other values (14) 221
 
0.7%
Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1972-02-26 00:00:00
Maximum2022-11-03 00:00:00
2023-12-11T02:37:11.179260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:37:11.518514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)

변경사유
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct27
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
주택재개발정비사업
8552 
사직지구 구획정리사업
 
458
주택건설사업
 
400
안락북지구 구획정리사업
 
205
연산.수민.안락지구 구획정리사업
 
129
Other values (22)
 
256

Length

Max length17
Median length9
Mean length9.2225
Min length6

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row주택재개발정비사업
2nd row주택재개발정비사업
3rd row주택재개발정비사업
4th row행정관할구역변경(수안동→복천동)
5th row주택재개발정비사업

Common Values

ValueCountFrequency (%)
주택재개발정비사업 8552
85.5%
사직지구 구획정리사업 458
 
4.6%
주택건설사업 400
 
4.0%
안락북지구 구획정리사업 205
 
2.1%
연산.수민.안락지구 구획정리사업 129
 
1.3%
안락지구 구획정리사업 59
 
0.6%
안락뜨란채1,2단지 20
 
0.2%
사직쌍용예가 20
 
0.2%
수민지구 구획정리사업 19
 
0.2%
행정관할구역변경(재송동→안락동) 19
 
0.2%
Other values (17) 119
 
1.2%

Length

2023-12-11T02:37:11.867134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
주택재개발정비사업 8552
78.7%
구획정리사업 870
 
8.0%
사직지구 458
 
4.2%
주택건설사업 400
 
3.7%
안락북지구 205
 
1.9%
연산.수민.안락지구 129
 
1.2%
안락지구 59
 
0.5%
안락뜨란채1,2단지 20
 
0.2%
사직쌍용예가 20
 
0.2%
수민지구 19
 
0.2%
Other values (18) 138
 
1.3%

사업명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
동래래미안아이파크
7741 
구획정리사업
870 
e편한세상동래명장1단지
 
623
힐스테이트명륜트라디움
 
139
e편한세상 동래 아시아드
 
135
Other values (14)
 
492

Length

Max length29
Median length9
Mean length9.209
Min length6

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row동래래미안아이파크
2nd row동래래미안아이파크
3rd row동래래미안아이파크
4th row행정관할구역변경
5th row동래래미안아이파크

Common Values

ValueCountFrequency (%)
동래래미안아이파크 7741
77.4%
구획정리사업 870
 
8.7%
e편한세상동래명장1단지 623
 
6.2%
힐스테이트명륜트라디움 139
 
1.4%
e편한세상 동래 아시아드 135
 
1.4%
쌍용더플래티넘사직아시아드 121
 
1.2%
행정관할구역변경 99
 
1.0%
동래 3차 SK VIEW 64
 
0.6%
주택건설사업 63
 
0.6%
사직1구역 주택재개발정비사업(사직 롯데캐슬 더클래식) 53
 
0.5%
Other values (9) 92
 
0.9%

Length

2023-12-11T02:37:12.189694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
동래래미안아이파크 7741
72.3%
구획정리사업 870
 
8.1%
e편한세상동래명장1단지 623
 
5.8%
동래 200
 
1.9%
힐스테이트명륜트라디움 139
 
1.3%
e편한세상 135
 
1.3%
아시아드 135
 
1.3%
쌍용더플래티넘사직아시아드 121
 
1.1%
행정관할구역변경 99
 
0.9%
3차 64
 
0.6%
Other values (24) 577
 
5.4%

Correlations

2023-12-11T02:37:12.391710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
변경일자변경사유사업명
변경일자1.0000.9881.000
변경사유0.9881.0000.911
사업명1.0000.9111.000
2023-12-11T02:37:12.595899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업명변경사유
사업명1.0000.525
변경사유0.5251.000
2023-12-11T02:37:12.779022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
변경사유사업명
변경사유1.0000.525
사업명0.5251.000

Missing values

2023-12-11T02:37:06.275472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:37:06.523756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

변경전지번변경후지번변경일자변경사유사업명
21427온천동 929-1온천동 1853-42022-11-03주택재개발정비사업동래래미안아이파크
17437온천동 1777-37온천동 1853-12022-11-03주택재개발정비사업동래래미안아이파크
39045온천동 990-9온천동 1853-152022-11-03주택재개발정비사업동래래미안아이파크
5886수안동 477 - 1복천동 292 - 91998-03-25행정관할구역변경(수안동→복천동)행정관할구역변경
62707온천동 1008-16온천동 1853-302022-11-03주택재개발정비사업동래래미안아이파크
44734온천동 885-18온천동 1853-192022-11-03주택재개발정비사업동래래미안아이파크
21228온천동 913-4온천동 1853-42022-11-03주택재개발정비사업동래래미안아이파크
39056온천동 992-7온천동 1853-152022-11-03주택재개발정비사업동래래미안아이파크
5135안락동 715 - 3낙민동 93 - 101980-06-16연산.수민.안락지구 구획정리사업구획정리사업
14479온천동 799-25온천동 18532022-11-03주택재개발정비사업동래래미안아이파크
변경전지번변경후지번변경일자변경사유사업명
48423온천동 984-2온천동 1853-212022-11-03주택재개발정비사업동래래미안아이파크
24718온천동 968-4온천동 1853-62022-11-03주택재개발정비사업동래래미안아이파크
22337온천동 799-28온천동 1853-52022-11-03주택재개발정비사업동래래미안아이파크
47290온천동 1777-42온천동 1853-202022-11-03주택재개발정비사업동래래미안아이파크
38153온천동 841-8온천동 1853-152022-11-03주택재개발정비사업동래래미안아이파크
59450온천동 989-19온천동 1853-282022-11-03주택재개발정비사업동래래미안아이파크
56455온천동 1014온천동 1853-262022-11-03주택재개발정비사업동래래미안아이파크
9923명장동 산49-11명장동 6272020-04-03주택재개발정비사업e편한세상동래명장1단지
61013온천동 989-11온천동 1853-292022-11-03주택재개발정비사업동래래미안아이파크
50215온천동 1021-6온천동 1853-222022-11-03주택재개발정비사업동래래미안아이파크

Duplicate rows

Most frequently occurring

변경전지번변경후지번변경일자변경사유사업명# duplicates
0사직동 187사직동 127 - 81974-07-05사직지구 구획정리사업구획정리사업2