Overview

Dataset statistics

Number of variables6
Number of observations221
Missing cells10
Missing cells (%)0.8%
Duplicate rows1
Duplicate rows (%)0.5%
Total size in memory10.9 KiB
Average record size in memory50.6 B

Variable types

Categorical2
Numeric2
Text1
DateTime1

Dataset

Description부산광역시해운대구_공동주택하자보수보증증권현황_20230818
Author부산광역시 해운대구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3075583

Alerts

Dataset has 1 (0.5%) duplicate rowsDuplicates
관리연도 is highly overall correlated with 관리시군구High correlation
관리시군구 is highly overall correlated with 관리연도High correlation
관리시군구 is highly imbalanced (50.4%)Imbalance
비고 is highly imbalanced (55.0%)Imbalance
사용 승인일자 has 9 (4.1%) missing valuesMissing

Reproduction

Analysis started2023-12-10 17:39:00.222351
Analysis finished2023-12-10 17:39:02.589586
Duration2.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

관리시군구
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
부산광역시 해운대구
197 
부산광역시 해운대구
24 

Length

Max length11
Median length10
Mean length10.108597
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산광역시 해운대구
2nd row 부산광역시 해운대구
3rd row 부산광역시 해운대구
4th row 부산광역시 해운대구
5th row 부산광역시 해운대구

Common Values

ValueCountFrequency (%)
부산광역시 해운대구 197
89.1%
부산광역시 해운대구 24
 
10.9%

Length

2023-12-11T02:39:02.791477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:39:03.121364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산광역시 221
50.0%
해운대구 221
50.0%

관리연도
Real number (ℝ)

HIGH CORRELATION 

Distinct22
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.9276
Minimum2002
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-11T02:39:03.442428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2002
5-th percentile2002
Q12004
median2012
Q32015
95-th percentile2020
Maximum2023
Range21
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.2423923
Coefficient of variation (CV)0.0031042352
Kurtosis-1.272047
Mean2010.9276
Median Absolute Deviation (MAD)6
Skewness-0.063735512
Sum444415
Variance38.967462
MonotonicityDecreasing
2023-12-11T02:39:03.738209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
2002 24
10.9%
2003 23
10.4%
2012 22
10.0%
2015 17
 
7.7%
2013 17
 
7.7%
2014 14
 
6.3%
2004 13
 
5.9%
2011 13
 
5.9%
2020 13
 
5.9%
2017 12
 
5.4%
Other values (12) 53
24.0%
ValueCountFrequency (%)
2002 24
10.9%
2003 23
10.4%
2004 13
5.9%
2005 9
 
4.1%
2006 8
 
3.6%
2007 1
 
0.5%
2008 3
 
1.4%
2009 1
 
0.5%
2010 2
 
0.9%
2011 13
5.9%
ValueCountFrequency (%)
2023 1
 
0.5%
2022 3
 
1.4%
2021 6
 
2.7%
2020 13
5.9%
2019 7
3.2%
2018 5
 
2.3%
2017 12
5.4%
2016 7
3.2%
2015 17
7.7%
2014 14
6.3%
Distinct220
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2023-12-11T02:39:04.583924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length52
Median length42
Mean length25.357466
Min length16

Characters and Unicode

Total characters5604
Distinct characters119
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique219 ?
Unique (%)99.1%

Sample

1st row부산광역시 해운대구 우동 587-1
2nd row부산광역시 해운대구 반여동 985-1
3rd row부산광역시 해운대구 중동 1369-8
4th row부산광역시 해운대구 중동 236
5th row부산광역시 해운대구 재송동 1056-2
ValueCountFrequency (%)
부산광역시 219
19.2%
해운대구 217
19.0%
대지 101
 
8.8%
재송동 69
 
6.0%
중동 62
 
5.4%
우동 36
 
3.1%
2동 32
 
2.8%
1동 28
 
2.4%
송정동 23
 
2.0%
반여동 15
 
1.3%
Other values (286) 341
29.8%
2023-12-11T02:39:05.771911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
938
 
16.7%
1 352
 
6.3%
331
 
5.9%
290
 
5.2%
225
 
4.0%
224
 
4.0%
223
 
4.0%
222
 
4.0%
221
 
3.9%
221
 
3.9%
Other values (109) 2357
42.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3113
55.5%
Decimal Number 1198
 
21.4%
Space Separator 938
 
16.7%
Dash Punctuation 208
 
3.7%
Close Punctuation 65
 
1.2%
Open Punctuation 65
 
1.2%
Other Punctuation 9
 
0.2%
Uppercase Letter 8
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
331
10.6%
290
9.3%
225
 
7.2%
224
 
7.2%
223
 
7.2%
222
 
7.1%
221
 
7.1%
221
 
7.1%
221
 
7.1%
221
 
7.1%
Other values (85) 714
22.9%
Decimal Number
ValueCountFrequency (%)
1 352
29.4%
2 142
11.9%
0 111
 
9.3%
3 110
 
9.2%
4 99
 
8.3%
7 95
 
7.9%
5 79
 
6.6%
9 75
 
6.3%
8 74
 
6.2%
6 61
 
5.1%
Uppercase Letter
ValueCountFrequency (%)
B 2
25.0%
T 1
12.5%
H 1
12.5%
E 1
12.5%
J 1
12.5%
S 1
12.5%
C 1
12.5%
Other Punctuation
ValueCountFrequency (%)
, 5
55.6%
/ 3
33.3%
# 1
 
11.1%
Space Separator
ValueCountFrequency (%)
938
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 208
100.0%
Close Punctuation
ValueCountFrequency (%)
) 65
100.0%
Open Punctuation
ValueCountFrequency (%)
( 65
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3113
55.5%
Common 2483
44.3%
Latin 8
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
331
10.6%
290
9.3%
225
 
7.2%
224
 
7.2%
223
 
7.2%
222
 
7.1%
221
 
7.1%
221
 
7.1%
221
 
7.1%
221
 
7.1%
Other values (85) 714
22.9%
Common
ValueCountFrequency (%)
938
37.8%
1 352
 
14.2%
- 208
 
8.4%
2 142
 
5.7%
0 111
 
4.5%
3 110
 
4.4%
4 99
 
4.0%
7 95
 
3.8%
5 79
 
3.2%
9 75
 
3.0%
Other values (7) 274
 
11.0%
Latin
ValueCountFrequency (%)
B 2
25.0%
T 1
12.5%
H 1
12.5%
E 1
12.5%
J 1
12.5%
S 1
12.5%
C 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3113
55.5%
ASCII 2491
44.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
938
37.7%
1 352
 
14.1%
- 208
 
8.4%
2 142
 
5.7%
0 111
 
4.5%
3 110
 
4.4%
4 99
 
4.0%
7 95
 
3.8%
5 79
 
3.2%
9 75
 
3.0%
Other values (14) 282
 
11.3%
Hangul
ValueCountFrequency (%)
331
10.6%
290
9.3%
225
 
7.2%
224
 
7.2%
223
 
7.2%
222
 
7.1%
221
 
7.1%
221
 
7.1%
221
 
7.1%
221
 
7.1%
Other values (85) 714
22.9%

사용 승인일자
Date

MISSING 

Distinct192
Distinct (%)90.6%
Missing9
Missing (%)4.1%
Memory size1.9 KiB
Minimum2001-06-30 00:00:00
Maximum2023-01-20 00:00:00
2023-12-11T02:39:06.152670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:39:06.541181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

공동주택 세대수
Real number (ℝ)

Distinct63
Distinct (%)28.6%
Missing1
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean85.795455
Minimum2
Maximum2752
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-11T02:39:06.901535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile4
Q18
median14
Q324
95-th percentile496.7
Maximum2752
Range2750
Interquartile range (IQR)16

Descriptive statistics

Standard deviation260.83769
Coefficient of variation (CV)3.0402274
Kurtosis55.868841
Mean85.795455
Median Absolute Deviation (MAD)6
Skewness6.5590652
Sum18875
Variance68036.3
MonotonicityNot monotonic
2023-12-11T02:39:07.290844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8 39
17.6%
16 22
 
10.0%
7 15
 
6.8%
12 15
 
6.8%
4 9
 
4.1%
14 8
 
3.6%
20 8
 
3.6%
6 7
 
3.2%
19 6
 
2.7%
11 5
 
2.3%
Other values (53) 86
38.9%
ValueCountFrequency (%)
2 2
 
0.9%
3 1
 
0.5%
4 9
 
4.1%
5 1
 
0.5%
6 7
 
3.2%
7 15
 
6.8%
8 39
17.6%
9 4
 
1.8%
10 4
 
1.8%
11 5
 
2.3%
ValueCountFrequency (%)
2752 1
 
0.5%
1631 1
 
0.5%
998 1
 
0.5%
882 1
 
0.5%
828 1
 
0.5%
703 1
 
0.5%
564 3
1.4%
548 1
 
0.5%
510 1
 
0.5%
496 1
 
0.5%

비고
Categorical

IMBALANCE 

Distinct12
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
서울보증보험주식회사
144 
건설공제조합
49 
대한주택보증주식회사
 
8
주택도시보증공사
 
6
서울보증보헙주식회사
 
6
Other values (7)
 
8

Length

Max length12
Median length10
Mean length9.0045249
Min length5

Unique

Unique6 ?
Unique (%)2.7%

Sample

1st row서울보증보험주식회사
2nd row건설공제조합
3rd row엔지니어링공제조합
4th row서울보증보험주식회사
5th row서울보증보험주식회사

Common Values

ValueCountFrequency (%)
서울보증보험주식회사 144
65.2%
건설공제조합 49
 
22.2%
대한주택보증주식회사 8
 
3.6%
주택도시보증공사 6
 
2.7%
서울보증보헙주식회사 6
 
2.7%
임대사업자 2
 
0.9%
엔지니어링공제조합 1
 
0.5%
주택임대사업자 1
 
0.5%
대힌주택보증주식회사 1
 
0.5%
서울보증 동래지점 1
 
0.5%
Other values (2) 2
 
0.9%

Length

2023-12-11T02:39:07.651765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울보증보험주식회사 144
64.3%
건설공제조합 51
 
22.8%
대한주택보증주식회사 8
 
3.6%
주택도시보증공사 6
 
2.7%
서울보증보헙주식회사 6
 
2.7%
임대사업자 2
 
0.9%
엔지니어링공제조합 1
 
0.4%
주택임대사업자 1
 
0.4%
대힌주택보증주식회사 1
 
0.4%
서울보증 1
 
0.4%
Other values (3) 3
 
1.3%

Interactions

2023-12-11T02:39:01.135643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:39:00.768718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:39:01.341657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:39:00.940804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:39:07.864515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리시군구관리연도공동주택 세대수비고
관리시군구1.0000.9800.2200.418
관리연도0.9801.0000.4410.654
공동주택 세대수0.2200.4411.0000.628
비고0.4180.6540.6281.000
2023-12-11T02:39:08.096627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리시군구비고
관리시군구1.0000.317
비고0.3171.000
2023-12-11T02:39:08.318280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리연도공동주택 세대수관리시군구비고
관리연도1.0000.1540.8630.356
공동주택 세대수0.1541.0000.1570.292
관리시군구0.8630.1571.0000.317
비고0.3560.2920.3171.000

Missing values

2023-12-11T02:39:01.765486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:39:02.137891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T02:39:02.396988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

관리시군구관리연도건축물위치사용 승인일자공동주택 세대수비고
0부산광역시 해운대구2023부산광역시 해운대구 우동 587-12023-01-20548서울보증보험주식회사
1부산광역시 해운대구2022부산광역시 해운대구 반여동 985-12022-06-288건설공제조합
2부산광역시 해운대구2022부산광역시 해운대구 중동 1369-82022-02-11152엔지니어링공제조합
3부산광역시 해운대구2022부산광역시 해운대구 중동 2362022-01-148서울보증보험주식회사
4부산광역시 해운대구2021부산광역시 해운대구 재송동 1056-22021-06-1011서울보증보험주식회사
5부산광역시 해운대구2021부산광역시 해운대구 중동 11402021-05-27298주택도시보증공사
6부산광역시 해운대구2021부산광역시 해운대구 우동 639-32021-03-1839서울보증보험주식회사
7부산광역시 해운대구2021부산광역시 해운대구 중동 1521-102021-02-1012서울보증보험주식회사
8부산광역시 해운대구2021부산광역시 해운대구 중동 846-12021-01-2017건설공제조합
9부산광역시 해운대구2021부산광역시 해운대구 중동 8432020-12-2426서울보증보험주식회사
관리시군구관리연도건축물위치사용 승인일자공동주택 세대수비고
211부산광역시 해운대구2002부산광역시 해운대구 중동 대지 1500-8 (2동)2002-12-2013서울보증보험주식회사
212부산광역시 해운대구2002부산광역시 해운대구 재송동 대지 1127-20 (1동) 외 1필지2002-08-1014서울보증보험주식회사
213부산광역시 해운대구2002부산광역시 해운대구 우동 대지 1087-14 (2동)2002-07-2214서울보증보험주식회사
214부산광역시 해운대구2002부산광역시 해운대구 중동 대지 1500-13 (2동)2002-12-2016서울보증보험주식회사
215부산광역시 해운대구2002부산광역시 해운대구 중동 대지 423-3 (2동)2002-12-2616서울보증보험주식회사
216부산광역시 해운대구2002부산광역시 해운대구 송정동 대지 422-3<NA>16서울보증보험주식회사
217부산광역시 해운대구2002부산광역시 해운대구 중동 대지 1495-1 (2동)2002-02-2616서울보증보험주식회사
218부산광역시 해운대구2002부산광역시 해운대구 중동 대지 1477-18 (1동) 경신윈트빌 1동2002-12-0217서울보증보험주식회사
219부산광역시 해운대구2002부산광역시 해운대구 중동 대지 1491-17 (2동)2002-10-1617서울보증보험주식회사
220부산광역시 해운대구2002부산광역시 해운대구 반여동 대지 1373-8 (2동)2002-03-2519서울보증보험주식회사

Duplicate rows

Most frequently occurring

관리시군구관리연도건축물위치사용 승인일자공동주택 세대수비고# duplicates
0부산광역시 해운대구2018부산광역시 해운대구 송정동 200-11번지2018-01-1839건설공제조합2