Overview

Dataset statistics

Number of variables7
Number of observations78
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.7 KiB
Average record size in memory61.6 B

Variable types

Text2
Numeric3
Categorical1
DateTime1

Dataset

Description부산광역시_중구_공동주택현황_20190314
Author부산광역시 중구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3072711

Alerts

세대수 is highly overall correlated with 연면적(㎡) and 1 other fieldsHigh correlation
연면적(㎡) is highly overall correlated with 세대수High correlation
동수 is highly overall correlated with 세대수High correlation
동수 is highly imbalanced (54.1%)Imbalance
소재지 has unique valuesUnique

Reproduction

Analysis started2024-04-21 11:16:30.147632
Analysis finished2024-04-21 11:16:33.635249
Duration3.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct77
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size752.0 B
2024-04-21T20:16:34.375534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length6.0641026
Min length3

Characters and Unicode

Total characters473
Distinct characters142
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76 ?
Unique (%)97.4%

Sample

1st row신창아파트
2nd row신생아파트
3rd row영주APT3블럭
4th row영주APT2블럭
5th row영주APT9블럭
ValueCountFrequency (%)
로하스 2
 
2.5%
봄여름가을겨울 2
 
2.5%
코모도에스테이트 1
 
1.2%
부평파크아파트 1
 
1.2%
수목하우스2차 1
 
1.2%
부평펠리스 1
 
1.2%
더휴빌라 1
 
1.2%
보수2차 1
 
1.2%
미소지움 1
 
1.2%
도경오벨리스 1
 
1.2%
Other values (69) 69
85.2%
2024-04-21T20:16:35.432703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
29
 
6.1%
29
 
6.1%
27
 
5.7%
15
 
3.2%
14
 
3.0%
10
 
2.1%
10
 
2.1%
10
 
2.1%
9
 
1.9%
9
 
1.9%
Other values (132) 311
65.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 429
90.7%
Decimal Number 19
 
4.0%
Uppercase Letter 15
 
3.2%
Space Separator 4
 
0.8%
Close Punctuation 3
 
0.6%
Open Punctuation 3
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
29
 
6.8%
29
 
6.8%
27
 
6.3%
15
 
3.5%
14
 
3.3%
10
 
2.3%
10
 
2.3%
10
 
2.3%
9
 
2.1%
9
 
2.1%
Other values (121) 267
62.2%
Decimal Number
ValueCountFrequency (%)
2 6
31.6%
1 6
31.6%
3 3
15.8%
0 2
 
10.5%
9 2
 
10.5%
Uppercase Letter
ValueCountFrequency (%)
T 5
33.3%
P 5
33.3%
A 5
33.3%
Space Separator
ValueCountFrequency (%)
4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 429
90.7%
Common 29
 
6.1%
Latin 15
 
3.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
29
 
6.8%
29
 
6.8%
27
 
6.3%
15
 
3.5%
14
 
3.3%
10
 
2.3%
10
 
2.3%
10
 
2.3%
9
 
2.1%
9
 
2.1%
Other values (121) 267
62.2%
Common
ValueCountFrequency (%)
2 6
20.7%
1 6
20.7%
4
13.8%
) 3
10.3%
( 3
10.3%
3 3
10.3%
0 2
 
6.9%
9 2
 
6.9%
Latin
ValueCountFrequency (%)
T 5
33.3%
P 5
33.3%
A 5
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 429
90.7%
ASCII 44
 
9.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
29
 
6.8%
29
 
6.8%
27
 
6.3%
15
 
3.5%
14
 
3.3%
10
 
2.3%
10
 
2.3%
10
 
2.3%
9
 
2.1%
9
 
2.1%
Other values (121) 267
62.2%
ASCII
ValueCountFrequency (%)
2 6
13.6%
1 6
13.6%
T 5
11.4%
P 5
11.4%
A 5
11.4%
4
9.1%
) 3
6.8%
( 3
6.8%
3 3
6.8%
0 2
 
4.5%

소재지
Text

UNIQUE 

Distinct78
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size752.0 B
2024-04-21T20:16:36.248778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length15
Mean length9.1538462
Min length4

Characters and Unicode

Total characters714
Distinct characters33
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78 ?
Unique (%)100.0%

Sample

1st row창선동1가9-1외1
2nd row신창동2가22
3rd row영주동73-1
4th row영주동72-4
5th row영주동93-4
ValueCountFrequency (%)
보수동2가 11
 
9.5%
영주동 8
 
6.9%
부평동4가 7
 
6.0%
대청동1가 3
 
2.6%
91-1 2
 
1.7%
대청동2가 2
 
1.7%
대청동4가 2
 
1.7%
보수동3가 2
 
1.7%
37-1 2
 
1.7%
37-6 1
 
0.9%
Other values (76) 76
65.5%
2024-04-21T20:16:37.389808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
78
 
10.9%
- 67
 
9.4%
1 67
 
9.4%
56
 
7.8%
2 54
 
7.6%
4 41
 
5.7%
41
 
5.7%
5 29
 
4.1%
3 26
 
3.6%
7 24
 
3.4%
Other values (23) 231
32.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 312
43.7%
Other Letter 290
40.6%
Dash Punctuation 67
 
9.4%
Space Separator 41
 
5.7%
Other Punctuation 4
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
78
26.9%
56
19.3%
22
 
7.6%
22
 
7.6%
21
 
7.2%
21
 
7.2%
13
 
4.5%
13
 
4.5%
13
 
4.5%
11
 
3.8%
Other values (10) 20
 
6.9%
Decimal Number
ValueCountFrequency (%)
1 67
21.5%
2 54
17.3%
4 41
13.1%
5 29
9.3%
3 26
 
8.3%
7 24
 
7.7%
8 22
 
7.1%
6 18
 
5.8%
9 17
 
5.4%
0 14
 
4.5%
Dash Punctuation
ValueCountFrequency (%)
- 67
100.0%
Space Separator
ValueCountFrequency (%)
41
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 424
59.4%
Hangul 290
40.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
78
26.9%
56
19.3%
22
 
7.6%
22
 
7.6%
21
 
7.2%
21
 
7.2%
13
 
4.5%
13
 
4.5%
13
 
4.5%
11
 
3.8%
Other values (10) 20
 
6.9%
Common
ValueCountFrequency (%)
- 67
15.8%
1 67
15.8%
2 54
12.7%
4 41
9.7%
41
9.7%
5 29
6.8%
3 26
 
6.1%
7 24
 
5.7%
8 22
 
5.2%
6 18
 
4.2%
Other values (3) 35
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 424
59.4%
Hangul 290
40.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
78
26.9%
56
19.3%
22
 
7.6%
22
 
7.6%
21
 
7.2%
21
 
7.2%
13
 
4.5%
13
 
4.5%
13
 
4.5%
11
 
3.8%
Other values (10) 20
 
6.9%
ASCII
ValueCountFrequency (%)
- 67
15.8%
1 67
15.8%
2 54
12.7%
4 41
9.7%
41
9.7%
5 29
6.8%
3 26
 
6.1%
7 24
 
5.7%
8 22
 
5.2%
6 18
 
4.2%
Other values (3) 35
8.3%

층수
Real number (ℝ)

Distinct15
Distinct (%)19.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.6923077
Minimum4
Maximum21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size830.0 B
2024-04-21T20:16:37.747542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile4
Q16
median9
Q314
95-th percentile16
Maximum21
Range17
Interquartile range (IQR)8

Descriptive statistics

Standard deviation4.3642815
Coefficient of variation (CV)0.45028301
Kurtosis-0.73855669
Mean9.6923077
Median Absolute Deviation (MAD)4
Skewness0.54069049
Sum756
Variance19.046953
MonotonicityNot monotonic
2024-04-21T20:16:38.119825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
5 13
16.7%
15 12
15.4%
9 9
11.5%
8 9
11.5%
6 7
9.0%
4 5
 
6.4%
14 5
 
6.4%
11 4
 
5.1%
7 3
 
3.8%
10 3
 
3.8%
Other values (5) 8
10.3%
ValueCountFrequency (%)
4 5
 
6.4%
5 13
16.7%
6 7
9.0%
7 3
 
3.8%
8 9
11.5%
9 9
11.5%
10 3
 
3.8%
11 4
 
5.1%
12 2
 
2.6%
14 5
 
6.4%
ValueCountFrequency (%)
21 1
 
1.3%
20 1
 
1.3%
18 1
 
1.3%
16 3
 
3.8%
15 12
15.4%
14 5
6.4%
12 2
 
2.6%
11 4
 
5.1%
10 3
 
3.8%
9 9
11.5%

동수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Memory size752.0 B
1
62 
2
4
 
3
3
 
3
5
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)1.3%

Sample

1st row1
2nd row1
3rd row4
4th row4
5th row2

Common Values

ValueCountFrequency (%)
1 62
79.5%
2 9
 
11.5%
4 3
 
3.8%
3 3
 
3.8%
5 1
 
1.3%

Length

2024-04-21T20:16:38.507679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T20:16:38.829586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 62
79.5%
2 9
 
11.5%
4 3
 
3.8%
3 3
 
3.8%
5 1
 
1.3%

세대수
Real number (ℝ)

HIGH CORRELATION 

Distinct51
Distinct (%)65.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean70.525641
Minimum8
Maximum406
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size830.0 B
2024-04-21T20:16:39.200390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile16
Q122.5
median43
Q378.5
95-th percentile222.95
Maximum406
Range398
Interquartile range (IQR)56

Descriptive statistics

Standard deviation77.363155
Coefficient of variation (CV)1.0969508
Kurtosis6.2660376
Mean70.525641
Median Absolute Deviation (MAD)23
Skewness2.4104284
Sum5501
Variance5985.0578
MonotonicityNot monotonic
2024-04-21T20:16:39.664596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20 7
 
9.0%
24 4
 
5.1%
16 4
 
5.1%
60 4
 
5.1%
47 3
 
3.8%
28 3
 
3.8%
74 2
 
2.6%
48 2
 
2.6%
36 2
 
2.6%
18 2
 
2.6%
Other values (41) 45
57.7%
ValueCountFrequency (%)
8 2
 
2.6%
12 1
 
1.3%
16 4
5.1%
18 2
 
2.6%
19 1
 
1.3%
20 7
9.0%
21 2
 
2.6%
22 1
 
1.3%
24 4
5.1%
27 1
 
1.3%
ValueCountFrequency (%)
406 1
1.3%
328 1
1.3%
322 1
1.3%
268 1
1.3%
215 1
1.3%
204 1
1.3%
192 1
1.3%
167 1
1.3%
147 1
1.3%
140 1
1.3%

연면적(㎡)
Real number (ℝ)

HIGH CORRELATION 

Distinct75
Distinct (%)96.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6332.7821
Minimum662
Maximum36367
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size830.0 B
2024-04-21T20:16:40.090575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum662
5-th percentile765.45
Q11386.25
median3182
Q38148.5
95-th percentile20673.55
Maximum36367
Range35705
Interquartile range (IQR)6762.25

Descriptive statistics

Standard deviation7354.5305
Coefficient of variation (CV)1.1613428
Kurtosis5.1559534
Mean6332.7821
Median Absolute Deviation (MAD)2213.5
Skewness2.1489546
Sum493957
Variance54089119
MonotonicityNot monotonic
2024-04-21T20:16:40.529917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
901 3
 
3.8%
2452 2
 
2.6%
1785 1
 
1.3%
1705 1
 
1.3%
897 1
 
1.3%
879 1
 
1.3%
2146 1
 
1.3%
841 1
 
1.3%
2525 1
 
1.3%
685 1
 
1.3%
Other values (65) 65
83.3%
ValueCountFrequency (%)
662 1
 
1.3%
685 1
 
1.3%
716 1
 
1.3%
734 1
 
1.3%
771 1
 
1.3%
841 1
 
1.3%
879 1
 
1.3%
897 1
 
1.3%
900 1
 
1.3%
901 3
3.8%
ValueCountFrequency (%)
36367 1
1.3%
34055 1
1.3%
24625 1
1.3%
23408 1
1.3%
20191 1
1.3%
19027 1
1.3%
17612 1
1.3%
14764 1
1.3%
14390 1
1.3%
14004 1
1.3%
Distinct74
Distinct (%)94.9%
Missing0
Missing (%)0.0%
Memory size752.0 B
Minimum1960-11-01 00:00:00
Maximum2018-07-13 00:00:00
2024-04-21T20:16:40.933364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T20:16:41.369370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-04-21T20:16:32.538530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T20:16:30.999928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T20:16:31.760858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T20:16:32.791836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T20:16:31.244956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T20:16:32.015096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T20:16:33.054739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T20:16:31.505394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T20:16:32.279161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T20:16:41.650277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단지명소재지층수동수세대수연면적(㎡)준공일자
단지명1.0001.0001.0001.0001.0001.0000.997
소재지1.0001.0001.0001.0001.0001.0001.000
층수1.0001.0001.0000.4940.7370.7131.000
동수1.0001.0000.4941.0000.9450.4220.000
세대수1.0001.0000.7370.9451.0000.7780.000
연면적(㎡)1.0001.0000.7130.4220.7781.0000.000
준공일자0.9971.0001.0000.0000.0000.0001.000
2024-04-21T20:16:42.114009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
층수세대수연면적(㎡)동수
층수1.0000.0690.3900.216
세대수0.0691.0000.7670.656
연면적(㎡)0.3900.7671.0000.270
동수0.2160.6560.2701.000

Missing values

2024-04-21T20:16:33.360194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T20:16:33.558528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

단지명소재지층수동수세대수연면적(㎡)준공일자
0신창아파트창선동1가9-1외1412417851960-11-01
1신생아파트신창동2가22612919321967-12-27
2영주APT3블럭영주동73-1441929011969-02-20
3영주APT2블럭영주동72-4441679011969-04-15
4영주APT9블럭영주동93-442729011969-01-25
5보수아파트보수1가산3-14855406147641969-12-30
6부산데파트동광동1가1-107174190271970-05-20
7영주시민아파트영주동 산1-2004421580811971-03-07
8동광아파트동광동5가16-290614819041971-03-25
9신보수아파트보수동3가58613624521972-05-20
단지명소재지층수동수세대수연면적(㎡)준공일자
68삼성팰리스보수동2가 79-1101209992016-03-28
69씨앤리진2차대청동1가 2281812052016-06-23
70수목하우스다동보수동2가 93-391189952016-07-13
71선경뉴빌아파트대창동1가 54-431514453892016-07-27
72에코하우스영주동 60-1481169782016-09-19
73대청포세이돈대청동4가 82-51112014022016-10-11
74수목하우스라동보수동2가 20-791209962017-07-14
75보수동 해마루아파트2차보수동2가 65-21512743062017-08-17
76보수3차 봄여름가을겨울보수동3가 73-351519897592017-10-31
77짐아트빌보수동2가 17-2091289622018-07-13