Overview

Dataset statistics

Number of variables8
Number of observations2109
Missing cells600
Missing cells (%)3.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory140.2 KiB
Average record size in memory68.1 B

Variable types

Numeric4
Text1
Categorical2
DateTime1

Dataset

Description부산광역시사하구_150세대미만공동주택현황_20230809
Author부산광역시 사하구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15103369

Alerts

시군구명 has constant value ""Constant
연번 is highly overall correlated with 법정동명High correlation
법정동명 is highly overall correlated with 연번High correlation
건축물명 has 577 (27.4%) missing valuesMissing
연번 has unique valuesUnique
부번 has 206 (9.8%) zerosZeros
세대수 has 37 (1.8%) zerosZeros

Reproduction

Analysis started2023-12-10 17:50:08.514211
Analysis finished2023-12-10 17:50:15.734195
Duration7.22 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct2109
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1055
Minimum1
Maximum2109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size18.7 KiB
2023-12-11T02:50:15.923732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile106.4
Q1528
median1055
Q31582
95-th percentile2003.6
Maximum2109
Range2108
Interquartile range (IQR)1054

Descriptive statistics

Standard deviation608.96018
Coefficient of variation (CV)0.57721344
Kurtosis-1.2
Mean1055
Median Absolute Deviation (MAD)527
Skewness0
Sum2224995
Variance370832.5
MonotonicityStrictly increasing
2023-12-11T02:50:16.285575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1418 1
 
< 0.1%
1416 1
 
< 0.1%
1415 1
 
< 0.1%
1414 1
 
< 0.1%
1413 1
 
< 0.1%
1412 1
 
< 0.1%
1411 1
 
< 0.1%
1410 1
 
< 0.1%
1409 1
 
< 0.1%
Other values (2099) 2099
99.5%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2109 1
< 0.1%
2108 1
< 0.1%
2107 1
< 0.1%
2106 1
< 0.1%
2105 1
< 0.1%
2104 1
< 0.1%
2103 1
< 0.1%
2102 1
< 0.1%
2101 1
< 0.1%
2100 1
< 0.1%

건축물명
Text

MISSING 

Distinct1003
Distinct (%)65.5%
Missing577
Missing (%)27.4%
Memory size16.6 KiB
2023-12-11T02:50:16.818027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length21
Mean length5.328329
Min length2

Characters and Unicode

Total characters8163
Distinct characters369
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique751 ?
Unique (%)49.0%

Sample

1st row조은빌
2nd row대티빌
3rd row작품 하나
4th row신태양아파트
5th row신태양아파트
ValueCountFrequency (%)
에덴대진연립 18
 
1.1%
대진빌라 14
 
0.9%
화인빌리지 13
 
0.8%
협진태양아파트 11
 
0.7%
11
 
0.7%
우신주택 10
 
0.6%
임호빌리지 10
 
0.6%
금성연립주택 9
 
0.5%
선광무지개빌라 9
 
0.5%
호성빌라맨션 9
 
0.5%
Other values (1027) 1524
93.0%
2023-12-11T02:50:17.689821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
814
 
10.0%
494
 
6.1%
403
 
4.9%
373
 
4.6%
328
 
4.0%
211
 
2.6%
176
 
2.2%
167
 
2.0%
153
 
1.9%
129
 
1.6%
Other values (359) 4915
60.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7843
96.1%
Space Separator 111
 
1.4%
Decimal Number 105
 
1.3%
Uppercase Letter 74
 
0.9%
Close Punctuation 10
 
0.1%
Open Punctuation 10
 
0.1%
Dash Punctuation 6
 
0.1%
Other Punctuation 3
 
< 0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
814
 
10.4%
494
 
6.3%
403
 
5.1%
373
 
4.8%
328
 
4.2%
211
 
2.7%
176
 
2.2%
167
 
2.1%
153
 
2.0%
129
 
1.6%
Other values (321) 4595
58.6%
Uppercase Letter
ValueCountFrequency (%)
A 16
21.6%
B 12
16.2%
S 9
12.2%
M 5
 
6.8%
C 4
 
5.4%
H 4
 
5.4%
T 4
 
5.4%
K 3
 
4.1%
E 3
 
4.1%
W 2
 
2.7%
Other values (10) 12
16.2%
Decimal Number
ValueCountFrequency (%)
1 38
36.2%
2 18
17.1%
0 18
17.1%
3 7
 
6.7%
9 6
 
5.7%
5 5
 
4.8%
4 5
 
4.8%
7 3
 
2.9%
8 3
 
2.9%
6 2
 
1.9%
Other Punctuation
ValueCountFrequency (%)
. 1
33.3%
/ 1
33.3%
, 1
33.3%
Space Separator
ValueCountFrequency (%)
111
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7843
96.1%
Common 245
 
3.0%
Latin 75
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
814
 
10.4%
494
 
6.3%
403
 
5.1%
373
 
4.8%
328
 
4.2%
211
 
2.7%
176
 
2.2%
167
 
2.1%
153
 
2.0%
129
 
1.6%
Other values (321) 4595
58.6%
Latin
ValueCountFrequency (%)
A 16
21.3%
B 12
16.0%
S 9
12.0%
M 5
 
6.7%
C 4
 
5.3%
H 4
 
5.3%
T 4
 
5.3%
K 3
 
4.0%
E 3
 
4.0%
W 2
 
2.7%
Other values (11) 13
17.3%
Common
ValueCountFrequency (%)
111
45.3%
1 38
 
15.5%
2 18
 
7.3%
0 18
 
7.3%
) 10
 
4.1%
( 10
 
4.1%
3 7
 
2.9%
9 6
 
2.4%
- 6
 
2.4%
5 5
 
2.0%
Other values (7) 16
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7843
96.1%
ASCII 319
 
3.9%
Number Forms 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
814
 
10.4%
494
 
6.3%
403
 
5.1%
373
 
4.8%
328
 
4.2%
211
 
2.7%
176
 
2.2%
167
 
2.1%
153
 
2.0%
129
 
1.6%
Other values (321) 4595
58.6%
ASCII
ValueCountFrequency (%)
111
34.8%
1 38
 
11.9%
2 18
 
5.6%
0 18
 
5.6%
A 16
 
5.0%
B 12
 
3.8%
) 10
 
3.1%
( 10
 
3.1%
S 9
 
2.8%
3 7
 
2.2%
Other values (27) 70
21.9%
Number Forms
ValueCountFrequency (%)
1
100.0%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size16.6 KiB
부산광역시 사하구
2109 

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산광역시 사하구
2nd row부산광역시 사하구
3rd row부산광역시 사하구
4th row부산광역시 사하구
5th row부산광역시 사하구

Common Values

ValueCountFrequency (%)
부산광역시 사하구 2109
100.0%

Length

2023-12-11T02:50:18.023211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:50:18.198371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산광역시 2109
50.0%
사하구 2109
50.0%

법정동명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size16.6 KiB
괴정동
614 
하단동
533 
당리동
293 
감천동
222 
장림동
190 
Other values (3)
257 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row감천동
2nd row하단동
3rd row당리동
4th row괴정동
5th row괴정동

Common Values

ValueCountFrequency (%)
괴정동 614
29.1%
하단동 533
25.3%
당리동 293
13.9%
감천동 222
 
10.5%
장림동 190
 
9.0%
신평동 122
 
5.8%
다대동 85
 
4.0%
구평동 50
 
2.4%

Length

2023-12-11T02:50:18.414662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:50:19.169120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
괴정동 614
29.1%
하단동 533
25.3%
당리동 293
13.9%
감천동 222
 
10.5%
장림동 190
 
9.0%
신평동 122
 
5.8%
다대동 85
 
4.0%
구평동 50
 
2.4%

본번
Real number (ℝ)

Distinct519
Distinct (%)24.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean490.09483
Minimum1
Maximum1577
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size18.7 KiB
2023-12-11T02:50:19.487876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile33
Q1305
median487
Q3618
95-th percentile1061.8
Maximum1577
Range1576
Interquartile range (IQR)313

Descriptive statistics

Standard deviation299.67852
Coefficient of variation (CV)0.61147048
Kurtosis-0.14435991
Mean490.09483
Median Absolute Deviation (MAD)169
Skewness0.53120411
Sum1033610
Variance89807.218
MonotonicityNot monotonic
2023-12-11T02:50:19.827951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
260 85
 
4.0%
870 54
 
2.6%
625 25
 
1.2%
354 24
 
1.1%
325 23
 
1.1%
764 22
 
1.0%
321 22
 
1.0%
352 20
 
0.9%
520 19
 
0.9%
381 19
 
0.9%
Other values (509) 1796
85.2%
ValueCountFrequency (%)
1 6
0.3%
2 2
 
0.1%
4 1
 
< 0.1%
5 1
 
< 0.1%
11 10
0.5%
12 1
 
< 0.1%
13 4
 
0.2%
14 3
 
0.1%
15 9
0.4%
16 7
0.3%
ValueCountFrequency (%)
1577 1
 
< 0.1%
1551 2
 
0.1%
1548 1
 
< 0.1%
1282 1
 
< 0.1%
1275 1
 
< 0.1%
1273 1
 
< 0.1%
1270 8
0.4%
1233 1
 
< 0.1%
1230 1
 
< 0.1%
1229 3
 
0.1%

부번
Real number (ℝ)

ZEROS 

Distinct214
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.223803
Minimum0
Maximum1105
Zeros206
Zeros (%)9.8%
Negative0
Negative (%)0.0%
Memory size18.7 KiB
2023-12-11T02:50:20.168308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median8
Q326
95-th percentile146
Maximum1105
Range1105
Interquartile range (IQR)23

Descriptive statistics

Standard deviation80.843133
Coefficient of variation (CV)2.5088018
Kurtosis61.275896
Mean32.223803
Median Absolute Deviation (MAD)7
Skewness6.6131698
Sum67960
Variance6535.6121
MonotonicityNot monotonic
2023-12-11T02:50:21.295393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 206
 
9.8%
1 181
 
8.6%
4 129
 
6.1%
2 122
 
5.8%
3 121
 
5.7%
5 89
 
4.2%
6 83
 
3.9%
7 70
 
3.3%
9 63
 
3.0%
8 58
 
2.8%
Other values (204) 987
46.8%
ValueCountFrequency (%)
0 206
9.8%
1 181
8.6%
2 122
5.8%
3 121
5.7%
4 129
6.1%
5 89
4.2%
6 83
3.9%
7 70
 
3.3%
8 58
 
2.8%
9 63
 
3.0%
ValueCountFrequency (%)
1105 1
< 0.1%
1000 1
< 0.1%
946 1
< 0.1%
939 1
< 0.1%
936 2
0.1%
698 1
< 0.1%
526 1
< 0.1%
501 1
< 0.1%
500 1
< 0.1%
494 1
< 0.1%

세대수
Real number (ℝ)

ZEROS 

Distinct94
Distinct (%)4.5%
Missing20
Missing (%)0.9%
Infinite0
Infinite (%)0.0%
Mean14.386309
Minimum0
Maximum146
Zeros37
Zeros (%)1.8%
Negative0
Negative (%)0.0%
Memory size18.7 KiB
2023-12-11T02:50:22.105992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median8
Q316
95-th percentile45.6
Maximum146
Range146
Interquartile range (IQR)9

Descriptive statistics

Standard deviation16.937332
Coefficient of variation (CV)1.1773229
Kurtosis17.883843
Mean14.386309
Median Absolute Deviation (MAD)4
Skewness3.7233531
Sum30053
Variance286.8732
MonotonicityNot monotonic
2023-12-11T02:50:22.774227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8 600
28.4%
4 218
 
10.3%
12 117
 
5.5%
6 105
 
5.0%
16 97
 
4.6%
7 91
 
4.3%
18 66
 
3.1%
19 56
 
2.7%
9 55
 
2.6%
10 52
 
2.5%
Other values (84) 632
30.0%
ValueCountFrequency (%)
0 37
 
1.8%
1 11
 
0.5%
2 44
 
2.1%
3 36
 
1.7%
4 218
 
10.3%
5 31
 
1.5%
6 105
 
5.0%
7 91
 
4.3%
8 600
28.4%
9 55
 
2.6%
ValueCountFrequency (%)
146 1
< 0.1%
142 2
0.1%
138 1
< 0.1%
132 1
< 0.1%
131 1
< 0.1%
130 1
< 0.1%
126 1
< 0.1%
123 1
< 0.1%
120 1
< 0.1%
119 2
0.1%
Distinct1400
Distinct (%)66.5%
Missing3
Missing (%)0.1%
Memory size16.6 KiB
Minimum1945-04-11 00:00:00
Maximum2022-08-16 00:00:00
2023-12-11T02:50:23.254544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:23.625597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-11T02:50:13.213811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:09.601362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:10.677819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:11.957198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:13.644443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:09.870932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:10.934272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:12.246832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:14.101583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:10.171281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:11.289868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:12.583511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:14.470519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:10.401726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:11.631519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:50:12.891297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:50:23.892174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번법정동명본번부번세대수
연번1.0000.7640.6070.2040.158
법정동명0.7641.0000.6290.2660.151
본번0.6070.6291.0000.2620.316
부번0.2040.2660.2621.0000.079
세대수0.1580.1510.3160.0791.000
2023-12-11T02:50:24.276248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번본번부번세대수법정동명
연번1.0000.071-0.0350.1000.502
본번0.0711.000-0.0850.1170.363
부번-0.035-0.0851.000-0.1630.092
세대수0.1000.117-0.1631.0000.074
법정동명0.5020.3630.0920.0741.000

Missing values

2023-12-11T02:50:14.869211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:50:15.341915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T02:50:15.603317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번건축물명시군구명법정동명본번부번세대수준공일
01조은빌부산광역시 사하구감천동8301982010-12-22
12<NA>부산광역시 사하구하단동49624242021-12-15
23<NA>부산광역시 사하구당리동28212202011-03-14
34대티빌부산광역시 사하구괴정동49461992-04-23
45<NA>부산광역시 사하구괴정동538901974-11-01
56작품 하나부산광역시 사하구하단동50085162019-11-19
67<NA>부산광역시 사하구괴정동232651979-11-10
78<NA>부산광역시 사하구괴정동238161982-08-24
89<NA>부산광역시 사하구괴정동2381051980-07-14
910신태양아파트부산광역시 사하구괴정동24080301983-06-21
연번건축물명시군구명법정동명본번부번세대수준공일
20992100더블유부산광역시 사하구괴정동378202122020-06-01
21002101<NA>부산광역시 사하구하단동48734192020-07-20
21012102동아파크부산광역시 사하구하단동49622102021-06-24
21022103<NA>부산광역시 사하구괴정동54221102021-05-13
21032104영웅빌A동부산광역시 사하구당리동348272021-04-27
21042105영웅빌B동부산광역시 사하구당리동348282021-04-27
21052106<NA>부산광역시 사하구괴정동494482021-09-09
21062107이한부산광역시 사하구하단동50212282022-02-16
21072108나산주택부산광역시 사하구괴정동1227242002-02-20
21082109<NA>부산광역시 사하구괴정동775492022-08-16