Overview

Dataset statistics

Number of variables5
Number of observations87
Missing cells87
Missing cells (%)20.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.7 KiB
Average record size in memory43.5 B

Variable types

Numeric2
Text3

Dataset

Description부산광역시사상구_기계설비성능점검대상건축물현황_20230120
Author부산광역시 사상구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15112095

Alerts

연번 is highly overall correlated with 세대수High correlation
세대수 is highly overall correlated with 연번High correlation
건축물 연면적 has 37 (42.5%) missing valuesMissing
세대수 has 50 (57.5%) missing valuesMissing
연번 has unique valuesUnique
상호명 has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:55:18.410843
Analysis finished2023-12-10 16:55:20.255709
Duration1.84 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct87
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44
Minimum1
Maximum87
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size915.0 B
2023-12-11T01:55:20.414035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.3
Q122.5
median44
Q365.5
95-th percentile82.7
Maximum87
Range86
Interquartile range (IQR)43

Descriptive statistics

Standard deviation25.258662
Coefficient of variation (CV)0.5740605
Kurtosis-1.2
Mean44
Median Absolute Deviation (MAD)22
Skewness0
Sum3828
Variance638
MonotonicityStrictly increasing
2023-12-11T01:55:20.683021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.1%
2 1
 
1.1%
65 1
 
1.1%
64 1
 
1.1%
63 1
 
1.1%
62 1
 
1.1%
61 1
 
1.1%
60 1
 
1.1%
59 1
 
1.1%
58 1
 
1.1%
Other values (77) 77
88.5%
ValueCountFrequency (%)
1 1
1.1%
2 1
1.1%
3 1
1.1%
4 1
1.1%
5 1
1.1%
6 1
1.1%
7 1
1.1%
8 1
1.1%
9 1
1.1%
10 1
1.1%
ValueCountFrequency (%)
87 1
1.1%
86 1
1.1%
85 1
1.1%
84 1
1.1%
83 1
1.1%
82 1
1.1%
81 1
1.1%
80 1
1.1%
79 1
1.1%
78 1
1.1%

상호명
Text

UNIQUE 

Distinct87
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size828.0 B
2023-12-11T01:55:21.168901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length11
Mean length7.6091954
Min length2

Characters and Unicode

Total characters662
Distinct characters189
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)100.0%

Sample

1st row모라주공1단지
2nd row모라주공3단지
3rd row산업용품 유통상가
4th row동서대학교
5th row경남정보대학
ValueCountFrequency (%)
아파트 3
 
2.7%
모라주공1단지 1
 
0.9%
1
 
0.9%
현대무지개타운 1
 
0.9%
덕포경동메르빌 1
 
0.9%
학장벽산아파트 1
 
0.9%
학장동2차삼성아파트 1
 
0.9%
사상강변동원아파트 1
 
0.9%
엄궁동쌍용스윗닷홈 1
 
0.9%
감전엘에이치아파트 1
 
0.9%
Other values (100) 100
89.3%
2023-12-11T01:55:21.941565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
31
 
4.7%
31
 
4.7%
30
 
4.5%
25
 
3.8%
19
 
2.9%
16
 
2.4%
14
 
2.1%
14
 
2.1%
13
 
2.0%
11
 
1.7%
Other values (179) 458
69.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 613
92.6%
Space Separator 25
 
3.8%
Decimal Number 10
 
1.5%
Open Punctuation 5
 
0.8%
Close Punctuation 5
 
0.8%
Other Symbol 2
 
0.3%
Connector Punctuation 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
31
 
5.1%
31
 
5.1%
30
 
4.9%
19
 
3.1%
16
 
2.6%
14
 
2.3%
14
 
2.3%
13
 
2.1%
11
 
1.8%
11
 
1.8%
Other values (170) 423
69.0%
Decimal Number
ValueCountFrequency (%)
2 5
50.0%
1 3
30.0%
4 1
 
10.0%
3 1
 
10.0%
Space Separator
ValueCountFrequency (%)
25
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 615
92.9%
Common 47
 
7.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
31
 
5.0%
31
 
5.0%
30
 
4.9%
19
 
3.1%
16
 
2.6%
14
 
2.3%
14
 
2.3%
13
 
2.1%
11
 
1.8%
11
 
1.8%
Other values (171) 425
69.1%
Common
ValueCountFrequency (%)
25
53.2%
( 5
 
10.6%
) 5
 
10.6%
2 5
 
10.6%
1 3
 
6.4%
_ 2
 
4.3%
4 1
 
2.1%
3 1
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 613
92.6%
ASCII 47
 
7.1%
None 2
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
31
 
5.1%
31
 
5.1%
30
 
4.9%
19
 
3.1%
16
 
2.6%
14
 
2.3%
14
 
2.3%
13
 
2.1%
11
 
1.8%
11
 
1.8%
Other values (170) 423
69.0%
ASCII
ValueCountFrequency (%)
25
53.2%
( 5
 
10.6%
) 5
 
10.6%
2 5
 
10.6%
1 3
 
6.4%
_ 2
 
4.3%
4 1
 
2.1%
3 1
 
2.1%
None
ValueCountFrequency (%)
2
100.0%

건축물 연면적
Text

MISSING 

Distinct50
Distinct (%)100.0%
Missing37
Missing (%)42.5%
Memory size828.0 B
2023-12-11T01:55:22.367798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length6.08
Min length6

Characters and Unicode

Total characters304
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)100.0%

Sample

1st row183,862
2nd row170,967
3rd row173,716
4th row142,225
5th row69,349
ValueCountFrequency (%)
32,514 1
 
2.0%
16,372 1
 
2.0%
16,174 1
 
2.0%
15,279 1
 
2.0%
14,330 1
 
2.0%
14,744 1
 
2.0%
11,723 1
 
2.0%
12,982 1
 
2.0%
14,154 1
 
2.0%
12,829 1
 
2.0%
Other values (40) 40
80.0%
2023-12-11T01:55:22.958937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 54
17.8%
, 50
16.4%
0 30
9.9%
2 28
9.2%
4 28
9.2%
7 25
8.2%
5 24
7.9%
6 19
 
6.2%
3 17
 
5.6%
9 15
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 254
83.6%
Other Punctuation 50
 
16.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 54
21.3%
0 30
11.8%
2 28
11.0%
4 28
11.0%
7 25
9.8%
5 24
9.4%
6 19
 
7.5%
3 17
 
6.7%
9 15
 
5.9%
8 14
 
5.5%
Other Punctuation
ValueCountFrequency (%)
, 50
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 304
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 54
17.8%
, 50
16.4%
0 30
9.9%
2 28
9.2%
4 28
9.2%
7 25
8.2%
5 24
7.9%
6 19
 
6.2%
3 17
 
5.6%
9 15
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 304
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 54
17.8%
, 50
16.4%
0 30
9.9%
2 28
9.2%
4 28
9.2%
7 25
8.2%
5 24
7.9%
6 19
 
6.2%
3 17
 
5.6%
9 15
 
4.9%

세대수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct36
Distinct (%)97.3%
Missing50
Missing (%)57.5%
Infinite0
Infinite (%)0.0%
Mean1004.0811
Minimum511
Maximum2529
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size915.0 B
2023-12-11T01:55:23.207808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum511
5-th percentile524.2
Q1731
median893
Q31080
95-th percentile2046.8
Maximum2529
Range2018
Interquartile range (IQR)349

Descriptive statistics

Standard deviation482.92192
Coefficient of variation (CV)0.48095908
Kurtosis3.3064874
Mean1004.0811
Median Absolute Deviation (MAD)173
Skewness1.8582064
Sum37151
Variance233213.58
MonotonicityDecreasing
2023-12-11T01:55:23.480687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
780 2
 
2.3%
874 1
 
1.1%
831 1
 
1.1%
795 1
 
1.1%
763 1
 
1.1%
741 1
 
1.1%
731 1
 
1.1%
720 1
 
1.1%
630 1
 
1.1%
622 1
 
1.1%
Other values (26) 26
29.9%
(Missing) 50
57.5%
ValueCountFrequency (%)
511 1
1.1%
517 1
1.1%
526 1
1.1%
549 1
1.1%
600 1
1.1%
607 1
1.1%
622 1
1.1%
630 1
1.1%
720 1
1.1%
731 1
1.1%
ValueCountFrequency (%)
2529 1
1.1%
2386 1
1.1%
1962 1
1.1%
1852 1
1.1%
1622 1
1.1%
1206 1
1.1%
1158 1
1.1%
1119 1
1.1%
1093 1
1.1%
1080 1
1.1%

주소
Text

Distinct82
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Memory size828.0 B
2023-12-11T01:55:23.986865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length17.057471
Min length13

Characters and Unicode

Total characters1484
Distinct characters49
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique77 ?
Unique (%)88.5%

Sample

1st row부산시 사상구 모라로110번길 121
2nd row부산시 사상구 모라로192번길 20-21
3rd row부산시 사상구 괘감로 37
4th row부산시 사상구 주례로 47
5th row부산시 사상구 주례로 45
ValueCountFrequency (%)
부산시 87
25.0%
사상구 87
25.0%
백양대로 12
 
3.4%
대동로 7
 
2.0%
엄궁로 6
 
1.7%
주례로 5
 
1.4%
광장로 5
 
1.4%
학감대로 4
 
1.1%
학감대로49번길 3
 
0.9%
낙동대로 3
 
0.9%
Other values (110) 129
37.1%
2023-12-11T01:55:24.711041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
261
17.6%
94
 
6.3%
93
 
6.3%
88
 
5.9%
88
 
5.9%
87
 
5.9%
87
 
5.9%
87
 
5.9%
1 64
 
4.3%
49
 
3.3%
Other values (39) 486
32.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 891
60.0%
Decimal Number 321
 
21.6%
Space Separator 261
 
17.6%
Dash Punctuation 11
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
94
10.5%
93
10.4%
88
9.9%
88
9.9%
87
9.8%
87
9.8%
87
9.8%
49
 
5.5%
31
 
3.5%
31
 
3.5%
Other values (27) 156
17.5%
Decimal Number
ValueCountFrequency (%)
1 64
19.9%
2 46
14.3%
4 40
12.5%
0 36
11.2%
3 31
9.7%
6 27
8.4%
7 24
 
7.5%
9 22
 
6.9%
5 17
 
5.3%
8 14
 
4.4%
Space Separator
ValueCountFrequency (%)
261
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 891
60.0%
Common 593
40.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
94
10.5%
93
10.4%
88
9.9%
88
9.9%
87
9.8%
87
9.8%
87
9.8%
49
 
5.5%
31
 
3.5%
31
 
3.5%
Other values (27) 156
17.5%
Common
ValueCountFrequency (%)
261
44.0%
1 64
 
10.8%
2 46
 
7.8%
4 40
 
6.7%
0 36
 
6.1%
3 31
 
5.2%
6 27
 
4.6%
7 24
 
4.0%
9 22
 
3.7%
5 17
 
2.9%
Other values (2) 25
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 891
60.0%
ASCII 593
40.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
261
44.0%
1 64
 
10.8%
2 46
 
7.8%
4 40
 
6.7%
0 36
 
6.1%
3 31
 
5.2%
6 27
 
4.6%
7 24
 
4.0%
9 22
 
3.7%
5 17
 
2.9%
Other values (2) 25
 
4.2%
Hangul
ValueCountFrequency (%)
94
10.5%
93
10.4%
88
9.9%
88
9.9%
87
9.8%
87
9.8%
87
9.8%
49
 
5.5%
31
 
3.5%
31
 
3.5%
Other values (27) 156
17.5%

Interactions

2023-12-11T01:55:19.397086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:55:19.090215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:55:19.561416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:55:19.222957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:55:24.884867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번상호명건축물 연면적세대수주소
연번1.0001.0001.0000.7970.863
상호명1.0001.0001.0001.0001.000
건축물 연면적1.0001.0001.000NaN1.000
세대수0.7971.000NaN1.0000.664
주소0.8631.0001.0000.6641.000
2023-12-11T01:55:25.046135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번세대수
연번1.000-1.000
세대수-1.0001.000

Missing values

2023-12-11T01:55:19.783776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:55:19.992037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T01:55:20.164528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번상호명건축물 연면적세대수주소
01모라주공1단지<NA>2529부산시 사상구 모라로110번길 121
12모라주공3단지<NA>2386부산시 사상구 모라로192번길 20-21
23산업용품 유통상가183,862<NA>부산시 사상구 괘감로 37
34동서대학교170,967<NA>부산시 사상구 주례로 47
45경남정보대학173,716<NA>부산시 사상구 주례로 45
56신라대학교142,225<NA>부산시 사상구 백양대로700번길 140
67르네시떼69,349<NA>부산시 사상구 광장로 7
78홈플러스40,050<NA>부산시 사상구 광장로 7
89마트월드71,773<NA>부산시 사상구 낙동대로 910
910부산벤처타워55,054<NA>부산시 사상구 모라로 22
연번상호명건축물 연면적세대수주소
7778큰솔1병원11,275<NA>부산시 사상구 대동로 141
7879대동레미안 스마트시티10,977<NA>부산시 사상구 사상로243번길 13
7980우창빌딩10,961<NA>부산시 사상구 새벽로 133
8081동궁초등학교10,806<NA>부산시 사상구 엄궁로 70-7
8182부산테크노파크 엄궁단지10,569<NA>부산시 사상구 엄궁로 70-16
8283큰솔2병원10,241<NA>부산시 사상구 학장로 189
8384부산솔빛학교10,169<NA>부산시 사상구 삼덕로5번길 171
8485동주중학교10,054<NA>부산시 사상구 주례로 107
8586현대무지개타운(상가)10,125<NA>부산시 사상구 주례로 101
8687우신모라아파트(상가)12,738<NA>부산시 사상구 백양대로 884