Overview

Dataset statistics

Number of variables7
Number of observations211
Missing cells31
Missing cells (%)2.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.1 KiB
Average record size in memory58.6 B

Variable types

Numeric2
Categorical1
Text4

Dataset

Description대구광역시 오피스텔 현황에 대한 자료로 구군명, 오피스텔명, 위치, 지번, 연면적, 호실수, 사용승인일 항목을 포함합니다.
Author대구광역시
URLhttps://www.data.go.kr/data/15100225/fileData.do

Alerts

연번 is highly overall correlated with 시도High correlation
시도 is highly overall correlated with 연번High correlation
사업명 has 31 (14.7%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-23 07:07:31.674809
Analysis finished2023-12-23 07:07:37.654659
Duration5.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct211
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean106
Minimum1
Maximum211
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2023-12-23T07:07:38.180378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile11.5
Q153.5
median106
Q3158.5
95-th percentile200.5
Maximum211
Range210
Interquartile range (IQR)105

Descriptive statistics

Standard deviation61.05462
Coefficient of variation (CV)0.57598698
Kurtosis-1.2
Mean106
Median Absolute Deviation (MAD)53
Skewness0
Sum22366
Variance3727.6667
MonotonicityStrictly increasing
2023-12-23T07:07:39.129640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.5%
134 1
 
0.5%
136 1
 
0.5%
137 1
 
0.5%
138 1
 
0.5%
139 1
 
0.5%
140 1
 
0.5%
141 1
 
0.5%
142 1
 
0.5%
143 1
 
0.5%
Other values (201) 201
95.3%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
211 1
0.5%
210 1
0.5%
209 1
0.5%
208 1
0.5%
207 1
0.5%
206 1
0.5%
205 1
0.5%
204 1
0.5%
203 1
0.5%
202 1
0.5%

시도
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
달서구
46 
수성구
40 
중구
37 
동구
32 
북구
24 
Other values (3)
32 

Length

Max length3
Median length2
Mean length2.492891
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중구
2nd row중구
3rd row중구
4th row중구
5th row중구

Common Values

ValueCountFrequency (%)
달서구 46
21.8%
수성구 40
19.0%
중구 37
17.5%
동구 32
15.2%
북구 24
11.4%
달성군 18
 
8.5%
남구 12
 
5.7%
서구 2
 
0.9%

Length

2023-12-23T07:07:39.995380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-23T07:07:40.870142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
달서구 46
21.8%
수성구 40
19.0%
중구 37
17.5%
동구 32
15.2%
북구 24
11.4%
달성군 18
 
8.5%
남구 12
 
5.7%
서구 2
 
0.9%

사업명
Text

MISSING 

Distinct151
Distinct (%)83.9%
Missing31
Missing (%)14.7%
Memory size1.8 KiB
2023-12-23T07:07:42.182459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length23
Mean length9.8277778
Min length3

Characters and Unicode

Total characters1769
Distinct characters239
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique143 ?
Unique (%)79.4%

Sample

1st row클래식명가
2nd row진석타워
3rd row움비어스오피스텔
4th row센트로펠리스
5th row대봉화성파크드림
ValueCountFrequency (%)
주거복합 45
 
12.9%
오피스텔 21
 
6.0%
감삼동 10
 
2.9%
호산동 9
 
2.6%
힐스테이트 9
 
2.6%
센트럴 6
 
1.7%
대구역 5
 
1.4%
본동 5
 
1.4%
범어 5
 
1.4%
5
 
1.4%
Other values (196) 229
65.6%
2023-12-23T07:07:44.100533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
172
 
9.7%
99
 
5.6%
74
 
4.2%
56
 
3.2%
55
 
3.1%
55
 
3.1%
54
 
3.1%
51
 
2.9%
42
 
2.4%
40
 
2.3%
Other values (229) 1071
60.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1549
87.6%
Space Separator 172
 
9.7%
Decimal Number 24
 
1.4%
Uppercase Letter 12
 
0.7%
Close Punctuation 3
 
0.2%
Open Punctuation 3
 
0.2%
Lowercase Letter 3
 
0.2%
Dash Punctuation 2
 
0.1%
Letter Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
99
 
6.4%
74
 
4.8%
56
 
3.6%
55
 
3.6%
55
 
3.6%
54
 
3.5%
51
 
3.3%
42
 
2.7%
40
 
2.6%
40
 
2.6%
Other values (206) 983
63.5%
Uppercase Letter
ValueCountFrequency (%)
B 2
16.7%
M 2
16.7%
K 2
16.7%
C 1
8.3%
T 1
8.3%
H 1
8.3%
D 1
8.3%
E 1
8.3%
W 1
8.3%
Decimal Number
ValueCountFrequency (%)
2 8
33.3%
1 7
29.2%
7 3
 
12.5%
5 3
 
12.5%
3 2
 
8.3%
8 1
 
4.2%
Lowercase Letter
ValueCountFrequency (%)
e 1
33.3%
d 1
33.3%
s 1
33.3%
Space Separator
ValueCountFrequency (%)
172
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1549
87.6%
Common 204
 
11.5%
Latin 16
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
99
 
6.4%
74
 
4.8%
56
 
3.6%
55
 
3.6%
55
 
3.6%
54
 
3.5%
51
 
3.3%
42
 
2.7%
40
 
2.6%
40
 
2.6%
Other values (206) 983
63.5%
Latin
ValueCountFrequency (%)
B 2
12.5%
M 2
12.5%
K 2
12.5%
C 1
 
6.2%
T 1
 
6.2%
H 1
 
6.2%
e 1
 
6.2%
d 1
 
6.2%
D 1
 
6.2%
1
 
6.2%
Other values (3) 3
18.8%
Common
ValueCountFrequency (%)
172
84.3%
2 8
 
3.9%
1 7
 
3.4%
) 3
 
1.5%
( 3
 
1.5%
7 3
 
1.5%
5 3
 
1.5%
3 2
 
1.0%
- 2
 
1.0%
8 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1549
87.6%
ASCII 219
 
12.4%
Number Forms 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
172
78.5%
2 8
 
3.7%
1 7
 
3.2%
) 3
 
1.4%
( 3
 
1.4%
7 3
 
1.4%
5 3
 
1.4%
B 2
 
0.9%
M 2
 
0.9%
K 2
 
0.9%
Other values (12) 14
 
6.4%
Hangul
ValueCountFrequency (%)
99
 
6.4%
74
 
4.8%
56
 
3.6%
55
 
3.6%
55
 
3.6%
54
 
3.5%
51
 
3.3%
42
 
2.7%
40
 
2.6%
40
 
2.6%
Other values (206) 983
63.5%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct69
Distinct (%)32.7%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-23T07:07:45.252014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3.5687204
Min length2

Characters and Unicode

Total characters753
Distinct characters77
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)15.6%

Sample

1st row삼덕동2가
2nd row삼덕동2가
3rd row삼덕동3가
4th row대봉동
5th row대봉동
ValueCountFrequency (%)
범어동 20
 
8.7%
신천동 15
 
6.6%
구지면 13
 
5.7%
내리 13
 
5.7%
감삼동 12
 
5.2%
칠성동2가 11
 
4.8%
호산동 9
 
3.9%
신암동 6
 
2.6%
남산동 6
 
2.6%
침산동 6
 
2.6%
Other values (63) 118
51.5%
2023-12-23T07:07:46.965625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
192
25.5%
34
 
4.5%
32
 
4.2%
27
 
3.6%
22
 
2.9%
20
 
2.7%
20
 
2.7%
19
 
2.5%
2 19
 
2.5%
18
 
2.4%
Other values (67) 350
46.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 696
92.4%
Decimal Number 39
 
5.2%
Space Separator 18
 
2.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
192
27.6%
34
 
4.9%
32
 
4.6%
27
 
3.9%
22
 
3.2%
20
 
2.9%
20
 
2.9%
19
 
2.7%
18
 
2.6%
16
 
2.3%
Other values (61) 296
42.5%
Decimal Number
ValueCountFrequency (%)
2 19
48.7%
1 12
30.8%
3 5
 
12.8%
9 2
 
5.1%
7 1
 
2.6%
Space Separator
ValueCountFrequency (%)
18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 696
92.4%
Common 57
 
7.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
192
27.6%
34
 
4.9%
32
 
4.6%
27
 
3.9%
22
 
3.2%
20
 
2.9%
20
 
2.9%
19
 
2.7%
18
 
2.6%
16
 
2.3%
Other values (61) 296
42.5%
Common
ValueCountFrequency (%)
2 19
33.3%
18
31.6%
1 12
21.1%
3 5
 
8.8%
9 2
 
3.5%
7 1
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 696
92.4%
ASCII 57
 
7.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
192
27.6%
34
 
4.9%
32
 
4.6%
27
 
3.9%
22
 
3.2%
20
 
2.9%
20
 
2.9%
19
 
2.7%
18
 
2.6%
16
 
2.3%
Other values (61) 296
42.5%
ASCII
ValueCountFrequency (%)
2 19
33.3%
18
31.6%
1 12
21.1%
3 5
 
8.8%
9 2
 
3.5%
7 1
 
1.8%

지번
Text

Distinct210
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-23T07:07:48.242350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length13
Mean length7.1706161
Min length1

Characters and Unicode

Total characters1513
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique209 ?
Unique (%)99.1%

Sample

1st row139-12
2nd row210-1
3rd row121
4th row60-10
5th row152-24
ValueCountFrequency (%)
38
 
11.1%
일원 31
 
9.1%
1필 6
 
1.8%
2필지 5
 
1.5%
2필 4
 
1.2%
1필지 4
 
1.2%
4필지 3
 
0.9%
8필지 3
 
0.9%
대명동 3
 
0.9%
177-3 2
 
0.6%
Other values (235) 242
71.0%
2023-12-23T07:07:50.531971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 208
13.7%
- 163
10.8%
2 135
 
8.9%
132
 
8.7%
5 94
 
6.2%
4 84
 
5.6%
77
 
5.1%
0 77
 
5.1%
3 75
 
5.0%
8 71
 
4.7%
Other values (13) 397
26.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 909
60.1%
Other Letter 308
 
20.4%
Dash Punctuation 163
 
10.8%
Space Separator 132
 
8.7%
Math Symbol 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 208
22.9%
2 135
14.9%
5 94
10.3%
4 84
9.2%
0 77
 
8.5%
3 75
 
8.3%
8 71
 
7.8%
7 71
 
7.8%
6 54
 
5.9%
9 40
 
4.4%
Other Letter
ValueCountFrequency (%)
77
25.0%
64
20.8%
55
17.9%
38
12.3%
32
10.4%
32
10.4%
3
 
1.0%
3
 
1.0%
3
 
1.0%
1
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 163
100.0%
Space Separator
ValueCountFrequency (%)
132
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1205
79.6%
Hangul 308
 
20.4%

Most frequent character per script

Common
ValueCountFrequency (%)
1 208
17.3%
- 163
13.5%
2 135
11.2%
132
11.0%
5 94
7.8%
4 84
7.0%
0 77
 
6.4%
3 75
 
6.2%
8 71
 
5.9%
7 71
 
5.9%
Other values (3) 95
7.9%
Hangul
ValueCountFrequency (%)
77
25.0%
64
20.8%
55
17.9%
38
12.3%
32
10.4%
32
10.4%
3
 
1.0%
3
 
1.0%
3
 
1.0%
1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1205
79.6%
Hangul 308
 
20.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 208
17.3%
- 163
13.5%
2 135
11.2%
132
11.0%
5 94
7.8%
4 84
7.0%
0 77
 
6.4%
3 75
 
6.2%
8 71
 
5.9%
7 71
 
5.9%
Other values (3) 95
7.9%
Hangul
ValueCountFrequency (%)
77
25.0%
64
20.8%
55
17.9%
38
12.3%
32
10.4%
32
10.4%
3
 
1.0%
3
 
1.0%
3
 
1.0%
1
 
0.3%
Distinct207
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-23T07:07:51.942026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length15
Mean length7.5971564
Min length4

Characters and Unicode

Total characters1603
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique204 ?
Unique (%)96.7%

Sample

1st row3297.6
2nd row46167.34
3rd row2890.08
4th row16336.29
5th row1669.27
ValueCountFrequency (%)
917.5 3
 
1.4%
695.86 2
 
0.9%
861.08 2
 
0.9%
1215.76 1
 
0.5%
31720.53 1
 
0.5%
11703.37(오피스텔만 1
 
0.5%
26747.01 1
 
0.5%
3297.6 1
 
0.5%
89243.07 1
 
0.5%
94439.19 1
 
0.5%
Other values (197) 197
93.4%
2023-12-23T07:07:54.522909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 205
12.8%
1 181
11.3%
2 154
9.6%
5 154
9.6%
9 147
9.2%
3 145
9.0%
6 138
8.6%
7 126
7.9%
4 119
7.4%
8 114
7.1%
Other values (8) 120
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1384
86.3%
Other Punctuation 205
 
12.8%
Other Letter 10
 
0.6%
Open Punctuation 2
 
0.1%
Close Punctuation 2
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 181
13.1%
2 154
11.1%
5 154
11.1%
9 147
10.6%
3 145
10.5%
6 138
10.0%
7 126
9.1%
4 119
8.6%
8 114
8.2%
0 106
7.7%
Other Letter
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%
Other Punctuation
ValueCountFrequency (%)
. 205
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1593
99.4%
Hangul 10
 
0.6%

Most frequent character per script

Common
ValueCountFrequency (%)
. 205
12.9%
1 181
11.4%
2 154
9.7%
5 154
9.7%
9 147
9.2%
3 145
9.1%
6 138
8.7%
7 126
7.9%
4 119
7.5%
8 114
7.2%
Other values (3) 110
6.9%
Hangul
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1593
99.4%
Hangul 10
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 205
12.9%
1 181
11.4%
2 154
9.7%
5 154
9.7%
9 147
9.2%
3 145
9.1%
6 138
8.7%
7 126
7.9%
4 119
7.5%
8 114
7.2%
Other values (3) 110
6.9%
Hangul
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%

오피스텔호실수
Real number (ℝ)

Distinct123
Distinct (%)58.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean146.18957
Minimum3
Maximum1046
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2023-12-23T07:07:55.124491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile21
Q141
median82
Q3164.5
95-th percentile497
Maximum1046
Range1043
Interquartile range (IQR)123.5

Descriptive statistics

Standard deviation168.6756
Coefficient of variation (CV)1.1538142
Kurtosis6.9863083
Mean146.18957
Median Absolute Deviation (MAD)52
Skewness2.4372258
Sum30846
Variance28451.459
MonotonicityNot monotonic
2023-12-23T07:07:55.950419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21 12
 
5.7%
30 10
 
4.7%
72 7
 
3.3%
63 5
 
2.4%
36 5
 
2.4%
52 5
 
2.4%
48 4
 
1.9%
46 3
 
1.4%
83 3
 
1.4%
60 3
 
1.4%
Other values (113) 154
73.0%
ValueCountFrequency (%)
3 1
 
0.5%
15 1
 
0.5%
20 2
 
0.9%
21 12
5.7%
22 3
 
1.4%
24 2
 
0.9%
25 2
 
0.9%
26 1
 
0.5%
27 2
 
0.9%
28 2
 
0.9%
ValueCountFrequency (%)
1046 1
0.5%
928 1
0.5%
730 1
0.5%
713 1
0.5%
686 1
0.5%
672 1
0.5%
614 1
0.5%
596 1
0.5%
528 1
0.5%
510 1
0.5%

Interactions

2023-12-23T07:07:35.122531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-23T07:07:33.504343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-23T07:07:35.628015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-23T07:07:34.099602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-23T07:07:56.253953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시도대지위치오피스텔호실수
연번1.0000.9430.9720.312
시도0.9431.0001.0000.173
대지위치0.9721.0001.0000.726
오피스텔호실수0.3120.1730.7261.000
2023-12-23T07:07:56.709241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번오피스텔호실수시도
연번1.000-0.3200.826
오피스텔호실수-0.3201.0000.084
시도0.8260.0841.000

Missing values

2023-12-23T07:07:36.654002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-23T07:07:37.425203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시도사업명대지위치지번연면적오피스텔호실수
01중구클래식명가삼덕동2가139-123297.658
12중구진석타워삼덕동2가210-146167.34138
23중구움비어스오피스텔삼덕동3가1212890.0855
34중구센트로펠리스대봉동60-1016336.29144
45중구대봉화성파크드림대봉동152-241669.2725
56중구세명오피스넬봉산동136-92300.8258
67중구동승오피스텔봉산동168-11 외 6필951.6925
78중구인터불고코아시스남산동437-113976.08118
89중구화성파크드림시티동인동2가51 외 9필지56487.24928
910중구노마즈하우스교동816894.76261
연번시도사업명대지위치지번연면적오피스텔호실수
201202달성군<NA>구지면 내리844-15917.521
202203달성군<NA>구지면 내리844-24917.521
203204달성군<NA>구지면 내리844-22960.4921
204205달성군<NA>구지면 내리844-20913.0321
205206달성군<NA>구지면 내리844-26917.521
206207달성군<NA>유가읍 봉리606-229816.57361
207208달성군<NA>구지면 내리842-551219.2830
208209달성군<NA>구지면 내리842-561215.7630
209210달성군<NA>다사읍 죽곡리1252925.6648
210211달성군<NA>구지면 내리8441256.5538