Overview

Dataset statistics

Number of variables5
Number of observations27
Missing cells11
Missing cells (%)8.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 KiB
Average record size in memory45.9 B

Variable types

Text3
Numeric1
DateTime1

Dataset

Description대구광역시 동구_오피스텔 현황_20220902
Author대구광역시 동구
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=15076827&dataSetDetailId=150768271839c4ef2f8cd&provdMethod=FILE

Alerts

도로명주소 has 2 (7.4%) missing valuesMissing
준공년월 has 9 (33.3%) missing valuesMissing
건물명 has unique valuesUnique
지번주소 has unique valuesUnique

Reproduction

Analysis started2023-12-10 20:02:55.534531
Analysis finished2023-12-10 20:02:56.482029
Duration0.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

건물명
Text

UNIQUE 

Distinct27
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size348.0 B
2023-12-11T05:02:56.694249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length16
Mean length9.7037037
Min length4

Characters and Unicode

Total characters262
Distinct characters103
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)100.0%

Sample

1st row서한코보스카운티
2nd row유성푸르나임
3rd row신천 까사밀라
4th row정성라임오피스텔
5th row더블유스퀘어
ValueCountFrequency (%)
오피스텔 6
 
13.0%
동대구역 3
 
6.5%
태왕아너스 2
 
4.3%
서원프레쉬빌 1
 
2.2%
동광오피스텔 1
 
2.2%
대구신서 1
 
2.2%
혁신도시 1
 
2.2%
하우스디어반 1
 
2.2%
밀레니엄오피스텔 1
 
2.2%
아펠리체 1
 
2.2%
Other values (28) 28
60.9%
2023-12-11T05:02:57.253390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23
 
8.8%
19
 
7.3%
13
 
5.0%
13
 
5.0%
12
 
4.6%
7
 
2.7%
7
 
2.7%
6
 
2.3%
5
 
1.9%
5
 
1.9%
Other values (93) 152
58.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 239
91.2%
Space Separator 19
 
7.3%
Lowercase Letter 2
 
0.8%
Letter Number 1
 
0.4%
Uppercase Letter 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
23
 
9.6%
13
 
5.4%
13
 
5.4%
12
 
5.0%
7
 
2.9%
7
 
2.9%
6
 
2.5%
5
 
2.1%
5
 
2.1%
5
 
2.1%
Other values (88) 143
59.8%
Lowercase Letter
ValueCountFrequency (%)
s 1
50.0%
d 1
50.0%
Space Separator
ValueCountFrequency (%)
19
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%
Uppercase Letter
ValueCountFrequency (%)
D 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 239
91.2%
Common 19
 
7.3%
Latin 4
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
23
 
9.6%
13
 
5.4%
13
 
5.4%
12
 
5.0%
7
 
2.9%
7
 
2.9%
6
 
2.5%
5
 
2.1%
5
 
2.1%
5
 
2.1%
Other values (88) 143
59.8%
Latin
ValueCountFrequency (%)
1
25.0%
D 1
25.0%
s 1
25.0%
d 1
25.0%
Common
ValueCountFrequency (%)
19
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 239
91.2%
ASCII 22
 
8.4%
Number Forms 1
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
23
 
9.6%
13
 
5.4%
13
 
5.4%
12
 
5.0%
7
 
2.9%
7
 
2.9%
6
 
2.5%
5
 
2.1%
5
 
2.1%
5
 
2.1%
Other values (88) 143
59.8%
ASCII
ValueCountFrequency (%)
19
86.4%
D 1
 
4.5%
s 1
 
4.5%
d 1
 
4.5%
Number Forms
ValueCountFrequency (%)
1
100.0%

도로명주소
Text

MISSING 

Distinct25
Distinct (%)100.0%
Missing2
Missing (%)7.4%
Memory size348.0 B
2023-12-11T05:02:57.529894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length19
Mean length17.08
Min length14

Characters and Unicode

Total characters427
Distinct characters37
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)100.0%

Sample

1st row대구광역시 동구 동부로22길 2
2nd row대구광역시 동구 동부로22길 48
3rd row대구광역시 동구 동부로 33
4th row대구광역시 동구 신암남로 105
5th row대구광역시 동구 신암남로 103
ValueCountFrequency (%)
대구광역시 25
25.0%
동구 25
25.0%
동부로22길 4
 
4.0%
동부로 4
 
4.0%
동부로26길 2
 
2.0%
33 2
 
2.0%
신암남로 2
 
2.0%
첨복로 2
 
2.0%
동대구로 2
 
2.0%
동촌로 2
 
2.0%
Other values (29) 30
30.0%
2023-12-11T05:02:57.971970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
75
17.6%
52
12.2%
41
9.6%
27
 
6.3%
25
 
5.9%
25
 
5.9%
25
 
5.9%
25
 
5.9%
2 16
 
3.7%
1 14
 
3.3%
Other values (27) 102
23.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 270
63.2%
Decimal Number 80
 
18.7%
Space Separator 75
 
17.6%
Dash Punctuation 2
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
52
19.3%
41
15.2%
27
10.0%
25
9.3%
25
9.3%
25
9.3%
25
9.3%
12
 
4.4%
9
 
3.3%
3
 
1.1%
Other values (15) 26
9.6%
Decimal Number
ValueCountFrequency (%)
2 16
20.0%
1 14
17.5%
3 14
17.5%
0 10
12.5%
5 8
10.0%
6 5
 
6.2%
7 4
 
5.0%
9 4
 
5.0%
4 3
 
3.8%
8 2
 
2.5%
Space Separator
ValueCountFrequency (%)
75
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 270
63.2%
Common 157
36.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
52
19.3%
41
15.2%
27
10.0%
25
9.3%
25
9.3%
25
9.3%
25
9.3%
12
 
4.4%
9
 
3.3%
3
 
1.1%
Other values (15) 26
9.6%
Common
ValueCountFrequency (%)
75
47.8%
2 16
 
10.2%
1 14
 
8.9%
3 14
 
8.9%
0 10
 
6.4%
5 8
 
5.1%
6 5
 
3.2%
7 4
 
2.5%
9 4
 
2.5%
4 3
 
1.9%
Other values (2) 4
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 270
63.2%
ASCII 157
36.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
75
47.8%
2 16
 
10.2%
1 14
 
8.9%
3 14
 
8.9%
0 10
 
6.4%
5 8
 
5.1%
6 5
 
3.2%
7 4
 
2.5%
9 4
 
2.5%
4 3
 
1.9%
Other values (2) 4
 
2.5%
Hangul
ValueCountFrequency (%)
52
19.3%
41
15.2%
27
10.0%
25
9.3%
25
9.3%
25
9.3%
25
9.3%
12
 
4.4%
9
 
3.3%
3
 
1.1%
Other values (15) 26
9.6%

지번주소
Text

UNIQUE 

Distinct27
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size348.0 B
2023-12-11T05:02:58.208660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length18
Mean length18
Min length16

Characters and Unicode

Total characters486
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)100.0%

Sample

1st row대구광역시 동구 신천동 285-1
2nd row대구광역시 동구 신천동 292-6
3rd row대구광역시 동구 신천동 538-15
4th row대구광역시 동구 신암동 259-7
5th row대구광역시 동구 신암동 259-6
ValueCountFrequency (%)
대구광역시 27
25.0%
동구 27
25.0%
신천동 13
12.0%
신암동 4
 
3.7%
신서동 4
 
3.7%
방촌동 2
 
1.9%
70-1 1
 
0.9%
235-1 1
 
0.9%
90-1 1
 
0.9%
괴전동 1
 
0.9%
Other values (27) 27
25.0%
2023-12-11T05:02:58.697427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
81
16.7%
55
11.3%
54
11.1%
27
 
5.6%
27
 
5.6%
27
 
5.6%
27
 
5.6%
1 27
 
5.6%
- 24
 
4.9%
21
 
4.3%
Other values (21) 116
23.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 270
55.6%
Decimal Number 111
22.8%
Space Separator 81
 
16.7%
Dash Punctuation 24
 
4.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
55
20.4%
54
20.0%
27
10.0%
27
10.0%
27
10.0%
27
10.0%
21
 
7.8%
13
 
4.8%
4
 
1.5%
4
 
1.5%
Other values (9) 11
 
4.1%
Decimal Number
ValueCountFrequency (%)
1 27
24.3%
2 17
15.3%
5 14
12.6%
3 13
11.7%
8 10
 
9.0%
9 9
 
8.1%
7 7
 
6.3%
4 5
 
4.5%
6 5
 
4.5%
0 4
 
3.6%
Space Separator
ValueCountFrequency (%)
81
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 270
55.6%
Common 216
44.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
55
20.4%
54
20.0%
27
10.0%
27
10.0%
27
10.0%
27
10.0%
21
 
7.8%
13
 
4.8%
4
 
1.5%
4
 
1.5%
Other values (9) 11
 
4.1%
Common
ValueCountFrequency (%)
81
37.5%
1 27
 
12.5%
- 24
 
11.1%
2 17
 
7.9%
5 14
 
6.5%
3 13
 
6.0%
8 10
 
4.6%
9 9
 
4.2%
7 7
 
3.2%
4 5
 
2.3%
Other values (2) 9
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 270
55.6%
ASCII 216
44.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
81
37.5%
1 27
 
12.5%
- 24
 
11.1%
2 17
 
7.9%
5 14
 
6.5%
3 13
 
6.0%
8 10
 
4.6%
9 9
 
4.2%
7 7
 
3.2%
4 5
 
2.3%
Other values (2) 9
 
4.2%
Hangul
ValueCountFrequency (%)
55
20.4%
54
20.0%
27
10.0%
27
10.0%
27
10.0%
27
10.0%
21
 
7.8%
13
 
4.8%
4
 
1.5%
4
 
1.5%
Other values (9) 11
 
4.1%

세대수
Real number (ℝ)

Distinct26
Distinct (%)96.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean213.33333
Minimum21
Maximum1046
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size375.0 B
2023-12-11T05:02:58.903008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile24.6
Q141
median162
Q3256.5
95-th percentile623.4
Maximum1046
Range1025
Interquartile range (IQR)215.5

Descriptive statistics

Standard deviation235.75166
Coefficient of variation (CV)1.1050859
Kurtosis5.2299957
Mean213.33333
Median Absolute Deviation (MAD)120
Skewness2.0954378
Sum5760
Variance55578.846
MonotonicityNot monotonic
2023-12-11T05:02:59.100078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
225 2
 
7.4%
193 1
 
3.7%
63 1
 
3.7%
50 1
 
3.7%
42 1
 
3.7%
253 1
 
3.7%
180 1
 
3.7%
83 1
 
3.7%
24 1
 
3.7%
26 1
 
3.7%
Other values (16) 16
59.3%
ValueCountFrequency (%)
21 1
3.7%
24 1
3.7%
26 1
3.7%
30 1
3.7%
31 1
3.7%
36 1
3.7%
40 1
3.7%
42 1
3.7%
50 1
3.7%
63 1
3.7%
ValueCountFrequency (%)
1046 1
3.7%
672 1
3.7%
510 1
3.7%
482 1
3.7%
326 1
3.7%
308 1
3.7%
260 1
3.7%
253 1
3.7%
231 1
3.7%
225 2
7.4%

준공년월
Date

MISSING 

Distinct17
Distinct (%)94.4%
Missing9
Missing (%)33.3%
Memory size348.0 B
Minimum1992-07-25 00:00:00
Maximum2020-03-04 00:00:00
2023-12-11T05:02:59.288793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T05:02:59.467057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)

Interactions

2023-12-11T05:02:55.924730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T05:02:59.591918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건물명도로명주소지번주소세대수준공년월
건물명1.0001.0001.0001.0001.000
도로명주소1.0001.0001.0001.0001.000
지번주소1.0001.0001.0001.0001.000
세대수1.0001.0001.0001.0001.000
준공년월1.0001.0001.0001.0001.000

Missing values

2023-12-11T05:02:56.098384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T05:02:56.265767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T05:02:56.407465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

건물명도로명주소지번주소세대수준공년월
0서한코보스카운티대구광역시 동구 동부로22길 2대구광역시 동구 신천동 285-11932014-07-30
1유성푸르나임대구광역시 동구 동부로22길 48대구광역시 동구 신천동 292-66722014-11-20
2신천 까사밀라대구광역시 동구 동부로 33대구광역시 동구 신천동 538-15402015-07-10
3정성라임오피스텔대구광역시 동구 신암남로 105대구광역시 동구 신암동 259-71622020-03-04
4더블유스퀘어대구광역시 동구 신암남로 103대구광역시 동구 신암동 259-61512017-12-13
5국제오피스텔대구광역시 동구 동대구로 432대구광역시 동구 신천동 299-2901992-07-25
6방촌강남밸리스대구광역시 동구 동촌로 299대구광역시 동구 방촌동 857-120312004-03-19
7ds빌딩대구광역시 동구 동부로30길 91대구광역시 동구 신천동 348-2212005-05-30
8동대구역부띠크시티대구광역시 동구 동부로22길 14대구광역시 동구 신천동 286-24822015-02-10
9부띠크시티 테라스대구광역시 동구 동부로26길 5대구광역시 동구 신천동 331-15102016-02-26
건물명도로명주소지번주소세대수준공년월
17밀레니엄오피스텔대구광역시 동구 동부로26길 31대구광역시 동구 신천동 335-3362002-06-11
18동대구역 아펠리체 오피스텔대구광역시 동구 동부로22길 9대구광역시 동구 신천동 327-3308<NA>
19서원프레쉬빌대구광역시 동구 반야월북로27길 21대구광역시 동구 각산동 909262016-06-22
20각산역더클래스Ⅱ대구광역시 동구 반야월로 360대구광역시 동구 신서동 518-1242015-07-10
21안심역삼정그린코아더베스트오피스텔대구광역시 동구 반야월북로 355대구광역시 동구 괴전동 90-183<NA>
22이안센트럴D오피스텔대구광역시 동구 동대구로 575대구광역시 동구 신암동 235-1180<NA>
23동대구역 우방아이유쉘 오피스텔대구광역시 동구 동부로 115대구광역시 동구 신천동 70-1253<NA>
24방촌동 태왕아너스 오피세틀<NA>대구광역시 동구 방촌동 877-142<NA>
25더샵 센터시티 오피스텔대구광역시 동구 동부로 103-18대구광역시 동구 신천동 55-150<NA>
26동대구역 화성파크드림 오피스텔<NA>대구광역시 동구 신암동 255-14225<NA>