Overview

Dataset statistics

Number of variables8
Number of observations115
Missing cells21
Missing cells (%)2.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 KiB
Average record size in memory67.1 B

Variable types

Numeric2
Categorical2
Text3
DateTime1

Dataset

Description대구광역시_오피스텔현황_20220430
Author대구광역시
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=15100225&dataSetDetailId=151002251d932a6e831de&provdMethod=FILE

Alerts

연번 is highly overall correlated with 구군명 and 1 other fieldsHigh correlation
구군명 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
대지위치 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
오피스텔명 has 19 (16.5%) missing valuesMissing
연면적 has 2 (1.7%) missing valuesMissing
연번 has unique valuesUnique
지번 has unique valuesUnique

Reproduction

Analysis started2024-04-20 21:56:24.531655
Analysis finished2024-04-20 21:56:26.227213
Duration1.7 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct115
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean58
Minimum1
Maximum115
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-04-21T06:56:26.361219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.7
Q129.5
median58
Q386.5
95-th percentile109.3
Maximum115
Range114
Interquartile range (IQR)57

Descriptive statistics

Standard deviation33.341666
Coefficient of variation (CV)0.5748563
Kurtosis-1.2
Mean58
Median Absolute Deviation (MAD)29
Skewness0
Sum6670
Variance1111.6667
MonotonicityStrictly increasing
2024-04-21T06:56:26.631472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.9%
74 1
 
0.9%
86 1
 
0.9%
85 1
 
0.9%
84 1
 
0.9%
83 1
 
0.9%
82 1
 
0.9%
81 1
 
0.9%
80 1
 
0.9%
79 1
 
0.9%
Other values (105) 105
91.3%
ValueCountFrequency (%)
1 1
0.9%
2 1
0.9%
3 1
0.9%
4 1
0.9%
5 1
0.9%
6 1
0.9%
7 1
0.9%
8 1
0.9%
9 1
0.9%
10 1
0.9%
ValueCountFrequency (%)
115 1
0.9%
114 1
0.9%
113 1
0.9%
112 1
0.9%
111 1
0.9%
110 1
0.9%
109 1
0.9%
108 1
0.9%
107 1
0.9%
106 1
0.9%

구군명
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
달서구
26 
동구
22 
북구
19 
중구
15 
수성구
14 
Other values (2)
19 

Length

Max length3
Median length2
Mean length2.4347826
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중구
2nd row중구
3rd row중구
4th row중구
5th row중구

Common Values

ValueCountFrequency (%)
달서구 26
22.6%
동구 22
19.1%
북구 19
16.5%
중구 15
13.0%
수성구 14
12.2%
달성군 10
 
8.7%
남구 9
 
7.8%

Length

2024-04-21T06:56:26.910041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T06:56:27.141966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
달서구 26
22.6%
동구 22
19.1%
북구 19
16.5%
중구 15
13.0%
수성구 14
12.2%
달성군 10
 
8.7%
남구 9
 
7.8%

오피스텔명
Text

MISSING 

Distinct80
Distinct (%)83.3%
Missing19
Missing (%)16.5%
Memory size1.0 KiB
2024-04-21T06:56:27.986830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length26
Mean length8.375
Min length3

Characters and Unicode

Total characters804
Distinct characters179
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)77.1%

Sample

1st row클래식명가
2nd row진석타워
3rd row움비어스오피스텔
4th row센트로펠리스
5th row대봉화성파크드림
ValueCountFrequency (%)
주거복합 25
 
16.3%
호산동 9
 
5.9%
오피스텔 6
 
3.9%
감삼동 3
 
2.0%
이곡동 3
 
2.0%
상인동 3
 
2.0%
태왕아너스 2
 
1.3%
오페라 2
 
1.3%
범어 2
 
1.3%
대구역 2
 
1.3%
Other values (93) 96
62.7%
2024-04-21T06:56:29.267807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
57
 
7.1%
50
 
6.2%
32
 
4.0%
30
 
3.7%
27
 
3.4%
27
 
3.4%
27
 
3.4%
27
 
3.4%
22
 
2.7%
21
 
2.6%
Other values (169) 484
60.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 736
91.5%
Space Separator 57
 
7.1%
Close Punctuation 2
 
0.2%
Open Punctuation 2
 
0.2%
Decimal Number 2
 
0.2%
Lowercase Letter 2
 
0.2%
Uppercase Letter 2
 
0.2%
Letter Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
50
 
6.8%
32
 
4.3%
30
 
4.1%
27
 
3.7%
27
 
3.7%
27
 
3.7%
27
 
3.7%
22
 
3.0%
21
 
2.9%
16
 
2.2%
Other values (160) 457
62.1%
Lowercase Letter
ValueCountFrequency (%)
d 1
50.0%
s 1
50.0%
Uppercase Letter
ValueCountFrequency (%)
W 1
50.0%
D 1
50.0%
Space Separator
ValueCountFrequency (%)
57
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 736
91.5%
Common 63
 
7.8%
Latin 5
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
50
 
6.8%
32
 
4.3%
30
 
4.1%
27
 
3.7%
27
 
3.7%
27
 
3.7%
27
 
3.7%
22
 
3.0%
21
 
2.9%
16
 
2.2%
Other values (160) 457
62.1%
Latin
ValueCountFrequency (%)
d 1
20.0%
W 1
20.0%
s 1
20.0%
D 1
20.0%
1
20.0%
Common
ValueCountFrequency (%)
57
90.5%
) 2
 
3.2%
( 2
 
3.2%
1 2
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 736
91.5%
ASCII 67
 
8.3%
Number Forms 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
57
85.1%
) 2
 
3.0%
( 2
 
3.0%
1 2
 
3.0%
d 1
 
1.5%
W 1
 
1.5%
s 1
 
1.5%
D 1
 
1.5%
Hangul
ValueCountFrequency (%)
50
 
6.8%
32
 
4.3%
30
 
4.1%
27
 
3.7%
27
 
3.7%
27
 
3.7%
27
 
3.7%
22
 
3.0%
21
 
2.9%
16
 
2.2%
Other values (160) 457
62.1%
Number Forms
ValueCountFrequency (%)
1
100.0%

대지위치
Categorical

HIGH CORRELATION 

Distinct45
Distinct (%)39.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
범어동
10 
신천동
호산동
칠성동2가
 
7
침산동
 
6
Other values (40)
74 

Length

Max length6
Median length3
Mean length3.6
Min length2

Unique

Unique20 ?
Unique (%)17.4%

Sample

1st row삼덕동2가
2nd row삼덕동2가
3rd row삼덕동3가
4th row대봉동
5th row대봉동

Common Values

ValueCountFrequency (%)
범어동 10
 
8.7%
신천동 9
 
7.8%
호산동 9
 
7.8%
칠성동2가 7
 
6.1%
침산동 6
 
5.2%
대명동 4
 
3.5%
감삼동 4
 
3.5%
신서동 4
 
3.5%
이곡동 3
 
2.6%
구지면내리 3
 
2.6%
Other values (35) 56
48.7%

Length

2024-04-21T06:56:29.710972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
범어동 10
 
8.4%
호산동 9
 
7.6%
신천동 9
 
7.6%
칠성동2가 7
 
5.9%
침산동 6
 
5.0%
대명동 4
 
3.4%
감삼동 4
 
3.4%
신서동 4
 
3.4%
고성동1가 3
 
2.5%
상인동 3
 
2.5%
Other values (37) 60
50.4%

지번
Text

UNIQUE 

Distinct115
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2024-04-21T06:56:30.774314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length6.7130435
Min length1

Characters and Unicode

Total characters772
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique115 ?
Unique (%)100.0%

Sample

1st row139-12
2nd row210-1
3rd row121
4th row60-10
5th row152-24
ValueCountFrequency (%)
20
 
11.8%
1필 6
 
3.5%
일원 4
 
2.4%
2필 3
 
1.8%
2필지 3
 
1.8%
139-12 1
 
0.6%
2248 1
 
0.6%
175-1 1
 
0.6%
177-1 1
 
0.6%
1004-1 1
 
0.6%
Other values (129) 129
75.9%
2024-04-21T06:56:32.018659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 124
16.1%
- 95
12.3%
2 84
10.9%
57
 
7.4%
0 46
 
6.0%
7 44
 
5.7%
3 42
 
5.4%
5 42
 
5.4%
4 41
 
5.3%
8 36
 
4.7%
Other values (10) 161
20.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 515
66.7%
Other Letter 104
 
13.5%
Dash Punctuation 95
 
12.3%
Space Separator 57
 
7.4%
Other Punctuation 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 124
24.1%
2 84
16.3%
0 46
 
8.9%
7 44
 
8.5%
3 42
 
8.2%
5 42
 
8.2%
4 41
 
8.0%
8 36
 
7.0%
9 30
 
5.8%
6 26
 
5.0%
Other Letter
ValueCountFrequency (%)
31
29.8%
28
26.9%
23
22.1%
8
 
7.7%
7
 
6.7%
5
 
4.8%
2
 
1.9%
Dash Punctuation
ValueCountFrequency (%)
- 95
100.0%
Space Separator
ValueCountFrequency (%)
57
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 668
86.5%
Hangul 104
 
13.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 124
18.6%
- 95
14.2%
2 84
12.6%
57
8.5%
0 46
 
6.9%
7 44
 
6.6%
3 42
 
6.3%
5 42
 
6.3%
4 41
 
6.1%
8 36
 
5.4%
Other values (3) 57
8.5%
Hangul
ValueCountFrequency (%)
31
29.8%
28
26.9%
23
22.1%
8
 
7.7%
7
 
6.7%
5
 
4.8%
2
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 668
86.5%
Hangul 104
 
13.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 124
18.6%
- 95
14.2%
2 84
12.6%
57
8.5%
0 46
 
6.9%
7 44
 
6.6%
3 42
 
6.3%
5 42
 
6.3%
4 41
 
6.1%
8 36
 
5.4%
Other values (3) 57
8.5%
Hangul
ValueCountFrequency (%)
31
29.8%
28
26.9%
23
22.1%
8
 
7.7%
7
 
6.7%
5
 
4.8%
2
 
1.9%

연면적
Text

MISSING 

Distinct110
Distinct (%)97.3%
Missing2
Missing (%)1.7%
Memory size1.0 KiB
2024-04-21T06:56:33.077944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length10
Mean length8.3097345
Min length5

Characters and Unicode

Total characters939
Distinct characters19
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique107 ?
Unique (%)94.7%

Sample

1st row3,297.60
2nd row46,167.34
3rd row2,890.08
4th row16,336.29
5th row1,669.27
ValueCountFrequency (%)
695.86 2
 
1.8%
861.08 2
 
1.8%
917.5 2
 
1.8%
29,816.57 1
 
0.9%
88,760.50 1
 
0.9%
2,050.45 1
 
0.9%
33,154.45 1
 
0.9%
30,604.49 1
 
0.9%
36,312.23 1
 
0.9%
59,578.57 1
 
0.9%
Other values (100) 100
88.5%
2024-04-21T06:56:34.351046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 114
12.1%
, 98
10.4%
1 97
10.3%
2 78
8.3%
6 77
8.2%
5 73
7.8%
0 70
7.5%
9 70
7.5%
3 69
7.3%
7 64
6.8%
Other values (9) 129
13.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 720
76.7%
Other Punctuation 212
 
22.6%
Other Letter 5
 
0.5%
Open Punctuation 1
 
0.1%
Close Punctuation 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 97
13.5%
2 78
10.8%
6 77
10.7%
5 73
10.1%
0 70
9.7%
9 70
9.7%
3 69
9.6%
7 64
8.9%
4 63
8.8%
8 59
8.2%
Other Letter
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Other Punctuation
ValueCountFrequency (%)
. 114
53.8%
, 98
46.2%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 934
99.5%
Hangul 5
 
0.5%

Most frequent character per script

Common
ValueCountFrequency (%)
. 114
12.2%
, 98
10.5%
1 97
10.4%
2 78
8.4%
6 77
8.2%
5 73
7.8%
0 70
7.5%
9 70
7.5%
3 69
7.4%
7 64
6.9%
Other values (4) 124
13.3%
Hangul
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 934
99.5%
Hangul 5
 
0.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 114
12.2%
, 98
10.5%
1 97
10.4%
2 78
8.4%
6 77
8.2%
5 73
7.8%
0 70
7.5%
9 70
7.5%
3 69
7.4%
7 64
6.9%
Other values (4) 124
13.3%
Hangul
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

오피스텔호실수
Real number (ℝ)

Distinct79
Distinct (%)68.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean150.45217
Minimum20
Maximum1046
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-04-21T06:56:34.606007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile21
Q132.5
median72
Q3163.5
95-th percentile497.4
Maximum1046
Range1026
Interquartile range (IQR)131

Descriptive statistics

Standard deviation188.29308
Coefficient of variation (CV)1.2515145
Kurtosis6.7227682
Mean150.45217
Median Absolute Deviation (MAD)46
Skewness2.4204147
Sum17302
Variance35454.285
MonotonicityNot monotonic
2024-04-21T06:56:34.860877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21 8
 
7.0%
30 6
 
5.2%
52 4
 
3.5%
36 4
 
3.5%
72 3
 
2.6%
83 3
 
2.6%
22 3
 
2.6%
63 3
 
2.6%
58 2
 
1.7%
20 2
 
1.7%
Other values (69) 77
67.0%
ValueCountFrequency (%)
20 2
 
1.7%
21 8
7.0%
22 3
 
2.6%
24 2
 
1.7%
25 2
 
1.7%
26 1
 
0.9%
27 2
 
1.7%
28 1
 
0.9%
30 6
5.2%
31 1
 
0.9%
ValueCountFrequency (%)
1046 1
0.9%
928 1
0.9%
730 1
0.9%
672 1
0.9%
614 1
0.9%
510 1
0.9%
492 1
0.9%
482 1
0.9%
449 1
0.9%
438 1
0.9%
Distinct108
Distinct (%)93.9%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
Minimum1992-07-25 00:00:00
Maximum2025-06-30 00:00:00
2024-04-21T06:56:35.133223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T06:56:35.395595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-04-21T06:56:25.383448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T06:56:25.086132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T06:56:25.539719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T06:56:25.232180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T06:56:35.553952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구군명오피스텔명대지위치오피스텔호실수
연번1.0000.9570.8490.9640.227
구군명0.9571.0001.0001.0000.000
오피스텔명0.8491.0001.0001.0000.992
대지위치0.9641.0001.0001.0000.586
오피스텔호실수0.2270.0000.9920.5861.000
2024-04-21T06:56:35.719630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구군명대지위치
구군명1.0000.805
대지위치0.8051.000
2024-04-21T06:56:35.861992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번오피스텔호실수구군명대지위치
연번1.000-0.3060.8780.622
오피스텔호실수-0.3061.0000.0000.184
구군명0.8780.0001.0000.805
대지위치0.6220.1840.8051.000

Missing values

2024-04-21T06:56:25.739511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T06:56:25.970726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-21T06:56:26.143354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번구군명오피스텔명대지위치지번연면적오피스텔호실수사용검사(사용승인)
01중구클래식명가삼덕동2가139-123,297.60582004-11-04
12중구진석타워삼덕동2가210-146,167.341381993-09-27
23중구움비어스오피스텔삼덕동3가1212,890.08552003-01-24
34중구센트로펠리스대봉동60-1016,336.291442007-03-23
45중구대봉화성파크드림대봉동152-241,669.27252006-05-22
56중구세명오피스넬봉산동136-92,300.82582011-04-29
67중구동승오피스텔봉산동168-11 외 6필951.69252010-11-05
78중구인터불고코아시스남산동437-113,976.081182014-02-07
89중구화성파크드림시티동인동2가51 외 9필지56,487.249282015-09-21
910중구노마즈하우스교동816,894.762612014-01-21
연번구군명오피스텔명대지위치지번연면적오피스텔호실수사용검사(사용승인)
105106달성군<NA>다사읍매곡리319 외 13필지4,968.891062013-07-29
106107달성군<NA>다사읍죽곡리210,2128,874.921522014-05-02
107108달성군<NA>구지면내리843-5734.41212014-04-14
108109달성군<NA>구지면내리844-1849.6212014-12-29
109110달성군<NA>구지면내리844-21869.04212014-12-29
110111달성군<NA>구지면 내리844-15917.5212014-12-19
111112달성군<NA>구지면 내리844-22960.49212016-07-28
112113달성군<NA>구지면 내리844-26917.5212015-02-06
113114달성군<NA>유가읍 봉리606-229,816.573612017-08-16
114115달성군<NA>다사읍죽곡리01월 25일2,925.66482018-10-04