Overview

Dataset statistics

Number of variables7
Number of observations153
Missing cells298
Missing cells (%)27.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.1 KiB
Average record size in memory60.9 B

Variable types

Numeric4
Text1
Categorical2

Dataset

Description대구광역시 남구_공동주택 공사현황_20190703
Author대구광역시 남구
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=15012482&dataSetDetailId=150124821966538a6bdaf_201907031025&provdMethod=FILE

Alerts

세대수 is highly overall correlated with 호수High correlation
호수 is highly overall correlated with 세대수 and 2 other fieldsHigh correlation
가구수 is highly overall correlated with 호수 and 1 other fieldsHigh correlation
주용도 is highly overall correlated with 가구수 and 1 other fieldsHigh correlation
부속용도 is highly overall correlated with 호수 and 1 other fieldsHigh correlation
세대수 has 134 (87.6%) missing valuesMissing
호수 has 145 (94.8%) missing valuesMissing
가구수 has 19 (12.4%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2024-04-19 05:19:11.468629
Analysis finished2024-04-19 05:19:13.180224
Duration1.71 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct153
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77
Minimum1
Maximum153
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2024-04-19T14:19:13.248385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8.6
Q139
median77
Q3115
95-th percentile145.4
Maximum153
Range152
Interquartile range (IQR)76

Descriptive statistics

Standard deviation44.311398
Coefficient of variation (CV)0.5754727
Kurtosis-1.2
Mean77
Median Absolute Deviation (MAD)38
Skewness0
Sum11781
Variance1963.5
MonotonicityStrictly increasing
2024-04-19T14:19:13.407929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.7%
106 1
 
0.7%
99 1
 
0.7%
100 1
 
0.7%
101 1
 
0.7%
102 1
 
0.7%
103 1
 
0.7%
104 1
 
0.7%
105 1
 
0.7%
107 1
 
0.7%
Other values (143) 143
93.5%
ValueCountFrequency (%)
1 1
0.7%
2 1
0.7%
3 1
0.7%
4 1
0.7%
5 1
0.7%
6 1
0.7%
7 1
0.7%
8 1
0.7%
9 1
0.7%
10 1
0.7%
ValueCountFrequency (%)
153 1
0.7%
152 1
0.7%
151 1
0.7%
150 1
0.7%
149 1
0.7%
148 1
0.7%
147 1
0.7%
146 1
0.7%
145 1
0.7%
144 1
0.7%
Distinct147
Distinct (%)96.1%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2024-04-19T14:19:13.710299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length24
Mean length20.738562
Min length16

Characters and Unicode

Total characters3173
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique142 ?
Unique (%)92.8%

Sample

1st row대구광역시 남구 대명동 917-3 외1필지
2nd row대구광역시 남구 대명동 790-13 외1필지
3rd row대구광역시 남구 대명동 2768-114 외1필지
4th row대구광역시 남구 대명동 1497-22 외1필지
5th row대구광역시 남구 대명동 1888-9
ValueCountFrequency (%)
대구광역시 153
23.0%
남구 153
23.0%
대명동 134
20.2%
외1필지 42
 
6.3%
봉덕동 14
 
2.1%
외2필지 10
 
1.5%
이천동 5
 
0.8%
1888-9 3
 
0.5%
1020-3 2
 
0.3%
581-1 2
 
0.3%
Other values (144) 147
22.1%
2024-04-19T14:19:14.122534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
512
16.1%
306
 
9.6%
287
 
9.0%
1 219
 
6.9%
153
 
4.8%
153
 
4.8%
153
 
4.8%
153
 
4.8%
153
 
4.8%
- 149
 
4.7%
Other values (17) 935
29.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1689
53.2%
Decimal Number 823
25.9%
Space Separator 512
 
16.1%
Dash Punctuation 149
 
4.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
306
18.1%
287
17.0%
153
9.1%
153
9.1%
153
9.1%
153
9.1%
153
9.1%
134
7.9%
53
 
3.1%
53
 
3.1%
Other values (5) 91
 
5.4%
Decimal Number
ValueCountFrequency (%)
1 219
26.6%
2 86
 
10.4%
4 80
 
9.7%
3 78
 
9.5%
6 75
 
9.1%
5 67
 
8.1%
0 59
 
7.2%
9 56
 
6.8%
8 56
 
6.8%
7 47
 
5.7%
Space Separator
ValueCountFrequency (%)
512
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 149
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1689
53.2%
Common 1484
46.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
306
18.1%
287
17.0%
153
9.1%
153
9.1%
153
9.1%
153
9.1%
153
9.1%
134
7.9%
53
 
3.1%
53
 
3.1%
Other values (5) 91
 
5.4%
Common
ValueCountFrequency (%)
512
34.5%
1 219
14.8%
- 149
 
10.0%
2 86
 
5.8%
4 80
 
5.4%
3 78
 
5.3%
6 75
 
5.1%
5 67
 
4.5%
0 59
 
4.0%
9 56
 
3.8%
Other values (2) 103
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1689
53.2%
ASCII 1484
46.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
512
34.5%
1 219
14.8%
- 149
 
10.0%
2 86
 
5.8%
4 80
 
5.4%
3 78
 
5.3%
6 75
 
5.1%
5 67
 
4.5%
0 59
 
4.0%
9 56
 
3.8%
Other values (2) 103
 
6.9%
Hangul
ValueCountFrequency (%)
306
18.1%
287
17.0%
153
9.1%
153
9.1%
153
9.1%
153
9.1%
153
9.1%
134
7.9%
53
 
3.1%
53
 
3.1%
Other values (5) 91
 
5.4%

주용도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
단독주택
135 
공동주택
18 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row단독주택
2nd row단독주택
3rd row단독주택
4th row단독주택
5th row단독주택

Common Values

ValueCountFrequency (%)
단독주택 135
88.2%
공동주택 18
 
11.8%

Length

2024-04-19T14:19:14.258829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-19T14:19:14.350708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단독주택 135
88.2%
공동주택 18
 
11.8%

부속용도
Categorical

HIGH CORRELATION 

Distinct34
Distinct (%)22.2%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
다가구주택
75 
<NA>
25 
다가구
11 
다세대주택
 
7
제1종근린생활시설
 
2
Other values (29)
33 

Length

Max length19
Median length5
Mean length5.7320261
Min length3

Unique

Unique25 ?
Unique (%)16.3%

Sample

1st row다가구주택
2nd row다가구주택
3rd row다가구주택
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
다가구주택 75
49.0%
<NA> 25
 
16.3%
다가구 11
 
7.2%
다세대주택 7
 
4.6%
제1종근린생활시설 2
 
1.3%
다세대 2
 
1.3%
다가구주택, 소매점 2
 
1.3%
다가구주택,소매점 2
 
1.3%
다가구,사무소 2
 
1.3%
다가구/소매점 1
 
0.7%
Other values (24) 24
 
15.7%

Length

2024-04-19T14:19:14.454647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
다가구주택 80
48.8%
na 25
 
15.2%
다가구 11
 
6.7%
다세대주택 8
 
4.9%
소매점 3
 
1.8%
3
 
1.8%
근린생활시설 2
 
1.2%
제1종근린생활시설 2
 
1.2%
다세대 2
 
1.2%
다가구주택,소매점 2
 
1.2%
Other values (24) 26
 
15.9%

세대수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct12
Distinct (%)63.2%
Missing134
Missing (%)87.6%
Infinite0
Infinite (%)0.0%
Mean14.263158
Minimum6
Maximum29
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2024-04-19T14:19:14.567999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile6.9
Q19
median14
Q316
95-th percentile27.2
Maximum29
Range23
Interquartile range (IQR)7

Descriptive statistics

Standard deviation6.8055705
Coefficient of variation (CV)0.47714332
Kurtosis-0.049540504
Mean14.263158
Median Absolute Deviation (MAD)5
Skewness0.90544929
Sum271
Variance46.315789
MonotonicityNot monotonic
2024-04-19T14:19:14.679307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
16 4
 
2.6%
8 2
 
1.3%
14 2
 
1.3%
10 2
 
1.3%
9 2
 
1.3%
21 1
 
0.7%
29 1
 
0.7%
7 1
 
0.7%
6 1
 
0.7%
27 1
 
0.7%
Other values (2) 2
 
1.3%
(Missing) 134
87.6%
ValueCountFrequency (%)
6 1
 
0.7%
7 1
 
0.7%
8 2
1.3%
9 2
1.3%
10 2
1.3%
11 1
 
0.7%
14 2
1.3%
16 4
2.6%
21 1
 
0.7%
24 1
 
0.7%
ValueCountFrequency (%)
29 1
 
0.7%
27 1
 
0.7%
24 1
 
0.7%
21 1
 
0.7%
16 4
2.6%
14 2
1.3%
11 1
 
0.7%
10 2
1.3%
9 2
1.3%
8 2
1.3%

호수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6
Distinct (%)75.0%
Missing145
Missing (%)94.8%
Infinite0
Infinite (%)0.0%
Mean9.375
Minimum2
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2024-04-19T14:19:14.772552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2.35
Q13
median6
Q314.75
95-th percentile21.55
Maximum24
Range22
Interquartile range (IQR)11.75

Descriptive statistics

Standard deviation8.0345237
Coefficient of variation (CV)0.85701586
Kurtosis-0.2836436
Mean9.375
Median Absolute Deviation (MAD)3.5
Skewness0.9821034
Sum75
Variance64.553571
MonotonicityNot monotonic
2024-04-19T14:19:14.880800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
6 2
 
1.3%
3 2
 
1.3%
24 1
 
0.7%
2 1
 
0.7%
14 1
 
0.7%
17 1
 
0.7%
(Missing) 145
94.8%
ValueCountFrequency (%)
2 1
0.7%
3 2
1.3%
6 2
1.3%
14 1
0.7%
17 1
0.7%
24 1
0.7%
ValueCountFrequency (%)
24 1
0.7%
17 1
0.7%
14 1
0.7%
6 2
1.3%
3 2
1.3%
2 1
0.7%

가구수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct17
Distinct (%)12.7%
Missing19
Missing (%)12.4%
Infinite0
Infinite (%)0.0%
Mean10.671642
Minimum2
Maximum19
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2024-04-19T14:19:15.008128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile5.65
Q19
median10
Q313
95-th percentile16.7
Maximum19
Range17
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.439654
Coefficient of variation (CV)0.32231723
Kurtosis0.042263685
Mean10.671642
Median Absolute Deviation (MAD)2
Skewness0.27998812
Sum1430
Variance11.83122
MonotonicityNot monotonic
2024-04-19T14:19:15.155349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
9 24
15.7%
12 15
9.8%
10 15
9.8%
11 15
9.8%
7 13
8.5%
13 10
6.5%
15 9
 
5.9%
6 5
 
3.3%
16 5
 
3.3%
8 5
 
3.3%
Other values (7) 18
11.8%
(Missing) 19
12.4%
ValueCountFrequency (%)
2 1
 
0.7%
3 1
 
0.7%
4 2
 
1.3%
5 3
 
2.0%
6 5
 
3.3%
7 13
8.5%
8 5
 
3.3%
9 24
15.7%
10 15
9.8%
11 15
9.8%
ValueCountFrequency (%)
19 3
 
2.0%
18 4
 
2.6%
16 5
 
3.3%
15 9
 
5.9%
14 4
 
2.6%
13 10
6.5%
12 15
9.8%
11 15
9.8%
10 15
9.8%
9 24
15.7%

Interactions

2024-04-19T14:19:12.601974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:11.697366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:11.995552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:12.306115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:12.683834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:11.773184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:12.084625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:12.378072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:12.759924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:11.845549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:12.152457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:12.451720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:12.847199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:11.910579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:12.230284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:19:12.517267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-19T14:19:15.244656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번주용도부속용도세대수호수가구수
연번1.0000.3280.3490.8620.8400.424
주용도0.3281.0001.0000.5370.565NaN
부속용도0.3491.0001.0000.7350.9640.000
세대수0.8620.5370.7351.0000.925NaN
호수0.8400.5650.9640.9251.0000.000
가구수0.424NaN0.000NaN0.0001.000
2024-04-19T14:19:15.350027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부속용도주용도
부속용도1.0000.868
주용도0.8681.000
2024-04-19T14:19:15.464714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번세대수호수가구수주용도부속용도
연번1.0000.1320.120-0.2200.2390.112
세대수0.1321.0001.000NaN0.2970.399
호수0.1201.0001.0001.0000.4350.667
가구수-0.220NaN1.0001.0001.0000.000
주용도0.2390.2970.4351.0001.0000.868
부속용도0.1120.3990.6670.0000.8681.000

Missing values

2024-04-19T14:19:12.945417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-19T14:19:13.037683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-19T14:19:13.127612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번대지위치주용도부속용도세대수호수가구수
01대구광역시 남구 대명동 917-3 외1필지단독주택다가구주택<NA><NA>18
12대구광역시 남구 대명동 790-13 외1필지단독주택다가구주택<NA><NA>11
23대구광역시 남구 대명동 2768-114 외1필지단독주택다가구주택<NA><NA>12
34대구광역시 남구 대명동 1497-22 외1필지단독주택<NA><NA><NA>10
45대구광역시 남구 대명동 1888-9단독주택<NA><NA><NA>11
56대구광역시 남구 대명동 1888-9단독주택<NA><NA><NA>11
67대구광역시 남구 대명동 1888-9단독주택<NA><NA><NA>11
78대구광역시 남구 대명동 915-1단독주택다가구주택<NA><NA>9
89대구광역시 남구 대명동 1077-15 외1필지단독주택<NA><NA><NA>12
910대구광역시 남구 대명동 1596-9 외1필지단독주택<NA><NA><NA>10
연번대지위치주용도부속용도세대수호수가구수
143144대구광역시 남구 이천동 517-25 외1필지단독주택다가구주택(8가구)<NA><NA>8
144145대구광역시 남구 대명동 1604-27단독주택<NA><NA><NA>9
145146대구광역시 남구 이천동 517-33단독주택다가구주택(6가구)<NA><NA>6
146147대구광역시 남구 대명동 1592-28단독주택다가구<NA><NA>7
147148대구광역시 남구 대명동 378-4단독주택<NA><NA><NA>10
148149대구광역시 남구 봉덕동 739-3단독주택<NA><NA><NA>10
149150대구광역시 남구 대명동 3040-10단독주택다가구주택<NA><NA>6
150151대구광역시 남구 대명동 919-1단독주택다가구주택 및 제2종근린새왈시설<NA><NA>9
151152대구광역시 남구 대명동 634-3단독주택다가구주택<NA><NA>10
152153대구광역시 남구 이천동 645-12단독주택<NA><NA>17<NA>