Overview

Dataset statistics

Number of variables7
Number of observations717
Missing cells2
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory39.3 KiB
Average record size in memory56.2 B

Variable types

Categorical1
Text4
DateTime2

Dataset

Description대구광역시_남구_ 건축인허가DB_20190630
Author대구광역시 남구
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=3055124&dataSetDetailId=30551242875e637979f5&provdMethod=FILE

Alerts

건축구분 is highly imbalanced (51.2%)Imbalance

Reproduction

Analysis started2024-04-18 01:18:11.965051
Analysis finished2024-04-18 01:18:13.192312
Duration1.23 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

건축구분
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
신축
581 
증축
73 
용도변경
 
37
대수선
 
26

Length

Max length4
Median length2
Mean length2.13947
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row신축
2nd row신축
3rd row신축
4th row신축
5th row신축

Common Values

ValueCountFrequency (%)
신축 581
81.0%
증축 73
 
10.2%
용도변경 37
 
5.2%
대수선 26
 
3.6%

Length

2024-04-18T10:18:13.253651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T10:18:13.350305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신축 581
81.0%
증축 73
 
10.2%
용도변경 37
 
5.2%
대수선 26
 
3.6%
Distinct673
Distinct (%)93.9%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
2024-04-18T10:18:13.638691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length25
Mean length19.746165
Min length16

Characters and Unicode

Total characters14158
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique631 ?
Unique (%)88.0%

Sample

1st row대구광역시 남구 대명동 1019-2 외1필지
2nd row대구광역시 남구 대명동 1019-5 외1필지
3rd row대구광역시 남구 대명동 1020-3
4th row대구광역시 남구 대명동 1020-8
5th row대구광역시 남구 대명동 1024-1
ValueCountFrequency (%)
대구광역시 717
24.1%
남구 717
24.1%
대명동 590
19.8%
봉덕동 97
 
3.3%
외1필지 71
 
2.4%
이천동 30
 
1.0%
외2필지 16
 
0.5%
317-1 5
 
0.2%
321-1 4
 
0.1%
4
 
0.1%
Other values (670) 722
24.3%
2024-04-18T10:18:14.069743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2256
15.9%
1434
 
10.1%
1307
 
9.2%
1 826
 
5.8%
717
 
5.1%
717
 
5.1%
717
 
5.1%
717
 
5.1%
717
 
5.1%
- 706
 
5.0%
Other values (18) 4044
28.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7477
52.8%
Decimal Number 3719
26.3%
Space Separator 2256
 
15.9%
Dash Punctuation 706
 
5.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1434
19.2%
1307
17.5%
717
9.6%
717
9.6%
717
9.6%
717
9.6%
717
9.6%
590
7.9%
101
 
1.4%
101
 
1.4%
Other values (6) 359
 
4.8%
Decimal Number
ValueCountFrequency (%)
1 826
22.2%
2 428
11.5%
3 400
10.8%
4 341
9.2%
5 326
 
8.8%
6 319
 
8.6%
0 295
 
7.9%
7 275
 
7.4%
9 260
 
7.0%
8 249
 
6.7%
Space Separator
ValueCountFrequency (%)
2256
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 706
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7477
52.8%
Common 6681
47.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1434
19.2%
1307
17.5%
717
9.6%
717
9.6%
717
9.6%
717
9.6%
717
9.6%
590
7.9%
101
 
1.4%
101
 
1.4%
Other values (6) 359
 
4.8%
Common
ValueCountFrequency (%)
2256
33.8%
1 826
 
12.4%
- 706
 
10.6%
2 428
 
6.4%
3 400
 
6.0%
4 341
 
5.1%
5 326
 
4.9%
6 319
 
4.8%
0 295
 
4.4%
7 275
 
4.1%
Other values (2) 509
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7477
52.8%
ASCII 6681
47.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2256
33.8%
1 826
 
12.4%
- 706
 
10.6%
2 428
 
6.4%
3 400
 
6.0%
4 341
 
5.1%
5 326
 
4.9%
6 319
 
4.8%
0 295
 
4.4%
7 275
 
4.1%
Other values (2) 509
 
7.6%
Hangul
ValueCountFrequency (%)
1434
19.2%
1307
17.5%
717
9.6%
717
9.6%
717
9.6%
717
9.6%
717
9.6%
590
7.9%
101
 
1.4%
101
 
1.4%
Other values (6) 359
 
4.8%
Distinct609
Distinct (%)85.2%
Missing2
Missing (%)0.3%
Memory size5.7 KiB
2024-04-18T10:18:14.400082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length5
Mean length4.7244755
Min length1

Characters and Unicode

Total characters3378
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique521 ?
Unique (%)72.9%

Sample

1st row269
2nd row268
3rd row255.4
4th row255.4
5th row429.8
ValueCountFrequency (%)
200 6
 
0.8%
302.3 4
 
0.6%
195 4
 
0.6%
271.7 3
 
0.4%
251 3
 
0.4%
225.7 3
 
0.4%
268.4 3
 
0.4%
289.3 3
 
0.4%
221 3
 
0.4%
324 3
 
0.4%
Other values (599) 680
95.1%
2024-04-18T10:18:14.835661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 535
15.8%
2 504
14.9%
3 389
11.5%
1 372
11.0%
4 286
8.5%
6 233
6.9%
5 225
6.7%
8 203
 
6.0%
0 201
 
6.0%
7 200
 
5.9%
Other values (2) 230
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2812
83.2%
Other Punctuation 566
 
16.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 504
17.9%
3 389
13.8%
1 372
13.2%
4 286
10.2%
6 233
8.3%
5 225
8.0%
8 203
7.2%
0 201
 
7.1%
7 200
 
7.1%
9 199
 
7.1%
Other Punctuation
ValueCountFrequency (%)
. 535
94.5%
, 31
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Common 3378
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 535
15.8%
2 504
14.9%
3 389
11.5%
1 372
11.0%
4 286
8.5%
6 233
6.9%
5 225
6.7%
8 203
 
6.0%
0 201
 
6.0%
7 200
 
5.9%
Other values (2) 230
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3378
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 535
15.8%
2 504
14.9%
3 389
11.5%
1 372
11.0%
4 286
8.5%
6 233
6.9%
5 225
6.7%
8 203
 
6.0%
0 201
 
6.0%
7 200
 
5.9%
Other values (2) 230
6.8%
Distinct656
Distinct (%)91.5%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
2024-04-18T10:18:15.109886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length6
Mean length5.7196653
Min length2

Characters and Unicode

Total characters4101
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique596 ?
Unique (%)83.1%

Sample

1st row161.16
2nd row160.38
3rd row153
4th row153
5th row215.25
ValueCountFrequency (%)
169.28 3
 
0.4%
196.29 2
 
0.3%
141.48 2
 
0.3%
155.7 2
 
0.3%
182.39 2
 
0.3%
168.96 2
 
0.3%
245.04 2
 
0.3%
150.02 2
 
0.3%
179.34 2
 
0.3%
214.2 2
 
0.3%
Other values (646) 696
97.1%
2024-04-18T10:18:15.514600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 701
17.1%
1 674
16.4%
2 477
11.6%
6 329
8.0%
8 309
7.5%
3 293
7.1%
4 291
7.1%
5 289
7.0%
9 250
 
6.1%
0 239
 
5.8%
Other values (2) 249
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3384
82.5%
Other Punctuation 717
 
17.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 674
19.9%
2 477
14.1%
6 329
9.7%
8 309
9.1%
3 293
8.7%
4 291
8.6%
5 289
8.5%
9 250
 
7.4%
0 239
 
7.1%
7 233
 
6.9%
Other Punctuation
ValueCountFrequency (%)
. 701
97.8%
, 16
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
Common 4101
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 701
17.1%
1 674
16.4%
2 477
11.6%
6 329
8.0%
8 309
7.5%
3 293
7.1%
4 291
7.1%
5 289
7.0%
9 250
 
6.1%
0 239
 
5.8%
Other values (2) 249
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4101
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 701
17.1%
1 674
16.4%
2 477
11.6%
6 329
8.0%
8 309
7.5%
3 293
7.1%
4 291
7.1%
5 289
7.0%
9 250
 
6.1%
0 239
 
5.8%
Other values (2) 249
 
6.1%
Distinct692
Distinct (%)96.5%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
2024-04-18T10:18:15.802501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length6
Mean length6.0376569
Min length3

Characters and Unicode

Total characters4329
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique667 ?
Unique (%)93.0%

Sample

1st row465.46
2nd row466.36
3rd row463.36
4th row463.36
5th row633.38
ValueCountFrequency (%)
896.31 2
 
0.3%
514.51 2
 
0.3%
221.92 2
 
0.3%
659.65 2
 
0.3%
659.55 2
 
0.3%
476.38 2
 
0.3%
399.55 2
 
0.3%
349.06 2
 
0.3%
659.62 2
 
0.3%
659.88 2
 
0.3%
Other values (682) 697
97.2%
2024-04-18T10:18:16.206716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 703
16.2%
5 445
10.3%
3 422
9.7%
4 417
9.6%
9 391
9.0%
6 390
9.0%
1 356
8.2%
2 354
8.2%
7 318
7.3%
8 305
7.0%
Other values (2) 228
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3565
82.4%
Other Punctuation 764
 
17.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 445
12.5%
3 422
11.8%
4 417
11.7%
9 391
11.0%
6 390
10.9%
1 356
10.0%
2 354
9.9%
7 318
8.9%
8 305
8.6%
0 167
 
4.7%
Other Punctuation
ValueCountFrequency (%)
. 703
92.0%
, 61
 
8.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4329
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 703
16.2%
5 445
10.3%
3 422
9.7%
4 417
9.6%
9 391
9.0%
6 390
9.0%
1 356
8.2%
2 354
8.2%
7 318
7.3%
8 305
7.0%
Other values (2) 228
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4329
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 703
16.2%
5 445
10.3%
3 422
9.7%
4 417
9.6%
9 391
9.0%
6 390
9.0%
1 356
8.2%
2 354
8.2%
7 318
7.3%
8 305
7.0%
Other values (2) 228
 
5.3%
Distinct454
Distinct (%)63.3%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
Minimum2004-03-18 00:00:00
Maximum2017-10-23 00:00:00
2024-04-18T10:18:16.338936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T10:18:16.450214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct428
Distinct (%)59.7%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
Minimum2015-01-02 00:00:00
Maximum2017-12-29 00:00:00
2024-04-18T10:18:16.567161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T10:18:16.676317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Missing values

2024-04-18T10:18:13.152140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

건축구분대지위치대지면적(㎡)건축면적(㎡)연면적(㎡)허가일사용승인일
0신축대구광역시 남구 대명동 1019-2 외1필지269161.16465.462015-12-312016-06-27
1신축대구광역시 남구 대명동 1019-5 외1필지268160.38466.362016-07-082017-01-18
2신축대구광역시 남구 대명동 1020-3255.4153463.362015-06-182015-12-14
3신축대구광역시 남구 대명동 1020-8255.4153463.362015-06-182015-12-14
4신축대구광역시 남구 대명동 1024-1429.8215.25633.382014-09-262015-01-28
5신축대구광역시 남구 대명동 1029-12463.8219.69659.262015-04-282015-08-11
6신축대구광역시 남구 대명동 1029-3532.2224.37659.512015-01-192015-06-09
7신축대구광역시 남구 대명동 1041-2235.3140.22392.442016-04-222016-10-20
8신축대구광역시 남구 대명동 1047-3288.9169.28479.022016-02-292016-08-23
9신축대구광역시 남구 대명동 1047-9289169.28479.022016-02-292016-08-23
건축구분대지위치대지면적(㎡)건축면적(㎡)연면적(㎡)허가일사용승인일
707신축대구광역시 남구 이천동 405-5530264.781,655.332014-05-152015-02-12
708신축대구광역시 남구 이천동 407-71,099611.544,173.522004-03-182015-07-21
709신축대구광역시 남구 이천동 491-62393232.07863.622015-08-122016-06-09
710증축대구광역시 남구 이천동 516-15628288.311,078.922017-08-252017-12-08
711신축대구광역시 남구 이천동 516-23 외1필지499288.31944.132015-12-082017-06-26
712신축대구광역시 남구 이천동 542-13172101.08197.5052015-06-152016-03-09
713신축대구광역시 남구 이천동 584-5 외1필지177105.22229.122017-04-052017-10-26
714증축대구광역시 남구 이천동 644-11341201.26576.772015-06-262015-07-31
715신축대구광역시 남구 이천동 644-11341201.26517.952014-11-272015-06-12
716신축대구광역시 남구 이천동 756-1200117.38326.322017-06-012017-12-06