Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells679
Missing cells (%)1.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory400.4 KiB
Average record size in memory41.0 B

Variable types

Numeric1
Text2
Categorical1

Dataset

Description건축물(일반건축물) 유지관리대상에 대한 데이터로 대지위치주소, 도로명주소, 건물명, 주용도 등의 항목을 제공합니다.
URLhttps://www.data.go.kr/data/15120022/fileData.do

Alerts

주용도 is highly imbalanced (64.5%)Imbalance
도로명주소 has 679 (6.8%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 01:42:05.515335
Analysis finished2023-12-12 01:42:06.452547
Duration0.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17582.389
Minimum2
Maximum35049
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T10:42:06.544959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile1925.8
Q18869.25
median17525.5
Q326272.5
95-th percentile33341.1
Maximum35049
Range35047
Interquartile range (IQR)17403.25

Descriptive statistics

Standard deviation10052.334
Coefficient of variation (CV)0.57172741
Kurtosis-1.1904994
Mean17582.389
Median Absolute Deviation (MAD)8710.5
Skewness0.0084593059
Sum1.7582389 × 108
Variance1.0104941 × 108
MonotonicityNot monotonic
2023-12-12T10:42:06.763898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31237 1
 
< 0.1%
12927 1
 
< 0.1%
16128 1
 
< 0.1%
29120 1
 
< 0.1%
3895 1
 
< 0.1%
12081 1
 
< 0.1%
10168 1
 
< 0.1%
17375 1
 
< 0.1%
13177 1
 
< 0.1%
34333 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
8 1
< 0.1%
18 1
< 0.1%
19 1
< 0.1%
20 1
< 0.1%
24 1
< 0.1%
26 1
< 0.1%
30 1
< 0.1%
ValueCountFrequency (%)
35049 1
< 0.1%
35047 1
< 0.1%
35039 1
< 0.1%
35034 1
< 0.1%
35033 1
< 0.1%
35031 1
< 0.1%
35027 1
< 0.1%
35024 1
< 0.1%
35023 1
< 0.1%
35022 1
< 0.1%
Distinct8770
Distinct (%)87.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T10:42:06.988542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length22
Mean length21.6657
Min length12

Characters and Unicode

Total characters216657
Distinct characters58
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7713 ?
Unique (%)77.1%

Sample

1st row대전광역시 중구 문창동 0122-0006
2nd row대전광역시 중구 태평동 0339-0001
3rd row대전광역시 중구 대사동 0247-0113
4th row대전광역시 중구 석교동 0014-0010
5th row대전광역시 중구 석교동 0073-0009
ValueCountFrequency (%)
대전광역시 10000
25.0%
중구 10000
25.0%
유천동 1356
 
3.4%
산성동 938
 
2.3%
문화동 902
 
2.3%
태평동 800
 
2.0%
부사동 633
 
1.6%
석교동 612
 
1.5%
선화동 569
 
1.4%
대흥동 562
 
1.4%
Other values (7299) 13661
34.1%
2023-12-12T10:42:07.356779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 37375
17.3%
30033
13.9%
10987
 
5.1%
10332
 
4.8%
10016
 
4.6%
10000
 
4.6%
10000
 
4.6%
10000
 
4.6%
10000
 
4.6%
10000
 
4.6%
Other values (48) 67914
31.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 99594
46.0%
Decimal Number 77620
35.8%
Space Separator 30033
 
13.9%
Dash Punctuation 9410
 
4.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10987
11.0%
10332
10.4%
10016
10.1%
10000
10.0%
10000
10.0%
10000
10.0%
10000
10.0%
10000
10.0%
1471
 
1.5%
1393
 
1.4%
Other values (36) 15395
15.5%
Decimal Number
ValueCountFrequency (%)
0 37375
48.2%
1 8957
 
11.5%
2 6113
 
7.9%
3 5762
 
7.4%
4 4232
 
5.5%
6 3527
 
4.5%
5 3274
 
4.2%
7 3055
 
3.9%
8 2748
 
3.5%
9 2577
 
3.3%
Space Separator
ValueCountFrequency (%)
30033
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9410
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 117063
54.0%
Hangul 99594
46.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10987
11.0%
10332
10.4%
10016
10.1%
10000
10.0%
10000
10.0%
10000
10.0%
10000
10.0%
10000
10.0%
1471
 
1.5%
1393
 
1.4%
Other values (36) 15395
15.5%
Common
ValueCountFrequency (%)
0 37375
31.9%
30033
25.7%
- 9410
 
8.0%
1 8957
 
7.7%
2 6113
 
5.2%
3 5762
 
4.9%
4 4232
 
3.6%
6 3527
 
3.0%
5 3274
 
2.8%
7 3055
 
2.6%
Other values (2) 5325
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 117063
54.0%
Hangul 99594
46.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 37375
31.9%
30033
25.7%
- 9410
 
8.0%
1 8957
 
7.7%
2 6113
 
5.2%
3 5762
 
4.9%
4 4232
 
3.6%
6 3527
 
3.0%
5 3274
 
2.8%
7 3055
 
2.6%
Other values (2) 5325
 
4.5%
Hangul
ValueCountFrequency (%)
10987
11.0%
10332
10.4%
10016
10.1%
10000
10.0%
10000
10.0%
10000
10.0%
10000
10.0%
10000
10.0%
1471
 
1.5%
1393
 
1.4%
Other values (36) 15395
15.5%

도로명주소
Text

MISSING 

Distinct8249
Distinct (%)88.5%
Missing679
Missing (%)6.8%
Memory size156.2 KiB
2023-12-12T10:42:07.676395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length24
Mean length19.417874
Min length14

Characters and Unicode

Total characters180994
Distinct characters113
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7301 ?
Unique (%)78.3%

Sample

1st row대전광역시 중구 문창로18번길 17
2nd row대전광역시 중구 수침로 105
3rd row대전광역시 중구 솔밭로 8
4th row대전광역시 중구 보문로47번길 100
5th row대전광역시 중구 대전천서로181번길 20
ValueCountFrequency (%)
대전광역시 9321
25.0%
중구 9321
25.0%
대종로 256
 
0.7%
문화로 178
 
0.5%
보문로 156
 
0.4%
대전천서로 155
 
0.4%
유천로 116
 
0.3%
동서대로 116
 
0.3%
11 115
 
0.3%
18 113
 
0.3%
Other values (2934) 17437
46.8%
2023-12-12T10:42:08.228798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
27963
15.4%
11299
 
6.2%
9690
 
5.4%
9534
 
5.3%
9321
 
5.1%
9321
 
5.1%
9321
 
5.1%
9321
 
5.1%
9231
 
5.1%
1 8876
 
4.9%
Other values (103) 67117
37.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 109243
60.4%
Decimal Number 40493
 
22.4%
Space Separator 27963
 
15.4%
Dash Punctuation 3295
 
1.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
11299
10.3%
9690
8.9%
9534
8.7%
9321
8.5%
9321
8.5%
9321
8.5%
9321
8.5%
9231
8.4%
6436
 
5.9%
6346
 
5.8%
Other values (91) 19423
17.8%
Decimal Number
ValueCountFrequency (%)
1 8876
21.9%
2 4980
12.3%
3 4498
11.1%
4 3976
9.8%
5 3725
9.2%
6 3584
8.9%
7 3122
 
7.7%
8 2709
 
6.7%
9 2540
 
6.3%
0 2483
 
6.1%
Space Separator
ValueCountFrequency (%)
27963
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3295
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 109243
60.4%
Common 71751
39.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
11299
10.3%
9690
8.9%
9534
8.7%
9321
8.5%
9321
8.5%
9321
8.5%
9321
8.5%
9231
8.4%
6436
 
5.9%
6346
 
5.8%
Other values (91) 19423
17.8%
Common
ValueCountFrequency (%)
27963
39.0%
1 8876
 
12.4%
2 4980
 
6.9%
3 4498
 
6.3%
4 3976
 
5.5%
5 3725
 
5.2%
6 3584
 
5.0%
- 3295
 
4.6%
7 3122
 
4.4%
8 2709
 
3.8%
Other values (2) 5023
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 109243
60.4%
ASCII 71751
39.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
27963
39.0%
1 8876
 
12.4%
2 4980
 
6.9%
3 4498
 
6.3%
4 3976
 
5.5%
5 3725
 
5.2%
6 3584
 
5.0%
- 3295
 
4.6%
7 3122
 
4.4%
8 2709
 
3.8%
Other values (2) 5023
 
7.0%
Hangul
ValueCountFrequency (%)
11299
10.3%
9690
8.9%
9534
8.7%
9321
8.5%
9321
8.5%
9321
8.5%
9321
8.5%
9231
8.4%
6436
 
5.9%
6346
 
5.8%
Other values (91) 19423
17.8%

주용도
Categorical

IMBALANCE 

Distinct27
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
단독주택
7057 
제2종근린생활시설
957 
제1종근린생활시설
782 
<NA>
 
588
교육연구시설
 
95
Other values (22)
 
521

Length

Max length10
Median length4
Mean length4.9446
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row단독주택
2nd row제2종근린생활시설
3rd row제2종근린생활시설
4th row단독주택
5th row<NA>

Common Values

ValueCountFrequency (%)
단독주택 7057
70.6%
제2종근린생활시설 957
 
9.6%
제1종근린생활시설 782
 
7.8%
<NA> 588
 
5.9%
교육연구시설 95
 
0.9%
숙박시설 88
 
0.9%
창고시설 74
 
0.7%
노유자시설 59
 
0.6%
업무시설 41
 
0.4%
문화및집회시설 41
 
0.4%
Other values (17) 218
 
2.2%

Length

2023-12-12T10:42:08.401121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
단독주택 7057
70.6%
제2종근린생활시설 957
 
9.6%
제1종근린생활시설 782
 
7.8%
na 588
 
5.9%
교육연구시설 95
 
0.9%
숙박시설 88
 
0.9%
창고시설 74
 
0.7%
노유자시설 59
 
0.6%
업무시설 41
 
0.4%
문화및집회시설 41
 
0.4%
Other values (17) 218
 
2.2%

Interactions

2023-12-12T10:42:06.113997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:42:08.489162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번주용도
연번1.0000.281
주용도0.2811.000
2023-12-12T10:42:08.582849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번주용도
연번1.0000.105
주용도0.1051.000

Missing values

2023-12-12T10:42:06.300113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:42:06.406970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번대지위치주소도로명주소주용도
3123631237대전광역시 중구 문창동 0122-0006대전광역시 중구 문창로18번길 17단독주택
2441824419대전광역시 중구 태평동 0339-0001대전광역시 중구 수침로 105제2종근린생활시설
2421324214대전광역시 중구 대사동 0247-0113대전광역시 중구 솔밭로 8제2종근린생활시설
19041905대전광역시 중구 석교동 0014-0010대전광역시 중구 보문로47번길 100단독주택
1278912790대전광역시 중구 석교동 0073-0009대전광역시 중구 대전천서로181번길 20<NA>
3409434095대전광역시 중구 용두동 0087-0002<NA>단독주택
64096410대전광역시 중구 옥계동 0005-0005대전광역시 중구 대종로62번길 112제2종근린생활시설
400401대전광역시 중구 부사동 0224-0002대전광역시 중구 보문로113번길 15-1<NA>
1507815079대전광역시 중구 문창동 0367-0023대전광역시 중구 보문로 18-6단독주택
74657466대전광역시 중구 문화동 0677-0015대전광역시 중구 당디로50번길 71-13단독주택
연번대지위치주소도로명주소주용도
3064830649대전광역시 중구 중촌동 0071-0013대전광역시 중구 대종로692번길 32-7단독주택
3335233353대전광역시 중구 문창동 0364-0004<NA>단독주택
3430134302대전광역시 중구 태평동 0405-0003<NA>단독주택
2307423075대전광역시 중구 태평동 0346-0001대전광역시 중구 태평로152번길 4제2종근린생활시설
71167117대전광역시 중구 부사동 0153-0025대전광역시 중구 사득로79번길 76단독주택
1143811439대전광역시 중구 목달동 0036대전광역시 중구 남달미로95번길 109단독주택
1102411025대전광역시 중구 태평동 0369-0021대전광역시 중구 동서대로1193번길 51-2단독주택
2191921920대전광역시 중구 유천동 0325-0025대전광역시 중구 태평로14번길 21-24단독주택
3436934370대전광역시 중구 태평동 0313-0049<NA>단독주택
290291대전광역시 중구 부사동 0230-0077대전광역시 중구 보문로111번길 22단독주택