Overview

Dataset statistics

Number of variables6
Number of observations91
Missing cells3
Missing cells (%)0.5%
Duplicate rows6
Duplicate rows (%)6.6%
Total size in memory4.4 KiB
Average record size in memory49.5 B

Variable types

Text3
Categorical3

Dataset

Description석면 조사 대상 건축물 자료(건물명, 동면, 주소, 구분, 소유자)
Author강원도 영월군
URLhttps://www.data.go.kr/data/15053446/fileData.do

Alerts

Dataset has 6 (6.6%) duplicate rowsDuplicates
소유자 is highly overall correlated with 구분(대분류)High correlation
구분(소분류) is highly overall correlated with 구분(대분류)High correlation
구분(대분류) is highly overall correlated with 구분(소분류) and 1 other fieldsHigh correlation
구분(대분류) is highly imbalanced (91.3%)Imbalance
건물명 has 1 (1.1%) missing valuesMissing
동명 has 1 (1.1%) missing valuesMissing
주소 has 1 (1.1%) missing valuesMissing

Reproduction

Analysis started2023-12-12 16:35:41.651049
Analysis finished2023-12-12 16:35:42.571339
Duration0.92 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

건물명
Text

MISSING 

Distinct63
Distinct (%)70.0%
Missing1
Missing (%)1.1%
Memory size860.0 B
2023-12-13T01:35:42.833020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length14
Mean length7.8111111
Min length4

Characters and Unicode

Total characters703
Distinct characters134
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)60.0%

Sample

1st row(구)보건소
2nd row강원도_병원
3rd row강원도_진규폐병동
4th row강원청 영월서
5th row김삿갓면복지회관
ValueCountFrequency (%)
영월군 13
 
11.5%
영월교도소 10
 
8.8%
한국남부발전 10
 
8.8%
영월군청 3
 
2.7%
서남농업협동조합 3
 
2.7%
농기계연수동 2
 
1.8%
한국전력공사 2
 
1.8%
환경시설관리사업소 2
 
1.8%
영월교육지원청 2
 
1.8%
농업기술센터 2
 
1.8%
Other values (62) 64
56.6%
2023-12-13T01:35:43.329964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
50
 
7.1%
50
 
7.1%
27
 
3.8%
24
 
3.4%
22
 
3.1%
21
 
3.0%
19
 
2.7%
18
 
2.6%
18
 
2.6%
17
 
2.4%
Other values (124) 437
62.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 667
94.9%
Space Separator 24
 
3.4%
Close Punctuation 4
 
0.6%
Open Punctuation 4
 
0.6%
Connector Punctuation 2
 
0.3%
Dash Punctuation 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
50
 
7.5%
50
 
7.5%
27
 
4.0%
22
 
3.3%
21
 
3.1%
19
 
2.8%
18
 
2.7%
18
 
2.7%
17
 
2.5%
17
 
2.5%
Other values (119) 408
61.2%
Space Separator
ValueCountFrequency (%)
24
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 667
94.9%
Common 36
 
5.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
50
 
7.5%
50
 
7.5%
27
 
4.0%
22
 
3.3%
21
 
3.1%
19
 
2.8%
18
 
2.7%
18
 
2.7%
17
 
2.5%
17
 
2.5%
Other values (119) 408
61.2%
Common
ValueCountFrequency (%)
24
66.7%
) 4
 
11.1%
( 4
 
11.1%
_ 2
 
5.6%
- 2
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 667
94.9%
ASCII 36
 
5.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
50
 
7.5%
50
 
7.5%
27
 
4.0%
22
 
3.3%
21
 
3.1%
19
 
2.8%
18
 
2.7%
18
 
2.7%
17
 
2.5%
17
 
2.5%
Other values (119) 408
61.2%
ASCII
ValueCountFrequency (%)
24
66.7%
) 4
 
11.1%
( 4
 
11.1%
_ 2
 
5.6%
- 2
 
5.6%

동명
Text

MISSING 

Distinct63
Distinct (%)70.0%
Missing1
Missing (%)1.1%
Memory size860.0 B
2023-12-13T01:35:43.640801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length14
Mean length7.7666667
Min length3

Characters and Unicode

Total characters699
Distinct characters131
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)60.0%

Sample

1st row(구)보건소
2nd row강원도_병원
3rd row강원도_진규폐병동
4th row강원청 영월서
5th row김삿갓면복지회관
ValueCountFrequency (%)
영월군 14
 
12.4%
영월교도소 10
 
8.8%
한국남부발전 10
 
8.8%
영월군청 3
 
2.7%
서남농업협동조합 3
 
2.7%
농기계연수동 2
 
1.8%
환경시설관리사업소 2
 
1.8%
영월교육지원청 2
 
1.8%
한국전력공사 2
 
1.8%
농업기술센터 2
 
1.8%
Other values (61) 63
55.8%
2023-12-13T01:35:44.055740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
51
 
7.3%
51
 
7.3%
27
 
3.9%
24
 
3.4%
22
 
3.1%
21
 
3.0%
20
 
2.9%
18
 
2.6%
18
 
2.6%
17
 
2.4%
Other values (121) 430
61.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 663
94.8%
Space Separator 24
 
3.4%
Open Punctuation 4
 
0.6%
Close Punctuation 4
 
0.6%
Connector Punctuation 2
 
0.3%
Dash Punctuation 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
51
 
7.7%
51
 
7.7%
27
 
4.1%
22
 
3.3%
21
 
3.2%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.6%
17
 
2.6%
Other values (116) 401
60.5%
Space Separator
ValueCountFrequency (%)
24
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 663
94.8%
Common 36
 
5.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
51
 
7.7%
51
 
7.7%
27
 
4.1%
22
 
3.3%
21
 
3.2%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.6%
17
 
2.6%
Other values (116) 401
60.5%
Common
ValueCountFrequency (%)
24
66.7%
( 4
 
11.1%
) 4
 
11.1%
_ 2
 
5.6%
- 2
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 663
94.8%
ASCII 36
 
5.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
51
 
7.7%
51
 
7.7%
27
 
4.1%
22
 
3.3%
21
 
3.2%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.6%
17
 
2.6%
Other values (116) 401
60.5%
ASCII
ValueCountFrequency (%)
24
66.7%
( 4
 
11.1%
) 4
 
11.1%
_ 2
 
5.6%
- 2
 
5.6%

주소
Text

MISSING 

Distinct57
Distinct (%)63.3%
Missing1
Missing (%)1.1%
Memory size860.0 B
2023-12-13T01:35:44.371626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length24
Mean length21.577778
Min length19

Characters and Unicode

Total characters1942
Distinct characters80
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)47.8%

Sample

1st row강원도 영월군 영월읍 하송로 46-43
2nd row강원도 영월군 영월읍 단종로 47-4
3rd row강원도 영월군 영월읍 단종로 47-4
4th row강원도 영월군 영월읍 단종로 9-0
5th row강원도 영월군 김삿갓면 영월동로 1644-0
ValueCountFrequency (%)
강원도 90
20.0%
영월군 90
20.0%
영월읍 60
13.3%
중앙로 13
 
2.9%
273-0 10
 
2.2%
팔괴로 10
 
2.2%
110-27 10
 
2.2%
주천면 7
 
1.6%
영월로 7
 
1.6%
단종로 7
 
1.6%
Other values (90) 146
32.4%
2023-12-13T01:35:44.808272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
360
18.5%
161
 
8.3%
161
 
8.3%
95
 
4.9%
92
 
4.7%
90
 
4.6%
90
 
4.6%
- 90
 
4.6%
0 88
 
4.5%
69
 
3.6%
Other values (70) 646
33.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1129
58.1%
Decimal Number 363
 
18.7%
Space Separator 360
 
18.5%
Dash Punctuation 90
 
4.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
161
14.3%
161
14.3%
95
 
8.4%
92
 
8.1%
90
 
8.0%
90
 
8.0%
69
 
6.1%
63
 
5.6%
27
 
2.4%
26
 
2.3%
Other values (58) 255
22.6%
Decimal Number
ValueCountFrequency (%)
0 88
24.2%
1 68
18.7%
2 40
11.0%
3 36
9.9%
7 35
 
9.6%
4 25
 
6.9%
9 23
 
6.3%
6 20
 
5.5%
8 15
 
4.1%
5 13
 
3.6%
Space Separator
ValueCountFrequency (%)
360
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 90
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1129
58.1%
Common 813
41.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
161
14.3%
161
14.3%
95
 
8.4%
92
 
8.1%
90
 
8.0%
90
 
8.0%
69
 
6.1%
63
 
5.6%
27
 
2.4%
26
 
2.3%
Other values (58) 255
22.6%
Common
ValueCountFrequency (%)
360
44.3%
- 90
 
11.1%
0 88
 
10.8%
1 68
 
8.4%
2 40
 
4.9%
3 36
 
4.4%
7 35
 
4.3%
4 25
 
3.1%
9 23
 
2.8%
6 20
 
2.5%
Other values (2) 28
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1129
58.1%
ASCII 813
41.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
360
44.3%
- 90
 
11.1%
0 88
 
10.8%
1 68
 
8.4%
2 40
 
4.9%
3 36
 
4.4%
7 35
 
4.3%
4 25
 
3.1%
9 23
 
2.8%
6 20
 
2.5%
Other values (2) 28
 
3.4%
Hangul
ValueCountFrequency (%)
161
14.3%
161
14.3%
95
 
8.4%
92
 
8.1%
90
 
8.0%
90
 
8.0%
69
 
6.1%
63
 
5.6%
27
 
2.4%
26
 
2.3%
Other values (58) 255
22.6%

구분(대분류)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size860.0 B
공공건축물
90 
<NA>
 
1

Length

Max length5
Median length5
Mean length4.989011
Min length4

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st row공공건축물
2nd row공공건축물
3rd row공공건축물
4th row공공건축물
5th row공공건축물

Common Values

ValueCountFrequency (%)
공공건축물 90
98.9%
<NA> 1
 
1.1%

Length

2023-12-13T01:35:44.951405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:35:45.057910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공공건축물 90
98.9%
na 1
 
1.1%

구분(소분류)
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Memory size860.0 B
공공기관
67 
특수법인
행정기관
지방공사.공단
 
6
<NA>
 
1

Length

Max length7
Median length4
Mean length4.1978022
Min length4

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st row공공기관
2nd row공공기관
3rd row공공기관
4th row공공기관
5th row공공기관

Common Values

ValueCountFrequency (%)
공공기관 67
73.6%
특수법인 9
 
9.9%
행정기관 8
 
8.8%
지방공사.공단 6
 
6.6%
<NA> 1
 
1.1%

Length

2023-12-13T01:35:45.172942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:35:45.288845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공공기관 67
73.6%
특수법인 9
 
9.9%
행정기관 8
 
8.8%
지방공사.공단 6
 
6.6%
na 1
 
1.1%

소유자
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)26.4%
Missing0
Missing (%)0.0%
Memory size860.0 B
영월군
42 
영월교도소
10 
한국남부발전
10 
강원지방우정청
 
3
서남농업협동조합
 
3
Other values (19)
23 

Length

Max length20
Median length16
Mean length5.2527473
Min length3

Unique

Unique16 ?
Unique (%)17.6%

Sample

1st row영월군
2nd row강원도_병원
3rd row강원도_진규폐병동
4th row강원청 영월서
5th row영월군

Common Values

ValueCountFrequency (%)
영월군 42
46.2%
영월교도소 10
 
11.0%
한국남부발전 10
 
11.0%
강원지방우정청 3
 
3.3%
서남농업협동조합 3
 
3.3%
영월교육지원청 3
 
3.3%
지식경제부 2
 
2.2%
영월군산림조합 2
 
2.2%
강원도청 1
 
1.1%
강원도_진규폐병동 1
 
1.1%
Other values (14) 14
 
15.4%

Length

2023-12-13T01:35:45.419225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
영월군 42
43.3%
한국남부발전 10
 
10.3%
영월교도소 10
 
10.3%
강원지방우정청 3
 
3.1%
서남농업협동조합 3
 
3.1%
영월교육지원청 3
 
3.1%
동부지방산림청 2
 
2.1%
지식경제부 2
 
2.1%
영월군산림조합 2
 
2.1%
영월출장소 1
 
1.0%
Other values (19) 19
19.6%

Correlations

2023-12-13T01:35:45.512474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건물명동명주소구분(소분류)소유자
건물명1.0001.0000.9990.9860.997
동명1.0001.0000.9990.9860.997
주소0.9990.9991.0000.9610.985
구분(소분류)0.9860.9860.9611.0000.792
소유자0.9970.9970.9850.7921.000
2023-12-13T01:35:45.628928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소유자구분(소분류)구분(대분류)
소유자1.0000.4931.000
구분(소분류)0.4931.0001.000
구분(대분류)1.0001.0001.000
2023-12-13T01:35:45.732050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분(대분류)구분(소분류)소유자
구분(대분류)1.0001.0001.000
구분(소분류)1.0001.0000.493
소유자1.0000.4931.000

Missing values

2023-12-13T01:35:42.240812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:35:42.364525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T01:35:42.496389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

건물명동명주소구분(대분류)구분(소분류)소유자
0(구)보건소(구)보건소강원도 영월군 영월읍 하송로 46-43공공건축물공공기관영월군
1강원도_병원강원도_병원강원도 영월군 영월읍 단종로 47-4공공건축물공공기관강원도_병원
2강원도_진규폐병동강원도_진규폐병동강원도 영월군 영월읍 단종로 47-4공공건축물공공기관강원도_진규폐병동
3강원청 영월서강원청 영월서강원도 영월군 영월읍 단종로 9-0공공건축물공공기관강원청 영월서
4김삿갓면복지회관김삿갓면복지회관강원도 영월군 김삿갓면 영월동로 1644-0공공건축물공공기관영월군
5김삿갓면사무소김삿갓면사무소강원도 영월군 김삿갓면 옥동장터길 34-0공공건축물공공기관영월군
6난고김삿갓문학관영월군강원도 영월군 김삿갓면 김삿갓로 216-22공공건축물공공기관영월군
7덕포씨름장천하장사의집강원도 영월군 영월읍 덕포우회길 26-0공공건축물지방공사.공단영월군
8동부지방산림청 영월국유림관리소(청사)동부지방산림청 영월국유림관리소(청사)강원도 영월군 영월읍 영월로 1909-1공공건축물공공기관동부지방산림청 영월국유림관리소(청사)
9동부지방산림청 영월국유림관리소(청사)동부지방산림청 영월국유림관리소(청사)강원도 영월군 영월읍 영월로 1909-1공공건축물행정기관동부지방산림청 영월국유림관리소
건물명동명주소구분(대분류)구분(소분류)소유자
81한국남부발전한국남부발전강원도 영월군 영월읍 중앙로 273-0공공건축물공공기관한국남부발전
82한국남부발전한국남부발전강원도 영월군 영월읍 중앙로 273-0공공건축물공공기관한국남부발전
83한국전력공사 영월지사-별관한국전력공사 영월지사-별관강원도 영월군 영월읍 중앙로 239-0공공건축물공공기관지식경제부
84한국전력공사 영월지사-본관한국전력공사 영월지사-본관강원도 영월군 영월읍 중앙로 239-0공공건축물특수법인지식경제부
85한국철도시설공단한국철도시설공단강원도 영월군 영월읍 중리2길 7-4공공건축물특수법인한국철도시설공단
86한반도면복지회관한반도면복지회관강원도 영월군 한반도면 서강로 793-0공공건축물행정기관영월군
87한반도면사무소한반도면사무소강원도 영월군 한반도면 신천길 6-6공공건축물공공기관영월군
88환경시설관리사업소환경시설관리사업소강원도 영월군 북면 굴앞마을길 48-60공공건축물공공기관영월군
89환경시설관리사업소환경시설관리사업소강원도 영월군 북면 굴앞마을길 48-60공공건축물공공기관영월군
90<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

건물명동명주소구분(대분류)구분(소분류)소유자# duplicates
0영월교도소영월교도소강원도 영월군 영월읍 팔괴로 110-27공공건축물공공기관영월교도소10
4한국남부발전한국남부발전강원도 영월군 영월읍 중앙로 273-0공공건축물공공기관한국남부발전10
1영월군 농기계연수동영월군 농기계연수동강원도 영월군 북면 영월로 1315-0공공건축물공공기관영월군2
2영월군 농업기술센터영월군 농업기술센터강원도 영월군 영월읍 덕포우회길 329-0공공건축물공공기관영월군2
3영월군청영월군청강원도 영월군 영월읍 하송로 64-0공공건축물행정기관영월군2
5환경시설관리사업소환경시설관리사업소강원도 영월군 북면 굴앞마을길 48-60공공건축물공공기관영월군2