Overview

Dataset statistics

Number of variables7
Number of observations91
Missing cells0
Missing cells (%)0.0%
Duplicate rows6
Duplicate rows (%)6.6%
Total size in memory5.1 KiB
Average record size in memory57.5 B

Variable types

Unsupported1
Text3
Categorical3

Dataset

Description석면조사 대상 건축물
Author강원도 영월군
URLhttps://www.data.go.kr/data/15053445/fileData.do

Alerts

Dataset has 6 (6.6%) duplicate rowsDuplicates
Unnamed: 5 is highly overall correlated with Unnamed: 4 and 1 other fieldsHigh correlation
Unnamed: 4 is highly overall correlated with Unnamed: 5 and 1 other fieldsHigh correlation
Unnamed: 6 is highly overall correlated with Unnamed: 4 and 1 other fieldsHigh correlation
Unnamed: 4 is highly imbalanced (91.3%)Imbalance
석면조사대상건축물조회 리스트 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 06:07:39.375170
Analysis finished2023-12-12 06:07:40.130186
Duration0.76 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

석면조사대상건축물조회 리스트
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size860.0 B
Distinct64
Distinct (%)70.3%
Missing0
Missing (%)0.0%
Memory size860.0 B
2023-12-12T15:07:40.362476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length15
Mean length7.7692308
Min length3

Characters and Unicode

Total characters707
Distinct characters136
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)60.4%

Sample

1st row건물명
2nd row(구)보건소
3rd row강원도_병원
4th row강원도_진규폐병동
5th row강원청 영월서
ValueCountFrequency (%)
영월군 13
 
11.4%
영월교도소 10
 
8.8%
한국남부발전 10
 
8.8%
영월군청 3
 
2.6%
서남농업협동조합 3
 
2.6%
농기계연수동 2
 
1.8%
환경시설관리사업소 2
 
1.8%
영월교육지원청 2
 
1.8%
한국전력공사 2
 
1.8%
농업기술센터 2
 
1.8%
Other values (63) 65
57.0%
2023-12-12T15:07:40.818205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
50
 
7.1%
50
 
7.1%
27
 
3.8%
24
 
3.4%
22
 
3.1%
21
 
3.0%
19
 
2.7%
18
 
2.5%
18
 
2.5%
17
 
2.4%
Other values (126) 441
62.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 670
94.8%
Space Separator 24
 
3.4%
Open Punctuation 4
 
0.6%
Close Punctuation 4
 
0.6%
Dash Punctuation 2
 
0.3%
Connector Punctuation 2
 
0.3%
Control 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
50
 
7.5%
50
 
7.5%
27
 
4.0%
22
 
3.3%
21
 
3.1%
19
 
2.8%
18
 
2.7%
18
 
2.7%
17
 
2.5%
17
 
2.5%
Other values (120) 411
61.3%
Space Separator
ValueCountFrequency (%)
24
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 670
94.8%
Common 37
 
5.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
50
 
7.5%
50
 
7.5%
27
 
4.0%
22
 
3.3%
21
 
3.1%
19
 
2.8%
18
 
2.7%
18
 
2.7%
17
 
2.5%
17
 
2.5%
Other values (120) 411
61.3%
Common
ValueCountFrequency (%)
24
64.9%
( 4
 
10.8%
) 4
 
10.8%
- 2
 
5.4%
_ 2
 
5.4%
1
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 670
94.8%
ASCII 37
 
5.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
50
 
7.5%
50
 
7.5%
27
 
4.0%
22
 
3.3%
21
 
3.1%
19
 
2.8%
18
 
2.7%
18
 
2.7%
17
 
2.5%
17
 
2.5%
Other values (120) 411
61.3%
ASCII
ValueCountFrequency (%)
24
64.9%
( 4
 
10.8%
) 4
 
10.8%
- 2
 
5.4%
_ 2
 
5.4%
1
 
2.7%
Distinct64
Distinct (%)70.3%
Missing0
Missing (%)0.0%
Memory size860.0 B
2023-12-12T15:07:41.113313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length15
Mean length7.7142857
Min length2

Characters and Unicode

Total characters702
Distinct characters133
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)60.4%

Sample

1st row동명
2nd row(구)보건소
3rd row강원도_병원
4th row강원도_진규폐병동
5th row강원청 영월서
ValueCountFrequency (%)
영월군 14
 
12.3%
영월교도소 10
 
8.8%
한국남부발전 10
 
8.8%
영월군청 3
 
2.6%
서남농업협동조합 3
 
2.6%
농기계연수동 2
 
1.8%
환경시설관리사업소 2
 
1.8%
영월교육지원청 2
 
1.8%
한국전력공사 2
 
1.8%
농업기술센터 2
 
1.8%
Other values (62) 64
56.1%
2023-12-12T15:07:41.582048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
51
 
7.3%
51
 
7.3%
27
 
3.8%
24
 
3.4%
22
 
3.1%
21
 
3.0%
20
 
2.8%
18
 
2.6%
18
 
2.6%
18
 
2.6%
Other values (123) 432
61.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 665
94.7%
Space Separator 24
 
3.4%
Open Punctuation 4
 
0.6%
Close Punctuation 4
 
0.6%
Dash Punctuation 2
 
0.3%
Connector Punctuation 2
 
0.3%
Control 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
51
 
7.7%
51
 
7.7%
27
 
4.1%
22
 
3.3%
21
 
3.2%
20
 
3.0%
18
 
2.7%
18
 
2.7%
18
 
2.7%
17
 
2.6%
Other values (117) 402
60.5%
Space Separator
ValueCountFrequency (%)
24
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 665
94.7%
Common 37
 
5.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
51
 
7.7%
51
 
7.7%
27
 
4.1%
22
 
3.3%
21
 
3.2%
20
 
3.0%
18
 
2.7%
18
 
2.7%
18
 
2.7%
17
 
2.6%
Other values (117) 402
60.5%
Common
ValueCountFrequency (%)
24
64.9%
( 4
 
10.8%
) 4
 
10.8%
- 2
 
5.4%
_ 2
 
5.4%
1
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 665
94.7%
ASCII 37
 
5.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
51
 
7.7%
51
 
7.7%
27
 
4.1%
22
 
3.3%
21
 
3.2%
20
 
3.0%
18
 
2.7%
18
 
2.7%
18
 
2.7%
17
 
2.6%
Other values (117) 402
60.5%
ASCII
ValueCountFrequency (%)
24
64.9%
( 4
 
10.8%
) 4
 
10.8%
- 2
 
5.4%
_ 2
 
5.4%
1
 
2.7%
Distinct58
Distinct (%)63.7%
Missing0
Missing (%)0.0%
Memory size860.0 B
2023-12-12T15:07:41.887545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length24
Mean length21.362637
Min length2

Characters and Unicode

Total characters1944
Distinct characters81
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)48.4%

Sample

1st row주소
2nd row강원도 영월군 영월읍 하송로 46-43
3rd row강원도 영월군 영월읍 단종로 47-4
4th row강원도 영월군 영월읍 단종로 47-4
5th row강원도 영월군 영월읍 단종로 9-0
ValueCountFrequency (%)
강원도 90
20.0%
영월군 90
20.0%
영월읍 60
13.3%
중앙로 13
 
2.9%
273-0 10
 
2.2%
팔괴로 10
 
2.2%
110-27 10
 
2.2%
영월로 7
 
1.6%
주천면 7
 
1.6%
단종로 7
 
1.6%
Other values (91) 147
32.6%
2023-12-12T15:07:42.558571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
360
18.5%
161
 
8.3%
161
 
8.3%
95
 
4.9%
92
 
4.7%
90
 
4.6%
- 90
 
4.6%
90
 
4.6%
0 88
 
4.5%
69
 
3.5%
Other values (71) 648
33.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1131
58.2%
Decimal Number 363
 
18.7%
Space Separator 360
 
18.5%
Dash Punctuation 90
 
4.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
161
14.2%
161
14.2%
95
 
8.4%
92
 
8.1%
90
 
8.0%
90
 
8.0%
69
 
6.1%
63
 
5.6%
27
 
2.4%
26
 
2.3%
Other values (59) 257
22.7%
Decimal Number
ValueCountFrequency (%)
0 88
24.2%
1 68
18.7%
2 40
11.0%
3 36
9.9%
7 35
 
9.6%
4 25
 
6.9%
9 23
 
6.3%
6 20
 
5.5%
8 15
 
4.1%
5 13
 
3.6%
Space Separator
ValueCountFrequency (%)
360
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 90
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1131
58.2%
Common 813
41.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
161
14.2%
161
14.2%
95
 
8.4%
92
 
8.1%
90
 
8.0%
90
 
8.0%
69
 
6.1%
63
 
5.6%
27
 
2.4%
26
 
2.3%
Other values (59) 257
22.7%
Common
ValueCountFrequency (%)
360
44.3%
- 90
 
11.1%
0 88
 
10.8%
1 68
 
8.4%
2 40
 
4.9%
3 36
 
4.4%
7 35
 
4.3%
4 25
 
3.1%
9 23
 
2.8%
6 20
 
2.5%
Other values (2) 28
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1131
58.2%
ASCII 813
41.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
360
44.3%
- 90
 
11.1%
0 88
 
10.8%
1 68
 
8.4%
2 40
 
4.9%
3 36
 
4.4%
7 35
 
4.3%
4 25
 
3.1%
9 23
 
2.8%
6 20
 
2.5%
Other values (2) 28
 
3.4%
Hangul
ValueCountFrequency (%)
161
14.2%
161
14.2%
95
 
8.4%
92
 
8.1%
90
 
8.0%
90
 
8.0%
69
 
6.1%
63
 
5.6%
27
 
2.4%
26
 
2.3%
Other values (59) 257
22.7%

Unnamed: 4
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size860.0 B
공공건축물
90 
구분 (대분류)
 
1

Length

Max length8
Median length5
Mean length5.032967
Min length5

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st row구분 (대분류)
2nd row공공건축물
3rd row공공건축물
4th row공공건축물
5th row공공건축물

Common Values

ValueCountFrequency (%)
공공건축물 90
98.9%
구분 (대분류) 1
 
1.1%

Length

2023-12-12T15:07:42.714908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:07:42.825789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공공건축물 90
97.8%
구분 1
 
1.1%
대분류 1
 
1.1%

Unnamed: 5
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Memory size860.0 B
공공기관
67 
특수법인
행정기관
지방공사.공단
 
6
구분 (소분류)
 
1

Length

Max length8
Median length4
Mean length4.2417582
Min length4

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st row구분 (소분류)
2nd row공공기관
3rd row공공기관
4th row공공기관
5th row공공기관

Common Values

ValueCountFrequency (%)
공공기관 67
73.6%
특수법인 9
 
9.9%
행정기관 8
 
8.8%
지방공사.공단 6
 
6.6%
구분 (소분류) 1
 
1.1%

Length

2023-12-12T15:07:42.930037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:07:43.035391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공공기관 67
72.8%
특수법인 9
 
9.8%
행정기관 8
 
8.7%
지방공사.공단 6
 
6.5%
구분 1
 
1.1%
소분류 1
 
1.1%

Unnamed: 6
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)26.4%
Missing0
Missing (%)0.0%
Memory size860.0 B
영월군
42 
영월교도소
10 
한국남부발전
10 
강원지방우정청
 
3
서남농업협동조합
 
3
Other values (19)
23 

Length

Max length20
Median length16
Mean length5.2527473
Min length3

Unique

Unique16 ?
Unique (%)17.6%

Sample

1st row소유자
2nd row영월군
3rd row강원도_병원
4th row강원도_진규폐병동
5th row강원청 영월서

Common Values

ValueCountFrequency (%)
영월군 42
46.2%
영월교도소 10
 
11.0%
한국남부발전 10
 
11.0%
강원지방우정청 3
 
3.3%
서남농업협동조합 3
 
3.3%
영월교육지원청 3
 
3.3%
영월군산림조합 2
 
2.2%
지식경제부 2
 
2.2%
영월세무서 1
 
1.1%
강원도_병원 1
 
1.1%
Other values (14) 14
 
15.4%

Length

2023-12-12T15:07:43.161676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
영월군 42
43.3%
한국남부발전 10
 
10.3%
영월교도소 10
 
10.3%
강원지방우정청 3
 
3.1%
서남농업협동조합 3
 
3.1%
영월교육지원청 3
 
3.1%
영월군산림조합 2
 
2.1%
지식경제부 2
 
2.1%
동부지방산림청 2
 
2.1%
춘천지방검찰청 1
 
1.0%
Other values (19) 19
19.6%

Correlations

2023-12-12T15:07:43.254247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
Unnamed: 11.0001.0000.9991.0000.9920.998
Unnamed: 21.0001.0000.9991.0000.9920.998
Unnamed: 30.9990.9991.0001.0000.9770.987
Unnamed: 41.0001.0001.0001.0001.0001.000
Unnamed: 50.9920.9920.9771.0001.0000.895
Unnamed: 60.9980.9980.9871.0000.8951.000
2023-12-12T15:07:43.359815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 6Unnamed: 5Unnamed: 4
Unnamed: 61.0000.6100.868
Unnamed: 50.6101.0000.983
Unnamed: 40.8680.9831.000
2023-12-12T15:07:43.500303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 4Unnamed: 5Unnamed: 6
Unnamed: 41.0000.9830.868
Unnamed: 50.9831.0000.610
Unnamed: 60.8680.6101.000

Missing values

2023-12-12T15:07:39.953982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:07:40.084067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

석면조사대상건축물조회 리스트Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
0번호건물명동명주소구분 (대분류)구분 (소분류)소유자
11(구)보건소(구)보건소강원도 영월군 영월읍 하송로 46-43공공건축물공공기관영월군
22강원도_병원강원도_병원강원도 영월군 영월읍 단종로 47-4공공건축물공공기관강원도_병원
33강원도_진규폐병동강원도_진규폐병동강원도 영월군 영월읍 단종로 47-4공공건축물공공기관강원도_진규폐병동
44강원청 영월서강원청 영월서강원도 영월군 영월읍 단종로 9-0공공건축물공공기관강원청 영월서
55김삿갓면복지회관김삿갓면복지회관강원도 영월군 김삿갓면 영월동로 1644-0공공건축물공공기관영월군
66김삿갓면사무소김삿갓면사무소강원도 영월군 김삿갓면 옥동장터길 34-0공공건축물공공기관영월군
77난고김삿갓문학관영월군강원도 영월군 김삿갓면 김삿갓로 216-22공공건축물공공기관영월군
88덕포씨름장천하장사의집강원도 영월군 영월읍 덕포우회길 26-0공공건축물지방공사.공단영월군
99동부지방산림청 영월국유림관리소(청사)동부지방산림청 영월국유림관리소(청사)강원도 영월군 영월읍 영월로 1909-1공공건축물공공기관동부지방산림청 영월국유림관리소(청사)
석면조사대상건축물조회 리스트Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
8181한국남부발전한국남부발전강원도 영월군 영월읍 중앙로 273-0공공건축물공공기관한국남부발전
8282한국남부발전한국남부발전강원도 영월군 영월읍 중앙로 273-0공공건축물공공기관한국남부발전
8383한국남부발전한국남부발전강원도 영월군 영월읍 중앙로 273-0공공건축물공공기관한국남부발전
8484한국전력공사 영월지사-별관한국전력공사 영월지사-별관강원도 영월군 영월읍 중앙로 239-0공공건축물공공기관지식경제부
8585한국전력공사 영월지사-본관한국전력공사 영월지사-본관강원도 영월군 영월읍 중앙로 239-0공공건축물특수법인지식경제부
8686한국철도시설공단한국철도시설공단강원도 영월군 영월읍 중리2길 7-4공공건축물특수법인한국철도시설공단
8787한반도면복지회관한반도면복지회관강원도 영월군 한반도면 서강로 793-0공공건축물행정기관영월군
8888한반도면사무소한반도면사무소강원도 영월군 한반도면 신천길 6-6공공건축물공공기관영월군
8989환경시설관리사업소환경시설관리사업소강원도 영월군 북면 굴앞마을길 48-60공공건축물공공기관영월군
9090환경시설관리사업소환경시설관리사업소강원도 영월군 북면 굴앞마을길 48-60공공건축물공공기관영월군

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6# duplicates
0영월교도소영월교도소강원도 영월군 영월읍 팔괴로 110-27공공건축물공공기관영월교도소10
4한국남부발전한국남부발전강원도 영월군 영월읍 중앙로 273-0공공건축물공공기관한국남부발전10
1영월군 농기계연수동영월군 농기계연수동강원도 영월군 북면 영월로 1315-0공공건축물공공기관영월군2
2영월군 농업기술센터영월군 농업기술센터강원도 영월군 영월읍 덕포우회길 329-0공공건축물공공기관영월군2
3영월군청영월군청강원도 영월군 영월읍 하송로 64-0공공건축물행정기관영월군2
5환경시설관리사업소환경시설관리사업소강원도 영월군 북면 굴앞마을길 48-60공공건축물공공기관영월군2