Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows51
Duplicate rows (%)0.5%
Total size in memory390.6 KiB
Average record size in memory40.0 B

Variable types

Text1
Categorical3

Dataset

Description시설물의안전관리에관한특별법에 의거하여 국토안전관리원에서 운영중인 시설물정보관리종합시스템 내 등록된 공공시설물(공동주택 제외)에 한하여 준공된지 30년이상 노후화된 현황(시설물명, 시설물구분, 소재지)를 제공
Author공공데이터포털
URLhttps://www.data.go.kr/data/15083081/fileData.do

Alerts

Dataset has 51 (0.5%) duplicate rowsDuplicates
시설물구분 is highly overall correlated with 시설물종류High correlation
시설물종류 is highly overall correlated with 시설물구분High correlation

Reproduction

Analysis started2024-04-18 00:12:01.634599
Analysis finished2024-04-18 00:12:03.098398
Duration1.46 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9686
Distinct (%)96.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-18T09:12:03.299452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length26
Mean length8.585
Min length2

Characters and Unicode

Total characters85850
Distinct characters604
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9457 ?
Unique (%)94.6%

Sample

1st row1-014 신설동역
2nd row괴곡천교(복)
3rd row고역교
4th row가대2교
5th row월삼교
ValueCountFrequency (%)
본관동 426
 
2.8%
교사동 344
 
2.3%
본관 315
 
2.1%
교사 167
 
1.1%
본관교사 118
 
0.8%
옹벽 89
 
0.6%
교사1호동 85
 
0.6%
후관동 65
 
0.4%
본관교사동 59
 
0.4%
교사1동 52
 
0.3%
Other values (10381) 13508
88.7%
2024-04-18T09:12:03.712091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8544
 
10.0%
5236
 
6.1%
3647
 
4.2%
3428
 
4.0%
2612
 
3.0%
2318
 
2.7%
2145
 
2.5%
1832
 
2.1%
) 1754
 
2.0%
( 1748
 
2.0%
Other values (594) 52586
61.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 70269
81.9%
Decimal Number 5252
 
6.1%
Space Separator 5236
 
6.1%
Close Punctuation 1806
 
2.1%
Open Punctuation 1800
 
2.1%
Uppercase Letter 698
 
0.8%
Dash Punctuation 354
 
0.4%
Other Punctuation 317
 
0.4%
Lowercase Letter 56
 
0.1%
Math Symbol 54
 
0.1%
Other values (3) 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8544
 
12.2%
3647
 
5.2%
3428
 
4.9%
2612
 
3.7%
2318
 
3.3%
2145
 
3.1%
1832
 
2.6%
1354
 
1.9%
1327
 
1.9%
1319
 
1.9%
Other values (530) 41743
59.4%
Uppercase Letter
ValueCountFrequency (%)
A 99
14.2%
C 82
11.7%
D 81
11.6%
B 78
11.2%
U 67
9.6%
W 66
9.5%
I 51
7.3%
J 51
7.3%
S 23
 
3.3%
H 21
 
3.0%
Other values (12) 79
11.3%
Decimal Number
ValueCountFrequency (%)
1 1536
29.2%
2 1092
20.8%
0 762
14.5%
3 570
 
10.9%
4 319
 
6.1%
5 251
 
4.8%
7 197
 
3.8%
8 192
 
3.7%
6 190
 
3.6%
9 143
 
2.7%
Lowercase Letter
ValueCountFrequency (%)
k 36
64.3%
m 6
 
10.7%
a 3
 
5.4%
e 3
 
5.4%
p 2
 
3.6%
i 2
 
3.6%
t 1
 
1.8%
r 1
 
1.8%
w 1
 
1.8%
c 1
 
1.8%
Other Punctuation
ValueCountFrequency (%)
. 145
45.7%
/ 102
32.2%
, 59
18.6%
· 5
 
1.6%
& 2
 
0.6%
* 2
 
0.6%
: 1
 
0.3%
@ 1
 
0.3%
Math Symbol
ValueCountFrequency (%)
~ 41
75.9%
| 9
 
16.7%
+ 4
 
7.4%
Close Punctuation
ValueCountFrequency (%)
) 1754
97.1%
] 52
 
2.9%
Open Punctuation
ValueCountFrequency (%)
( 1748
97.1%
[ 52
 
2.9%
Dash Punctuation
ValueCountFrequency (%)
- 353
99.7%
1
 
0.3%
Other Symbol
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Space Separator
ValueCountFrequency (%)
5236
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 70269
81.9%
Common 14825
 
17.3%
Latin 756
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8544
 
12.2%
3647
 
5.2%
3428
 
4.9%
2612
 
3.7%
2318
 
3.3%
2145
 
3.1%
1832
 
2.6%
1354
 
1.9%
1327
 
1.9%
1319
 
1.9%
Other values (530) 41743
59.4%
Latin
ValueCountFrequency (%)
A 99
13.1%
C 82
10.8%
D 81
10.7%
B 78
10.3%
U 67
8.9%
W 66
8.7%
I 51
6.7%
J 51
6.7%
k 36
 
4.8%
S 23
 
3.0%
Other values (23) 122
16.1%
Common
ValueCountFrequency (%)
5236
35.3%
) 1754
 
11.8%
( 1748
 
11.8%
1 1536
 
10.4%
2 1092
 
7.4%
0 762
 
5.1%
3 570
 
3.8%
- 353
 
2.4%
4 319
 
2.2%
5 251
 
1.7%
Other values (21) 1204
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 70269
81.9%
ASCII 15569
 
18.1%
None 6
 
< 0.1%
CJK Compat 4
 
< 0.1%
Number Forms 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8544
 
12.2%
3647
 
5.2%
3428
 
4.9%
2612
 
3.7%
2318
 
3.3%
2145
 
3.1%
1832
 
2.6%
1354
 
1.9%
1327
 
1.9%
1319
 
1.9%
Other values (530) 41743
59.4%
ASCII
ValueCountFrequency (%)
5236
33.6%
) 1754
 
11.3%
( 1748
 
11.2%
1 1536
 
9.9%
2 1092
 
7.0%
0 762
 
4.9%
3 570
 
3.7%
- 353
 
2.3%
4 319
 
2.0%
5 251
 
1.6%
Other values (49) 1948
 
12.5%
None
ValueCountFrequency (%)
· 5
83.3%
1
 
16.7%
CJK Compat
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Number Forms
ValueCountFrequency (%)
2
100.0%

시설물구분
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
건축물
4698 
교량
3258 
하천
738 
터널
 
334
상하수도
 
246
Other values (6)
726 

Length

Max length4
Median length3
Mean length2.5332
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row건축물
2nd row교량
3rd row교량
4th row교량
5th row교량

Common Values

ValueCountFrequency (%)
건축물 4698
47.0%
교량 3258
32.6%
하천 738
 
7.4%
터널 334
 
3.3%
상하수도 246
 
2.5%
옹벽 201
 
2.0%
193
 
1.9%
절토사면 165
 
1.7%
기타 116
 
1.2%
항만 46
 
0.5%

Length

2024-04-18T09:12:03.850098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
건축물 4698
47.0%
교량 3258
32.6%
하천 738
 
7.4%
터널 334
 
3.3%
상하수도 246
 
2.5%
옹벽 201
 
2.0%
193
 
1.9%
절토사면 165
 
1.7%
기타 116
 
1.2%
항만 46
 
0.5%
Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경기도
1341 
서울특별시
1275 
경상북도
1189 
충청남도
778 
충청북도
747 
Other values (12)
4670 

Length

Max length7
Median length5
Mean length4.4232
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row대전광역시
3rd row강원특별자치도
4th row충청북도
5th row충청남도

Common Values

ValueCountFrequency (%)
경기도 1341
13.4%
서울특별시 1275
12.8%
경상북도 1189
11.9%
충청남도 778
7.8%
충청북도 747
7.5%
전라남도 705
7.0%
전라북도 688
6.9%
경상남도 683
6.8%
강원특별자치도 670
6.7%
부산광역시 560
5.6%
Other values (7) 1364
13.6%

Length

2024-04-18T09:12:03.976823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 1341
13.4%
서울특별시 1275
12.8%
경상북도 1189
11.9%
충청남도 778
7.8%
충청북도 747
7.5%
전라남도 705
7.0%
전라북도 688
6.9%
경상남도 683
6.8%
강원특별자치도 670
6.7%
부산광역시 560
5.6%
Other values (7) 1364
13.6%

시설물종류
Categorical

HIGH CORRELATION 

Distinct35
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
다중이용건축물
3401 
도로교량
2533 
대형건축물
846 
철도교량
597 
기타
468 
Other values (30)
2155 

Length

Max length12
Median length8
Mean length5.1934
Min length2

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row철도역시설
2nd row철도교량
3rd row도로교량
4th row도로교량
5th row도로교량

Common Values

ValueCountFrequency (%)
다중이용건축물 3401
34.0%
도로교량 2533
25.3%
대형건축물 846
 
8.5%
철도교량 597
 
6.0%
기타 468
 
4.7%
수문 및 통문 442
 
4.4%
지방상수도 230
 
2.3%
철도터널 195
 
1.9%
용수전용댐 182
 
1.8%
도로사면 161
 
1.6%
Other values (25) 945
 
9.4%

Length

2024-04-18T09:12:04.100069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
다중이용건축물 3401
31.2%
도로교량 2533
23.2%
대형건축물 846
 
7.7%
철도교량 597
 
5.5%
기타 468
 
4.3%
459
 
4.2%
수문 442
 
4.0%
통문 442
 
4.0%
지방상수도 230
 
2.1%
철도터널 195
 
1.8%
Other values (28) 1305
 
12.0%

Correlations

2024-04-18T09:12:04.170493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물구분시설물소재지시설물종류
시설물구분1.0000.3910.996
시설물소재지0.3911.0000.509
시설물종류0.9960.5091.000
2024-04-18T09:12:04.260883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물소재지시설물구분시설물종류
시설물소재지1.0000.1570.161
시설물구분0.1571.0000.959
시설물종류0.1610.9591.000
2024-04-18T09:12:04.347150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물구분시설물소재지시설물종류
시설물구분1.0000.1570.959
시설물소재지0.1571.0000.161
시설물종류0.9590.1611.000

Missing values

2024-04-18T09:12:03.054833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설물명시설물구분시설물소재지시설물종류
179151-014 신설동역건축물서울특별시철도역시설
1458괴곡천교(복)교량대전광역시철도교량
480고역교교량강원특별자치도도로교량
20737가대2교교량충청북도도로교량
20016월삼교교량충청남도도로교량
10291대전대흥초등학교 교사동건축물대전광역시대형건축물
7773매포초등학교 본관교사건축물충청북도다중이용건축물
19902죽산교교량충청남도도로교량
2247오창(통영)휴게소건축물충청북도다중이용건축물
14218대구광역시교육청 본관동건축물대구광역시다중이용건축물
시설물명시설물구분시설물소재지시설물종류
12805온평초등학교 본관동건축물제주특별자치도다중이용건축물
7331한국교통대학교 화학생명관건축물충청북도대형건축물
21193본덕제(우안)하천광주광역시제방
13084서울구의초등학교 본관건축물서울특별시다중이용건축물
333006SE37D25709절토사면경기도도로사면
17794구의역건축물서울특별시철도역시설
7186석포초등학교 교사(본관)건축물부산광역시다중이용건축물
17124오금교교량서울특별시도로교량
18278주안역지하도상가건축물인천광역시지하도상가
14258서동초 부지경계 옹벽옹벽부산광역시건축물옹벽

Duplicate rows

Most frequently occurring

시설물명시설물구분시설물소재지시설물종류# duplicates
10덕천교교량충청북도도로교량3
25신기교교량경상남도도로교량3
34용수교교량경기도도로교량3
35용연교교량충청남도도로교량3
40원동교교량경상남도도로교량3
0갈마교교량전라북도도로교량2
1고현교교량경상북도도로교량2
2광덕교교량경상북도도로교량2
3금곡교교량경기도도로교량2
4금곡교교량충청남도도로교량2