Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows31
Duplicate rows (%)0.3%
Total size in memory468.8 KiB
Average record size in memory48.0 B

Variable types

Text2
Categorical2
DateTime1

Dataset

Description3종 시설물 정보(시설물명, 시설물 구분, 시설물 종류(공동주택 제외), 지정일자, 3종지정기관)를 정리하여 파일데이터(csv)형식으로 개방합니다.
URLhttps://www.data.go.kr/data/15038344/fileData.do

Alerts

Dataset has 31 (0.3%) duplicate rowsDuplicates
시설물종류 is highly overall correlated with 시설물구분High correlation
시설물구분 is highly overall correlated with 시설물종류High correlation
시설물구분 is highly imbalanced (54.3%)Imbalance
시설물종류 is highly imbalanced (50.3%)Imbalance

Reproduction

Analysis started2023-12-12 01:08:15.622047
Analysis finished2023-12-12 01:08:16.666846
Duration1.04 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9562
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T10:08:16.955906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length35
Mean length8.5164
Min length2

Characters and Unicode

Total characters85164
Distinct characters661
Distinct categories15 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9274 ?
Unique (%)92.7%

Sample

1st row마성교
2nd row금릉가도교(상)
3rd row죽림천교
4th row안동교
5th row상운교(상)
ValueCountFrequency (%)
교사동 446
 
3.0%
본관동 337
 
2.2%
본관 269
 
1.8%
교사 118
 
0.8%
체육관 106
 
0.7%
본관교사 70
 
0.5%
옹벽 67
 
0.4%
교사1호동 55
 
0.4%
후관동 51
 
0.3%
별관 40
 
0.3%
Other values (10655) 13445
89.6%
2023-12-12T10:08:17.494772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9878
 
11.6%
4995
 
5.9%
3972
 
4.7%
3442
 
4.0%
2539
 
3.0%
2489
 
2.9%
1904
 
2.2%
) 1895
 
2.2%
( 1892
 
2.2%
1483
 
1.7%
Other values (651) 50675
59.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 70292
82.5%
Space Separator 5004
 
5.9%
Decimal Number 4304
 
5.1%
Close Punctuation 1906
 
2.2%
Open Punctuation 1903
 
2.2%
Uppercase Letter 1153
 
1.4%
Other Punctuation 279
 
0.3%
Math Symbol 135
 
0.2%
Dash Punctuation 130
 
0.2%
Lowercase Letter 51
 
0.1%
Other values (5) 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9878
 
14.1%
3972
 
5.7%
3442
 
4.9%
2539
 
3.6%
2489
 
3.5%
1904
 
2.7%
1483
 
2.1%
1302
 
1.9%
1165
 
1.7%
1045
 
1.5%
Other values (582) 41073
58.4%
Uppercase Letter
ValueCountFrequency (%)
C 274
23.8%
I 213
18.5%
B 210
18.2%
U 131
11.4%
A 106
 
9.2%
T 35
 
3.0%
J 32
 
2.8%
E 23
 
2.0%
D 17
 
1.5%
R 17
 
1.5%
Other values (14) 95
 
8.2%
Lowercase Letter
ValueCountFrequency (%)
k 10
19.6%
l 9
17.6%
m 6
11.8%
e 6
11.8%
a 5
9.8%
t 3
 
5.9%
p 3
 
5.9%
i 2
 
3.9%
c 2
 
3.9%
r 2
 
3.9%
Other values (2) 3
 
5.9%
Decimal Number
ValueCountFrequency (%)
1 1300
30.2%
2 1012
23.5%
0 546
12.7%
3 451
 
10.5%
4 232
 
5.4%
5 180
 
4.2%
7 161
 
3.7%
8 144
 
3.3%
6 141
 
3.3%
9 137
 
3.2%
Other Punctuation
ValueCountFrequency (%)
/ 135
48.4%
. 80
28.7%
, 50
 
17.9%
# 5
 
1.8%
· 4
 
1.4%
: 3
 
1.1%
* 1
 
0.4%
" 1
 
0.4%
Math Symbol
ValueCountFrequency (%)
| 116
85.9%
~ 12
 
8.9%
+ 7
 
5.2%
Space Separator
ValueCountFrequency (%)
4995
99.8%
  9
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 1895
99.4%
] 11
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 1892
99.4%
[ 11
 
0.6%
Dash Punctuation
ValueCountFrequency (%)
- 130
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Control
ValueCountFrequency (%)
1
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 70291
82.5%
Common 13667
 
16.0%
Latin 1205
 
1.4%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9878
 
14.1%
3972
 
5.7%
3442
 
4.9%
2539
 
3.6%
2489
 
3.5%
1904
 
2.7%
1483
 
2.1%
1302
 
1.9%
1165
 
1.7%
1045
 
1.5%
Other values (581) 41072
58.4%
Latin
ValueCountFrequency (%)
C 274
22.7%
I 213
17.7%
B 210
17.4%
U 131
10.9%
A 106
 
8.8%
T 35
 
2.9%
J 32
 
2.7%
E 23
 
1.9%
D 17
 
1.4%
R 17
 
1.4%
Other values (27) 147
12.2%
Common
ValueCountFrequency (%)
4995
36.5%
) 1895
 
13.9%
( 1892
 
13.8%
1 1300
 
9.5%
2 1012
 
7.4%
0 546
 
4.0%
3 451
 
3.3%
4 232
 
1.7%
5 180
 
1.3%
7 161
 
1.2%
Other values (22) 1003
 
7.3%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 70291
82.5%
ASCII 14856
 
17.4%
None 13
 
< 0.1%
CJK 1
 
< 0.1%
Number Forms 1
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9878
 
14.1%
3972
 
5.7%
3442
 
4.9%
2539
 
3.6%
2489
 
3.5%
1904
 
2.7%
1483
 
2.1%
1302
 
1.9%
1165
 
1.7%
1045
 
1.5%
Other values (581) 41072
58.4%
ASCII
ValueCountFrequency (%)
4995
33.6%
) 1895
 
12.8%
( 1892
 
12.7%
1 1300
 
8.8%
2 1012
 
6.8%
0 546
 
3.7%
3 451
 
3.0%
C 274
 
1.8%
4 232
 
1.6%
I 213
 
1.4%
Other values (54) 2046
13.8%
None
ValueCountFrequency (%)
  9
69.2%
· 4
30.8%
CJK
ValueCountFrequency (%)
1
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

시설물구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
건축물
5050 
교량
4519 
터널
 
160
옹벽
 
145
기타
 
121
Other values (2)
 
5

Length

Max length4
Median length3
Mean length2.5052
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row교량
2nd row교량
3rd row교량
4th row교량
5th row교량

Common Values

ValueCountFrequency (%)
건축물 5050
50.5%
교량 4519
45.2%
터널 160
 
1.6%
옹벽 145
 
1.5%
기타 121
 
1.2%
하천 4
 
< 0.1%
절토사면 1
 
< 0.1%

Length

2023-12-12T10:08:17.656952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:08:17.847087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
건축물 5050
50.5%
교량 4519
45.2%
터널 160
 
1.6%
옹벽 145
 
1.5%
기타 121
 
1.2%
하천 4
 
< 0.1%
절토사면 1
 
< 0.1%

시설물종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
공동주택외건축물
4657 
교량
4239 
기타건축물
 
393
육교
 
279
옹벽
 
145
Other values (5)
 
287

Length

Max length8
Median length6
Mean length4.9768
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row교량
2nd row교량
3rd row교량
4th row교량
5th row교량

Common Values

ValueCountFrequency (%)
공동주택외건축물 4657
46.6%
교량 4239
42.4%
기타건축물 393
 
3.9%
육교 279
 
2.8%
옹벽 145
 
1.5%
기타토목시설 121
 
1.2%
터널 85
 
0.9%
지하차도 75
 
0.8%
<NA> 5
 
0.1%
복개구조물 1
 
< 0.1%

Length

2023-12-12T10:08:18.042057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:08:18.222331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공동주택외건축물 4657
46.6%
교량 4239
42.4%
기타건축물 393
 
3.9%
육교 279
 
2.8%
옹벽 145
 
1.5%
기타토목시설 121
 
1.2%
터널 85
 
0.9%
지하차도 75
 
0.8%
na 5
 
< 0.1%
복개구조물 1
 
< 0.1%
Distinct544
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2018-01-10 00:00:00
Maximum2022-12-28 00:00:00
2023-12-12T10:08:18.459754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:08:18.687532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct281
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T10:08:19.003361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length7.9231
Min length3

Characters and Unicode

Total characters79231
Distinct characters168
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)0.3%

Sample

1st row경상남도청
2nd row국토교통부 철도시설안전과
3rd row국토교통부 철도시설안전과
4th row충청북도 단양군청
5th row익산지방국토관리청
ValueCountFrequency (%)
국토교통부 1213
 
8.9%
경기도 752
 
5.5%
경기도교육청 719
 
5.3%
첨단도로안전과 646
 
4.7%
서울특별시교육청 543
 
4.0%
철도시설안전과 470
 
3.4%
교육부 345
 
2.5%
경상북도 327
 
2.4%
경상북도교육청 321
 
2.4%
강원특별자치도청 297
 
2.2%
Other values (269) 8010
58.7%
2023-12-12T10:08:19.506670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9211
 
11.6%
6507
 
8.2%
4917
 
6.2%
4051
 
5.1%
3717
 
4.7%
3643
 
4.6%
2895
 
3.7%
2562
 
3.2%
2325
 
2.9%
1770
 
2.2%
Other values (158) 37633
47.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 75582
95.4%
Space Separator 3643
 
4.6%
Open Punctuation 2
 
< 0.1%
Decimal Number 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9211
 
12.2%
6507
 
8.6%
4917
 
6.5%
4051
 
5.4%
3717
 
4.9%
2895
 
3.8%
2562
 
3.4%
2325
 
3.1%
1770
 
2.3%
1753
 
2.3%
Other values (154) 35874
47.5%
Space Separator
ValueCountFrequency (%)
3643
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Decimal Number
ValueCountFrequency (%)
3 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 75582
95.4%
Common 3649
 
4.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9211
 
12.2%
6507
 
8.6%
4917
 
6.5%
4051
 
5.4%
3717
 
4.9%
2895
 
3.8%
2562
 
3.4%
2325
 
3.1%
1770
 
2.3%
1753
 
2.3%
Other values (154) 35874
47.5%
Common
ValueCountFrequency (%)
3643
99.8%
( 2
 
0.1%
3 2
 
0.1%
) 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 75582
95.4%
ASCII 3649
 
4.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9211
 
12.2%
6507
 
8.6%
4917
 
6.5%
4051
 
5.4%
3717
 
4.9%
2895
 
3.8%
2562
 
3.4%
2325
 
3.1%
1770
 
2.3%
1753
 
2.3%
Other values (154) 35874
47.5%
ASCII
ValueCountFrequency (%)
3643
99.8%
( 2
 
0.1%
3 2
 
0.1%
) 2
 
0.1%

Correlations

2023-12-12T10:08:19.621487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물구분시설물종류
시설물구분1.0001.000
시설물종류1.0001.000
2023-12-12T10:08:19.727496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물종류시설물구분
시설물종류1.0001.000
시설물구분1.0001.000
2023-12-12T10:08:19.819723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물구분시설물종류
시설물구분1.0001.000
시설물종류1.0001.000

Missing values

2023-12-12T10:08:16.465912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:08:16.601751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설물명시설물구분시설물종류지정일자3종지정기관
8374마성교교량교량2019-03-28경상남도청
13903금릉가도교(상)교량교량2018-07-12국토교통부 철도시설안전과
14873죽림천교교량교량2018-07-12국토교통부 철도시설안전과
30172안동교교량교량2018-06-29충청북도 단양군청
44377상운교(상)교량교량2018-01-18익산지방국토관리청
22439태인중학교 본관동건축물공동주택외건축물2018-06-29전라북도교육청
31647신부교교량교량2018-06-19충청남도 천안시청
5160서울고은초등학교 옹벽(석축)옹벽옹벽2020-06-30서울특별시교육청
5628경남여자고등학교 본관 뒤편 옹벽옹벽옹벽2020-06-30부산광역시교육청
12148어론교교량교량2018-08-24강원특별자치도청
시설물명시설물구분시설물종류지정일자3종지정기관
3653간촌2교교량교량2021-01-04경기도 남양주시청
8725창녕상설시장건축물공동주택외건축물2019-01-29경상남도 창녕군청
25866오천고등학교 3동(생활관)건축물공동주택외건축물2018-06-29경상북도교육청
23262안덕중학교 본관동건축물공동주택외건축물2018-06-29제주특별자치도교육청
13873군정천제1교(복)교량교량2018-07-12국토교통부 철도시설안전과
40590용머리다리교교량교량2018-04-12경기도 수원시청
23891도화초등학교 본관교사동건축물공동주택외건축물2018-06-29전라남도교육청
23691금성초등학교 주1동건축물공동주택외건축물2018-06-29대전광역시교육청
36530재동교교량교량2018-06-22전라남도청
44152월성교교량교량2018-05-10대구광역시청

Duplicate rows

Most frequently occurring

시설물명시설물구분시설물종류지정일자3종지정기관# duplicates
22옹벽기타기타토목시설2019-05-30서울특별시청4
29학생회관건축물공동주택외건축물2018-06-29교육부3
0공동실험실습관건축물공동주택외건축물2018-06-29교육부2
1괴진교교량교량2018-07-05경상남도청2
2남천교교량교량2018-01-18부산지방국토관리청2
3대천교교량교량2018-05-21경상북도청2
4덕곡천교교량교량2018-07-12국토교통부 철도시설안전과2
5본관동건축물공동주택외건축물2019-12-02과학기술정보통신부2
6본관동건축물기타건축물2019-12-02과학기술정보통신부2
7봉덕교교량교량2018-03-15전라남도청2