Overview

Dataset statistics

Number of variables5
Number of observations283
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.4%
Total size in memory11.5 KiB
Average record size in memory41.5 B

Variable types

Categorical2
Text2
Numeric1

Dataset

Description국토안전관리원에서 제공하는 데이터이며 국토안전관리원에서 생성된 연구보고서 등 발간목록(자료구분, 자료명, 사업구분, 발행년도)을 제공합니다.
URLhttps://www.data.go.kr/data/15018181/fileData.do

Alerts

Dataset has 1 (0.4%) duplicate rowsDuplicates
자료구분 is highly overall correlated with 사업구분High correlation
사업구분 is highly overall correlated with 자료구분High correlation

Reproduction

Analysis started2023-12-12 04:19:05.512971
Analysis finished2023-12-12 04:19:06.225819
Duration0.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자료구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
정책/연구
142 
기술연구
141 

Length

Max length5
Median length5
Mean length4.5017668
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기술연구
2nd row기술연구
3rd row기술연구
4th row정책/연구
5th row정책/연구

Common Values

ValueCountFrequency (%)
정책/연구 142
50.2%
기술연구 141
49.8%

Length

2023-12-12T13:19:06.315456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:19:06.466283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정책/연구 142
50.2%
기술연구 141
49.8%
Distinct282
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-12T13:19:06.775040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length43
Mean length28.39576
Min length7

Characters and Unicode

Total characters8036
Distinct characters370
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique281 ?
Unique (%)99.3%

Sample

1st row시설물 영상취득 방안 체계화 및 손상분석 자동화 방안 연구
2nd row시설물 안전진단·점검의 자동화 추진방안 연구
3rd row수직형 시설물의 AI기반 비진입 스캐닝 자동화 시스템 개발
4th row건설안전 및 품질 관련 제도의 실효성 분석 및 개선 연구
5th row건설사고 재해율 저감을 위한 해외 선진사례 조사 및 분석 연구
ValueCountFrequency (%)
135
 
7.2%
연구 97
 
5.2%
개발 88
 
4.7%
위한 44
 
2.3%
시설물 30
 
1.6%
유지관리 29
 
1.5%
시스템 23
 
1.2%
방안 21
 
1.1%
관한 20
 
1.1%
매뉴얼 18
 
1.0%
Other values (843) 1371
73.1%
2023-12-12T13:19:07.364848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1593
 
19.8%
172
 
2.1%
169
 
2.1%
162
 
2.0%
154
 
1.9%
139
 
1.7%
137
 
1.7%
130
 
1.6%
126
 
1.6%
126
 
1.6%
Other values (360) 5128
63.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6025
75.0%
Space Separator 1593
 
19.8%
Uppercase Letter 116
 
1.4%
Decimal Number 99
 
1.2%
Other Punctuation 75
 
0.9%
Lowercase Letter 44
 
0.5%
Close Punctuation 40
 
0.5%
Open Punctuation 40
 
0.5%
Dash Punctuation 3
 
< 0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
172
 
2.9%
169
 
2.8%
162
 
2.7%
154
 
2.6%
139
 
2.3%
137
 
2.3%
130
 
2.2%
126
 
2.1%
126
 
2.1%
125
 
2.1%
Other values (308) 4585
76.1%
Uppercase Letter
ValueCountFrequency (%)
C 18
15.5%
I 16
13.8%
S 11
9.5%
L 9
 
7.8%
D 7
 
6.0%
B 7
 
6.0%
M 6
 
5.2%
A 6
 
5.2%
T 6
 
5.2%
G 6
 
5.2%
Other values (8) 24
20.7%
Lowercase Letter
ValueCountFrequency (%)
e 8
18.2%
i 6
13.6%
n 5
11.4%
l 4
9.1%
f 4
9.1%
o 3
 
6.8%
r 3
 
6.8%
a 2
 
4.5%
v 2
 
4.5%
s 2
 
4.5%
Other values (5) 5
11.4%
Decimal Number
ValueCountFrequency (%)
2 43
43.4%
8 30
30.3%
1 15
 
15.2%
3 6
 
6.1%
0 3
 
3.0%
4 1
 
1.0%
7 1
 
1.0%
Other Punctuation
ValueCountFrequency (%)
& 15
20.0%
# 15
20.0%
, 15
20.0%
/ 13
17.3%
· 10
13.3%
. 4
 
5.3%
; 3
 
4.0%
Space Separator
ValueCountFrequency (%)
1593
100.0%
Close Punctuation
ValueCountFrequency (%)
) 40
100.0%
Open Punctuation
ValueCountFrequency (%)
( 40
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6025
75.0%
Common 1850
 
23.0%
Latin 161
 
2.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
172
 
2.9%
169
 
2.8%
162
 
2.7%
154
 
2.6%
139
 
2.3%
137
 
2.3%
130
 
2.2%
126
 
2.1%
126
 
2.1%
125
 
2.1%
Other values (308) 4585
76.1%
Latin
ValueCountFrequency (%)
C 18
 
11.2%
I 16
 
9.9%
S 11
 
6.8%
L 9
 
5.6%
e 8
 
5.0%
D 7
 
4.3%
B 7
 
4.3%
M 6
 
3.7%
A 6
 
3.7%
i 6
 
3.7%
Other values (24) 67
41.6%
Common
ValueCountFrequency (%)
1593
86.1%
2 43
 
2.3%
) 40
 
2.2%
( 40
 
2.2%
8 30
 
1.6%
& 15
 
0.8%
# 15
 
0.8%
, 15
 
0.8%
1 15
 
0.8%
/ 13
 
0.7%
Other values (8) 31
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6025
75.0%
ASCII 2000
 
24.9%
None 10
 
0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1593
79.7%
2 43
 
2.1%
) 40
 
2.0%
( 40
 
2.0%
8 30
 
1.5%
C 18
 
0.9%
I 16
 
0.8%
& 15
 
0.8%
# 15
 
0.8%
, 15
 
0.8%
Other values (40) 175
 
8.8%
Hangul
ValueCountFrequency (%)
172
 
2.9%
169
 
2.8%
162
 
2.7%
154
 
2.6%
139
 
2.3%
137
 
2.3%
130
 
2.2%
126
 
2.1%
126
 
2.1%
125
 
2.1%
Other values (308) 4585
76.1%
None
ValueCountFrequency (%)
· 10
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

사업구분
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
공통
143 
교량
54 
수리시설
33 
건축
30 
터널
23 

Length

Max length4
Median length2
Mean length2.2332155
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공통
2nd row공통
3rd row터널
4th row공통
5th row공통

Common Values

ValueCountFrequency (%)
공통 143
50.5%
교량 54
 
19.1%
수리시설 33
 
11.7%
건축 30
 
10.6%
터널 23
 
8.1%

Length

2023-12-12T13:19:07.585907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:19:07.748529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공통 143
50.5%
교량 54
 
19.1%
수리시설 33
 
11.7%
건축 30
 
10.6%
터널 23
 
8.1%

등록년도
Real number (ℝ)

Distinct16
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2009.1378
Minimum2004
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-12T13:19:07.918240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2004
5-th percentile2004
Q12004
median2005
Q32016
95-th percentile2021
Maximum2023
Range19
Interquartile range (IQR)12

Descriptive statistics

Standard deviation6.2170052
Coefficient of variation (CV)0.0030943648
Kurtosis-0.89497057
Mean2009.1378
Median Absolute Deviation (MAD)1
Skewness0.81044442
Sum568586
Variance38.651154
MonotonicityNot monotonic
2023-12-12T13:19:08.089586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
2004 107
37.8%
2005 45
15.9%
2016 36
 
12.7%
2007 26
 
9.2%
2017 11
 
3.9%
2014 9
 
3.2%
2019 9
 
3.2%
2023 7
 
2.5%
2020 7
 
2.5%
2022 6
 
2.1%
Other values (6) 20
 
7.1%
ValueCountFrequency (%)
2004 107
37.8%
2005 45
15.9%
2007 26
 
9.2%
2008 2
 
0.7%
2009 1
 
0.4%
2011 3
 
1.1%
2012 6
 
2.1%
2013 5
 
1.8%
2014 9
 
3.2%
2016 36
 
12.7%
ValueCountFrequency (%)
2023 7
 
2.5%
2022 6
 
2.1%
2021 3
 
1.1%
2020 7
 
2.5%
2019 9
 
3.2%
2017 11
 
3.9%
2016 36
12.7%
2014 9
 
3.2%
2013 5
 
1.8%
2012 6
 
2.1%
Distinct194
Distinct (%)68.6%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-12T13:19:08.401537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length21
Mean length21
Min length21

Characters and Unicode

Total characters5943
Distinct characters14
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique153 ?
Unique (%)54.1%

Sample

1st row2020-01-01~2022-12-31
2nd row2022-01-01~2022-12-31
3rd row2020-01-01~2022-12-31
4th row2022-01-01~2022-12-31
5th row2022-01-01~2022-12-31
ValueCountFrequency (%)
1997-01-01~1997-12-31 10
 
3.5%
2012-01-01~2013-12-31 7
 
2.5%
1998-01-01~1998-12-31 6
 
2.1%
2014-01-01~2015-12-31 5
 
1.8%
1999-10-01~2001-02-28 5
 
1.8%
2003-12-22∼2005-12-21 5
 
1.8%
2013-01-01~2014-12-31 5
 
1.8%
2009-01-01~2010-12-31 4
 
1.4%
2022-01-01~2022-12-31 4
 
1.4%
2019-01-01~2019-12-31 4
 
1.4%
Other values (184) 228
80.6%
2023-12-12T13:19:08.897642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1345
22.6%
1 1161
19.5%
- 1131
19.0%
2 925
15.6%
9 315
 
5.3%
3 285
 
4.8%
~ 245
 
4.1%
6 117
 
2.0%
8 105
 
1.8%
5 99
 
1.7%
Other values (4) 215
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4528
76.2%
Dash Punctuation 1131
 
19.0%
Math Symbol 283
 
4.8%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1345
29.7%
1 1161
25.6%
2 925
20.4%
9 315
 
7.0%
3 285
 
6.3%
6 117
 
2.6%
8 105
 
2.3%
5 99
 
2.2%
7 91
 
2.0%
4 85
 
1.9%
Math Symbol
ValueCountFrequency (%)
~ 245
86.6%
38
 
13.4%
Dash Punctuation
ValueCountFrequency (%)
- 1131
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5943
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1345
22.6%
1 1161
19.5%
- 1131
19.0%
2 925
15.6%
9 315
 
5.3%
3 285
 
4.8%
~ 245
 
4.1%
6 117
 
2.0%
8 105
 
1.8%
5 99
 
1.7%
Other values (4) 215
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5905
99.4%
Math Operators 38
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1345
22.8%
1 1161
19.7%
- 1131
19.2%
2 925
15.7%
9 315
 
5.3%
3 285
 
4.8%
~ 245
 
4.1%
6 117
 
2.0%
8 105
 
1.8%
5 99
 
1.7%
Other values (3) 177
 
3.0%
Math Operators
ValueCountFrequency (%)
38
100.0%

Interactions

2023-12-12T13:19:05.877998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:19:09.039081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자료구분사업구분등록년도
자료구분1.0000.8460.234
사업구분0.8461.0000.157
등록년도0.2340.1571.000
2023-12-12T13:19:09.148699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자료구분사업구분
자료구분1.0000.959
사업구분0.9591.000
2023-12-12T13:19:09.250359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록년도자료구분사업구분
등록년도1.0000.1960.064
자료구분0.1961.0000.959
사업구분0.0640.9591.000

Missing values

2023-12-12T13:19:06.013748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:19:06.156941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자료구분자료명사업구분등록년도수행기간
0기술연구시설물 영상취득 방안 체계화 및 손상분석 자동화 방안 연구공통20232020-01-01~2022-12-31
1기술연구시설물 안전진단·점검의 자동화 추진방안 연구공통20232022-01-01~2022-12-31
2기술연구수직형 시설물의 AI기반 비진입 스캐닝 자동화 시스템 개발터널20232020-01-01~2022-12-31
3정책/연구건설안전 및 품질 관련 제도의 실효성 분석 및 개선 연구공통20232022-01-01~2022-12-31
4정책/연구건설사고 재해율 저감을 위한 해외 선진사례 조사 및 분석 연구공통20232022-01-01~2022-12-31
5정책/연구건설·시설안전 사회이슈 분석을 통한 시사점 도출 및 안전정책 발굴 연구공통20232022-01-01~2022-12-31
6정책/연구건설·시설 안전분야 정보관리체계 개선 및 활용방안 연구공통20232020-01-01~2022-12-31
7정책/연구통계 기반 국토안전 정책동향·전망 연구공통20222021-01-01~2021-12-31
8정책/연구드론 활용을 위한 시설물 안전점검 등에서의 시범적용 연구공통20222020-01-01~2021-12-31
9기술연구건축 마감재 안전점검 요령 및 보수·보강 기법 개발건축20222020-01-01~2021-12-31
자료구분자료명사업구분등록년도수행기간
273기술연구건축물의 옥내주차장 구조물에 대한 안전성 평가기법에 관한 연구건축20041998-01-01~1999-12-31
274기술연구건축물의 재건축 판정을 위한 평가방안건축20041999-12-13~2000-12-12
275기술연구고층건축물 구조안정성 및 사용성 장기계측 시스템을 이용한 상시 안전진단 기술개발건축20041999-11-15~2001-11-14
276기술연구기존건축물의 종합성능 평가모델 개발건축20042001-06-25~2002-10-24
277기술연구대형서스펜션 구조물의 유지관리기법 개발건축20042001-06-25~2002-10-24
278기술연구공동주택 장수명화를 위한 유지관리 시스템 개발건축20042001-08-30~2003-08-29
279기술연구재건축 판정을 위한 안전진단 기준 및 절차에 관한 연구건축20042003-01-29~2003-05-28
280기술연구건설교통안전관리 개선방안 연구(건축부분)건축20042003-03-12~2003-07-09
281기술연구열적외선 장비를 활용한 건축물 진단기법 개발건축20052003-01-01~2004-12-31
282기술연구건축물의 유지관리를 위한 진단평가시스템 개발건축20052002-02-01~2004-10-27

Duplicate rows

Most frequently occurring

자료구분자료명사업구분등록년도수행기간# duplicates
0정책/연구시설물 안전 및 유지관리 실태조사 방안 연구공통20162013-01-01~2014-12-312