Overview

Dataset statistics

Number of variables9
Number of observations39
Missing cells140
Missing cells (%)39.9%
Duplicate rows1
Duplicate rows (%)2.6%
Total size in memory2.9 KiB
Average record size in memory77.4 B

Variable types

Numeric2
Categorical4
Text1
DateTime2

Dataset

Description한국가스안전공사에서 수행하고 있는 각종 4차 산업혁명 분야의 연구과제 목록을 공공 및 민간에 공개하여 산/학/연과의 협업과제 발굴 유도를 위해 제공하는 데이터입니다.
Author한국가스안전공사
URLhttps://www.data.go.kr/data/15064357/fileData.do

Alerts

Dataset has 1 (2.6%) duplicate rowsDuplicates
과제구분 is highly overall correlated with 과제분야 and 1 other fieldsHigh correlation
수탁-위탁기관 is highly overall correlated with 과제분야 and 1 other fieldsHigh correlation
과제분야 is highly overall correlated with 번호 and 4 other fieldsHigh correlation
수행부서 is highly overall correlated with 번호 and 1 other fieldsHigh correlation
번호 is highly overall correlated with 과제분야 and 1 other fieldsHigh correlation
연구기간(월) is highly overall correlated with 과제분야High correlation
번호 has 28 (71.8%) missing valuesMissing
과제명 has 28 (71.8%) missing valuesMissing
연구시작일 has 28 (71.8%) missing valuesMissing
연구종료일 has 28 (71.8%) missing valuesMissing
연구기간(월) has 28 (71.8%) missing valuesMissing

Reproduction

Analysis started2023-12-12 22:34:39.870739
Analysis finished2023-12-12 22:34:41.263293
Duration1.39 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct11
Distinct (%)100.0%
Missing28
Missing (%)71.8%
Infinite0
Infinite (%)0.0%
Mean6
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-13T07:34:41.318791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.5
Q13.5
median6
Q38.5
95-th percentile10.5
Maximum11
Range10
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.3166248
Coefficient of variation (CV)0.5527708
Kurtosis-1.2
Mean6
Median Absolute Deviation (MAD)3
Skewness0
Sum66
Variance11
MonotonicityStrictly increasing
2023-12-13T07:34:41.422154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 1
 
2.6%
2 1
 
2.6%
3 1
 
2.6%
4 1
 
2.6%
5 1
 
2.6%
6 1
 
2.6%
7 1
 
2.6%
8 1
 
2.6%
9 1
 
2.6%
10 1
 
2.6%
(Missing) 28
71.8%
ValueCountFrequency (%)
1 1
2.6%
2 1
2.6%
3 1
2.6%
4 1
2.6%
5 1
2.6%
6 1
2.6%
7 1
2.6%
8 1
2.6%
9 1
2.6%
10 1
2.6%
ValueCountFrequency (%)
11 1
2.6%
10 1
2.6%
9 1
2.6%
8 1
2.6%
7 1
2.6%
6 1
2.6%
5 1
2.6%
4 1
2.6%
3 1
2.6%
2 1
2.6%

과제분야
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size444.0 B
<NA>
28 
4차산업혁명
11 

Length

Max length6
Median length4
Mean length4.5641026
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4차산업혁명
2nd row4차산업혁명
3rd row4차산업혁명
4th row4차산업혁명
5th row4차산업혁명

Common Values

ValueCountFrequency (%)
<NA> 28
71.8%
4차산업혁명 11
 
28.2%

Length

2023-12-13T07:34:41.554922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:34:41.689799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 28
71.8%
4차산업혁명 11
 
28.2%

과제명
Text

MISSING 

Distinct11
Distinct (%)100.0%
Missing28
Missing (%)71.8%
Memory size444.0 B
2023-12-13T07:34:41.934999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length69
Median length46
Mean length44.454545
Min length26

Characters and Unicode

Total characters489
Distinct characters165
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)100.0%

Sample

1st rowIOT기반 볼타입 지능형 계측기 현장부합화 연구
2nd row도시가스 사용시설 스마트 안전관리 장치 및 서비스 플랫폼 기술 개발
3rd row산업용 하이브리드밸브 모니터링 시스템 및 밸브제어를 위한 인공지능형 제어 액추에이터 시스템 개발
4th row가스시설 무선 차단 제어 성능 평가 인프라 구축 및 제도 개선 - 충북 규제자유특구
5th row드론을 활용한 도시가스배관 순회점검 기술기준(안) 개발
ValueCountFrequency (%)
개발 7
 
5.8%
6
 
5.0%
시스템 5
 
4.1%
안전 3
 
2.5%
기술 3
 
2.5%
스마트 3
 
2.5%
구축 2
 
1.7%
평가 2
 
1.7%
성능 2
 
1.7%
활용한 2
 
1.7%
Other values (79) 86
71.1%
2023-12-13T07:34:42.310098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
110
 
22.5%
16
 
3.3%
12
 
2.5%
11
 
2.2%
9
 
1.8%
9
 
1.8%
8
 
1.6%
8
 
1.6%
7
 
1.4%
7
 
1.4%
Other values (155) 292
59.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 352
72.0%
Space Separator 110
 
22.5%
Lowercase Letter 11
 
2.2%
Uppercase Letter 8
 
1.6%
Decimal Number 5
 
1.0%
Dash Punctuation 1
 
0.2%
Close Punctuation 1
 
0.2%
Open Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
16
 
4.5%
12
 
3.4%
11
 
3.1%
9
 
2.6%
9
 
2.6%
8
 
2.3%
8
 
2.3%
7
 
2.0%
7
 
2.0%
6
 
1.7%
Other values (133) 259
73.6%
Lowercase Letter
ValueCountFrequency (%)
r 2
18.2%
k 2
18.2%
g 1
9.1%
e 1
9.1%
n 1
9.1%
o 1
9.1%
t 1
9.1%
a 1
9.1%
m 1
9.1%
Uppercase Letter
ValueCountFrequency (%)
I 2
25.0%
T 2
25.0%
W 1
12.5%
O 1
12.5%
D 1
12.5%
S 1
12.5%
Decimal Number
ValueCountFrequency (%)
0 3
60.0%
2 1
 
20.0%
3 1
 
20.0%
Space Separator
ValueCountFrequency (%)
110
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 352
72.0%
Common 118
 
24.1%
Latin 19
 
3.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
16
 
4.5%
12
 
3.4%
11
 
3.1%
9
 
2.6%
9
 
2.6%
8
 
2.3%
8
 
2.3%
7
 
2.0%
7
 
2.0%
6
 
1.7%
Other values (133) 259
73.6%
Latin
ValueCountFrequency (%)
r 2
 
10.5%
k 2
 
10.5%
I 2
 
10.5%
T 2
 
10.5%
g 1
 
5.3%
W 1
 
5.3%
O 1
 
5.3%
e 1
 
5.3%
n 1
 
5.3%
o 1
 
5.3%
Other values (5) 5
26.3%
Common
ValueCountFrequency (%)
110
93.2%
0 3
 
2.5%
2 1
 
0.8%
3 1
 
0.8%
- 1
 
0.8%
) 1
 
0.8%
( 1
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 352
72.0%
ASCII 137
 
28.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
110
80.3%
0 3
 
2.2%
r 2
 
1.5%
k 2
 
1.5%
I 2
 
1.5%
T 2
 
1.5%
g 1
 
0.7%
2 1
 
0.7%
W 1
 
0.7%
3 1
 
0.7%
Other values (12) 12
 
8.8%
Hangul
ValueCountFrequency (%)
16
 
4.5%
12
 
3.4%
11
 
3.1%
9
 
2.6%
9
 
2.6%
8
 
2.3%
8
 
2.3%
7
 
2.0%
7
 
2.0%
6
 
1.7%
Other values (133) 259
73.6%

수행부서
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size444.0 B
<NA>
28 
시스템연구부
기기연구부
 
2
장치연구부
 
1
2-수소인프라연구부
 
1

Length

Max length10
Median length4
Mean length4.6410256
Min length4

Unique

Unique3 ?
Unique (%)7.7%

Sample

1st row시스템연구부
2nd row시스템연구부
3rd row시스템연구부
4th row시스템연구부
5th row장치연구부

Common Values

ValueCountFrequency (%)
<NA> 28
71.8%
시스템연구부 6
 
15.4%
기기연구부 2
 
5.1%
장치연구부 1
 
2.6%
2-수소인프라연구부 1
 
2.6%
수소인프라연구부 1
 
2.6%

Length

2023-12-13T07:34:42.457381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:34:42.560357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 28
71.8%
시스템연구부 6
 
15.4%
기기연구부 2
 
5.1%
장치연구부 1
 
2.6%
2-수소인프라연구부 1
 
2.6%
수소인프라연구부 1
 
2.6%

과제구분
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size444.0 B
<NA>
28 
정부수탁
자체
 
1
2-정부수탁
 
1

Length

Max length6
Median length4
Mean length4
Min length2

Unique

Unique2 ?
Unique (%)5.1%

Sample

1st row자체
2nd row정부수탁
3rd row정부수탁
4th row정부수탁
5th row정부수탁

Common Values

ValueCountFrequency (%)
<NA> 28
71.8%
정부수탁 9
 
23.1%
자체 1
 
2.6%
2-정부수탁 1
 
2.6%

Length

2023-12-13T07:34:42.681830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:34:42.810313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 28
71.8%
정부수탁 9
 
23.1%
자체 1
 
2.6%
2-정부수탁 1
 
2.6%

수탁-위탁기관
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size444.0 B
<NA>
28 
정부(산업부)
정부(중기부)
 
2
자체연구
 
1

Length

Max length7
Median length4
Mean length4.7692308
Min length4

Unique

Unique1 ?
Unique (%)2.6%

Sample

1st row자체연구
2nd row정부(산업부)
3rd row정부(산업부)
4th row정부(중기부)
5th row정부(산업부)

Common Values

ValueCountFrequency (%)
<NA> 28
71.8%
정부(산업부) 8
 
20.5%
정부(중기부) 2
 
5.1%
자체연구 1
 
2.6%

Length

2023-12-13T07:34:42.905963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:34:43.001372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 28
71.8%
정부(산업부 8
 
20.5%
정부(중기부 2
 
5.1%
자체연구 1
 
2.6%

연구시작일
Date

MISSING 

Distinct10
Distinct (%)90.9%
Missing28
Missing (%)71.8%
Memory size444.0 B
Minimum2017-05-01 00:00:00
Maximum2021-11-01 00:00:00
2023-12-13T07:34:43.096034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:43.199608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)

연구종료일
Date

MISSING 

Distinct10
Distinct (%)90.9%
Missing28
Missing (%)71.8%
Memory size444.0 B
Minimum2020-04-30 00:00:00
Maximum2026-06-30 00:00:00
2023-12-13T07:34:43.290622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:43.379616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)

연구기간(월)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct8
Distinct (%)72.7%
Missing28
Missing (%)71.8%
Infinite0
Infinite (%)0.0%
Mean33.181818
Minimum12
Maximum49
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-13T07:34:43.476868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile18
Q125
median36
Q340
95-th percentile48.5
Maximum49
Range37
Interquartile range (IQR)15

Descriptive statistics

Standard deviation11.356216
Coefficient of variation (CV)0.34224212
Kurtosis-0.38582405
Mean33.181818
Median Absolute Deviation (MAD)10
Skewness-0.24281094
Sum365
Variance128.96364
MonotonicityNot monotonic
2023-12-13T07:34:43.574828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
36 3
 
7.7%
24 2
 
5.1%
12 1
 
2.6%
49 1
 
2.6%
26 1
 
2.6%
48 1
 
2.6%
44 1
 
2.6%
30 1
 
2.6%
(Missing) 28
71.8%
ValueCountFrequency (%)
12 1
 
2.6%
24 2
5.1%
26 1
 
2.6%
30 1
 
2.6%
36 3
7.7%
44 1
 
2.6%
48 1
 
2.6%
49 1
 
2.6%
ValueCountFrequency (%)
49 1
 
2.6%
48 1
 
2.6%
44 1
 
2.6%
36 3
7.7%
30 1
 
2.6%
26 1
 
2.6%
24 2
5.1%
12 1
 
2.6%

Interactions

2023-12-13T07:34:40.535303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:40.342356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:40.665177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:40.432247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:34:43.979867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호과제명수행부서과제구분수탁-위탁기관연구시작일연구종료일연구기간(월)
번호1.0001.0000.7891.0001.0000.9500.9500.654
과제명1.0001.0001.0001.0001.0001.0001.0001.000
수행부서0.7891.0001.0000.5750.0001.0001.0000.431
과제구분1.0001.0000.5751.0000.9111.0001.0000.000
수탁-위탁기관1.0001.0000.0000.9111.0001.0001.0000.000
연구시작일0.9501.0001.0001.0001.0001.0001.0001.000
연구종료일0.9501.0001.0001.0001.0001.0001.0001.000
연구기간(월)0.6541.0000.4310.0000.0001.0001.0001.000
2023-12-13T07:34:44.087130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과제구분수탁-위탁기관과제분야수행부서
과제구분1.0000.6261.0000.414
수탁-위탁기관0.6261.0001.0000.000
과제분야1.0001.0001.0001.000
수행부서0.4140.0001.0001.000
2023-12-13T07:34:44.173162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호연구기간(월)과제분야수행부서과제구분수탁-위탁기관
번호1.0000.3911.0000.6670.0000.000
연구기간(월)0.3911.0001.0000.1080.0000.000
과제분야1.0001.0001.0001.0001.0001.000
수행부서0.6670.1081.0001.0000.4140.000
과제구분0.0000.0001.0000.4141.0000.626
수탁-위탁기관0.0000.0001.0000.0000.6261.000

Missing values

2023-12-13T07:34:40.829356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:34:40.990777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T07:34:41.160368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

번호과제분야과제명수행부서과제구분수탁-위탁기관연구시작일연구종료일연구기간(월)
014차산업혁명IOT기반 볼타입 지능형 계측기 현장부합화 연구시스템연구부자체자체연구2019-01-012020-12-3124
124차산업혁명도시가스 사용시설 스마트 안전관리 장치 및 서비스 플랫폼 기술 개발시스템연구부정부수탁정부(산업부)2017-05-012020-04-3036
234차산업혁명산업용 하이브리드밸브 모니터링 시스템 및 밸브제어를 위한 인공지능형 제어 액추에이터 시스템 개발시스템연구부정부수탁정부(산업부)2017-05-012020-04-3036
344차산업혁명가스시설 무선 차단 제어 성능 평가 인프라 구축 및 제도 개선 - 충북 규제자유특구시스템연구부정부수탁정부(중기부)2019-08-092021-08-0824
454차산업혁명드론을 활용한 도시가스배관 순회점검 기술기준(안) 개발장치연구부정부수탁정부(산업부)2020-04-012021-03-2112
564차산업혁명Smart Drone기반 고층건물 및 교량첨가 가스배관 검사장비 상용화 개발기기연구부정부수탁정부(산업부)2018-05-012021-04-3036
674차산업혁명지능형 통합 에너지 플랫폼 기반 복합에너지 허브 구축 사업기기연구부정부수탁정부(중기부)2019-05-012023-05-3149
784차산업혁명디지털 트윈을 활용한 재생에너지 연계 알칼라인 수소생산 시스템 성능 표준화 및 운영 안전성 평가 기술 개발2-수소인프라연구부2-정부수탁정부(산업부)2020-11-012022-12-3126
894차산업혁명최대이륙중량 200kg급 비행체용 순정격출력 30kW급 연료전지 파워팩 시스템 개발수소인프라연구부정부수탁정부(산업부)2021-05-012025-04-3048
9104차산업혁명고위험가스 밀집시설에서 현장상황에 따른 위험예측과 사고 대응이 가능한 차등적 안전 프로세스 중심의 스마트 안전관리 시스템개발시스템연구부정부수탁정부(산업부)2021-11-012026-06-3044
번호과제분야과제명수행부서과제구분수탁-위탁기관연구시작일연구종료일연구기간(월)
29<NA><NA><NA><NA><NA><NA><NA><NA><NA>
30<NA><NA><NA><NA><NA><NA><NA><NA><NA>
31<NA><NA><NA><NA><NA><NA><NA><NA><NA>
32<NA><NA><NA><NA><NA><NA><NA><NA><NA>
33<NA><NA><NA><NA><NA><NA><NA><NA><NA>
34<NA><NA><NA><NA><NA><NA><NA><NA><NA>
35<NA><NA><NA><NA><NA><NA><NA><NA><NA>
36<NA><NA><NA><NA><NA><NA><NA><NA><NA>
37<NA><NA><NA><NA><NA><NA><NA><NA><NA>
38<NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

번호과제분야과제명수행부서과제구분수탁-위탁기관연구시작일연구종료일연구기간(월)# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA><NA>28