Overview

Dataset statistics

Number of variables6
Number of observations87
Missing cells27
Missing cells (%)5.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.2 KiB
Average record size in memory49.5 B

Variable types

Categorical2
Text2
DateTime2

Dataset

Description경상북도개발공사 사업별 사업명, 공사구분, 공사명, 시공사, 착공일, 준공일 관련 정보가 포함된 공사정보입니다.
Author경상북도개발공사
URLhttps://www.data.go.kr/data/15011735/fileData.do

Alerts

공사명 has 1 (1.1%) missing valuesMissing
시공사 has 24 (27.6%) missing valuesMissing
착공일 has 1 (1.1%) missing valuesMissing
준공일 has 1 (1.1%) missing valuesMissing

Reproduction

Analysis started2023-12-12 12:01:23.371224
Analysis finished2023-12-12 12:01:24.309524
Duration0.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사업명
Categorical

Distinct30
Distinct (%)34.5%
Missing0
Missing (%)0.0%
Memory size828.0 B
경북도청신도시 건설사업(1단계)
11 
세계유교선비문화공원 및 한국문화테마파크 조성사업
11 
경북도청신도시 건설사업(2단계)
 
5
경상북도 동해안 119특수구조단 건립사업
 
4
봉화소방서 신축사업
 
4
Other values (25)
52 

Length

Max length26
Median length19
Mean length16.425287
Min length4

Unique

Unique12 ?
Unique (%)13.8%

Sample

1st row경상북도개발공사 사옥이전 신축사업
2nd row경상북도개발공사 사옥이전 신축사업
3rd row경상북도개발공사 사옥이전 신축사업
4th row경상북도개발공사 사옥이전 신축사업
5th row경북도청신도시 한옥시범주택 건설사업

Common Values

ValueCountFrequency (%)
경북도청신도시 건설사업(1단계) 11
 
12.6%
세계유교선비문화공원 및 한국문화테마파크 조성사업 11
 
12.6%
경북도청신도시 건설사업(2단계) 5
 
5.7%
경상북도 동해안 119특수구조단 건립사업 4
 
4.6%
봉화소방서 신축사업 4
 
4.6%
청송소방서 신축사업 4
 
4.6%
의성세포배양산업지원센터건립사업 4
 
4.6%
경상북도개발공사 사옥이전 신축사업 4
 
4.6%
경북권역 재활병원 건립사업 4
 
4.6%
경상북도 보훈회관 건립공사 4
 
4.6%
Other values (20) 32
36.8%

Length

2023-12-12T21:01:24.375996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경북도청신도시 19
 
9.0%
조성사업 16
 
7.5%
건립사업 12
 
5.7%
신축사업 12
 
5.7%
한국문화테마파크 11
 
5.2%
세계유교선비문화공원 11
 
5.2%
건설사업(1단계 11
 
5.2%
11
 
5.2%
경상북도 9
 
4.2%
건설사업(2단계 5
 
2.4%
Other values (39) 95
44.8%

공사구분
Categorical

Distinct19
Distinct (%)21.8%
Missing0
Missing (%)0.0%
Memory size828.0 B
토목
14 
전기
14 
통신
12 
건축
시설공사
Other values (14)
33 

Length

Max length10
Median length2
Mean length2.9655172
Min length2

Unique

Unique3 ?
Unique (%)3.4%

Sample

1st row건축, 토목, 조경
2nd row전기
3rd row통신
4th row기계설비
5th row시설공사

Common Values

ValueCountFrequency (%)
토목 14
16.1%
전기 14
16.1%
통신 12
13.8%
건축 8
9.2%
시설공사 6
 
6.9%
소방 5
 
5.7%
조경 5
 
5.7%
토목,조경 3
 
3.4%
건축, 토목, 조경 3
 
3.4%
임업 2
 
2.3%
Other values (9) 15
17.2%

Length

2023-12-12T21:01:24.527479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
토목 18
19.1%
전기 14
14.9%
통신 12
12.8%
건축 11
11.7%
조경 9
9.6%
시설공사 6
 
6.4%
소방 5
 
5.3%
토목,조경 3
 
3.2%
구조물해체 2
 
2.1%
통신공사 2
 
2.1%
Other values (7) 12
12.8%

공사명
Text

MISSING 

Distinct76
Distinct (%)88.4%
Missing1
Missing (%)1.1%
Memory size828.0 B
2023-12-12T21:01:24.697161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length28
Mean length17.569767
Min length4

Characters and Unicode

Total characters1511
Distinct characters152
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70 ?
Unique (%)81.4%

Sample

1st row경상북도개발공사 사옥이전 신축사업 시설공사
2nd row전기공사
3rd row통신공사
4th row기계설비공사
5th row경북도청이전신도시 한옥시범주택-Ⅱ
ValueCountFrequency (%)
전기공사 14
 
6.0%
건립 13
 
5.6%
조성공사 11
 
4.7%
통신공사 10
 
4.3%
경북도청이전신도시 10
 
4.3%
신축사업 9
 
3.9%
시설공사 8
 
3.4%
소방공사 6
 
2.6%
동해안 5
 
2.1%
119특수구조단 5
 
2.1%
Other values (85) 142
60.9%
2023-12-12T21:01:25.088152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
147
 
9.7%
117
 
7.7%
102
 
6.8%
56
 
3.7%
47
 
3.1%
44
 
2.9%
37
 
2.4%
37
 
2.4%
36
 
2.4%
31
 
2.1%
Other values (142) 857
56.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1269
84.0%
Space Separator 147
 
9.7%
Decimal Number 34
 
2.3%
Close Punctuation 14
 
0.9%
Open Punctuation 14
 
0.9%
Lowercase Letter 9
 
0.6%
Uppercase Letter 9
 
0.6%
Dash Punctuation 8
 
0.5%
Letter Number 5
 
0.3%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
117
 
9.2%
102
 
8.0%
56
 
4.4%
47
 
3.7%
44
 
3.5%
37
 
2.9%
37
 
2.9%
36
 
2.8%
31
 
2.4%
27
 
2.1%
Other values (122) 735
57.9%
Decimal Number
ValueCountFrequency (%)
1 14
41.2%
2 13
38.2%
9 5
 
14.7%
7 1
 
2.9%
3 1
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
C 3
33.3%
U 3
33.3%
B 2
22.2%
L 1
 
11.1%
Lowercase Letter
ValueCountFrequency (%)
t 3
33.3%
y 3
33.3%
i 3
33.3%
Letter Number
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%
Space Separator
ValueCountFrequency (%)
147
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1269
84.0%
Common 219
 
14.5%
Latin 23
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
117
 
9.2%
102
 
8.0%
56
 
4.4%
47
 
3.7%
44
 
3.5%
37
 
2.9%
37
 
2.9%
36
 
2.8%
31
 
2.4%
27
 
2.1%
Other values (122) 735
57.9%
Common
ValueCountFrequency (%)
147
67.1%
) 14
 
6.4%
1 14
 
6.4%
( 14
 
6.4%
2 13
 
5.9%
- 8
 
3.7%
9 5
 
2.3%
, 2
 
0.9%
7 1
 
0.5%
3 1
 
0.5%
Latin
ValueCountFrequency (%)
t 3
13.0%
y 3
13.0%
C 3
13.0%
i 3
13.0%
U 3
13.0%
B 2
8.7%
2
8.7%
2
8.7%
1
 
4.3%
L 1
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1269
84.0%
ASCII 237
 
15.7%
Number Forms 5
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
147
62.0%
) 14
 
5.9%
1 14
 
5.9%
( 14
 
5.9%
2 13
 
5.5%
- 8
 
3.4%
9 5
 
2.1%
t 3
 
1.3%
y 3
 
1.3%
C 3
 
1.3%
Other values (7) 13
 
5.5%
Hangul
ValueCountFrequency (%)
117
 
9.2%
102
 
8.0%
56
 
4.4%
47
 
3.7%
44
 
3.5%
37
 
2.9%
37
 
2.9%
36
 
2.8%
31
 
2.4%
27
 
2.1%
Other values (122) 735
57.9%
Number Forms
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%

시공사
Text

MISSING 

Distinct58
Distinct (%)92.1%
Missing24
Missing (%)27.6%
Memory size828.0 B
2023-12-12T21:01:25.327371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length14
Mean length7.6666667
Min length3

Characters and Unicode

Total characters483
Distinct characters113
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53 ?
Unique (%)84.1%

Sample

1st row서원종합건설㈜ 외 1개사
2nd row㈜세림전력, 동양종합건설주식회사
3rd row㈜제일정보통신
4th row준에너지테크㈜
5th row㈜지성건설
ValueCountFrequency (%)
7
 
8.1%
1개사 5
 
5.8%
3
 
3.5%
㈜신도시개발 2
 
2.3%
㈜신도시조경 2
 
2.3%
계룡건설산업㈜ 2
 
2.3%
태령종합건설 2
 
2.3%
주식회사 2
 
2.3%
경북도청이전신도시주민생계조합㈜ 2
 
2.3%
㈜태양전력공사 1
 
1.2%
Other values (58) 58
67.4%
2023-12-12T21:01:25.732808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
41
 
8.5%
29
 
6.0%
28
 
5.8%
24
 
5.0%
17
 
3.5%
17
 
3.5%
15
 
3.1%
13
 
2.7%
12
 
2.5%
12
 
2.5%
Other values (103) 275
56.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 395
81.8%
Other Symbol 41
 
8.5%
Space Separator 24
 
5.0%
Decimal Number 8
 
1.7%
Close Punctuation 6
 
1.2%
Open Punctuation 6
 
1.2%
Other Punctuation 3
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
29
 
7.3%
28
 
7.1%
17
 
4.3%
17
 
4.3%
15
 
3.8%
13
 
3.3%
12
 
3.0%
12
 
3.0%
12
 
3.0%
10
 
2.5%
Other values (95) 230
58.2%
Decimal Number
ValueCountFrequency (%)
1 5
62.5%
2 2
 
25.0%
3 1
 
12.5%
Other Symbol
ValueCountFrequency (%)
41
100.0%
Space Separator
ValueCountFrequency (%)
24
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 436
90.3%
Common 47
 
9.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
41
 
9.4%
29
 
6.7%
28
 
6.4%
17
 
3.9%
17
 
3.9%
15
 
3.4%
13
 
3.0%
12
 
2.8%
12
 
2.8%
12
 
2.8%
Other values (96) 240
55.0%
Common
ValueCountFrequency (%)
24
51.1%
) 6
 
12.8%
( 6
 
12.8%
1 5
 
10.6%
, 3
 
6.4%
2 2
 
4.3%
3 1
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 395
81.8%
ASCII 47
 
9.7%
None 41
 
8.5%

Most frequent character per block

None
ValueCountFrequency (%)
41
100.0%
Hangul
ValueCountFrequency (%)
29
 
7.3%
28
 
7.1%
17
 
4.3%
17
 
4.3%
15
 
3.8%
13
 
3.3%
12
 
3.0%
12
 
3.0%
12
 
3.0%
10
 
2.5%
Other values (95) 230
58.2%
ASCII
ValueCountFrequency (%)
24
51.1%
) 6
 
12.8%
( 6
 
12.8%
1 5
 
10.6%
, 3
 
6.4%
2 2
 
4.3%
3 1
 
2.1%

착공일
Date

MISSING 

Distinct63
Distinct (%)73.3%
Missing1
Missing (%)1.1%
Memory size828.0 B
Minimum2014-12-15 00:00:00
Maximum2022-08-17 00:00:00
2023-12-12T21:01:25.877179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:01:26.021853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

준공일
Date

MISSING 

Distinct57
Distinct (%)66.3%
Missing1
Missing (%)1.1%
Memory size828.0 B
Minimum2015-05-07 00:00:00
Maximum2023-10-28 00:00:00
2023-12-12T21:01:26.179626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:01:26.347875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Correlations

2023-12-12T21:01:26.482144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업명공사구분공사명시공사착공일준공일
사업명1.0000.0000.0000.8140.9690.992
공사구분0.0001.0000.8961.0000.9380.867
공사명0.0000.8961.0000.9670.9190.993
시공사0.8141.0000.9671.0000.9950.968
착공일0.9690.9380.9190.9951.0000.995
준공일0.9920.8670.9930.9680.9951.000
2023-12-12T21:01:26.642990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공사구분사업명
공사구분1.0000.000
사업명0.0001.000
2023-12-12T21:01:26.746598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업명공사구분
사업명1.0000.000
공사구분0.0001.000

Missing values

2023-12-12T21:01:23.999869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:01:24.103783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T21:01:24.227732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

사업명공사구분공사명시공사착공일준공일
0경상북도개발공사 사옥이전 신축사업건축, 토목, 조경경상북도개발공사 사옥이전 신축사업 시설공사서원종합건설㈜ 외 1개사2015-06-122017-05-31
1경상북도개발공사 사옥이전 신축사업전기전기공사㈜세림전력, 동양종합건설주식회사2015-07-302017-02-28
2경상북도개발공사 사옥이전 신축사업통신통신공사㈜제일정보통신2015-07-312017-02-28
3경상북도개발공사 사옥이전 신축사업기계설비기계설비공사준에너지테크㈜2015-08-192017-02-28
4경북도청신도시 한옥시범주택 건설사업시설공사경북도청이전신도시 한옥시범주택-Ⅱ㈜지성건설2016-04-282016-09-04
5경북도청신도시 한옥시범주택 건설사업시설공사경북도청이전신도시 한옥시범주택-Ⅰ,Ⅲ㈜일성건설2016-06-102016-10-31
6경북도청신도시 한옥시범주택 건설사업시설공사경북도청이전신도시 한옥시범주택-Ⅰ,Ⅲ호 잔여공사㈜시동건설2017-03-292017-06-30
7포항청년주택건립사업건축포항청년주택건립공사<NA>2017-11-232019-08-26
8경북도청이전신도시 B-7BL공공임대주택건립사업건축경북도청이전신도시 B-7BL공공임대주택건립공사코오롱글로벌㈜ 외 3개사2017-11-302020-05-28
9경상북도 보훈회관 건립공사시설공사시설공사해인건설㈜2015-03-202016-08-31
사업명공사구분공사명시공사착공일준공일
77호민지수변생태공원조성사업통신호민지수변생태공원조성사업통신공사<NA>2021-07-302021-12-31
78호민지수변생태공원조성사업토목,조경호민지수변생태공업조성공사<NA>2020-04-282021-10-26
79경북도청신도시건설사업(2단계)토목,조경경북도청신도시건설사업(2단계) 조성공사<NA>2017-12-292023-10-28
80경주 동천지구 도시개발사업전기경주동천지구 도시개발사업 전기공사(주)광진2022-07-042022-12-31
81경산 화장품특화단지 조성사업전기경산 화장품특화단지 조성사업 전기공사합자회사 하양전기2022-08-172023-04-30
82봉화춘양 행복주택 건립사업시설공사봉화춘양 행복주택 건립사업 시설공사㈜신화건설, ㈜현대사인개발2021-10-202023-04-12
83봉화춘양 행복주택 건립사업전기공사봉화춘양 행복주택 건립사업 전기공사㈜대한전기2021-11-112023-05-04
84봉화춘양 행복주택 건립사업통신공사봉화춘양 행복주택 건립사업 통신공사주식회사 누리정보기술2021-11-122023-05-05
85봉화춘양 행복주택 건립사업소방공사봉화춘양 행복주택 건립사업 소방공사주식회사 성산전설2021-11-122023-05-05
86<NA><NA><NA><NA><NA><NA>