Overview

Dataset statistics

Number of variables6
Number of observations25
Missing cells36
Missing cells (%)24.0%
Duplicate rows4
Duplicate rows (%)16.0%
Total size in memory1.3 KiB
Average record size in memory54.3 B

Variable types

Numeric1
Categorical1
DateTime1
Text3

Dataset

Description경상남도 공사계약 도급자 변경 내역 데이터로, 공사년도, 공사구분, 도급변경일, 변경사항, 변경사유, 변경비고에 대한 정보를 제공합니다.
Author경상남도
URLhttps://www.data.go.kr/data/15049522/fileData.do

Alerts

Dataset has 4 (16.0%) duplicate rowsDuplicates
공사구분 is highly imbalanced (59.8%)Imbalance
변경사항 has 3 (12.0%) missing valuesMissing
변경사유 has 10 (40.0%) missing valuesMissing
변경비고 has 23 (92.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 16:32:10.390600
Analysis finished2023-12-12 16:32:11.171915
Duration0.78 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

공사년도
Real number (ℝ)

Distinct10
Distinct (%)40.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2011.52
Minimum2004
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size357.0 B
2023-12-13T01:32:11.238027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2004
5-th percentile2004
Q12009
median2011
Q32015
95-th percentile2019.4
Maximum2020
Range16
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.5471603
Coefficient of variation (CV)0.0022605593
Kurtosis-0.38692056
Mean2011.52
Median Absolute Deviation (MAD)2
Skewness0.039186803
Sum50288
Variance20.676667
MonotonicityIncreasing
2023-12-13T01:32:11.386017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2010 5
20.0%
2004 3
12.0%
2009 3
12.0%
2013 3
12.0%
2015 3
12.0%
2011 2
 
8.0%
2017 2
 
8.0%
2020 2
 
8.0%
2005 1
 
4.0%
2014 1
 
4.0%
ValueCountFrequency (%)
2004 3
12.0%
2005 1
 
4.0%
2009 3
12.0%
2010 5
20.0%
2011 2
 
8.0%
2013 3
12.0%
2014 1
 
4.0%
2015 3
12.0%
2017 2
 
8.0%
2020 2
 
8.0%
ValueCountFrequency (%)
2020 2
 
8.0%
2017 2
 
8.0%
2015 3
12.0%
2014 1
 
4.0%
2013 3
12.0%
2011 2
 
8.0%
2010 5
20.0%
2009 3
12.0%
2005 1
 
4.0%
2004 3
12.0%

공사구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size332.0 B
공사
23 
용역
 
2

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공사
2nd row공사
3rd row공사
4th row용역
5th row공사

Common Values

ValueCountFrequency (%)
공사 23
92.0%
용역 2
 
8.0%

Length

2023-12-13T01:32:11.551599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:32:11.660395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공사 23
92.0%
용역 2
 
8.0%
Distinct17
Distinct (%)68.0%
Missing0
Missing (%)0.0%
Memory size332.0 B
Minimum2005-01-18 00:00:00
Maximum2021-06-17 00:00:00
2023-12-13T01:32:11.771696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:32:11.919313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)

변경사항
Text

MISSING 

Distinct19
Distinct (%)86.4%
Missing3
Missing (%)12.0%
Memory size332.0 B
2023-12-13T01:32:12.140237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length17.5
Mean length13.636364
Min length4

Characters and Unicode

Total characters300
Distinct characters86
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)72.7%

Sample

1st row타절정산액 : 총체3214449천원
2nd row업체명 주소 대표자
3rd row진주시 평거동 7-1번지 2층, 사용인감 변경
4th row상호변경(2006.4.20.)신고
5th row삼영토건(주)->청호건설(주
ValueCountFrequency (%)
대표자 5
 
10.2%
상호 3
 
6.1%
변경 3
 
6.1%
대표이사 2
 
4.1%
→(변경)미조건설(주 2
 
4.1%
삼영토건(주)->청호건설㈜ 2
 
4.1%
당초)미조종합건설(주 2
 
4.1%
소재지 2
 
4.1%
계약상대자 1
 
2.0%
변경(이 1
 
2.0%
Other values (26) 26
53.1%
2023-12-13T01:32:12.509869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
27
 
9.0%
( 16
 
5.3%
) 15
 
5.0%
14
 
4.7%
11
 
3.7%
11
 
3.7%
11
 
3.7%
9
 
3.0%
9
 
3.0%
9
 
3.0%
Other values (76) 168
56.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 199
66.3%
Space Separator 27
 
9.0%
Decimal Number 21
 
7.0%
Open Punctuation 16
 
5.3%
Close Punctuation 15
 
5.0%
Other Punctuation 9
 
3.0%
Math Symbol 6
 
2.0%
Dash Punctuation 5
 
1.7%
Other Symbol 2
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
7.0%
11
 
5.5%
11
 
5.5%
11
 
5.5%
9
 
4.5%
9
 
4.5%
9
 
4.5%
8
 
4.0%
8
 
4.0%
5
 
2.5%
Other values (57) 104
52.3%
Decimal Number
ValueCountFrequency (%)
2 5
23.8%
1 5
23.8%
4 4
19.0%
0 3
14.3%
3 1
 
4.8%
9 1
 
4.8%
7 1
 
4.8%
6 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
* 4
44.4%
. 3
33.3%
: 1
 
11.1%
, 1
 
11.1%
Math Symbol
ValueCountFrequency (%)
3
50.0%
> 3
50.0%
Space Separator
ValueCountFrequency (%)
27
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 201
67.0%
Common 99
33.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
7.0%
11
 
5.5%
11
 
5.5%
11
 
5.5%
9
 
4.5%
9
 
4.5%
9
 
4.5%
8
 
4.0%
8
 
4.0%
5
 
2.5%
Other values (58) 106
52.7%
Common
ValueCountFrequency (%)
27
27.3%
( 16
16.2%
) 15
15.2%
2 5
 
5.1%
- 5
 
5.1%
1 5
 
5.1%
* 4
 
4.0%
4 4
 
4.0%
3
 
3.0%
. 3
 
3.0%
Other values (8) 12
12.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 199
66.3%
ASCII 96
32.0%
Arrows 3
 
1.0%
None 2
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
27
28.1%
( 16
16.7%
) 15
15.6%
2 5
 
5.2%
- 5
 
5.2%
1 5
 
5.2%
* 4
 
4.2%
4 4
 
4.2%
. 3
 
3.1%
0 3
 
3.1%
Other values (7) 9
 
9.4%
Hangul
ValueCountFrequency (%)
14
 
7.0%
11
 
5.5%
11
 
5.5%
11
 
5.5%
9
 
4.5%
9
 
4.5%
9
 
4.5%
8
 
4.0%
8
 
4.0%
5
 
2.5%
Other values (57) 104
52.3%
Arrows
ValueCountFrequency (%)
3
100.0%
None
ValueCountFrequency (%)
2
100.0%

변경사유
Text

MISSING 

Distinct12
Distinct (%)80.0%
Missing10
Missing (%)40.0%
Memory size332.0 B
2023-12-13T01:32:12.729213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length11
Mean length8.8666667
Min length2

Characters and Unicode

Total characters133
Distinct characters56
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)66.7%

Sample

1st row당해 961000천원
2nd row면허말소 타절정산
3rd row주소지 이전
4th row상호, 대표, 주소변경
5th row상호, 대표, 주소변경
ValueCountFrequency (%)
상호 3
 
9.7%
주소변경 3
 
9.7%
대표 3
 
9.7%
대표자 2
 
6.5%
변경 2
 
6.5%
상호변경 2
 
6.5%
961000천원 1
 
3.2%
개인사업자에서 1
 
3.2%
사망 1
 
3.2%
2018.05.04 1
 
3.2%
Other values (12) 12
38.7%
2023-12-13T01:32:13.130940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16
 
12.0%
7
 
5.3%
7
 
5.3%
0 6
 
4.5%
, 6
 
4.5%
5
 
3.8%
5
 
3.8%
5
 
3.8%
5
 
3.8%
5
 
3.8%
Other values (46) 66
49.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 94
70.7%
Space Separator 16
 
12.0%
Decimal Number 14
 
10.5%
Other Punctuation 9
 
6.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
7.4%
7
 
7.4%
5
 
5.3%
5
 
5.3%
5
 
5.3%
5
 
5.3%
5
 
5.3%
4
 
4.3%
4
 
4.3%
3
 
3.2%
Other values (35) 44
46.8%
Decimal Number
ValueCountFrequency (%)
0 6
42.9%
1 2
 
14.3%
4 1
 
7.1%
8 1
 
7.1%
2 1
 
7.1%
5 1
 
7.1%
9 1
 
7.1%
6 1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
, 6
66.7%
. 3
33.3%
Space Separator
ValueCountFrequency (%)
16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 94
70.7%
Common 39
29.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
7.4%
7
 
7.4%
5
 
5.3%
5
 
5.3%
5
 
5.3%
5
 
5.3%
5
 
5.3%
4
 
4.3%
4
 
4.3%
3
 
3.2%
Other values (35) 44
46.8%
Common
ValueCountFrequency (%)
16
41.0%
0 6
 
15.4%
, 6
 
15.4%
. 3
 
7.7%
1 2
 
5.1%
4 1
 
2.6%
8 1
 
2.6%
2 1
 
2.6%
5 1
 
2.6%
9 1
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 94
70.7%
ASCII 39
29.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
16
41.0%
0 6
 
15.4%
, 6
 
15.4%
. 3
 
7.7%
1 2
 
5.1%
4 1
 
2.6%
8 1
 
2.6%
2 1
 
2.6%
5 1
 
2.6%
9 1
 
2.6%
Hangul
ValueCountFrequency (%)
7
 
7.4%
7
 
7.4%
5
 
5.3%
5
 
5.3%
5
 
5.3%
5
 
5.3%
5
 
5.3%
4
 
4.3%
4
 
4.3%
3
 
3.2%
Other values (35) 44
46.8%

변경비고
Text

MISSING 

Distinct2
Distinct (%)100.0%
Missing23
Missing (%)92.0%
Memory size332.0 B
2023-12-13T01:32:13.391204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length16.5
Mean length16.5
Min length15

Characters and Unicode

Total characters33
Distinct characters17
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row2005.1.12.(타절정산일자)
2nd row변경일 2011.05.27.
ValueCountFrequency (%)
2005.1.12.(타절정산일자 1
33.3%
변경일 1
33.3%
2011.05.27 1
33.3%
2023-12-13T01:32:14.022937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 6
18.2%
2 4
12.1%
1 4
12.1%
0 4
12.1%
5 2
 
6.1%
2
 
6.1%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
Other values (7) 7
21.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15
45.5%
Other Letter 9
27.3%
Other Punctuation 6
 
18.2%
Space Separator 1
 
3.0%
Close Punctuation 1
 
3.0%
Open Punctuation 1
 
3.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Decimal Number
ValueCountFrequency (%)
2 4
26.7%
1 4
26.7%
0 4
26.7%
5 2
13.3%
7 1
 
6.7%
Other Punctuation
ValueCountFrequency (%)
. 6
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24
72.7%
Hangul 9
 
27.3%

Most frequent character per script

Common
ValueCountFrequency (%)
. 6
25.0%
2 4
16.7%
1 4
16.7%
0 4
16.7%
5 2
 
8.3%
1
 
4.2%
) 1
 
4.2%
( 1
 
4.2%
7 1
 
4.2%
Hangul
ValueCountFrequency (%)
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24
72.7%
Hangul 9
 
27.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 6
25.0%
2 4
16.7%
1 4
16.7%
0 4
16.7%
5 2
 
8.3%
1
 
4.2%
) 1
 
4.2%
( 1
 
4.2%
7 1
 
4.2%
Hangul
ValueCountFrequency (%)
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%

Interactions

2023-12-13T01:32:10.692571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:32:14.287272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공사년도공사구분도급변경일변경사항변경사유변경비고
공사년도1.0000.6530.9811.0001.000NaN
공사구분0.6531.0001.0001.0001.000NaN
도급변경일0.9811.0001.0001.0001.0000.000
변경사항1.0001.0001.0001.0001.0000.000
변경사유1.0001.0001.0001.0001.0000.000
변경비고NaNNaN0.0000.0000.0001.000
2023-12-13T01:32:14.548209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공사년도공사구분
공사년도1.0000.000
공사구분0.0001.000

Missing values

2023-12-13T01:32:10.855186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:32:11.002284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T01:32:11.118906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

공사년도공사구분도급변경일변경사항변경사유변경비고
02004공사2005-01-18타절정산액 : 총체3214449천원당해 961000천원<NA>
12004공사2005-01-18업체명 주소 대표자면허말소 타절정산2005.1.12.(타절정산일자)
22004공사2005-02-22진주시 평거동 7-1번지 2층, 사용인감 변경주소지 이전<NA>
32005용역2006-04-12상호변경(2006.4.20.)신고<NA><NA>
42009공사2010-01-14삼영토건(주)->청호건설(주상호, 대표, 주소변경<NA>
52009공사2010-01-14삼영토건(주)->청호건설㈜상호, 대표, 주소변경<NA>
62009공사2010-01-14삼영토건(주)->청호건설㈜상호, 대표, 주소변경<NA>
72010공사2010-12-13상호 대표자 주소변경<NA><NA>
82010공사2011-04-27주소변경(창원시 성산구 중앙동 12-11)<NA><NA>
92010공사2010-09-30<NA><NA><NA>
공사년도공사구분도급변경일변경사항변경사유변경비고
152013공사2013-08-09(당초)미조종합건설(주) →(변경)미조건설(주)상호변경<NA>
162013공사2013-08-09(당초)미조종합건설(주) →(변경)미조건설(주)상호변경<NA>
172014공사2015-01-21한국전력주식회사개인사업자에서 법인으로 전환<NA>
182015공사2016-03-16대표이사<NA><NA>
192015공사2016-03-16대표이사<NA><NA>
202015공사2016-06-15계약상대자 변경정보통신공사업 양도양수<NA>
212017공사2018-05-04<NA><NA><NA>
222017공사2018-05-04업체명(주 거평건설)2018.05.04.<NA>
232020공사2020-12-18대표자변경대표자 사망<NA>
242020공사2021-06-17사업자등록번호 법인등록번호합병<NA>

Duplicate rows

Most frequently occurring

공사년도공사구분도급변경일변경사항변경사유변경비고# duplicates
02009공사2010-01-14삼영토건(주)->청호건설㈜상호, 대표, 주소변경<NA>2
12010공사2010-09-30<NA><NA><NA>2
22013공사2013-08-09(당초)미조종합건설(주) →(변경)미조건설(주)상호변경<NA>2
32015공사2016-03-16대표이사<NA><NA>2