Overview

Dataset statistics

Number of variables9
Number of observations23
Missing cells1
Missing cells (%)0.5%
Duplicate rows4
Duplicate rows (%)17.4%
Total size in memory1.8 KiB
Average record size in memory80.7 B

Variable types

Numeric2
Categorical4
DateTime1
Text2

Dataset

Description경상남도_도급자변경내역 데이터 입니다. (공사년도, 공사구분, 도급변경일, 변경사항, 변경사유, 변경비고, 업체명등의 데이터를 포함하고있습니다.)
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15049522

Alerts

부서코드 has constant value ""Constant
Dataset has 4 (17.4%) duplicate rowsDuplicates
공사년도 is highly overall correlated with 변경사유High correlation
변경사유 is highly overall correlated with 공사년도 and 1 other fieldsHigh correlation
변경비고 is highly overall correlated with 변경사유High correlation
공사구분 is highly imbalanced (57.4%)Imbalance
변경비고 is highly imbalanced (67.6%)Imbalance
업체명 has 1 (4.3%) missing valuesMissing

Reproduction

Analysis started2023-12-11 00:35:43.353914
Analysis finished2023-12-11 00:35:44.320203
Duration0.97 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

공사년도
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)39.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.7826
Minimum2004
Maximum2017
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-11T09:35:44.691936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2004
5-th percentile2004
Q12009
median2010
Q32013.5
95-th percentile2016.8
Maximum2017
Range13
Interquartile range (IQR)4.5

Descriptive statistics

Standard deviation3.9307242
Coefficient of variation (CV)0.0019548231
Kurtosis-0.55005342
Mean2010.7826
Median Absolute Deviation (MAD)3
Skewness-0.30910293
Sum46248
Variance15.450593
MonotonicityIncreasing
2023-12-11T09:35:44.862729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2010 5
21.7%
2004 3
13.0%
2009 3
13.0%
2013 3
13.0%
2015 3
13.0%
2011 2
 
8.7%
2017 2
 
8.7%
2005 1
 
4.3%
2014 1
 
4.3%
ValueCountFrequency (%)
2004 3
13.0%
2005 1
 
4.3%
2009 3
13.0%
2010 5
21.7%
2011 2
 
8.7%
2013 3
13.0%
2014 1
 
4.3%
2015 3
13.0%
2017 2
 
8.7%
ValueCountFrequency (%)
2017 2
 
8.7%
2015 3
13.0%
2014 1
 
4.3%
2013 3
13.0%
2011 2
 
8.7%
2010 5
21.7%
2009 3
13.0%
2005 1
 
4.3%
2004 3
13.0%

공사구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
공사
21 
용역
 
2

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공사
2nd row공사
3rd row공사
4th row용역
5th row공사

Common Values

ValueCountFrequency (%)
공사 21
91.3%
용역 2
 
8.7%

Length

2023-12-11T09:35:45.018165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:35:45.116694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공사 21
91.3%
용역 2
 
8.7%

공사번호
Real number (ℝ)

Distinct15
Distinct (%)65.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean160.78261
Minimum8
Maximum620
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size339.0 B
2023-12-11T09:35:45.210418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile8
Q138.5
median84
Q3137.5
95-th percentile496
Maximum620
Range612
Interquartile range (IQR)99

Descriptive statistics

Standard deviation190.69322
Coefficient of variation (CV)1.1860314
Kurtosis0.52326366
Mean160.78261
Median Absolute Deviation (MAD)55
Skewness1.3976042
Sum3698
Variance36363.905
MonotonicityNot monotonic
2023-12-11T09:35:45.330219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
8 3
13.0%
120 3
13.0%
84 2
 
8.7%
496 2
 
8.7%
56 2
 
8.7%
10 2
 
8.7%
123 1
 
4.3%
48 1
 
4.3%
398 1
 
4.3%
489 1
 
4.3%
Other values (5) 5
21.7%
ValueCountFrequency (%)
8 3
13.0%
10 2
8.7%
29 1
 
4.3%
48 1
 
4.3%
50 1
 
4.3%
56 2
8.7%
84 2
8.7%
113 1
 
4.3%
120 3
13.0%
123 1
 
4.3%
ValueCountFrequency (%)
620 1
 
4.3%
496 2
8.7%
489 1
 
4.3%
398 1
 
4.3%
152 1
 
4.3%
123 1
 
4.3%
120 3
13.0%
113 1
 
4.3%
84 2
8.7%
56 2
8.7%

부서코드
Categorical

CONSTANT 

Distinct1
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size316.0 B
1
23 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 23
100.0%

Length

2023-12-11T09:35:45.469783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:35:45.566478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 23
100.0%
Distinct15
Distinct (%)65.2%
Missing0
Missing (%)0.0%
Memory size316.0 B
Minimum2005-01-18 00:00:00
Maximum2018-05-04 00:00:00
2023-12-11T09:35:45.671769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:45.786530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
Distinct18
Distinct (%)78.3%
Missing0
Missing (%)0.0%
Memory size316.0 B
2023-12-11T09:35:45.981727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length15
Mean length12.304348
Min length1

Characters and Unicode

Total characters283
Distinct characters85
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)60.9%

Sample

1st row타절정산액 ; 총체3214449천원
2nd row업체명 주소 대표자
3rd row진주시 평거동 7-1번지 2층, 사용인감 변경
4th row상호변경(2006.4.20)신고
5th row삼영토건(주)->청호건설(주
ValueCountFrequency (%)
대표자 5
 
10.2%
4
 
8.2%
변경 3
 
6.1%
상호 3
 
6.1%
대표이사 2
 
4.1%
→(변경)미조건설(주 2
 
4.1%
당초)미조종합건설(주 2
 
4.1%
삼영토건(주)->청호건설㈜ 2
 
4.1%
소재지 2
 
4.1%
중앙동 1
 
2.0%
Other values (23) 23
46.9%
2023-12-11T09:35:46.302384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26
 
9.2%
( 16
 
5.7%
) 15
 
5.3%
14
 
4.9%
11
 
3.9%
10
 
3.5%
10
 
3.5%
8
 
2.8%
8
 
2.8%
- 8
 
2.8%
Other values (75) 157
55.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 185
65.4%
Space Separator 26
 
9.2%
Decimal Number 21
 
7.4%
Open Punctuation 16
 
5.7%
Close Punctuation 15
 
5.3%
Dash Punctuation 8
 
2.8%
Math Symbol 6
 
2.1%
Other Punctuation 4
 
1.4%
Other Symbol 2
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
7.6%
11
 
5.9%
10
 
5.4%
10
 
5.4%
8
 
4.3%
8
 
4.3%
7
 
3.8%
7
 
3.8%
7
 
3.8%
5
 
2.7%
Other values (57) 98
53.0%
Decimal Number
ValueCountFrequency (%)
2 5
23.8%
1 5
23.8%
4 4
19.0%
0 3
14.3%
9 1
 
4.8%
3 1
 
4.8%
6 1
 
4.8%
7 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
. 2
50.0%
; 1
25.0%
, 1
25.0%
Math Symbol
ValueCountFrequency (%)
> 3
50.0%
3
50.0%
Space Separator
ValueCountFrequency (%)
26
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 187
66.1%
Common 96
33.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
7.5%
11
 
5.9%
10
 
5.3%
10
 
5.3%
8
 
4.3%
8
 
4.3%
7
 
3.7%
7
 
3.7%
7
 
3.7%
5
 
2.7%
Other values (58) 100
53.5%
Common
ValueCountFrequency (%)
26
27.1%
( 16
16.7%
) 15
15.6%
- 8
 
8.3%
2 5
 
5.2%
1 5
 
5.2%
4 4
 
4.2%
0 3
 
3.1%
> 3
 
3.1%
3
 
3.1%
Other values (7) 8
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 185
65.4%
ASCII 93
32.9%
Arrows 3
 
1.1%
None 2
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
26
28.0%
( 16
17.2%
) 15
16.1%
- 8
 
8.6%
2 5
 
5.4%
1 5
 
5.4%
4 4
 
4.3%
0 3
 
3.2%
> 3
 
3.2%
. 2
 
2.2%
Other values (6) 6
 
6.5%
Hangul
ValueCountFrequency (%)
14
 
7.6%
11
 
5.9%
10
 
5.4%
10
 
5.4%
8
 
4.3%
8
 
4.3%
7
 
3.8%
7
 
3.8%
7
 
3.8%
5
 
2.7%
Other values (57) 98
53.0%
Arrows
ValueCountFrequency (%)
3
100.0%
None
ValueCountFrequency (%)
2
100.0%

변경사유
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)47.8%
Missing0
Missing (%)0.0%
Memory size316.0 B
-
10 
상호, 대표, 주소변경
상호변경
당해 961000천원
 
1
면허말소 타절정산
 
1
Other values (6)

Length

Max length15
Median length12
Mean length5.7391304
Min length1

Unique

Unique8 ?
Unique (%)34.8%

Sample

1st row당해 961000천원
2nd row면허말소 타절정산
3rd row주소지 이전
4th row-
5th row상호, 대표, 주소변경

Common Values

ValueCountFrequency (%)
- 10
43.5%
상호, 대표, 주소변경 3
 
13.0%
상호변경 2
 
8.7%
당해 961000천원 1
 
4.3%
면허말소 타절정산 1
 
4.3%
주소지 이전 1
 
4.3%
대표자 변경 1
 
4.3%
양도양수에 따른 변경 1
 
4.3%
개인사업자에서 법인으로 전환 1
 
4.3%
정보통신공사업 양도양수 1
 
4.3%

Length

2023-12-11T09:35:46.494050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
10
26.3%
대표 3
 
7.9%
주소변경 3
 
7.9%
상호 3
 
7.9%
상호변경 2
 
5.3%
변경 2
 
5.3%
양도양수에 1
 
2.6%
양도양수 1
 
2.6%
정보통신공사업 1
 
2.6%
전환 1
 
2.6%
Other values (11) 11
28.9%

변경비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Memory size316.0 B
-
21 
2005.1.12(타절정산일자)
 
1
변경일 2011-05,27
 
1

Length

Max length17
Median length1
Mean length2.2608696
Min length1

Unique

Unique2 ?
Unique (%)8.7%

Sample

1st row-
2nd row2005.1.12(타절정산일자)
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
- 21
91.3%
2005.1.12(타절정산일자) 1
 
4.3%
변경일 2011-05,27 1
 
4.3%

Length

2023-12-11T09:35:46.615434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:35:46.730994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
21
87.5%
2005.1.12(타절정산일자 1
 
4.2%
변경일 1
 
4.2%
2011-05,27 1
 
4.2%

업체명
Text

MISSING 

Distinct15
Distinct (%)68.2%
Missing1
Missing (%)4.3%
Memory size316.0 B
2023-12-11T09:35:46.900991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length7.2272727
Min length4

Characters and Unicode

Total characters159
Distinct characters41
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)40.9%

Sample

1st row(주)케이디
2nd row이호건설(주)
3rd row(주)한국종합기술
4th row삼영토건(주)
5th row삼영토건(주)
ValueCountFrequency (%)
삼영토건(주 3
13.6%
광득종합건설(주 2
 
9.1%
주)디엠 2
 
9.1%
미조종합건설(주 2
 
9.1%
남광토건 2
 
9.1%
진우종합건설(주 2
 
9.1%
주)케이디 1
 
4.5%
이호건설(주 1
 
4.5%
주)한국종합기술 1
 
4.5%
주)유진건설 1
 
4.5%
Other values (5) 5
22.7%
2023-12-11T09:35:47.236496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 19
 
11.9%
19
 
11.9%
) 19
 
11.9%
14
 
8.8%
9
 
5.7%
8
 
5.0%
8
 
5.0%
5
 
3.1%
5
 
3.1%
4
 
2.5%
Other values (31) 49
30.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 121
76.1%
Open Punctuation 19
 
11.9%
Close Punctuation 19
 
11.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
15.7%
14
 
11.6%
9
 
7.4%
8
 
6.6%
8
 
6.6%
5
 
4.1%
5
 
4.1%
4
 
3.3%
4
 
3.3%
3
 
2.5%
Other values (29) 42
34.7%
Open Punctuation
ValueCountFrequency (%)
( 19
100.0%
Close Punctuation
ValueCountFrequency (%)
) 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 121
76.1%
Common 38
 
23.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
15.7%
14
 
11.6%
9
 
7.4%
8
 
6.6%
8
 
6.6%
5
 
4.1%
5
 
4.1%
4
 
3.3%
4
 
3.3%
3
 
2.5%
Other values (29) 42
34.7%
Common
ValueCountFrequency (%)
( 19
50.0%
) 19
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 121
76.1%
ASCII 38
 
23.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 19
50.0%
) 19
50.0%
Hangul
ValueCountFrequency (%)
19
15.7%
14
 
11.6%
9
 
7.4%
8
 
6.6%
8
 
6.6%
5
 
4.1%
5
 
4.1%
4
 
3.3%
4
 
3.3%
3
 
2.5%
Other values (29) 42
34.7%

Interactions

2023-12-11T09:35:43.856211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:43.682409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:43.984006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:35:43.763941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:35:47.334888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공사년도공사구분공사번호도급변경일변경사항변경사유변경비고업체명
공사년도1.0000.8360.6490.9790.9720.9390.5770.979
공사구분0.8361.0000.0001.0001.0000.3730.0001.000
공사번호0.6490.0001.0000.9610.9940.0000.0000.955
도급변경일0.9791.0000.9611.0000.9880.9480.8361.000
변경사항0.9721.0000.9940.9881.0001.0001.0000.987
변경사유0.9390.3730.0000.9481.0001.0001.0000.993
변경비고0.5770.0000.0000.8361.0001.0001.0001.000
업체명0.9791.0000.9551.0000.9870.9931.0001.000
2023-12-11T09:35:47.441595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공사구분변경비고변경사유
공사구분1.0000.0000.235
변경비고0.0001.0000.775
변경사유0.2350.7751.000
2023-12-11T09:35:47.516894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공사년도공사번호공사구분변경사유변경비고
공사년도1.000-0.0600.0000.5900.173
공사번호-0.0601.0000.0000.0000.000
공사구분0.0000.0001.0000.2350.000
변경사유0.5900.0000.2351.0000.775
변경비고0.1730.0000.0000.7751.000

Missing values

2023-12-11T09:35:44.110618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:35:44.243668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

공사년도공사구분공사번호부서코드도급변경일변경사항변경사유변경비고업체명
02004공사8412005-01-18타절정산액 ; 총체3214449천원당해 961000천원-<NA>
12004공사8412005-01-18업체명 주소 대표자면허말소 타절정산2005.1.12(타절정산일자)(주)케이디
22004공사12312005-02-22진주시 평거동 7-1번지 2층, 사용인감 변경주소지 이전-이호건설(주)
32005용역4812006-04-12상호변경(2006.4.20)신고--(주)한국종합기술
42009공사812010-01-14삼영토건(주)->청호건설(주상호, 대표, 주소변경-삼영토건(주)
52009공사812010-01-14삼영토건(주)->청호건설㈜상호, 대표, 주소변경-삼영토건(주)
62009공사812010-01-14삼영토건(주)->청호건설㈜상호, 대표, 주소변경-삼영토건(주)
72010공사39812010-12-13상호 대표자 주소변경--(주)유진건설
82010공사48912011-04-27주소변경(창원시 성산구 중앙동 12-11)--영림종합건설(주)
92010공사49612010-09-30---광득종합건설(주)
공사년도공사구분공사번호부서코드도급변경일변경사항변경사유변경비고업체명
132011공사12012011-12-01상호 대표자 소재지 변경--(주)디엠
142013용역5012013-04-26도급자변경양도양수에 따른 변경-(주)태영기술단
152013공사12012013-08-09(당초)미조종합건설(주) →(변경)미조건설(주)상호변경-미조종합건설(주)
162013공사12012013-08-09(당초)미조종합건설(주) →(변경)미조건설(주)상호변경-미조종합건설(주)
172014공사15212015-01-21한국전력주식회사개인사업자에서 법인으로 전환-부강전기
182015공사5612016-03-16대표이사--남광토건
192015공사5612016-03-16대표이사--남광토건
202015공사11312016-06-15계약상대자 변경정보통신공사업 양도양수-(주)선경전기
212017공사1012018-05-04---진우종합건설(주)
222017공사1012018-05-04업체명(주 거평건설)20180504-진우종합건설(주)

Duplicate rows

Most frequently occurring

공사년도공사구분공사번호부서코드도급변경일변경사항변경사유변경비고업체명# duplicates
02009공사812010-01-14삼영토건(주)->청호건설㈜상호, 대표, 주소변경-삼영토건(주)2
12010공사49612010-09-30---광득종합건설(주)2
22013공사12012013-08-09(당초)미조종합건설(주) →(변경)미조건설(주)상호변경-미조종합건설(주)2
32015공사5612016-03-16대표이사--남광토건2