Overview

Dataset statistics

Number of variables8
Number of observations67
Missing cells30
Missing cells (%)5.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.5 KiB
Average record size in memory69.0 B

Variable types

Numeric3
Text2
Categorical2
DateTime1

Dataset

Description1) 내용: 정책보증인 분양보증 사고가 발생한 사업장의 사고업체명, 위치, 이행방안 (환급이행, 분양이행), 대위변제액에 대한 정보 공개 2) 효과: 사업장별 분양보증사고 현황 및 대위변제금액 파악 가능 3) 기타: 하자보수보증과 분양보증은 다른 성격의 보증으로 각각 보증별로 자료를 공개하는 것이 효율적임
URLhttps://www.data.go.kr/data/15011474/fileData.do

Alerts

순번 is highly overall correlated with 연도High correlation
연도 is highly overall correlated with 순번High correlation
지역 is highly overall correlated with 이행방안High correlation
이행방안 is highly overall correlated with 지역High correlation
대위변제액(백만원) has 30 (44.8%) missing valuesMissing
순번 has unique valuesUnique
사업장명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 05:08:53.512447
Analysis finished2023-12-12 05:08:55.456382
Duration1.94 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct67
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34
Minimum1
Maximum67
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size735.0 B
2023-12-12T14:08:55.570335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.3
Q117.5
median34
Q350.5
95-th percentile63.7
Maximum67
Range66
Interquartile range (IQR)33

Descriptive statistics

Standard deviation19.485037
Coefficient of variation (CV)0.57308932
Kurtosis-1.2
Mean34
Median Absolute Deviation (MAD)17
Skewness0
Sum2278
Variance379.66667
MonotonicityStrictly increasing
2023-12-12T14:08:56.167330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.5%
44 1
 
1.5%
50 1
 
1.5%
49 1
 
1.5%
48 1
 
1.5%
47 1
 
1.5%
46 1
 
1.5%
45 1
 
1.5%
43 1
 
1.5%
2 1
 
1.5%
Other values (57) 57
85.1%
ValueCountFrequency (%)
1 1
1.5%
2 1
1.5%
3 1
1.5%
4 1
1.5%
5 1
1.5%
6 1
1.5%
7 1
1.5%
8 1
1.5%
9 1
1.5%
10 1
1.5%
ValueCountFrequency (%)
67 1
1.5%
66 1
1.5%
65 1
1.5%
64 1
1.5%
63 1
1.5%
62 1
1.5%
61 1
1.5%
60 1
1.5%
59 1
1.5%
58 1
1.5%

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct10
Distinct (%)14.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2012.9104
Minimum2010
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size735.0 B
2023-12-12T14:08:56.330841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2010
5-th percentile2010
Q12010
median2012
Q32014
95-th percentile2020
Maximum2020
Range10
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.278633
Coefficient of variation (CV)0.0016288022
Kurtosis0.31473115
Mean2012.9104
Median Absolute Deviation (MAD)2
Skewness1.1905569
Sum134865
Variance10.749435
MonotonicityIncreasing
2023-12-12T14:08:56.489773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2010 22
32.8%
2012 12
17.9%
2013 8
 
11.9%
2020 8
 
11.9%
2014 6
 
9.0%
2011 5
 
7.5%
2015 3
 
4.5%
2016 1
 
1.5%
2018 1
 
1.5%
2019 1
 
1.5%
ValueCountFrequency (%)
2010 22
32.8%
2011 5
 
7.5%
2012 12
17.9%
2013 8
 
11.9%
2014 6
 
9.0%
2015 3
 
4.5%
2016 1
 
1.5%
2018 1
 
1.5%
2019 1
 
1.5%
2020 8
 
11.9%
ValueCountFrequency (%)
2020 8
 
11.9%
2019 1
 
1.5%
2018 1
 
1.5%
2016 1
 
1.5%
2015 3
 
4.5%
2014 6
 
9.0%
2013 8
 
11.9%
2012 12
17.9%
2011 5
 
7.5%
2010 22
32.8%
Distinct51
Distinct (%)76.1%
Missing0
Missing (%)0.0%
Memory size668.0 B
2023-12-12T14:08:56.799946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length16
Mean length6.5970149
Min length3

Characters and Unicode

Total characters442
Distinct characters99
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)64.2%

Sample

1st row㈜유영
2nd row㈜솔연씨엔디
3rd row㈜미래피엔씨
4th row성원산업개발㈜
5th row㈜엘비씨앤씨
ValueCountFrequency (%)
극동건설(주 4
 
5.6%
지안스건설㈜ 4
 
5.6%
벽산건설㈜ 4
 
5.6%
남양건설㈜ 3
 
4.2%
㈜아택씨앤디 3
 
4.2%
㈜아택시오(합병 3
 
4.2%
대경종합건설㈜ 2
 
2.8%
㈜대림산업 2
 
2.8%
엘아이지건설㈜ 2
 
2.8%
유)청원토건 1
 
1.4%
Other values (43) 43
60.6%
2023-12-12T14:08:57.297277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
61
 
13.8%
38
 
8.6%
36
 
8.1%
13
 
2.9%
( 12
 
2.7%
) 12
 
2.7%
10
 
2.3%
10
 
2.3%
9
 
2.0%
9
 
2.0%
Other values (89) 232
52.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 349
79.0%
Other Symbol 61
 
13.8%
Open Punctuation 12
 
2.7%
Close Punctuation 12
 
2.7%
Space Separator 4
 
0.9%
Other Punctuation 4
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
38
 
10.9%
36
 
10.3%
13
 
3.7%
10
 
2.9%
10
 
2.9%
9
 
2.6%
9
 
2.6%
9
 
2.6%
8
 
2.3%
8
 
2.3%
Other values (84) 199
57.0%
Other Symbol
ValueCountFrequency (%)
61
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 410
92.8%
Common 32
 
7.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
61
 
14.9%
38
 
9.3%
36
 
8.8%
13
 
3.2%
10
 
2.4%
10
 
2.4%
9
 
2.2%
9
 
2.2%
9
 
2.2%
8
 
2.0%
Other values (85) 207
50.5%
Common
ValueCountFrequency (%)
( 12
37.5%
) 12
37.5%
4
 
12.5%
, 4
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 349
79.0%
None 61
 
13.8%
ASCII 32
 
7.2%

Most frequent character per block

None
ValueCountFrequency (%)
61
100.0%
Hangul
ValueCountFrequency (%)
38
 
10.9%
36
 
10.3%
13
 
3.7%
10
 
2.9%
10
 
2.9%
9
 
2.6%
9
 
2.6%
9
 
2.6%
8
 
2.3%
8
 
2.3%
Other values (84) 199
57.0%
ASCII
ValueCountFrequency (%)
( 12
37.5%
) 12
37.5%
4
 
12.5%
, 4
 
12.5%

지역
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)23.9%
Missing0
Missing (%)0.0%
Memory size668.0 B
경기
24 
서울
충남
전북
경남
Other values (11)
20 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique6 ?
Unique (%)9.0%

Sample

1st row경기
2nd row경기
3rd row경기
4th row경기
5th row경기

Common Values

ValueCountFrequency (%)
경기 24
35.8%
서울 6
 
9.0%
충남 6
 
9.0%
전북 6
 
9.0%
경남 5
 
7.5%
충북 3
 
4.5%
경북 3
 
4.5%
부산 3
 
4.5%
세종 3
 
4.5%
강원 2
 
3.0%
Other values (6) 6
 
9.0%

Length

2023-12-12T14:08:57.481578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 24
35.8%
서울 6
 
9.0%
충남 6
 
9.0%
전북 6
 
9.0%
경남 5
 
7.5%
충북 3
 
4.5%
경북 3
 
4.5%
부산 3
 
4.5%
세종 3
 
4.5%
강원 2
 
3.0%
Other values (6) 6
 
9.0%

사업장명
Text

UNIQUE 

Distinct67
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size668.0 B
2023-12-12T14:08:57.950878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length20
Mean length16.492537
Min length8

Characters and Unicode

Total characters1105
Distinct characters199
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67 ?
Unique (%)100.0%

Sample

1st row경기 안양시 비산동 안양천 상떼빌
2nd row경기 용인시 신갈동 성원상떼빌
3rd row경기 고양시 한강능곡 성원 상떼빌 2차
4th row경기 용인시 풍덕천 성원 상떼빌
5th row경기 광주시 쌍령 상떼빌
ValueCountFrequency (%)
경기 12
 
4.4%
아파트 6
 
2.2%
2차 6
 
2.2%
성원 4
 
1.5%
2단지 4
 
1.5%
1단지 4
 
1.5%
남양주시 4
 
1.5%
충남 4
 
1.5%
거제 4
 
1.5%
스타클래스 4
 
1.5%
Other values (171) 218
80.7%
2023-12-12T14:08:58.650976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
203
 
18.4%
37
 
3.3%
29
 
2.6%
29
 
2.6%
27
 
2.4%
18
 
1.6%
16
 
1.4%
16
 
1.4%
16
 
1.4%
16
 
1.4%
Other values (189) 698
63.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 843
76.3%
Space Separator 203
 
18.4%
Decimal Number 33
 
3.0%
Uppercase Letter 13
 
1.2%
Dash Punctuation 5
 
0.5%
Close Punctuation 3
 
0.3%
Open Punctuation 3
 
0.3%
Lowercase Letter 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
37
 
4.4%
29
 
3.4%
29
 
3.4%
27
 
3.2%
18
 
2.1%
16
 
1.9%
16
 
1.9%
16
 
1.9%
16
 
1.9%
16
 
1.9%
Other values (169) 623
73.9%
Uppercase Letter
ValueCountFrequency (%)
I 4
30.8%
L 3
23.1%
E 1
 
7.7%
M 1
 
7.7%
G 1
 
7.7%
S 1
 
7.7%
X 1
 
7.7%
T 1
 
7.7%
Decimal Number
ValueCountFrequency (%)
2 15
45.5%
1 7
21.2%
3 4
 
12.1%
0 2
 
6.1%
4 2
 
6.1%
5 2
 
6.1%
9 1
 
3.0%
Space Separator
ValueCountFrequency (%)
203
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 843
76.3%
Common 247
 
22.4%
Latin 15
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
37
 
4.4%
29
 
3.4%
29
 
3.4%
27
 
3.2%
18
 
2.1%
16
 
1.9%
16
 
1.9%
16
 
1.9%
16
 
1.9%
16
 
1.9%
Other values (169) 623
73.9%
Common
ValueCountFrequency (%)
203
82.2%
2 15
 
6.1%
1 7
 
2.8%
- 5
 
2.0%
3 4
 
1.6%
) 3
 
1.2%
( 3
 
1.2%
0 2
 
0.8%
4 2
 
0.8%
5 2
 
0.8%
Latin
ValueCountFrequency (%)
I 4
26.7%
L 3
20.0%
e 2
13.3%
E 1
 
6.7%
M 1
 
6.7%
G 1
 
6.7%
S 1
 
6.7%
X 1
 
6.7%
T 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 843
76.3%
ASCII 262
 
23.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
203
77.5%
2 15
 
5.7%
1 7
 
2.7%
- 5
 
1.9%
I 4
 
1.5%
3 4
 
1.5%
L 3
 
1.1%
) 3
 
1.1%
( 3
 
1.1%
0 2
 
0.8%
Other values (10) 13
 
5.0%
Hangul
ValueCountFrequency (%)
37
 
4.4%
29
 
3.4%
29
 
3.4%
27
 
3.2%
18
 
2.1%
16
 
1.9%
16
 
1.9%
16
 
1.9%
16
 
1.9%
16
 
1.9%
Other values (169) 623
73.9%
Distinct52
Distinct (%)77.6%
Missing0
Missing (%)0.0%
Memory size668.0 B
Minimum2010-01-06 00:00:00
Maximum2020-05-25 00:00:00
2023-12-12T14:08:58.881418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:59.047349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

이행방안
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size668.0 B
환급
32 
공사
16 
제외
16 
분양
 
3

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row환급
2nd row환급
3rd row공사
4th row환급
5th row환급

Common Values

ValueCountFrequency (%)
환급 32
47.8%
공사 16
23.9%
제외 16
23.9%
분양 3
 
4.5%

Length

2023-12-12T14:08:59.201674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:08:59.332350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
환급 32
47.8%
공사 16
23.9%
제외 16
23.9%
분양 3
 
4.5%

대위변제액(백만원)
Real number (ℝ)

MISSING 

Distinct36
Distinct (%)97.3%
Missing30
Missing (%)44.8%
Infinite0
Infinite (%)0.0%
Mean36983.405
Minimum10
Maximum399678
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size735.0 B
2023-12-12T14:08:59.470425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile256.6
Q11114
median11821
Q336565
95-th percentile134846.6
Maximum399678
Range399668
Interquartile range (IQR)35451

Descriptive statistics

Standard deviation72780.144
Coefficient of variation (CV)1.9679135
Kurtosis17.413827
Mean36983.405
Median Absolute Deviation (MAD)11337
Skewness3.8194022
Sum1368386
Variance5.2969493 × 109
MonotonicityNot monotonic
2023-12-12T14:08:59.639599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
484 2
 
3.0%
2889 1
 
1.5%
11821 1
 
1.5%
837 1
 
1.5%
31325 1
 
1.5%
15694 1
 
1.5%
14363 1
 
1.5%
131 1
 
1.5%
16080 1
 
1.5%
64279 1
 
1.5%
Other values (26) 26
38.8%
(Missing) 30
44.8%
ValueCountFrequency (%)
10 1
1.5%
131 1
1.5%
288 1
1.5%
299 1
1.5%
449 1
1.5%
484 2
3.0%
485 1
1.5%
837 1
1.5%
1114 1
1.5%
1594 1
1.5%
ValueCountFrequency (%)
399678 1
1.5%
161709 1
1.5%
128131 1
1.5%
109644 1
1.5%
85584 1
1.5%
71721 1
1.5%
64279 1
1.5%
60128 1
1.5%
39280 1
1.5%
36565 1
1.5%

Interactions

2023-12-12T14:08:54.802040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:54.095002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:54.434302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:54.935066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:54.203191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:54.555798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:55.053443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:54.316989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:08:54.682255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:08:59.760398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번연도사고업체지역사업장명사고일이행방안대위변제액(백만원)
순번1.0000.9321.0000.7221.0001.0000.5870.339
연도0.9321.0001.0000.7651.0001.0000.4380.798
사고업체1.0001.0001.0000.9591.0000.9990.9151.000
지역0.7220.7650.9591.0001.0000.9900.8590.699
사업장명1.0001.0001.0001.0001.0001.0001.0001.000
사고일1.0001.0000.9990.9901.0001.0000.9591.000
이행방안0.5870.4380.9150.8591.0000.9591.0000.438
대위변제액(백만원)0.3390.7981.0000.6991.0001.0000.4381.000
2023-12-12T14:08:59.863953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
이행방안지역
이행방안1.0000.515
지역0.5151.000
2023-12-12T14:08:59.956221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번연도대위변제액(백만원)지역이행방안
순번1.0000.977-0.3180.3590.371
연도0.9771.000-0.3090.4040.323
대위변제액(백만원)-0.318-0.3091.0000.3000.182
지역0.3590.4040.3001.0000.515
이행방안0.3710.3230.1820.5151.000

Missing values

2023-12-12T14:08:55.214755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:08:55.396152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번연도사고업체지역사업장명사고일이행방안대위변제액(백만원)
012010㈜유영경기경기 안양시 비산동 안양천 상떼빌2010-01-06환급36565
122010㈜솔연씨엔디경기경기 용인시 신갈동 성원상떼빌2010-01-11환급85584
232010㈜미래피엔씨경기경기 고양시 한강능곡 성원 상떼빌 2차2010-01-12공사4397
342010성원산업개발㈜경기경기 용인시 풍덕천 성원 상떼빌2010-01-28환급12283
452010㈜엘비씨앤씨경기경기 광주시 쌍령 상떼빌2010-01-28환급39280
562010㈜연수개발서울서울 상봉동 성원 상떼르시엘2010-02-22환급161709
672010한빌건설㈜경기경기 용인시 공세동 성원 상떼레이크뷰2010-02-26환급109644
782010남양건설㈜경기경기 남양주시 도농동 남양I-좋은집 1단지2010-04-02제외<NA>
892010남양건설㈜경기경기 남양주시 도농동 남양I-좋은집 2단지2010-04-02제외<NA>
9102010남양건설㈜경기경기 남양주시 도농동 남양I-좋은집 3단지2010-04-02제외<NA>
순번연도사고업체지역사업장명사고일이행방안대위변제액(백만원)
57582018흥한산업㈜전남광양 흥한 에르가 1차2018-08-16공사<NA>
58592019㈜세종알엔디경남사천 유천지구 흥한에르가 2차2019-01-28환급71721
59602020㈜진경건설전북군산시 개정면 수페리체 임대아파트2020-01-03환급22733
60612020지안스건설㈜전북완주 이서 지역주택조합(이안이서로가 2단지)2020-02-11환급10503
61622020㈜남우아이디전북완주 이서 공동주택(이안이서로가 1단지)2020-02-24환급26249
62632020지안스건설㈜충북충북 진천군 광혜원 지역주택조합 신축공사2020-04-06환급484
63642020지안스건설㈜충북진천 2차 지역주택조합2020-04-06분양288
64652020(유)모은광주광주 송정 숲안애 2차2020-04-09분양2409
65662020지안스건설㈜울산울산 이안 지안스 지역주택조합아파트2020-05-11환급1594
66672020슈에건설㈜제주제주 조천 레이크샤이어2020-05-25분양<NA>