Overview

Dataset statistics

Number of variables15
Number of observations171
Missing cells730
Missing cells (%)28.5%
Duplicate rows9
Duplicate rows (%)5.3%
Total size in memory20.8 KiB
Average record size in memory124.8 B

Variable types

Categorical5
Text3
DateTime3
Numeric4

Dataset

Description경기도 내 지역주택조합 사업 현황
Author가평군
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=GW39KASIE2OW6TTSYND432117396&infSeq=1

Alerts

Dataset has 9 (5.3%) duplicate rowsDuplicates
시군명 is highly overall correlated with 모집신고일자 and 3 other fieldsHigh correlation
관리기관전화번호 is highly overall correlated with 시군명 and 1 other fieldsHigh correlation
관리기관명 is highly overall correlated with 시군명 and 2 other fieldsHigh correlation
세대수(조합원) is highly overall correlated with 세대수(총분양) and 2 other fieldsHigh correlation
세대수(총분양) is highly overall correlated with 세대수(조합원) and 3 other fieldsHigh correlation
대지면적 is highly overall correlated with 세대수(조합원) and 2 other fieldsHigh correlation
연면적 is highly overall correlated with 세대수(조합원) and 2 other fieldsHigh correlation
모집신고일자 is highly overall correlated with 세대수(총분양) and 1 other fieldsHigh correlation
시공사명 is highly overall correlated with 시군명 and 1 other fieldsHigh correlation
시공사명 is highly imbalanced (51.1%)Imbalance
조합장명 has 103 (60.2%) missing valuesMissing
설립인가일자 has 40 (23.4%) missing valuesMissing
사업계획승인일 has 121 (70.8%) missing valuesMissing
착공신고일 has 125 (73.1%) missing valuesMissing
세대수(조합원) has 111 (64.9%) missing valuesMissing
대지면적 has 110 (64.3%) missing valuesMissing
연면적 has 120 (70.2%) missing valuesMissing
대지면적 has 2 (1.2%) zerosZeros
연면적 has 4 (2.3%) zerosZeros

Reproduction

Analysis started2023-12-10 22:17:05.423569
Analysis finished2023-12-10 22:17:07.926646
Duration2.5 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)14.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
남양주시
25 
평택시
19 
하남시
14 
화성시
13 
용인시
12 
Other values (19)
88 

Length

Max length4
Median length3
Mean length3.1637427
Min length3

Unique

Unique2 ?
Unique (%)1.2%

Sample

1st row이천시
2nd row이천시
3rd row이천시
4th row구리시
5th row구리시

Common Values

ValueCountFrequency (%)
남양주시 25
14.6%
평택시 19
11.1%
하남시 14
 
8.2%
화성시 13
 
7.6%
용인시 12
 
7.0%
구리시 12
 
7.0%
광주시 9
 
5.3%
파주시 9
 
5.3%
성남시 7
 
4.1%
이천시 6
 
3.5%
Other values (14) 45
26.3%

Length

2023-12-11T07:17:07.980538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
남양주시 25
14.6%
평택시 19
11.1%
하남시 14
 
8.2%
화성시 13
 
7.6%
용인시 12
 
7.0%
구리시 12
 
7.0%
광주시 9
 
5.3%
파주시 9
 
5.3%
성남시 7
 
4.1%
이천시 6
 
3.5%
Other values (14) 45
26.3%
Distinct139
Distinct (%)81.3%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-11T07:17:08.176674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length21
Mean length11.754386
Min length3

Characters and Unicode

Total characters2010
Distinct characters194
Distinct categories7 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique110 ?
Unique (%)64.3%

Sample

1st row안흥동지역주택조합
2nd row중리신도시현대지역주택조합
3rd row중리신도시현대지역주택조합2
4th row(가칭)인창대명지역주택조합
5th row(가칭)인창대명2지역주택조합
ValueCountFrequency (%)
지역주택조합 50
 
20.7%
가칭)인창대명2지역주택조합 3
 
1.2%
인창동 3
 
1.2%
수택지역주택조합 3
 
1.2%
남양주 3
 
1.2%
벨리체 2
 
0.8%
금곡역 2
 
0.8%
안흥동지역주택조합 2
 
0.8%
마석우리2지역주택조합 2
 
0.8%
퇴계원역1차지역주택조합 2
 
0.8%
Other values (145) 170
70.2%
2023-12-11T07:17:08.490904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
187
 
9.3%
178
 
8.9%
172
 
8.6%
166
 
8.3%
158
 
7.9%
157
 
7.8%
71
 
3.5%
31
 
1.5%
) 26
 
1.3%
25
 
1.2%
Other values (184) 839
41.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1820
90.5%
Space Separator 71
 
3.5%
Decimal Number 53
 
2.6%
Close Punctuation 26
 
1.3%
Open Punctuation 23
 
1.1%
Uppercase Letter 14
 
0.7%
Dash Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
187
 
10.3%
178
 
9.8%
172
 
9.5%
166
 
9.1%
158
 
8.7%
157
 
8.6%
31
 
1.7%
25
 
1.4%
23
 
1.3%
21
 
1.2%
Other values (168) 702
38.6%
Decimal Number
ValueCountFrequency (%)
2 23
43.4%
3 13
24.5%
1 10
18.9%
5 3
 
5.7%
4 3
 
5.7%
7 1
 
1.9%
Uppercase Letter
ValueCountFrequency (%)
A 5
35.7%
B 3
21.4%
L 3
21.4%
G 1
 
7.1%
X 1
 
7.1%
T 1
 
7.1%
Space Separator
ValueCountFrequency (%)
71
100.0%
Close Punctuation
ValueCountFrequency (%)
) 26
100.0%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1819
90.5%
Common 176
 
8.8%
Latin 14
 
0.7%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
187
 
10.3%
178
 
9.8%
172
 
9.5%
166
 
9.1%
158
 
8.7%
157
 
8.6%
31
 
1.7%
25
 
1.4%
23
 
1.3%
21
 
1.2%
Other values (167) 701
38.5%
Common
ValueCountFrequency (%)
71
40.3%
) 26
 
14.8%
( 23
 
13.1%
2 23
 
13.1%
3 13
 
7.4%
1 10
 
5.7%
5 3
 
1.7%
- 3
 
1.7%
4 3
 
1.7%
7 1
 
0.6%
Latin
ValueCountFrequency (%)
A 5
35.7%
B 3
21.4%
L 3
21.4%
G 1
 
7.1%
X 1
 
7.1%
T 1
 
7.1%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1819
90.5%
ASCII 190
 
9.5%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
187
 
10.3%
178
 
9.8%
172
 
9.5%
166
 
9.1%
158
 
8.7%
157
 
8.6%
31
 
1.7%
25
 
1.4%
23
 
1.3%
21
 
1.2%
Other values (167) 701
38.5%
ASCII
ValueCountFrequency (%)
71
37.4%
) 26
 
13.7%
( 23
 
12.1%
2 23
 
12.1%
3 13
 
6.8%
1 10
 
5.3%
A 5
 
2.6%
5 3
 
1.6%
B 3
 
1.6%
L 3
 
1.6%
Other values (6) 10
 
5.3%
CJK
ValueCountFrequency (%)
1
100.0%

조합장명
Text

MISSING 

Distinct67
Distinct (%)98.5%
Missing103
Missing (%)60.2%
Memory size1.5 KiB
2023-12-11T07:17:08.719615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length3
Mean length3.1764706
Min length2

Characters and Unicode

Total characters216
Distinct characters91
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique66 ?
Unique (%)97.1%

Sample

1st row남정민
2nd row김광수
3rd row서은석
4th row이요한
5th row윤성섭
ValueCountFrequency (%)
윤동규 2
 
2.9%
이호열 1
 
1.5%
김나연 1
 
1.5%
박웅 1
 
1.5%
이인동 1
 
1.5%
이한승 1
 
1.5%
강원진 1
 
1.5%
곽종근 1
 
1.5%
이경택 1
 
1.5%
부승균 1
 
1.5%
Other values (57) 57
83.8%
2023-12-11T07:17:09.031520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
15
 
6.9%
8
 
3.7%
8
 
3.7%
6
 
2.8%
6
 
2.8%
6
 
2.8%
5
 
2.3%
5
 
2.3%
5
 
2.3%
4
 
1.9%
Other values (81) 148
68.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 209
96.8%
Open Punctuation 2
 
0.9%
Close Punctuation 2
 
0.9%
Decimal Number 2
 
0.9%
Other Punctuation 1
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
15
 
7.2%
8
 
3.8%
8
 
3.8%
6
 
2.9%
6
 
2.9%
6
 
2.9%
5
 
2.4%
5
 
2.4%
5
 
2.4%
4
 
1.9%
Other values (76) 141
67.5%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 209
96.8%
Common 7
 
3.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
15
 
7.2%
8
 
3.8%
8
 
3.8%
6
 
2.9%
6
 
2.9%
6
 
2.9%
5
 
2.4%
5
 
2.4%
5
 
2.4%
4
 
1.9%
Other values (76) 141
67.5%
Common
ValueCountFrequency (%)
( 2
28.6%
) 2
28.6%
1 1
14.3%
/ 1
14.3%
2 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 209
96.8%
ASCII 7
 
3.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
15
 
7.2%
8
 
3.8%
8
 
3.8%
6
 
2.9%
6
 
2.9%
6
 
2.9%
5
 
2.4%
5
 
2.4%
5
 
2.4%
4
 
1.9%
Other values (76) 141
67.5%
ASCII
ValueCountFrequency (%)
( 2
28.6%
) 2
28.6%
1 1
14.3%
/ 1
14.3%
2 1
14.3%

모집신고일자
Categorical

HIGH CORRELATION 

Distinct33
Distinct (%)19.3%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
<NA>
105 
2017-06-03
 
5
2017-04-11
 
4
2020-04-23
 
3
2020-06-08
 
3
Other values (28)
51 

Length

Max length10
Median length4
Mean length6.3157895
Min length4

Unique

Unique8 ?
Unique (%)4.7%

Sample

1st row2017-06-03
2nd row2017-09-22
3rd row2019-03-11
4th row2020-04-23
5th row2020-06-08

Common Values

ValueCountFrequency (%)
<NA> 105
61.4%
2017-06-03 5
 
2.9%
2017-04-11 4
 
2.3%
2020-04-23 3
 
1.8%
2020-06-08 3
 
1.8%
2019-04-09 3
 
1.8%
2020-01-03 3
 
1.8%
2020-03-09 3
 
1.8%
2019-03-11 2
 
1.2%
2018-07-11 2
 
1.2%
Other values (23) 38
 
22.2%

Length

2023-12-11T07:17:09.140687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 105
61.4%
2017-06-03 5
 
2.9%
2017-04-11 4
 
2.3%
2020-04-23 3
 
1.8%
2020-06-08 3
 
1.8%
2019-04-09 3
 
1.8%
2020-01-03 3
 
1.8%
2020-03-09 3
 
1.8%
2019-07-25 2
 
1.2%
2020-12-23 2
 
1.2%
Other values (23) 38
 
22.2%

설립인가일자
Date

MISSING 

Distinct96
Distinct (%)73.3%
Missing40
Missing (%)23.4%
Memory size1.5 KiB
Minimum2001-10-09 00:00:00
Maximum2059-12-31 00:00:00
2023-12-11T07:17:09.234858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:09.342760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

사업계획승인일
Date

MISSING 

Distinct35
Distinct (%)70.0%
Missing121
Missing (%)70.8%
Memory size1.5 KiB
Minimum2008-06-04 00:00:00
Maximum2059-12-31 00:00:00
2023-12-11T07:17:09.446354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:09.550993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)

착공신고일
Date

MISSING 

Distinct29
Distinct (%)63.0%
Missing125
Missing (%)73.1%
Memory size1.5 KiB
Minimum2008-11-03 00:00:00
Maximum2059-12-31 00:00:00
2023-12-11T07:17:09.648964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:09.740864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
Distinct155
Distinct (%)90.6%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-11T07:17:09.976497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length29
Mean length21.187135
Min length7

Characters and Unicode

Total characters3623
Distinct characters156
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique143 ?
Unique (%)83.6%

Sample

1st row이천시 안흥동 279-1번지 일원
2nd row이천시 증일동 60-5번지 일원(2단지)
3rd row이천시 증일동 79-4번지 일원(1단지)
4th row경기도 구리시 인창동 610-81번지 일원
5th row경기도 구리시 인창동 610-20번지 일원
ValueCountFrequency (%)
경기도 139
 
16.5%
일원 66
 
7.9%
평택시 19
 
2.3%
15
 
1.8%
남양주시 14
 
1.7%
화성시 13
 
1.5%
용인시 12
 
1.4%
덕풍동 12
 
1.4%
구리시 12
 
1.4%
파주시 9
 
1.1%
Other values (308) 529
63.0%
2023-12-11T07:17:10.333380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
672
 
18.5%
155
 
4.3%
151
 
4.2%
144
 
4.0%
139
 
3.8%
1 120
 
3.3%
118
 
3.3%
- 114
 
3.1%
92
 
2.5%
2 91
 
2.5%
Other values (146) 1827
50.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2113
58.3%
Decimal Number 685
 
18.9%
Space Separator 672
 
18.5%
Dash Punctuation 114
 
3.1%
Close Punctuation 15
 
0.4%
Open Punctuation 15
 
0.4%
Uppercase Letter 5
 
0.1%
Other Punctuation 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
155
 
7.3%
151
 
7.1%
144
 
6.8%
139
 
6.6%
118
 
5.6%
92
 
4.4%
88
 
4.2%
85
 
4.0%
80
 
3.8%
60
 
2.8%
Other values (128) 1001
47.4%
Decimal Number
ValueCountFrequency (%)
1 120
17.5%
2 91
13.3%
3 82
12.0%
4 79
11.5%
7 62
9.1%
5 61
8.9%
9 53
7.7%
6 50
7.3%
0 49
7.2%
8 38
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
C 2
40.0%
R 2
40.0%
B 1
20.0%
Space Separator
ValueCountFrequency (%)
672
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 114
100.0%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%
Open Punctuation
ValueCountFrequency (%)
( 15
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2113
58.3%
Common 1505
41.5%
Latin 5
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
155
 
7.3%
151
 
7.1%
144
 
6.8%
139
 
6.6%
118
 
5.6%
92
 
4.4%
88
 
4.2%
85
 
4.0%
80
 
3.8%
60
 
2.8%
Other values (128) 1001
47.4%
Common
ValueCountFrequency (%)
672
44.7%
1 120
 
8.0%
- 114
 
7.6%
2 91
 
6.0%
3 82
 
5.4%
4 79
 
5.2%
7 62
 
4.1%
5 61
 
4.1%
9 53
 
3.5%
6 50
 
3.3%
Other values (5) 121
 
8.0%
Latin
ValueCountFrequency (%)
C 2
40.0%
R 2
40.0%
B 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2113
58.3%
ASCII 1510
41.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
672
44.5%
1 120
 
7.9%
- 114
 
7.5%
2 91
 
6.0%
3 82
 
5.4%
4 79
 
5.2%
7 62
 
4.1%
5 61
 
4.0%
9 53
 
3.5%
6 50
 
3.3%
Other values (8) 126
 
8.3%
Hangul
ValueCountFrequency (%)
155
 
7.3%
151
 
7.1%
144
 
6.8%
139
 
6.6%
118
 
5.6%
92
 
4.4%
88
 
4.2%
85
 
4.0%
80
 
3.8%
60
 
2.8%
Other values (128) 1001
47.4%

세대수(조합원)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct60
Distinct (%)100.0%
Missing111
Missing (%)64.9%
Infinite0
Infinite (%)0.0%
Mean613.53333
Minimum27
Maximum2587
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-11T07:17:10.447719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum27
5-th percentile84.3
Q1227.75
median493
Q3899
95-th percentile1286.85
Maximum2587
Range2560
Interquartile range (IQR)671.25

Descriptive statistics

Standard deviation477.33847
Coefficient of variation (CV)0.77801555
Kurtosis3.4086815
Mean613.53333
Median Absolute Deviation (MAD)323
Skewness1.3929669
Sum36812
Variance227852.02
MonotonicityNot monotonic
2023-12-11T07:17:10.558998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
399 1
 
0.6%
2587 1
 
0.6%
377 1
 
0.6%
229 1
 
0.6%
325 1
 
0.6%
326 1
 
0.6%
145 1
 
0.6%
957 1
 
0.6%
890 1
 
0.6%
71 1
 
0.6%
Other values (50) 50
29.2%
(Missing) 111
64.9%
ValueCountFrequency (%)
27 1
0.6%
40 1
0.6%
71 1
0.6%
85 1
0.6%
94 1
0.6%
97 1
0.6%
145 1
0.6%
146 1
0.6%
152 1
0.6%
157 1
0.6%
ValueCountFrequency (%)
2587 1
0.6%
1486 1
0.6%
1398 1
0.6%
1281 1
0.6%
1273 1
0.6%
1263 1
0.6%
1246 1
0.6%
1170 1
0.6%
1161 1
0.6%
1125 1
0.6%

세대수(총분양)
Real number (ℝ)

HIGH CORRELATION 

Distinct131
Distinct (%)76.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean814.82456
Minimum104
Maximum2981
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-11T07:17:10.664147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum104
5-th percentile141
Q1405
median699
Q31045
95-th percentile1878.5
Maximum2981
Range2877
Interquartile range (IQR)640

Descriptive statistics

Standard deviation569.33847
Coefficient of variation (CV)0.69872522
Kurtosis1.8014846
Mean814.82456
Median Absolute Deviation (MAD)315
Skewness1.2641934
Sum139335
Variance324146.3
MonotonicityNot monotonic
2023-12-11T07:17:10.768678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
244 5
 
2.9%
472 4
 
2.3%
996 3
 
1.8%
432 3
 
1.8%
786 3
 
1.8%
140 3
 
1.8%
266 3
 
1.8%
420 3
 
1.8%
920 3
 
1.8%
135 2
 
1.2%
Other values (121) 139
81.3%
ValueCountFrequency (%)
104 1
 
0.6%
118 1
 
0.6%
122 1
 
0.6%
132 1
 
0.6%
135 2
1.2%
140 3
1.8%
142 2
1.2%
160 1
 
0.6%
192 1
 
0.6%
194 1
 
0.6%
ValueCountFrequency (%)
2981 1
0.6%
2908 1
0.6%
2581 1
0.6%
2329 1
0.6%
2280 1
0.6%
2124 1
0.6%
2090 1
0.6%
1963 1
0.6%
1885 1
0.6%
1872 1
0.6%

대지면적
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct60
Distinct (%)98.4%
Missing110
Missing (%)64.3%
Infinite0
Infinite (%)0.0%
Mean41418.164
Minimum0
Maximum151775
Zeros2
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-11T07:17:10.874721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6007
Q116180
median33789
Q361634
95-th percentile97925
Maximum151775
Range151775
Interquartile range (IQR)45454

Descriptive statistics

Standard deviation31958.797
Coefficient of variation (CV)0.77161307
Kurtosis1.0338109
Mean41418.164
Median Absolute Deviation (MAD)19236
Skewness1.0731278
Sum2526508
Variance1.0213647 × 109
MonotonicityNot monotonic
2023-12-11T07:17:10.996999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2
 
1.2%
19715 1
 
0.6%
15721 1
 
0.6%
73451 1
 
0.6%
79571 1
 
0.6%
61634 1
 
0.6%
34703 1
 
0.6%
43251 1
 
0.6%
71678 1
 
0.6%
92130 1
 
0.6%
Other values (50) 50
29.2%
(Missing) 110
64.3%
ValueCountFrequency (%)
0 2
1.2%
4306 1
0.6%
6007 1
0.6%
6029 1
0.6%
6130 1
0.6%
10197 1
0.6%
11242 1
0.6%
12252 1
0.6%
12470 1
0.6%
12543 1
0.6%
ValueCountFrequency (%)
151775 1
0.6%
105691 1
0.6%
104013 1
0.6%
97925 1
0.6%
92130 1
0.6%
89550 1
0.6%
83175 1
0.6%
80677 1
0.6%
79571 1
0.6%
77610 1
0.6%

연면적
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct48
Distinct (%)94.1%
Missing120
Missing (%)70.2%
Infinite0
Infinite (%)0.0%
Mean136594.96
Minimum0
Maximum434955
Zeros4
Zeros (%)2.3%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-11T07:17:11.106750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q163077
median119231
Q3212314.5
95-th percentile278801
Maximum434955
Range434955
Interquartile range (IQR)149237.5

Descriptive statistics

Standard deviation98889.459
Coefficient of variation (CV)0.72396126
Kurtosis0.62439932
Mean136594.96
Median Absolute Deviation (MAD)66686
Skewness0.83490956
Sum6966343
Variance9.7791252 × 109
MonotonicityNot monotonic
2023-12-11T07:17:11.409221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
0 4
 
2.3%
89992 1
 
0.6%
212277 1
 
0.6%
119231 1
 
0.6%
212352 1
 
0.6%
227348 1
 
0.6%
59994 1
 
0.6%
52545 1
 
0.6%
389411 1
 
0.6%
66288 1
 
0.6%
Other values (38) 38
 
22.2%
(Missing) 120
70.2%
ValueCountFrequency (%)
0 4
2.3%
14928 1
 
0.6%
23493 1
 
0.6%
30905 1
 
0.6%
48202 1
 
0.6%
50992 1
 
0.6%
52545 1
 
0.6%
57600 1
 
0.6%
57767 1
 
0.6%
59994 1
 
0.6%
ValueCountFrequency (%)
434955 1
0.6%
389411 1
0.6%
289208 1
0.6%
268394 1
0.6%
264606 1
0.6%
245083 1
0.6%
242542 1
0.6%
240275 1
0.6%
229476 1
0.6%
227348 1
0.6%

시공사명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct29
Distinct (%)17.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
<NA>
113 
㈜서희건설
 
11
-
 
8
미정
 
4
현대건설㈜
 
3
Other values (24)
32 

Length

Max length8
Median length4
Mean length4.1461988
Min length1

Unique

Unique17 ?
Unique (%)9.9%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 113
66.1%
㈜서희건설 11
 
6.4%
- 8
 
4.7%
미정 4
 
2.3%
현대건설㈜ 3
 
1.8%
㈜포스코건설 3
 
1.8%
서희건설 2
 
1.2%
코오롱건설 2
 
1.2%
양우건설㈜ 2
 
1.2%
㈜양우건설 2
 
1.2%
Other values (19) 21
 
12.3%

Length

2023-12-11T07:17:11.514714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 113
65.7%
㈜서희건설 11
 
6.4%
8
 
4.7%
미정 4
 
2.3%
현대건설㈜ 3
 
1.7%
㈜포스코건설 3
 
1.7%
㈜양우건설 2
 
1.2%
㈜한양건설 2
 
1.2%
대림산업(주 2
 
1.2%
양우건설㈜ 2
 
1.2%
Other values (20) 22
 
12.8%

관리기관명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
<NA>
102 
평택시 주택과
19 
화성시 주택과
13 
파주시 주택과
 
9
광주시청
 
9
Other values (6)
19 

Length

Max length16
Median length4
Mean length5.2807018
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 102
59.6%
평택시 주택과 19
 
11.1%
화성시 주택과 13
 
7.6%
파주시 주택과 9
 
5.3%
광주시청 9
 
5.3%
김포시 주택과 4
 
2.3%
의정부시 주택과 3
 
1.8%
오산시 주택과 3
 
1.8%
포천시 건축과(꽁동주택허가팀) 3
 
1.8%
수원시 공동주택과 3
 
1.8%

Length

2023-12-11T07:17:11.620131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 102
44.2%
주택과 51
22.1%
평택시 19
 
8.2%
화성시 13
 
5.6%
파주시 9
 
3.9%
광주시청 9
 
3.9%
김포시 4
 
1.7%
의정부시 3
 
1.3%
오산시 3
 
1.3%
포천시 3
 
1.3%
Other values (5) 15
 
6.5%

관리기관전화번호
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)9.9%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
<NA>
102 
031-8024-4146
19 
031-5189-6303
 
7
031-760-8661
 
4
031-940-4752
 
4
Other values (12)
35 

Length

Max length13
Median length4
Mean length7.4327485
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 102
59.6%
031-8024-4146 19
 
11.1%
031-5189-6303 7
 
4.1%
031-760-8661 4
 
2.3%
031-940-4752 4
 
2.3%
031-980-2402 4
 
2.3%
031-828-4494 3
 
1.8%
031-8036-7768 3
 
1.8%
031-538-2384 3
 
1.8%
031-940-4770 3
 
1.8%
Other values (7) 19
 
11.1%

Length

2023-12-11T07:17:11.712319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 102
59.6%
031-8024-4146 19
 
11.1%
031-5189-6303 7
 
4.1%
031-760-8661 4
 
2.3%
031-940-4752 4
 
2.3%
031-980-2402 4
 
2.3%
031-228-3388 3
 
1.8%
031-887-2402 3
 
1.8%
031-760-8664 3
 
1.8%
031-5189-2405 3
 
1.8%
Other values (7) 19
 
11.1%

Interactions

2023-12-11T07:17:07.178630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:06.147580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:06.439325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:06.910050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:07.262058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:06.211906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:06.501628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:06.971765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:07.340248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:06.280531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:06.771206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:07.045410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:07.419189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:06.358237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:06.841016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:17:07.113665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:17:11.780133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명조합장명모집신고일자설립인가일자사업계획승인일착공신고일세대수(조합원)세대수(총분양)대지면적연면적시공사명관리기관명관리기관전화번호
시군명1.0001.0000.9900.9980.9860.9840.4890.6640.3230.6720.9501.0001.000
조합장명1.0001.000NaN1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
모집신고일자0.990NaN1.0000.995NaNNaNNaN0.919NaNNaNNaNNaNNaN
설립인가일자0.9981.0000.9951.0001.0001.0000.0000.9800.0000.8611.0001.0000.947
사업계획승인일0.9861.000NaN1.0001.0000.9980.0000.0000.0000.5470.9940.9860.916
착공신고일0.9841.000NaN1.0000.9981.0000.3240.0000.0000.5420.9930.9840.938
세대수(조합원)0.4891.000NaN0.0000.0000.3241.0000.8520.9050.7870.4840.4890.481
세대수(총분양)0.6641.0000.9190.9800.0000.0000.8521.0000.8200.9200.0000.6340.437
대지면적0.3231.000NaN0.0000.0000.0000.9050.8201.0000.7550.0000.3480.634
연면적0.6721.000NaN0.8610.5470.5420.7870.9200.7551.0000.1470.6720.571
시공사명0.9501.000NaN1.0000.9940.9930.4840.0000.0000.1471.0000.9500.907
관리기관명1.0001.000NaN1.0000.9860.9840.4890.6340.3480.6720.9501.0001.000
관리기관전화번호1.0001.000NaN0.9470.9160.9380.4810.4370.6340.5710.9071.0001.000
2023-12-11T07:17:11.893615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명모집신고일자관리기관전화번호관리기관명시공사명
시군명1.0000.7190.9481.0000.586
모집신고일자0.7191.000NaNNaNNaN
관리기관전화번호0.948NaN1.0000.9480.454
관리기관명1.000NaN0.9481.0000.586
시공사명0.586NaN0.4540.5861.000
2023-12-11T07:17:11.980746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세대수(조합원)세대수(총분양)대지면적연면적시군명모집신고일자시공사명관리기관명관리기관전화번호
세대수(조합원)1.0000.9340.7610.6830.2610.0000.1120.2730.220
세대수(총분양)0.9341.0000.7590.7940.2980.5320.0000.2360.167
대지면적0.7610.7591.0000.7870.1480.0000.0000.1670.325
연면적0.6830.7940.7871.0000.2660.0000.0000.2660.264
시군명0.2610.2980.1480.2661.0000.7190.5861.0000.948
모집신고일자0.0000.5320.0000.0000.7191.0000.0000.0000.000
시공사명0.1120.0000.0000.0000.5860.0001.0000.5860.454
관리기관명0.2730.2360.1670.2661.0000.0000.5861.0000.948
관리기관전화번호0.2200.1670.3250.2640.9480.0000.4540.9481.000

Missing values

2023-12-11T07:17:07.508306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:17:07.674226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T07:17:07.817502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군명조합명조합장명모집신고일자설립인가일자사업계획승인일착공신고일사업지위치세대수(조합원)세대수(총분양)대지면적연면적시공사명관리기관명관리기관전화번호
0이천시안흥동지역주택조합<NA>2017-06-032018-10-19<NA><NA>이천시 안흥동 279-1번지 일원<NA>945<NA><NA><NA><NA><NA>
1이천시중리신도시현대지역주택조합<NA>2017-09-222019-02-08<NA><NA>이천시 증일동 60-5번지 일원(2단지)<NA>885<NA><NA><NA><NA><NA>
2이천시중리신도시현대지역주택조합2<NA>2019-03-112020-08-28<NA><NA>이천시 증일동 79-4번지 일원(1단지)<NA>937<NA><NA><NA><NA><NA>
3구리시(가칭)인창대명지역주택조합<NA>2020-04-23<NA><NA><NA>경기도 구리시 인창동 610-81번지 일원<NA>239<NA><NA><NA><NA><NA>
4구리시(가칭)인창대명2지역주택조합<NA>2020-06-08<NA><NA><NA>경기도 구리시 인창동 610-20번지 일원<NA>140<NA><NA><NA><NA><NA>
5구리시인창동 지역주택조합<NA>2019-04-092019-10-30<NA><NA>경기도 구리시 인창동 515-1번지 일원<NA>244<NA><NA><NA><NA><NA>
6구리시수택지역주택조합<NA>2020-01-032020-05-22<NA><NA>경기도 구리시 수택동 266-2번지 일원<NA>266<NA><NA><NA><NA><NA>
7용인시용인역북지역주택조합<NA><NA>2016-05-18<NA><NA>경기도 용인시 처인구 역북동 233번지 일원<NA>1872<NA><NA><NA><NA><NA>
8용인시역북지역주택조합<NA><NA>2016-08-16<NA><NA>경기도 용인시 처인구 역북동 89-25번지 일원<NA>912<NA><NA><NA><NA><NA>
9용인시용인역삼지역주택조합<NA><NA>2017-09-29<NA><NA>경기도 용인시 처인구 역삼도시개발구역 R1-3블록<NA>1042<NA><NA><NA><NA><NA>
시군명조합명조합장명모집신고일자설립인가일자사업계획승인일착공신고일사업지위치세대수(조합원)세대수(총분양)대지면적연면적시공사명관리기관명관리기관전화번호
161남양주시창현지역주택조합<NA>2020-06-262020-12-16<NA><NA>경기도 남양주시 화도읍 창현리 391-20 외 79필지<NA>1235<NA><NA><NA><NA><NA>
162남양주시가칭)부평2지구A-3BL지역주택조합<NA>2021-02-10<NA><NA><NA>경기도 남양주시 진접읍 부평리 734-37 일원<NA>704<NA><NA><NA><NA><NA>
163남양주시양지7지구 1단지 지역주택조합<NA>2021-08-202022-10-14<NA><NA>경기도 남양주시 오남읍 양지리 101-2번지 일원<NA>1308<NA><NA><NA><NA><NA>
164남양주시(가칭)양지5지구2블럭지역주택조합추진위원회<NA>2022-08-12<NA><NA><NA>경기도 남양주시 오남읍 양지리 496번지 일원<NA>1080<NA><NA><NA><NA><NA>
165남양주시(가칭)장현지구지역주택조합추진위원회<NA>2022-08-30<NA><NA><NA>경기도 남양주시 진접읍 장현리 593번지 외6필지<NA>387<NA><NA><NA><NA><NA>
166남양주시(가칭)양지5지구 3블럭 지역주택조합추진위원회<NA>2022-10-20<NA><NA><NA>경기도 남양주시 오남읍 양지리 186-10번지 일원<NA>1054<NA><NA><NA><NA><NA>
167남양주시(가칭)화도현대지역주택조합추진위원회<NA>2022-12-13<NA><NA><NA>경기도 남양주시 화도읍 창현리 산46번지 일원<NA>420<NA><NA><NA><NA><NA>
168이천시안흥동지역주택조합<NA><NA>2018-10-19<NA><NA>이천시 안흥동 279-1번지 일원<NA>945<NA><NA><NA><NA><NA>
169이천시중리신도시현대지역주택조합<NA>2017-09-222019-02-08<NA><NA>이천시 증일동 60-5번지 일원(2단지)<NA>885<NA><NA><NA><NA><NA>
170이천시중리신도시현대지역주택조합2<NA>2019-03-112020-08-28<NA><NA>이천시 증일동 79-4번지 일원(1단지)<NA>937<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

시군명조합명조합장명모집신고일자설립인가일자사업계획승인일착공신고일사업지위치세대수(조합원)세대수(총분양)대지면적연면적시공사명관리기관명관리기관전화번호# duplicates
1구리시(가칭)인창대명2지역주택조합<NA>2020-06-08<NA><NA><NA>경기도 구리시 인창동 610-20번지 일원<NA>140<NA><NA><NA><NA><NA>3
3구리시수택지역주택조합<NA>2020-01-032020-05-22<NA><NA>경기도 구리시 수택동 266-2번지 일원<NA>266<NA><NA><NA><NA><NA>3
4구리시인창동 지역주택조합<NA>2019-04-092019-10-30<NA><NA>경기도 구리시 인창동 515-1번지 일원<NA>244<NA><NA><NA><NA><NA>3
0가평군디엘본가평설악지역주택조합<NA>2021-01-212021-05-31<NA><NA>경기도 가평군 설악면 신천리 산45-27 일원<NA>420<NA><NA><NA><NA><NA>2
2구리시(가칭)인창대명지역주택조합<NA>2020-04-23<NA><NA><NA>경기도 구리시 인창동 610-81번지 일원<NA>239<NA><NA><NA><NA><NA>2
5안양시안양수리산지역주택조합<NA>2020-05-262020-12-01<NA><NA>경기도 안양시 만안구 안양동 산53 일원<NA>472<NA><NA><NA><NA><NA>2
6안양시평촌동 지역주택조합<NA>2017-06-032017-06-28<NA><NA>경기도 안양시 동안구 평촌동 54-1 일원<NA>472<NA><NA><NA><NA><NA>2
7이천시중리신도시현대지역주택조합<NA>2017-09-222019-02-08<NA><NA>이천시 증일동 60-5번지 일원(2단지)<NA>885<NA><NA><NA><NA><NA>2
8이천시중리신도시현대지역주택조합2<NA>2019-03-112020-08-28<NA><NA>이천시 증일동 79-4번지 일원(1단지)<NA>937<NA><NA><NA><NA><NA>2