Overview

Dataset statistics

Number of variables7
Number of observations93
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)1.1%
Total size in memory5.8 KiB
Average record size in memory63.4 B

Variable types

Categorical2
Numeric5

Dataset

Description공무원 임대주택 입주 희망 공무원 대상 임대주택 예비입주자 선정과 관련한 주택 유형별 경쟁률 및 최저점(커트라인) 현황입니다.선정순위 : 1순위(전국무주택/비수혜), 2순위(소재지무주택/비수혜), 3순위(전국무주택/기수혜), 4순위(소재지무주택/기수혜)※ 순위 및 가점이 동일할 경우 : ① 전국무주택기간이 긴 세대 ② 생년월일 빠른 자
Author공무원연금공단
URLhttps://www.data.go.kr/data/15093383/fileData.do

Alerts

Dataset has 1 (1.1%) duplicate rowsDuplicates
모집세대수 is highly overall correlated with 신청세대수High correlation
신청세대수 is highly overall correlated with 모집세대수 and 1 other fieldsHigh correlation
경쟁률 is highly overall correlated with 최저가점 and 1 other fieldsHigh correlation
최저가점 is highly overall correlated with 경쟁률 and 1 other fieldsHigh correlation
단지명 is highly overall correlated with 신청세대수 and 1 other fieldsHigh correlation
최저선정순위 is highly overall correlated with 최저가점High correlation
최저선정순위 is highly imbalanced (85.0%)Imbalance

Reproduction

Analysis started2023-12-12 17:07:19.176687
Analysis finished2023-12-12 17:07:22.255883
Duration3.08 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

단지명
Categorical

HIGH CORRELATION 

Distinct44
Distinct (%)47.3%
Missing0
Missing (%)0.0%
Memory size876.0 B
경북안동 상록아파트
 
6
화성동탄 상록리슈빌아파트
 
6
광교A20(임대)
 
5
서귀포 강정
 
5
충남 내포(임대)
 
5
Other values (39)
66 

Length

Max length15
Median length12
Mean length9.3763441
Min length4

Unique

Unique23 ?
Unique (%)24.7%

Sample

1st row서울 상계
2nd row서울 상계
3rd row인천가좌
4th row인천가좌
5th row부천 상동(18)

Common Values

ValueCountFrequency (%)
경북안동 상록아파트 6
 
6.5%
화성동탄 상록리슈빌아파트 6
 
6.5%
광교A20(임대) 5
 
5.4%
서귀포 강정 5
 
5.4%
충남 내포(임대) 5
 
5.4%
부산화명(롯데캐슬카이저) 4
 
4.3%
김포한강AB15(임대) 4
 
4.3%
전주 반월(광신) 4
 
4.3%
대구복현(명문세가) 3
 
3.2%
남원 죽항 3
 
3.2%
Other values (34) 48
51.6%

Length

2023-12-13T02:07:22.329366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경북안동 6
 
4.4%
화성동탄 6
 
4.4%
상록리슈빌아파트 6
 
4.4%
상록아파트 6
 
4.4%
광교a20(임대 5
 
3.7%
서귀포 5
 
3.7%
강정 5
 
3.7%
충남 5
 
3.7%
내포(임대 5
 
3.7%
전주 4
 
3.0%
Other values (49) 82
60.7%

주택형
Real number (ℝ)

Distinct31
Distinct (%)33.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48.849462
Minimum14
Maximum85
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-13T02:07:22.467085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile19.6
Q125
median46
Q370
95-th percentile84
Maximum85
Range71
Interquartile range (IQR)45

Descriptive statistics

Standard deviation23.915134
Coefficient of variation (CV)0.489568
Kurtosis-1.4319897
Mean48.849462
Median Absolute Deviation (MAD)21
Skewness0.30225658
Sum4543
Variance571.93361
MonotonicityNot monotonic
2023-12-13T02:07:22.707397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
84 18
19.4%
59 15
16.1%
24 8
 
8.6%
25 6
 
6.5%
30 4
 
4.3%
33 4
 
4.3%
32 3
 
3.2%
29 3
 
3.2%
23 3
 
3.2%
39 3
 
3.2%
Other values (21) 26
28.0%
ValueCountFrequency (%)
14 1
 
1.1%
16 1
 
1.1%
17 1
 
1.1%
18 1
 
1.1%
19 1
 
1.1%
20 1
 
1.1%
22 1
 
1.1%
23 3
 
3.2%
24 8
8.6%
25 6
6.5%
ValueCountFrequency (%)
85 1
 
1.1%
84 18
19.4%
82 1
 
1.1%
74 2
 
2.2%
70 2
 
2.2%
69 2
 
2.2%
59 15
16.1%
58 1
 
1.1%
54 1
 
1.1%
50 2
 
2.2%

모집세대수
Real number (ℝ)

HIGH CORRELATION 

Distinct18
Distinct (%)19.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.1397849
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-13T02:07:22.862468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q37
95-th percentile23.4
Maximum60
Range59
Interquartile range (IQR)5

Descriptive statistics

Standard deviation9.0500918
Coefficient of variation (CV)1.4740079
Kurtosis16.652684
Mean6.1397849
Median Absolute Deviation (MAD)2
Skewness3.7113128
Sum571
Variance81.904161
MonotonicityNot monotonic
2023-12-13T02:07:23.012645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
2 24
25.8%
1 17
18.3%
3 10
10.8%
5 9
 
9.7%
8 8
 
8.6%
4 7
 
7.5%
7 3
 
3.2%
24 2
 
2.2%
15 2
 
2.2%
6 2
 
2.2%
Other values (8) 9
 
9.7%
ValueCountFrequency (%)
1 17
18.3%
2 24
25.8%
3 10
10.8%
4 7
 
7.5%
5 9
 
9.7%
6 2
 
2.2%
7 3
 
3.2%
8 8
 
8.6%
9 2
 
2.2%
12 1
 
1.1%
ValueCountFrequency (%)
60 1
1.1%
45 1
1.1%
30 1
1.1%
24 2
2.2%
23 1
1.1%
22 1
1.1%
18 1
1.1%
15 2
2.2%
12 1
1.1%
9 2
2.2%

신청세대수
Real number (ℝ)

HIGH CORRELATION 

Distinct62
Distinct (%)66.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean58.086022
Minimum2
Maximum431
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-13T02:07:23.190267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile4.2
Q118
median31
Q370
95-th percentile156.8
Maximum431
Range429
Interquartile range (IQR)52

Descriptive statistics

Standard deviation72.969269
Coefficient of variation (CV)1.2562277
Kurtosis12.117373
Mean58.086022
Median Absolute Deviation (MAD)17
Skewness3.0854928
Sum5402
Variance5324.5143
MonotonicityNot monotonic
2023-12-13T02:07:23.364088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29 4
 
4.3%
20 4
 
4.3%
10 3
 
3.2%
25 3
 
3.2%
14 3
 
3.2%
2 3
 
3.2%
34 3
 
3.2%
68 2
 
2.2%
27 2
 
2.2%
18 2
 
2.2%
Other values (52) 64
68.8%
ValueCountFrequency (%)
2 3
3.2%
3 2
2.2%
5 1
 
1.1%
9 2
2.2%
10 3
3.2%
11 1
 
1.1%
13 2
2.2%
14 3
3.2%
15 2
2.2%
16 2
2.2%
ValueCountFrequency (%)
431 1
1.1%
410 1
1.1%
259 1
1.1%
171 1
1.1%
161 1
1.1%
154 1
1.1%
147 1
1.1%
146 1
1.1%
143 1
1.1%
137 1
1.1%

경쟁률
Real number (ℝ)

HIGH CORRELATION 

Distinct64
Distinct (%)68.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.390323
Minimum1
Maximum33.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-13T02:07:23.530155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.84
Q17
median10
Q315
95-th percentile22.52
Maximum33.5
Range32.5
Interquartile range (IQR)8

Descriptive statistics

Standard deviation6.6070131
Coefficient of variation (CV)0.58005496
Kurtosis1.2017141
Mean11.390323
Median Absolute Deviation (MAD)3.7
Skewness0.98529418
Sum1059.3
Variance43.652623
MonotonicityNot monotonic
2023-12-13T02:07:23.718796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9.0 5
 
5.4%
10.0 4
 
4.3%
12.5 3
 
3.2%
11.3 3
 
3.2%
20.0 3
 
3.2%
8.5 3
 
3.2%
7.0 3
 
3.2%
3.6 2
 
2.2%
14.0 2
 
2.2%
15.0 2
 
2.2%
Other values (54) 63
67.7%
ValueCountFrequency (%)
1.0 2
2.2%
1.5 1
1.1%
2.0 1
1.1%
2.6 1
1.1%
3.0 1
1.1%
3.3 1
1.1%
3.6 2
2.2%
3.8 1
1.1%
4.3 2
2.2%
4.5 1
1.1%
ValueCountFrequency (%)
33.5 1
 
1.1%
33.0 1
 
1.1%
25.6 1
 
1.1%
24.4 1
 
1.1%
23.0 1
 
1.1%
22.2 1
 
1.1%
21.3 1
 
1.1%
20.5 2
2.2%
20.4 1
 
1.1%
20.0 3
3.2%

최저선정순위
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size876.0 B
1
91 
2
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 91
97.8%
2 2
 
2.2%

Length

2023-12-13T02:07:23.896310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:07:24.034796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 91
97.8%
2 2
 
2.2%

최저가점
Real number (ℝ)

HIGH CORRELATION 

Distinct18
Distinct (%)19.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.806452
Minimum8
Maximum52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-13T02:07:24.167290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile22
Q126
median32
Q334
95-th percentile38.8
Maximum52
Range44
Interquartile range (IQR)8

Descriptive statistics

Standard deviation6.8813833
Coefficient of variation (CV)0.22337475
Kurtosis1.3510395
Mean30.806452
Median Absolute Deviation (MAD)4
Skewness-0.24264332
Sum2865
Variance47.353436
MonotonicityNot monotonic
2023-12-13T02:07:24.307300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
32 23
24.7%
34 12
12.9%
36 11
11.8%
24 10
10.8%
22 9
 
9.7%
38 7
 
7.5%
30 5
 
5.4%
26 5
 
5.4%
28 2
 
2.2%
18 1
 
1.1%
Other values (8) 8
 
8.6%
ValueCountFrequency (%)
8 1
 
1.1%
13 1
 
1.1%
18 1
 
1.1%
20 1
 
1.1%
22 9
 
9.7%
24 10
10.8%
26 5
 
5.4%
28 2
 
2.2%
30 5
 
5.4%
32 23
24.7%
ValueCountFrequency (%)
52 1
 
1.1%
48 1
 
1.1%
44 1
 
1.1%
42 1
 
1.1%
40 1
 
1.1%
38 7
 
7.5%
36 11
11.8%
34 12
12.9%
32 23
24.7%
30 5
 
5.4%

Interactions

2023-12-13T02:07:21.510241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:19.446420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:19.907852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:20.602556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:21.043470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:21.609077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:19.537998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:20.031914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:20.694048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:21.140768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:21.702213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:19.627396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:20.114834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:20.774002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:21.244145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:21.797178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:19.732411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:20.429444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:20.853390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:21.332341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:21.906338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:19.823092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:20.512916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:20.947254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:21.430686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:07:24.416075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단지명주택형모집세대수신청세대수경쟁률최저선정순위최저가점
단지명1.0000.8100.8860.9620.9670.0000.758
주택형0.8101.0000.2270.2040.2110.0000.000
모집세대수0.8860.2271.0000.7610.0000.0000.000
신청세대수0.9620.2040.7611.0000.3010.0000.000
경쟁률0.9670.2110.0000.3011.0000.3250.568
최저선정순위0.0000.0000.0000.0000.3251.0001.000
최저가점0.7580.0000.0000.0000.5681.0001.000
2023-12-13T02:07:24.553875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최저선정순위단지명
최저선정순위1.0000.000
단지명0.0001.000
2023-12-13T02:07:24.645235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주택형모집세대수신청세대수경쟁률최저가점단지명최저선정순위
주택형1.000-0.187-0.1130.1160.3060.3280.000
모집세대수-0.1871.0000.809-0.220-0.0480.4590.000
신청세대수-0.1130.8091.0000.3350.2590.6090.000
경쟁률0.116-0.2200.3351.0000.5730.6260.311
최저가점0.306-0.0480.2590.5731.0000.3040.955
단지명0.3280.4590.6090.6260.3041.0000.000
최저선정순위0.0000.0000.0000.3110.9550.0001.000

Missing values

2023-12-13T02:07:22.046924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:07:22.200075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

단지명주택형모집세대수신청세대수경쟁률최저선정순위최저가점
0서울 상계18454109.1130
1서울 상계22604317.2132
2인천가좌24810312.9136
3인천가좌283299.7132
4부천 상동(18)24221547.0134
5부천 상동(19)24241616.7132
6안양석수 아이파크(임대)25511122.2138
7파주교하24241114.6124
8화성동탄(임대)32181377.6134
9화성동탄 상록리슈빌아파트5957014.0134
단지명주택형모집세대수신청세대수경쟁률최저선정순위최저가점
83충남 내포(임대)5915996.6124
84충남 내포(임대)844389.5132
85청주사직(푸르지오캐슬)33512825.6138
86세종시M2(임대)34814618.3124
87세종시M2(임대)70813316.6132
88세종시M2(임대)7046115.3132
89세종시M5(임대)361213611.3122
90세종시M5(임대)461517111.4124
91세종시M6(임대)8424120.5138
92세종시M6(임대)8424120.5140

Duplicate rows

Most frequently occurring

단지명주택형모집세대수신청세대수경쟁률최저선정순위최저가점# duplicates
0전주 반월(광신)5022914.51322