Overview

Dataset statistics

Number of variables12
Number of observations8374
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory809.7 KiB
Average record size in memory99.0 B

Variable types

Categorical4
Numeric3
Boolean3
DateTime2

Dataset

Description대구도시개발공사 전세임대 가족사항 데이터 입니다. 메타데이터기반 공공데이터 개방자료이기 때문에 가공되지 않은 원본 테이블의 데이터가 등록되었습니다.
URLhttps://www.data.go.kr/data/15120616/fileData.do

Alerts

건축물소유여부 has constant value ""Constant
주택소유여부 is highly overall correlated with 일련번호 and 3 other fieldsHigh correlation
수정자번호 is highly overall correlated with 일련번호 and 3 other fieldsHigh correlation
등록자번호 is highly overall correlated with 일련번호 and 4 other fieldsHigh correlation
신청자계약자번호 is highly overall correlated with 신청자계약자구분High correlation
일련번호 is highly overall correlated with 전세고객번호 and 3 other fieldsHigh correlation
전세고객번호 is highly overall correlated with 일련번호 and 3 other fieldsHigh correlation
신청자계약자구분 is highly overall correlated with 신청자계약자번호 and 1 other fieldsHigh correlation
가족관계 is highly imbalanced (72.1%)Imbalance
토지소유여부 is highly imbalanced (89.5%)Imbalance
수정자번호 is highly imbalanced (50.6%)Imbalance

Reproduction

Analysis started2023-12-12 21:26:23.274242
Analysis finished2023-12-12 21:26:25.541401
Duration2.27 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

신청자계약자구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size65.6 KiB
신규계약
6980 
재계약
1394 

Length

Max length4
Median length4
Mean length3.8335324
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row신규계약
2nd row신규계약
3rd row신규계약
4th row신규계약
5th row신규계약

Common Values

ValueCountFrequency (%)
신규계약 6980
83.4%
재계약 1394
 
16.6%

Length

2023-12-13T06:26:25.622639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:26:25.737611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신규계약 6980
83.4%
재계약 1394
 
16.6%

신청자계약자번호
Real number (ℝ)

HIGH CORRELATION 

Distinct4532
Distinct (%)54.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.1276261 × 108
Minimum12015001
Maximum8.2023001 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size73.7 KiB
2023-12-13T06:26:26.199765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12015001
5-th percentile42022002
Q12.2016004 × 108
median5.2013025 × 108
Q36.2015003 × 108
95-th percentile8.2013002 × 108
Maximum8.2023001 × 108
Range8.0821501 × 108
Interquartile range (IQR)3.9999 × 108

Descriptive statistics

Standard deviation2.4716781 × 108
Coefficient of variation (CV)0.59881346
Kurtosis-1.2471682
Mean4.1276261 × 108
Median Absolute Deviation (MAD)2.0004979 × 108
Skewness-0.15721017
Sum3.4564741 × 1012
Variance6.1091926 × 1016
MonotonicityNot monotonic
2023-12-13T06:26:26.370629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
220160062 9
 
0.1%
520160032 7
 
0.1%
620200079 7
 
0.1%
220190025 7
 
0.1%
420160013 7
 
0.1%
620200020 6
 
0.1%
820190012 6
 
0.1%
520200053 6
 
0.1%
120160012 6
 
0.1%
420190026 6
 
0.1%
Other values (4522) 8307
99.2%
ValueCountFrequency (%)
12015001 1
 
< 0.1%
12015002 2
< 0.1%
12016001 4
< 0.1%
12016004 3
< 0.1%
12016009 1
 
< 0.1%
12016010 1
 
< 0.1%
12016011 2
< 0.1%
12017003 3
< 0.1%
12017005 1
 
< 0.1%
12017007 1
 
< 0.1%
ValueCountFrequency (%)
820230009 2
< 0.1%
820230008 2
< 0.1%
820230007 1
< 0.1%
820230003 1
< 0.1%
820230001 1
< 0.1%
820210033 2
< 0.1%
820210032 2
< 0.1%
820210031 2
< 0.1%
820210028 2
< 0.1%
820210027 2
< 0.1%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct4569
Distinct (%)54.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean858170.23
Minimum1
Maximum1857825
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size73.7 KiB
2023-12-13T06:26:26.528650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median485.5
Q31855729.8
95-th percentile1857404.4
Maximum1857825
Range1857824
Interquartile range (IQR)1855727.8

Descriptive statistics

Standard deviation925333.94
Coefficient of variation (CV)1.0782638
Kurtosis-1.977692
Mean858170.23
Median Absolute Deviation (MAD)484.5
Skewness0.15092235
Sum7.1863175 × 109
Variance8.562429 × 1011
MonotonicityNot monotonic
2023-12-13T06:26:26.736274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1953
23.3%
2 1199
 
14.3%
3 456
 
5.4%
4 139
 
1.7%
5 48
 
0.6%
6 9
 
0.1%
1857656 2
 
< 0.1%
1857586 2
 
< 0.1%
1857646 2
 
< 0.1%
1857645 2
 
< 0.1%
Other values (4559) 4562
54.5%
ValueCountFrequency (%)
1 1953
23.3%
2 1199
14.3%
3 456
 
5.4%
4 139
 
1.7%
5 48
 
0.6%
6 9
 
0.1%
7 2
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
10 1
 
< 0.1%
ValueCountFrequency (%)
1857825 1
< 0.1%
1857824 1
< 0.1%
1857823 1
< 0.1%
1857822 1
< 0.1%
1857821 1
< 0.1%
1857820 1
< 0.1%
1857819 1
< 0.1%
1857818 1
< 0.1%
1857817 1
< 0.1%
1857816 1
< 0.1%

전세고객번호
Real number (ℝ)

HIGH CORRELATION 

Distinct7334
Distinct (%)87.6%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean667382.43
Minimum658324
Maximum678410
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size73.7 KiB
2023-12-13T06:26:26.915615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum658324
5-th percentile662170
Q1663253
median665196
Q3670977
95-th percentile677419.2
Maximum678410
Range20086
Interquartile range (IQR)7724

Descriptive statistics

Standard deviation4887.6001
Coefficient of variation (CV)0.0073235373
Kurtosis-0.70235229
Mean667382.43
Median Absolute Deviation (MAD)2771
Skewness0.74964711
Sum5.5879931 × 109
Variance23888635
MonotonicityNot monotonic
2023-12-13T06:26:27.105376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
662857 5
 
0.1%
661985 4
 
< 0.1%
662889 4
 
< 0.1%
662044 4
 
< 0.1%
662530 4
 
< 0.1%
661984 4
 
< 0.1%
661983 4
 
< 0.1%
663106 4
 
< 0.1%
662817 4
 
< 0.1%
662890 4
 
< 0.1%
Other values (7324) 8332
99.5%
ValueCountFrequency (%)
658324 1
< 0.1%
658417 1
< 0.1%
659093 1
< 0.1%
659244 1
< 0.1%
659347 1
< 0.1%
659366 1
< 0.1%
659520 1
< 0.1%
659823 1
< 0.1%
659877 1
< 0.1%
660102 1
< 0.1%
ValueCountFrequency (%)
678410 1
< 0.1%
678394 1
< 0.1%
678378 1
< 0.1%
678376 1
< 0.1%
678370 1
< 0.1%
678367 1
< 0.1%
678366 1
< 0.1%
678364 1
< 0.1%
678363 1
< 0.1%
678361 1
< 0.1%

가족관계
Categorical

IMBALANCE 

Distinct28
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size65.6 KiB
자녀
6678 
배우자
 
516
 
362
 
259
남편
 
133
Other values (23)
 
426

Length

Max length4
Median length2
Mean length1.9802962
Min length1

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st row자녀
2nd row자녀
3rd row
4th row자녀
5th row자녀

Common Values

ValueCountFrequency (%)
자녀 6678
79.7%
배우자 516
 
6.2%
362
 
4.3%
259
 
3.1%
남편 133
 
1.6%
손자 115
 
1.4%
82
 
1.0%
동거인 46
 
0.5%
42
 
0.5%
기타 32
 
0.4%
Other values (18) 109
 
1.3%

Length

2023-12-13T06:26:27.295903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
자녀 6678
79.7%
배우자 516
 
6.2%
362
 
4.3%
259
 
3.1%
남편 133
 
1.6%
손자 115
 
1.4%
82
 
1.0%
동거인 46
 
0.5%
42
 
0.5%
기타 32
 
0.4%
Other values (18) 109
 
1.3%

주택소유여부
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
False
4570 
True
3804 
ValueCountFrequency (%)
False 4570
54.6%
True 3804
45.4%
2023-12-13T06:26:27.422717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

토지소유여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
False
8259 
True
 
115
ValueCountFrequency (%)
False 8259
98.6%
True 115
 
1.4%
2023-12-13T06:26:27.531017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

건축물소유여부
Boolean

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
False
8374 
ValueCountFrequency (%)
False 8374
100.0%
2023-12-13T06:26:27.621441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

등록자번호
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size65.6 KiB
19920113
2091 
20159051
1867 
20050190
957 
admin
671 
20139023
631 
Other values (19)
2157 

Length

Max length8
Median length8
Mean length7.7596131
Min length5

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row20159051
2nd row20159051
3rd row20159051
4th row20159051
5th row20159051

Common Values

ValueCountFrequency (%)
19920113 2091
25.0%
20159051 1867
22.3%
20050190 957
11.4%
admin 671
 
8.0%
20139023 631
 
7.5%
20040187 581
 
6.9%
19880040 407
 
4.9%
20080209 354
 
4.2%
20139024 242
 
2.9%
20159042 155
 
1.9%
Other values (14) 418
 
5.0%

Length

2023-12-13T06:26:27.775061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
19920113 2091
25.0%
20159051 1867
22.3%
20050190 957
11.4%
admin 671
 
8.0%
20139023 631
 
7.5%
20040187 581
 
6.9%
19880040 407
 
4.9%
20080209 354
 
4.2%
20139024 242
 
2.9%
20159042 155
 
1.9%
Other values (14) 418
 
5.0%
Distinct849
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Memory size65.6 KiB
Minimum2013-07-02 17:30:13
Maximum2023-06-22 15:37:53
2023-12-13T06:26:27.912832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:26:28.048656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

수정자번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct21
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size65.6 KiB
자료이관
4113 
19920113
2157 
20050190
957 
20080209
 
337
19880040
 
334
Other values (16)
476 

Length

Max length8
Median length8
Mean length6.0353475
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row자료이관
2nd row자료이관
3rd row자료이관
4th row자료이관
5th row자료이관

Common Values

ValueCountFrequency (%)
자료이관 4113
49.1%
19920113 2157
25.8%
20050190 957
 
11.4%
20080209 337
 
4.0%
19880040 334
 
4.0%
99999992 108
 
1.3%
20090219 56
 
0.7%
19920107 53
 
0.6%
20200305 51
 
0.6%
20179076 40
 
0.5%
Other values (11) 168
 
2.0%

Length

2023-12-13T06:26:28.212739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
자료이관 4113
49.1%
19920113 2157
25.8%
20050190 957
 
11.4%
20080209 337
 
4.0%
19880040 334
 
4.0%
99999992 108
 
1.3%
20090219 56
 
0.7%
19920107 53
 
0.6%
20200305 51
 
0.6%
20179076 40
 
0.5%
Other values (11) 168
 
2.0%
Distinct1070
Distinct (%)12.8%
Missing0
Missing (%)0.0%
Memory size65.6 KiB
Minimum2013-07-02 17:30:13
Maximum2023-08-22 14:49:47
2023-12-13T06:26:28.364092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:26:28.511615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-13T06:26:24.818462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:26:24.189486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:26:24.514440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:26:24.931909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:26:24.286622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:26:24.623344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:26:25.079113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:26:24.385702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:26:24.719150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:26:28.640199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
신청자계약자구분신청자계약자번호일련번호전세고객번호가족관계주택소유여부토지소유여부등록자번호수정자번호
신청자계약자구분1.0001.0000.4250.4140.0260.1490.0470.8010.499
신청자계약자번호1.0001.0000.4560.4640.2140.3750.0740.6800.530
일련번호0.4250.4561.0000.8930.3500.9700.0680.9920.887
전세고객번호0.4140.4640.8931.0000.3210.9770.0870.9060.879
가족관계0.0260.2140.3500.3211.0000.3910.2770.3120.321
주택소유여부0.1490.3750.9700.9770.3911.0000.1000.9980.972
토지소유여부0.0470.0740.0680.0870.2770.1001.0000.0670.057
등록자번호0.8010.6800.9920.9060.3120.9980.0671.0000.981
수정자번호0.4990.5300.8870.8790.3210.9720.0570.9811.000
2023-12-13T06:26:28.809709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주택소유여부신청자계약자구분수정자번호토지소유여부가족관계등록자번호
주택소유여부1.0000.0950.9610.0630.3360.964
신청자계약자구분0.0951.0000.4400.0300.0230.659
수정자번호0.9610.4401.0000.0500.0900.796
토지소유여부0.0630.0300.0501.0000.2380.053
가족관계0.3360.0230.0900.2381.0000.083
등록자번호0.9640.6590.7960.0530.0831.000
2023-12-13T06:26:28.970642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
신청자계약자번호일련번호전세고객번호신청자계약자구분가족관계주택소유여부토지소유여부등록자번호수정자번호
신청자계약자번호1.000-0.0500.0641.0000.0730.3740.0740.3350.235
일련번호-0.0501.000-0.5420.2800.3010.8440.0430.9270.831
전세고객번호0.064-0.5421.0000.3180.1230.8690.0660.6290.574
신청자계약자구분1.0000.2800.3181.0000.0230.0950.0300.6590.440
가족관계0.0730.3010.1230.0231.0000.3360.2380.0830.090
주택소유여부0.3740.8440.8690.0950.3361.0000.0630.9640.961
토지소유여부0.0740.0430.0660.0300.2380.0631.0000.0530.050
등록자번호0.3350.9270.6290.6590.0830.9640.0531.0000.796
수정자번호0.2350.8310.5740.4400.0900.9610.0500.7961.000

Missing values

2023-12-13T06:26:25.247687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:26:25.456294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

신청자계약자구분신청자계약자번호일련번호전세고객번호가족관계주택소유여부토지소유여부건축물소유여부등록자번호등록일시수정자번호수정일시
0신규계약2201600111855790663732자녀NNN201590512016-03-24 16:22:23자료이관2016-09-03 18:09:25
1신규계약2201600111855791663733자녀NNN201590512016-03-24 16:22:23자료이관2016-09-03 18:09:25
2신규계약2201600121855792662596NNN201590512016-03-24 16:22:23자료이관2016-09-03 18:09:25
3신규계약2201600121855793662595자녀NNN201590512016-03-24 16:22:23자료이관2016-09-03 18:09:25
4신규계약2201600131855794663734자녀NNN201590512016-03-24 16:22:23자료이관2016-09-03 18:09:25
5신규계약2201600141855795663735자녀NNN201590512016-03-24 16:22:23자료이관2016-09-03 18:09:25
6신규계약2201600161855796663736자녀NNN201590512016-03-24 16:22:23자료이관2016-09-03 18:09:25
7신규계약2201600161855797663737자녀NNN201590512016-03-24 16:22:23자료이관2016-09-03 18:09:25
8신규계약2201600171855798663738자녀NNN201590512016-03-24 16:22:23자료이관2016-09-03 18:09:25
9신규계약2201600171855799663739자녀NNN201590512016-03-24 16:22:23자료이관2016-09-03 18:09:25
신청자계약자구분신청자계약자번호일련번호전세고객번호가족관계주택소유여부토지소유여부건축물소유여부등록자번호등록일시수정자번호수정일시
8364신규계약1202100862675109자녀YNN200501902021-08-11 11:18:30200501902021-08-11 11:18:30
8365신규계약1202100871675111자녀YNN200501902021-08-11 11:18:30200501902021-08-11 11:18:30
8366신규계약1202100872675112자녀YNN200501902021-08-11 11:18:30200501902021-08-11 11:18:30
8367신규계약1202100881675114자녀YNN200501902021-08-11 11:18:30200501902021-08-11 11:18:30
8368신규계약1202100882675115자녀YNN200501902021-08-11 11:18:30200501902021-08-11 11:18:30
8369신규계약1202100891675117배우자YNN200501902021-08-11 11:18:30200501902021-08-11 11:18:30
8370신규계약1202100892675118자녀YNN200501902021-08-11 11:18:30200501902021-08-11 11:18:30
8371신규계약1202100893675119자녀YNN200501902021-08-11 11:18:30200501902021-08-11 11:18:30
8372신규계약1202100901675121자녀YNN200501902021-08-11 11:18:30200501902021-08-11 11:18:30
8373신규계약1202100902675122자녀YNN200501902021-08-11 11:18:30200501902021-08-11 11:18:30