Overview

Dataset statistics

Number of variables9
Number of observations186
Missing cells373
Missing cells (%)22.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory13.9 KiB
Average record size in memory76.7 B

Variable types

Numeric2
Categorical3
DateTime3
Boolean1

Dataset

Description한국주택금융공사 주택연금부 업무 관련 공개 데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터)
Author한국주택금융공사
URLhttps://www.data.go.kr/data/15073084/fileData.do

Alerts

CNSL_DY is highly overall correlated with REQ_DY and 3 other fieldsHigh correlation
PRS_DVCD is highly overall correlated with REQ_DY and 1 other fieldsHigh correlation
REQ_DY is highly overall correlated with PRS_DVCD and 1 other fieldsHigh correlation
CNSL_HOPE_BNK_CD is highly overall correlated with CNSL_DYHigh correlation
CTRL_BRCD is highly overall correlated with CNSL_DYHigh correlation
PRS_DVCD is highly imbalanced (65.5%)Imbalance
CNSL_DY is highly imbalanced (78.0%)Imbalance
PRS_TS has 168 (90.3%) missing valuesMissing
FIN_CNSL_YN has 61 (32.8%) missing valuesMissing
CNSL_HOPE_BNK_CD has 144 (77.4%) missing valuesMissing
REQ_TS has unique valuesUnique

Reproduction

Analysis started2023-12-12 07:04:47.709831
Analysis finished2023-12-12 07:04:49.076475
Duration1.37 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

REQ_DY
Real number (ℝ)

HIGH CORRELATION 

Distinct145
Distinct (%)78.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20172228
Minimum20121021
Maximum20201025
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2023-12-12T16:04:49.158628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20121021
5-th percentile20141147
Q120160477
median20170166
Q320190918
95-th percentile20201021
Maximum20201025
Range80004
Interquartile range (IQR)30440.75

Descriptive statistics

Standard deviation19878.089
Coefficient of variation (CV)0.00098541858
Kurtosis-0.42484298
Mean20172228
Median Absolute Deviation (MAD)10054
Skewness-0.16124294
Sum3.7520345 × 109
Variance3.951384 × 108
MonotonicityNot monotonic
2023-12-12T16:04:49.341068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20201021 7
 
3.8%
20160425 5
 
2.7%
20201022 5
 
2.7%
20200106 3
 
1.6%
20170126 3
 
1.6%
20190531 3
 
1.6%
20160321 2
 
1.1%
20121206 2
 
1.1%
20121122 2
 
1.1%
20160421 2
 
1.1%
Other values (135) 152
81.7%
ValueCountFrequency (%)
20121021 1
0.5%
20121122 2
1.1%
20121206 2
1.1%
20130506 1
0.5%
20130708 1
0.5%
20141119 1
0.5%
20141121 1
0.5%
20141128 1
0.5%
20141205 1
0.5%
20141210 1
0.5%
ValueCountFrequency (%)
20201025 1
 
0.5%
20201023 2
 
1.1%
20201022 5
2.7%
20201021 7
3.8%
20201004 1
 
0.5%
20200928 1
 
0.5%
20200925 1
 
0.5%
20200426 1
 
0.5%
20200410 1
 
0.5%
20200318 1
 
0.5%

CTRL_BRCD
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)11.8%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
TLB
46 
TBA
43 
TPA
15 
XXX
10 
ABN
10 
Other values (17)
62 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique3 ?
Unique (%)1.6%

Sample

1st rowTRA
2nd rowQAD
3rd rowTLB
4th rowTRA
5th rowTAB

Common Values

ValueCountFrequency (%)
TLB 46
24.7%
TBA 43
23.1%
TPA 15
 
8.1%
XXX 10
 
5.4%
ABN 10
 
5.4%
TAB 9
 
4.8%
THA 7
 
3.8%
TRA 5
 
2.7%
TAC 5
 
2.7%
THB 5
 
2.7%
Other values (12) 31
16.7%

Length

2023-12-12T16:04:49.462497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tlb 46
24.7%
tba 43
23.1%
tpa 15
 
8.1%
xxx 10
 
5.4%
abn 10
 
5.4%
tab 9
 
4.8%
tha 7
 
3.8%
tra 5
 
2.7%
tac 5
 
2.7%
thb 5
 
2.7%
Other values (12) 31
16.7%

PRS_DVCD
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
1
174 
2
 
12

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 174
93.5%
2 12
 
6.5%

Length

2023-12-12T16:04:49.561788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:04:49.653488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 174
93.5%
2 12
 
6.5%

CNSL_DY
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
<NA>
174 
20201022
 
6
20201023
 
4
20201026
 
2

Length

Max length8
Median length4
Mean length4.2580645
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 174
93.5%
20201022 6
 
3.2%
20201023 4
 
2.2%
20201026 2
 
1.1%

Length

2023-12-12T16:04:49.773392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:04:49.888318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 174
93.5%
20201022 6
 
3.2%
20201023 4
 
2.2%
20201026 2
 
1.1%

REQ_TS
Date

UNIQUE 

Distinct186
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
Minimum2012-10-21 16:02:49
Maximum2020-10-25 12:44:27
2023-12-12T16:04:49.985824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:04:50.101603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

REG_TS
Date

Distinct144
Distinct (%)77.4%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
Minimum2012-10-22 02:31:32
Maximum2020-10-26 02:32:24
2023-12-12T16:04:50.218445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:04:50.364389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

PRS_TS
Date

MISSING 

Distinct18
Distinct (%)100.0%
Missing168
Missing (%)90.3%
Memory size1.6 KiB
Minimum2013-05-10 14:47:38
Maximum2020-10-26 09:28:15
2023-12-12T16:04:50.477697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:04:50.566536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)

FIN_CNSL_YN
Boolean

MISSING 

Distinct2
Distinct (%)1.6%
Missing61
Missing (%)32.8%
Memory size504.0 B
True
65 
False
60 
(Missing)
61 
ValueCountFrequency (%)
True 65
34.9%
False 60
32.3%
(Missing) 61
32.8%
2023-12-12T16:04:50.645854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

CNSL_HOPE_BNK_CD
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct22
Distinct (%)52.4%
Missing144
Missing (%)77.4%
Infinite0
Infinite (%)0.0%
Mean176926.4
Minimum34571
Maximum816760
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2023-12-12T16:04:50.727088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum34571
5-th percentile41713
Q1103265
median114679
Q3204602
95-th percentile320213.35
Maximum816760
Range782189
Interquartile range (IQR)101337

Descriptive statistics

Standard deviation169856.06
Coefficient of variation (CV)0.96003796
Kurtosis8.3201124
Mean176926.4
Median Absolute Deviation (MAD)54362
Skewness2.7060798
Sum7430909
Variance2.8851082 × 1010
MonotonicityNot monotonic
2023-12-12T16:04:50.834753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
114679 11
 
5.9%
111436 3
 
1.6%
310020 3
 
1.6%
201647 2
 
1.1%
100719 2
 
1.1%
41713 2
 
1.1%
320201 2
 
1.1%
204602 2
 
1.1%
816760 2
 
1.1%
111135 1
 
0.5%
Other values (12) 12
 
6.5%
(Missing) 144
77.4%
ValueCountFrequency (%)
34571 1
0.5%
41030 1
0.5%
41713 2
1.1%
46310 1
0.5%
48046 1
0.5%
55165 1
0.5%
65469 1
0.5%
66691 1
0.5%
100719 2
1.1%
110903 1
0.5%
ValueCountFrequency (%)
816760 2
 
1.1%
320214 1
 
0.5%
320201 2
 
1.1%
310062 1
 
0.5%
310020 3
 
1.6%
208527 1
 
0.5%
204602 2
 
1.1%
201647 2
 
1.1%
115665 1
 
0.5%
114679 11
5.9%

Interactions

2023-12-12T16:04:48.294015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:04:48.047978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:04:48.450195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:04:48.178102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:04:50.927828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
REQ_DYCTRL_BRCDPRS_DVCDCNSL_DYPRS_TSFIN_CNSL_YNCNSL_HOPE_BNK_CD
REQ_DY1.0000.8300.699NaN1.0000.4390.439
CTRL_BRCD0.8301.0000.5681.0001.0000.2310.588
PRS_DVCD0.6990.5681.000NaN1.0000.0000.000
CNSL_DYNaN1.000NaN1.0001.0000.000NaN
PRS_TS1.0001.0001.0001.0001.0001.0001.000
FIN_CNSL_YN0.4390.2310.0000.0001.0001.0000.140
CNSL_HOPE_BNK_CD0.4390.5880.000NaN1.0000.1401.000
2023-12-12T16:04:51.031089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
CTRL_BRCDCNSL_DYPRS_DVCDFIN_CNSL_YN
CTRL_BRCD1.0000.5770.4280.167
CNSL_DY0.5771.0001.0000.000
PRS_DVCD0.4281.0001.0000.000
FIN_CNSL_YN0.1670.0000.0001.000
2023-12-12T16:04:51.125399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
REQ_DYCNSL_HOPE_BNK_CDCTRL_BRCDPRS_DVCDCNSL_DYFIN_CNSL_YN
REQ_DY1.000-0.1410.4790.5341.0000.311
CNSL_HOPE_BNK_CD-0.1411.0000.4010.0001.0000.168
CTRL_BRCD0.4790.4011.0000.4280.5770.167
PRS_DVCD0.5340.0000.4281.0001.0000.000
CNSL_DY1.0001.0000.5771.0001.0000.000
FIN_CNSL_YN0.3110.1680.1670.0000.0001.000

Missing values

2023-12-12T16:04:48.660596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:04:48.848985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T16:04:48.996142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

REQ_DYCTRL_BRCDPRS_DVCDCNSL_DYREQ_TSREG_TSPRS_TSFIN_CNSL_YNCNSL_HOPE_BNK_CD
020141128TRA1<NA>2014/11/28 19:28:282014/11/29 02:31:29<NA><NA><NA>
120141210QAD1<NA>2014/12/10 14:48:462014/12/11 02:35:33<NA><NA><NA>
220141121TLB1<NA>2014/11/21 11:23:372014/11/22 02:31:30<NA><NA><NA>
320141119TRA1<NA>2014/11/19 09:24:492014/11/20 02:31:28<NA><NA><NA>
420141205TAB1<NA>2014/12/05 09:48:072014/12/06 02:31:29<NA><NA><NA>
520150223TAC1<NA>2015/02/23 17:36:332015/02/24 02:31:27<NA><NA><NA>
620150206TAB1<NA>2015/02/06 12:58:002015/02/07 02:36:02<NA><NA><NA>
720150109TRA1<NA>2015/01/09 12:42:092015/01/10 02:31:31<NA><NA><NA>
820150313TBA1<NA>2015/03/13 10:16:122015/03/14 02:31:30<NA><NA><NA>
920150516TPB1<NA>2015/05/16 09:58:302015/05/17 02:31:22<NA><NA><NA>
REQ_DYCTRL_BRCDPRS_DVCDCNSL_DYREQ_TSREG_TSPRS_TSFIN_CNSL_YNCNSL_HOPE_BNK_CD
17620190918TLB1<NA>2019/09/18 14:07:022019/09/19 02:32:27<NA>Y114679
17720190917TLB1<NA>2019/09/17 16:38:402019/09/18 02:32:28<NA>Y114679
17820190923TLB1<NA>2019/09/23 18:51:362019/09/24 02:32:42<NA>N<NA>
17920190920TLB1<NA>2019/09/20 14:27:382019/09/21 02:32:27<NA>N<NA>
18020191025TLB1<NA>2019/10/25 13:49:332019/10/26 02:32:19<NA>N<NA>
18120190927TLB1<NA>2019/09/27 09:47:382019/09/28 02:32:17<NA>N<NA>
18220191008TLB1<NA>2019/10/08 02:47:122019/10/08 02:48:51<NA>N<NA>
18320191009TLB1<NA>2019/10/09 13:42:082019/10/10 02:32:39<NA>N<NA>
18420191014TLB1<NA>2019/10/14 10:28:582019/10/15 02:32:29<NA>Y114679
18520191014TLB1<NA>2019/10/14 11:24:282019/10/15 02:32:29<NA>N<NA>