Overview

Dataset statistics

Number of variables6
Number of observations1000
Missing cells950
Missing cells (%)15.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory52.9 KiB
Average record size in memory54.1 B

Variable types

Numeric4
Categorical2

Dataset

Description한국주택금융공사 채권관리부 업무 관련 공개 공공데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터)
Author한국주택금융공사
URLhttps://www.data.go.kr/data/15072891/fileData.do

Alerts

PETITN_ACPT_DY is highly overall correlated with CO_LAWST_POS_CD and 1 other fieldsHigh correlation
CO_LAWST_POS_CD is highly overall correlated with PETITN_ACPT_DYHigh correlation
LAWST_CLSS_DVCD is highly overall correlated with PETITN_ACPT_DYHigh correlation
CO_LAWST_POS_CD is highly imbalanced (91.3%)Imbalance
LAWST_CLSS_DVCD is highly imbalanced (91.1%)Imbalance
PETITN_ACPT_DY has 950 (95.0%) missing valuesMissing
ACPT_PTNO has unique valuesUnique

Reproduction

Analysis started2023-12-12 19:42:26.411663
Analysis finished2023-12-12 19:42:28.794334
Duration2.38 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

ACPT_PTNO
Real number (ℝ)

UNIQUE 

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0201304 × 1010
Minimum2.0201301 × 1010
Maximum2.0201305 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T04:42:28.899404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0201301 × 1010
5-th percentile2.0201303 × 1010
Q12.0201303 × 1010
median2.0201304 × 1010
Q32.0201304 × 1010
95-th percentile2.0201304 × 1010
Maximum2.0201305 × 1010
Range3171
Interquartile range (IQR)786.5

Descriptive statistics

Standard deviation443.22577
Coefficient of variation (CV)2.1940454 × 10-8
Kurtosis0.0045814669
Mean2.0201304 × 1010
Median Absolute Deviation (MAD)394.5
Skewness-0.29593241
Sum2.0201304 × 1013
Variance196449.08
MonotonicityNot monotonic
2023-12-13T04:42:29.473136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20201304549 1
 
0.1%
20201303574 1
 
0.1%
20201303591 1
 
0.1%
20201303589 1
 
0.1%
20201303588 1
 
0.1%
20201303603 1
 
0.1%
20201303586 1
 
0.1%
20201303587 1
 
0.1%
20201303584 1
 
0.1%
20201303581 1
 
0.1%
Other values (990) 990
99.0%
ValueCountFrequency (%)
20201301378 1
0.1%
20201301723 1
0.1%
20201302946 1
0.1%
20201303073 1
0.1%
20201303085 1
0.1%
20201303121 1
0.1%
20201303122 1
0.1%
20201303123 1
0.1%
20201303124 1
0.1%
20201303125 1
0.1%
ValueCountFrequency (%)
20201304549 1
0.1%
20201304548 1
0.1%
20201304545 1
0.1%
20201304544 1
0.1%
20201304543 1
0.1%
20201304542 1
0.1%
20201304540 1
0.1%
20201304539 1
0.1%
20201304538 1
0.1%
20201304537 1
0.1%

MDBTR_CUST_NO
Real number (ℝ)

Distinct880
Distinct (%)88.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean82278016
Minimum268952
Maximum1.362029 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T04:42:29.706538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum268952
5-th percentile24283638
Q166891528
median88117561
Q31.01497 × 108
95-th percentile1.2122536 × 108
Maximum1.362029 × 108
Range1.3593395 × 108
Interquartile range (IQR)34605476

Descriptive statistics

Standard deviation29403751
Coefficient of variation (CV)0.35737069
Kurtosis-0.24947517
Mean82278016
Median Absolute Deviation (MAD)18052321
Skewness-0.68862151
Sum8.2278016 × 1010
Variance8.6458058 × 1014
MonotonicityNot monotonic
2023-12-13T04:42:29.930238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
93908202 3
 
0.3%
118880218 3
 
0.3%
81675356 3
 
0.3%
21564211 3
 
0.3%
66870861 3
 
0.3%
65681214 3
 
0.3%
73584824 3
 
0.3%
77911778 3
 
0.3%
88825154 3
 
0.3%
101462546 3
 
0.3%
Other values (870) 970
97.0%
ValueCountFrequency (%)
268952 1
0.1%
782315 1
0.1%
3001912 1
0.1%
3792483 2
0.2%
6105860 1
0.1%
6867603 1
0.1%
8763695 1
0.1%
9088654 1
0.1%
10023174 1
0.1%
10295313 2
0.2%
ValueCountFrequency (%)
136202904 1
0.1%
135204725 1
0.1%
130928776 1
0.1%
128943598 1
0.1%
128881391 1
0.1%
128236111 1
0.1%
128050287 2
0.2%
127947126 1
0.1%
127129038 2
0.2%
126786456 1
0.1%

LAWST_TYP_CD
Real number (ℝ)

Distinct14
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.978
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T04:42:30.089293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median15
Q315
95-th percentile16
Maximum99
Range98
Interquartile range (IQR)14

Descriptive statistics

Standard deviation11.288709
Coefficient of variation (CV)0.94245358
Kurtosis36.387236
Mean11.978
Median Absolute Deviation (MAD)0
Skewness4.8154287
Sum11978
Variance127.43495
MonotonicityNot monotonic
2023-12-13T04:42:30.260722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
15 643
64.3%
1 287
28.7%
21 26
 
2.6%
99 11
 
1.1%
20 11
 
1.1%
16 4
 
0.4%
2 4
 
0.4%
6 3
 
0.3%
11 3
 
0.3%
7 2
 
0.2%
Other values (4) 6
 
0.6%
ValueCountFrequency (%)
1 287
28.7%
2 4
 
0.4%
4 1
 
0.1%
6 3
 
0.3%
7 2
 
0.2%
8 2
 
0.2%
10 1
 
0.1%
11 3
 
0.3%
12 2
 
0.2%
15 643
64.3%
ValueCountFrequency (%)
99 11
 
1.1%
21 26
 
2.6%
20 11
 
1.1%
16 4
 
0.4%
15 643
64.3%
12 2
 
0.2%
11 3
 
0.3%
10 1
 
0.1%
8 2
 
0.2%
7 2
 
0.2%

CO_LAWST_POS_CD
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
989 
2
 
11

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 989
98.9%
2 11
 
1.1%

Length

2023-12-13T04:42:30.427134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:42:30.565469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 989
98.9%
2 11
 
1.1%

LAWST_CLSS_DVCD
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
981 
2
 
18
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 981
98.1%
2 18
 
1.8%
3 1
 
0.1%

Length

2023-12-13T04:42:30.697296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:42:30.827823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 981
98.1%
2 18
 
1.8%
3 1
 
0.1%

PETITN_ACPT_DY
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct45
Distinct (%)90.0%
Missing950
Missing (%)95.0%
Infinite0
Infinite (%)0.0%
Mean20197687
Minimum20120917
Maximum20201005
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T04:42:30.965039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20120917
5-th percentile20190869
Q120200347
median20200610
Q320200722
95-th percentile20200913
Maximum20201005
Range80088
Interquartile range (IQR)374.5

Descriptive statistics

Standard deviation11578.97
Coefficient of variation (CV)0.00057328197
Kurtosis41.403365
Mean20197687
Median Absolute Deviation (MAD)196.5
Skewness-6.20946
Sum1.0098843 × 109
Variance1.3407254 × 108
MonotonicityNot monotonic
2023-12-13T04:42:31.180536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
20200717 2
 
0.2%
20200409 2
 
0.2%
20200615 2
 
0.2%
20200807 2
 
0.2%
20200629 2
 
0.2%
20191206 1
 
0.1%
20200828 1
 
0.1%
20190829 1
 
0.1%
20200722 1
 
0.1%
20200713 1
 
0.1%
Other values (35) 35
 
3.5%
(Missing) 950
95.0%
ValueCountFrequency (%)
20120917 1
0.1%
20190617 1
0.1%
20190829 1
0.1%
20190918 1
0.1%
20191122 1
0.1%
20191206 1
0.1%
20191213 1
0.1%
20191218 1
0.1%
20200302 1
0.1%
20200311 1
0.1%
ValueCountFrequency (%)
20201005 1
0.1%
20200923 1
0.1%
20200922 1
0.1%
20200902 1
0.1%
20200831 1
0.1%
20200828 1
0.1%
20200825 1
0.1%
20200813 1
0.1%
20200807 2
0.2%
20200724 1
0.1%

Interactions

2023-12-13T04:42:27.969829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:26.644582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:26.990612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:27.403291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:28.109901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:26.726589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:27.099580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:27.500062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:28.227327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:26.811624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:27.210661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:27.620272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:28.370076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:26.900623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:27.306787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:27.786689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:42:31.306675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ACPT_PTNOMDBTR_CUST_NOLAWST_TYP_CDCO_LAWST_POS_CDLAWST_CLSS_DVCDPETITN_ACPT_DY
ACPT_PTNO1.0000.0000.1560.0000.0000.000
MDBTR_CUST_NO0.0001.0000.2410.0000.1170.000
LAWST_TYP_CD0.1560.2411.0000.5410.2650.000
CO_LAWST_POS_CD0.0000.0000.5411.0000.220NaN
LAWST_CLSS_DVCD0.0000.1170.2650.2201.000NaN
PETITN_ACPT_DY0.0000.0000.000NaNNaN1.000
2023-12-13T04:42:31.458867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
LAWST_CLSS_DVCDCO_LAWST_POS_CD
LAWST_CLSS_DVCD1.0000.360
CO_LAWST_POS_CD0.3601.000
2023-12-13T04:42:31.576673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ACPT_PTNOMDBTR_CUST_NOLAWST_TYP_CDPETITN_ACPT_DYCO_LAWST_POS_CDLAWST_CLSS_DVCD
ACPT_PTNO1.000-0.0120.0280.4250.2930.000
MDBTR_CUST_NO-0.0121.0000.0840.0430.0230.069
LAWST_TYP_CD0.0280.0841.0000.3630.3690.253
PETITN_ACPT_DY0.4250.0430.3631.0001.0001.000
CO_LAWST_POS_CD0.2930.0230.3691.0001.0000.360
LAWST_CLSS_DVCD0.0000.0690.2531.0000.3601.000

Missing values

2023-12-13T04:42:28.578371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:42:28.730327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ACPT_PTNOMDBTR_CUST_NOLAWST_TYP_CDCO_LAWST_POS_CDLAWST_CLSS_DVCDPETITN_ACPT_DY
02020130454998434771111<NA>
120201304548478764599911<NA>
22020130454546504654111<NA>
32020130454487114989111<NA>
420201304543830669631511<NA>
5202013045421001243061511<NA>
6202013045397726171611120190617
720201304538109950050111<NA>
82020130454087879059111<NA>
9202013045361248954041511<NA>
ACPT_PTNOMDBTR_CUST_NOLAWST_TYP_CDCO_LAWST_POS_CDLAWST_CLSS_DVCDPETITN_ACPT_DY
99020201303142970494131511<NA>
991202013031331133502062011<NA>
992202013031321185667491511<NA>
99320201303131209273961111<NA>
9942020130313525559141111<NA>
99520201303125125385036151120191218
99620201303124321225871511<NA>
99720201303123110531222151120200319
998202013030733792483211<NA>
999202013031211267864561511<NA>