Overview

Dataset statistics

Number of variables9
Number of observations1000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory79.2 KiB
Average record size in memory81.1 B

Variable types

Categorical5
Numeric4

Dataset

Description내지역주택연금RAWDATA에 대한 데이터로, 주택지역도시구분코드, 고객번호, 연령구간, 감정금액 등의 항목을 제공합니다.
Author한국주택금융공사
URLhttps://www.data.go.kr/data/15073023/fileData.do

Alerts

BASIS_DY has constant value ""Constant
AGE_SECTN has constant value ""Constant
JUDGE_AMT is highly overall correlated with PNSN_PAYFORM_CD and 1 other fieldsHigh correlation
PNSN_PAYFORM_CD is highly overall correlated with JUDGE_AMTHigh correlation
GUARNT_EXEC_AMT is highly overall correlated with JUDGE_AMTHigh correlation
PNSN_PROD_DVCD is highly imbalanced (50.1%)Imbalance
GUARNT_ISSUE_CNT is highly imbalanced (80.6%)Imbalance
GUARNT_EXEC_AMT is highly skewed (γ1 = 20.60116992)Skewed
CUST_NO has unique valuesUnique
GUARNT_EXEC_AMT has 13 (1.3%) zerosZeros

Reproduction

Analysis started2023-12-11 23:47:23.272990
Analysis finished2023-12-11 23:47:25.944130
Duration2.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BASIS_DY
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
202006
1000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202006
2nd row202006
3rd row202006
4th row202006
5th row202006

Common Values

ValueCountFrequency (%)
202006 1000
100.0%

Length

2023-12-12T08:47:25.997499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:47:26.077417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202006 1000
100.0%
Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
28
585 
29
247 
30
168 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row30
2nd row30
3rd row30
4th row30
5th row30

Common Values

ValueCountFrequency (%)
28 585
58.5%
29 247
24.7%
30 168
 
16.8%

Length

2023-12-12T08:47:26.167997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:47:26.261947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
28 585
58.5%
29 247
24.7%
30 168
 
16.8%

CUST_NO
Real number (ℝ)

UNIQUE 

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2132185 × 108
Minimum7986907
Maximum1.42296 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-12T08:47:26.393081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7986907
5-th percentile93169678
Q11.150409 × 108
median1.2333852 × 108
Q31.300265 × 108
95-th percentile1.408883 × 108
Maximum1.42296 × 108
Range1.3430909 × 108
Interquartile range (IQR)14985605

Descriptive statistics

Standard deviation16681368
Coefficient of variation (CV)0.13749682
Kurtosis12.252093
Mean1.2132185 × 108
Median Absolute Deviation (MAD)7602420.5
Skewness-2.6387022
Sum1.2132185 × 1011
Variance2.7826803 × 1014
MonotonicityNot monotonic
2023-12-12T08:47:26.555233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
142071404 1
 
0.1%
125904996 1
 
0.1%
126428248 1
 
0.1%
126362896 1
 
0.1%
126353643 1
 
0.1%
126327763 1
 
0.1%
126294443 1
 
0.1%
126243122 1
 
0.1%
126236870 1
 
0.1%
126225124 1
 
0.1%
Other values (990) 990
99.0%
ValueCountFrequency (%)
7986907 1
0.1%
10824027 1
0.1%
17564966 1
0.1%
17700940 1
0.1%
19911546 1
0.1%
23923038 1
0.1%
28611028 1
0.1%
29817818 1
0.1%
34981409 1
0.1%
45688678 1
0.1%
ValueCountFrequency (%)
142296001 1
0.1%
142192299 1
0.1%
142188384 1
0.1%
142130196 1
0.1%
142106450 1
0.1%
142101594 1
0.1%
142074126 1
0.1%
142071404 1
0.1%
142039310 1
0.1%
142024590 1
0.1%

AGE_SECTN
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
65
1000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row65
2nd row65
3rd row65
4th row65
5th row65

Common Values

ValueCountFrequency (%)
65 1000
100.0%

Length

2023-12-12T08:47:26.725313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:47:26.816516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
65 1000
100.0%

JUDGE_AMT
Real number (ℝ)

HIGH CORRELATION 

Distinct535
Distinct (%)53.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0070178 × 108
Minimum20787000
Maximum8.9561783 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-12T08:47:26.922266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20787000
5-th percentile72000000
Q11.15 × 108
median1.615 × 108
Q32.4613986 × 108
95-th percentile4.6525 × 108
Maximum8.9561783 × 108
Range8.7483083 × 108
Interquartile range (IQR)1.3113986 × 108

Descriptive statistics

Standard deviation1.2986514 × 108
Coefficient of variation (CV)0.64705526
Kurtosis5.3751776
Mean2.0070178 × 108
Median Absolute Deviation (MAD)58500000
Skewness2.0167282
Sum2.0070178 × 1011
Variance1.6864954 × 1016
MonotonicityNot monotonic
2023-12-12T08:47:27.067947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
125000000 17
 
1.7%
150000000 16
 
1.6%
110000000 15
 
1.5%
120000000 15
 
1.5%
135000000 15
 
1.5%
140000000 13
 
1.3%
100000000 13
 
1.3%
160000000 13
 
1.3%
130000000 11
 
1.1%
115000000 11
 
1.1%
Other values (525) 861
86.1%
ValueCountFrequency (%)
20787000 1
0.1%
37000000 1
0.1%
38376000 1
0.1%
40326900 1
0.1%
44000000 2
0.2%
46813200 1
0.1%
47500000 1
0.1%
49000000 1
0.1%
50000000 2
0.2%
50500000 1
0.1%
ValueCountFrequency (%)
895617830 1
0.1%
894071630 1
0.1%
890983200 1
0.1%
831795720 1
0.1%
831070860 1
0.1%
805295000 1
0.1%
752500000 1
0.1%
731634000 1
0.1%
700000000 2
0.2%
699492000 1
0.1%

PNSN_PAYFORM_CD
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.337
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-12T08:47:27.221246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q37
95-th percentile8
Maximum8
Range7
Interquartile range (IQR)6

Descriptive statistics

Standard deviation2.8725742
Coefficient of variation (CV)0.86082536
Kurtosis-1.5156261
Mean3.337
Median Absolute Deviation (MAD)0
Skewness0.6014012
Sum3337
Variance8.2516827
MonotonicityNot monotonic
2023-12-12T08:47:27.332953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 507
50.7%
7 231
23.1%
2 124
 
12.4%
8 90
 
9.0%
6 26
 
2.6%
4 21
 
2.1%
5 1
 
0.1%
ValueCountFrequency (%)
1 507
50.7%
2 124
 
12.4%
4 21
 
2.1%
5 1
 
0.1%
6 26
 
2.6%
7 231
23.1%
8 90
 
9.0%
ValueCountFrequency (%)
8 90
 
9.0%
7 231
23.1%
6 26
 
2.6%
5 1
 
0.1%
4 21
 
2.1%
2 124
 
12.4%
1 507
50.7%

PNSN_PROD_DVCD
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
782 
22
213 
21
 
5

Length

Max length2
Median length1
Mean length1.218
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 782
78.2%
22 213
 
21.3%
21 5
 
0.5%

Length

2023-12-12T08:47:27.454990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:47:27.577950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 782
78.2%
22 213
 
21.3%
21 5
 
0.5%

GUARNT_ISSUE_CNT
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
0
970 
1
 
30

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 970
97.0%
1 30
 
3.0%

Length

2023-12-12T08:47:27.687844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:47:27.769892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 970
97.0%
1 30
 
3.0%

GUARNT_EXEC_AMT
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct988
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1372617.4
Minimum0
Maximum1.9732403 × 108
Zeros13
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-12T08:47:27.875628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile243349.2
Q1486196.25
median688554
Q31027116.5
95-th percentile2044521.4
Maximum1.9732403 × 108
Range1.9732403 × 108
Interquartile range (IQR)540920.25

Descriptive statistics

Standard deviation7638776.7
Coefficient of variation (CV)5.5651174
Kurtosis481.59876
Mean1372617.4
Median Absolute Deviation (MAD)244927.5
Skewness20.60117
Sum1.3726174 × 109
Variance5.835091 × 1013
MonotonicityNot monotonic
2023-12-12T08:47:28.029599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 13
 
1.3%
1229039 1
 
0.1%
885049 1
 
0.1%
580662 1
 
0.1%
769342 1
 
0.1%
774677 1
 
0.1%
504835 1
 
0.1%
1048581 1
 
0.1%
695919 1
 
0.1%
485627 1
 
0.1%
Other values (978) 978
97.8%
ValueCountFrequency (%)
0 13
1.3%
27494 1
 
0.1%
49508 1
 
0.1%
95506 1
 
0.1%
111438 1
 
0.1%
113060 1
 
0.1%
122550 1
 
0.1%
129595 1
 
0.1%
164158 1
 
0.1%
168762 1
 
0.1%
ValueCountFrequency (%)
197324028 1
0.1%
112929900 1
0.1%
43557540 1
0.1%
42973371 1
0.1%
39223485 1
0.1%
31951370 1
0.1%
18172410 1
0.1%
16509663 1
0.1%
15973763 1
0.1%
10517378 1
0.1%

Interactions

2023-12-12T08:47:25.330069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:23.655417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:24.447124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:24.881704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:25.423928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:23.782492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:24.587917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:24.977348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:25.517710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:24.222102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:24.677088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:25.077576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:25.628242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:24.340885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:24.778309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:47:25.196915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:47:28.155123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
HOUSE_LOC_CITY_DVCDCUST_NOJUDGE_AMTPNSN_PAYFORM_CDPNSN_PROD_DVCDGUARNT_ISSUE_CNTGUARNT_EXEC_AMT
HOUSE_LOC_CITY_DVCD1.0000.3960.4530.3160.3510.0170.092
CUST_NO0.3961.0000.0000.4160.3030.3350.185
JUDGE_AMT0.4530.0001.0000.4560.2640.0000.428
PNSN_PAYFORM_CD0.3160.4160.4561.0000.3860.0000.161
PNSN_PROD_DVCD0.3510.3030.2640.3861.0000.0000.000
GUARNT_ISSUE_CNT0.0170.3350.0000.0000.0001.0000.271
GUARNT_EXEC_AMT0.0920.1850.4280.1610.0000.2711.000
2023-12-12T08:47:28.259871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
HOUSE_LOC_CITY_DVCDPNSN_PROD_DVCDGUARNT_ISSUE_CNT
HOUSE_LOC_CITY_DVCD1.0000.1230.029
PNSN_PROD_DVCD0.1231.0000.000
GUARNT_ISSUE_CNT0.0290.0001.000
2023-12-12T08:47:28.348394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
CUST_NOJUDGE_AMTPNSN_PAYFORM_CDGUARNT_EXEC_AMTHOUSE_LOC_CITY_DVCDPNSN_PROD_DVCDGUARNT_ISSUE_CNT
CUST_NO1.0000.0170.0420.0300.2620.1890.256
JUDGE_AMT0.0171.000-0.6210.6940.3050.1630.000
PNSN_PAYFORM_CD0.042-0.6211.000-0.4270.2230.2810.000
GUARNT_EXEC_AMT0.0300.694-0.4271.0000.0690.0000.330
HOUSE_LOC_CITY_DVCD0.2620.3050.2230.0691.0000.1230.029
PNSN_PROD_DVCD0.1890.1630.2810.0000.1231.0000.000
GUARNT_ISSUE_CNT0.2560.0000.0000.3300.0290.0001.000

Missing values

2023-12-12T08:47:25.761483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:47:25.894929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BASIS_DYHOUSE_LOC_CITY_DVCDCUST_NOAGE_SECTNJUDGE_AMTPNSN_PAYFORM_CDPNSN_PROD_DVCDGUARNT_ISSUE_CNTGUARNT_EXEC_AMT
02020063014207140465505000007111229039
12020063014193010665650000000611197324028
2202006301416233326511500000081118172410
32020063014121303065350000000610476668
4202006301402351876592500000710415549
5202006301401642906541500000012201402689
620200630140069052652750000001220772177
720200630138307645657000000001101739284
82020063013803903665115000000710379262
9202006301377788066533992280012200
BASIS_DYHOUSE_LOC_CITY_DVCDCUST_NOAGE_SECTNJUDGE_AMTPNSN_PAYFORM_CDPNSN_PROD_DVCDGUARNT_ISSUE_CNTGUARNT_EXEC_AMT
990202006281110083856594096900810454075
991202006281110078656571750000810453399
9922020062811098837065153500000110991040
9932020062811094277265107500000710405587
99420200628110892958652775000004101084847
995202006281108923306596000000810299727
99620200628110891917652550000001220867374
9972020062811074825165127000000710699968
9982020062811072991165108134600710423328
999202006281107295266577000000810258434