Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)2.0%
Total size in memory7.0 KiB
Average record size in memory71.3 B

Variable types

Categorical6
Numeric1
DateTime1

Dataset

Description정책모기지 상품 중 한국주택금융공사에서 운영하는 보금자리론 이용 고객들의 주택연금 대출기간 현황 관련된 데이터를 제공하고 있습니다.
URLhttps://www.data.go.kr/data/15090220/fileData.do

Alerts

최초등록부점 has constant value ""Constant
Dataset has 2 (2.0%) duplicate rowsDuplicates
상품대분류 is highly overall correlated with 상품중분류 and 1 other fieldsHigh correlation
상품중분류 is highly overall correlated with 상품대분류 and 2 other fieldsHigh correlation
상품 is highly overall correlated with 상품대분류 and 1 other fieldsHigh correlation
부점 is highly overall correlated with 상품중분류High correlation
대출기간 is highly imbalanced (69.7%)Imbalance

Reproduction

Analysis started2023-12-12 21:29:23.416021
Analysis finished2023-12-12 21:29:24.046731
Duration0.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상품대분류
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2
57 
1
43 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 57
57.0%
1 43
43.0%

Length

2023-12-13T06:29:24.113220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:29:24.207395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 57
57.0%
1 43
43.0%

상품중분류
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
3
53 
1
43 
2
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
3 53
53.0%
1 43
43.0%
2 4
 
4.0%

Length

2023-12-13T06:29:24.315957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:29:24.438540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 53
53.0%
1 43
43.0%
2 4
 
4.0%

상품
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
120301
33 
110108
24 
120305
20 
110101
19 
120201

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row120301
2nd row120301
3rd row120301
4th row120301
5th row120301

Common Values

ValueCountFrequency (%)
120301 33
33.0%
110108 24
24.0%
120305 20
20.0%
110101 19
19.0%
120201 4
 
4.0%

Length

2023-12-13T06:29:24.540014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:29:24.647524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
120301 33
33.0%
110108 24
24.0%
120305 20
20.0%
110101 19
19.0%
120201 4
 
4.0%

금액
Real number (ℝ)

Distinct66
Distinct (%)66.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.62106 × 108
Minimum20000000
Maximum4.35 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-13T06:29:24.800963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20000000
5-th percentile35700000
Q180000000
median1.46 × 108
Q32.38 × 108
95-th percentile3.181 × 108
Maximum4.35 × 108
Range4.15 × 108
Interquartile range (IQR)1.58 × 108

Descriptive statistics

Standard deviation1.0073767 × 108
Coefficient of variation (CV)0.62143083
Kurtosis-0.61748243
Mean1.62106 × 108
Median Absolute Deviation (MAD)75000000
Skewness0.55546501
Sum1.62106 × 1010
Variance1.0148078 × 1016
MonotonicityNot monotonic
2023-12-13T06:29:24.956898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
300000000 9
 
9.0%
100000000 6
 
6.0%
200000000 4
 
4.0%
156000000 4
 
4.0%
50000000 3
 
3.0%
140000000 3
 
3.0%
80000000 3
 
3.0%
270000000 3
 
3.0%
30000000 2
 
2.0%
90000000 2
 
2.0%
Other values (56) 61
61.0%
ValueCountFrequency (%)
20000000 2
2.0%
23000000 1
 
1.0%
30000000 2
2.0%
36000000 1
 
1.0%
40000000 2
2.0%
41400000 1
 
1.0%
42000000 1
 
1.0%
45700000 1
 
1.0%
50000000 3
3.0%
52000000 1
 
1.0%
ValueCountFrequency (%)
435000000 1
 
1.0%
412000000 1
 
1.0%
382000000 1
 
1.0%
332000000 1
 
1.0%
320000000 1
 
1.0%
318000000 1
 
1.0%
300000000 9
9.0%
294000000 1
 
1.0%
292000000 1
 
1.0%
281000000 1
 
1.0%

부점
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)22.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
THO
13 
THA
12 
TLA
TPB
QAD
Other values (17)
51 

Length

Max length3
Median length3
Mean length2.98
Min length1

Unique

Unique5 ?
Unique (%)5.0%

Sample

1st rowQAD
2nd rowTAA
3rd rowTBA
4th rowTBA
5th rowQAD

Common Values

ValueCountFrequency (%)
THO 13
13.0%
THA 12
12.0%
TLA 9
9.0%
TPB 8
 
8.0%
QAD 7
 
7.0%
TAC 7
 
7.0%
THB 7
 
7.0%
TBA 6
 
6.0%
TAB 5
 
5.0%
TLB 4
 
4.0%
Other values (12) 22
22.0%

Length

2023-12-13T06:29:25.123694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tho 13
13.0%
tha 12
12.0%
tla 9
9.0%
tpb 8
 
8.0%
qad 7
 
7.0%
tac 7
 
7.0%
thb 7
 
7.0%
tba 6
 
6.0%
tab 5
 
5.0%
tlb 4
 
4.0%
Other values (12) 22
22.0%
Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2021-06-07 00:00:00
Maximum2021-06-08 00:00:00
2023-12-13T06:29:25.235435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:29:25.343608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=2)

최초등록부점
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
999
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row999
2nd row999
3rd row999
4th row999
5th row999

Common Values

ValueCountFrequency (%)
999 100
100.0%

Length

2023-12-13T06:29:25.460945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:29:25.885496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
999 100
100.0%

대출기간
Categorical

IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
360
90 
240
 
5
120
 
4
180
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row360
2nd row360
3rd row240
4th row360
5th row360

Common Values

ValueCountFrequency (%)
360 90
90.0%
240 5
 
5.0%
120 4
 
4.0%
180 1
 
1.0%

Length

2023-12-13T06:29:25.982850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:29:26.095236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
360 90
90.0%
240 5
 
5.0%
120 4
 
4.0%
180 1
 
1.0%

Interactions

2023-12-13T06:29:23.731227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:29:26.169303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상품대분류상품중분류상품금액부점최초등록일시대출기간
상품대분류1.0001.0001.0000.5530.6410.6560.059
상품중분류1.0001.0001.0000.4620.7810.2930.000
상품1.0001.0001.0000.6530.6910.3760.000
금액0.5530.4620.6531.0000.0000.5270.000
부점0.6410.7810.6910.0001.0000.1660.000
최초등록일시0.6560.2930.3760.5270.1661.0000.109
대출기간0.0590.0000.0000.0000.0000.1091.000
2023-12-13T06:29:26.301517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부점상품대분류상품중분류상품대출기간
부점1.0000.4570.5250.3870.000
상품대분류0.4571.0000.9950.9850.034
상품중분류0.5250.9951.0000.9900.000
상품0.3870.9850.9901.0000.000
대출기간0.0000.0340.0000.0001.000
2023-12-13T06:29:26.405971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
금액상품대분류상품중분류상품부점대출기간
금액1.0000.4200.3110.3110.0000.000
상품대분류0.4201.0000.9950.9850.4570.034
상품중분류0.3110.9951.0000.9900.5250.000
상품0.3110.9850.9901.0000.3870.000
부점0.0000.4570.5250.3871.0000.000
대출기간0.0000.0340.0000.0000.0001.000

Missing values

2023-12-13T06:29:23.839519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:29:23.989076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상품대분류상품중분류상품금액부점최초등록일시최초등록부점대출기간
023120301318000000QAD2021-06-08999360
123120301272000000TAA2021-06-08999360
22312030140000000TBA2021-06-08999240
32312030180000000TBA2021-06-08999360
42312030198000000QAD2021-06-08999360
523120301150000000THA2021-06-08999360
623120301435000000THA2021-06-08999360
72312030150000000THB2021-06-08999360
82312030141400000THB2021-06-08999360
92312030142000000TJA2021-06-08999360
상품대분류상품중분류상품금액부점최초등록일시최초등록부점대출기간
902312030165000000TAA2021-06-07999360
912312030180500000TJA2021-06-07999360
9223120301100000000THO2021-06-07999360
9323120301300000000TLB2021-06-07999360
9423120301120000000THB2021-06-07999360
952312030190000000TLA2021-06-07999360
962312030173500000TLA2021-06-07999360
972312030160000000THO2021-06-07999360
982312030145700000THO2021-06-07999360
992312030123000000TPB2021-06-07999360

Duplicate rows

Most frequently occurring

상품대분류상품중분류상품금액부점최초등록일시최초등록부점대출기간# duplicates
011110108300000000THB2021-06-089993602
123120305156000000QAD2021-06-089993602