Overview

Dataset statistics

Number of variables6
Number of observations1000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory49.9 KiB
Average record size in memory51.1 B

Variable types

Text1
Categorical3
Numeric2

Dataset

Description한국주택금융공사 유동화자산부 업무 관련 공개 공공데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터)
Author한국주택금융공사
URLhttps://www.data.go.kr/data/15073148/fileData.do

Alerts

BASIS_MM has constant value ""Constant
RAMT_CNT is highly overall correlated with LOAN_RAMTHigh correlation
LOAN_RAMT is highly overall correlated with RAMT_CNTHigh correlation
LOAN_RAMT has unique valuesUnique

Reproduction

Analysis started2023-12-12 21:12:41.363982
Analysis finished2023-12-12 21:12:42.408618
Duration1.04 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct83
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-13T06:12:42.604008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters14000
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.5%

Sample

1st rowKHFCMB2017S-26
2nd rowKHFCMB2017S-26
3rd rowKHFCMB2017S-26
4th rowKHFCMB2017S-26
5th rowKHFCMB2017S-26
ValueCountFrequency (%)
khfcmb2017s-23 31
 
3.1%
khfcmb2016s-26 29
 
2.9%
khfcmb2016s-14 24
 
2.4%
khfcmb2016s-17 23
 
2.3%
khfcmb2016s-07 22
 
2.2%
khfcmb2016s-21 22
 
2.2%
khfcmb2017s-15 22
 
2.2%
khfcmb2015s-23 22
 
2.2%
khfcmb2016s-06 22
 
2.2%
khfcmb2016s-03 21
 
2.1%
Other values (73) 762
76.2%
2023-12-13T06:12:43.052221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1537
11.0%
2 1461
10.4%
0 1392
9.9%
K 1000
 
7.1%
F 1000
 
7.1%
C 1000
 
7.1%
M 1000
 
7.1%
B 1000
 
7.1%
H 1000
 
7.1%
- 1000
 
7.1%
Other values (9) 2610
18.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7000
50.0%
Decimal Number 6000
42.9%
Dash Punctuation 1000
 
7.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1537
25.6%
2 1461
24.3%
0 1392
23.2%
6 402
 
6.7%
5 356
 
5.9%
7 273
 
4.5%
4 216
 
3.6%
3 199
 
3.3%
9 86
 
1.4%
8 78
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
K 1000
14.3%
F 1000
14.3%
C 1000
14.3%
M 1000
14.3%
B 1000
14.3%
H 1000
14.3%
S 981
14.0%
L 19
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 1000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7000
50.0%
Latin 7000
50.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1537
22.0%
2 1461
20.9%
0 1392
19.9%
- 1000
14.3%
6 402
 
5.7%
5 356
 
5.1%
7 273
 
3.9%
4 216
 
3.1%
3 199
 
2.8%
9 86
 
1.2%
Latin
ValueCountFrequency (%)
K 1000
14.3%
F 1000
14.3%
C 1000
14.3%
M 1000
14.3%
B 1000
14.3%
H 1000
14.3%
S 981
14.0%
L 19
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1537
11.0%
2 1461
10.4%
0 1392
9.9%
K 1000
 
7.1%
F 1000
 
7.1%
C 1000
 
7.1%
M 1000
 
7.1%
B 1000
 
7.1%
H 1000
 
7.1%
- 1000
 
7.1%
Other values (9) 2610
18.6%

TREAT_ORG_CD
Categorical

Distinct24
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
B003
107 
B023
103 
B020
98 
B004
95 
B088
80 
Other values (19)
517 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowB020
2nd rowB020
3rd rowB010
4th rowB010
5th rowB007

Common Values

ValueCountFrequency (%)
B003 107
10.7%
B023 103
10.3%
B020 98
9.8%
B004 95
9.5%
B088 80
8.0%
B081 80
8.0%
B010 78
 
7.8%
B039 71
 
7.1%
B027 46
 
4.6%
B032 45
 
4.5%
Other values (14) 197
19.7%

Length

2023-12-13T06:12:43.193781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
b003 107
10.7%
b023 103
10.3%
b020 98
9.8%
b004 95
9.5%
b088 80
8.0%
b081 80
8.0%
b010 78
 
7.8%
b039 71
 
7.1%
b027 46
 
4.6%
b032 45
 
4.5%
Other values (14) 197
19.7%

PROD_GRP_CD
Categorical

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
C
455 
T
401 
M
141 
P
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowC
3rd rowT
4th rowC
5th rowT

Common Values

ValueCountFrequency (%)
C 455
45.5%
T 401
40.1%
M 141
 
14.1%
P 3
 
0.3%

Length

2023-12-13T06:12:43.355715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:12:43.468098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c 455
45.5%
t 401
40.1%
m 141
 
14.1%
p 3
 
0.3%

BASIS_MM
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
202009
1000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202009
2nd row202009
3rd row202009
4th row202009
5th row202009

Common Values

ValueCountFrequency (%)
202009 1000
100.0%

Length

2023-12-13T06:12:43.578158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:12:43.673573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202009 1000
100.0%

RAMT_CNT
Real number (ℝ)

HIGH CORRELATION 

Distinct458
Distinct (%)45.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean385.468
Minimum1
Maximum9664
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T06:12:43.772583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q111
median87
Q3363
95-th percentile1704
Maximum9664
Range9663
Interquartile range (IQR)352

Descriptive statistics

Standard deviation915.86309
Coefficient of variation (CV)2.375977
Kurtosis44.859759
Mean385.468
Median Absolute Deviation (MAD)84
Skewness5.8564933
Sum385468
Variance838805.19
MonotonicityNot monotonic
2023-12-13T06:12:43.896279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 54
 
5.4%
3 37
 
3.7%
2 35
 
3.5%
4 28
 
2.8%
5 19
 
1.9%
9 18
 
1.8%
6 17
 
1.7%
7 14
 
1.4%
11 14
 
1.4%
8 14
 
1.4%
Other values (448) 750
75.0%
ValueCountFrequency (%)
1 54
5.4%
2 35
3.5%
3 37
3.7%
4 28
2.8%
5 19
 
1.9%
6 17
 
1.7%
7 14
 
1.4%
8 14
 
1.4%
9 18
 
1.8%
10 7
 
0.7%
ValueCountFrequency (%)
9664 1
0.1%
9078 1
0.1%
8699 1
0.1%
8312 1
0.1%
8200 1
0.1%
8028 1
0.1%
7464 1
0.1%
4095 1
0.1%
4070 1
0.1%
3834 1
0.1%

LOAN_RAMT
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.0473143 × 1010
Minimum388920
Maximum5.91171 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T06:12:44.031595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum388920
5-th percentile80990299
Q18.0807957 × 108
median6.0203335 × 109
Q32.8098742 × 1010
95-th percentile1.5408615 × 1011
Maximum5.91171 × 1011
Range5.9117061 × 1011
Interquartile range (IQR)2.7290662 × 1010

Descriptive statistics

Standard deviation6.6295212 × 1010
Coefficient of variation (CV)2.1755292
Kurtosis24.516749
Mean3.0473143 × 1010
Median Absolute Deviation (MAD)5.8012095 × 109
Skewness4.3587193
Sum3.0473143 × 1013
Variance4.3950551 × 1021
MonotonicityNot monotonic
2023-12-13T06:12:44.172004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4088681210 1
 
0.1%
605784252 1
 
0.1%
9452144700 1
 
0.1%
671760312 1
 
0.1%
5427380143 1
 
0.1%
17811887304 1
 
0.1%
21761758208 1
 
0.1%
121513000000 1
 
0.1%
7401801930 1
 
0.1%
15363605602 1
 
0.1%
Other values (990) 990
99.0%
ValueCountFrequency (%)
388920 1
0.1%
1978481 1
0.1%
2321930 1
0.1%
3308459 1
0.1%
3974950 1
0.1%
4878496 1
0.1%
7156247 1
0.1%
7753990 1
0.1%
10562701 1
0.1%
13178542 1
0.1%
ValueCountFrequency (%)
591171000000 1
0.1%
547796000000 1
0.1%
543419000000 1
0.1%
530904000000 1
0.1%
530186000000 1
0.1%
486051000000 1
0.1%
452068000000 1
0.1%
336997000000 1
0.1%
290859000000 1
0.1%
286155000000 1
0.1%

Interactions

2023-12-13T06:12:41.917255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:12:41.681589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:12:42.007563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:12:41.791789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:12:44.270018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
LIQD_PLAN_CDTREAT_ORG_CDPROD_GRP_CDRAMT_CNTLOAN_RAMT
LIQD_PLAN_CD1.0000.3170.9180.4400.329
TREAT_ORG_CD0.3171.0000.7150.2520.270
PROD_GRP_CD0.9180.7151.0000.2310.178
RAMT_CNT0.4400.2520.2311.0000.915
LOAN_RAMT0.3290.2700.1780.9151.000
2023-12-13T06:12:44.367101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
TREAT_ORG_CDPROD_GRP_CD
TREAT_ORG_CD1.0000.415
PROD_GRP_CD0.4151.000
2023-12-13T06:12:44.442986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RAMT_CNTLOAN_RAMTTREAT_ORG_CDPROD_GRP_CD
RAMT_CNT1.0000.9870.0910.105
LOAN_RAMT0.9871.0000.1050.115
TREAT_ORG_CD0.0910.1051.0000.415
PROD_GRP_CD0.1050.1150.4151.000

Missing values

2023-12-13T06:12:42.197648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:12:42.344004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

LIQD_PLAN_CDTREAT_ORG_CDPROD_GRP_CDBASIS_MMRAMT_CNTLOAN_RAMT
0KHFCMB2017S-26B020M202009494088681210
1KHFCMB2017S-26B020C20200947247986564042
2KHFCMB2017S-26B010T20200958991544890410
3KHFCMB2017S-26B010C2020092113350000
4KHFCMB2017S-26B007T2020095654795464
5KHFCMB2017S-26B007C20200911913401240479
6KHFCMB2017S-26B005T2020093366835271
7KHFCMB2017S-26B004T20200961662752465484
8KHFCMB2017S-26B004C202009322260792179
9KHFCMB2017S-26B003T20200910510500558038
LIQD_PLAN_CDTREAT_ORG_CDPROD_GRP_CDBASIS_MMRAMT_CNTLOAN_RAMT
990KHFCMB2010S-16B003M2020092217588541
991KHFCMB2012S-09B027C20200921968472060
992KHFCMB2012S-09B023C202009873422004121
993KHFCMB2010S-13I001M202009147181810
994KHFCMB2010S-13B088M2020094192666994
995KHFCMB2010S-13B081M20200924701461173
996KHFCMB2010S-13B039M202009236408503
997KHFCMB2010S-13B023M2020098715804165
998KHFCMB2010S-13B020M20200913446890933
999KHFCMB2010S-13B010M2020099144556804