Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 1000 |
Missing cells | 650 |
Missing cells (%) | 10.8% |
Duplicate rows | 44 |
Duplicate rows (%) | 4.4% |
Total size in memory | 51.9 KiB |
Average record size in memory | 53.1 B |
Variable types
Numeric | 4 |
---|---|
Categorical | 1 |
DateTime | 1 |
Dataset
Description | 한국주택금융공사 유동화자산부 업무 관련 공개 공공데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터) |
---|---|
Author | 한국주택금융공사 |
URL | https://www.data.go.kr/data/15073197/fileData.do |
Dataset has 44 (4.4%) duplicate rows | Duplicates |
BASIS_DY is highly overall correlated with SEQ and 2 other fields | High correlation |
SEQ is highly overall correlated with BASIS_DY and 2 other fields | High correlation |
DISPOS_DY is highly overall correlated with FMLY_RELTN_CD | High correlation |
TELGRM_MAKE_DY is highly overall correlated with BASIS_DY and 2 other fields | High correlation |
FMLY_RELTN_CD is highly overall correlated with BASIS_DY and 3 other fields | High correlation |
FMLY_RELTN_CD is highly imbalanced (98.6%) | Imbalance |
DISPOS_DY has 631 (63.1%) missing values | Missing |
TELGRM_MAKE_DY has 19 (1.9%) missing values | Missing |
Reproduction
Analysis started | 2023-12-12 16:43:04.412216 |
---|---|
Analysis finished | 2023-12-12 16:43:07.137232 |
Duration | 2.73 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
BASIS_DY
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 272 |
---|---|
Distinct (%) | 27.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 20117750 |
Minimum | 20100902 |
---|---|
Maximum | 20150105 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.9 KiB |
Quantile statistics
Minimum | 20100902 |
---|---|
5-th percentile | 20101025 |
Q1 | 20110207 |
median | 20111219 |
Q3 | 20130624 |
95-th percentile | 20131127 |
Maximum | 20150105 |
Range | 49203 |
Interquartile range (IQR) | 20417 |
Descriptive statistics
Standard deviation | 11598.707 |
---|---|
Coefficient of variation (CV) | 0.00057654094 |
Kurtosis | -1.3183842 |
Mean | 20117750 |
Median Absolute Deviation (MAD) | 10010 |
Skewness | 0.056871662 |
Sum | 2.011775 × 1010 |
Variance | 1.3453 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
20101125 | 28 | 2.8% |
20130417 | 24 | 2.4% |
20110210 | 20 | 2.0% |
20110113 | 18 | 1.8% |
20110120 | 17 | 1.7% |
20131002 | 16 | 1.6% |
20130722 | 14 | 1.4% |
20130624 | 14 | 1.4% |
20101118 | 12 | 1.2% |
20110127 | 12 | 1.2% |
Other values (262) | 825 |
Value | Count | Frequency (%) |
20100902 | 2 | 0.2% |
20100906 | 5 | |
20100909 | 11 | |
20100913 | 8 | |
20100930 | 5 | |
20101004 | 5 | |
20101011 | 4 | 0.4% |
20101018 | 6 | |
20101025 | 6 | |
20101028 | 8 |
Value | Count | Frequency (%) |
20150105 | 2 | 0.2% |
20141229 | 2 | 0.2% |
20141222 | 5 | |
20141203 | 3 | |
20140811 | 3 | |
20140101 | 1 | 0.1% |
20131230 | 2 | 0.2% |
20131227 | 2 | 0.2% |
20131223 | 2 | 0.2% |
20131220 | 6 |
SEQ
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 83 |
---|---|
Distinct (%) | 8.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 19.505 |
Minimum | 0 |
---|---|
Maximum | 928 |
Zeros | 2 |
Zeros (%) | 0.2% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 1 |
Q1 | 3 |
median | 6 |
Q3 | 15 |
95-th percentile | 51 |
Maximum | 928 |
Range | 928 |
Interquartile range (IQR) | 12 |
Descriptive statistics
Standard deviation | 79.996183 |
---|---|
Coefficient of variation (CV) | 4.1013168 |
Kurtosis | 105.2114 |
Mean | 19.505 |
Median Absolute Deviation (MAD) | 4 |
Skewness | 10.027157 |
Sum | 19505 |
Variance | 6399.3894 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2 | 144 | |
3 | 97 | 9.7% |
4 | 90 | 9.0% |
5 | 78 | 7.8% |
1 | 77 | 7.7% |
7 | 57 | 5.7% |
6 | 52 | 5.2% |
8 | 34 | 3.4% |
10 | 31 | 3.1% |
15 | 28 | 2.8% |
Other values (73) | 312 |
Value | Count | Frequency (%) |
0 | 2 | 0.2% |
1 | 77 | |
2 | 144 | |
3 | 97 | |
4 | 90 | |
5 | 78 | |
6 | 52 | 5.2% |
7 | 57 | 5.7% |
8 | 34 | 3.4% |
9 | 27 | 2.7% |
Value | Count | Frequency (%) |
928 | 1 | |
927 | 1 | |
926 | 1 | |
925 | 1 | |
924 | 1 | |
923 | 1 | |
568 | 1 | |
567 | 1 | |
566 | 1 | |
565 | 1 |
FMLY_RELTN_CD
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | 0.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 7.9 KiB |
<NA> | |
---|---|
5 | 1 |
1 | 1 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.994 |
Min length | 1 |
Unique
Unique | 2 ? |
---|---|
Unique (%) | 0.2% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 998 | |
5 | 1 | 0.1% |
1 | 1 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 998 | |
5 | 1 | 0.1% |
1 | 1 | 0.1% |
DISPOS_DY
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 250 |
---|---|
Distinct (%) | 67.8% |
Missing | 631 |
Missing (%) | 63.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 19936563 |
Minimum | 84.844 |
---|---|
Maximum | 20130628 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.9 KiB |
Quantile statistics
Minimum | 84.844 |
---|---|
5-th percentile | 19950342 |
Q1 | 20020325 |
median | 20060801 |
Q3 | 20080509 |
95-th percentile | 20110412 |
Maximum | 20130628 |
Range | 20130543 |
Interquartile range (IQR) | 60184 |
Descriptive statistics
Standard deviation | 1474528.6 |
---|---|
Coefficient of variation (CV) | 0.073961022 |
Kurtosis | 181.58045 |
Mean | 19936563 |
Median Absolute Deviation (MAD) | 29571 |
Skewness | -13.50542 |
Sum | 7.3565918 × 109 |
Variance | 2.1742346 × 1012 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
20070907.0 | 42 | 4.2% |
19950829.0 | 7 | 0.7% |
20110831.0 | 6 | 0.6% |
20061113.0 | 4 | 0.4% |
20040503.0 | 4 | 0.4% |
20031230.0 | 4 | 0.4% |
20060823.0 | 4 | 0.4% |
20081210.0 | 4 | 0.4% |
20060214.0 | 4 | 0.4% |
20061020.0 | 3 | 0.3% |
Other values (240) | 287 | |
(Missing) | 631 |
Value | Count | Frequency (%) |
84.844 | 1 | |
99.44 | 1 | |
19920814.0 | 1 | |
19920929.0 | 1 | |
19921007.0 | 1 | |
19921231.0 | 1 | |
19930307.0 | 2 | |
19930701.0 | 2 | |
19931112.0 | 1 | |
19940218.0 | 1 |
Value | Count | Frequency (%) |
20130628.0 | 1 | 0.1% |
20120913.0 | 1 | 0.1% |
20120705.0 | 1 | 0.1% |
20111216.0 | 1 | 0.1% |
20111208.0 | 2 | 0.2% |
20111031.0 | 1 | 0.1% |
20111028.0 | 2 | 0.2% |
20111017.0 | 1 | 0.1% |
20110920.0 | 1 | 0.1% |
20110831.0 | 6 |
REG_DT
Date
Distinct | 633 |
---|---|
Distinct (%) | 63.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 7.9 KiB |
Minimum | 2010-09-07 11:59:43 |
---|---|
Maximum | 2015-01-06 11:23:03 |
TELGRM_MAKE_DY
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 255 |
---|---|
Distinct (%) | 26.0% |
Missing | 19 |
Missing (%) | 1.9% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 20117641 |
Minimum | 20100907 |
---|---|
Maximum | 20141231 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.9 KiB |
Quantile statistics
Minimum | 20100907 |
---|---|
5-th percentile | 20101101 |
Q1 | 20110208 |
median | 20111201 |
Q3 | 20130618 |
95-th percentile | 20131126 |
Maximum | 20141231 |
Range | 40324 |
Interquartile range (IQR) | 20410 |
Descriptive statistics
Standard deviation | 11456.704 |
---|---|
Coefficient of variation (CV) | 0.00056948546 |
Kurtosis | -1.3797833 |
Mean | 20117641 |
Median Absolute Deviation (MAD) | 9994 |
Skewness | 0.049704113 |
Sum | 1.9735406 × 1010 |
Variance | 1.3125607 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
20101129 | 34 | 3.4% |
20110215 | 27 | 2.7% |
20130418 | 24 | 2.4% |
20100914 | 19 | 1.9% |
20110117 | 18 | 1.8% |
20110124 | 17 | 1.7% |
20131004 | 16 | 1.6% |
20130723 | 14 | 1.4% |
20130625 | 14 | 1.4% |
20110111 | 12 | 1.2% |
Other values (245) | 786 | |
(Missing) | 19 | 1.9% |
Value | Count | Frequency (%) |
20100907 | 2 | 0.2% |
20100909 | 5 | 0.5% |
20100914 | 19 | |
20101004 | 5 | 0.5% |
20101007 | 5 | 0.5% |
20101022 | 6 | 0.6% |
20101026 | 6 | 0.6% |
20101101 | 8 | |
20101115 | 8 | |
20101116 | 2 | 0.2% |
Value | Count | Frequency (%) |
20141231 | 2 | 0.2% |
20141223 | 5 | |
20141204 | 3 | |
20140812 | 3 | |
20140103 | 1 | 0.1% |
20131230 | 2 | 0.2% |
20131227 | 6 | |
20131224 | 2 | 0.2% |
20131219 | 2 | 0.2% |
20131217 | 2 | 0.2% |
BASIS_DY | SEQ | FMLY_RELTN_CD | DISPOS_DY | TELGRM_MAKE_DY | |
---|---|---|---|---|---|
BASIS_DY | 1.000 | 0.208 | NaN | 0.000 | 0.986 |
SEQ | 0.208 | 1.000 | NaN | 0.000 | 0.225 |
FMLY_RELTN_CD | NaN | NaN | 1.000 | NaN | NaN |
DISPOS_DY | 0.000 | 0.000 | NaN | 1.000 | NaN |
TELGRM_MAKE_DY | 0.986 | 0.225 | NaN | NaN | 1.000 |
BASIS_DY | SEQ | DISPOS_DY | TELGRM_MAKE_DY | FMLY_RELTN_CD | |
---|---|---|---|---|---|
BASIS_DY | 1.000 | -0.700 | 0.188 | 1.000 | 1.000 |
SEQ | -0.700 | 1.000 | -0.187 | -0.695 | 1.000 |
DISPOS_DY | 0.188 | -0.187 | 1.000 | 0.202 | 1.000 |
TELGRM_MAKE_DY | 1.000 | -0.695 | 0.202 | 1.000 | 1.000 |
FMLY_RELTN_CD | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
BASIS_DY | SEQ | FMLY_RELTN_CD | DISPOS_DY | REG_DT | TELGRM_MAKE_DY | |
---|---|---|---|---|---|---|
0 | 20150105 | 2 | <NA> | <NA> | 2015/01/06 11:23:03 | <NA> |
1 | 20150105 | 2 | <NA> | <NA> | 2015/01/06 11:23:02 | <NA> |
2 | 20141229 | 1 | <NA> | <NA> | 2014/12/31 11:04:32 | 20141231 |
3 | 20141229 | 1 | <NA> | <NA> | 2014/12/31 11:04:31 | 20141231 |
4 | 20141222 | 1 | <NA> | 20100531.0 | 2014/12/23 11:07:21 | 20141223 |
5 | 20141222 | 1 | <NA> | <NA> | 2014/12/23 11:07:00 | 20141223 |
6 | 20141222 | 1 | <NA> | 20070131.0 | 2014/12/23 11:07:00 | 20141223 |
7 | 20141222 | 1 | <NA> | 19980311.0 | 2014/12/23 11:07:00 | 20141223 |
8 | 20141222 | 1 | <NA> | <NA> | 2014/12/23 11:06:59 | 20141223 |
9 | 20141203 | 2 | <NA> | <NA> | 2014/12/04 11:13:35 | 20141204 |
BASIS_DY | SEQ | FMLY_RELTN_CD | DISPOS_DY | REG_DT | TELGRM_MAKE_DY | |
---|---|---|---|---|---|---|
990 | 20100909 | 70 | <NA> | 20020416.0 | 2010/09/14 11:35:58 | 20100914 |
991 | 20100909 | 69 | <NA> | <NA> | 2010/09/14 11:35:58 | 20100914 |
992 | 20100909 | 68 | <NA> | 19960506.0 | 2010/09/14 11:35:58 | 20100914 |
993 | 20100909 | 67 | <NA> | 20070718.0 | 2010/09/14 11:35:58 | 20100914 |
994 | 20100909 | 66 | <NA> | <NA> | 2010/09/14 11:35:58 | 20100914 |
995 | 20100909 | 65 | <NA> | <NA> | 2010/09/14 11:35:58 | 20100914 |
996 | 20100909 | 64 | <NA> | 20021002.0 | 2010/09/14 11:35:58 | 20100914 |
997 | 20100909 | 63 | <NA> | 20041213.0 | 2010/09/14 11:35:58 | 20100914 |
998 | 20100909 | 62 | <NA> | 20080509.0 | 2010/09/14 11:35:58 | 20100914 |
999 | 20100902 | 0 | 1 | 20000306.0 | 2010/09/07 11:59:43 | 20100907 |
Most frequently occurring
BASIS_DY | SEQ | FMLY_RELTN_CD | DISPOS_DY | REG_DT | TELGRM_MAKE_DY | # duplicates | |
---|---|---|---|---|---|---|---|
23 | 20130624 | 3 | <NA> | 19950829.0 | 2013/06/25 11:29:19 | 20130625 | 5 |
0 | 20110127 | 15 | <NA> | 20040503.0 | 2011/01/31 15:55:02 | 20110131 | 3 |
3 | 20110210 | 37 | <NA> | <NA> | 2011/02/14 11:08:20 | 20110215 | 3 |
5 | 20110210 | 37 | <NA> | <NA> | 2011/02/14 11:08:22 | 20110215 | 3 |
6 | 20110221 | 17 | <NA> | <NA> | 2011/02/22 18:33:18 | 20110222 | 3 |
7 | 20110608 | 15 | <NA> | <NA> | 2011/06/09 11:10:56 | 20110609 | 3 |
10 | 20111019 | 4 | <NA> | <NA> | 2011/10/20 11:32:19 | 20111024 | 3 |
1 | 20110210 | 32 | <NA> | <NA> | 2011/02/14 11:08:24 | 20110215 | 2 |
2 | 20110210 | 37 | <NA> | <NA> | 2011/02/14 11:08:19 | 20110215 | 2 |
4 | 20110210 | 37 | <NA> | <NA> | 2011/02/14 11:08:21 | 20110215 | 2 |