Overview

Dataset statistics

Number of variables6
Number of observations99
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.3 KiB
Average record size in memory55.3 B

Variable types

Categorical3
Numeric3

Dataset

Description샘플 데이터
Author코리아크레딧뷰로 / 장윤상
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020210

Alerts

BS_YR_MON has constant value ""Constant
AGE_CD is highly overall correlated with SUM_LN_BALHigh correlation
POP_CNT is highly overall correlated with SUM_LN_BALHigh correlation
SUM_LN_BAL is highly overall correlated with AGE_CD and 1 other fieldsHigh correlation
SUM_LN_BAL has unique valuesUnique

Reproduction

Analysis started2023-12-11 22:34:59.269903
Analysis finished2023-12-11 22:35:00.491371
Duration1.22 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BS_YR_MON
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
201912
99 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201912
2nd row201912
3rd row201912
4th row201912
5th row201912

Common Values

ValueCountFrequency (%)
201912 99
100.0%

Length

2023-12-12T07:35:00.537406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:35:00.603530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201912 99
100.0%

PRV_CD
Categorical

Distinct5
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size924.0 B
11110
22 
11140
22 
11170
22 
11200
22 
11215
11 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11110
2nd row11110
3rd row11110
4th row11110
5th row11110

Common Values

ValueCountFrequency (%)
11110 22
22.2%
11140 22
22.2%
11170 22
22.2%
11200 22
22.2%
11215 11
11.1%

Length

2023-12-12T07:35:00.670473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:35:00.746156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11110 22
22.2%
11140 22
22.2%
11170 22
22.2%
11200 22
22.2%
11215 11
11.1%

GENDER
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
1
55 
2
44 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 55
55.6%
2 44
44.4%

Length

2023-12-12T07:35:00.835142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:35:00.905906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 55
55.6%
2 44
44.4%

AGE_CD
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.636364
Minimum25
Maximum71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:35:00.970007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile25
Q135
median50
Q365
95-th percentile71
Maximum71
Range46
Interquartile range (IQR)30

Descriptive statistics

Standard deviation15.346644
Coefficient of variation (CV)0.30918147
Kurtosis-1.2980204
Mean49.636364
Median Absolute Deviation (MAD)15
Skewness-0.092468713
Sum4914
Variance235.51948
MonotonicityNot monotonic
2023-12-12T07:35:01.049372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
25 9
9.1%
30 9
9.1%
35 9
9.1%
40 9
9.1%
45 9
9.1%
50 9
9.1%
55 9
9.1%
60 9
9.1%
65 9
9.1%
70 9
9.1%
ValueCountFrequency (%)
25 9
9.1%
30 9
9.1%
35 9
9.1%
40 9
9.1%
45 9
9.1%
50 9
9.1%
55 9
9.1%
60 9
9.1%
65 9
9.1%
70 9
9.1%
ValueCountFrequency (%)
71 9
9.1%
70 9
9.1%
65 9
9.1%
60 9
9.1%
55 9
9.1%
50 9
9.1%
45 9
9.1%
40 9
9.1%
35 9
9.1%
30 9
9.1%

POP_CNT
Real number (ℝ)

HIGH CORRELATION 

Distinct96
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean854.41414
Minimum35
Maximum2186
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:35:01.147390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum35
5-th percentile71.8
Q1498.5
median811
Q31193
95-th percentile1858
Maximum2186
Range2151
Interquartile range (IQR)694.5

Descriptive statistics

Standard deviation518.8596
Coefficient of variation (CV)0.60726944
Kurtosis-0.2896325
Mean854.41414
Median Absolute Deviation (MAD)346
Skewness0.50519806
Sum84587
Variance269215.29
MonotonicityNot monotonic
2023-12-12T07:35:01.248463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1858 2
 
2.0%
777 2
 
2.0%
567 2
 
2.0%
47 1
 
1.0%
1232 1
 
1.0%
1930 1
 
1.0%
1602 1
 
1.0%
1691 1
 
1.0%
872 1
 
1.0%
411 1
 
1.0%
Other values (86) 86
86.9%
ValueCountFrequency (%)
35 1
1.0%
43 1
1.0%
44 1
1.0%
47 1
1.0%
52 1
1.0%
74 1
1.0%
80 1
1.0%
81 1
1.0%
87 1
1.0%
204 1
1.0%
ValueCountFrequency (%)
2186 1
1.0%
2040 1
1.0%
2012 1
1.0%
1930 1
1.0%
1858 2
2.0%
1829 1
1.0%
1691 1
1.0%
1610 1
1.0%
1602 1
1.0%
1555 1
1.0%

SUM_LN_BAL
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct99
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean96633077
Minimum167279
Maximum2.855001 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:35:01.348263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum167279
5-th percentile723402.6
Q135609735
median84018977
Q31.4272992 × 108
95-th percentile2.3738833 × 108
Maximum2.855001 × 108
Range2.8533283 × 108
Interquartile range (IQR)1.0712018 × 108

Descriptive statistics

Standard deviation74253447
Coefficient of variation (CV)0.76840611
Kurtosis-0.53720982
Mean96633077
Median Absolute Deviation (MAD)54274017
Skewness0.58621078
Sum9.5666746 × 109
Variance5.5135745 × 1015
MonotonicityNot monotonic
2023-12-12T07:35:01.449055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
757588 1
 
1.0%
155302422 1
 
1.0%
202615164 1
 
1.0%
237962154 1
 
1.0%
234092241 1
 
1.0%
205325186 1
 
1.0%
226508212 1
 
1.0%
84941146 1
 
1.0%
13626140 1
 
1.0%
723627 1
 
1.0%
Other values (89) 89
89.9%
ValueCountFrequency (%)
167279 1
1.0%
197251 1
1.0%
214065 1
1.0%
446743 1
1.0%
721383 1
1.0%
723627 1
1.0%
757588 1
1.0%
932432 1
1.0%
2211873 1
1.0%
4230108 1
1.0%
ValueCountFrequency (%)
285500105 1
1.0%
255503820 1
1.0%
242748400 1
1.0%
242484604 1
1.0%
237962154 1
1.0%
237324568 1
1.0%
234640756 1
1.0%
234092241 1
1.0%
226508212 1
1.0%
205325186 1
1.0%

Interactions

2023-12-12T07:35:00.185261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:59.659104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:59.837681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:00.243900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:59.718380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:00.067026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:00.302989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:59.779323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:00.123454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:35:01.538698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRV_CDGENDERAGE_CDPOP_CNTSUM_LN_BAL
PRV_CD1.0000.2040.0000.6890.538
GENDER0.2041.0000.0000.3970.509
AGE_CD0.0000.0001.0000.7440.720
POP_CNT0.6890.3970.7441.0000.873
SUM_LN_BAL0.5380.5090.7200.8731.000
2023-12-12T07:35:01.626007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRV_CDGENDER
PRV_CD1.0000.245
GENDER0.2451.000
2023-12-12T07:35:01.688346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
AGE_CDPOP_CNTSUM_LN_BALPRV_CDGENDER
AGE_CD1.0000.3610.5740.0000.000
POP_CNT0.3611.0000.8930.3440.291
SUM_LN_BAL0.5740.8931.0000.2410.370
PRV_CD0.0000.3440.2411.0000.245
GENDER0.0000.2910.3700.2451.000

Missing values

2023-12-12T07:35:00.384698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:35:00.461494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNTSUM_LN_BAL
02019121111012547757588
1201912111101302255436771
22019121111013552127710484
32019121111014073462684726
420191211110145777129186547
520191211110150997105383136
6201912111101551022116259168
7201912111101601120144205047
820191211110165923138292994
92019121111017056779484684
BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNTSUM_LN_BAL
892019121121513053513366457
9020191211215135104955815531
91201912112151401531119324740
92201912112151451610144190873
93201912112151502012198909694
94201912112151552040194101186
95201912112151602186237324568
96201912112151651829189403270
9720191211215170966141268958
9820191211215171954123926181