Overview

Dataset statistics

Number of variables6
Number of observations99
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.3 KiB
Average record size in memory55.3 B

Variable types

Categorical2
Numeric4

Dataset

Description샘플 데이터
Author코리아크레딧뷰로 / 장윤상
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020213

Alerts

BS_YR_MON has constant value ""Constant
POP_CNT is highly overall correlated with SUM_LN_BAL and 1 other fieldsHigh correlation
SUM_LN_BAL is highly overall correlated with POP_CNTHigh correlation
GENDER is highly overall correlated with POP_CNTHigh correlation
SUM_LN_BAL has 3 (3.0%) zerosZeros

Reproduction

Analysis started2024-04-21 01:30:23.609869
Analysis finished2024-04-21 01:30:27.824625
Duration4.21 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BS_YR_MON
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size920.0 B
201912
99 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201912
2nd row201912
3rd row201912
4th row201912
5th row201912

Common Values

ValueCountFrequency (%)
201912 99
100.0%

Length

2024-04-21T10:30:28.024416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T10:30:28.309261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201912 99
100.0%

PRV_CD
Real number (ℝ)

Distinct7
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11185.455
Minimum11110
Maximum11260
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1019.0 B
2024-04-21T10:30:28.552243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110
5-th percentile11110
Q111140
median11200
Q311215
95-th percentile11233
Maximum11260
Range150
Interquartile range (IQR)75

Descriptive statistics

Standard deviation44.199749
Coefficient of variation (CV)0.003951538
Kurtosis-0.96669026
Mean11185.455
Median Absolute Deviation (MAD)30
Skewness-0.38196219
Sum1107360
Variance1953.6178
MonotonicityIncreasing
2024-04-21T10:30:28.893272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
11230 18
18.2%
11200 17
17.2%
11170 16
16.2%
11215 16
16.2%
11110 14
14.1%
11140 13
13.1%
11260 5
 
5.1%
ValueCountFrequency (%)
11110 14
14.1%
11140 13
13.1%
11170 16
16.2%
11200 17
17.2%
11215 16
16.2%
11230 18
18.2%
11260 5
 
5.1%
ValueCountFrequency (%)
11260 5
 
5.1%
11230 18
18.2%
11215 16
16.2%
11200 17
17.2%
11170 16
16.2%
11140 13
13.1%
11110 14
14.1%

GENDER
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size920.0 B
1
64 
2
35 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 64
64.6%
2 35
35.4%

Length

2024-04-21T10:30:29.292168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T10:30:29.584174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 64
64.6%
2 35
35.4%

AGE_CD
Real number (ℝ)

Distinct10
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.070707
Minimum30
Maximum71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1019.0 B
2024-04-21T10:30:30.093163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum30
5-th percentile30
Q145
median55
Q365
95-th percentile71
Maximum71
Range41
Interquartile range (IQR)20

Descriptive statistics

Standard deviation13.097166
Coefficient of variation (CV)0.24678711
Kurtosis-1.1315423
Mean53.070707
Median Absolute Deviation (MAD)10
Skewness-0.19934397
Sum5254
Variance171.53577
MonotonicityNot monotonic
2024-04-21T10:30:30.439148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
50 13
13.1%
45 11
11.1%
60 11
11.1%
65 11
11.1%
55 10
10.1%
70 10
10.1%
40 9
9.1%
71 9
9.1%
30 8
8.1%
35 7
7.1%
ValueCountFrequency (%)
30 8
8.1%
35 7
7.1%
40 9
9.1%
45 11
11.1%
50 13
13.1%
55 10
10.1%
60 11
11.1%
65 11
11.1%
70 10
10.1%
71 9
9.1%
ValueCountFrequency (%)
71 9
9.1%
70 10
10.1%
65 11
11.1%
60 11
11.1%
55 10
10.1%
50 13
13.1%
45 11
11.1%
40 9
9.1%
35 7
7.1%
30 8
8.1%

POP_CNT
Real number (ℝ)

HIGH CORRELATION 

Distinct53
Distinct (%)53.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.070707
Minimum3
Maximum185
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1019.0 B
2024-04-21T10:30:30.815714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile3
Q15.5
median18
Q345.5
95-th percentile131.8
Maximum185
Range182
Interquartile range (IQR)40

Descriptive statistics

Standard deviation44.41209
Coefficient of variation (CV)1.2312509
Kurtosis2.2997154
Mean36.070707
Median Absolute Deviation (MAD)14
Skewness1.7280239
Sum3571
Variance1972.4337
MonotonicityNot monotonic
2024-04-21T10:30:31.257385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 10
 
10.1%
4 9
 
9.1%
6 6
 
6.1%
5 6
 
6.1%
11 5
 
5.1%
9 3
 
3.0%
18 3
 
3.0%
24 2
 
2.0%
130 2
 
2.0%
45 2
 
2.0%
Other values (43) 51
51.5%
ValueCountFrequency (%)
3 10
10.1%
4 9
9.1%
5 6
6.1%
6 6
6.1%
7 2
 
2.0%
8 2
 
2.0%
9 3
 
3.0%
10 1
 
1.0%
11 5
5.1%
13 2
 
2.0%
ValueCountFrequency (%)
185 1
1.0%
184 1
1.0%
159 1
1.0%
150 1
1.0%
148 1
1.0%
130 2
2.0%
129 1
1.0%
117 1
1.0%
110 1
1.0%
105 1
1.0%

SUM_LN_BAL
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct97
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1928966.4
Minimum0
Maximum10091411
Zeros3
Zeros (%)3.0%
Negative0
Negative (%)0.0%
Memory size1019.0 B
2024-04-21T10:30:31.695210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile29750.3
Q1309323
median818092
Q32749857
95-th percentile6765305.4
Maximum10091411
Range10091411
Interquartile range (IQR)2440534

Descriptive statistics

Standard deviation2281677.3
Coefficient of variation (CV)1.1828497
Kurtosis2.203342
Mean1928966.4
Median Absolute Deviation (MAD)767478
Skewness1.6172294
Sum1.9096767 × 108
Variance5.2060511 × 1012
MonotonicityNot monotonic
2024-04-21T10:30:32.140898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3
 
3.0%
36196 1
 
1.0%
1389355 1
 
1.0%
91072 1
 
1.0%
594322 1
 
1.0%
386455 1
 
1.0%
33509 1
 
1.0%
5149079 1
 
1.0%
3732660 1
 
1.0%
7335830 1
 
1.0%
Other values (87) 87
87.9%
ValueCountFrequency (%)
0 3
3.0%
2470 1
 
1.0%
28646 1
 
1.0%
29873 1
 
1.0%
33509 1
 
1.0%
36196 1
 
1.0%
50614 1
 
1.0%
60211 1
 
1.0%
73914 1
 
1.0%
79270 1
 
1.0%
ValueCountFrequency (%)
10091411 1
1.0%
9020465 1
1.0%
8773742 1
1.0%
7335830 1
1.0%
6892794 1
1.0%
6751140 1
1.0%
6613142 1
1.0%
5953590 1
1.0%
5922112 1
1.0%
5739670 1
1.0%

Interactions

2024-04-21T10:30:26.273746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:23.830446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:24.749060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:25.474644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:26.532168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:24.089750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:24.901878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:25.633780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:26.768781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:24.330715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:25.079679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:25.777004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:27.023010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:24.598390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:25.328276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:30:26.018301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T10:30:32.411149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRV_CDGENDERAGE_CDPOP_CNTSUM_LN_BAL
PRV_CD1.0000.0000.0000.1390.000
GENDER0.0001.0000.0000.7820.515
AGE_CD0.0000.0001.0000.0600.000
POP_CNT0.1390.7820.0601.0000.883
SUM_LN_BAL0.0000.5150.0000.8831.000
2024-04-21T10:30:32.683962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRV_CDAGE_CDPOP_CNTSUM_LN_BALGENDER
PRV_CD1.000-0.1050.3150.2140.000
AGE_CD-0.1051.0000.2360.3100.000
POP_CNT0.3150.2361.0000.9010.591
SUM_LN_BAL0.2140.3100.9011.0000.497
GENDER0.0000.0000.5910.4971.000

Missing values

2024-04-21T10:30:27.349747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T10:30:27.694028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNTSUM_LN_BAL
020191211110135436196
1201912111101406623165
22019121111014516520401
32019121111015019818092
42019121111015521890756
520191211110160442479834
620191211110165311932908
720191211110170261828442
82019121111017116789489
92019121111024530
BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNTSUM_LN_BAL
892019121123025511629331
902019121123026011266834
91201912112302659150167
92201912112302705109524
93201912112302713318748
9420191211260130181567929
952019121126013524818006
9620191211260140673860045
9720191211260145885085995
98201912112601501596751140