Overview

Dataset statistics

Number of variables6
Number of observations99
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.3 KiB
Average record size in memory55.3 B

Variable types

Categorical3
Numeric3

Dataset

Description샘플 데이터
Author코리아크레딧뷰로 / 장윤상
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020209

Alerts

BS_YR_MON has constant value ""Constant
POP_CNT is highly overall correlated with SUM_INCOMEHigh correlation
SUM_INCOME is highly overall correlated with POP_CNTHigh correlation
SUM_INCOME has unique valuesUnique

Reproduction

Analysis started2023-12-11 22:39:10.319838
Analysis finished2023-12-11 22:39:12.260129
Duration1.94 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BS_YR_MON
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
201912
99 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201912
2nd row201912
3rd row201912
4th row201912
5th row201912

Common Values

ValueCountFrequency (%)
201912 99
100.0%

Length

2023-12-12T07:39:12.307179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:39:12.373313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201912 99
100.0%

PRV_CD
Categorical

Distinct5
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size924.0 B
11110
22 
11140
22 
11170
22 
11200
22 
11215
11 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11110
2nd row11110
3rd row11110
4th row11110
5th row11110

Common Values

ValueCountFrequency (%)
11110 22
22.2%
11140 22
22.2%
11170 22
22.2%
11200 22
22.2%
11215 11
11.1%

Length

2023-12-12T07:39:12.439170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:39:12.522985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11110 22
22.2%
11140 22
22.2%
11170 22
22.2%
11200 22
22.2%
11215 11
11.1%

GENDER
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
1
55 
2
44 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 55
55.6%
2 44
44.4%

Length

2023-12-12T07:39:12.624671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:39:12.711250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 55
55.6%
2 44
44.4%

AGE_CD
Real number (ℝ)

Distinct11
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.636364
Minimum25
Maximum71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:39:12.792997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile25
Q135
median50
Q365
95-th percentile71
Maximum71
Range46
Interquartile range (IQR)30

Descriptive statistics

Standard deviation15.346644
Coefficient of variation (CV)0.30918147
Kurtosis-1.2980204
Mean49.636364
Median Absolute Deviation (MAD)15
Skewness-0.092468713
Sum4914
Variance235.51948
MonotonicityNot monotonic
2023-12-12T07:39:12.893305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
25 9
9.1%
30 9
9.1%
35 9
9.1%
40 9
9.1%
45 9
9.1%
50 9
9.1%
55 9
9.1%
60 9
9.1%
65 9
9.1%
70 9
9.1%
ValueCountFrequency (%)
25 9
9.1%
30 9
9.1%
35 9
9.1%
40 9
9.1%
45 9
9.1%
50 9
9.1%
55 9
9.1%
60 9
9.1%
65 9
9.1%
70 9
9.1%
ValueCountFrequency (%)
71 9
9.1%
70 9
9.1%
65 9
9.1%
60 9
9.1%
55 9
9.1%
50 9
9.1%
45 9
9.1%
40 9
9.1%
35 9
9.1%
30 9
9.1%

POP_CNT
Real number (ℝ)

HIGH CORRELATION 

Distinct96
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean854.41414
Minimum35
Maximum2186
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:39:12.998274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum35
5-th percentile71.8
Q1498.5
median811
Q31193
95-th percentile1858
Maximum2186
Range2151
Interquartile range (IQR)694.5

Descriptive statistics

Standard deviation518.8596
Coefficient of variation (CV)0.60726944
Kurtosis-0.2896325
Mean854.41414
Median Absolute Deviation (MAD)346
Skewness0.50519806
Sum84587
Variance269215.29
MonotonicityNot monotonic
2023-12-12T07:39:13.096825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1858 2
 
2.0%
777 2
 
2.0%
567 2
 
2.0%
47 1
 
1.0%
1232 1
 
1.0%
1930 1
 
1.0%
1602 1
 
1.0%
1691 1
 
1.0%
872 1
 
1.0%
411 1
 
1.0%
Other values (86) 86
86.9%
ValueCountFrequency (%)
35 1
1.0%
43 1
1.0%
44 1
1.0%
47 1
1.0%
52 1
1.0%
74 1
1.0%
80 1
1.0%
81 1
1.0%
87 1
1.0%
204 1
1.0%
ValueCountFrequency (%)
2186 1
1.0%
2040 1
1.0%
2012 1
1.0%
1930 1
1.0%
1858 2
2.0%
1829 1
1.0%
1691 1
1.0%
1610 1
1.0%
1602 1
1.0%
1555 1
1.0%

SUM_INCOME
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct99
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3971252.1
Minimum65257
Maximum11698070
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:39:13.196789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum65257
5-th percentile124149.2
Q11668584.5
median3506572
Q35468869.5
95-th percentile10280240
Maximum11698070
Range11632813
Interquartile range (IQR)3800285

Descriptive statistics

Standard deviation3026972.9
Coefficient of variation (CV)0.7622213
Kurtosis-0.084556729
Mean3971252.1
Median Absolute Deviation (MAD)1853922
Skewness0.80968604
Sum3.9315396 × 108
Variance9.1625651 × 1012
MonotonicityNot monotonic
2023-12-12T07:39:13.310695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
82943 1
 
1.0%
4184504 1
 
1.0%
10265058 1
 
1.0%
10781818 1
 
1.0%
9914831 1
 
1.0%
8773560 1
 
1.0%
8355651 1
 
1.0%
3794841 1
 
1.0%
1010680 1
 
1.0%
148185 1
 
1.0%
Other values (89) 89
89.9%
ValueCountFrequency (%)
65257 1
1.0%
68318 1
1.0%
70154 1
1.0%
82943 1
1.0%
92714 1
1.0%
127642 1
1.0%
148185 1
1.0%
149131 1
1.0%
160477 1
1.0%
456664 1
1.0%
ValueCountFrequency (%)
11698070 1
1.0%
10853814 1
1.0%
10781818 1
1.0%
10535896 1
1.0%
10416883 1
1.0%
10265058 1
1.0%
10264579 1
1.0%
10097169 1
1.0%
9914831 1
1.0%
8773560 1
1.0%

Interactions

2023-12-12T07:39:11.941034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:11.410910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:11.730036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:12.004238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:11.502532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:11.792628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:12.061609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:11.669449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:11.858922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:39:13.397707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRV_CDGENDERAGE_CDPOP_CNTSUM_INCOME
PRV_CD1.0000.2040.0000.6890.153
GENDER0.2041.0000.0000.3970.484
AGE_CD0.0000.0001.0000.7440.731
POP_CNT0.6890.3970.7441.0000.934
SUM_INCOME0.1530.4840.7310.9341.000
2023-12-12T07:39:13.488894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRV_CDGENDER
PRV_CD1.0000.245
GENDER0.2451.000
2023-12-12T07:39:13.568745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
AGE_CDPOP_CNTSUM_INCOMEPRV_CDGENDER
AGE_CD1.0000.3610.4400.0000.000
POP_CNT0.3611.0000.9660.3440.291
SUM_INCOME0.4400.9661.0000.0690.346
PRV_CD0.0000.3440.0691.0000.245
GENDER0.0000.2910.3460.2451.000

Missing values

2023-12-12T07:39:12.144060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:39:12.218175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNTSUM_INCOME
0201912111101254782943
120191211110130225536321
2201912111101355211757800
3201912111101407343050231
4201912111101457773952626
5201912111101509975895566
62019121111015510225816924
72019121111016011206543789
8201912111101659234918646
9201912111101705672866859
BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNTSUM_INCOME
89201912112151305351256967
902019121121513510493506572
912019121121514015316308135
922019121121514516107579790
9320191211215150201210853814
9420191211215155204010416883
9520191211215160218611698070
962019121121516518298659672
97201912112151709664652031
98201912112151719545193461