Overview

Dataset statistics

Number of variables6
Number of observations99
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.3 KiB
Average record size in memory55.3 B

Variable types

Categorical3
Numeric3

Dataset

Description샘플 데이터
Author코리아크레딧뷰로 / 장윤상
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020208

Alerts

BS_YR_MON has constant value ""Constant
POP_CNT is highly overall correlated with SUM_SCOREHigh correlation
SUM_SCORE is highly overall correlated with POP_CNTHigh correlation
SUM_SCORE has unique valuesUnique

Reproduction

Analysis started2023-12-11 22:34:10.209827
Analysis finished2023-12-11 22:34:11.042801
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BS_YR_MON
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
201912
99 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201912
2nd row201912
3rd row201912
4th row201912
5th row201912

Common Values

ValueCountFrequency (%)
201912 99
100.0%

Length

2023-12-12T07:34:11.088516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:11.155120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201912 99
100.0%

PRV_CD
Categorical

Distinct5
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size924.0 B
11110
22 
11140
22 
11170
22 
11200
22 
11215
11 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11110
2nd row11110
3rd row11110
4th row11110
5th row11110

Common Values

ValueCountFrequency (%)
11110 22
22.2%
11140 22
22.2%
11170 22
22.2%
11200 22
22.2%
11215 11
11.1%

Length

2023-12-12T07:34:11.223099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:11.299433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11110 22
22.2%
11140 22
22.2%
11170 22
22.2%
11200 22
22.2%
11215 11
11.1%

GENDER
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
1
55 
2
44 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 55
55.6%
2 44
44.4%

Length

2023-12-12T07:34:11.382895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:11.448844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 55
55.6%
2 44
44.4%

AGE_CD
Real number (ℝ)

Distinct11
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.636364
Minimum25
Maximum71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:34:11.509095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile25
Q135
median50
Q365
95-th percentile71
Maximum71
Range46
Interquartile range (IQR)30

Descriptive statistics

Standard deviation15.346644
Coefficient of variation (CV)0.30918147
Kurtosis-1.2980204
Mean49.636364
Median Absolute Deviation (MAD)15
Skewness-0.092468713
Sum4914
Variance235.51948
MonotonicityNot monotonic
2023-12-12T07:34:11.598833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
25 9
9.1%
30 9
9.1%
35 9
9.1%
40 9
9.1%
45 9
9.1%
50 9
9.1%
55 9
9.1%
60 9
9.1%
65 9
9.1%
70 9
9.1%
ValueCountFrequency (%)
25 9
9.1%
30 9
9.1%
35 9
9.1%
40 9
9.1%
45 9
9.1%
50 9
9.1%
55 9
9.1%
60 9
9.1%
65 9
9.1%
70 9
9.1%
ValueCountFrequency (%)
71 9
9.1%
70 9
9.1%
65 9
9.1%
60 9
9.1%
55 9
9.1%
50 9
9.1%
45 9
9.1%
40 9
9.1%
35 9
9.1%
30 9
9.1%

POP_CNT
Real number (ℝ)

HIGH CORRELATION 

Distinct96
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean854.41414
Minimum35
Maximum2186
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:34:11.693297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum35
5-th percentile71.8
Q1498.5
median811
Q31193
95-th percentile1858
Maximum2186
Range2151
Interquartile range (IQR)694.5

Descriptive statistics

Standard deviation518.8596
Coefficient of variation (CV)0.60726944
Kurtosis-0.2896325
Mean854.41414
Median Absolute Deviation (MAD)346
Skewness0.50519806
Sum84587
Variance269215.29
MonotonicityNot monotonic
2023-12-12T07:34:11.801365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1858 2
 
2.0%
777 2
 
2.0%
567 2
 
2.0%
47 1
 
1.0%
1232 1
 
1.0%
1930 1
 
1.0%
1602 1
 
1.0%
1691 1
 
1.0%
872 1
 
1.0%
411 1
 
1.0%
Other values (86) 86
86.9%
ValueCountFrequency (%)
35 1
1.0%
43 1
1.0%
44 1
1.0%
47 1
1.0%
52 1
1.0%
74 1
1.0%
80 1
1.0%
81 1
1.0%
87 1
1.0%
204 1
1.0%
ValueCountFrequency (%)
2186 1
1.0%
2040 1
1.0%
2012 1
1.0%
1930 1
1.0%
1858 2
2.0%
1829 1
1.0%
1691 1
1.0%
1610 1
1.0%
1602 1
1.0%
1555 1
1.0%

SUM_SCORE
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct99
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean728640.04
Minimum25568
Maximum1880977
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:34:11.902104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25568
5-th percentile55189.5
Q1421319
median698445
Q31029328.5
95-th percentile1580049
Maximum1880977
Range1855409
Interquartile range (IQR)608009.5

Descriptive statistics

Standard deviation443829.49
Coefficient of variation (CV)0.60912036
Kurtosis-0.32538681
Mean728640.04
Median Absolute Deviation (MAD)314550
Skewness0.47721638
Sum72135364
Variance1.9698461 × 1011
MonotonicityNot monotonic
2023-12-12T07:34:12.007433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
36015 1
 
1.0%
896071 1
 
1.0%
1659172 1
 
1.0%
1570029 1
 
1.0%
1577093 1
 
1.0%
1352395 1
 
1.0%
1426308 1
 
1.0%
722779 1
 
1.0%
326676 1
 
1.0%
61034 1
 
1.0%
Other values (89) 89
89.9%
ValueCountFrequency (%)
25568 1
1.0%
33972 1
1.0%
35153 1
1.0%
36015 1
1.0%
38103 1
1.0%
57088 1
1.0%
59980 1
1.0%
61034 1
1.0%
67726 1
1.0%
170479 1
1.0%
ValueCountFrequency (%)
1880977 1
1.0%
1707899 1
1.0%
1664314 1
1.0%
1659172 1
1.0%
1606653 1
1.0%
1577093 1
1.0%
1570029 1
1.0%
1426308 1
1.0%
1372768 1
1.0%
1352395 1
1.0%

Interactions

2023-12-12T07:34:10.727591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:10.376944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:10.554994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:10.787818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:10.433060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:10.611386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:10.849379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:10.488636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:34:10.663003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:34:12.082492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRV_CDGENDERAGE_CDPOP_CNTSUM_SCORE
PRV_CD1.0000.2040.0000.6890.694
GENDER0.2041.0000.0000.3970.412
AGE_CD0.0000.0001.0000.7440.751
POP_CNT0.6890.3970.7441.0000.997
SUM_SCORE0.6940.4120.7510.9971.000
2023-12-12T07:34:12.154758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GENDERPRV_CD
GENDER1.0000.245
PRV_CD0.2451.000
2023-12-12T07:34:12.216532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
AGE_CDPOP_CNTSUM_SCOREPRV_CDGENDER
AGE_CD1.0000.3610.3900.0000.000
POP_CNT0.3611.0000.9980.3440.291
SUM_SCORE0.3900.9981.0000.3490.298
PRV_CD0.0000.3440.3491.0000.245
GENDER0.0000.2910.2980.2451.000

Missing values

2023-12-12T07:34:10.939379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:34:11.013303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNTSUM_SCORE
0201912111101254736015
120191211110130225179763
220191211110135521429246
320191211110140734603279
420191211110145777655701
520191211110150997825086
6201912111101551022866228
7201912111101601120948339
820191211110165923794509
920191211110170567502415
BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNTSUM_SCORE
8920191211215130535425490
90201912112151351049861312
912019121121514015311277452
922019121121514516101332787
932019121121515020121664314
942019121121515520401707899
952019121121516021861880977
962019121121516518291606653
9720191211215170966863043
9820191211215171954855250