Overview

Dataset statistics

Number of variables6
Number of observations99
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.3 KiB
Average record size in memory55.3 B

Variable types

Categorical2
Numeric4

Dataset

Description샘플 데이터
Author코리아크레딧뷰로 / 장윤상
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020212

Alerts

BS_YR_MON has constant value ""Constant
POP_CNT is highly overall correlated with SUM_SCORE and 1 other fieldsHigh correlation
SUM_SCORE is highly overall correlated with POP_CNT and 1 other fieldsHigh correlation
GENDER is highly overall correlated with POP_CNT and 1 other fieldsHigh correlation
SUM_SCORE has unique valuesUnique

Reproduction

Analysis started2023-12-11 22:35:30.100084
Analysis finished2023-12-11 22:35:31.744037
Duration1.64 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BS_YR_MON
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
201912
99 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201912
2nd row201912
3rd row201912
4th row201912
5th row201912

Common Values

ValueCountFrequency (%)
201912 99
100.0%

Length

2023-12-12T07:35:31.789425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:35:31.856649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201912 99
100.0%

PRV_CD
Real number (ℝ)

Distinct7
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11185.455
Minimum11110
Maximum11260
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:35:31.910525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110
5-th percentile11110
Q111140
median11200
Q311215
95-th percentile11233
Maximum11260
Range150
Interquartile range (IQR)75

Descriptive statistics

Standard deviation44.199749
Coefficient of variation (CV)0.003951538
Kurtosis-0.96669026
Mean11185.455
Median Absolute Deviation (MAD)30
Skewness-0.38196219
Sum1107360
Variance1953.6178
MonotonicityIncreasing
2023-12-12T07:35:31.988721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
11230 18
18.2%
11200 17
17.2%
11170 16
16.2%
11215 16
16.2%
11110 14
14.1%
11140 13
13.1%
11260 5
 
5.1%
ValueCountFrequency (%)
11110 14
14.1%
11140 13
13.1%
11170 16
16.2%
11200 17
17.2%
11215 16
16.2%
11230 18
18.2%
11260 5
 
5.1%
ValueCountFrequency (%)
11260 5
 
5.1%
11230 18
18.2%
11215 16
16.2%
11200 17
17.2%
11170 16
16.2%
11140 13
13.1%
11110 14
14.1%

GENDER
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
1
64 
2
35 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 64
64.6%
2 35
35.4%

Length

2023-12-12T07:35:32.076581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:35:32.146543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 64
64.6%
2 35
35.4%

AGE_CD
Real number (ℝ)

Distinct10
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.070707
Minimum30
Maximum71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:35:32.212214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum30
5-th percentile30
Q145
median55
Q365
95-th percentile71
Maximum71
Range41
Interquartile range (IQR)20

Descriptive statistics

Standard deviation13.097166
Coefficient of variation (CV)0.24678711
Kurtosis-1.1315423
Mean53.070707
Median Absolute Deviation (MAD)10
Skewness-0.19934397
Sum5254
Variance171.53577
MonotonicityNot monotonic
2023-12-12T07:35:32.290078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
50 13
13.1%
45 11
11.1%
60 11
11.1%
65 11
11.1%
55 10
10.1%
70 10
10.1%
40 9
9.1%
71 9
9.1%
30 8
8.1%
35 7
7.1%
ValueCountFrequency (%)
30 8
8.1%
35 7
7.1%
40 9
9.1%
45 11
11.1%
50 13
13.1%
55 10
10.1%
60 11
11.1%
65 11
11.1%
70 10
10.1%
71 9
9.1%
ValueCountFrequency (%)
71 9
9.1%
70 10
10.1%
65 11
11.1%
60 11
11.1%
55 10
10.1%
50 13
13.1%
45 11
11.1%
40 9
9.1%
35 7
7.1%
30 8
8.1%

POP_CNT
Real number (ℝ)

HIGH CORRELATION 

Distinct53
Distinct (%)53.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.070707
Minimum3
Maximum185
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:35:32.388383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile3
Q15.5
median18
Q345.5
95-th percentile131.8
Maximum185
Range182
Interquartile range (IQR)40

Descriptive statistics

Standard deviation44.41209
Coefficient of variation (CV)1.2312509
Kurtosis2.2997154
Mean36.070707
Median Absolute Deviation (MAD)14
Skewness1.7280239
Sum3571
Variance1972.4337
MonotonicityNot monotonic
2023-12-12T07:35:32.505576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 10
 
10.1%
4 9
 
9.1%
6 6
 
6.1%
5 6
 
6.1%
11 5
 
5.1%
9 3
 
3.0%
18 3
 
3.0%
24 2
 
2.0%
130 2
 
2.0%
45 2
 
2.0%
Other values (43) 51
51.5%
ValueCountFrequency (%)
3 10
10.1%
4 9
9.1%
5 6
6.1%
6 6
6.1%
7 2
 
2.0%
8 2
 
2.0%
9 3
 
3.0%
10 1
 
1.0%
11 5
5.1%
13 2
 
2.0%
ValueCountFrequency (%)
185 1
1.0%
184 1
1.0%
159 1
1.0%
150 1
1.0%
148 1
1.0%
130 2
2.0%
129 1
1.0%
117 1
1.0%
110 1
1.0%
105 1
1.0%

SUM_SCORE
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct99
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30191.788
Minimum2017
Maximum162458
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:35:32.609286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2017
5-th percentile2370.3
Q14575.5
median13810
Q340226.5
95-th percentile117547.1
Maximum162458
Range160441
Interquartile range (IQR)35651

Descriptive statistics

Standard deviation37836.757
Coefficient of variation (CV)1.2532135
Kurtosis2.6061654
Mean30191.788
Median Absolute Deviation (MAD)10362
Skewness1.7904141
Sum2988987
Variance1.4316202 × 109
MonotonicityNot monotonic
2023-12-12T07:35:32.718538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3369 1
 
1.0%
40139 1
 
1.0%
9583 1
 
1.0%
9107 1
 
1.0%
11794 1
 
1.0%
2398 1
 
1.0%
55202 1
 
1.0%
98546 1
 
1.0%
161000 1
 
1.0%
124207 1
 
1.0%
Other values (89) 89
89.9%
ValueCountFrequency (%)
2017 1
1.0%
2109 1
1.0%
2277 1
1.0%
2315 1
1.0%
2355 1
1.0%
2372 1
1.0%
2398 1
1.0%
2408 1
1.0%
2461 1
1.0%
2747 1
1.0%
ValueCountFrequency (%)
162458 1
1.0%
161000 1
1.0%
129447 1
1.0%
124207 1
1.0%
123443 1
1.0%
116892 1
1.0%
109475 1
1.0%
108290 1
1.0%
98546 1
1.0%
95382 1
1.0%

Interactions

2023-12-12T07:35:31.264000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:30.252680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:30.530563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:30.992824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:31.344308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:30.323802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:30.595704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:31.060476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:31.407911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:30.384225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:30.650443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:31.120439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:31.500104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:30.455050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:30.720492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:35:31.186865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:35:32.794699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRV_CDGENDERAGE_CDPOP_CNTSUM_SCORE
PRV_CD1.0000.0000.0000.1390.132
GENDER0.0001.0000.0000.7820.598
AGE_CD0.0000.0001.0000.0600.000
POP_CNT0.1390.7820.0601.0000.929
SUM_SCORE0.1320.5980.0000.9291.000
2023-12-12T07:35:32.870280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRV_CDAGE_CDPOP_CNTSUM_SCOREGENDER
PRV_CD1.000-0.1050.3150.3040.000
AGE_CD-0.1051.0000.2360.2630.000
POP_CNT0.3150.2361.0000.9970.591
SUM_SCORE0.3040.2630.9971.0000.585
GENDER0.0000.0000.5910.5851.000

Missing values

2023-12-12T07:35:31.614861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:35:31.711720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNTSUM_SCORE
02019121111013543369
12019121111014064730
2201912111101451612067
3201912111101501914847
4201912111101552117733
5201912111101604437132
6201912111101653125984
7201912111101702623400
8201912111101711614289
92019121111024532747
BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNTSUM_SCORE
8920191211230255118027
9020191211230260119610
912019121123026598229
922019121123027053689
932019121123027132461
94201912112601301813800
95201912112601352418707
96201912112601406753113
97201912112601458864697
9820191211260150159123443