Overview

Dataset statistics

Number of variables5
Number of observations99
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.5 KiB
Average record size in memory46.3 B

Variable types

Categorical2
Numeric3

Dataset

Description샘플 데이터
Author코리아크레딧뷰로 / 장윤상
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020211

Alerts

BS_YR_MON has constant value ""Constant
POP_CNT is highly overall correlated with GENDERHigh correlation
GENDER is highly overall correlated with POP_CNTHigh correlation

Reproduction

Analysis started2024-01-14 00:46:02.997565
Analysis finished2024-01-14 00:46:08.667053
Duration5.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BS_YR_MON
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size920.0 B
201912
99 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201912
2nd row201912
3rd row201912
4th row201912
5th row201912

Common Values

ValueCountFrequency (%)
201912 99
100.0%

Length

2024-01-14T09:46:08.971522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-14T09:46:09.387262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201912 99
100.0%

PRV_CD
Real number (ℝ)

Distinct7
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11185.455
Minimum11110
Maximum11260
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1019.0 B
2024-01-14T09:46:09.588920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110
5-th percentile11110
Q111140
median11200
Q311215
95-th percentile11233
Maximum11260
Range150
Interquartile range (IQR)75

Descriptive statistics

Standard deviation44.199749
Coefficient of variation (CV)0.003951538
Kurtosis-0.96669026
Mean11185.455
Median Absolute Deviation (MAD)30
Skewness-0.38196219
Sum1107360
Variance1953.6178
MonotonicityIncreasing
2024-01-14T09:46:09.947333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
11230 18
18.2%
11200 17
17.2%
11170 16
16.2%
11215 16
16.2%
11110 14
14.1%
11140 13
13.1%
11260 5
 
5.1%
ValueCountFrequency (%)
11110 14
14.1%
11140 13
13.1%
11170 16
16.2%
11200 17
17.2%
11215 16
16.2%
11230 18
18.2%
11260 5
 
5.1%
ValueCountFrequency (%)
11260 5
 
5.1%
11230 18
18.2%
11215 16
16.2%
11200 17
17.2%
11170 16
16.2%
11140 13
13.1%
11110 14
14.1%

GENDER
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size920.0 B
1
64 
2
35 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 64
64.6%
2 35
35.4%

Length

2024-01-14T09:46:10.195194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-14T09:46:10.355086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 64
64.6%
2 35
35.4%

AGE_CD
Real number (ℝ)

Distinct10
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.070707
Minimum30
Maximum71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1019.0 B
2024-01-14T09:46:10.510224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum30
5-th percentile30
Q145
median55
Q365
95-th percentile71
Maximum71
Range41
Interquartile range (IQR)20

Descriptive statistics

Standard deviation13.097166
Coefficient of variation (CV)0.24678711
Kurtosis-1.1315423
Mean53.070707
Median Absolute Deviation (MAD)10
Skewness-0.19934397
Sum5254
Variance171.53577
MonotonicityNot monotonic
2024-01-14T09:46:10.712897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
50 13
13.1%
45 11
11.1%
60 11
11.1%
65 11
11.1%
55 10
10.1%
70 10
10.1%
40 9
9.1%
71 9
9.1%
30 8
8.1%
35 7
7.1%
ValueCountFrequency (%)
30 8
8.1%
35 7
7.1%
40 9
9.1%
45 11
11.1%
50 13
13.1%
55 10
10.1%
60 11
11.1%
65 11
11.1%
70 10
10.1%
71 9
9.1%
ValueCountFrequency (%)
71 9
9.1%
70 10
10.1%
65 11
11.1%
60 11
11.1%
55 10
10.1%
50 13
13.1%
45 11
11.1%
40 9
9.1%
35 7
7.1%
30 8
8.1%

POP_CNT
Real number (ℝ)

HIGH CORRELATION 

Distinct53
Distinct (%)53.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.070707
Minimum3
Maximum185
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1019.0 B
2024-01-14T09:46:10.949006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile3
Q15.5
median18
Q345.5
95-th percentile131.8
Maximum185
Range182
Interquartile range (IQR)40

Descriptive statistics

Standard deviation44.41209
Coefficient of variation (CV)1.2312509
Kurtosis2.2997154
Mean36.070707
Median Absolute Deviation (MAD)14
Skewness1.7280239
Sum3571
Variance1972.4337
MonotonicityNot monotonic
2024-01-14T09:46:11.238401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 10
 
10.1%
4 9
 
9.1%
6 6
 
6.1%
5 6
 
6.1%
11 5
 
5.1%
9 3
 
3.0%
18 3
 
3.0%
24 2
 
2.0%
130 2
 
2.0%
45 2
 
2.0%
Other values (43) 51
51.5%
ValueCountFrequency (%)
3 10
10.1%
4 9
9.1%
5 6
6.1%
6 6
6.1%
7 2
 
2.0%
8 2
 
2.0%
9 3
 
3.0%
10 1
 
1.0%
11 5
5.1%
13 2
 
2.0%
ValueCountFrequency (%)
185 1
1.0%
184 1
1.0%
159 1
1.0%
150 1
1.0%
148 1
1.0%
130 2
2.0%
129 1
1.0%
117 1
1.0%
110 1
1.0%
105 1
1.0%

Interactions

2024-01-14T09:46:06.641226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-14T09:46:04.827311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-14T09:46:05.733735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-14T09:46:07.097816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-14T09:46:05.171235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-14T09:46:06.046149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-14T09:46:07.453852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-14T09:46:05.443594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-14T09:46:06.321607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-14T09:46:11.511631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRV_CDGENDERAGE_CDPOP_CNT
PRV_CD1.0000.0000.0000.139
GENDER0.0001.0000.0000.782
AGE_CD0.0000.0001.0000.060
POP_CNT0.1390.7820.0601.000
2024-01-14T09:46:11.837158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRV_CDAGE_CDPOP_CNTGENDER
PRV_CD1.000-0.1050.3150.000
AGE_CD-0.1051.0000.2360.000
POP_CNT0.3150.2361.0000.591
GENDER0.0000.0000.5911.000

Missing values

2024-01-14T09:46:07.994253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-14T09:46:08.385630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNT
0201912111101354
1201912111101406
22019121111014516
32019121111015019
42019121111015521
52019121111016044
62019121111016531
72019121111017026
82019121111017116
9201912111102453
BS_YR_MONPRV_CDGENDERAGE_CDPOP_CNT
892019121123025511
902019121123026011
91201912112302659
92201912112302705
93201912112302713
942019121126013018
952019121126013524
962019121126014067
972019121126014588
9820191211260150159