Overview

Dataset statistics

Number of variables5
Number of observations66
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.0 KiB
Average record size in memory47.0 B

Variable types

Categorical3
Numeric2

Dataset

Description샘플 데이터
Author코리아크레딧뷰로 / 장윤상
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020195

Alerts

REGION_CD has constant value ""Constant
AGE_CD is highly overall correlated with POP_CNTHigh correlation
POP_CNT is highly overall correlated with AGE_CDHigh correlation
POP_CNT has unique valuesUnique

Reproduction

Analysis started2023-12-11 22:37:14.760099
Analysis finished2023-12-11 22:37:15.686274
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BS_YR_MON
Categorical

Distinct3
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size660.0 B
201912
22 
202012
22 
202112
22 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201912
2nd row202012
3rd row202112
4th row201912
5th row202012

Common Values

ValueCountFrequency (%)
201912 22
33.3%
202012 22
33.3%
202112 22
33.3%

Length

2023-12-12T07:37:15.735845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:37:15.814590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201912 22
33.3%
202012 22
33.3%
202112 22
33.3%

REGION_CD
Categorical

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size660.0 B
1
66 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 66
100.0%

Length

2023-12-12T07:37:15.897321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:37:15.968953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 66
100.0%

GENDER
Categorical

Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size660.0 B
1
33 
2
33 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row2
5th row1

Common Values

ValueCountFrequency (%)
1 33
50.0%
2 33
50.0%

Length

2023-12-12T07:37:16.043052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:37:16.113776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 33
50.0%
2 33
50.0%

AGE_CD
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.636364
Minimum25
Maximum71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size726.0 B
2023-12-12T07:37:16.177290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile25
Q135
median50
Q365
95-th percentile71
Maximum71
Range46
Interquartile range (IQR)30

Descriptive statistics

Standard deviation15.385944
Coefficient of variation (CV)0.30997323
Kurtosis-1.3002699
Mean49.636364
Median Absolute Deviation (MAD)15
Skewness-0.093193277
Sum3276
Variance236.72727
MonotonicityNot monotonic
2023-12-12T07:37:16.405857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
60 6
9.1%
30 6
9.1%
50 6
9.1%
40 6
9.1%
35 6
9.1%
45 6
9.1%
65 6
9.1%
70 6
9.1%
25 6
9.1%
71 6
9.1%
ValueCountFrequency (%)
25 6
9.1%
30 6
9.1%
35 6
9.1%
40 6
9.1%
45 6
9.1%
50 6
9.1%
55 6
9.1%
60 6
9.1%
65 6
9.1%
70 6
9.1%
ValueCountFrequency (%)
71 6
9.1%
70 6
9.1%
65 6
9.1%
60 6
9.1%
55 6
9.1%
50 6
9.1%
45 6
9.1%
40 6
9.1%
35 6
9.1%
30 6
9.1%

POP_CNT
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct66
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean93183.924
Minimum28138
Maximum190046
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size726.0 B
2023-12-12T07:37:16.508911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum28138
5-th percentile35951.75
Q167512.25
median81877.5
Q3123823.75
95-th percentile180018.5
Maximum190046
Range161908
Interquartile range (IQR)56311.5

Descriptive statistics

Standard deviation42721.961
Coefficient of variation (CV)0.45846922
Kurtosis-0.30539875
Mean93183.924
Median Absolute Deviation (MAD)28789
Skewness0.63619849
Sum6150139
Variance1.825166 × 109
MonotonicityNot monotonic
2023-12-12T07:37:16.736324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
82047 1
 
1.5%
52554 1
 
1.5%
81475 1
 
1.5%
67817 1
 
1.5%
67075 1
 
1.5%
91317 1
 
1.5%
140298 1
 
1.5%
81708 1
 
1.5%
150526 1
 
1.5%
149496 1
 
1.5%
Other values (56) 56
84.8%
ValueCountFrequency (%)
28138 1
1.5%
29296 1
1.5%
33781 1
1.5%
34645 1
1.5%
39872 1
1.5%
40146 1
1.5%
40816 1
1.5%
41379 1
1.5%
44790 1
1.5%
45100 1
1.5%
ValueCountFrequency (%)
190046 1
1.5%
189910 1
1.5%
185087 1
1.5%
180591 1
1.5%
178301 1
1.5%
170805 1
1.5%
150526 1
1.5%
149496 1
1.5%
147533 1
1.5%
140298 1
1.5%

Interactions

2023-12-12T07:37:15.429027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:37:15.296827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:37:15.494541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:37:15.360529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:37:16.946532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
BS_YR_MONGENDERAGE_CDPOP_CNT
BS_YR_MON1.0000.0000.0000.000
GENDER0.0001.0000.0000.477
AGE_CD0.0000.0001.0000.917
POP_CNT0.0000.4770.9171.000
2023-12-12T07:37:17.112866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GENDERBS_YR_MON
GENDER1.0000.000
BS_YR_MON0.0001.000
2023-12-12T07:37:17.251019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
AGE_CDPOP_CNTBS_YR_MONGENDER
AGE_CD1.000-0.6560.0000.000
POP_CNT-0.6561.0000.0000.334
BS_YR_MON0.0000.0001.0000.000
GENDER0.0000.3340.0001.000

Missing values

2023-12-12T07:37:15.588613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:37:15.657825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BS_YR_MONREGION_CDGENDERAGE_CDPOP_CNT
0201912116082047
12020121130124300
22021121150109093
32019121240110132
42020121135185087
52021121140190046
6202112124588161
72019121230128149
8202112116090695
92019121130136735
BS_YR_MONREGION_CDGENDERAGE_CDPOP_CNT
56202012127045777
57201912125074784
582019121150108520
59202012117134645
60202012116568092
61201912127128138
62202012115592687
63202012125071067
64202112125578288
652020121150106757