Overview

Dataset statistics

Number of variables5
Number of observations66
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.0 KiB
Average record size in memory47.0 B

Variable types

Categorical3
Numeric2

Dataset

Description샘플 데이터
Author코리아크레딧뷰로 / 장윤상
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020197

Alerts

REGION_CD has constant value ""Constant
AGE_CD is highly overall correlated with POP_CNTHigh correlation
POP_CNT is highly overall correlated with AGE_CDHigh correlation
POP_CNT has unique valuesUnique

Reproduction

Analysis started2024-04-19 05:48:16.188402
Analysis finished2024-04-19 05:48:16.782877
Duration0.59 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BS_YR_MON
Categorical

Distinct3
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size660.0 B
201912
22 
202112
22 
202012
22 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row201912
2nd row202112
3rd row202112
4th row201912
5th row201912

Common Values

ValueCountFrequency (%)
201912 22
33.3%
202112 22
33.3%
202012 22
33.3%

Length

2024-04-19T14:48:16.847368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-19T14:48:16.955009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
201912 22
33.3%
202112 22
33.3%
202012 22
33.3%

REGION_CD
Categorical

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size660.0 B
1
66 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 66
100.0%

Length

2024-04-19T14:48:17.111969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-19T14:48:17.205992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 66
100.0%

GENDER
Categorical

Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size660.0 B
1
33 
2
33 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row1
4th row2
5th row2

Common Values

ValueCountFrequency (%)
1 33
50.0%
2 33
50.0%

Length

2024-04-19T14:48:17.318832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-19T14:48:17.452942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 33
50.0%
2 33
50.0%

AGE_CD
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.636364
Minimum25
Maximum71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size726.0 B
2024-04-19T14:48:17.561661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile25
Q135
median50
Q365
95-th percentile71
Maximum71
Range46
Interquartile range (IQR)30

Descriptive statistics

Standard deviation15.385944
Coefficient of variation (CV)0.30997323
Kurtosis-1.3002699
Mean49.636364
Median Absolute Deviation (MAD)15
Skewness-0.093193277
Sum3276
Variance236.72727
MonotonicityNot monotonic
2024-04-19T14:48:17.667941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
25 6
9.1%
70 6
9.1%
45 6
9.1%
30 6
9.1%
40 6
9.1%
55 6
9.1%
50 6
9.1%
71 6
9.1%
65 6
9.1%
35 6
9.1%
ValueCountFrequency (%)
25 6
9.1%
30 6
9.1%
35 6
9.1%
40 6
9.1%
45 6
9.1%
50 6
9.1%
55 6
9.1%
60 6
9.1%
65 6
9.1%
70 6
9.1%
ValueCountFrequency (%)
71 6
9.1%
70 6
9.1%
65 6
9.1%
60 6
9.1%
55 6
9.1%
50 6
9.1%
45 6
9.1%
40 6
9.1%
35 6
9.1%
30 6
9.1%

POP_CNT
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct66
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36581.076
Minimum3425
Maximum115502
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size726.0 B
2024-04-19T14:48:17.790639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3425
5-th percentile4882
Q110834.75
median26521
Q355892.25
95-th percentile104756.25
Maximum115502
Range112077
Interquartile range (IQR)45057.5

Descriptive statistics

Standard deviation31764.183
Coefficient of variation (CV)0.86832282
Kurtosis0.055282342
Mean36581.076
Median Absolute Deviation (MAD)18595
Skewness1.0229641
Sum2414351
Variance1.0089633 × 109
MonotonicityNot monotonic
2024-04-19T14:48:17.921852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
47285 1
 
1.5%
107267 1
 
1.5%
16849 1
 
1.5%
4751 1
 
1.5%
8972 1
 
1.5%
68925 1
 
1.5%
71782 1
 
1.5%
14647 1
 
1.5%
40770 1
 
1.5%
83990 1
 
1.5%
Other values (56) 56
84.8%
ValueCountFrequency (%)
3425 1
1.5%
4217 1
1.5%
4702 1
1.5%
4751 1
1.5%
5275 1
1.5%
5600 1
1.5%
5627 1
1.5%
5812 1
1.5%
6521 1
1.5%
6727 1
1.5%
ValueCountFrequency (%)
115502 1
1.5%
113573 1
1.5%
111581 1
1.5%
107267 1
1.5%
97224 1
1.5%
90161 1
1.5%
88484 1
1.5%
85596 1
1.5%
83990 1
1.5%
71782 1
1.5%

Interactions

2024-04-19T14:48:16.472011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:48:16.321123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:48:16.551491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-19T14:48:16.394889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-19T14:48:18.003970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
BS_YR_MONGENDERAGE_CDPOP_CNT
BS_YR_MON1.0000.0000.0000.000
GENDER0.0001.0000.0000.337
AGE_CD0.0000.0001.0000.895
POP_CNT0.0000.3370.8951.000
2024-04-19T14:48:18.093282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GENDERBS_YR_MON
GENDER1.0000.000
BS_YR_MON0.0001.000
2024-04-19T14:48:18.182513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
AGE_CDPOP_CNTBS_YR_MONGENDER
AGE_CD1.000-0.8720.0000.000
POP_CNT-0.8721.0000.0000.237
BS_YR_MON0.0000.0001.0000.000
GENDER0.0000.2370.0001.000

Missing values

2024-04-19T14:48:16.654290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-19T14:48:16.744096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BS_YR_MONREGION_CDGENDERAGE_CDPOP_CNT
0201912112547285
120211212706521
2202112112532091
3201912124525002
4201912123057243
5201912114088484
6201912115520328
7201912125018634
8202112125022270
9201912115036459
BS_YR_MONREGION_CDGENDERAGE_CDPOP_CNT
56202112114571289
57201912123563944
58202012125515768
5920191211714702
60202112126014683
61201912116013460
62202112116511101
63201912113090161
6420201212705627
6520201212714217