Overview

Dataset statistics

Number of variables5
Number of observations55
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.5 KiB
Average record size in memory47.3 B

Variable types

Categorical3
Numeric2

Dataset

Description샘플 데이터
Author코리아크레딧뷰로 / 장윤상
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020200

Alerts

BS_YR_MON has constant value ""Constant
GENDER is highly imbalanced (77.5%)Imbalance

Reproduction

Analysis started2024-01-14 00:46:38.563124
Analysis finished2024-01-14 00:46:40.620056
Duration2.06 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BS_YR_MON
Categorical

CONSTANT 

Distinct1
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size568.0 B
202112
55 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202112
2nd row202112
3rd row202112
4th row202112
5th row202112

Common Values

ValueCountFrequency (%)
202112 55
100.0%

Length

2024-01-14T09:46:40.833930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-14T09:46:41.171818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202112 55
100.0%

ADM_CD
Real number (ℝ)

Distinct38
Distinct (%)69.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40667044
Minimum28260515
Maximum41630250
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size623.0 B
2024-01-14T09:46:41.737611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum28260515
5-th percentile37257774
Q141220256
median41360545
Q341550315
95-th percentile41610250
Maximum41630250
Range13369735
Interquartile range (IQR)330059

Descriptive statistics

Standard deviation3011628.9
Coefficient of variation (CV)0.074055761
Kurtosis14.710947
Mean40667044
Median Absolute Deviation (MAD)169715
Skewness-4.0149115
Sum2.2366874 × 109
Variance9.0699087 × 1012
MonotonicityNot monotonic
2024-01-14T09:46:42.304670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
41610250 4
 
7.3%
41590259 3
 
5.5%
41590253 3
 
5.5%
41370540 2
 
3.6%
41360256 2
 
3.6%
41190795 2
 
3.6%
41461530 2
 
3.6%
41360545 2
 
3.6%
41190742 2
 
3.6%
41550250 2
 
3.6%
Other values (28) 31
56.4%
ValueCountFrequency (%)
28260515 1
1.8%
28260575 1
1.8%
28260730 1
1.8%
41113650 1
1.8%
41190742 2
3.6%
41190744 1
1.8%
41190795 2
3.6%
41190800 1
1.8%
41190830 2
3.6%
41220253 1
1.8%
ValueCountFrequency (%)
41630250 1
 
1.8%
41610250 4
7.3%
41590262 1
 
1.8%
41590259 3
5.5%
41590253 3
5.5%
41570570 1
 
1.8%
41550380 1
 
1.8%
41550250 2
3.6%
41500515 1
 
1.8%
41500253 1
 
1.8%

GENDER
Categorical

IMBALANCE 

Distinct2
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size568.0 B
1
53 
2
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 53
96.4%
2 2
 
3.6%

Length

2024-01-14T09:46:42.723647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-14T09:46:42.961917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 53
96.4%
2 2
 
3.6%

AGE_CD
Real number (ℝ)

Distinct7
Distinct (%)12.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.545455
Minimum40
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size623.0 B
2024-01-14T09:46:43.106268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile45
Q147.5
median55
Q360
95-th percentile66.5
Maximum70
Range30
Interquartile range (IQR)12.5

Descriptive statistics

Standard deviation7.7143881
Coefficient of variation (CV)0.14143045
Kurtosis-0.65043159
Mean54.545455
Median Absolute Deviation (MAD)5
Skewness0.063307377
Sum3000
Variance59.511785
MonotonicityNot monotonic
2024-01-14T09:46:43.319885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
55 17
30.9%
45 12
21.8%
60 11
20.0%
65 5
 
9.1%
50 5
 
9.1%
70 3
 
5.5%
40 2
 
3.6%
ValueCountFrequency (%)
40 2
 
3.6%
45 12
21.8%
50 5
 
9.1%
55 17
30.9%
60 11
20.0%
65 5
 
9.1%
70 3
 
5.5%
ValueCountFrequency (%)
70 3
 
5.5%
65 5
 
9.1%
60 11
20.0%
55 17
30.9%
50 5
 
9.1%
45 12
21.8%
40 2
 
3.6%

POP_CNT
Categorical

Distinct4
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Memory size568.0 B
4
38 
5
10 
7
6
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row4
4th row4
5th row5

Common Values

ValueCountFrequency (%)
4 38
69.1%
5 10
 
18.2%
7 4
 
7.3%
6 3
 
5.5%

Length

2024-01-14T09:46:43.577376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-14T09:46:43.897892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 38
69.1%
5 10
 
18.2%
7 4
 
7.3%
6 3
 
5.5%

Interactions

2024-01-14T09:46:39.305645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-14T09:46:38.744023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-14T09:46:39.611599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-14T09:46:39.014716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-14T09:46:44.086298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ADM_CDGENDERAGE_CDPOP_CNT
ADM_CD1.0000.0000.1520.000
GENDER0.0001.0000.0130.000
AGE_CD0.1520.0131.0000.000
POP_CNT0.0000.0000.0001.000
2024-01-14T09:46:44.314431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
GENDERPOP_CNT
GENDER1.0000.000
POP_CNT0.0001.000
2024-01-14T09:46:44.557438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ADM_CDAGE_CDGENDERPOP_CNT
ADM_CD1.000-0.1080.0000.205
AGE_CD-0.1081.0000.0000.000
GENDER0.0000.0001.0000.000
POP_CNT0.2050.0000.0001.000

Missing values

2024-01-14T09:46:39.977392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-14T09:46:40.309980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BS_YR_MONADM_CDGENDERAGE_CDPOP_CNT
0202112415902592454
1202112415705701454
2202112411908001554
3202112412206301454
4202112415902531705
5202112412716001654
6202112413605701454
7202112412202561504
8202112414615251454
9202112413905971554
BS_YR_MONADM_CDGENDERAGE_CDPOP_CNT
45202112415902621554
46202112416102501604
47202112412202531554
48202112415902591457
49202112411908301604
50202112282605751454
51202112282605151556
52202112415902531454
53202112411907951554
54202112412206351604