Overview

Dataset statistics

Number of variables5
Number of observations99
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.5 KiB
Average record size in memory46.3 B

Variable types

Categorical2
Numeric3

Dataset

Description샘플 데이터
Author코리아크레딧뷰로 / 장윤상
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020205

Alerts

BS_YR_MON has constant value ""Constant
AGE_CD is highly overall correlated with POP_CNTHigh correlation
POP_CNT is highly overall correlated with AGE_CDHigh correlation

Reproduction

Analysis started2023-12-11 22:39:55.245480
Analysis finished2023-12-11 22:39:56.017123
Duration0.77 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BS_YR_MON
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
202112
99 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202112
2nd row202112
3rd row202112
4th row202112
5th row202112

Common Values

ValueCountFrequency (%)
202112 99
100.0%

Length

2023-12-12T07:39:56.062047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:39:56.125632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202112 99
100.0%

ADM_CD
Real number (ℝ)

Distinct94
Distinct (%)94.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28249644
Minimum26110545
Maximum31710310
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:39:56.197714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum26110545
5-th percentile26230567
Q126485750
median28200540
Q329200610
95-th percentile31143624
Maximum31710310
Range5599765
Interquartile range (IQR)2714860

Descriptive statistics

Standard deviation1628615.7
Coefficient of variation (CV)0.05765084
Kurtosis-1.1918648
Mean28249644
Median Absolute Deviation (MAD)1699770
Skewness0.24183275
Sum2.7967147 × 109
Variance2.6523891 × 1012
MonotonicityNot monotonic
2023-12-12T07:39:56.300864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
26500770 3
 
3.0%
29200565 3
 
3.0%
29170669 2
 
2.0%
26470730 1
 
1.0%
27260601 1
 
1.0%
31170550 1
 
1.0%
30140535 1
 
1.0%
28110585 1
 
1.0%
29200540 1
 
1.0%
30170540 1
 
1.0%
Other values (84) 84
84.8%
ValueCountFrequency (%)
26110545 1
1.0%
26110590 1
1.0%
26140660 1
1.0%
26200630 1
1.0%
26230540 1
1.0%
26230570 1
1.0%
26230660 1
1.0%
26260600 1
1.0%
26290550 1
1.0%
26290560 1
1.0%
ValueCountFrequency (%)
31710310 1
1.0%
31200580 1
1.0%
31200530 1
1.0%
31170550 1
1.0%
31170530 1
1.0%
31140635 1
1.0%
31110630 1
1.0%
30230543 1
1.0%
30200580 1
1.0%
30200550 1
1.0%

GENDER
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
1
54 
2
45 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row1
4th row2
5th row1

Common Values

ValueCountFrequency (%)
1 54
54.5%
2 45
45.5%

Length

2023-12-12T07:39:56.392407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:39:56.468443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 54
54.5%
2 45
45.5%

AGE_CD
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.515152
Minimum25
Maximum71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:39:56.543347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile29.5
Q140
median55
Q365
95-th percentile71
Maximum71
Range46
Interquartile range (IQR)25

Descriptive statistics

Standard deviation15.143102
Coefficient of variation (CV)0.28835682
Kurtosis-1.2515184
Mean52.515152
Median Absolute Deviation (MAD)15
Skewness-0.30774832
Sum5199
Variance229.31354
MonotonicityNot monotonic
2023-12-12T07:39:56.640668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
71 14
14.1%
60 12
12.1%
40 10
10.1%
70 10
10.1%
55 9
9.1%
65 9
9.1%
30 9
9.1%
50 7
7.1%
35 7
7.1%
45 7
7.1%
ValueCountFrequency (%)
25 5
5.1%
30 9
9.1%
35 7
7.1%
40 10
10.1%
45 7
7.1%
50 7
7.1%
55 9
9.1%
60 12
12.1%
65 9
9.1%
70 10
10.1%
ValueCountFrequency (%)
71 14
14.1%
70 10
10.1%
65 9
9.1%
60 12
12.1%
55 9
9.1%
50 7
7.1%
45 7
7.1%
40 10
10.1%
35 7
7.1%
30 9
9.1%

POP_CNT
Real number (ℝ)

HIGH CORRELATION 

Distinct84
Distinct (%)84.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean151.77778
Minimum4
Maximum565
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:39:56.750415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile5
Q127
median130
Q3230
95-th percentile387.6
Maximum565
Range561
Interquartile range (IQR)203

Descriptive statistics

Standard deviation134.16655
Coefficient of variation (CV)0.88396705
Kurtosis0.64405355
Mean151.77778
Median Absolute Deviation (MAD)102
Skewness0.96647038
Sum15026
Variance18000.664
MonotonicityNot monotonic
2023-12-12T07:39:56.874207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 4
 
4.0%
43 3
 
3.0%
5 3
 
3.0%
191 2
 
2.0%
8 2
 
2.0%
194 2
 
2.0%
21 2
 
2.0%
23 2
 
2.0%
20 2
 
2.0%
130 2
 
2.0%
Other values (74) 75
75.8%
ValueCountFrequency (%)
4 4
4.0%
5 3
3.0%
6 1
 
1.0%
7 1
 
1.0%
8 2
2.0%
10 1
 
1.0%
11 1
 
1.0%
13 1
 
1.0%
18 1
 
1.0%
19 1
 
1.0%
ValueCountFrequency (%)
565 1
1.0%
551 1
1.0%
531 1
1.0%
439 1
1.0%
402 1
1.0%
386 1
1.0%
374 1
1.0%
368 1
1.0%
343 1
1.0%
341 1
1.0%

Interactions

2023-12-12T07:39:55.732553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:55.363762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:55.562299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:55.790797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:55.440981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:55.623870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:55.849451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:55.505473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:39:55.676234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:39:56.961360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ADM_CDGENDERAGE_CDPOP_CNT
ADM_CD1.0000.2120.0000.285
GENDER0.2121.0000.0000.000
AGE_CD0.0000.0001.0000.420
POP_CNT0.2850.0000.4201.000
2023-12-12T07:39:57.047150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ADM_CDAGE_CDPOP_CNTGENDER
ADM_CD1.000-0.096-0.0550.183
AGE_CD-0.0961.0000.5880.000
POP_CNT-0.0550.5881.0000.000
GENDER0.1830.0000.0001.000

Missing values

2023-12-12T07:39:55.922692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:39:55.990529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BS_YR_MONADM_CDGENDERAGE_CDPOP_CNT
020211227260520255235
120211226290560250194
220211227110670160129
32021122914082123543
420211226470700171368
520211228237660160231
62021122620063026568
72021122611059015032
82021122632054313023
920211229170669265341
BS_YR_MONADM_CDGENDERAGE_CDPOP_CNT
892021122641060014086
9020211226500770245150
9120211226530661265216
922021122823764823543
9320211226290550165130
9420211229170673150191
9520211226380580145224
9620211229200565265343
9720211228260515135147
982021122714054014020