Overview

Dataset statistics

Number of variables6
Number of observations99
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.3 KiB
Average record size in memory55.3 B

Variable types

Numeric5
Categorical1

Dataset

Description샘플 데이터
Author코리아크레딧뷰로 / 장윤상
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020198

Alerts

ADM_CD is highly overall correlated with POP_CNT and 1 other fieldsHigh correlation
POP_CNT is highly overall correlated with ADM_CD and 1 other fieldsHigh correlation
ECO_BH_CNT is highly overall correlated with ADM_CD and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-11 22:42:51.689252
Analysis finished2023-12-11 22:42:54.083613
Duration2.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BS_YR_MON
Real number (ℝ)

Distinct10
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean202057.78
Minimum201912
Maximum202203
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:42:54.121798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201912
5-th percentile201912
Q1202006
median202012
Q3202109
95-th percentile202203
Maximum202203
Range291
Interquartile range (IQR)103

Descriptive statistics

Standard deviation78.626287
Coefficient of variation (CV)0.00038912774
Kurtosis-0.52972509
Mean202057.78
Median Absolute Deviation (MAD)91
Skewness0.1640678
Sum20003720
Variance6182.093
MonotonicityNot monotonic
2023-12-12T07:42:54.204768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
202003 13
13.1%
202112 11
11.1%
202006 11
11.1%
202009 11
11.1%
202203 11
11.1%
202106 10
10.1%
202012 9
9.1%
202103 8
8.1%
201912 8
8.1%
202109 7
7.1%
ValueCountFrequency (%)
201912 8
8.1%
202003 13
13.1%
202006 11
11.1%
202009 11
11.1%
202012 9
9.1%
202103 8
8.1%
202106 10
10.1%
202109 7
7.1%
202112 11
11.1%
202203 11
11.1%
ValueCountFrequency (%)
202203 11
11.1%
202112 11
11.1%
202109 7
7.1%
202106 10
10.1%
202103 8
8.1%
202012 9
9.1%
202009 11
11.1%
202006 11
11.1%
202003 13
13.1%
201912 8
8.1%

ADM_CD
Real number (ℝ)

HIGH CORRELATION 

Distinct98
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37171433
Minimum11110690
Maximum50110560
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:42:54.307782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110690
5-th percentile11440600
Q128200620
median42110620
Q347170400
95-th percentile48750379
Maximum50110560
Range38999870
Interquartile range (IQR)18969780

Descriptive statistics

Standard deviation12656844
Coefficient of variation (CV)0.34049923
Kurtosis-0.22471058
Mean37171433
Median Absolute Deviation (MAD)6000105
Skewness-1.0557503
Sum3.6799718 × 109
Variance1.6019571 × 1014
MonotonicityNot monotonic
2023-12-12T07:42:54.429209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42130330 2
 
2.0%
11620630 1
 
1.0%
50110560 1
 
1.0%
41800310 1
 
1.0%
45140646 1
 
1.0%
46790395 1
 
1.0%
47210610 1
 
1.0%
44150570 1
 
1.0%
41480530 1
 
1.0%
47130250 1
 
1.0%
Other values (88) 88
88.9%
ValueCountFrequency (%)
11110690 1
1.0%
11140590 1
1.0%
11170530 1
1.0%
11170650 1
1.0%
11170690 1
1.0%
11470590 1
1.0%
11470611 1
1.0%
11530560 1
1.0%
11545510 1
1.0%
11560650 1
1.0%
ValueCountFrequency (%)
50110560 1
1.0%
48890430 1
1.0%
48880410 1
1.0%
48850410 1
1.0%
48840370 1
1.0%
48740380 1
1.0%
48740320 1
1.0%
48720330 1
1.0%
48310600 1
1.0%
48250320 1
1.0%

GENDER
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
2
50 
1
49 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row1
4th row1
5th row2

Common Values

ValueCountFrequency (%)
2 50
50.5%
1 49
49.5%

Length

2023-12-12T07:42:54.535359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:42:54.603377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 50
50.5%
1 49
49.5%

AGE_CD
Real number (ℝ)

Distinct11
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48
Minimum25
Maximum71
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:42:54.663569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile25
Q135
median50
Q360
95-th percentile71
Maximum71
Range46
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.937284
Coefficient of variation (CV)0.31119341
Kurtosis-1.155448
Mean48
Median Absolute Deviation (MAD)15
Skewness-0.00018748811
Sum4752
Variance223.12245
MonotonicityNot monotonic
2023-12-12T07:42:54.750364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
45 12
12.1%
25 12
12.1%
55 11
11.1%
35 11
11.1%
50 11
11.1%
60 9
9.1%
70 8
8.1%
71 7
7.1%
30 7
7.1%
40 6
6.1%
ValueCountFrequency (%)
25 12
12.1%
30 7
7.1%
35 11
11.1%
40 6
6.1%
45 12
12.1%
50 11
11.1%
55 11
11.1%
60 9
9.1%
65 5
5.1%
70 8
8.1%
ValueCountFrequency (%)
71 7
7.1%
70 8
8.1%
65 5
5.1%
60 9
9.1%
55 11
11.1%
50 11
11.1%
45 12
12.1%
40 6
6.1%
35 11
11.1%
30 7
7.1%

POP_CNT
Real number (ℝ)

HIGH CORRELATION 

Distinct92
Distinct (%)92.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean493.35354
Minimum12
Maximum1784
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:42:54.854151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile26.7
Q1111
median378
Q3744
95-th percentile1373.2
Maximum1784
Range1772
Interquartile range (IQR)633

Descriptive statistics

Standard deviation447.61145
Coefficient of variation (CV)0.90728334
Kurtosis0.37086166
Mean493.35354
Median Absolute Deviation (MAD)292
Skewness1.0259837
Sum48842
Variance200356.01
MonotonicityNot monotonic
2023-12-12T07:42:54.971461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
216 3
 
3.0%
359 2
 
2.0%
1611 2
 
2.0%
210 2
 
2.0%
65 2
 
2.0%
561 2
 
2.0%
1187 1
 
1.0%
197 1
 
1.0%
748 1
 
1.0%
1091 1
 
1.0%
Other values (82) 82
82.8%
ValueCountFrequency (%)
12 1
1.0%
13 1
1.0%
17 1
1.0%
20 1
1.0%
24 1
1.0%
27 1
1.0%
29 1
1.0%
34 1
1.0%
49 1
1.0%
54 1
1.0%
ValueCountFrequency (%)
1784 1
1.0%
1706 1
1.0%
1611 2
2.0%
1564 1
1.0%
1352 1
1.0%
1297 1
1.0%
1187 1
1.0%
1164 1
1.0%
1137 1
1.0%
1102 1
1.0%

ECO_BH_CNT
Real number (ℝ)

HIGH CORRELATION 

Distinct93
Distinct (%)93.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean414.41414
Minimum10
Maximum1580
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T07:42:55.080819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile16.6
Q182.5
median272
Q3670
95-th percentile1163.9
Maximum1580
Range1570
Interquartile range (IQR)587.5

Descriptive statistics

Standard deviation392.70698
Coefficient of variation (CV)0.94761965
Kurtosis0.43659719
Mean414.41414
Median Absolute Deviation (MAD)225
Skewness1.0607633
Sum41027
Variance154218.78
MonotonicityNot monotonic
2023-12-12T07:42:55.189448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
139 3
 
3.0%
247 2
 
2.0%
54 2
 
2.0%
13 2
 
2.0%
277 2
 
2.0%
598 1
 
1.0%
161 1
 
1.0%
12 1
 
1.0%
176 1
 
1.0%
666 1
 
1.0%
Other values (83) 83
83.8%
ValueCountFrequency (%)
10 1
1.0%
11 1
1.0%
12 1
1.0%
13 2
2.0%
17 1
1.0%
21 1
1.0%
23 1
1.0%
26 1
1.0%
39 1
1.0%
40 1
1.0%
ValueCountFrequency (%)
1580 1
1.0%
1567 1
1.0%
1421 1
1.0%
1291 1
1.0%
1226 1
1.0%
1157 1
1.0%
1156 1
1.0%
1103 1
1.0%
1040 1
1.0%
924 1
1.0%

Interactions

2023-12-12T07:42:53.589785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:51.877995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.281765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.691840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:53.014339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:53.662780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:51.962339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.359108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.758305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:53.083808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:53.741727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.054033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.452698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.825509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:53.375015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:53.808201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.131820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.531862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.881262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:53.436450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:53.879872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.206834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.617292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:52.945710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:42:53.514884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:42:55.271986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
BS_YR_MONADM_CDGENDERAGE_CDPOP_CNTECO_BH_CNT
BS_YR_MON1.0000.0000.0000.0000.0000.000
ADM_CD0.0001.0000.2930.2600.4770.591
GENDER0.0000.2931.0000.0000.4330.000
AGE_CD0.0000.2600.0001.0000.2540.267
POP_CNT0.0000.4770.4330.2541.0000.971
ECO_BH_CNT0.0000.5910.0000.2670.9711.000
2023-12-12T07:42:55.450511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
BS_YR_MONADM_CDAGE_CDPOP_CNTECO_BH_CNTGENDER
BS_YR_MON1.000-0.0550.0640.0570.0380.000
ADM_CD-0.0551.0000.019-0.529-0.5130.221
AGE_CD0.0640.0191.000-0.085-0.1420.000
POP_CNT0.057-0.529-0.0851.0000.9900.317
ECO_BH_CNT0.038-0.513-0.1420.9901.0000.000
GENDER0.0000.2210.0000.3170.0001.000

Missing values

2023-12-12T07:42:53.975084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:42:54.052696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BS_YR_MONADM_CDGENDERAGE_CDPOP_CNTECO_BH_CNT
020211211620630265692598
1202006451304002555740
220201242820250171378247
3202009487403201706047
420211248127545245548501
520220341173510225865833
6202106487403802359043
7202203457403201352013
820210327170600130270258
920200930110590145394277
BS_YR_MONADM_CDGENDERAGE_CDPOP_CNTECO_BH_CNT
8920201229200550245160139
902022034514041013013986
91202006488904302256563
9220200328177620225802733
93202112427703202457771
9420210648850410171420277
9520210948220665225259253
9620210941220330230777747
97202003467203702355441
9820200341220600130561522