Overview

Dataset statistics

Number of variables21
Number of observations2000
Missing cells13780
Missing cells (%)32.8%
Duplicate rows167
Duplicate rows (%)8.3%
Total size in memory365.4 KiB
Average record size in memory187.1 B

Variable types

Text1
Numeric9
Unsupported2
Categorical9

Dataset

Description샘플 데이터
Author(주)모토브 / 신재훈
URLhttps://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020251

Alerts

register_at has constant value ""Constant
Dataset has 167 (8.3%) duplicate rowsDuplicates
m10 is highly imbalanced (69.5%)Imbalance
m20 is highly imbalanced (78.2%)Imbalance
m70 is highly imbalanced (71.3%)Imbalance
f10 is highly imbalanced (88.2%)Imbalance
f30 is highly imbalanced (80.1%)Imbalance
f40 is highly imbalanced (63.2%)Imbalance
f60 is highly imbalanced (82.6%)Imbalance
f70 is highly imbalanced (85.3%)Imbalance
m00 has 2000 (100.0%) missing valuesMissing
m30 has 1655 (82.8%) missing valuesMissing
m40 has 1628 (81.4%) missing valuesMissing
m50 has 1563 (78.1%) missing valuesMissing
m60 has 1591 (79.5%) missing valuesMissing
f00 has 2000 (100.0%) missing valuesMissing
f20 has 1749 (87.5%) missing valuesMissing
f50 has 1594 (79.7%) missing valuesMissing
m00 is an unsupported type, check if it needs cleaning or further analysisUnsupported
f00 is an unsupported type, check if it needs cleaning or further analysisUnsupported
total has 467 (23.4%) zerosZeros

Reproduction

Analysis started2023-12-11 22:34:23.478860
Analysis finished2023-12-11 22:34:24.115507
Duration0.64 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct64
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
2023-12-12T07:34:24.263755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters20000
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowT_94385645
2nd rowT_49402838
3rd rowT_47205528
4th rowT_97608367
5th rowT_15961504
ValueCountFrequency (%)
t_94385645 32
 
1.6%
t_92408066 32
 
1.6%
t_22699923 32
 
1.6%
t_17060159 32
 
1.6%
t_47791477 32
 
1.6%
t_98633779 32
 
1.6%
t_47425259 32
 
1.6%
t_94605377 32
 
1.6%
t_17133403 32
 
1.6%
t_47352015 32
 
1.6%
Other values (54) 1680
84.0%
2023-12-12T07:34:24.575461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 2093
10.5%
T 2000
10.0%
_ 2000
10.0%
7 1987
9.9%
8 1776
8.9%
9 1568
7.8%
6 1537
7.7%
0 1535
7.7%
3 1532
7.7%
2 1525
7.6%
Other values (2) 2447
12.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16000
80.0%
Uppercase Letter 2000
 
10.0%
Connector Punctuation 2000
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 2093
13.1%
7 1987
12.4%
8 1776
11.1%
9 1568
9.8%
6 1537
9.6%
0 1535
9.6%
3 1532
9.6%
2 1525
9.5%
1 1377
8.6%
5 1070
6.7%
Uppercase Letter
ValueCountFrequency (%)
T 2000
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 18000
90.0%
Latin 2000
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 2093
11.6%
_ 2000
11.1%
7 1987
11.0%
8 1776
9.9%
9 1568
8.7%
6 1537
8.5%
0 1535
8.5%
3 1532
8.5%
2 1525
8.5%
1 1377
7.6%
Latin
ValueCountFrequency (%)
T 2000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 2093
10.5%
T 2000
10.0%
_ 2000
10.0%
7 1987
9.9%
8 1776
8.9%
9 1568
7.8%
6 1537
7.7%
0 1535
7.7%
3 1532
7.7%
2 1525
7.6%
Other values (2) 2447
12.2%

latitude
Real number (ℝ)

Distinct1353
Distinct (%)67.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.513271
Minimum37.329357
Maximum37.75941
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.7 KiB
2023-12-12T07:34:24.685368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.329357
5-th percentile37.43614
Q137.473595
median37.506542
Q337.53827
95-th percentile37.64166
Maximum37.75941
Range0.430053
Interquartile range (IQR)0.064675

Descriptive statistics

Standard deviation0.065224721
Coefficient of variation (CV)0.0017387106
Kurtosis2.9835217
Mean37.513271
Median Absolute Deviation (MAD)0.032947
Skewness0.59582938
Sum75026.543
Variance0.0042542642
MonotonicityNot monotonic
2023-12-12T07:34:24.783175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.53594 31
 
1.6%
37.5649 31
 
1.6%
37.65057 31
 
1.6%
37.50444 21
 
1.1%
37.57089 21
 
1.1%
37.464 20
 
1.0%
37.503742 18
 
0.9%
37.64166 16
 
0.8%
37.530537 16
 
0.8%
37.4907 15
 
0.8%
Other values (1343) 1780
89.0%
ValueCountFrequency (%)
37.329357 1
0.1%
37.32951 1
0.1%
37.32966 1
0.1%
37.329815 1
0.1%
37.330135 1
0.1%
37.33031 1
0.1%
37.330494 1
0.1%
37.330685 1
0.1%
37.330875 1
0.1%
37.33107 1
0.1%
ValueCountFrequency (%)
37.75941 1
0.1%
37.75938 2
0.1%
37.759357 1
0.1%
37.75933 1
0.1%
37.75929 1
0.1%
37.75923 1
0.1%
37.75919 1
0.1%
37.75914 1
0.1%
37.75911 1
0.1%
37.75907 1
0.1%

longitude
Real number (ℝ)

Distinct1007
Distinct (%)50.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean126.87146
Minimum126.63241
Maximum127.24984
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.7 KiB
2023-12-12T07:34:24.877468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.63241
5-th percentile126.66715
Q1126.70485
median126.85901
Q3127.02955
95-th percentile127.11921
Maximum127.24984
Range0.61743
Interquartile range (IQR)0.3246975

Descriptive statistics

Standard deviation0.16829362
Coefficient of variation (CV)0.0013264892
Kurtosis-1.2990669
Mean126.87146
Median Absolute Deviation (MAD)0.158245
Skewness0.22885201
Sum253742.92
Variance0.028322741
MonotonicityNot monotonic
2023-12-12T07:34:24.977000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.73594 32
 
1.6%
126.899445 31
 
1.6%
126.63241 31
 
1.6%
126.83376 31
 
1.6%
126.83854 30
 
1.5%
126.68101 27
 
1.4%
126.703354 26
 
1.3%
127.0187 24
 
1.2%
127.04489 21
 
1.1%
126.76331 19
 
0.9%
Other values (997) 1728
86.4%
ValueCountFrequency (%)
126.63241 31
1.6%
126.64281 2
 
0.1%
126.64282 2
 
0.1%
126.64285 1
 
0.1%
126.64287 2
 
0.1%
126.64289 3
 
0.1%
126.64292 2
 
0.1%
126.64294 2
 
0.1%
126.64298 4
 
0.2%
126.643036 7
 
0.4%
ValueCountFrequency (%)
127.24984 1
0.1%
127.24971 1
0.1%
127.24958 1
0.1%
127.24944 1
0.1%
127.24934 1
0.1%
127.24922 1
0.1%
127.24914 1
0.1%
127.24909 1
0.1%
127.249054 1
0.1%
127.24902 1
0.1%

m00
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2000
Missing (%)100.0%
Memory size17.7 KiB

m10
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
<NA>
1778 
20.0
 
95
50.0
 
63
33.33
 
32
12.5
 
32

Length

Max length5
Median length4
Mean length4.016
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row50.0
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1778
88.9%
20.0 95
 
4.8%
50.0 63
 
3.1%
33.33 32
 
1.6%
12.5 32
 
1.6%

Length

2023-12-12T07:34:25.079888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:25.164810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1778
88.9%
20.0 95
 
4.8%
50.0 63
 
3.1%
33.33 32
 
1.6%
12.5 32
 
1.6%

m20
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
<NA>
1873 
50
 
63
100
 
32
20
 
32

Length

Max length4
Median length4
Mean length3.889
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1873
93.7%
50 63
 
3.1%
100 32
 
1.6%
20 32
 
1.6%

Length

2023-12-12T07:34:25.260347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:25.346599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1873
93.7%
50 63
 
3.1%
100 32
 
1.6%
20 32
 
1.6%

m30
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)1.7%
Missing1655
Missing (%)82.8%
Infinite0
Infinite (%)0.0%
Mean35.236406
Minimum12.5
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.7 KiB
2023-12-12T07:34:25.422840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12.5
5-th percentile12.5
Q116.67
median33.33
Q350
95-th percentile100
Maximum100
Range87.5
Interquartile range (IQR)33.33

Descriptive statistics

Standard deviation24.148233
Coefficient of variation (CV)0.68532056
Kurtosis2.0644617
Mean35.236406
Median Absolute Deviation (MAD)16.66
Skewness1.6408368
Sum12156.56
Variance583.13717
MonotonicityNot monotonic
2023-12-12T07:34:25.496970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
33.33 94
 
4.7%
50.0 63
 
3.1%
16.67 62
 
3.1%
20.0 62
 
3.1%
100.0 32
 
1.6%
12.5 32
 
1.6%
(Missing) 1655
82.8%
ValueCountFrequency (%)
12.5 32
 
1.6%
16.67 62
3.1%
20.0 62
3.1%
33.33 94
4.7%
50.0 63
3.1%
100.0 32
 
1.6%
ValueCountFrequency (%)
100.0 32
 
1.6%
50.0 63
3.1%
33.33 94
4.7%
20.0 62
3.1%
16.67 62
3.1%
12.5 32
 
1.6%

m40
Real number (ℝ)

MISSING 

Distinct7
Distinct (%)1.9%
Missing1628
Missing (%)81.4%
Infinite0
Infinite (%)0.0%
Mean50.703118
Minimum12.5
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.7 KiB
2023-12-12T07:34:25.574142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12.5
5-th percentile12.5
Q120
median33.33
Q3100
95-th percentile100
Maximum100
Range87.5
Interquartile range (IQR)80

Descriptive statistics

Standard deviation35.461366
Coefficient of variation (CV)0.69939221
Kurtosis-1.4910759
Mean50.703118
Median Absolute Deviation (MAD)16.66
Skewness0.55862873
Sum18861.56
Variance1257.5085
MonotonicityNot monotonic
2023-12-12T07:34:25.645943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
100.0 121
 
6.0%
25.0 63
 
3.1%
33.33 63
 
3.1%
12.5 32
 
1.6%
20.0 31
 
1.6%
16.67 31
 
1.6%
50.0 31
 
1.6%
(Missing) 1628
81.4%
ValueCountFrequency (%)
12.5 32
 
1.6%
16.67 31
 
1.6%
20.0 31
 
1.6%
25.0 63
3.1%
33.33 63
3.1%
50.0 31
 
1.6%
100.0 121
6.0%
ValueCountFrequency (%)
100.0 121
6.0%
50.0 31
 
1.6%
33.33 63
3.1%
25.0 63
3.1%
20.0 31
 
1.6%
16.67 31
 
1.6%
12.5 32
 
1.6%

m50
Real number (ℝ)

MISSING 

Distinct7
Distinct (%)1.6%
Missing1563
Missing (%)78.1%
Infinite0
Infinite (%)0.0%
Mean52.063066
Minimum16.67
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.7 KiB
2023-12-12T07:34:25.720155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum16.67
5-th percentile16.67
Q133.33
median40
Q366.67
95-th percentile100
Maximum100
Range83.33
Interquartile range (IQR)33.34

Descriptive statistics

Standard deviation27.556809
Coefficient of variation (CV)0.5292967
Kurtosis-0.695507
Mean52.063066
Median Absolute Deviation (MAD)10
Skewness0.84551592
Sum22751.56
Variance759.37773
MonotonicityNot monotonic
2023-12-12T07:34:25.794917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
100.0 94
 
4.7%
33.33 94
 
4.7%
40.0 94
 
4.7%
50.0 62
 
3.1%
16.67 31
 
1.6%
66.67 31
 
1.6%
25.0 31
 
1.6%
(Missing) 1563
78.1%
ValueCountFrequency (%)
16.67 31
 
1.6%
25.0 31
 
1.6%
33.33 94
4.7%
40.0 94
4.7%
50.0 62
3.1%
66.67 31
 
1.6%
100.0 94
4.7%
ValueCountFrequency (%)
100.0 94
4.7%
66.67 31
 
1.6%
50.0 62
3.1%
40.0 94
4.7%
33.33 94
4.7%
25.0 31
 
1.6%
16.67 31
 
1.6%

m60
Real number (ℝ)

MISSING 

Distinct7
Distinct (%)1.7%
Missing1591
Missing (%)79.5%
Infinite0
Infinite (%)0.0%
Mean38.948386
Minimum12.5
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.7 KiB
2023-12-12T07:34:25.870245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12.5
5-th percentile12.5
Q120
median33.33
Q350
95-th percentile100
Maximum100
Range87.5
Interquartile range (IQR)30

Descriptive statistics

Standard deviation23.696252
Coefficient of variation (CV)0.60840139
Kurtosis0.9204954
Mean38.948386
Median Absolute Deviation (MAD)16.66
Skewness1.1991378
Sum15929.89
Variance561.51237
MonotonicityNot monotonic
2023-12-12T07:34:25.942555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
33.33 95
 
4.8%
50.0 94
 
4.7%
20.0 94
 
4.7%
100.0 32
 
1.6%
12.5 32
 
1.6%
16.67 31
 
1.6%
66.67 31
 
1.6%
(Missing) 1591
79.5%
ValueCountFrequency (%)
12.5 32
 
1.6%
16.67 31
 
1.6%
20.0 94
4.7%
33.33 95
4.8%
50.0 94
4.7%
66.67 31
 
1.6%
100.0 32
 
1.6%
ValueCountFrequency (%)
100.0 32
 
1.6%
66.67 31
 
1.6%
50.0 94
4.7%
33.33 95
4.8%
20.0 94
4.7%
16.67 31
 
1.6%
12.5 32
 
1.6%

m70
Categorical

IMBALANCE 

Distinct6
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
<NA>
1782 
16.67
 
62
20.0
 
62
33.33
 
32
50.0
 
31

Length

Max length5
Median length4
Mean length4.0625
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row16.67
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1782
89.1%
16.67 62
 
3.1%
20.0 62
 
3.1%
33.33 32
 
1.6%
50.0 31
 
1.6%
100.0 31
 
1.6%

Length

2023-12-12T07:34:26.029109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:26.112931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1782
89.1%
16.67 62
 
3.1%
20.0 62
 
3.1%
33.33 32
 
1.6%
50.0 31
 
1.6%
100.0 31
 
1.6%

f00
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2000
Missing (%)100.0%
Memory size17.7 KiB

f10
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
<NA>
1968 
20
 
32

Length

Max length4
Median length4
Mean length3.968
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1968
98.4%
20 32
 
1.6%

Length

2023-12-12T07:34:26.205280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:26.284961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1968
98.4%
20 32
 
1.6%

f20
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)2.4%
Missing1749
Missing (%)87.5%
Infinite0
Infinite (%)0.0%
Mean46.380717
Minimum12.5
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.7 KiB
2023-12-12T07:34:26.359746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12.5
5-th percentile12.5
Q120.835
median33.33
Q375
95-th percentile100
Maximum100
Range87.5
Interquartile range (IQR)54.165

Descriptive statistics

Standard deviation32.895693
Coefficient of variation (CV)0.70925365
Kurtosis-0.95319718
Mean46.380717
Median Absolute Deviation (MAD)16.67
Skewness0.816995
Sum11641.56
Variance1082.1266
MonotonicityNot monotonic
2023-12-12T07:34:26.430165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
100.0 63
 
3.1%
33.33 63
 
3.1%
12.5 32
 
1.6%
25.0 31
 
1.6%
16.67 31
 
1.6%
50.0 31
 
1.6%
(Missing) 1749
87.5%
ValueCountFrequency (%)
12.5 32
1.6%
16.67 31
1.6%
25.0 31
1.6%
33.33 63
3.1%
50.0 31
1.6%
100.0 63
3.1%
ValueCountFrequency (%)
100.0 63
3.1%
50.0 31
1.6%
33.33 63
3.1%
25.0 31
1.6%
16.67 31
1.6%
12.5 32
1.6%

f30
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
<NA>
1875 
12.5
 
32
66.67
 
31
33.33
 
31
50.0
 
31

Length

Max length5
Median length4
Mean length4.031
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1875
93.8%
12.5 32
 
1.6%
66.67 31
 
1.6%
33.33 31
 
1.6%
50.0 31
 
1.6%

Length

2023-12-12T07:34:26.509251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:26.590786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1875
93.8%
12.5 32
 
1.6%
66.67 31
 
1.6%
33.33 31
 
1.6%
50.0 31
 
1.6%

f40
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
<NA>
1717 
100
 
96
50
 
93
25
 
63
20
 
31

Length

Max length4
Median length4
Mean length3.765
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row50
4th row25
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1717
85.9%
100 96
 
4.8%
50 93
 
4.7%
25 63
 
3.1%
20 31
 
1.6%

Length

2023-12-12T07:34:26.689629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:26.782852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1717
85.9%
100 96
 
4.8%
50 93
 
4.7%
25 63
 
3.1%
20 31
 
1.6%

f50
Real number (ℝ)

MISSING 

Distinct7
Distinct (%)1.7%
Missing1594
Missing (%)79.7%
Infinite0
Infinite (%)0.0%
Mean52.130542
Minimum16.67
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.7 KiB
2023-12-12T07:34:26.856429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum16.67
5-th percentile16.67
Q125
median40
Q3100
95-th percentile100
Maximum100
Range83.33
Interquartile range (IQR)75

Descriptive statistics

Standard deviation33.351262
Coefficient of variation (CV)0.63976434
Kurtosis-1.38986
Mean52.130542
Median Absolute Deviation (MAD)20
Skewness0.57577563
Sum21165
Variance1112.3066
MonotonicityNot monotonic
2023-12-12T07:34:26.929641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
100.0 124
 
6.2%
25.0 63
 
3.1%
20.0 63
 
3.1%
50.0 62
 
3.1%
40.0 32
 
1.6%
16.67 31
 
1.6%
33.33 31
 
1.6%
(Missing) 1594
79.7%
ValueCountFrequency (%)
16.67 31
 
1.6%
20.0 63
3.1%
25.0 63
3.1%
33.33 31
 
1.6%
40.0 32
 
1.6%
50.0 62
3.1%
100.0 124
6.2%
ValueCountFrequency (%)
100.0 124
6.2%
50.0 62
3.1%
40.0 32
 
1.6%
33.33 31
 
1.6%
25.0 63
3.1%
20.0 63
3.1%
16.67 31
 
1.6%

f60
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
<NA>
1906 
50
 
32
20
 
31
25
 
31

Length

Max length4
Median length4
Mean length3.906
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1906
95.3%
50 32
 
1.6%
20 31
 
1.6%
25 31
 
1.6%

Length

2023-12-12T07:34:27.029471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:27.118887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1906
95.3%
50 32
 
1.6%
20 31
 
1.6%
25 31
 
1.6%

f70
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
<NA>
1937 
33.33
 
32
100.0
 
31

Length

Max length5
Median length4
Mean length4.0315
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1937
96.9%
33.33 32
 
1.6%
100.0 31
 
1.6%

Length

2023-12-12T07:34:27.198076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:27.272568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1937
96.9%
33.33 32
 
1.6%
100.0 31
 
1.6%

total
Real number (ℝ)

ZEROS 

Distinct32
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.6665
Minimum0
Maximum56
Zeros467
Zeros (%)23.4%
Negative0
Negative (%)0.0%
Memory size17.7 KiB
2023-12-12T07:34:27.349777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median20
Q328
95-th percentile44
Maximum56
Range56
Interquartile range (IQR)23

Descriptive statistics

Standard deviation15.390637
Coefficient of variation (CV)0.82450578
Kurtosis-0.72634962
Mean18.6665
Median Absolute Deviation (MAD)12
Skewness0.44046994
Sum37333
Variance236.87171
MonotonicityNot monotonic
2023-12-12T07:34:27.451837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
0 467
23.4%
18 126
 
6.3%
20 126
 
6.3%
26 124
 
6.2%
23 94
 
4.7%
9 88
 
4.4%
8 64
 
3.2%
44 63
 
3.1%
41 62
 
3.1%
21 62
 
3.1%
Other values (22) 724
36.2%
ValueCountFrequency (%)
0 467
23.4%
3 32
 
1.6%
5 32
 
1.6%
6 62
 
3.1%
8 64
 
3.2%
9 88
 
4.4%
10 32
 
1.6%
12 32
 
1.6%
13 31
 
1.6%
15 31
 
1.6%
ValueCountFrequency (%)
56 32
1.6%
53 31
1.6%
45 31
1.6%
44 63
3.1%
43 32
1.6%
41 62
3.1%
40 31
1.6%
38 32
1.6%
36 32
1.6%
35 32
1.6%

register_at
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
2020-09-14 0:00
2000 

Length

Max length15
Median length15
Mean length15
Min length15

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-09-14 0:00
2nd row2020-09-14 0:00
3rd row2020-09-14 0:00
4th row2020-09-14 0:00
5th row2020-09-14 0:00

Common Values

ValueCountFrequency (%)
2020-09-14 0:00 2000
100.0%

Length

2023-12-12T07:34:27.537943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:34:27.608210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-09-14 2000
50.0%
0:00 2000
50.0%

Sample

taxi_idlatitudelongitudem00m10m20m30m40m50m60m70f00f10f20f30f40f50f60f70totalregister_at
0T_9438564537.423923126.64315<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>100.0<NA><NA><NA><NA><NA>282020-09-14 0:00
1T_4940283837.565292127.05467<NA><NA><NA>16.67<NA>50.0<NA>16.67<NA><NA><NA><NA><NA>16.67<NA><NA>442020-09-14 0:00
2T_4720552837.65406127.24984<NA>50.0<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>50<NA><NA><NA>62020-09-14 0:00
3T_9760836737.5649126.83376<NA><NA><NA><NA>25.0<NA><NA><NA><NA><NA>25.0<NA>2525.0<NA><NA>182020-09-14 0:00
4T_1596150437.47155126.70179<NA><NA><NA><NA><NA><NA>100.0<NA><NA><NA><NA><NA><NA><NA><NA><NA>202020-09-14 0:00
5T_4918310737.543686127.07258<NA><NA><NA><NA><NA>100.0<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>242020-09-14 0:00
6T_4808445237.343693127.18032<NA><NA><NA><NA><NA>33.33<NA><NA><NA><NA><NA>66.67<NA><NA><NA><NA>402020-09-14 0:00
7T_2372533437.56094126.85555<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>02020-09-14 0:00
8T_7368871237.47894127.12365<NA><NA><NA><NA><NA><NA><NA>50.0<NA><NA><NA><NA>50<NA><NA><NA>132020-09-14 0:00
9T_4618011637.516018127.10864<NA><NA><NA><NA><NA><NA>50.0<NA><NA><NA><NA><NA><NA>50.0<NA><NA>232020-09-14 0:00
taxi_idlatitudelongitudem00m10m20m30m40m50m60m70f00f10f20f30f40f50f60f70totalregister_at
1990T_6658407537.463818126.681656<NA><NA><NA>33.33<NA><NA>33.3333.33<NA><NA><NA><NA><NA><NA><NA><NA>232020-09-14 0:00
1991T_4493497337.5062126.71486<NA>20.0<NA><NA><NA><NA>20.0<NA><NA>20<NA><NA><NA>40.0<NA><NA>382020-09-14 0:00
1992T_4303063837.521824126.7048<NA>20.020<NA><NA>40.0<NA><NA><NA><NA><NA><NA><NA>20.0<NA><NA>182020-09-14 0:00
1993T_7244356937.48125126.91473<NA>33.33<NA><NA><NA><NA>33.33<NA><NA><NA>33.33<NA><NA><NA><NA><NA>352020-09-14 0:00
1994T_9438564537.42445126.64308<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>100.0<NA><NA><NA><NA><NA>282020-09-14 0:00
1995T_1596150437.471523126.70176<NA><NA><NA><NA><NA><NA>100.0<NA><NA><NA><NA><NA><NA><NA><NA><NA>202020-09-14 0:00
1996T_9211509237.525475126.72923<NA><NA><NA>50.025.0<NA><NA><NA><NA><NA><NA><NA>25<NA><NA><NA>442020-09-14 0:00
1997T_9475186437.471703126.69085<NA><NA><NA><NA>100.0<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>182020-09-14 0:00
1998T_6826868037.446762126.66715<NA><NA><NA><NA><NA><NA>50.0<NA><NA><NA><NA><NA><NA><NA>50<NA>202020-09-14 0:00
1999T_4347010037.439774126.673744<NA>50.050<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>432020-09-14 0:00

Duplicate rows

Most frequently occurring

taxi_idlatitudelongitudem10m20m30m40m50m60m70f10f20f30f40f50f60f70totalregister_at# duplicates
10T_1735313437.65057126.63241<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>02020-09-14 0:0031
152T_9760836737.5649126.83376<NA><NA><NA>25.0<NA><NA><NA><NA>25.0<NA>2525.0<NA><NA>182020-09-14 0:0031
163T_9848729137.53594126.899445<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>02020-09-14 0:0031
145T_9460537737.57089126.73594<NA><NA><NA><NA>100.0<NA><NA><NA><NA><NA><NA><NA><NA><NA>52020-09-14 0:0021
88T_6812219237.50444126.76331<NA><NA>33.33<NA>66.67<NA><NA><NA><NA><NA><NA><NA><NA><NA>232020-09-14 0:0019
98T_6951382237.464126.68101<NA><NA>16.6716.6716.6716.6716.67<NA>16.67<NA><NA><NA><NA><NA>262020-09-14 0:0019
48T_4493497337.503742126.71458420.0<NA><NA><NA><NA>20.0<NA>20<NA><NA><NA>40.0<NA><NA>382020-09-14 0:0018
151T_9687593137.64166127.029854<NA><NA><NA><NA>33.33<NA><NA><NA><NA>33.33<NA>33.33<NA><NA>302020-09-14 0:0016
165T_9863377937.530537126.8433612.5<NA>12.512.5<NA>12.5<NA><NA>12.512.5<NA>25.0<NA><NA>562020-09-14 0:0016
126T_7449439237.4907126.98211<NA><NA><NA>100.0<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>152020-09-14 0:0015