Dataset statistics
Number of variables | 21 |
---|---|
Number of observations | 2000 |
Missing cells | 13780 |
Missing cells (%) | 32.8% |
Duplicate rows | 167 |
Duplicate rows (%) | 8.3% |
Total size in memory | 365.4 KiB |
Average record size in memory | 187.1 B |
Variable types
Text | 1 |
---|---|
Numeric | 9 |
Unsupported | 2 |
Categorical | 9 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | (주)모토브 / 신재훈 |
URL | https://www.bigdata-transportation.kr/frn/prdt/detail?prdtId=PRDTNUM_000000020251 |
register_at has constant value "" | Constant |
Dataset has 167 (8.3%) duplicate rows | Duplicates |
m10 is highly imbalanced (69.5%) | Imbalance |
m20 is highly imbalanced (78.2%) | Imbalance |
m70 is highly imbalanced (71.3%) | Imbalance |
f10 is highly imbalanced (88.2%) | Imbalance |
f30 is highly imbalanced (80.1%) | Imbalance |
f40 is highly imbalanced (63.2%) | Imbalance |
f60 is highly imbalanced (82.6%) | Imbalance |
f70 is highly imbalanced (85.3%) | Imbalance |
m00 has 2000 (100.0%) missing values | Missing |
m30 has 1655 (82.8%) missing values | Missing |
m40 has 1628 (81.4%) missing values | Missing |
m50 has 1563 (78.1%) missing values | Missing |
m60 has 1591 (79.5%) missing values | Missing |
f00 has 2000 (100.0%) missing values | Missing |
f20 has 1749 (87.5%) missing values | Missing |
f50 has 1594 (79.7%) missing values | Missing |
m00 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
f00 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
total has 467 (23.4%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-11 22:34:23.478860 |
---|---|
Analysis finished | 2023-12-11 22:34:24.115507 |
Duration | 0.64 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
taxi_id
Text
Distinct | 64 |
---|---|
Distinct (%) | 3.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 15.8 KiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 20000 |
---|---|
Distinct characters | 12 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | T_94385645 |
---|---|
2nd row | T_49402838 |
3rd row | T_47205528 |
4th row | T_97608367 |
5th row | T_15961504 |
Value | Count | Frequency (%) |
t_94385645 | 32 | 1.6% |
t_92408066 | 32 | 1.6% |
t_22699923 | 32 | 1.6% |
t_17060159 | 32 | 1.6% |
t_47791477 | 32 | 1.6% |
t_98633779 | 32 | 1.6% |
t_47425259 | 32 | 1.6% |
t_94605377 | 32 | 1.6% |
t_17133403 | 32 | 1.6% |
t_47352015 | 32 | 1.6% |
Other values (54) | 1680 |
Most occurring characters
Value | Count | Frequency (%) |
4 | 2093 | |
T | 2000 | |
_ | 2000 | |
7 | 1987 | |
8 | 1776 | |
9 | 1568 | |
6 | 1537 | |
0 | 1535 | |
3 | 1532 | |
2 | 1525 | |
Other values (2) | 2447 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 16000 | |
Uppercase Letter | 2000 | 10.0% |
Connector Punctuation | 2000 | 10.0% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
4 | 2093 | |
7 | 1987 | |
8 | 1776 | |
9 | 1568 | |
6 | 1537 | |
0 | 1535 | |
3 | 1532 | |
2 | 1525 | |
1 | 1377 | |
5 | 1070 |
Uppercase Letter
Value | Count | Frequency (%) |
T | 2000 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 2000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 18000 | |
Latin | 2000 | 10.0% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
4 | 2093 | |
_ | 2000 | |
7 | 1987 | |
8 | 1776 | |
9 | 1568 | |
6 | 1537 | |
0 | 1535 | |
3 | 1532 | |
2 | 1525 | |
1 | 1377 |
Latin
Value | Count | Frequency (%) |
T | 2000 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 20000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
4 | 2093 | |
T | 2000 | |
_ | 2000 | |
7 | 1987 | |
8 | 1776 | |
9 | 1568 | |
6 | 1537 | |
0 | 1535 | |
3 | 1532 | |
2 | 1525 | |
Other values (2) | 2447 |
latitude
Real number (ℝ)
Distinct | 1353 |
---|---|
Distinct (%) | 67.7% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 37.513271 |
Minimum | 37.329357 |
---|---|
Maximum | 37.75941 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 17.7 KiB |
Quantile statistics
Minimum | 37.329357 |
---|---|
5-th percentile | 37.43614 |
Q1 | 37.473595 |
median | 37.506542 |
Q3 | 37.53827 |
95-th percentile | 37.64166 |
Maximum | 37.75941 |
Range | 0.430053 |
Interquartile range (IQR) | 0.064675 |
Descriptive statistics
Standard deviation | 0.065224721 |
---|---|
Coefficient of variation (CV) | 0.0017387106 |
Kurtosis | 2.9835217 |
Mean | 37.513271 |
Median Absolute Deviation (MAD) | 0.032947 |
Skewness | 0.59582938 |
Sum | 75026.543 |
Variance | 0.0042542642 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
37.53594 | 31 | 1.6% |
37.5649 | 31 | 1.6% |
37.65057 | 31 | 1.6% |
37.50444 | 21 | 1.1% |
37.57089 | 21 | 1.1% |
37.464 | 20 | 1.0% |
37.503742 | 18 | 0.9% |
37.64166 | 16 | 0.8% |
37.530537 | 16 | 0.8% |
37.4907 | 15 | 0.8% |
Other values (1343) | 1780 |
Value | Count | Frequency (%) |
37.329357 | 1 | |
37.32951 | 1 | |
37.32966 | 1 | |
37.329815 | 1 | |
37.330135 | 1 | |
37.33031 | 1 | |
37.330494 | 1 | |
37.330685 | 1 | |
37.330875 | 1 | |
37.33107 | 1 |
Value | Count | Frequency (%) |
37.75941 | 1 | |
37.75938 | 2 | |
37.759357 | 1 | |
37.75933 | 1 | |
37.75929 | 1 | |
37.75923 | 1 | |
37.75919 | 1 | |
37.75914 | 1 | |
37.75911 | 1 | |
37.75907 | 1 |
longitude
Real number (ℝ)
Distinct | 1007 |
---|---|
Distinct (%) | 50.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 126.87146 |
Minimum | 126.63241 |
---|---|
Maximum | 127.24984 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 17.7 KiB |
Quantile statistics
Minimum | 126.63241 |
---|---|
5-th percentile | 126.66715 |
Q1 | 126.70485 |
median | 126.85901 |
Q3 | 127.02955 |
95-th percentile | 127.11921 |
Maximum | 127.24984 |
Range | 0.61743 |
Interquartile range (IQR) | 0.3246975 |
Descriptive statistics
Standard deviation | 0.16829362 |
---|---|
Coefficient of variation (CV) | 0.0013264892 |
Kurtosis | -1.2990669 |
Mean | 126.87146 |
Median Absolute Deviation (MAD) | 0.158245 |
Skewness | 0.22885201 |
Sum | 253742.92 |
Variance | 0.028322741 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
126.73594 | 32 | 1.6% |
126.899445 | 31 | 1.6% |
126.63241 | 31 | 1.6% |
126.83376 | 31 | 1.6% |
126.83854 | 30 | 1.5% |
126.68101 | 27 | 1.4% |
126.703354 | 26 | 1.3% |
127.0187 | 24 | 1.2% |
127.04489 | 21 | 1.1% |
126.76331 | 19 | 0.9% |
Other values (997) | 1728 |
Value | Count | Frequency (%) |
126.63241 | 31 | |
126.64281 | 2 | 0.1% |
126.64282 | 2 | 0.1% |
126.64285 | 1 | 0.1% |
126.64287 | 2 | 0.1% |
126.64289 | 3 | 0.1% |
126.64292 | 2 | 0.1% |
126.64294 | 2 | 0.1% |
126.64298 | 4 | 0.2% |
126.643036 | 7 | 0.4% |
Value | Count | Frequency (%) |
127.24984 | 1 | |
127.24971 | 1 | |
127.24958 | 1 | |
127.24944 | 1 | |
127.24934 | 1 | |
127.24922 | 1 | |
127.24914 | 1 | |
127.24909 | 1 | |
127.249054 | 1 | |
127.24902 | 1 |
m00
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 2000 |
---|---|
Missing (%) | 100.0% |
Memory size | 17.7 KiB |
m10
Categorical
IMBALANCE
 
Distinct | 5 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 15.8 KiB |
<NA> | |
---|---|
20.0 | 95 |
50.0 | 63 |
33.33 | 32 |
12.5 | 32 |
Length
Max length | 5 |
---|---|
Median length | 4 |
Mean length | 4.016 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | 50.0 |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 1778 | |
20.0 | 95 | 4.8% |
50.0 | 63 | 3.1% |
33.33 | 32 | 1.6% |
12.5 | 32 | 1.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 1778 | |
20.0 | 95 | 4.8% |
50.0 | 63 | 3.1% |
33.33 | 32 | 1.6% |
12.5 | 32 | 1.6% |
m20
Categorical
IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 15.8 KiB |
<NA> | |
---|---|
50 | 63 |
100 | 32 |
20 | 32 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.889 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 1873 | |
50 | 63 | 3.1% |
100 | 32 | 1.6% |
20 | 32 | 1.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 1873 | |
50 | 63 | 3.1% |
100 | 32 | 1.6% |
20 | 32 | 1.6% |
m30
Real number (ℝ)
MISSING
 
Distinct | 6 |
---|---|
Distinct (%) | 1.7% |
Missing | 1655 |
Missing (%) | 82.8% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 35.236406 |
Minimum | 12.5 |
---|---|
Maximum | 100 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 17.7 KiB |
Quantile statistics
Minimum | 12.5 |
---|---|
5-th percentile | 12.5 |
Q1 | 16.67 |
median | 33.33 |
Q3 | 50 |
95-th percentile | 100 |
Maximum | 100 |
Range | 87.5 |
Interquartile range (IQR) | 33.33 |
Descriptive statistics
Standard deviation | 24.148233 |
---|---|
Coefficient of variation (CV) | 0.68532056 |
Kurtosis | 2.0644617 |
Mean | 35.236406 |
Median Absolute Deviation (MAD) | 16.66 |
Skewness | 1.6408368 |
Sum | 12156.56 |
Variance | 583.13717 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
33.33 | 94 | 4.7% |
50.0 | 63 | 3.1% |
16.67 | 62 | 3.1% |
20.0 | 62 | 3.1% |
100.0 | 32 | 1.6% |
12.5 | 32 | 1.6% |
(Missing) | 1655 |
Value | Count | Frequency (%) |
12.5 | 32 | 1.6% |
16.67 | 62 | |
20.0 | 62 | |
33.33 | 94 | |
50.0 | 63 | |
100.0 | 32 | 1.6% |
Value | Count | Frequency (%) |
100.0 | 32 | 1.6% |
50.0 | 63 | |
33.33 | 94 | |
20.0 | 62 | |
16.67 | 62 | |
12.5 | 32 | 1.6% |
m40
Real number (ℝ)
MISSING
 
Distinct | 7 |
---|---|
Distinct (%) | 1.9% |
Missing | 1628 |
Missing (%) | 81.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 50.703118 |
Minimum | 12.5 |
---|---|
Maximum | 100 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 17.7 KiB |
Quantile statistics
Minimum | 12.5 |
---|---|
5-th percentile | 12.5 |
Q1 | 20 |
median | 33.33 |
Q3 | 100 |
95-th percentile | 100 |
Maximum | 100 |
Range | 87.5 |
Interquartile range (IQR) | 80 |
Descriptive statistics
Standard deviation | 35.461366 |
---|---|
Coefficient of variation (CV) | 0.69939221 |
Kurtosis | -1.4910759 |
Mean | 50.703118 |
Median Absolute Deviation (MAD) | 16.66 |
Skewness | 0.55862873 |
Sum | 18861.56 |
Variance | 1257.5085 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
100.0 | 121 | 6.0% |
25.0 | 63 | 3.1% |
33.33 | 63 | 3.1% |
12.5 | 32 | 1.6% |
20.0 | 31 | 1.6% |
16.67 | 31 | 1.6% |
50.0 | 31 | 1.6% |
(Missing) | 1628 |
Value | Count | Frequency (%) |
12.5 | 32 | 1.6% |
16.67 | 31 | 1.6% |
20.0 | 31 | 1.6% |
25.0 | 63 | |
33.33 | 63 | |
50.0 | 31 | 1.6% |
100.0 | 121 |
Value | Count | Frequency (%) |
100.0 | 121 | |
50.0 | 31 | 1.6% |
33.33 | 63 | |
25.0 | 63 | |
20.0 | 31 | 1.6% |
16.67 | 31 | 1.6% |
12.5 | 32 | 1.6% |
m50
Real number (ℝ)
MISSING
 
Distinct | 7 |
---|---|
Distinct (%) | 1.6% |
Missing | 1563 |
Missing (%) | 78.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 52.063066 |
Minimum | 16.67 |
---|---|
Maximum | 100 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 17.7 KiB |
Quantile statistics
Minimum | 16.67 |
---|---|
5-th percentile | 16.67 |
Q1 | 33.33 |
median | 40 |
Q3 | 66.67 |
95-th percentile | 100 |
Maximum | 100 |
Range | 83.33 |
Interquartile range (IQR) | 33.34 |
Descriptive statistics
Standard deviation | 27.556809 |
---|---|
Coefficient of variation (CV) | 0.5292967 |
Kurtosis | -0.695507 |
Mean | 52.063066 |
Median Absolute Deviation (MAD) | 10 |
Skewness | 0.84551592 |
Sum | 22751.56 |
Variance | 759.37773 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
100.0 | 94 | 4.7% |
33.33 | 94 | 4.7% |
40.0 | 94 | 4.7% |
50.0 | 62 | 3.1% |
16.67 | 31 | 1.6% |
66.67 | 31 | 1.6% |
25.0 | 31 | 1.6% |
(Missing) | 1563 |
Value | Count | Frequency (%) |
16.67 | 31 | 1.6% |
25.0 | 31 | 1.6% |
33.33 | 94 | |
40.0 | 94 | |
50.0 | 62 | |
66.67 | 31 | 1.6% |
100.0 | 94 |
Value | Count | Frequency (%) |
100.0 | 94 | |
66.67 | 31 | 1.6% |
50.0 | 62 | |
40.0 | 94 | |
33.33 | 94 | |
25.0 | 31 | 1.6% |
16.67 | 31 | 1.6% |
m60
Real number (ℝ)
MISSING
 
Distinct | 7 |
---|---|
Distinct (%) | 1.7% |
Missing | 1591 |
Missing (%) | 79.5% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 38.948386 |
Minimum | 12.5 |
---|---|
Maximum | 100 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 17.7 KiB |
Quantile statistics
Minimum | 12.5 |
---|---|
5-th percentile | 12.5 |
Q1 | 20 |
median | 33.33 |
Q3 | 50 |
95-th percentile | 100 |
Maximum | 100 |
Range | 87.5 |
Interquartile range (IQR) | 30 |
Descriptive statistics
Standard deviation | 23.696252 |
---|---|
Coefficient of variation (CV) | 0.60840139 |
Kurtosis | 0.9204954 |
Mean | 38.948386 |
Median Absolute Deviation (MAD) | 16.66 |
Skewness | 1.1991378 |
Sum | 15929.89 |
Variance | 561.51237 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
33.33 | 95 | 4.8% |
50.0 | 94 | 4.7% |
20.0 | 94 | 4.7% |
100.0 | 32 | 1.6% |
12.5 | 32 | 1.6% |
16.67 | 31 | 1.6% |
66.67 | 31 | 1.6% |
(Missing) | 1591 |
Value | Count | Frequency (%) |
12.5 | 32 | 1.6% |
16.67 | 31 | 1.6% |
20.0 | 94 | |
33.33 | 95 | |
50.0 | 94 | |
66.67 | 31 | 1.6% |
100.0 | 32 | 1.6% |
Value | Count | Frequency (%) |
100.0 | 32 | 1.6% |
66.67 | 31 | 1.6% |
50.0 | 94 | |
33.33 | 95 | |
20.0 | 94 | |
16.67 | 31 | 1.6% |
12.5 | 32 | 1.6% |
m70
Categorical
IMBALANCE
 
Distinct | 6 |
---|---|
Distinct (%) | 0.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 15.8 KiB |
<NA> | |
---|---|
16.67 | 62 |
20.0 | 62 |
33.33 | 32 |
50.0 | 31 |
Length
Max length | 5 |
---|---|
Median length | 4 |
Mean length | 4.0625 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | 16.67 |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 1782 | |
16.67 | 62 | 3.1% |
20.0 | 62 | 3.1% |
33.33 | 32 | 1.6% |
50.0 | 31 | 1.6% |
100.0 | 31 | 1.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 1782 | |
16.67 | 62 | 3.1% |
20.0 | 62 | 3.1% |
33.33 | 32 | 1.6% |
50.0 | 31 | 1.6% |
100.0 | 31 | 1.6% |
f00
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 2000 |
---|---|
Missing (%) | 100.0% |
Memory size | 17.7 KiB |
f10
Categorical
IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 15.8 KiB |
<NA> | |
---|---|
20 | 32 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.968 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 1968 | |
20 | 32 | 1.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 1968 | |
20 | 32 | 1.6% |
f20
Real number (ℝ)
MISSING
 
Distinct | 6 |
---|---|
Distinct (%) | 2.4% |
Missing | 1749 |
Missing (%) | 87.5% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 46.380717 |
Minimum | 12.5 |
---|---|
Maximum | 100 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 17.7 KiB |
Quantile statistics
Minimum | 12.5 |
---|---|
5-th percentile | 12.5 |
Q1 | 20.835 |
median | 33.33 |
Q3 | 75 |
95-th percentile | 100 |
Maximum | 100 |
Range | 87.5 |
Interquartile range (IQR) | 54.165 |
Descriptive statistics
Standard deviation | 32.895693 |
---|---|
Coefficient of variation (CV) | 0.70925365 |
Kurtosis | -0.95319718 |
Mean | 46.380717 |
Median Absolute Deviation (MAD) | 16.67 |
Skewness | 0.816995 |
Sum | 11641.56 |
Variance | 1082.1266 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
100.0 | 63 | 3.1% |
33.33 | 63 | 3.1% |
12.5 | 32 | 1.6% |
25.0 | 31 | 1.6% |
16.67 | 31 | 1.6% |
50.0 | 31 | 1.6% |
(Missing) | 1749 |
Value | Count | Frequency (%) |
12.5 | 32 | |
16.67 | 31 | |
25.0 | 31 | |
33.33 | 63 | |
50.0 | 31 | |
100.0 | 63 |
Value | Count | Frequency (%) |
100.0 | 63 | |
50.0 | 31 | |
33.33 | 63 | |
25.0 | 31 | |
16.67 | 31 | |
12.5 | 32 |
f30
Categorical
IMBALANCE
 
Distinct | 5 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 15.8 KiB |
<NA> | |
---|---|
12.5 | 32 |
66.67 | 31 |
33.33 | 31 |
50.0 | 31 |
Length
Max length | 5 |
---|---|
Median length | 4 |
Mean length | 4.031 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 1875 | |
12.5 | 32 | 1.6% |
66.67 | 31 | 1.6% |
33.33 | 31 | 1.6% |
50.0 | 31 | 1.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 1875 | |
12.5 | 32 | 1.6% |
66.67 | 31 | 1.6% |
33.33 | 31 | 1.6% |
50.0 | 31 | 1.6% |
f40
Categorical
IMBALANCE
 
Distinct | 5 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 15.8 KiB |
<NA> | |
---|---|
100 | 96 |
50 | 93 |
25 | 63 |
20 | 31 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.765 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | 50 |
4th row | 25 |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 1717 | |
100 | 96 | 4.8% |
50 | 93 | 4.7% |
25 | 63 | 3.1% |
20 | 31 | 1.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 1717 | |
100 | 96 | 4.8% |
50 | 93 | 4.7% |
25 | 63 | 3.1% |
20 | 31 | 1.6% |
f50
Real number (ℝ)
MISSING
 
Distinct | 7 |
---|---|
Distinct (%) | 1.7% |
Missing | 1594 |
Missing (%) | 79.7% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 52.130542 |
Minimum | 16.67 |
---|---|
Maximum | 100 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 17.7 KiB |
Quantile statistics
Minimum | 16.67 |
---|---|
5-th percentile | 16.67 |
Q1 | 25 |
median | 40 |
Q3 | 100 |
95-th percentile | 100 |
Maximum | 100 |
Range | 83.33 |
Interquartile range (IQR) | 75 |
Descriptive statistics
Standard deviation | 33.351262 |
---|---|
Coefficient of variation (CV) | 0.63976434 |
Kurtosis | -1.38986 |
Mean | 52.130542 |
Median Absolute Deviation (MAD) | 20 |
Skewness | 0.57577563 |
Sum | 21165 |
Variance | 1112.3066 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
100.0 | 124 | 6.2% |
25.0 | 63 | 3.1% |
20.0 | 63 | 3.1% |
50.0 | 62 | 3.1% |
40.0 | 32 | 1.6% |
16.67 | 31 | 1.6% |
33.33 | 31 | 1.6% |
(Missing) | 1594 |
Value | Count | Frequency (%) |
16.67 | 31 | 1.6% |
20.0 | 63 | |
25.0 | 63 | |
33.33 | 31 | 1.6% |
40.0 | 32 | 1.6% |
50.0 | 62 | |
100.0 | 124 |
Value | Count | Frequency (%) |
100.0 | 124 | |
50.0 | 62 | |
40.0 | 32 | 1.6% |
33.33 | 31 | 1.6% |
25.0 | 63 | |
20.0 | 63 | |
16.67 | 31 | 1.6% |
f60
Categorical
IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 15.8 KiB |
<NA> | |
---|---|
50 | 32 |
20 | 31 |
25 | 31 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.906 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 1906 | |
50 | 32 | 1.6% |
20 | 31 | 1.6% |
25 | 31 | 1.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 1906 | |
50 | 32 | 1.6% |
20 | 31 | 1.6% |
25 | 31 | 1.6% |
f70
Categorical
IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 15.8 KiB |
<NA> | |
---|---|
33.33 | 32 |
100.0 | 31 |
Length
Max length | 5 |
---|---|
Median length | 4 |
Mean length | 4.0315 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 1937 | |
33.33 | 32 | 1.6% |
100.0 | 31 | 1.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 1937 | |
33.33 | 32 | 1.6% |
100.0 | 31 | 1.6% |
total
Real number (ℝ)
ZEROS
 
Distinct | 32 |
---|---|
Distinct (%) | 1.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 18.6665 |
Minimum | 0 |
---|---|
Maximum | 56 |
Zeros | 467 |
Zeros (%) | 23.4% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 17.7 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 5 |
median | 20 |
Q3 | 28 |
95-th percentile | 44 |
Maximum | 56 |
Range | 56 |
Interquartile range (IQR) | 23 |
Descriptive statistics
Standard deviation | 15.390637 |
---|---|
Coefficient of variation (CV) | 0.82450578 |
Kurtosis | -0.72634962 |
Mean | 18.6665 |
Median Absolute Deviation (MAD) | 12 |
Skewness | 0.44046994 |
Sum | 37333 |
Variance | 236.87171 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 467 | |
18 | 126 | 6.3% |
20 | 126 | 6.3% |
26 | 124 | 6.2% |
23 | 94 | 4.7% |
9 | 88 | 4.4% |
8 | 64 | 3.2% |
44 | 63 | 3.1% |
41 | 62 | 3.1% |
21 | 62 | 3.1% |
Other values (22) | 724 |
Value | Count | Frequency (%) |
0 | 467 | |
3 | 32 | 1.6% |
5 | 32 | 1.6% |
6 | 62 | 3.1% |
8 | 64 | 3.2% |
9 | 88 | 4.4% |
10 | 32 | 1.6% |
12 | 32 | 1.6% |
13 | 31 | 1.6% |
15 | 31 | 1.6% |
Value | Count | Frequency (%) |
56 | 32 | |
53 | 31 | |
45 | 31 | |
44 | 63 | |
43 | 32 | |
41 | 62 | |
40 | 31 | |
38 | 32 | |
36 | 32 | |
35 | 32 |
register_at
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 15.8 KiB |
2020-09-14 0:00 |
---|
Length
Max length | 15 |
---|---|
Median length | 15 |
Mean length | 15 |
Min length | 15 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2020-09-14 0:00 |
---|---|
2nd row | 2020-09-14 0:00 |
3rd row | 2020-09-14 0:00 |
4th row | 2020-09-14 0:00 |
5th row | 2020-09-14 0:00 |
Common Values
Value | Count | Frequency (%) |
2020-09-14 0:00 | 2000 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2020-09-14 | 2000 | |
0:00 | 2000 |
taxi_id | latitude | longitude | m00 | m10 | m20 | m30 | m40 | m50 | m60 | m70 | f00 | f10 | f20 | f30 | f40 | f50 | f60 | f70 | total | register_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | T_94385645 | 37.423923 | 126.64315 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 100.0 | <NA> | <NA> | <NA> | <NA> | <NA> | 28 | 2020-09-14 0:00 |
1 | T_49402838 | 37.565292 | 127.05467 | <NA> | <NA> | <NA> | 16.67 | <NA> | 50.0 | <NA> | 16.67 | <NA> | <NA> | <NA> | <NA> | <NA> | 16.67 | <NA> | <NA> | 44 | 2020-09-14 0:00 |
2 | T_47205528 | 37.65406 | 127.24984 | <NA> | 50.0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 50 | <NA> | <NA> | <NA> | 6 | 2020-09-14 0:00 |
3 | T_97608367 | 37.5649 | 126.83376 | <NA> | <NA> | <NA> | <NA> | 25.0 | <NA> | <NA> | <NA> | <NA> | <NA> | 25.0 | <NA> | 25 | 25.0 | <NA> | <NA> | 18 | 2020-09-14 0:00 |
4 | T_15961504 | 37.47155 | 126.70179 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 100.0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 20 | 2020-09-14 0:00 |
5 | T_49183107 | 37.543686 | 127.07258 | <NA> | <NA> | <NA> | <NA> | <NA> | 100.0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 24 | 2020-09-14 0:00 |
6 | T_48084452 | 37.343693 | 127.18032 | <NA> | <NA> | <NA> | <NA> | <NA> | 33.33 | <NA> | <NA> | <NA> | <NA> | <NA> | 66.67 | <NA> | <NA> | <NA> | <NA> | 40 | 2020-09-14 0:00 |
7 | T_23725334 | 37.56094 | 126.85555 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 0 | 2020-09-14 0:00 |
8 | T_73688712 | 37.47894 | 127.12365 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 50.0 | <NA> | <NA> | <NA> | <NA> | 50 | <NA> | <NA> | <NA> | 13 | 2020-09-14 0:00 |
9 | T_46180116 | 37.516018 | 127.10864 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 50.0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 50.0 | <NA> | <NA> | 23 | 2020-09-14 0:00 |
taxi_id | latitude | longitude | m00 | m10 | m20 | m30 | m40 | m50 | m60 | m70 | f00 | f10 | f20 | f30 | f40 | f50 | f60 | f70 | total | register_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1990 | T_66584075 | 37.463818 | 126.681656 | <NA> | <NA> | <NA> | 33.33 | <NA> | <NA> | 33.33 | 33.33 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 23 | 2020-09-14 0:00 |
1991 | T_44934973 | 37.5062 | 126.71486 | <NA> | 20.0 | <NA> | <NA> | <NA> | <NA> | 20.0 | <NA> | <NA> | 20 | <NA> | <NA> | <NA> | 40.0 | <NA> | <NA> | 38 | 2020-09-14 0:00 |
1992 | T_43030638 | 37.521824 | 126.7048 | <NA> | 20.0 | 20 | <NA> | <NA> | 40.0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 20.0 | <NA> | <NA> | 18 | 2020-09-14 0:00 |
1993 | T_72443569 | 37.48125 | 126.91473 | <NA> | 33.33 | <NA> | <NA> | <NA> | <NA> | 33.33 | <NA> | <NA> | <NA> | 33.33 | <NA> | <NA> | <NA> | <NA> | <NA> | 35 | 2020-09-14 0:00 |
1994 | T_94385645 | 37.42445 | 126.64308 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 100.0 | <NA> | <NA> | <NA> | <NA> | <NA> | 28 | 2020-09-14 0:00 |
1995 | T_15961504 | 37.471523 | 126.70176 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 100.0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 20 | 2020-09-14 0:00 |
1996 | T_92115092 | 37.525475 | 126.72923 | <NA> | <NA> | <NA> | 50.0 | 25.0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 25 | <NA> | <NA> | <NA> | 44 | 2020-09-14 0:00 |
1997 | T_94751864 | 37.471703 | 126.69085 | <NA> | <NA> | <NA> | <NA> | 100.0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 18 | 2020-09-14 0:00 |
1998 | T_68268680 | 37.446762 | 126.66715 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 50.0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 50 | <NA> | 20 | 2020-09-14 0:00 |
1999 | T_43470100 | 37.439774 | 126.673744 | <NA> | 50.0 | 50 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 43 | 2020-09-14 0:00 |
Most frequently occurring
taxi_id | latitude | longitude | m10 | m20 | m30 | m40 | m50 | m60 | m70 | f10 | f20 | f30 | f40 | f50 | f60 | f70 | total | register_at | # duplicates | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
10 | T_17353134 | 37.65057 | 126.63241 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 0 | 2020-09-14 0:00 | 31 |
152 | T_97608367 | 37.5649 | 126.83376 | <NA> | <NA> | <NA> | 25.0 | <NA> | <NA> | <NA> | <NA> | 25.0 | <NA> | 25 | 25.0 | <NA> | <NA> | 18 | 2020-09-14 0:00 | 31 |
163 | T_98487291 | 37.53594 | 126.899445 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 0 | 2020-09-14 0:00 | 31 |
145 | T_94605377 | 37.57089 | 126.73594 | <NA> | <NA> | <NA> | <NA> | 100.0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 5 | 2020-09-14 0:00 | 21 |
88 | T_68122192 | 37.50444 | 126.76331 | <NA> | <NA> | 33.33 | <NA> | 66.67 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 23 | 2020-09-14 0:00 | 19 |
98 | T_69513822 | 37.464 | 126.68101 | <NA> | <NA> | 16.67 | 16.67 | 16.67 | 16.67 | 16.67 | <NA> | 16.67 | <NA> | <NA> | <NA> | <NA> | <NA> | 26 | 2020-09-14 0:00 | 19 |
48 | T_44934973 | 37.503742 | 126.714584 | 20.0 | <NA> | <NA> | <NA> | <NA> | 20.0 | <NA> | 20 | <NA> | <NA> | <NA> | 40.0 | <NA> | <NA> | 38 | 2020-09-14 0:00 | 18 |
151 | T_96875931 | 37.64166 | 127.029854 | <NA> | <NA> | <NA> | <NA> | 33.33 | <NA> | <NA> | <NA> | <NA> | 33.33 | <NA> | 33.33 | <NA> | <NA> | 30 | 2020-09-14 0:00 | 16 |
165 | T_98633779 | 37.530537 | 126.84336 | 12.5 | <NA> | 12.5 | 12.5 | <NA> | 12.5 | <NA> | <NA> | 12.5 | 12.5 | <NA> | 25.0 | <NA> | <NA> | 56 | 2020-09-14 0:00 | 16 |
126 | T_74494392 | 37.4907 | 126.98211 | <NA> | <NA> | <NA> | 100.0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 15 | 2020-09-14 0:00 | 15 |