Overview

Dataset statistics

Number of variables6
Number of observations49
Missing cells1
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.6 KiB
Average record size in memory53.7 B

Variable types

Text1
Numeric3
DateTime2

Dataset

DescriptionSample
Author올시데이터
URLhttps://www.bigdata-sea.kr/datasearch/base/view.do?prodId=PROD_000429

Alerts

SHIP_CNT is highly overall correlated with NVGTN_DISTHigh correlation
NVGTN_DIST is highly overall correlated with SHIP_CNTHigh correlation
SHIP_KIND has 1 (2.0%) missing valuesMissing
DPTR_HMS has unique valuesUnique
NVGTN_DIST has unique valuesUnique
RN has unique valuesUnique

Reproduction

Analysis started2023-12-10 14:50:31.986510
Analysis finished2023-12-10 14:50:33.722664
Duration1.74 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

SHIP_KIND
Text

MISSING 

Distinct46
Distinct (%)95.8%
Missing1
Missing (%)2.0%
Memory size524.0 B
2023-12-10T23:50:33.932534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length30.5
Mean length20.916667
Min length3

Characters and Unicode

Total characters1004
Distinct characters53
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)91.7%

Sample

1st rowTanker(product oil)
2nd rowTanker(chemical/oil product)
3rd rowTanker(oil/chemical)
4th rowTanker(chemical)
5th rowTanker
ValueCountFrequency (%)
tanker 27
 
17.3%
inland 16
 
10.3%
cargo 9
 
5.8%
oil 6
 
3.8%
pushtow 6
 
3.8%
barges 5
 
3.2%
motor 5
 
3.2%
chemical 4
 
2.6%
liquid 4
 
2.6%
or 3
 
1.9%
Other values (54) 71
45.5%
2023-12-10T23:50:34.481242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
108
 
10.8%
a 87
 
8.7%
n 77
 
7.7%
e 77
 
7.7%
r 76
 
7.6%
o 51
 
5.1%
l 46
 
4.6%
i 38
 
3.8%
t 37
 
3.7%
T 33
 
3.3%
Other values (43) 374
37.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 725
72.2%
Uppercase Letter 153
 
15.2%
Space Separator 108
 
10.8%
Open Punctuation 6
 
0.6%
Close Punctuation 6
 
0.6%
Other Punctuation 3
 
0.3%
Dash Punctuation 2
 
0.2%
Decimal Number 1
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 87
12.0%
n 77
10.6%
e 77
10.6%
r 76
10.5%
o 51
 
7.0%
l 46
 
6.3%
i 38
 
5.2%
t 37
 
5.1%
k 32
 
4.4%
s 32
 
4.4%
Other values (16) 172
23.7%
Uppercase Letter
ValueCountFrequency (%)
T 33
21.6%
I 20
13.1%
P 16
10.5%
C 13
 
8.5%
M 9
 
5.9%
O 8
 
5.2%
A 6
 
3.9%
L 5
 
3.3%
U 5
 
3.3%
F 5
 
3.3%
Other values (11) 33
21.6%
Space Separator
ValueCountFrequency (%)
108
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 878
87.5%
Common 126
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 87
 
9.9%
n 77
 
8.8%
e 77
 
8.8%
r 76
 
8.7%
o 51
 
5.8%
l 46
 
5.2%
i 38
 
4.3%
t 37
 
4.2%
T 33
 
3.8%
k 32
 
3.6%
Other values (37) 324
36.9%
Common
ValueCountFrequency (%)
108
85.7%
( 6
 
4.8%
) 6
 
4.8%
/ 3
 
2.4%
- 2
 
1.6%
2 1
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1004
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
108
 
10.8%
a 87
 
8.7%
n 77
 
7.7%
e 77
 
7.7%
r 76
 
7.6%
o 51
 
5.1%
l 46
 
4.6%
i 38
 
3.8%
t 37
 
3.7%
T 33
 
3.3%
Other values (43) 374
37.3%

SHIP_CNT
Real number (ℝ)

HIGH CORRELATION 

Distinct40
Distinct (%)81.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean531.38776
Minimum1
Maximum4436
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-10T23:50:34.679779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q17
median59
Q3369
95-th percentile3189.8
Maximum4436
Range4435
Interquartile range (IQR)362

Descriptive statistics

Standard deviation1078.4791
Coefficient of variation (CV)2.029552
Kurtosis5.5759337
Mean531.38776
Median Absolute Deviation (MAD)56
Skewness2.5226367
Sum26038
Variance1163117.2
MonotonicityNot monotonic
2023-12-10T23:50:34.860192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
1 4
 
8.2%
3 4
 
8.2%
20 2
 
4.1%
7 2
 
4.1%
110 2
 
4.1%
3471 1
 
2.0%
595 1
 
2.0%
528 1
 
2.0%
28 1
 
2.0%
147 1
 
2.0%
Other values (30) 30
61.2%
ValueCountFrequency (%)
1 4
8.2%
2 1
 
2.0%
3 4
8.2%
4 1
 
2.0%
5 1
 
2.0%
7 2
4.1%
8 1
 
2.0%
9 1
 
2.0%
12 1
 
2.0%
18 1
 
2.0%
ValueCountFrequency (%)
4436 1
2.0%
3991 1
2.0%
3471 1
2.0%
2768 1
2.0%
2749 1
2.0%
1358 1
2.0%
1325 1
2.0%
1205 1
2.0%
595 1
2.0%
571 1
2.0%

DPTR_HMS
Date

UNIQUE 

Distinct49
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size524.0 B
Minimum2021-01-01 00:00:01
Maximum2023-01-03 01:11:39
2023-12-10T23:50:35.434060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:35.591527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
Distinct39
Distinct (%)79.6%
Missing0
Missing (%)0.0%
Memory size524.0 B
Minimum2021-01-01 05:25:04
Maximum2023-05-31 23:59:15
2023-12-10T23:50:35.801234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:36.033969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)

NVGTN_DIST
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct49
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.3191287 × 1010
Minimum2372370
Maximum2.93964 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-10T23:50:36.256404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2372370
5-th percentile55633780
Q11.83412 × 108
median1.25437 × 109
Q31.1618 × 1010
95-th percentile1.02892 × 1011
Maximum2.93964 × 1011
Range2.9396163 × 1011
Interquartile range (IQR)1.1434588 × 1010

Descriptive statistics

Standard deviation5.5534424 × 1010
Coefficient of variation (CV)2.3946245
Kurtosis13.098556
Mean2.3191287 × 1010
Median Absolute Deviation (MAD)1.177267 × 109
Skewness3.4352786
Sum1.136373 × 1012
Variance3.0840722 × 1021
MonotonicityNot monotonic
2023-12-10T23:50:36.469555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
102760000000 1
 
2.0%
81064800000 1
 
2.0%
1254370000 1
 
2.0%
11791500000 1
 
2.0%
574511000 1
 
2.0%
86111200 1
 
2.0%
2619400000 1
 
2.0%
557365000 1
 
2.0%
102980000000 1
 
2.0%
2892830000 1
 
2.0%
Other values (39) 39
79.6%
ValueCountFrequency (%)
2372370 1
2.0%
13496000 1
2.0%
51309500 1
2.0%
62120200 1
2.0%
77103000 1
2.0%
78135500 1
2.0%
80181800 1
2.0%
86111200 1
2.0%
86620900 1
2.0%
123551000 1
2.0%
ValueCountFrequency (%)
293964000000 1
2.0%
203049000000 1
2.0%
102980000000 1
2.0%
102760000000 1
2.0%
100792000000 1
2.0%
81064800000 1
2.0%
73894500000 1
2.0%
32416900000 1
2.0%
30803400000 1
2.0%
24802900000 1
2.0%

RN
Real number (ℝ)

UNIQUE 

Distinct49
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26
Minimum2
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-10T23:50:36.696157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile4.4
Q114
median26
Q338
95-th percentile47.6
Maximum50
Range48
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.28869
Coefficient of variation (CV)0.54956501
Kurtosis-1.2
Mean26
Median Absolute Deviation (MAD)12
Skewness0
Sum1274
Variance204.16667
MonotonicityStrictly increasing
2023-12-10T23:50:36.888268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
2 1
 
2.0%
39 1
 
2.0%
29 1
 
2.0%
30 1
 
2.0%
31 1
 
2.0%
32 1
 
2.0%
33 1
 
2.0%
34 1
 
2.0%
35 1
 
2.0%
36 1
 
2.0%
Other values (39) 39
79.6%
ValueCountFrequency (%)
2 1
2.0%
3 1
2.0%
4 1
2.0%
5 1
2.0%
6 1
2.0%
7 1
2.0%
8 1
2.0%
9 1
2.0%
10 1
2.0%
11 1
2.0%
ValueCountFrequency (%)
50 1
2.0%
49 1
2.0%
48 1
2.0%
47 1
2.0%
46 1
2.0%
45 1
2.0%
44 1
2.0%
43 1
2.0%
42 1
2.0%
41 1
2.0%

Interactions

2023-12-10T23:50:33.115879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:32.333851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:32.714546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:33.244012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:32.452905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:32.832267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:33.364446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:32.582115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:32.984710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:50:37.005667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SHIP_KINDSHIP_CNTDPTR_HMSARVL_HMSNVGTN_DISTRN
SHIP_KIND1.0000.0001.0000.9000.7350.851
SHIP_CNT0.0001.0001.0000.8810.8950.000
DPTR_HMS1.0001.0001.0001.0001.0001.000
ARVL_HMS0.9000.8811.0001.0000.8760.000
NVGTN_DIST0.7350.8951.0000.8761.0000.000
RN0.8510.0001.0000.0000.0001.000
2023-12-10T23:50:37.158370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SHIP_CNTNVGTN_DISTRN
SHIP_CNT1.0000.9820.046
NVGTN_DIST0.9821.0000.087
RN0.0460.0871.000

Missing values

2023-12-10T23:50:33.517781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:50:33.667601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

SHIP_KINDSHIP_CNTDPTR_HMSARVL_HMSNVGTN_DISTRN
0Tanker(product oil)347101-Jan-2023 00:23:2131-May-2023 23:51:221027600000002
1Tanker(chemical/oil product)43601-Jan-2023 17:16:5631-May-2023 23:50:02166096000003
2Tanker(oil/chemical)120501-Jan-2023 00:02:5824-Jan-2023 16:33:04324169000004
3Tanker(chemical)33401-Jan-2023 00:00:2931-May-2023 23:59:15116180000005
4Tanker3901-Jan-2023 00:05:0731-May-2023 10:06:239465230006
5Tanker - Hazard A (Major)101-Jan-2023 00:10:0417-May-2023 23:55:58134960007
6Asphalt/Bitumen Tanker103-Jan-2023 01:11:3931-May-2023 23:02:10771030008
7Edible Oil Tanker301-Jan-2023 00:09:2031-May-2023 23:55:15513095009
8CO2 Tanker301-Jan-2023 00:03:5931-May-2023 23:53:2812355100010
9FRUIT JUICE Tanker501-Jan-2023 00:10:4931-May-2023 22:38:0141847800011
SHIP_KINDSHIP_CNTDPTR_HMSARVL_HMSNVGTN_DISTRN
39Floating Storage or Production11001-Apr-2021 15:11:4413-Oct-2021 02:25:00140831000041
40Oil Products Tanker274901-Jan-2021 00:04:4713-Oct-2021 23:58:0210079200000042
41Bunkering Tanker33303-Jan-2021 14:37:2813-Oct-2021 23:59:04555417000043
42Inland Pushtow six cargo barges101-Jan-2021 00:01:1116-Sep-2021 20:45:048662090044
43Inland Pushtow two barges at least one tanker703-Jan-2021 19:31:5916-Aug-2021 20:50:0313790300045
44Oil or Chemical Tanker443602-Jan-2021 02:58:1413-Oct-2021 23:55:0029396400000046
45Shuttle Tanker6503-Jun-2021 10:46:4013-Oct-2021 23:42:00326260000047
46Chemical Tanker57106-Jan-2021 13:00:0413-Oct-2021 23:57:033080340000048
47CHEMICAL TANKER2701-Jan-2021 00:01:4728-Apr-2021 21:40:2265244200049
48LPG or Chemical Tanker701-Jan-2021 00:02:1913-Oct-2021 23:58:0526674800050