Overview

Dataset statistics

Number of variables6
Number of observations49
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.6 KiB
Average record size in memory53.7 B

Variable types

Text1
Numeric3
Categorical2

Dataset

DescriptionSample
Author올시데이터
URLhttps://www.bigdata-sea.kr/datasearch/base/view.do?prodId=PROD_000456

Alerts

DPTR_HMS is highly imbalanced (58.8%)Imbalance
SHIP_OWNER_NM has unique valuesUnique
RN has unique valuesUnique
NOX has 21 (42.9%) zerosZeros

Reproduction

Analysis started2023-12-10 15:01:28.653188
Analysis finished2023-12-10 15:01:31.026488
Duration2.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

SHIP_OWNER_NM
Text

UNIQUE 

Distinct49
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size524.0 B
2023-12-11T00:01:31.380053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length24
Mean length18.204082
Min length5

Characters and Unicode

Total characters892
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)100.0%

Sample

1st rowANTIPAXOS SHIPPING
2nd rowHANARO SHIPPING SEOUL
3rd rowSHOKUYU TANKER
4th rowMSI SHIPMANAGEMENT
5th rowSEA COUNTESS MARINE
ValueCountFrequency (%)
shipping 18
 
14.3%
marine 4
 
3.2%
3
 
2.4%
hong 2
 
1.6%
ocean 2
 
1.6%
trading 2
 
1.6%
leasing 2
 
1.6%
financial 2
 
1.6%
shipmanagement 2
 
1.6%
tankers 2
 
1.6%
Other values (87) 87
69.0%
2023-12-11T00:01:32.068337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 96
10.8%
N 91
 
10.2%
A 81
 
9.1%
78
 
8.7%
S 66
 
7.4%
E 63
 
7.1%
P 50
 
5.6%
R 50
 
5.6%
O 44
 
4.9%
H 43
 
4.8%
Other values (18) 230
25.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 810
90.8%
Space Separator 78
 
8.7%
Other Punctuation 3
 
0.3%
Decimal Number 1
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 96
11.9%
N 91
11.2%
A 81
10.0%
S 66
 
8.1%
E 63
 
7.8%
P 50
 
6.2%
R 50
 
6.2%
O 44
 
5.4%
H 43
 
5.3%
G 39
 
4.8%
Other values (15) 187
23.1%
Space Separator
ValueCountFrequency (%)
78
100.0%
Other Punctuation
ValueCountFrequency (%)
& 3
100.0%
Decimal Number
ValueCountFrequency (%)
8 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 810
90.8%
Common 82
 
9.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 96
11.9%
N 91
11.2%
A 81
10.0%
S 66
 
8.1%
E 63
 
7.8%
P 50
 
6.2%
R 50
 
6.2%
O 44
 
5.4%
H 43
 
5.3%
G 39
 
4.8%
Other values (15) 187
23.1%
Common
ValueCountFrequency (%)
78
95.1%
& 3
 
3.7%
8 1
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 892
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 96
10.8%
N 91
 
10.2%
A 81
 
9.1%
78
 
8.7%
S 66
 
7.4%
E 63
 
7.1%
P 50
 
5.6%
R 50
 
5.6%
O 44
 
4.9%
H 43
 
4.8%
Other values (18) 230
25.8%

SHIP_CNT
Real number (ℝ)

Distinct8
Distinct (%)16.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1020408
Minimum1
Maximum21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-11T00:01:32.311121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile7.8
Maximum21
Range20
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3.5251767
Coefficient of variation (CV)1.6770258
Kurtosis18.859342
Mean2.1020408
Median Absolute Deviation (MAD)0
Skewness4.1659594
Sum103
Variance12.426871
MonotonicityNot monotonic
2023-12-11T00:01:32.536728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 40
81.6%
2 3
 
6.1%
9 1
 
2.0%
6 1
 
2.0%
21 1
 
2.0%
3 1
 
2.0%
5 1
 
2.0%
13 1
 
2.0%
ValueCountFrequency (%)
1 40
81.6%
2 3
 
6.1%
3 1
 
2.0%
5 1
 
2.0%
6 1
 
2.0%
9 1
 
2.0%
13 1
 
2.0%
21 1
 
2.0%
ValueCountFrequency (%)
21 1
 
2.0%
13 1
 
2.0%
9 1
 
2.0%
6 1
 
2.0%
5 1
 
2.0%
3 1
 
2.0%
2 3
 
6.1%
1 40
81.6%

DPTR_HMS
Categorical

IMBALANCE 

Distinct6
Distinct (%)12.2%
Missing0
Missing (%)0.0%
Memory size524.0 B
01-Jan-2021 00:00:00
40 
02-Jan-2021 00:00:00
 
4
05-Jan-2021 00:00:00
 
2
03-Jan-2021 00:00:00
 
1
10-Jan-2021 00:00:00
 
1

Length

Max length20
Median length20
Mean length20
Min length20

Unique

Unique3 ?
Unique (%)6.1%

Sample

1st row01-Jan-2021 00:00:00
2nd row01-Jan-2021 00:00:00
3rd row01-Jan-2021 00:00:00
4th row01-Jan-2021 00:00:00
5th row01-Jan-2021 00:00:00

Common Values

ValueCountFrequency (%)
01-Jan-2021 00:00:00 40
81.6%
02-Jan-2021 00:00:00 4
 
8.2%
05-Jan-2021 00:00:00 2
 
4.1%
03-Jan-2021 00:00:00 1
 
2.0%
10-Jan-2021 00:00:00 1
 
2.0%
04-Jan-2021 00:00:00 1
 
2.0%

Length

2023-12-11T00:01:32.803581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T00:01:33.191634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
00:00:00 49
50.0%
01-jan-2021 40
40.8%
02-jan-2021 4
 
4.1%
05-jan-2021 2
 
2.0%
03-jan-2021 1
 
1.0%
10-jan-2021 1
 
1.0%
04-jan-2021 1
 
1.0%

ARVL_HMS
Categorical

Distinct16
Distinct (%)32.7%
Missing0
Missing (%)0.0%
Memory size524.0 B
13-Oct-2021 18:00:00
29 
13-Oct-2021 12:00:00
13-Oct-2021 06:00:00
11-Oct-2021 12:00:00
 
1
10-Aug-2021 12:00:00
 
1
Other values (11)
11 

Length

Max length20
Median length20
Mean length20
Min length20

Unique

Unique13 ?
Unique (%)26.5%

Sample

1st row13-Oct-2021 18:00:00
2nd row13-Oct-2021 18:00:00
3rd row11-Oct-2021 12:00:00
4th row13-Oct-2021 18:00:00
5th row13-Oct-2021 18:00:00

Common Values

ValueCountFrequency (%)
13-Oct-2021 18:00:00 29
59.2%
13-Oct-2021 12:00:00 4
 
8.2%
13-Oct-2021 06:00:00 3
 
6.1%
11-Oct-2021 12:00:00 1
 
2.0%
10-Aug-2021 12:00:00 1
 
2.0%
29-Sep-2021 06:00:00 1
 
2.0%
12-Oct-2021 18:00:00 1
 
2.0%
04-Aug-2021 06:00:00 1
 
2.0%
06-Oct-2021 06:00:00 1
 
2.0%
09-Sep-2021 06:00:00 1
 
2.0%
Other values (6) 6
 
12.2%

Length

2023-12-11T00:01:33.646247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
13-oct-2021 36
36.7%
18:00:00 31
31.6%
12:00:00 8
 
8.2%
06:00:00 7
 
7.1%
00:00:00 3
 
3.1%
12-oct-2021 2
 
2.0%
01-oct-2021 2
 
2.0%
11-oct-2021 1
 
1.0%
10-aug-2021 1
 
1.0%
29-sep-2021 1
 
1.0%
Other values (6) 6
 
6.1%

NOX
Real number (ℝ)

ZEROS 

Distinct29
Distinct (%)59.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5729715.4
Minimum0
Maximum2.16386 × 108
Zeros21
Zeros (%)42.9%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-11T00:01:33.935018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median134182
Q31554820
95-th percentile6987172
Maximum2.16386 × 108
Range2.16386 × 108
Interquartile range (IQR)1554820

Descriptive statistics

Standard deviation30839485
Coefficient of variation (CV)5.3823765
Kurtosis48.201194
Mean5729715.4
Median Absolute Deviation (MAD)134182
Skewness6.9184963
Sum2.8075606 × 108
Variance9.5107386 × 1014
MonotonicityNot monotonic
2023-12-11T00:01:34.221078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
0.0 21
42.9%
3622240.0 1
 
2.0%
6267040.0 1
 
2.0%
443371.0 1
 
2.0%
720683.0 1
 
2.0%
7467260.0 1
 
2.0%
612149.0 1
 
2.0%
1589500.0 1
 
2.0%
1651850.0 1
 
2.0%
1554820.0 1
 
2.0%
Other values (19) 19
38.8%
ValueCountFrequency (%)
0.0 21
42.9%
838.925 1
 
2.0%
9307.34 1
 
2.0%
100141.0 1
 
2.0%
134182.0 1
 
2.0%
287515.0 1
 
2.0%
384970.0 1
 
2.0%
443371.0 1
 
2.0%
610403.0 1
 
2.0%
612149.0 1
 
2.0%
ValueCountFrequency (%)
216386000.0 1
2.0%
15734800.0 1
2.0%
7467260.0 1
2.0%
6267040.0 1
2.0%
5014260.0 1
2.0%
3700000.0 1
2.0%
3622240.0 1
2.0%
3418800.0 1
2.0%
3275940.0 1
2.0%
1975120.0 1
2.0%

RN
Real number (ℝ)

UNIQUE 

Distinct49
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26
Minimum2
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-11T00:01:34.540617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile4.4
Q114
median26
Q338
95-th percentile47.6
Maximum50
Range48
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.28869
Coefficient of variation (CV)0.54956501
Kurtosis-1.2
Mean26
Median Absolute Deviation (MAD)12
Skewness0
Sum1274
Variance204.16667
MonotonicityStrictly increasing
2023-12-11T00:01:34.956082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
2 1
 
2.0%
39 1
 
2.0%
29 1
 
2.0%
30 1
 
2.0%
31 1
 
2.0%
32 1
 
2.0%
33 1
 
2.0%
34 1
 
2.0%
35 1
 
2.0%
36 1
 
2.0%
Other values (39) 39
79.6%
ValueCountFrequency (%)
2 1
2.0%
3 1
2.0%
4 1
2.0%
5 1
2.0%
6 1
2.0%
7 1
2.0%
8 1
2.0%
9 1
2.0%
10 1
2.0%
11 1
2.0%
ValueCountFrequency (%)
50 1
2.0%
49 1
2.0%
48 1
2.0%
47 1
2.0%
46 1
2.0%
45 1
2.0%
44 1
2.0%
43 1
2.0%
42 1
2.0%
41 1
2.0%

Interactions

2023-12-11T00:01:30.222893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:29.084178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:29.692112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:30.410952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:29.284320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:29.885567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:30.548505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:29.504437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:30.047942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T00:01:36.003193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SHIP_OWNER_NMSHIP_CNTDPTR_HMSARVL_HMSNOXRN
SHIP_OWNER_NM1.0001.0001.0001.0001.0001.000
SHIP_CNT1.0001.0000.0000.0000.8500.114
DPTR_HMS1.0000.0001.0000.5130.0000.166
ARVL_HMS1.0000.0000.5131.0000.1760.415
NOX1.0000.8500.0000.1761.0000.325
RN1.0000.1140.1660.4150.3251.000
2023-12-11T00:01:36.259933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
DPTR_HMSARVL_HMS
DPTR_HMS1.0000.228
ARVL_HMS0.2281.000
2023-12-11T00:01:36.430293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SHIP_CNTNOXRNDPTR_HMSARVL_HMS
SHIP_CNT1.0000.3870.2400.0000.161
NOX0.3871.0000.1140.0000.084
RN0.2400.1141.0000.0530.125
DPTR_HMS0.0000.0000.0531.0000.228
ARVL_HMS0.1610.0840.1250.2281.000

Missing values

2023-12-11T00:01:30.763082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T00:01:30.959130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

SHIP_OWNER_NMSHIP_CNTDPTR_HMSARVL_HMSNOXRN
0ANTIPAXOS SHIPPING101-Jan-2021 00:00:0013-Oct-2021 18:00:003622240.02
1HANARO SHIPPING SEOUL101-Jan-2021 00:00:0013-Oct-2021 18:00:000.03
2SHOKUYU TANKER101-Jan-2021 00:00:0011-Oct-2021 12:00:000.04
3MSI SHIPMANAGEMENT101-Jan-2021 00:00:0013-Oct-2021 18:00:000.05
4SEA COUNTESS MARINE101-Jan-2021 00:00:0013-Oct-2021 18:00:00287515.06
5PROSAFE PRODUCTION SERVICES103-Jan-2021 00:00:0013-Oct-2021 18:00:00384970.07
6DAMICO TANKERS201-Jan-2021 00:00:0010-Aug-2021 12:00:003700000.08
7SAN LORENZO SHIPPING101-Jan-2021 00:00:0013-Oct-2021 18:00:00838.9259
8OSAKA FLEET101-Jan-2021 00:00:0013-Oct-2021 18:00:005014260.010
9CONFORD TRADING105-Jan-2021 00:00:0029-Sep-2021 06:00:000.011
SHIP_OWNER_NMSHIP_CNTDPTR_HMSARVL_HMSNOXRN
39SINO OCEAN SHIPPING102-Jan-2021 00:00:0013-Oct-2021 18:00:001651850.041
40JINGHAI SHIPPING101-Jan-2021 00:00:0013-Oct-2021 12:00:000.042
41MAERSK SHIPPING HONG KONG102-Jan-2021 00:00:0010-Jan-2021 18:00:000.043
42HUA HAI PETROLEUM TRANSPORT &101-Jan-2021 00:00:0001-Oct-2021 12:00:001589500.044
43HAIHUA SHIPPING101-Jan-2021 00:00:0013-Oct-2021 12:00:00612149.045
44CHINA SHIPPING DEVELOPMENT501-Jan-2021 00:00:0013-Oct-2021 18:00:007467260.046
45MINSHENG FINANCIAL LEASING201-Jan-2021 00:00:0013-Oct-2021 12:00:00720683.047
46CMB FINANCIAL LEASING104-Jan-2021 00:00:0013-Oct-2021 18:00:00443371.048
47SINOCHEM SHIPPING HAINAN1302-Jan-2021 00:00:0013-Oct-2021 18:00:006267040.049
48ANRUN SHIPPING101-Jan-2021 00:00:0012-Oct-2021 00:00:000.050