Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows204
Duplicate rows (%)2.0%
Total size in memory566.4 KiB
Average record size in memory58.0 B

Variable types

Numeric1
DateTime3
Categorical2

Dataset

Description울산항만공사 선석운영시스템에서 추출한 선석입항예정현황 데이터입니다. ※데이터 기준연도: 2020~2023년(최근 3개년)
URLhttps://www.data.go.kr/data/15121199/fileData.do

Alerts

Dataset has 204 (2.0%) duplicate rowsDuplicates
배정부두코드명 is highly overall correlated with 배정항구코드명High correlation
배정항구코드명 is highly overall correlated with 배정부두코드명High correlation
배정항구코드명 is highly imbalanced (55.8%)Imbalance

Reproduction

Analysis started2023-12-12 02:52:11.286184
Analysis finished2023-12-12 02:52:12.219231
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

입항연도
Real number (ℝ)

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2021.638
Minimum2000
Maximum2024
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T11:52:12.272481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2020
Q12021
median2022
Q32022
95-th percentile2023
Maximum2024
Range24
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.0331809
Coefficient of variation (CV)0.00051106127
Kurtosis76.049171
Mean2021.638
Median Absolute Deviation (MAD)1
Skewness-3.9156578
Sum20216380
Variance1.0674627
MonotonicityNot monotonic
2023-12-12T11:52:12.392648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2021 3449
34.5%
2022 3398
34.0%
2023 2075
20.8%
2020 1072
 
10.7%
2002 3
 
< 0.1%
2000 2
 
< 0.1%
2024 1
 
< 0.1%
ValueCountFrequency (%)
2000 2
 
< 0.1%
2002 3
 
< 0.1%
2020 1072
 
10.7%
2021 3449
34.5%
2022 3398
34.0%
2023 2075
20.8%
2024 1
 
< 0.1%
ValueCountFrequency (%)
2024 1
 
< 0.1%
2023 2075
20.8%
2022 3398
34.0%
2021 3449
34.5%
2020 1072
 
10.7%
2002 3
 
< 0.1%
2000 2
 
< 0.1%
Distinct7690
Distinct (%)76.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-06-23 17:00:00
Maximum2023-08-27 15:00:00
2023-12-12T11:52:12.530640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:52:12.666546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

배정항구코드명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
820
8123 
823
1874 
821
 
3

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row820
2nd row820
3rd row820
4th row823
5th row820

Common Values

ValueCountFrequency (%)
820 8123
81.2%
823 1874
 
18.7%
821 3
 
< 0.1%

Length

2023-12-12T11:52:12.784352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:52:12.873651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
820 8123
81.2%
823 1874
 
18.7%
821 3
 
< 0.1%

배정부두코드명
Categorical

HIGH CORRELATION 

Distinct21
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
MB6
1690 
MBN
1183 
MB4
1023 
MB3
859 
MB2
826 
Other values (16)
4419 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMB6
2nd rowMB8
3rd rowMB3
4th rowMBN
5th rowMB3

Common Values

ValueCountFrequency (%)
MB6 1690
16.9%
MBN 1183
11.8%
MB4 1023
10.2%
MB3 859
8.6%
MB2 826
8.3%
MBG 694
 
6.9%
MDY 529
 
5.3%
MB8 520
 
5.2%
MBF 403
 
4.0%
MB7 372
 
3.7%
Other values (11) 1901
19.0%

Length

2023-12-12T11:52:12.982562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mb6 1690
16.9%
mbn 1183
11.8%
mb4 1023
10.2%
mb3 859
8.6%
mb2 826
8.3%
mbg 694
 
6.9%
mdy 529
 
5.3%
mb8 520
 
5.2%
mbf 403
 
4.0%
mb7 372
 
3.7%
Other values (11) 1901
19.0%
Distinct879
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-06-23 00:00:00
Maximum2023-08-24 00:00:00
2023-12-12T11:52:13.133689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:52:13.304038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct939
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-06-23 00:00:00
Maximum2023-08-24 00:00:00
2023-12-12T11:52:13.436926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:52:13.561689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T11:52:11.577867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T11:52:13.647587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
입항연도배정항구코드명배정부두코드명
입항연도1.0000.2550.240
배정항구코드명0.2551.0001.000
배정부두코드명0.2401.0001.000
2023-12-12T11:52:13.739905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
배정부두코드명배정항구코드명
배정부두코드명1.0000.999
배정항구코드명0.9991.000
2023-12-12T11:52:13.819102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
입항연도배정항구코드명배정부두코드명
입항연도1.0000.0830.113
배정항구코드명0.0831.0000.999
배정부두코드명0.1130.9991.000

Missing values

2023-12-12T11:52:12.049493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T11:52:12.165509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

입항연도입항예정일시배정항구코드명배정부두코드명등록일배정수정일
108420202020-12-01 12:00820MB62020-11-302020-12-01
972920222022-06-08 16:00820MB82022-06-082022-06-08
342520212021-05-06 13:50820MB32021-05-042021-05-06
786220222022-02-15 16:00823MBN2022-02-142022-02-14
648620212021-11-29 3:00820MB32021-11-282021-11-28
761220222022-01-23 14:00820MB32022-01-282022-01-28
965020222022-06-01 6:00820MBP2022-06-022022-06-02
1412520232023-03-21 4:00820MDY2023-03-212023-03-21
720920222022-01-09 17:00820MDY2022-01-072022-01-07
887720222022-04-14 3:00820MB32022-04-132022-04-14
입항연도입항예정일시배정항구코드명배정부두코드명등록일배정수정일
411720212021-06-27 11:00820MDY2021-06-252021-06-25
300520212021-04-07 19:00820MB42021-04-072021-04-07
929420222022-05-10 15:00820MB42022-05-092022-05-09
975820222022-06-12 0:00820MB32022-06-102022-06-10
1089520222022-08-29 8:00820MB42022-08-292022-08-29
1105220222022-09-07 11:00823MBN2022-09-082022-09-08
862020222022-03-30 10:00823MBF2022-03-302022-03-30
1550520232023-06-09 11:41820MB42023-06-092023-06-09
486020212021-08-14 23:00820MB92021-08-142021-08-14
547120212021-09-23 6:00820MB52021-09-222021-09-22

Duplicate rows

Most frequently occurring

입항연도입항예정일시배정항구코드명배정부두코드명등록일배정수정일# duplicates
6720222022-04-09 8:00823MBN2022-04-082022-04-083
7420222022-04-29 17:00823MBN2022-04-292022-04-293
8720222022-06-20 17:00823MBN2022-06-202022-06-203
11220222022-10-13 17:00823MBN2022-10-132022-10-133
12220222022-11-22 17:00823MBN2022-11-222022-11-223
13520222022-12-30 17:00823MBN2022-12-302022-12-303
16720232023-04-12 14:00823MBN2023-04-122023-04-123
17120232023-05-11 17:00823MBN2023-05-112023-05-113
17920232023-05-31 17:00823MBN2023-05-312023-05-313
18020232023-06-01 17:00823MBN2023-06-012023-06-013