Overview

Dataset statistics

Number of variables5
Number of observations2552
Missing cells7
Missing cells (%)0.1%
Duplicate rows67
Duplicate rows (%)2.6%
Total size in memory102.3 KiB
Average record size in memory41.1 B

Variable types

Categorical1
DateTime2
Text1
Numeric1

Dataset

Description전국 12개 주요항만에 운영하는 항만청소선(청항선) 운영, 해양부유쓰레기 수거 실적 정보를 포함하고 있는 파일데이터선박위치, 선박명, 지사명, 작업시간, 수거량 등에 대한 정보를 제공함
Author해양환경공단
URLhttps://www.data.go.kr/data/15002252/fileData.do

Alerts

Dataset has 67 (2.6%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 07:19:07.294714
Analysis finished2023-12-12 07:19:08.222392
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

선박명
Categorical

Distinct22
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size20.1 KiB
청항1호
204 
청항2호
198 
항만정화1호
197 
파란호
181 
항만정화2호
170 
Other values (17)
1602 

Length

Max length8
Median length6
Mean length4.6398903
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row황금산호
2nd row포항해양3호
3rd row항만정화1호
4th row항만정화1호
5th row청항1호

Common Values

ValueCountFrequency (%)
청항1호 204
 
8.0%
청항2호 198
 
7.8%
항만정화1호 197
 
7.7%
파란호 181
 
7.1%
항만정화2호 170
 
6.7%
부산936호 169
 
6.6%
청화호 126
 
4.9%
청누리호 125
 
4.9%
여청2호 124
 
4.9%
목포청해호 117
 
4.6%
Other values (12) 941
36.9%

Length

2023-12-12T16:19:08.289732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
청항1호 204
 
8.0%
청항2호 198
 
7.8%
항만정화1호 197
 
7.7%
파란호 181
 
7.1%
항만정화2호 170
 
6.7%
부산936호 169
 
6.6%
청화호 126
 
4.9%
청누리호 125
 
4.9%
여청2호 124
 
4.9%
목포청해호 117
 
4.6%
Other values (12) 941
36.9%
Distinct265
Distinct (%)10.4%
Missing0
Missing (%)0.0%
Memory size20.1 KiB
Minimum2022-01-03 00:00:00
Maximum2022-12-31 00:00:00
2023-12-12T16:19:08.406576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:19:08.552548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1121
Distinct (%)44.0%
Missing7
Missing (%)0.3%
Memory size20.1 KiB
2023-12-12T16:19:08.813068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length33
Mean length8.5300589
Min length2

Characters and Unicode

Total characters21709
Distinct characters329
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique829 ?
Unique (%)32.6%

Sample

1st row대산항
2nd row신항
3rd row감만시민부두
4th row조도 방파제부근 해상
5th row인천 여객선 항로
ValueCountFrequency (%)
해상 320
 
6.4%
226
 
4.5%
부근 165
 
3.3%
북항 157
 
3.1%
여객선 151
 
3.0%
항로 133
 
2.7%
신항 101
 
2.0%
97
 
1.9%
부산항 96
 
1.9%
인천 87
 
1.7%
Other values (967) 3465
69.3%
2023-12-12T16:19:09.227104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2484
 
11.4%
1486
 
6.8%
1292
 
6.0%
936
 
4.3%
575
 
2.6%
, 554
 
2.6%
528
 
2.4%
502
 
2.3%
410
 
1.9%
407
 
1.9%
Other values (319) 12535
57.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 17276
79.6%
Space Separator 2484
 
11.4%
Other Punctuation 604
 
2.8%
Decimal Number 532
 
2.5%
Dash Punctuation 357
 
1.6%
Uppercase Letter 178
 
0.8%
Close Punctuation 95
 
0.4%
Open Punctuation 95
 
0.4%
Lowercase Letter 76
 
0.4%
Math Symbol 12
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1486
 
8.6%
1292
 
7.5%
936
 
5.4%
575
 
3.3%
528
 
3.1%
502
 
2.9%
410
 
2.4%
407
 
2.4%
380
 
2.2%
363
 
2.1%
Other values (262) 10397
60.2%
Uppercase Letter
ValueCountFrequency (%)
S 37
20.8%
M 24
13.5%
W 18
10.1%
P 15
8.4%
K 13
 
7.3%
A 11
 
6.2%
T 10
 
5.6%
C 8
 
4.5%
I 7
 
3.9%
O 6
 
3.4%
Other values (10) 29
16.3%
Lowercase Letter
ValueCountFrequency (%)
a 15
19.7%
p 12
15.8%
m 10
13.2%
g 9
11.8%
o 7
9.2%
n 5
 
6.6%
t 5
 
6.6%
l 3
 
3.9%
s 2
 
2.6%
e 2
 
2.6%
Other values (6) 6
 
7.9%
Decimal Number
ValueCountFrequency (%)
1 142
26.7%
3 112
21.1%
2 102
19.2%
5 53
 
10.0%
4 51
 
9.6%
8 31
 
5.8%
7 19
 
3.6%
6 10
 
1.9%
9 7
 
1.3%
0 5
 
0.9%
Other Punctuation
ValueCountFrequency (%)
, 554
91.7%
; 16
 
2.6%
# 15
 
2.5%
. 11
 
1.8%
& 8
 
1.3%
Math Symbol
ValueCountFrequency (%)
~ 10
83.3%
> 2
 
16.7%
Space Separator
ValueCountFrequency (%)
2484
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 357
100.0%
Close Punctuation
ValueCountFrequency (%)
) 95
100.0%
Open Punctuation
ValueCountFrequency (%)
( 95
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 17276
79.6%
Common 4179
 
19.3%
Latin 254
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1486
 
8.6%
1292
 
7.5%
936
 
5.4%
575
 
3.3%
528
 
3.1%
502
 
2.9%
410
 
2.4%
407
 
2.4%
380
 
2.2%
363
 
2.1%
Other values (262) 10397
60.2%
Latin
ValueCountFrequency (%)
S 37
14.6%
M 24
 
9.4%
W 18
 
7.1%
P 15
 
5.9%
a 15
 
5.9%
K 13
 
5.1%
p 12
 
4.7%
A 11
 
4.3%
m 10
 
3.9%
T 10
 
3.9%
Other values (26) 89
35.0%
Common
ValueCountFrequency (%)
2484
59.4%
, 554
 
13.3%
- 357
 
8.5%
1 142
 
3.4%
3 112
 
2.7%
2 102
 
2.4%
) 95
 
2.3%
( 95
 
2.3%
5 53
 
1.3%
4 51
 
1.2%
Other values (11) 134
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 17276
79.6%
ASCII 4433
 
20.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2484
56.0%
, 554
 
12.5%
- 357
 
8.1%
1 142
 
3.2%
3 112
 
2.5%
2 102
 
2.3%
) 95
 
2.1%
( 95
 
2.1%
5 53
 
1.2%
4 51
 
1.2%
Other values (47) 388
 
8.8%
Hangul
ValueCountFrequency (%)
1486
 
8.6%
1292
 
7.5%
936
 
5.4%
575
 
3.3%
528
 
3.1%
502
 
2.9%
410
 
2.4%
407
 
2.4%
380
 
2.2%
363
 
2.1%
Other values (262) 10397
60.2%
Distinct46
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size20.1 KiB
Minimum2023-12-12 00:00:00
Maximum2023-12-12 16:00:00
2023-12-12T16:19:09.389428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:19:09.544592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=46)

수거량
Real number (ℝ)

Distinct81
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1789.3354
Minimum10
Maximum35000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.6 KiB
2023-12-12T16:19:09.711297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile50
Q1300
median1000
Q32000
95-th percentile6000
Maximum35000
Range34990
Interquartile range (IQR)1700

Descriptive statistics

Standard deviation2802.6187
Coefficient of variation (CV)1.5662903
Kurtosis32.77906
Mean1789.3354
Median Absolute Deviation (MAD)800
Skewness4.7274205
Sum4566384
Variance7854671.6
MonotonicityNot monotonic
2023-12-12T16:19:09.858617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000 543
21.3%
2000 349
13.7%
500 294
11.5%
200 173
 
6.8%
3000 160
 
6.3%
300 144
 
5.6%
100 137
 
5.4%
50 128
 
5.0%
1500 96
 
3.8%
5000 93
 
3.6%
Other values (71) 435
17.0%
ValueCountFrequency (%)
10 5
 
0.2%
20 9
 
0.4%
30 10
 
0.4%
40 2
 
0.1%
50 128
5.0%
60 1
 
< 0.1%
100 137
5.4%
120 1
 
< 0.1%
130 1
 
< 0.1%
150 22
 
0.9%
ValueCountFrequency (%)
35000 1
 
< 0.1%
30000 2
 
0.1%
27000 1
 
< 0.1%
26000 1
 
< 0.1%
25000 3
 
0.1%
20000 5
 
0.2%
18000 1
 
< 0.1%
17000 2
 
0.1%
15000 13
0.5%
14000 1
 
< 0.1%

Interactions

2023-12-12T16:19:07.961470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:19:09.973957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선박명작업시간수거량
선박명1.0000.7360.457
작업시간0.7361.0000.654
수거량0.4570.6541.000
2023-12-12T16:19:10.084356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수거량선박명
수거량1.0000.196
선박명0.1961.000

Missing values

2023-12-12T16:19:08.087277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:19:08.183033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

선박명작업일자작업장소작업시간수거량
0황금산호2022-01-03대산항00:50300
1포항해양3호2022-01-03신항01:30500
2항만정화1호2022-01-03감만시민부두01:001000
3항만정화1호2022-01-03조도 방파제부근 해상01:30500
4청항1호2022-01-03인천 여객선 항로01:001000
5청항1호2022-01-03석탄 부두 앞 해상00:50500
6청항2호2022-01-03물치도01:001500
7청항2호2022-01-03북항01:001500
8온바르호2022-01-03제주항내01:2050
9부산936호2022-01-04감만시민부두,감만부두,동명부두,신선대부두,국제여객부두02:003000
선박명작업일자작업장소작업시간수거량
2542온바르호2022-12-28제주항내02:408000
2543목포청해호2022-12-29장좌도 옆 해상00:00150
2544청화호2022-12-29e-303:305000
2545청항1호2022-12-29원거리 해역01:00100
2546에코인천호2022-12-29여객선 항로(원거리)01:00500
2547에코인천호2022-12-29인천 송도 신항 해상01:00500
2548에코인천호2022-12-29인천 송도 신항 해상01:00500
2549에코인천호2022-12-29여객선 항로(원거리)01:00500
2550여청2호2022-12-30오동도03:001000
2551온바르호2022-12-31제주항내01:001000

Duplicate rows

Most frequently occurring

선박명작업일자작업장소작업시간수거량# duplicates
47청화호2022-09-071구역 M묘지 부근00:30200010
7부산936호2022-08-03부산항01:0010008
8부산936호2022-08-10부산항01:0010008
14부산936호2022-10-19부산항01:0010008
9부산936호2022-08-11부산항01:0010006
6부산936호2022-07-27부산항01:0010005
12부산936호2022-09-07감천항01:0010005
13부산936호2022-10-05부산항01:0020005
50청화호2022-09-162구역 목도 부근00:3010005
2부산936호2022-07-06부산항01:0010004