Overview

Dataset statistics

Number of variables7
Number of observations738
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory41.9 KiB
Average record size in memory58.2 B

Variable types

Numeric2
Categorical4
Text1

Dataset

Description2021년 기준으로 유선 및 도선 선박 현황을 관리기관, 선박명, 사업구분(유선/도선), 톤수 등에 대한 항목을 제공합니다.
URLhttps://www.data.go.kr/data/15061883/fileData.do

Alerts

번호 is highly overall correlated with 관리기관 and 2 other fieldsHigh correlation
관리기관 is highly overall correlated with 번호 and 2 other fieldsHigh correlation
선박종류 is highly overall correlated with 번호 and 2 other fieldsHigh correlation
사업구분 is highly overall correlated with 번호 and 2 other fieldsHigh correlation
사업구분 is highly imbalanced (69.6%)Imbalance
등록구분 is highly imbalanced (86.0%)Imbalance
번호 has unique valuesUnique

Reproduction

Analysis started2023-12-11 23:15:01.750162
Analysis finished2023-12-11 23:15:02.815772
Duration1.07 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct738
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean369.5
Minimum1
Maximum738
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2023-12-12T08:15:02.888966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile37.85
Q1185.25
median369.5
Q3553.75
95-th percentile701.15
Maximum738
Range737
Interquartile range (IQR)368.5

Descriptive statistics

Standard deviation213.18654
Coefficient of variation (CV)0.57695951
Kurtosis-1.2
Mean369.5
Median Absolute Deviation (MAD)184.5
Skewness0
Sum272691
Variance45448.5
MonotonicityStrictly increasing
2023-12-12T08:15:03.072620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
497 1
 
0.1%
488 1
 
0.1%
489 1
 
0.1%
490 1
 
0.1%
491 1
 
0.1%
492 1
 
0.1%
493 1
 
0.1%
494 1
 
0.1%
495 1
 
0.1%
Other values (728) 728
98.6%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
738 1
0.1%
737 1
0.1%
736 1
0.1%
735 1
0.1%
734 1
0.1%
733 1
0.1%
732 1
0.1%
731 1
0.1%
730 1
0.1%
729 1
0.1%

관리기관
Categorical

HIGH CORRELATION 

Distinct35
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
강원도 춘천시
152 
서울특별시 한강사업본부
121 
대구광역시 동구
103 
대구광역시 수성구
74 
경기도 의왕시
39 
Other values (30)
249 

Length

Max length12
Median length9
Mean length8.3536585
Min length5

Unique

Unique6 ?
Unique (%)0.8%

Sample

1st row서울특별시 한강사업본부
2nd row서울특별시 한강사업본부
3rd row서울특별시 한강사업본부
4th row서울특별시 한강사업본부
5th row서울특별시 한강사업본부

Common Values

ValueCountFrequency (%)
강원도 춘천시 152
20.6%
서울특별시 한강사업본부 121
16.4%
대구광역시 동구 103
14.0%
대구광역시 수성구 74
10.0%
경기도 의왕시 39
 
5.3%
경기도 가평군 33
 
4.5%
경상북도 구미시 30
 
4.1%
경기도 평택시 29
 
3.9%
충청북도 제천시 20
 
2.7%
경상북도 청도군 14
 
1.9%
Other values (25) 123
16.7%

Length

2023-12-12T08:15:03.236977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
대구광역시 178
12.1%
강원도 175
11.9%
춘천시 152
10.3%
서울특별시 131
8.9%
경기도 124
 
8.4%
한강사업본부 121
 
8.2%
동구 103
 
7.0%
수성구 74
 
5.0%
경상북도 52
 
3.5%
의왕시 39
 
2.6%
Other values (35) 324
22.0%
Distinct680
Distinct (%)92.1%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
2023-12-12T08:15:03.617013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length5.4322493
Min length3

Characters and Unicode

Total characters4009
Distinct characters173
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique667 ?
Unique (%)90.4%

Sample

1st row한강아라호
2nd row카약1호
3rd row카약2호
4th row카약3호
5th row파라다이스1호(유)
ValueCountFrequency (%)
로멘스보트 30
 
3.6%
오리보트 27
 
3.2%
청평 15
 
1.8%
현대 14
 
1.7%
오리배 14
 
1.7%
로멘스 13
 
1.5%
오리 8
 
0.9%
보트 8
 
0.9%
5호 6
 
0.7%
9호 6
 
0.7%
Other values (625) 702
83.3%
2023-12-12T08:15:04.178922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
612
 
15.3%
1 251
 
6.3%
2 188
 
4.7%
150
 
3.7%
136
 
3.4%
136
 
3.4%
135
 
3.4%
3 124
 
3.1%
105
 
2.6%
93
 
2.3%
Other values (163) 2079
51.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2830
70.6%
Decimal Number 1042
 
26.0%
Space Separator 105
 
2.6%
Dash Punctuation 18
 
0.4%
Uppercase Letter 12
 
0.3%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
612
21.6%
150
 
5.3%
136
 
4.8%
136
 
4.8%
135
 
4.8%
93
 
3.3%
82
 
2.9%
71
 
2.5%
63
 
2.2%
57
 
2.0%
Other values (146) 1295
45.8%
Decimal Number
ValueCountFrequency (%)
1 251
24.1%
2 188
18.0%
3 124
11.9%
5 83
 
8.0%
4 71
 
6.8%
7 69
 
6.6%
8 67
 
6.4%
6 64
 
6.1%
9 63
 
6.0%
0 62
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
U 4
33.3%
F 4
33.3%
O 4
33.3%
Space Separator
ValueCountFrequency (%)
105
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2830
70.6%
Common 1167
29.1%
Latin 12
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
612
21.6%
150
 
5.3%
136
 
4.8%
136
 
4.8%
135
 
4.8%
93
 
3.3%
82
 
2.9%
71
 
2.5%
63
 
2.2%
57
 
2.0%
Other values (146) 1295
45.8%
Common
ValueCountFrequency (%)
1 251
21.5%
2 188
16.1%
3 124
10.6%
105
9.0%
5 83
 
7.1%
4 71
 
6.1%
7 69
 
5.9%
8 67
 
5.7%
6 64
 
5.5%
9 63
 
5.4%
Other values (4) 82
 
7.0%
Latin
ValueCountFrequency (%)
U 4
33.3%
F 4
33.3%
O 4
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2830
70.6%
ASCII 1179
29.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
612
21.6%
150
 
5.3%
136
 
4.8%
136
 
4.8%
135
 
4.8%
93
 
3.3%
82
 
2.9%
71
 
2.5%
63
 
2.2%
57
 
2.0%
Other values (146) 1295
45.8%
ASCII
ValueCountFrequency (%)
1 251
21.3%
2 188
15.9%
3 124
10.5%
105
8.9%
5 83
 
7.0%
4 71
 
6.0%
7 69
 
5.9%
8 67
 
5.7%
6 64
 
5.4%
9 63
 
5.3%
Other values (7) 94
 
8.0%

선박종류
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
무동력선
619 
동력선
119 

Length

Max length4
Median length4
Mean length3.8387534
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row동력선
2nd row동력선
3rd row동력선
4th row동력선
5th row동력선

Common Values

ValueCountFrequency (%)
무동력선 619
83.9%
동력선 119
 
16.1%

Length

2023-12-12T08:15:04.346730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:15:04.469929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
무동력선 619
83.9%
동력선 119
 
16.1%

사업구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
유선
698 
도선
 
40

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row유선
2nd row유선
3rd row유선
4th row유선
5th row유선

Common Values

ValueCountFrequency (%)
유선 698
94.6%
도선 40
 
5.4%

Length

2023-12-12T08:15:04.621342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:15:04.733529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유선 698
94.6%
도선 40
 
5.4%

등록구분
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
갱신
708 
폐업
 
24
휴업
 
5
신규
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row갱신
2nd row갱신
3rd row갱신
4th row갱신
5th row갱신

Common Values

ValueCountFrequency (%)
갱신 708
95.9%
폐업 24
 
3.3%
휴업 5
 
0.7%
신규 1
 
0.1%

Length

2023-12-12T08:15:04.860942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:15:04.996919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
갱신 708
95.9%
폐업 24
 
3.3%
휴업 5
 
0.7%
신규 1
 
0.1%

톤수
Real number (ℝ)

Distinct94
Distinct (%)12.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.8388062
Minimum0.023
Maximum999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2023-12-12T08:15:05.148416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.023
5-th percentile0.075
Q10.1
median0.29
Q30.4
95-th percentile17.15
Maximum999
Range998.977
Interquartile range (IQR)0.3

Descriptive statistics

Standard deviation51.92345
Coefficient of variation (CV)7.5924728
Kurtosis229.14867
Mean6.8388062
Median Absolute Deviation (MAD)0.19
Skewness13.980737
Sum5047.039
Variance2696.0447
MonotonicityNot monotonic
2023-12-12T08:15:05.334403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.075 176
23.8%
0.29 108
14.6%
0.3 91
12.3%
0.2 49
 
6.6%
0.16 44
 
6.0%
0.65 30
 
4.1%
0.4 24
 
3.3%
0.1 21
 
2.8%
0.5 20
 
2.7%
0.15 12
 
1.6%
Other values (84) 163
22.1%
ValueCountFrequency (%)
0.023 3
 
0.4%
0.075 176
23.8%
0.1 21
 
2.8%
0.12 4
 
0.5%
0.15 12
 
1.6%
0.16 44
 
6.0%
0.17 7
 
0.9%
0.18 2
 
0.3%
0.2 49
 
6.6%
0.25 6
 
0.8%
ValueCountFrequency (%)
999.0 1
0.1%
688.0 1
0.1%
430.0 1
0.1%
299.0 1
0.1%
277.0 1
0.1%
247.0 1
0.1%
138.0 1
0.1%
136.0 1
0.1%
135.0 1
0.1%
117.0 1
0.1%

Interactions

2023-12-12T08:15:02.369170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:15:02.168954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:15:02.488025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:15:02.268846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:15:05.437604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호관리기관선박종류사업구분등록구분톤수
번호1.0000.9650.6510.8790.5060.161
관리기관0.9651.0000.7960.6690.7170.728
선박종류0.6510.7961.0000.7470.0000.374
사업구분0.8790.6690.7471.0000.0000.254
등록구분0.5060.7170.0000.0001.0000.000
톤수0.1610.7280.3740.2540.0001.000
2023-12-12T08:15:05.912846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록구분사업구분관리기관선박종류
등록구분1.0000.0000.4460.000
사업구분0.0001.0000.5620.537
관리기관0.4460.5621.0000.683
선박종류0.0000.5370.6831.000
2023-12-12T08:15:06.041007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호톤수관리기관선박종류사업구분등록구분
번호1.0000.4750.7600.5030.7090.325
톤수0.4751.0000.4120.2680.1830.000
관리기관0.7600.4121.0000.6830.5620.446
선박종류0.5030.2680.6831.0000.5370.000
사업구분0.7090.1830.5620.5371.0000.000
등록구분0.3250.0000.4460.0000.0001.000

Missing values

2023-12-12T08:15:02.612978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:15:02.763966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호관리기관선박명선박종류사업구분등록구분톤수
01서울특별시 한강사업본부한강아라호동력선유선갱신277.0
12서울특별시 한강사업본부카약1호동력선유선갱신247.0
23서울특별시 한강사업본부카약2호동력선유선갱신430.0
34서울특별시 한강사업본부카약3호동력선유선갱신299.0
45서울특별시 한강사업본부파라다이스1호(유)동력선유선갱신135.0
56서울특별시 한강사업본부오리보트21동력선유선갱신688.0
67서울특별시 한강사업본부오리보트22무동력선유선갱신0.023
78서울특별시 한강사업본부오리보트28무동력선유선갱신0.023
89서울특별시 한강사업본부오리보트18무동력선유선갱신0.023
910서울특별시 한강사업본부오리보트19동력선유선갱신0.77
번호관리기관선박명선박종류사업구분등록구분톤수
728729충청북도 괴산군산막이1호동력선도선갱신1.66
729730충청북도 괴산군산막이호동력선도선갱신1.66
730731충청북도 괴산군산막이3호동력선도선갱신22.0
731732충청북도 괴산군산막이5호동력선도선갱신15.0
732733충청북도 옥천군오대호동력선도선갱신1.5
733734충청북도 옥천군막지1호동력선도선갱신1.0
734735충청북도 단양군황포돛배동력선도선갱신3.63
735736경상북도 안동시경북 제703호동력선도선갱신6.18
736737경상북도 안동시경북 제704호동력선도선갱신4.8
737738경상북도 안동시경북 제705호동력선도선갱신36.0