Overview

Dataset statistics

Number of variables6
Number of observations1940
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory93.0 KiB
Average record size in memory49.1 B

Variable types

Categorical3
Text2
Numeric1

Dataset

Description공단의 여객선 항해 관련 정보로 아래와 같은 데이터를 제공하고 있습니다. (지사명, 여객선명, 항해구역, 선종, 여객정원, 운항항로명)
URLhttps://www.data.go.kr/data/15042771/fileData.do

Alerts

항해구역 is highly overall correlated with 선종High correlation
선종 is highly overall correlated with 항해구역High correlation

Reproduction

Analysis started2023-12-12 15:24:21.739585
Analysis finished2023-12-12 15:24:22.479576
Duration0.74 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지사명
Categorical

Distinct11
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
목포
646 
여수
346 
통영
310 
완도
233 
인천
148 
Other values (6)
257 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원
2nd row강원
3rd row강원
4th row강원
5th row강원

Common Values

ValueCountFrequency (%)
목포 646
33.3%
여수 346
17.8%
통영 310
16.0%
완도 233
 
12.0%
인천 148
 
7.6%
고흥 64
 
3.3%
전북 63
 
3.2%
보령 55
 
2.8%
제주 36
 
1.9%
경북 30
 
1.5%

Length

2023-12-13T00:24:22.554055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
목포 646
33.3%
여수 346
17.8%
통영 310
16.0%
완도 233
 
12.0%
인천 148
 
7.6%
고흥 64
 
3.3%
전북 63
 
3.2%
보령 55
 
2.8%
제주 36
 
1.9%
경북 30
 
1.5%
Distinct163
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
2023-12-13T00:24:22.873992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length5.6412371
Min length2

Characters and Unicode

Total characters10944
Distinct characters170
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)0.5%

Sample

1st row씨스타1호
2nd row씨스타1호
3rd row씨스타1호
4th row씨스타1호
5th row씨스타1호
ValueCountFrequency (%)
태평양3호 71
 
3.7%
태평양1호 71
 
3.7%
대형카훼리3호 71
 
3.7%
파라다이스 43
 
2.2%
남신안농협2호 42
 
2.2%
퍼스트엔젤 42
 
2.2%
한솔3호 40
 
2.1%
웨스트그린호 37
 
1.9%
한솔1호 37
 
1.9%
한솔2호 37
 
1.9%
Other values (154) 1451
74.7%
2023-12-13T00:24:23.304380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1268
 
11.6%
704
 
6.4%
1 334
 
3.1%
310
 
2.8%
306
 
2.8%
277
 
2.5%
3 264
 
2.4%
242
 
2.2%
232
 
2.1%
2 232
 
2.1%
Other values (160) 6775
61.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9868
90.2%
Decimal Number 1052
 
9.6%
Close Punctuation 11
 
0.1%
Open Punctuation 11
 
0.1%
Space Separator 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1268
 
12.8%
704
 
7.1%
310
 
3.1%
306
 
3.1%
277
 
2.8%
242
 
2.5%
232
 
2.4%
215
 
2.2%
200
 
2.0%
192
 
1.9%
Other values (148) 5922
60.0%
Decimal Number
ValueCountFrequency (%)
1 334
31.7%
3 264
25.1%
2 232
22.1%
5 86
 
8.2%
9 57
 
5.4%
7 45
 
4.3%
6 22
 
2.1%
0 11
 
1.0%
8 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9857
90.1%
Common 1076
 
9.8%
Han 11
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1268
 
12.9%
704
 
7.1%
310
 
3.1%
306
 
3.1%
277
 
2.8%
242
 
2.5%
232
 
2.4%
215
 
2.2%
200
 
2.0%
192
 
1.9%
Other values (147) 5911
60.0%
Common
ValueCountFrequency (%)
1 334
31.0%
3 264
24.5%
2 232
21.6%
5 86
 
8.0%
9 57
 
5.3%
7 45
 
4.2%
6 22
 
2.0%
) 11
 
1.0%
( 11
 
1.0%
0 11
 
1.0%
Other values (2) 3
 
0.3%
Han
ValueCountFrequency (%)
11
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9857
90.1%
ASCII 1076
 
9.8%
CJK 11
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1268
 
12.9%
704
 
7.1%
310
 
3.1%
306
 
3.1%
277
 
2.8%
242
 
2.5%
232
 
2.4%
215
 
2.2%
200
 
2.0%
192
 
1.9%
Other values (147) 5911
60.0%
ASCII
ValueCountFrequency (%)
1 334
31.0%
3 264
24.5%
2 232
21.6%
5 86
 
8.0%
9 57
 
5.3%
7 45
 
4.2%
6 22
 
2.0%
) 11
 
1.0%
( 11
 
1.0%
0 11
 
1.0%
Other values (2) 3
 
0.3%
CJK
ValueCountFrequency (%)
11
100.0%

항해구역
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
평수구역
1165 
먼바다
486 
앞바다
289 

Length

Max length4
Median length4
Mean length3.6005155
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row먼바다
2nd row먼바다
3rd row먼바다
4th row먼바다
5th row먼바다

Common Values

ValueCountFrequency (%)
평수구역 1165
60.1%
먼바다 486
25.1%
앞바다 289
 
14.9%

Length

2023-12-13T00:24:23.478334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:24:23.617820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
평수구역 1165
60.1%
먼바다 486
25.1%
앞바다 289
 
14.9%

선종
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
차도선
1281 
일반선
271 
초쾌속선
168 
쾌속선
162 
일반카페리선
 
45

Length

Max length6
Median length3
Mean length3.1762887
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row초쾌속선
2nd row초쾌속선
3rd row초쾌속선
4th row초쾌속선
5th row초쾌속선

Common Values

ValueCountFrequency (%)
차도선 1281
66.0%
일반선 271
 
14.0%
초쾌속선 168
 
8.7%
쾌속선 162
 
8.4%
일반카페리선 45
 
2.3%
쾌속카페리선 13
 
0.7%

Length

2023-12-13T00:24:23.773423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:24:23.951710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
차도선 1281
66.0%
일반선 271
 
14.0%
초쾌속선 168
 
8.7%
쾌속선 162
 
8.4%
일반카페리선 45
 
2.3%
쾌속카페리선 13
 
0.7%

여객정원
Real number (ℝ)

Distinct121
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean274.96753
Minimum31
Maximum1284
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.2 KiB
2023-12-13T00:24:24.146735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum31
5-th percentile64.9
Q1160
median250
Q3344
95-th percentile600
Maximum1284
Range1253
Interquartile range (IQR)184

Descriptive statistics

Standard deviation172.11534
Coefficient of variation (CV)0.62594789
Kurtosis8.1260668
Mean274.96753
Median Absolute Deviation (MAD)94
Skewness2.126615
Sum533437
Variance29623.691
MonotonicityNot monotonic
2023-12-13T00:24:24.359307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
250 93
 
4.8%
77 71
 
3.7%
249 71
 
3.7%
128 71
 
3.7%
344 59
 
3.0%
275 55
 
2.8%
280 46
 
2.4%
230 44
 
2.3%
234 42
 
2.2%
200 42
 
2.2%
Other values (111) 1346
69.4%
ValueCountFrequency (%)
31 7
 
0.4%
46 8
 
0.4%
50 35
1.8%
54 8
 
0.4%
56 10
 
0.5%
60 20
1.0%
61 5
 
0.3%
63 4
 
0.2%
65 9
 
0.5%
66 2
 
0.1%
ValueCountFrequency (%)
1284 4
 
0.2%
1200 10
0.5%
1180 4
 
0.2%
948 2
 
0.1%
930 4
 
0.2%
877 4
 
0.2%
860 3
 
0.2%
818 2
 
0.1%
810 2
 
0.1%
767 1
 
0.1%
Distinct955
Distinct (%)49.2%
Missing0
Missing (%)0.0%
Memory size15.3 KiB
2023-12-13T00:24:24.670791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length11
Mean length8.707732
Min length3

Characters and Unicode

Total characters16893
Distinct characters255
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique441 ?
Unique (%)22.7%

Sample

1st row묵호-울릉
2nd row속초-묵호-울릉
3rd row울릉-독도-울릉
4th row묵호-울릉(사동)
5th row묵호-울릉(저동)
ValueCountFrequency (%)
목포-안좌(읍동 9
 
0.5%
목포-도초 9
 
0.5%
흑산-가거도(소흑산 8
 
0.4%
목포-소흑산-홍도 8
 
0.4%
목포-가거도(2019 8
 
0.4%
목포-가거도(일시 8
 
0.4%
목포-홍도(일시 8
 
0.4%
목포-대둔도-대흑산 8
 
0.4%
목포-도초-흑산-홍도 8
 
0.4%
목포-가거도(직항 8
 
0.4%
Other values (954) 1888
95.8%
2023-12-13T00:24:25.246860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 3035
 
18.0%
( 1027
 
6.1%
) 1018
 
6.0%
729
 
4.3%
474
 
2.8%
452
 
2.7%
357
 
2.1%
334
 
2.0%
268
 
1.6%
255
 
1.5%
Other values (245) 8944
52.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11249
66.6%
Dash Punctuation 3035
 
18.0%
Open Punctuation 1027
 
6.1%
Close Punctuation 1018
 
6.0%
Decimal Number 387
 
2.3%
Other Punctuation 101
 
0.6%
Space Separator 30
 
0.2%
Uppercase Letter 24
 
0.1%
Connector Punctuation 8
 
< 0.1%
Lowercase Letter 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
729
 
6.5%
474
 
4.2%
452
 
4.0%
357
 
3.2%
334
 
3.0%
268
 
2.4%
255
 
2.3%
209
 
1.9%
192
 
1.7%
171
 
1.5%
Other values (223) 7808
69.4%
Decimal Number
ValueCountFrequency (%)
2 127
32.8%
1 82
21.2%
3 53
13.7%
0 28
 
7.2%
4 25
 
6.5%
9 25
 
6.5%
6 14
 
3.6%
5 14
 
3.6%
7 10
 
2.6%
8 9
 
2.3%
Other Punctuation
ValueCountFrequency (%)
, 86
85.1%
/ 8
 
7.9%
. 7
 
6.9%
Uppercase Letter
ValueCountFrequency (%)
X 21
87.5%
O 3
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 3035
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1027
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1018
100.0%
Space Separator
ValueCountFrequency (%)
30
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8
100.0%
Lowercase Letter
ValueCountFrequency (%)
x 8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11249
66.6%
Common 5612
33.2%
Latin 32
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
729
 
6.5%
474
 
4.2%
452
 
4.0%
357
 
3.2%
334
 
3.0%
268
 
2.4%
255
 
2.3%
209
 
1.9%
192
 
1.7%
171
 
1.5%
Other values (223) 7808
69.4%
Common
ValueCountFrequency (%)
- 3035
54.1%
( 1027
 
18.3%
) 1018
 
18.1%
2 127
 
2.3%
, 86
 
1.5%
1 82
 
1.5%
3 53
 
0.9%
30
 
0.5%
0 28
 
0.5%
4 25
 
0.4%
Other values (9) 101
 
1.8%
Latin
ValueCountFrequency (%)
X 21
65.6%
x 8
 
25.0%
O 3
 
9.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11249
66.6%
ASCII 5644
33.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 3035
53.8%
( 1027
 
18.2%
) 1018
 
18.0%
2 127
 
2.3%
, 86
 
1.5%
1 82
 
1.5%
3 53
 
0.9%
30
 
0.5%
0 28
 
0.5%
4 25
 
0.4%
Other values (12) 133
 
2.4%
Hangul
ValueCountFrequency (%)
729
 
6.5%
474
 
4.2%
452
 
4.0%
357
 
3.2%
334
 
3.0%
268
 
2.4%
255
 
2.3%
209
 
1.9%
192
 
1.7%
171
 
1.5%
Other values (223) 7808
69.4%

Interactions

2023-12-13T00:24:22.164955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:24:25.350149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지사명항해구역선종여객정원
지사명1.0000.6280.5830.579
항해구역0.6281.0000.8280.564
선종0.5830.8281.0000.743
여객정원0.5790.5640.7431.000
2023-12-13T00:24:25.452225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지사명항해구역선종
지사명1.0000.4620.347
항해구역0.4621.0000.514
선종0.3470.5141.000
2023-12-13T00:24:25.546123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
여객정원지사명항해구역선종
여객정원1.0000.3080.2980.482
지사명0.3081.0000.4620.347
항해구역0.2980.4621.0000.514
선종0.4820.3470.5141.000

Missing values

2023-12-13T00:24:22.288001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:24:22.425747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지사명여객선명항해구역선종여객정원운항항로명
0강원씨스타1호먼바다초쾌속선442묵호-울릉
1강원씨스타1호먼바다초쾌속선442속초-묵호-울릉
2강원씨스타1호먼바다초쾌속선442울릉-독도-울릉
3강원씨스타1호먼바다초쾌속선442묵호-울릉(사동)
4강원씨스타1호먼바다초쾌속선442묵호-울릉(저동)
5강원씨스타1호먼바다초쾌속선442울릉(사동)-독도
6강원씨스타1호먼바다초쾌속선442울릉(저동)-독도
7강원씨스타5호먼바다초쾌속선438강릉-울릉(저동)
8강원씨스타5호먼바다초쾌속선438울릉(저동)-독도
9경북썬라이즈호먼바다초쾌속선442울릉-독도
지사명여객선명항해구역선종여객정원운항항로명
1930통영한산농협카페리2평수구역차도선194통영-용초(2호 2항차)
1931통영한산농협카페리2평수구역차도선194통영-용초(2호 3항차)
1932통영한산농협카페리2평수구역차도선194통영-용초(오후/진두)
1933통영욕지영동골드고속평수구역차도선559삼-연(순)
1934통영욕지영동골드고속평수구역차도선559삼-욕(편)
1935통영욕지영동골드고속평수구역차도선559삼-연-욕(편)
1936통영욕지영동골드고속평수구역차도선559삼-연-욕-삼(순)
1937통영욕지영동골드고속평수구역차도선559삼-욕-두-삼(순)
1938통영욕지영동골드고속평수구역차도선559삼-욕-연-삼(순)
1939통영욕지영동골드고속평수구역차도선559삼-두남-두북-삼(순)