Overview

Dataset statistics

Number of variables6
Number of observations3254
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory162.2 KiB
Average record size in memory51.0 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description공단의 여객선 항로별 이용 관련 정보로 아래와 같은 데이터를 제공하고 있습니다. (지사명, 항로명, 연_월, 합계, 일반, 도서민)
URLhttps://www.data.go.kr/data/15117984/fileData.do

Alerts

합계 is highly overall correlated with 일반 and 1 other fieldsHigh correlation
일반 is highly overall correlated with 합계 and 1 other fieldsHigh correlation
도서민 is highly overall correlated with 합계 and 1 other fieldsHigh correlation
합계 has 159 (4.9%) zerosZeros
일반 has 160 (4.9%) zerosZeros
도서민 has 349 (10.7%) zerosZeros

Reproduction

Analysis started2023-12-12 17:06:58.803995
Analysis finished2023-12-12 17:07:00.509234
Duration1.71 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지사명
Categorical

Distinct12
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size25.6 KiB
목포지사
777 
인천지사
426 
통영지사
408 
완도지사
390 
여수지사
258 
Other values (7)
995 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산지사
2nd row부산지사
3rd row부산지사
4th row부산지사
5th row부산지사

Common Values

ValueCountFrequency (%)
목포지사 777
23.9%
인천지사 426
13.1%
통영지사 408
12.5%
완도지사 390
12.0%
여수지사 258
 
7.9%
보령지사 210
 
6.5%
포항지사 196
 
6.0%
동해지사 151
 
4.6%
제주지사 150
 
4.6%
고흥지사 139
 
4.3%
Other values (2) 149
 
4.6%

Length

2023-12-13T02:07:00.588753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
목포지사 777
23.9%
인천지사 426
13.1%
통영지사 408
12.5%
완도지사 390
12.0%
여수지사 258
 
7.9%
보령지사 210
 
6.5%
포항지사 196
 
6.0%
동해지사 151
 
4.6%
제주지사 150
 
4.6%
고흥지사 139
 
4.3%
Other values (2) 149
 
4.6%
Distinct126
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size25.6 KiB
2023-12-13T02:07:00.881344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length5
Mean length5.8202213
Min length5

Characters and Unicode

Total characters18939
Distinct characters146
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row부산-제주
2nd row부산-제주
3rd row부산-제주
4th row부산-제주
5th row부산-제주
ValueCountFrequency (%)
울릉(사동)-독도 50
 
1.5%
울릉(도동)-독도 45
 
1.4%
하리-서검 32
 
1.0%
통영-삼천포 31
 
1.0%
제주-우수영 30
 
0.9%
당목-일정 30
 
0.9%
울릉(저동)-독도 30
 
0.9%
완도-덕우 30
 
0.9%
완도-여서 30
 
0.9%
완도-모도 30
 
0.9%
Other values (117) 2919
89.6%
2023-12-13T02:07:01.412493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 3284
 
17.3%
1274
 
6.7%
655
 
3.5%
604
 
3.2%
547
 
2.9%
( 434
 
2.3%
) 434
 
2.3%
408
 
2.2%
374
 
2.0%
374
 
2.0%
Other values (136) 10551
55.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14781
78.0%
Dash Punctuation 3284
 
17.3%
Open Punctuation 434
 
2.3%
Close Punctuation 434
 
2.3%
Space Separator 3
 
< 0.1%
Decimal Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1274
 
8.6%
655
 
4.4%
604
 
4.1%
547
 
3.7%
408
 
2.8%
374
 
2.5%
374
 
2.5%
340
 
2.3%
281
 
1.9%
261
 
1.8%
Other values (131) 9663
65.4%
Dash Punctuation
ValueCountFrequency (%)
- 3284
100.0%
Open Punctuation
ValueCountFrequency (%)
( 434
100.0%
Close Punctuation
ValueCountFrequency (%)
) 434
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Decimal Number
ValueCountFrequency (%)
1 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14781
78.0%
Common 4158
 
22.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1274
 
8.6%
655
 
4.4%
604
 
4.1%
547
 
3.7%
408
 
2.8%
374
 
2.5%
374
 
2.5%
340
 
2.3%
281
 
1.9%
261
 
1.8%
Other values (131) 9663
65.4%
Common
ValueCountFrequency (%)
- 3284
79.0%
( 434
 
10.4%
) 434
 
10.4%
3
 
0.1%
1 3
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14781
78.0%
ASCII 4158
 
22.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 3284
79.0%
( 434
 
10.4%
) 434
 
10.4%
3
 
0.1%
1 3
 
0.1%
Hangul
ValueCountFrequency (%)
1274
 
8.6%
655
 
4.4%
604
 
4.1%
547
 
3.7%
408
 
2.8%
374
 
2.5%
374
 
2.5%
340
 
2.3%
281
 
1.9%
261
 
1.8%
Other values (131) 9663
65.4%

연-월
Categorical

Distinct30
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size25.6 KiB
2023-05
 
112
2021-11
 
111
2021-12
 
111
2021-10
 
111
2021-08
 
110
Other values (25)
2699 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-01
2nd row2021-02
3rd row2021-03
4th row2021-04
5th row2021-05

Common Values

ValueCountFrequency (%)
2023-05 112
 
3.4%
2021-11 111
 
3.4%
2021-12 111
 
3.4%
2021-10 111
 
3.4%
2021-08 110
 
3.4%
2021-09 110
 
3.4%
2022-01 110
 
3.4%
2021-07 110
 
3.4%
2021-03 110
 
3.4%
2022-04 109
 
3.3%
Other values (20) 2150
66.1%

Length

2023-12-13T02:07:01.552274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2023-05 112
 
3.4%
2021-12 111
 
3.4%
2021-10 111
 
3.4%
2021-11 111
 
3.4%
2022-01 110
 
3.4%
2021-07 110
 
3.4%
2021-03 110
 
3.4%
2021-09 110
 
3.4%
2021-08 110
 
3.4%
2022-04 109
 
3.3%
Other values (20) 2150
66.1%

합계
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2807
Distinct (%)86.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9948.6899
Minimum0
Maximum200042
Zeros159
Zeros (%)4.9%
Negative0
Negative (%)0.0%
Memory size28.7 KiB
2023-12-13T02:07:01.689329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile38.3
Q11386.5
median5157
Q312180
95-th percentile37332.9
Maximum200042
Range200042
Interquartile range (IQR)10793.5

Descriptive statistics

Standard deviation13817.847
Coefficient of variation (CV)1.3889113
Kurtosis26.896662
Mean9948.6899
Median Absolute Deviation (MAD)4214
Skewness3.6654946
Sum32373037
Variance1.9093291 × 108
MonotonicityNot monotonic
2023-12-13T02:07:01.891320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 159
 
4.9%
243 3
 
0.1%
1197 3
 
0.1%
302 3
 
0.1%
1163 3
 
0.1%
641 3
 
0.1%
281 3
 
0.1%
1345 3
 
0.1%
3686 3
 
0.1%
849 3
 
0.1%
Other values (2797) 3068
94.3%
ValueCountFrequency (%)
0 159
4.9%
7 1
 
< 0.1%
21 1
 
< 0.1%
32 1
 
< 0.1%
37 1
 
< 0.1%
39 1
 
< 0.1%
60 1
 
< 0.1%
66 1
 
< 0.1%
93 1
 
< 0.1%
97 1
 
< 0.1%
ValueCountFrequency (%)
200042 1
< 0.1%
165165 1
< 0.1%
160182 1
< 0.1%
121809 1
< 0.1%
117633 1
< 0.1%
116978 1
< 0.1%
112912 1
< 0.1%
94608 1
< 0.1%
92283 1
< 0.1%
78044 1
< 0.1%

일반
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2694
Distinct (%)82.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7596.9644
Minimum0
Maximum197878
Zeros160
Zeros (%)4.9%
Negative0
Negative (%)0.0%
Memory size28.7 KiB
2023-12-13T02:07:02.113179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile14.95
Q1709
median3163.5
Q38909
95-th percentile29543.75
Maximum197878
Range197878
Interquartile range (IQR)8200

Descriptive statistics

Standard deviation12358.899
Coefficient of variation (CV)1.6268208
Kurtosis42.121802
Mean7596.9644
Median Absolute Deviation (MAD)2761
Skewness4.7196704
Sum24720522
Variance1.5274239 × 108
MonotonicityNot monotonic
2023-12-13T02:07:02.337437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 160
 
4.9%
365 7
 
0.2%
417 6
 
0.2%
455 5
 
0.2%
901 5
 
0.2%
354 4
 
0.1%
328 4
 
0.1%
603 4
 
0.1%
308 4
 
0.1%
529 4
 
0.1%
Other values (2684) 3051
93.8%
ValueCountFrequency (%)
0 160
4.9%
9 2
 
0.1%
13 1
 
< 0.1%
16 2
 
0.1%
17 4
 
0.1%
18 3
 
0.1%
21 1
 
< 0.1%
24 2
 
0.1%
25 1
 
< 0.1%
26 2
 
0.1%
ValueCountFrequency (%)
197878 1
< 0.1%
162635 1
< 0.1%
158053 1
< 0.1%
119249 1
< 0.1%
116219 1
< 0.1%
110269 1
< 0.1%
108141 1
< 0.1%
94608 1
< 0.1%
90421 1
< 0.1%
77056 1
< 0.1%

도서민
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2082
Distinct (%)64.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2351.7256
Minimum0
Maximum26674
Zeros349
Zeros (%)10.7%
Negative0
Negative (%)0.0%
Memory size28.7 KiB
2023-12-13T02:07:02.843823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1317
median938
Q33303.25
95-th percentile8978
Maximum26674
Range26674
Interquartile range (IQR)2986.25

Descriptive statistics

Standard deviation3431.6021
Coefficient of variation (CV)1.4591848
Kurtosis11.1729
Mean2351.7256
Median Absolute Deviation (MAD)903.5
Skewness2.9109961
Sum7652515
Variance11775893
MonotonicityNot monotonic
2023-12-13T02:07:03.042437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 349
 
10.7%
2 10
 
0.3%
4 9
 
0.3%
11 8
 
0.2%
417 7
 
0.2%
8 7
 
0.2%
382 6
 
0.2%
317 6
 
0.2%
1 6
 
0.2%
47 6
 
0.2%
Other values (2072) 2840
87.3%
ValueCountFrequency (%)
0 349
10.7%
1 6
 
0.2%
2 10
 
0.3%
3 5
 
0.2%
4 9
 
0.3%
5 4
 
0.1%
6 6
 
0.2%
7 5
 
0.2%
8 7
 
0.2%
10 5
 
0.2%
ValueCountFrequency (%)
26674 1
< 0.1%
25627 1
< 0.1%
25579 1
< 0.1%
25081 1
< 0.1%
24572 1
< 0.1%
24091 1
< 0.1%
22671 1
< 0.1%
22654 1
< 0.1%
22436 1
< 0.1%
22118 1
< 0.1%

Interactions

2023-12-13T02:06:59.991821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:06:59.244294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:06:59.635001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:00.108765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:06:59.389067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:06:59.759581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:07:00.229437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:06:59.516754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:06:59.882227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:07:03.142915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지사명연-월합계일반도서민
지사명1.0000.0000.3190.3390.404
연-월0.0001.0000.1530.1700.000
합계0.3190.1531.0000.9750.445
일반0.3390.1700.9751.0000.316
도서민0.4040.0000.4450.3161.000
2023-12-13T02:07:03.245437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연-월지사명
연-월1.0000.000
지사명0.0001.000
2023-12-13T02:07:03.332659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
합계일반도서민지사명연-월
합계1.0000.9760.6530.1410.057
일반0.9761.0000.5190.1490.055
도서민0.6530.5191.0000.1820.000
지사명0.1410.1490.1821.0000.000
연-월0.0570.0550.0000.0001.000

Missing values

2023-12-13T02:07:00.357678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:07:00.463645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지사명항로명연-월합계일반도서민
0부산지사부산-제주2021-01150215020
1부산지사부산-제주2021-02000
2부산지사부산-제주2021-03154515450
3부산지사부산-제주2021-04174917490
4부산지사부산-제주2021-05232623260
5부산지사부산-제주2021-06247524750
6부산지사부산-제주2021-07314531450
7부산지사부산-제주2021-08314131410
8부산지사부산-제주2021-09211221120
9부산지사부산-제주2021-10267726770
지사명항로명연-월합계일반도서민
3244고흥지사녹동-제주2022-0914160141600
3245고흥지사녹동-제주2022-1018535185350
3246고흥지사녹동-제주2022-1115518155180
3247고흥지사녹동-제주2022-1211256112560
3248고흥지사녹동-제주2023-0113247132470
3249고흥지사녹동-제주2023-0211974119740
3250고흥지사녹동-제주2023-0311422114220
3251고흥지사녹동-제주2023-0412315123150
3252고흥지사녹동-제주2023-0514457144570
3253고흥지사녹동-제주2023-06491749170