Overview

Dataset statistics

Number of variables6
Number of observations112
Missing cells90
Missing cells (%)13.4%
Duplicate rows1
Duplicate rows (%)0.9%
Total size in memory5.4 KiB
Average record size in memory49.2 B

Variable types

Categorical2
Text4

Dataset

Description남해군 내 현존하는 어업용 항구인 어항 정보에 대한 데이터로 어항분류, 어항명, 마을명, 어항주소 등을 제공합니다.
Author경상남도 남해군
URLhttps://www.data.go.kr/data/15109498/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (0.9%) duplicate rowsDuplicates
어항분류 is highly imbalanced (53.3%)Imbalance
지번주소 has 2 (1.8%) missing valuesMissing
도로명주소 has 88 (78.6%) missing valuesMissing

Reproduction

Analysis started2023-12-12 09:56:45.188420
Analysis finished2023-12-12 09:56:45.752319
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

어항분류
Categorical

IMBALANCE 

Distinct3
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
정주어항
94 
지방어항
15 
해양수산부
 
3

Length

Max length5
Median length4
Mean length4.0267857
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정주어항
2nd row정주어항
3rd row정주어항
4th row정주어항
5th row정주어항

Common Values

ValueCountFrequency (%)
정주어항 94
83.9%
지방어항 15
 
13.4%
해양수산부 3
 
2.7%

Length

2023-12-12T18:56:45.815333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:56:45.909885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정주어항 94
83.9%
지방어항 15
 
13.4%
해양수산부 3
 
2.7%
Distinct111
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-12T18:56:46.192437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.1607143
Min length3

Characters and Unicode

Total characters354
Distinct characters109
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique110 ?
Unique (%)98.2%

Sample

1st row토촌항
2nd row섬호항
3rd row심천항
4th row화계항
5th row용소항
ValueCountFrequency (%)
노구항 2
 
1.8%
동대항 1
 
0.9%
사촌항 1
 
0.9%
후리망골항 1
 
0.9%
단항항 1
 
0.9%
당항항 1
 
0.9%
곤유항 1
 
0.9%
동흥항 1
 
0.9%
왕지항 1
 
0.9%
수원늘항 1
 
0.9%
Other values (101) 101
90.2%
2023-12-12T18:56:46.622402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
118
33.3%
11
 
3.1%
8
 
2.3%
8
 
2.3%
6
 
1.7%
6
 
1.7%
6
 
1.7%
5
 
1.4%
5
 
1.4%
5
 
1.4%
Other values (99) 176
49.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 348
98.3%
Open Punctuation 2
 
0.6%
Decimal Number 2
 
0.6%
Close Punctuation 2
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
118
33.9%
11
 
3.2%
8
 
2.3%
8
 
2.3%
6
 
1.7%
6
 
1.7%
6
 
1.7%
5
 
1.4%
5
 
1.4%
5
 
1.4%
Other values (96) 170
48.9%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Decimal Number
ValueCountFrequency (%)
2 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 348
98.3%
Common 6
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
118
33.9%
11
 
3.2%
8
 
2.3%
8
 
2.3%
6
 
1.7%
6
 
1.7%
6
 
1.7%
5
 
1.4%
5
 
1.4%
5
 
1.4%
Other values (96) 170
48.9%
Common
ValueCountFrequency (%)
( 2
33.3%
2 2
33.3%
) 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 348
98.3%
ASCII 6
 
1.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
118
33.9%
11
 
3.2%
8
 
2.3%
8
 
2.3%
6
 
1.7%
6
 
1.7%
6
 
1.7%
5
 
1.4%
5
 
1.4%
5
 
1.4%
Other values (96) 170
48.9%
ASCII
ValueCountFrequency (%)
( 2
33.3%
2 2
33.3%
) 2
33.3%
Distinct111
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-12T18:56:46.963663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length4.1696429
Min length4

Characters and Unicode

Total characters467
Distinct characters110
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique110 ?
Unique (%)98.2%

Sample

1st row토촌마을
2nd row섬호마을
3rd row심천마을
4th row화계마을
5th row용소마을
ValueCountFrequency (%)
노구마을 2
 
1.8%
동대마을 1
 
0.9%
사촌마을 1
 
0.9%
후리망골마을 1
 
0.9%
단항마을 1
 
0.9%
당항마을 1
 
0.9%
곤유마을 1
 
0.9%
동흥마을 1
 
0.9%
왕지마을 1
 
0.9%
수원늘마을 1
 
0.9%
Other values (101) 101
90.2%
2023-12-12T18:56:47.578115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
113
24.2%
112
24.0%
11
 
2.4%
9
 
1.9%
7
 
1.5%
7
 
1.5%
7
 
1.5%
7
 
1.5%
6
 
1.3%
6
 
1.3%
Other values (100) 182
39.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 462
98.9%
Decimal Number 5
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
113
24.5%
112
24.2%
11
 
2.4%
9
 
1.9%
7
 
1.5%
7
 
1.5%
7
 
1.5%
7
 
1.5%
6
 
1.3%
6
 
1.3%
Other values (97) 177
38.3%
Decimal Number
ValueCountFrequency (%)
2 2
40.0%
1 2
40.0%
3 1
20.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 462
98.9%
Common 5
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
113
24.5%
112
24.2%
11
 
2.4%
9
 
1.9%
7
 
1.5%
7
 
1.5%
7
 
1.5%
7
 
1.5%
6
 
1.3%
6
 
1.3%
Other values (97) 177
38.3%
Common
ValueCountFrequency (%)
2 2
40.0%
1 2
40.0%
3 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 462
98.9%
ASCII 5
 
1.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
113
24.5%
112
24.2%
11
 
2.4%
9
 
1.9%
7
 
1.5%
7
 
1.5%
7
 
1.5%
7
 
1.5%
6
 
1.3%
6
 
1.3%
Other values (97) 177
38.3%
ASCII
ValueCountFrequency (%)
2 2
40.0%
1 2
40.0%
3 1
20.0%

지번주소
Text

MISSING 

Distinct104
Distinct (%)94.5%
Missing2
Missing (%)1.8%
Memory size1.0 KiB
2023-12-12T18:56:48.031537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length23
Mean length21.772727
Min length16

Characters and Unicode

Total characters2395
Distinct characters91
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98 ?
Unique (%)89.1%

Sample

1st row경상남도 남해군 남해읍 토촌리 388-1
2nd row경상남도 남해군 남해읍 입현리 370-1
3rd row경상남도 남해군 남해읍 심천리 45-3
4th row경상남도 남해군 이동면 화계리 268-5
5th row경상남도 남해군 이동면 용소리 389-8
ValueCountFrequency (%)
경상남도 110
20.1%
남해군 110
20.1%
창선면 25
 
4.6%
설천면 13
 
2.4%
삼동면 13
 
2.4%
남면 12
 
2.2%
미조면 11
 
2.0%
서면 9
 
1.6%
상주면 8
 
1.5%
고현면 7
 
1.3%
Other values (153) 229
41.9%
2023-12-12T18:56:48.639955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
437
18.2%
238
 
9.9%
126
 
5.3%
114
 
4.8%
114
 
4.8%
111
 
4.6%
110
 
4.6%
110
 
4.6%
106
 
4.4%
- 97
 
4.1%
Other values (81) 832
34.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1410
58.9%
Decimal Number 451
 
18.8%
Space Separator 437
 
18.2%
Dash Punctuation 97
 
4.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
238
16.9%
126
 
8.9%
114
 
8.1%
114
 
8.1%
111
 
7.9%
110
 
7.8%
110
 
7.8%
106
 
7.5%
27
 
1.9%
27
 
1.9%
Other values (69) 327
23.2%
Decimal Number
ValueCountFrequency (%)
1 96
21.3%
3 48
10.6%
7 44
9.8%
6 43
9.5%
2 42
9.3%
5 38
 
8.4%
0 38
 
8.4%
4 37
 
8.2%
8 36
 
8.0%
9 29
 
6.4%
Space Separator
ValueCountFrequency (%)
437
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 97
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1410
58.9%
Common 985
41.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
238
16.9%
126
 
8.9%
114
 
8.1%
114
 
8.1%
111
 
7.9%
110
 
7.8%
110
 
7.8%
106
 
7.5%
27
 
1.9%
27
 
1.9%
Other values (69) 327
23.2%
Common
ValueCountFrequency (%)
437
44.4%
- 97
 
9.8%
1 96
 
9.7%
3 48
 
4.9%
7 44
 
4.5%
6 43
 
4.4%
2 42
 
4.3%
5 38
 
3.9%
0 38
 
3.9%
4 37
 
3.8%
Other values (2) 65
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1410
58.9%
ASCII 985
41.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
437
44.4%
- 97
 
9.8%
1 96
 
9.7%
3 48
 
4.9%
7 44
 
4.5%
6 43
 
4.4%
2 42
 
4.3%
5 38
 
3.9%
0 38
 
3.9%
4 37
 
3.8%
Other values (2) 65
 
6.6%
Hangul
ValueCountFrequency (%)
238
16.9%
126
 
8.9%
114
 
8.1%
114
 
8.1%
111
 
7.9%
110
 
7.8%
110
 
7.8%
106
 
7.5%
27
 
1.9%
27
 
1.9%
Other values (69) 327
23.2%

도로명주소
Text

MISSING 

Distinct22
Distinct (%)91.7%
Missing88
Missing (%)78.6%
Memory size1.0 KiB
2023-12-12T18:56:48.942094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length27.5
Mean length25.333333
Min length19

Characters and Unicode

Total characters608
Distinct characters43
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)83.3%

Sample

1st row경상남도 남해군 상주면 남해대로1299번길 74-15
2nd row경상남도 남해군 미조면 조도길 5-6
3rd row경상남도 남해군 남면 남면로1103번길 43-16
4th row경상남도 남해군 남면 남면로1229번길 36-12
5th row경상남도 남해군 서면 남서대로2733번길 77
ValueCountFrequency (%)
경상남도 24
18.3%
남해군 24
18.3%
창선면 6
 
4.6%
미조면 5
 
3.8%
동부대로 4
 
3.1%
서면 4
 
3.1%
남서대로 3
 
2.3%
흥선로 2
 
1.5%
남면 2
 
1.5%
삼동면 2
 
1.5%
Other values (51) 55
42.0%
2023-12-12T18:56:49.336073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
107
17.6%
62
 
10.2%
1 27
 
4.4%
27
 
4.4%
26
 
4.3%
26
 
4.3%
25
 
4.1%
24
 
3.9%
24
 
3.9%
21
 
3.5%
Other values (33) 239
39.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 358
58.9%
Decimal Number 131
 
21.5%
Space Separator 107
 
17.6%
Dash Punctuation 12
 
2.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
62
17.3%
27
 
7.5%
26
 
7.3%
26
 
7.3%
25
 
7.0%
24
 
6.7%
24
 
6.7%
21
 
5.9%
20
 
5.6%
18
 
5.0%
Other values (21) 85
23.7%
Decimal Number
ValueCountFrequency (%)
1 27
20.6%
2 18
13.7%
3 15
11.5%
9 13
9.9%
7 12
9.2%
5 12
9.2%
8 11
8.4%
6 10
 
7.6%
0 8
 
6.1%
4 5
 
3.8%
Space Separator
ValueCountFrequency (%)
107
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 358
58.9%
Common 250
41.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
62
17.3%
27
 
7.5%
26
 
7.3%
26
 
7.3%
25
 
7.0%
24
 
6.7%
24
 
6.7%
21
 
5.9%
20
 
5.6%
18
 
5.0%
Other values (21) 85
23.7%
Common
ValueCountFrequency (%)
107
42.8%
1 27
 
10.8%
2 18
 
7.2%
3 15
 
6.0%
9 13
 
5.2%
7 12
 
4.8%
5 12
 
4.8%
- 12
 
4.8%
8 11
 
4.4%
6 10
 
4.0%
Other values (2) 13
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 358
58.9%
ASCII 250
41.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
107
42.8%
1 27
 
10.8%
2 18
 
7.2%
3 15
 
6.0%
9 13
 
5.2%
7 12
 
4.8%
5 12
 
4.8%
- 12
 
4.8%
8 11
 
4.4%
6 10
 
4.0%
Other values (2) 13
 
5.2%
Hangul
ValueCountFrequency (%)
62
17.3%
27
 
7.5%
26
 
7.3%
26
 
7.3%
25
 
7.0%
24
 
6.7%
24
 
6.7%
21
 
5.9%
20
 
5.6%
18
 
5.0%
Other values (21) 85
23.7%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2022-10-28
112 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-10-28
2nd row2022-10-28
3rd row2022-10-28
4th row2022-10-28
5th row2022-10-28

Common Values

ValueCountFrequency (%)
2022-10-28 112
100.0%

Length

2023-12-12T18:56:49.472866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:56:49.567339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-10-28 112
100.0%

Correlations

2023-12-12T18:56:49.633078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
어항분류도로명주소
어항분류1.0001.000
도로명주소1.0001.000

Missing values

2023-12-12T18:56:45.517506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:56:45.622747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T18:56:45.708430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

어항분류어항명마을명지번주소도로명주소데이터기준일자
0정주어항토촌항토촌마을경상남도 남해군 남해읍 토촌리 388-1<NA>2022-10-28
1정주어항섬호항섬호마을경상남도 남해군 남해읍 입현리 370-1<NA>2022-10-28
2정주어항심천항심천마을경상남도 남해군 남해읍 심천리 45-3<NA>2022-10-28
3정주어항화계항화계마을경상남도 남해군 이동면 화계리 268-5<NA>2022-10-28
4정주어항용소항용소마을경상남도 남해군 이동면 용소리 389-8<NA>2022-10-28
5정주어항광두항광두마을경상남도 남해군 이동면 초음리 2-7<NA>2022-10-28
6정주어항초양항초양마을경상남도 남해군 이동면 초음리 739-5<NA>2022-10-28
7정주어항금평항금평마을경상남도 남해군 이동면 신전리 998-19<NA>2022-10-28
8정주어항고모항고모마을경상남도 남해군 이동면 초음리 80-2<NA>2022-10-28
9정주어항금포항금포마을경상남도 남해군 상주면 상주리 128-5<NA>2022-10-28
어항분류어항명마을명지번주소도로명주소데이터기준일자
102지방어항동갈화항돌갈화마을경상남도 남해군 고현면 갈화리 1706-2경상남도 남해군 고현면 남서대로 2979번길 672022-10-28
103지방어항당저항당저마을경상남도 남해군 창선면 당저리 산48경상남도 남해군 창선면 동부대로 2084번길 722022-10-28
104지방어항장포항장포마을경상남도 남해군 창선면 진동리 136-2경상남도 남해군 창선면 흥선로 1505번길 7-252022-10-28
105지방어항적량항적량마을경상남도 남해군 창선면 진동리 997-20경상남도 남해군 창선면 흥선로 11902022-10-28
106지방어항냉천항냉천마을경상남도 남해군 창선면 당항리 180경상남도 남해군 창선면 동부대로 2860번길 192022-10-28
107지방어항광천항광천마을경상남도 남해군 창선면 광천리 673-4경상남도 남해군 창선면 서부로 591번길 39-82022-10-28
108지방어항대벽항대벽마을<NA>경상남도 남해군 창선면 대벽리 859-42022-10-28
109해양수산부물건항물건마을경상남도 남해군 삼동면 물건리<NA>2022-10-28
110해양수산부미조(북)항본촌마을경상남도 남해군 미조면 미조리경상남도 남해군 미조면 미조로 일원2022-10-28
111해양수산부미조(남)항사항마을경상남도 남해군 미조면 미조리경상남도 남해군 미조면 미조로 일원2022-10-28

Duplicate rows

Most frequently occurring

어항분류어항명마을명지번주소도로명주소데이터기준일자# duplicates
0정주어항노구항노구마을경상남도 남해군 미조면 송정리 124-9<NA>2022-10-282