Overview

Dataset statistics

Number of variables7
Number of observations29
Missing cells56
Missing cells (%)27.6%
Duplicate rows1
Duplicate rows (%)3.4%
Total size in memory1.7 KiB
Average record size in memory60.4 B

Variable types

Unsupported2
Text3
Categorical2

Dataset

Description해상조난사고 상세데이터(해양경찰청)
Author행정안전부
URLhttps://www.vworld.kr/dtmk/dtmk_ntads_s002.do?dsId=30161

Alerts

Dataset has 1 (3.4%) duplicate rowsDuplicates
Unnamed: 6 is highly overall correlated with Unnamed: 3High correlation
Unnamed: 3 is highly overall correlated with Unnamed: 6High correlation
Unnamed: 1 has 10 (34.5%) missing valuesMissing
Unnamed: 2 has 9 (31.0%) missing valuesMissing
Unnamed: 4 has 10 (34.5%) missing valuesMissing
Unnamed: 5 has 27 (93.1%) missing valuesMissing
테이블정의서 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-21 08:19:33.446671
Analysis finished2024-04-21 08:19:34.780122
Duration1.33 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

테이블정의서
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size360.0 B

Unnamed: 1
Text

MISSING 

Distinct19
Distinct (%)100.0%
Missing10
Missing (%)34.5%
Memory size360.0 B
2024-04-21T17:19:35.313312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length6.4210526
Min length1

Characters and Unicode

Total characters122
Distinct characters27
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)100.0%

Sample

1st row컬럼ID
2nd rowOBJT_ID
3rd rowOCCU_YEAR
4th rowOCCU_MT
5th rowOCCU_DE
ValueCountFrequency (%)
컬럼id 1
 
5.3%
occu_ca_de 1
 
5.3%
y 1
 
5.3%
x 1
 
5.3%
occu_ty_cd 1
 
5.3%
lon 1
 
5.3%
lat 1
 
5.3%
acdn_lc_cn 1
 
5.3%
wether_sn 1
 
5.3%
occu_ty 1
 
5.3%
Other values (9) 9
47.4%
2024-04-21T17:19:36.127149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 25
20.5%
_ 16
13.1%
O 12
9.8%
U 9
 
7.4%
T 8
 
6.6%
D 7
 
5.7%
E 6
 
4.9%
N 5
 
4.1%
L 5
 
4.1%
A 5
 
4.1%
Other values (17) 24
19.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 100
82.0%
Connector Punctuation 16
 
13.1%
Other Letter 6
 
4.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 25
25.0%
O 12
12.0%
U 9
 
9.0%
T 8
 
8.0%
D 7
 
7.0%
E 6
 
6.0%
N 5
 
5.0%
L 5
 
5.0%
A 5
 
5.0%
Y 4
 
4.0%
Other values (10) 14
14.0%
Other Letter
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Connector Punctuation
ValueCountFrequency (%)
_ 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 100
82.0%
Common 16
 
13.1%
Hangul 6
 
4.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 25
25.0%
O 12
12.0%
U 9
 
9.0%
T 8
 
8.0%
D 7
 
7.0%
E 6
 
6.0%
N 5
 
5.0%
L 5
 
5.0%
A 5
 
5.0%
Y 4
 
4.0%
Other values (10) 14
14.0%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Common
ValueCountFrequency (%)
_ 16
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 116
95.1%
Hangul 6
 
4.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 25
21.6%
_ 16
13.8%
O 12
10.3%
U 9
 
7.8%
T 8
 
6.9%
D 7
 
6.0%
E 6
 
5.2%
N 5
 
4.3%
L 5
 
4.3%
A 5
 
4.3%
Other values (11) 18
15.5%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Unnamed: 2
Text

MISSING 

Distinct20
Distinct (%)100.0%
Missing9
Missing (%)31.0%
Memory size360.0 B
2024-04-21T17:19:36.738811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length12.5
Mean length7.25
Min length2

Characters and Unicode

Total characters145
Distinct characters68
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)100.0%

Sample

1st rowA2SM_OceanAcdntSttus
2nd row생활안전지도 사고안전 해양사고발생현황정보
3rd row컬럼명
4th row일련번호
5th row발생년도(YYYY)
ValueCountFrequency (%)
생활안전지도 1
 
4.5%
사고안전 1
 
4.5%
y좌표 1
 
4.5%
x좌표 1
 
4.5%
발생유형코드 1
 
4.5%
경도 1
 
4.5%
위도 1
 
4.5%
사고위치내용 1
 
4.5%
기상특보 1
 
4.5%
발생원인상세 1
 
4.5%
Other values (12) 12
54.5%
2024-04-21T17:19:37.594933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11
 
7.6%
10
 
6.9%
Y 9
 
6.2%
M 7
 
4.8%
) 5
 
3.4%
( 5
 
3.4%
4
 
2.8%
D 4
 
2.8%
t 3
 
2.1%
3
 
2.1%
Other values (58) 84
57.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 91
62.8%
Uppercase Letter 28
 
19.3%
Lowercase Letter 12
 
8.3%
Close Punctuation 5
 
3.4%
Open Punctuation 5
 
3.4%
Space Separator 2
 
1.4%
Decimal Number 1
 
0.7%
Connector Punctuation 1
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
11
 
12.1%
10
 
11.0%
4
 
4.4%
3
 
3.3%
3
 
3.3%
3
 
3.3%
3
 
3.3%
2
 
2.2%
2
 
2.2%
2
 
2.2%
Other values (37) 48
52.7%
Uppercase Letter
ValueCountFrequency (%)
Y 9
32.1%
M 7
25.0%
D 4
14.3%
H 2
 
7.1%
A 2
 
7.1%
S 2
 
7.1%
X 1
 
3.6%
O 1
 
3.6%
Lowercase Letter
ValueCountFrequency (%)
t 3
25.0%
c 2
16.7%
n 2
16.7%
s 1
 
8.3%
u 1
 
8.3%
d 1
 
8.3%
e 1
 
8.3%
a 1
 
8.3%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 91
62.8%
Latin 40
27.6%
Common 14
 
9.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
11
 
12.1%
10
 
11.0%
4
 
4.4%
3
 
3.3%
3
 
3.3%
3
 
3.3%
3
 
3.3%
2
 
2.2%
2
 
2.2%
2
 
2.2%
Other values (37) 48
52.7%
Latin
ValueCountFrequency (%)
Y 9
22.5%
M 7
17.5%
D 4
10.0%
t 3
 
7.5%
H 2
 
5.0%
A 2
 
5.0%
c 2
 
5.0%
n 2
 
5.0%
S 2
 
5.0%
s 1
 
2.5%
Other values (6) 6
15.0%
Common
ValueCountFrequency (%)
) 5
35.7%
( 5
35.7%
2
 
14.3%
2 1
 
7.1%
_ 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 91
62.8%
ASCII 54
37.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
11
 
12.1%
10
 
11.0%
4
 
4.4%
3
 
3.3%
3
 
3.3%
3
 
3.3%
3
 
3.3%
2
 
2.2%
2
 
2.2%
2
 
2.2%
Other values (37) 48
52.7%
ASCII
ValueCountFrequency (%)
Y 9
16.7%
M 7
13.0%
) 5
 
9.3%
( 5
 
9.3%
D 4
 
7.4%
t 3
 
5.6%
H 2
 
3.7%
A 2
 
3.7%
c 2
 
3.7%
n 2
 
3.7%
Other values (11) 13
24.1%

Unnamed: 3
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Memory size360.0 B
VARCHAR2
12 
<NA>
10 
NUMBER
테이블명
 
1
데이터 타입
 
1

Length

Max length8
Median length6
Mean length6.0689655
Min length4

Unique

Unique2 ?
Unique (%)6.9%

Sample

1st row테이블명
2nd row<NA>
3rd row데이터 타입
4th rowNUMBER
5th rowVARCHAR2

Common Values

ValueCountFrequency (%)
VARCHAR2 12
41.4%
<NA> 10
34.5%
NUMBER 5
17.2%
테이블명 1
 
3.4%
데이터 타입 1
 
3.4%

Length

2024-04-21T17:19:37.834497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T17:19:38.035542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
varchar2 12
40.0%
na 10
33.3%
number 5
16.7%
테이블명 1
 
3.3%
데이터 1
 
3.3%
타입 1
 
3.3%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)34.5%
Memory size360.0 B

Unnamed: 5
Text

MISSING 

Distinct2
Distinct (%)100.0%
Missing27
Missing (%)93.1%
Memory size360.0 B
2024-04-21T17:19:38.427013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2.5
Mean length2.5
Min length2

Characters and Unicode

Total characters5
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowKey
2nd rowPK
ValueCountFrequency (%)
key 1
50.0%
pk 1
50.0%
2024-04-21T17:19:39.032999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
K 2
40.0%
e 1
20.0%
y 1
20.0%
P 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3
60.0%
Lowercase Letter 2
40.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
K 2
66.7%
P 1
33.3%
Lowercase Letter
ValueCountFrequency (%)
e 1
50.0%
y 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
K 2
40.0%
e 1
20.0%
y 1
20.0%
P 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
K 2
40.0%
e 1
20.0%
y 1
20.0%
P 1
20.0%

Unnamed: 6
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size360.0 B
<NA>
16 
NOT NULL
12 
NULL여부
 
1

Length

Max length8
Median length4
Mean length5.7241379
Min length4

Unique

Unique1 ?
Unique (%)3.4%

Sample

1st row<NA>
2nd row<NA>
3rd rowNULL여부
4th rowNOT NULL
5th rowNOT NULL

Common Values

ValueCountFrequency (%)
<NA> 16
55.2%
NOT NULL 12
41.4%
NULL여부 1
 
3.4%

Length

2024-04-21T17:19:39.262604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T17:19:39.454209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 16
39.0%
not 12
29.3%
null 12
29.3%
null여부 1
 
2.4%

Correlations

2024-04-21T17:19:39.572120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 5Unnamed: 6
Unnamed: 11.0001.0001.0000.0001.000
Unnamed: 21.0001.0001.0000.0001.000
Unnamed: 31.0001.0001.0000.0001.000
Unnamed: 50.0000.0000.0001.0000.000
Unnamed: 61.0001.0001.0000.0001.000
2024-04-21T17:19:39.736264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 6Unnamed: 3
Unnamed: 61.0000.953
Unnamed: 30.9531.000
2024-04-21T17:19:39.869863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 3Unnamed: 6
Unnamed: 31.0000.953
Unnamed: 60.9531.000

Missing values

2024-04-21T17:19:33.866936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T17:19:34.250239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-21T17:19:34.580944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
0테이블ID<NA>A2SM_OceanAcdntSttus테이블명해양사고 발생현황<NA><NA>
1테이블설명<NA>생활안전지도 사고안전 해양사고발생현황정보<NA>NaN<NA><NA>
2No.컬럼ID컬럼명데이터 타입길이KeyNULL여부
31OBJT_ID일련번호NUMBER10PKNOT NULL
42OCCU_YEAR발생년도(YYYY)VARCHAR24<NA>NOT NULL
53OCCU_MT발생월(MM)VARCHAR22<NA>NOT NULL
64OCCU_DE발생일(DD)VARCHAR22<NA>NOT NULL
75OCCU_TM발생시간(HHMM)VARCHAR24<NA>NOT NULL
86OCCU_DATE발생년월일(YYYYMMDD)VARCHAR28<NA>NOT NULL
97POLC_NM관할해양경찰서VARCHAR220<NA><NA>
테이블정의서Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
1917YY좌표NUMBER15,6<NA>NOT NULL
2018<NA><NA><NA>NaN<NA><NA>
2119<NA><NA><NA>NaN<NA><NA>
2220<NA><NA><NA>NaN<NA><NA>
2321<NA><NA><NA>NaN<NA><NA>
2422<NA><NA><NA>NaN<NA><NA>
2523<NA><NA><NA>NaN<NA><NA>
2624<NA><NA><NA>NaN<NA><NA>
2725<NA><NA><NA>NaN<NA><NA>
28기타내용없음<NA><NA>NaN<NA><NA>

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 5Unnamed: 6# duplicates
0<NA><NA><NA><NA><NA>8