Overview

Dataset statistics

Number of variables6
Number of observations5613
Missing cells59
Missing cells (%)0.2%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory263.2 KiB
Average record size in memory48.0 B

Variable types

Categorical2
DateTime1
Text3

Dataset

Description검역범에 따른 선박 쥐잡이소독 등 관련 증명서 발급 정보 (검역소, 증명서종류, 발급일, 선박명, 신청업체, 유효기간)
Author질병관리청
URLhttps://www.data.go.kr/data/3074722/fileData.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
증명서종류 is highly imbalanced (91.5%)Imbalance
유효기간 has 58 (1.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 07:20:19.683966
Analysis finished2023-12-12 07:20:20.369203
Duration0.69 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

검역소
Categorical

Distinct11
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size44.0 KiB
국립부산검역소
1695 
국립여수검역소
927 
국립울산검역소
880 
국립평택검역소
674 
국립마산검역소
398 
Other values (6)
1039 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row국립부산검역소
2nd row국립마산검역소
3rd row국립부산검역소
4th row국립부산검역소
5th row국립부산검역소

Common Values

ValueCountFrequency (%)
국립부산검역소 1695
30.2%
국립여수검역소 927
16.5%
국립울산검역소 880
15.7%
국립평택검역소 674
 
12.0%
국립마산검역소 398
 
7.1%
국립인천검역소 374
 
6.7%
국립포항검역소 209
 
3.7%
국립군산검역소 198
 
3.5%
국립동해검역소 141
 
2.5%
국립목포검역소 116
 
2.1%

Length

2023-12-12T16:20:20.427059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
국립부산검역소 1695
30.2%
국립여수검역소 927
16.5%
국립울산검역소 880
15.7%
국립평택검역소 674
 
12.0%
국립마산검역소 398
 
7.1%
국립인천검역소 374
 
6.7%
국립포항검역소 209
 
3.7%
국립군산검역소 198
 
3.5%
국립동해검역소 141
 
2.5%
국립목포검역소 116
 
2.1%

증명서종류
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size44.0 KiB
선박위생면제증명서
5478 
선박위생관리증명서
 
77
살균소독증명서
 
44
(운송수단)소독증명서
 
11
감염병매개체구제증명서
 
3

Length

Max length11
Median length9
Mean length8.9893105
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row선박위생면제증명서
2nd row선박위생면제증명서
3rd row선박위생면제증명서
4th row선박위생면제증명서
5th row선박위생면제증명서

Common Values

ValueCountFrequency (%)
선박위생면제증명서 5478
97.6%
선박위생관리증명서 77
 
1.4%
살균소독증명서 44
 
0.8%
(운송수단)소독증명서 11
 
0.2%
감염병매개체구제증명서 3
 
0.1%

Length

2023-12-12T16:20:20.565189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:20:20.718996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
선박위생면제증명서 5478
97.6%
선박위생관리증명서 77
 
1.4%
살균소독증명서 44
 
0.8%
운송수단)소독증명서 11
 
0.2%
감염병매개체구제증명서 3
 
0.1%
Distinct365
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size44.0 KiB
Minimum2022-01-01 00:00:00
Maximum2022-12-31 00:00:00
2023-12-12T16:20:20.883563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:20:21.054698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct4109
Distinct (%)73.2%
Missing0
Missing (%)0.0%
Memory size44.0 KiB
2023-12-12T16:20:21.498305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length21
Mean length11.14698
Min length2

Characters and Unicode

Total characters62568
Distinct characters121
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2814 ?
Unique (%)50.1%

Sample

1st rowSIMBA
2nd rowSM EAGLE
3rd rowMALIAKOS
4th rowSUNNY COSMOS
5th rowSKY AURORA
ValueCountFrequency (%)
hyundai 107
 
1.0%
maersk 99
 
0.9%
star 87
 
0.8%
msc 86
 
0.8%
hl 79
 
0.7%
sm 72
 
0.7%
cgm 72
 
0.7%
cma 72
 
0.7%
kmtc 71
 
0.6%
sun 69
 
0.6%
Other values (3415) 10142
92.6%
2023-12-12T16:20:22.146167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 6585
 
10.5%
5350
 
8.6%
N 5213
 
8.3%
E 4956
 
7.9%
I 4120
 
6.6%
O 3975
 
6.4%
R 3765
 
6.0%
S 3627
 
5.8%
L 2392
 
3.8%
G 2289
 
3.7%
Other values (111) 20296
32.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 55929
89.4%
Space Separator 5350
 
8.6%
Decimal Number 835
 
1.3%
Other Punctuation 221
 
0.4%
Other Letter 140
 
0.2%
Lowercase Letter 46
 
0.1%
Dash Punctuation 38
 
0.1%
Open Punctuation 4
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
21
 
15.0%
13
 
9.3%
9
 
6.4%
7
 
5.0%
5
 
3.6%
5
 
3.6%
4
 
2.9%
4
 
2.9%
4
 
2.9%
4
 
2.9%
Other values (49) 64
45.7%
Uppercase Letter
ValueCountFrequency (%)
A 6585
 
11.8%
N 5213
 
9.3%
E 4956
 
8.9%
I 4120
 
7.4%
O 3975
 
7.1%
R 3765
 
6.7%
S 3627
 
6.5%
L 2392
 
4.3%
G 2289
 
4.1%
T 2271
 
4.1%
Other values (16) 16736
29.9%
Lowercase Letter
ValueCountFrequency (%)
a 6
13.0%
e 5
10.9%
r 4
8.7%
l 4
8.7%
n 4
8.7%
i 4
8.7%
p 3
 
6.5%
o 3
 
6.5%
u 2
 
4.3%
m 2
 
4.3%
Other values (7) 9
19.6%
Decimal Number
ValueCountFrequency (%)
1 179
21.4%
2 113
13.5%
3 103
12.3%
0 103
12.3%
8 87
10.4%
5 81
9.7%
7 76
9.1%
6 60
 
7.2%
9 28
 
3.4%
4 5
 
0.6%
Other Punctuation
ValueCountFrequency (%)
. 215
97.3%
/ 4
 
1.8%
& 1
 
0.5%
; 1
 
0.5%
Space Separator
ValueCountFrequency (%)
5350
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 38
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 55976
89.5%
Common 6452
 
10.3%
Hangul 140
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
21
 
15.0%
13
 
9.3%
9
 
6.4%
7
 
5.0%
5
 
3.6%
5
 
3.6%
4
 
2.9%
4
 
2.9%
4
 
2.9%
4
 
2.9%
Other values (49) 64
45.7%
Latin
ValueCountFrequency (%)
A 6585
 
11.8%
N 5213
 
9.3%
E 4956
 
8.9%
I 4120
 
7.4%
O 3975
 
7.1%
R 3765
 
6.7%
S 3627
 
6.5%
L 2392
 
4.3%
G 2289
 
4.1%
T 2271
 
4.1%
Other values (34) 16783
30.0%
Common
ValueCountFrequency (%)
5350
82.9%
. 215
 
3.3%
1 179
 
2.8%
2 113
 
1.8%
3 103
 
1.6%
0 103
 
1.6%
8 87
 
1.3%
5 81
 
1.3%
7 76
 
1.2%
6 60
 
0.9%
Other values (8) 85
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62427
99.8%
Hangul 140
 
0.2%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 6585
 
10.5%
5350
 
8.6%
N 5213
 
8.4%
E 4956
 
7.9%
I 4120
 
6.6%
O 3975
 
6.4%
R 3765
 
6.0%
S 3627
 
5.8%
L 2392
 
3.8%
G 2289
 
3.7%
Other values (51) 20155
32.3%
Hangul
ValueCountFrequency (%)
21
 
15.0%
13
 
9.3%
9
 
6.4%
7
 
5.0%
5
 
3.6%
5
 
3.6%
4
 
2.9%
4
 
2.9%
4
 
2.9%
4
 
2.9%
Other values (49) 64
45.7%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct489
Distinct (%)8.7%
Missing1
Missing (%)< 0.1%
Memory size44.0 KiB
2023-12-12T16:20:22.438744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length46
Mean length8.3640413
Min length3

Characters and Unicode

Total characters46939
Distinct characters296
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique150 ?
Unique (%)2.7%

Sample

1st row코마텍유한
2nd row부영해운
3rd row케이앤제이오션(주)
4th row천경해운(주)
5th row천경해운(주)
ValueCountFrequency (%)
주식회사 347
 
5.3%
주)코리아마린 231
 
3.5%
에이치디마린(주 117
 
1.8%
고려해운(주 112
 
1.7%
주)세바해운 111
 
1.7%
해왕해운(광양 91
 
1.4%
씨엠에이씨지엠코리아 91
 
1.4%
해양선박 86
 
1.3%
그레이트해운(주 83
 
1.3%
신진해운(주 81
 
1.2%
Other values (568) 5202
79.4%
2023-12-12T16:20:22.815927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4167
 
8.9%
) 4002
 
8.5%
( 3997
 
8.5%
2813
 
6.0%
2613
 
5.6%
1051
 
2.2%
1043
 
2.2%
985
 
2.1%
877
 
1.9%
858
 
1.8%
Other values (286) 24533
52.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 35708
76.1%
Close Punctuation 4016
 
8.6%
Open Punctuation 3997
 
8.5%
Uppercase Letter 1984
 
4.2%
Space Separator 985
 
2.1%
Other Punctuation 209
 
0.4%
Lowercase Letter 32
 
0.1%
Decimal Number 5
 
< 0.1%
Dash Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4167
 
11.7%
2813
 
7.9%
2613
 
7.3%
1051
 
2.9%
1043
 
2.9%
877
 
2.5%
858
 
2.4%
839
 
2.3%
740
 
2.1%
716
 
2.0%
Other values (236) 19991
56.0%
Uppercase Letter
ValueCountFrequency (%)
N 209
10.5%
O 192
9.7%
A 186
 
9.4%
I 179
 
9.0%
C 150
 
7.6%
E 132
 
6.7%
P 130
 
6.6%
T 116
 
5.8%
L 114
 
5.7%
S 106
 
5.3%
Other values (15) 470
23.7%
Lowercase Letter
ValueCountFrequency (%)
e 6
18.8%
t 6
18.8%
o 4
12.5%
d 3
9.4%
n 3
9.4%
i 2
 
6.2%
s 2
 
6.2%
c 2
 
6.2%
v 1
 
3.1%
a 1
 
3.1%
Other values (2) 2
 
6.2%
Other Punctuation
ValueCountFrequency (%)
. 136
65.1%
, 65
31.1%
/ 4
 
1.9%
: 4
 
1.9%
Decimal Number
ValueCountFrequency (%)
6 2
40.0%
5 1
20.0%
3 1
20.0%
1 1
20.0%
Close Punctuation
ValueCountFrequency (%)
) 4002
99.7%
14
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 3997
100.0%
Space Separator
ValueCountFrequency (%)
985
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 35708
76.1%
Common 9215
 
19.6%
Latin 2016
 
4.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4167
 
11.7%
2813
 
7.9%
2613
 
7.3%
1051
 
2.9%
1043
 
2.9%
877
 
2.5%
858
 
2.4%
839
 
2.3%
740
 
2.1%
716
 
2.0%
Other values (236) 19991
56.0%
Latin
ValueCountFrequency (%)
N 209
10.4%
O 192
 
9.5%
A 186
 
9.2%
I 179
 
8.9%
C 150
 
7.4%
E 132
 
6.5%
P 130
 
6.4%
T 116
 
5.8%
L 114
 
5.7%
S 106
 
5.3%
Other values (27) 502
24.9%
Common
ValueCountFrequency (%)
) 4002
43.4%
( 3997
43.4%
985
 
10.7%
. 136
 
1.5%
, 65
 
0.7%
14
 
0.2%
/ 4
 
< 0.1%
: 4
 
< 0.1%
- 3
 
< 0.1%
6 2
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 35708
76.1%
ASCII 11217
 
23.9%
None 14
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4167
 
11.7%
2813
 
7.9%
2613
 
7.3%
1051
 
2.9%
1043
 
2.9%
877
 
2.5%
858
 
2.4%
839
 
2.3%
740
 
2.1%
716
 
2.0%
Other values (236) 19991
56.0%
ASCII
ValueCountFrequency (%)
) 4002
35.7%
( 3997
35.6%
985
 
8.8%
N 209
 
1.9%
O 192
 
1.7%
A 186
 
1.7%
I 179
 
1.6%
C 150
 
1.3%
. 136
 
1.2%
E 132
 
1.2%
Other values (39) 1049
 
9.4%
None
ValueCountFrequency (%)
14
100.0%

유효기간
Text

MISSING 

Distinct380
Distinct (%)6.8%
Missing58
Missing (%)1.0%
Memory size44.0 KiB
2023-12-12T16:20:23.078602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length23
Mean length23
Min length23

Characters and Unicode

Total characters127765
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)0.3%

Sample

1st row2022-01-01 - 2022-06-30
2nd row2022-01-01 - 2022-06-30
3rd row2022-01-01 - 2022-06-30
4th row2022-01-01 - 2022-06-30
5th row2022-01-02 - 2022-07-01
ValueCountFrequency (%)
5555
33.3%
2022-09-01 57
 
0.3%
2022-10-18 56
 
0.3%
2022-09-30 54
 
0.3%
2022-11-16 51
 
0.3%
2022-09-29 50
 
0.3%
2022-10-28 50
 
0.3%
2022-09-21 50
 
0.3%
2022-12-29 46
 
0.3%
2022-08-06 46
 
0.3%
Other values (539) 10650
63.9%
2023-12-12T16:20:23.500581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 37181
29.1%
- 27775
21.7%
0 24590
19.2%
11110
 
8.7%
1 9694
 
7.6%
3 5245
 
4.1%
7 2081
 
1.6%
5 2064
 
1.6%
8 2058
 
1.6%
4 2053
 
1.6%
Other values (2) 3914
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 88880
69.6%
Dash Punctuation 27775
 
21.7%
Space Separator 11110
 
8.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 37181
41.8%
0 24590
27.7%
1 9694
 
10.9%
3 5245
 
5.9%
7 2081
 
2.3%
5 2064
 
2.3%
8 2058
 
2.3%
4 2053
 
2.3%
6 1975
 
2.2%
9 1939
 
2.2%
Dash Punctuation
ValueCountFrequency (%)
- 27775
100.0%
Space Separator
ValueCountFrequency (%)
11110
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 127765
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 37181
29.1%
- 27775
21.7%
0 24590
19.2%
11110
 
8.7%
1 9694
 
7.6%
3 5245
 
4.1%
7 2081
 
1.6%
5 2064
 
1.6%
8 2058
 
1.6%
4 2053
 
1.6%
Other values (2) 3914
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 127765
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 37181
29.1%
- 27775
21.7%
0 24590
19.2%
11110
 
8.7%
1 9694
 
7.6%
3 5245
 
4.1%
7 2081
 
1.6%
5 2064
 
1.6%
8 2058
 
1.6%
4 2053
 
1.6%
Other values (2) 3914
 
3.1%

Correlations

2023-12-12T16:20:23.630395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검역소증명서종류
검역소1.0000.191
증명서종류0.1911.000
2023-12-12T16:20:23.740307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
증명서종류검역소
증명서종류1.0000.106
검역소0.1061.000
2023-12-12T16:20:23.825843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검역소증명서종류
검역소1.0000.106
증명서종류0.1061.000

Missing values

2023-12-12T16:20:20.122485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:20:20.225127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T16:20:20.324043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

검역소증명서종류발급일선박명신청업체유효기간
0국립부산검역소선박위생면제증명서2022-01-01SIMBA코마텍유한2022-01-01 - 2022-06-30
1국립마산검역소선박위생면제증명서2022-01-01SM EAGLE부영해운2022-01-01 - 2022-06-30
2국립부산검역소선박위생면제증명서2022-01-01MALIAKOS케이앤제이오션(주)2022-01-01 - 2022-06-30
3국립부산검역소선박위생면제증명서2022-01-01SUNNY COSMOS천경해운(주)2022-01-01 - 2022-06-30
4국립부산검역소선박위생면제증명서2022-01-02SKY AURORA천경해운(주)2022-01-02 - 2022-07-01
5국립부산검역소선박위생면제증명서2022-01-02MSC ELAINE(주)코리아마린서비스2022-01-02 - 2022-07-01
6국립평택검역소선박위생면제증명서2022-01-02AMBERJACK LNG유니푸로스해운2022-01-02 - 2022-07-01
7국립평택검역소선박위생면제증명서2022-01-02GAILLARDIA SW코리아해운(주)2022-01-02 - 2022-07-01
8국립여수검역소선박위생면제증명서2022-01-02GLORIOUS SUNSHINE주식회사 뉴글로벌해운2022-01-02 - 2022-07-01
9국립부산검역소선박위생면제증명서2022-01-03KMTC ULSAN고려해운(주)2022-01-03 - 2022-07-02
검역소증명서종류발급일선박명신청업체유효기간
5603국립포항검역소선박위생면제증명서2022-12-30EVER WISDOM(주)신양선박2022-12-30 - 2023-06-29
5604국립여수검역소선박위생면제증명서2022-12-31BAYOU SUN여진해운2022-12-31 - 2023-06-30
5605국립여수검역소선박위생면제증명서2022-12-31NAVIG8 GUARD(주)세바해운2022-12-31 - 2023-06-30
5606국립여수검역소선박위생면제증명서2022-12-31FRONT VEGA(주)신성해운2022-12-31 - 2023-06-30
5607국립여수검역소선박위생면제증명서2022-12-31TALARA태원해운2022-12-31 - 2023-06-30
5608국립울산검역소선박위생면제증명서2022-12-31SEARANGER주식회사 네오글로벌쉬핑에이전시2022-12-31 - 2023-06-30
5609국립평택검역소선박위생면제증명서2022-12-31SHENG SHI 569웨스턴해운2022-12-31 - 2023-06-30
5610국립평택검역소선박위생면제증명서2022-12-31MORNING PROSPERITY유라해운2022-12-31 - 2023-06-30
5611국립부산검역소선박위생면제증명서2022-12-31ORYONG NO.725사조산업(주)2022-12-31 - 2023-06-30
5612국립부산검역소선박위생면제증명서2022-12-31MAERSK LONDRINA(주)코리아마린2022-12-31 - 2023-06-30

Duplicate rows

Most frequently occurring

검역소증명서종류발급일선박명신청업체유효기간# duplicates
0국립목포검역소선박위생면제증명서2022-07-26RIO DE JANEIRO EXPRESS현대삼호중공업(주)2022-07-26 - 2023-01-252