Overview

Dataset statistics

Number of variables7
Number of observations3377
Missing cells0
Missing cells (%)0.0%
Duplicate rows460
Duplicate rows (%)13.6%
Total size in memory184.8 KiB
Average record size in memory56.0 B

Variable types

Categorical3
Text2
DateTime2

Dataset

Description수산물 안전성 검사결과 현황
Author경기도
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=6GXT8558R3S0M75HG3BO1427215&infSeq=1

Alerts

Dataset has 460 (13.6%) duplicate rowsDuplicates
시군명 is highly overall correlated with 수산물구분명High correlation
수산물구분명 is highly overall correlated with 시군명High correlation
검사결과적합여부 is highly imbalanced (96.4%)Imbalance

Reproduction

Analysis started2024-04-20 18:13:01.217909
Analysis finished2024-04-20 18:13:02.874748
Duration1.66 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size26.5 KiB
화성시
898 
안산시
636 
파주시
335 
안성시
263 
김포시
230 
Other values (15)
1015 

Length

Max length4
Median length3
Mean length3.007403
Min length3

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row가평군
2nd row가평군
3rd row가평군
4th row가평군
5th row가평군

Common Values

ValueCountFrequency (%)
화성시 898
26.6%
안산시 636
18.8%
파주시 335
 
9.9%
안성시 263
 
7.8%
김포시 230
 
6.8%
평택시 225
 
6.7%
양평군 192
 
5.7%
여주시 173
 
5.1%
가평군 131
 
3.9%
포천시 96
 
2.8%
Other values (10) 198
 
5.9%

Length

2024-04-21T03:13:02.934914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
화성시 898
26.6%
안산시 636
18.8%
파주시 335
 
9.9%
안성시 263
 
7.8%
김포시 230
 
6.8%
평택시 225
 
6.7%
양평군 192
 
5.7%
여주시 173
 
5.1%
가평군 131
 
3.9%
포천시 96
 
2.8%
Other values (10) 198
 
5.9%

수산물구분명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size26.5 KiB
양식
1793 
어획
979 
양식(종자)
303 
하천조사
302 

Length

Max length6
Median length2
Mean length2.5377554
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row양식
2nd row양식
3rd row양식
4th row양식
5th row양식

Common Values

ValueCountFrequency (%)
양식 1793
53.1%
어획 979
29.0%
양식(종자) 303
 
9.0%
하천조사 302
 
8.9%

Length

2024-04-21T03:13:03.036545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T03:13:03.128581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
양식 1793
53.1%
어획 979
29.0%
양식(종자 303
 
9.0%
하천조사 302
 
8.9%

품명
Text

Distinct137
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size26.5 KiB
2024-04-21T03:13:03.311210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length2.7690258
Min length1

Characters and Unicode

Total characters9351
Distinct characters154
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)1.2%

Sample

1st row송어
2nd row송어
3rd row송어
4th row송어
5th row송어
ValueCountFrequency (%)
뱀장어 496
 
14.7%
철갑상어 243
 
7.2%
231
 
6.8%
송어 230
 
6.8%
메기 149
 
4.4%
민꽃게 105
 
3.1%
꽃게 105
 
3.1%
동자개 98
 
2.9%
흰다리새우 93
 
2.8%
틸라피아 91
 
2.7%
Other values (127) 1536
45.5%
2024-04-21T03:13:03.618921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1210
 
12.9%
543
 
5.8%
496
 
5.3%
269
 
2.9%
268
 
2.9%
260
 
2.8%
251
 
2.7%
245
 
2.6%
243
 
2.6%
230
 
2.5%
Other values (144) 5336
57.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9351
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1210
 
12.9%
543
 
5.8%
496
 
5.3%
269
 
2.9%
268
 
2.9%
260
 
2.8%
251
 
2.7%
245
 
2.6%
243
 
2.6%
230
 
2.5%
Other values (144) 5336
57.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9351
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1210
 
12.9%
543
 
5.8%
496
 
5.3%
269
 
2.9%
268
 
2.9%
260
 
2.8%
251
 
2.7%
245
 
2.6%
243
 
2.6%
230
 
2.5%
Other values (144) 5336
57.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9351
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1210
 
12.9%
543
 
5.8%
496
 
5.3%
269
 
2.9%
268
 
2.9%
260
 
2.8%
251
 
2.7%
245
 
2.6%
243
 
2.6%
230
 
2.5%
Other values (144) 5336
57.1%
Distinct653
Distinct (%)19.3%
Missing0
Missing (%)0.0%
Memory size26.5 KiB
Minimum2017-01-02 00:00:00
Maximum2029-09-19 00:00:00
2024-04-21T03:13:03.737820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T03:13:03.852468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct445
Distinct (%)13.2%
Missing0
Missing (%)0.0%
Memory size26.5 KiB
Minimum2017-01-06 00:00:00
Maximum2024-03-28 00:00:00
2024-04-21T03:13:03.964038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T03:13:04.071716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct62
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size26.5 KiB
2024-04-21T03:13:04.219819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length18
Mean length12.424637
Min length3

Characters and Unicode

Total characters41958
Distinct characters57
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)0.2%

Sample

1st row동물용의약품, 중금속 등 50종
2nd row동물용의약품, 중금속 등 50종
3rd row동물용의약품, 중금속 등 50종
4th row동물용의약품, 중금속 등 50종
5th row동물용의약품, 중금속 등 50종
ValueCountFrequency (%)
중금속 2928
28.3%
1845
17.8%
동물용의약품 1367
13.2%
방사능 1254
12.1%
말라카이트그린 449
 
4.3%
325
 
3.1%
3종 245
 
2.4%
47종 244
 
2.4%
44종 162
 
1.6%
6종 128
 
1.2%
Other values (38) 1411
13.6%
2024-04-21T03:13:04.530631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6981
16.6%
2978
 
7.1%
2928
 
7.0%
2928
 
7.0%
, 2603
 
6.2%
1855
 
4.4%
1845
 
4.4%
1419
 
3.4%
1371
 
3.3%
1369
 
3.3%
Other values (47) 15681
37.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29053
69.2%
Space Separator 6981
 
16.6%
Decimal Number 3219
 
7.7%
Other Punctuation 2603
 
6.2%
Uppercase Letter 102
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2978
 
10.3%
2928
 
10.1%
2928
 
10.1%
1855
 
6.4%
1845
 
6.4%
1419
 
4.9%
1371
 
4.7%
1369
 
4.7%
1369
 
4.7%
1369
 
4.7%
Other values (33) 9622
33.1%
Decimal Number
ValueCountFrequency (%)
4 1152
35.8%
3 556
17.3%
5 332
 
10.3%
7 276
 
8.6%
6 244
 
7.6%
9 211
 
6.6%
8 149
 
4.6%
0 145
 
4.5%
1 95
 
3.0%
2 59
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
G 51
50.0%
M 51
50.0%
Space Separator
ValueCountFrequency (%)
6981
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2603
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29053
69.2%
Common 12803
30.5%
Latin 102
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2978
 
10.3%
2928
 
10.1%
2928
 
10.1%
1855
 
6.4%
1845
 
6.4%
1419
 
4.9%
1371
 
4.7%
1369
 
4.7%
1369
 
4.7%
1369
 
4.7%
Other values (33) 9622
33.1%
Common
ValueCountFrequency (%)
6981
54.5%
, 2603
 
20.3%
4 1152
 
9.0%
3 556
 
4.3%
5 332
 
2.6%
7 276
 
2.2%
6 244
 
1.9%
9 211
 
1.6%
8 149
 
1.2%
0 145
 
1.1%
Other values (2) 154
 
1.2%
Latin
ValueCountFrequency (%)
G 51
50.0%
M 51
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29053
69.2%
ASCII 12905
30.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6981
54.1%
, 2603
 
20.2%
4 1152
 
8.9%
3 556
 
4.3%
5 332
 
2.6%
7 276
 
2.1%
6 244
 
1.9%
9 211
 
1.6%
8 149
 
1.2%
0 145
 
1.1%
Other values (4) 256
 
2.0%
Hangul
ValueCountFrequency (%)
2978
 
10.3%
2928
 
10.1%
2928
 
10.1%
1855
 
6.4%
1845
 
6.4%
1419
 
4.9%
1371
 
4.7%
1369
 
4.7%
1369
 
4.7%
1369
 
4.7%
Other values (33) 9622
33.1%

검사결과적합여부
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size26.5 KiB
적합
3364 
부적합
 
13

Length

Max length3
Median length2
Mean length2.0038496
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row적합
2nd row적합
3rd row적합
4th row적합
5th row적합

Common Values

ValueCountFrequency (%)
적합 3364
99.6%
부적합 13
 
0.4%

Length

2024-04-21T03:13:04.635715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T03:13:04.717906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
적합 3364
99.6%
부적합 13
 
0.4%

Correlations

2024-04-21T03:13:04.768570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명수산물구분명검사항목내역검사결과적합여부
시군명1.0000.8020.7800.081
수산물구분명0.8021.0000.9780.058
검사항목내역0.7800.9781.0000.169
검사결과적합여부0.0810.0580.1691.000
2024-04-21T03:13:04.844919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수산물구분명시군명검사결과적합여부
수산물구분명1.0000.5060.038
시군명0.5061.0000.064
검사결과적합여부0.0380.0641.000
2024-04-21T03:13:04.921295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명수산물구분명검사결과적합여부
시군명1.0000.5060.064
수산물구분명0.5061.0000.038
검사결과적합여부0.0640.0381.000

Missing values

2024-04-21T03:13:02.696258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T03:13:02.820754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군명수산물구분명품명검사접수일자검사완료일자검사항목내역검사결과적합여부
0가평군양식송어2024-03-042024-03-14동물용의약품, 중금속 등 50종적합
1가평군양식송어2024-03-042024-03-14동물용의약품, 중금속 등 50종적합
2가평군양식송어2024-02-152024-02-27동물용의약품, 중금속 등 50종적합
3가평군양식송어2024-02-152024-02-27동물용의약품, 중금속 등 50종적합
4가평군양식송어2024-01-032024-01-16동물용의약품, 중금속 등 50종적합
5가평군양식송어2024-01-032024-01-16동물용의약품, 중금속 등 50종적합
6가평군양식송어2023-12-012023-12-14동물용의약품, 중금속 등 49종적합
7가평군양식송어2023-12-012023-12-14동물용의약품, 중금속 등 49종적합
8가평군양식송어2023-09-042023-09-21동물용의약품, 중금속 등 51종적합
9가평군양식송어2023-09-042023-09-21동물용의약품, 중금속 등 51종적합
시군명수산물구분명품명검사접수일자검사완료일자검사항목내역검사결과적합여부
3367화성시양식2017-01-132017-01-23방사능 및 중금속적합
3368화성시어획가오리2017-01-132017-01-23방사능 및 중금속적합
3369화성시양식2017-01-132017-01-23방사능 및 중금속적합
3370화성시어획조피볼락2017-01-132017-01-23방사능 및 중금속적합
3371화성시어획가자미2017-01-132017-01-23방사능 및 중금속적합
3372화성시어획넙치2017-01-132017-01-23방사능 및 중금속적합
3373화성시어획삼세기2017-01-132017-01-23방사능 및 중금속적합
3374화성시어획점농어2017-01-132017-01-23방사능 및 중금속적합
3375화성시양식송어2017-01-092017-01-17중금속, 동물용의약품 등 35종적합
3376화성시양식철갑상어2017-01-092017-01-17중금속, 동물용의약품 등 35종적합

Duplicate rows

Most frequently occurring

시군명수산물구분명품명검사접수일자검사완료일자검사항목내역검사결과적합여부# duplicates
424화성시양식2023-03-282023-04-07중금속, 방사능적합12
429화성시양식물김2024-02-142024-02-19중금속, 방사능적합12
421화성시양식2023-01-192023-02-02중금속, 방사능적합10
401화성시양식2019-02-122019-02-21중금속 및 방사능적합9
422화성시양식2023-02-102023-02-24중금속, 방사능적합9
423화성시양식2023-03-222023-04-04중금속, 방사능적합9
359평택시양식메기2017-11-022017-11-17MG 및 퀴놀론계 동물용의약품적합8
365평택시양식메기2019-08-212019-09-05중금속, 동물용의약품 등 42종적합8
88안산시양식2018-01-052018-01-12중금속 및 방사능적합7
92안산시양식2020-02-132020-02-20카드뮴, 방사능적합7