Overview

Dataset statistics

Number of variables14
Number of observations126
Missing cells892
Missing cells (%)50.6%
Duplicate rows1
Duplicate rows (%)0.8%
Total size in memory15.0 KiB
Average record size in memory122.0 B

Variable types

Categorical5
Numeric2
Unsupported7

Dataset

Description경상남도 수산생물 질병 발생 현황 월별 조사결과입니다.(해양, 어류, 갑각류에 대한 질병발생 품종, 질병, 발생건수, 발생률 데이터를 제공합니다.)
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=3076265

Alerts

Dataset has 1 (0.8%) duplicate rowsDuplicates
품종 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
발생질병 is highly overall correlated with 연도High correlation
비 고 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
연도 is highly overall correlated with 발생건수 and 5 other fieldsHigh correlation
is highly overall correlated with 연도High correlation
발생건수 is highly overall correlated with 발생률(%) and 3 other fieldsHigh correlation
발생률(%) is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
연도 is highly imbalanced (75.9%)Imbalance
발생건수 has 5 (4.0%) missing valuesMissing
발생률(%) has 5 (4.0%) missing valuesMissing
Unnamed: 7 has 126 (100.0%) missing valuesMissing
Unnamed: 8 has 126 (100.0%) missing valuesMissing
Unnamed: 9 has 126 (100.0%) missing valuesMissing
Unnamed: 10 has 126 (100.0%) missing valuesMissing
Unnamed: 11 has 126 (100.0%) missing valuesMissing
Unnamed: 12 has 126 (100.0%) missing valuesMissing
Unnamed: 13 has 126 (100.0%) missing valuesMissing
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 13 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 00:05:50.116577
Analysis finished2023-12-11 00:05:51.332188
Duration1.22 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2018년
121 
<NA>
 
5

Length

Max length5
Median length5
Mean length4.9603175
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2018년
2nd row2018년
3rd row2018년
4th row2018년
5th row2018년

Common Values

ValueCountFrequency (%)
2018년 121
96.0%
<NA> 5
 
4.0%

Length

2023-12-11T09:05:51.413354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:05:51.535434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2018년 121
96.0%
na 5
 
4.0%


Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
6월
16 
4월
13 
10월
13 
7월
12 
9월
12 
Other values (8)
60 

Length

Max length4
Median length2
Mean length2.3174603
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1월
2nd row1월
3rd row1월
4th row1월
5th row2월

Common Values

ValueCountFrequency (%)
6월 16
12.7%
4월 13
10.3%
10월 13
10.3%
7월 12
9.5%
9월 12
9.5%
5월 11
8.7%
8월 9
7.1%
11월 9
7.1%
3월 8
6.3%
12월 8
6.3%
Other values (3) 15
11.9%

Length

2023-12-11T09:05:51.663598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
6월 16
12.7%
4월 13
10.3%
10월 13
10.3%
7월 12
9.5%
9월 12
9.5%
5월 11
8.7%
8월 9
7.1%
11월 9
7.1%
3월 8
6.3%
12월 8
6.3%
Other values (3) 15
11.9%

품종
Categorical

HIGH CORRELATION 

Distinct28
Distinct (%)22.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
넙치
31 
조피볼락
24 
참돔
14 
돌돔
14 
감성돔
Other values (23)
36 

Length

Max length106
Median length2
Mean length9.8253968
Min length2

Unique

Unique18 ?
Unique (%)14.3%

Sample

1st row조피볼락
2nd row조피볼락
3rd row넙치
4th row숭어, 조피볼락, 참돔, 비단잉어, 점농어, 넙치, 방어, 강도다리, 볼락, 돌돔, 우렁이, 뱀장어, 향어, 은어, 메기, 징거미새우, 잉어, 붕어, 말쥐치, 쏘가리, 미꾸리, 자라, 은어
5th row조피볼락

Common Values

ValueCountFrequency (%)
넙치 31
24.6%
조피볼락 24
19.0%
참돔 14
11.1%
돌돔 14
11.1%
감성돔 7
 
5.6%
숭어 5
 
4.0%
<NA> 5
 
4.0%
방어 4
 
3.2%
은어 2
 
1.6%
버들치 2
 
1.6%
Other values (18) 18
14.3%

Length

2023-12-11T09:05:51.812853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
넙치 42
 
13.2%
조피볼락 36
 
11.4%
참돔 26
 
8.2%
돌돔 25
 
7.9%
감성돔 16
 
5.0%
숭어 16
 
5.0%
잉어 11
 
3.5%
뱀장어 11
 
3.5%
향어 10
 
3.2%
방어 9
 
2.8%
Other values (34) 115
36.3%

발생질병
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)17.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
비브리오병
30 
연쇄구균병
16 
없음
12 
스쿠티카증
11 
아가미흡충병
11 
Other values (17)
46 

Length

Max length10
Median length5
Mean length4.8412698
Min length2

Unique

Unique11 ?
Unique (%)8.7%

Sample

1st row아가미흡충
2nd row비브리오
3rd row트리코디나
4th row없음
5th row비브리오

Common Values

ValueCountFrequency (%)
비브리오병 30
23.8%
연쇄구균병 16
12.7%
없음 12
 
9.5%
스쿠티카증 11
 
8.7%
아가미흡충병 11
 
8.7%
활주세균증 10
 
7.9%
비브리오 8
 
6.3%
트리코디나증 8
 
6.3%
<NA> 5
 
4.0%
트리코디나 2
 
1.6%
Other values (12) 13
10.3%

Length

2023-12-11T09:05:51.989044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
비브리오병 30
23.8%
연쇄구균병 16
12.7%
없음 12
 
9.5%
스쿠티카증 11
 
8.7%
아가미흡충병 11
 
8.7%
활주세균증 10
 
7.9%
비브리오 8
 
6.3%
트리코디나증 8
 
6.3%
na 5
 
4.0%
트리코디나 2
 
1.6%
Other values (12) 13
10.3%

발생건수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct26
Distinct (%)21.5%
Missing5
Missing (%)4.0%
Infinite0
Infinite (%)0.0%
Mean8.9586777
Minimum1
Maximum84
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-11T09:05:52.110059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q35
95-th percentile55
Maximum84
Range83
Interquartile range (IQR)4

Descriptive statistics

Standard deviation19.05798
Coefficient of variation (CV)2.1273207
Kurtosis7.6003343
Mean8.9586777
Median Absolute Deviation (MAD)1
Skewness2.8808121
Sum1084
Variance363.20661
MonotonicityNot monotonic
2023-12-11T09:05:52.222121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
1 60
47.6%
2 22
 
17.5%
5 8
 
6.3%
3 4
 
3.2%
4 3
 
2.4%
7 2
 
1.6%
8 2
 
1.6%
83 2
 
1.6%
44 1
 
0.8%
79 1
 
0.8%
Other values (16) 16
 
12.7%
(Missing) 5
 
4.0%
ValueCountFrequency (%)
1 60
47.6%
2 22
 
17.5%
3 4
 
3.2%
4 3
 
2.4%
5 8
 
6.3%
6 1
 
0.8%
7 2
 
1.6%
8 2
 
1.6%
9 1
 
0.8%
10 1
 
0.8%
ValueCountFrequency (%)
84 1
0.8%
83 2
1.6%
80 1
0.8%
79 1
0.8%
62 1
0.8%
55 1
0.8%
47 1
0.8%
44 1
0.8%
41 1
0.8%
40 1
0.8%

발생률(%)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct37
Distinct (%)30.6%
Missing5
Missing (%)4.0%
Infinite0
Infinite (%)0.0%
Mean9.9157025
Minimum1
Maximum96.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-11T09:05:52.375642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.1
Q11.1
median2
Q35
95-th percentile61.8
Maximum96.5
Range95.5
Interquartile range (IQR)3.9

Descriptive statistics

Standard deviation21.301428
Coefficient of variation (CV)2.148252
Kurtosis7.8546979
Mean9.9157025
Median Absolute Deviation (MAD)0.9
Skewness2.9222206
Sum1199.8
Variance453.75083
MonotonicityNot monotonic
2023-12-11T09:05:52.523061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%)
1.1 51
40.5%
2.3 11
 
8.7%
2.2 7
 
5.6%
1.0 6
 
4.8%
5.7 4
 
3.2%
2.0 4
 
3.2%
1.2 3
 
2.4%
3.4 3
 
2.4%
5.0 2
 
1.6%
5.4 2
 
1.6%
Other values (27) 28
22.2%
(Missing) 5
 
4.0%
ValueCountFrequency (%)
1.0 6
 
4.8%
1.1 51
40.5%
1.2 3
 
2.4%
2.0 4
 
3.2%
2.2 7
 
5.6%
2.3 11
 
8.7%
3.0 1
 
0.8%
3.4 3
 
2.4%
4.0 1
 
0.8%
4.3 1
 
0.8%
ValueCountFrequency (%)
96.5 1
0.8%
93.1 1
0.8%
90.2 1
0.8%
89.8 1
0.8%
88.9 1
0.8%
71.6 1
0.8%
61.8 1
0.8%
50.5 1
0.8%
50.0 1
0.8%
45.5 1
0.8%

비 고
Categorical

HIGH CORRELATION 

Distinct29
Distinct (%)23.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
통영
60 
남해
13 
거제
통영, 남해
 
6
<NA>
 
5
Other values (24)
35 

Length

Max length46
Median length2
Mean length5.6587302
Min length2

Unique

Unique18 ?
Unique (%)14.3%

Sample

1st row통영
2nd row통영
3rd row통영
4th row통영, 거제, 고성, 남해, 하동, 창원, 밀양, 창녕, 김해, 산청
5th row통영

Common Values

ValueCountFrequency (%)
통영 60
47.6%
남해 13
 
10.3%
거제 7
 
5.6%
통영, 남해 6
 
4.8%
<NA> 5
 
4.0%
고성 5
 
4.0%
하동 3
 
2.4%
남해, 하동 3
 
2.4%
합천 2
 
1.6%
밀양 2
 
1.6%
Other values (19) 20
 
15.9%

Length

2023-12-11T09:05:52.644817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
통영 79
33.1%
남해 38
15.9%
거제 25
 
10.5%
고성 20
 
8.4%
하동 18
 
7.5%
밀양 9
 
3.8%
양산 9
 
3.8%
창원 7
 
2.9%
합천 6
 
2.5%
함양 6
 
2.5%
Other values (5) 22
 
9.2%

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing126
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing126
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing126
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 10
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing126
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing126
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 12
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing126
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 13
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing126
Missing (%)100.0%
Memory size1.2 KiB

Interactions

2023-12-11T09:05:50.695398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:50.546561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:50.764304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:50.621084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:05:52.712810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품종발생질병발생건수발생률(%)비 고
1.0000.0000.5590.0000.0000.128
품종0.0001.0000.6540.9730.9780.983
발생질병0.5590.6541.0000.0000.2760.000
발생건수0.0000.9730.0001.0001.0000.962
발생률(%)0.0000.9780.2761.0001.0000.966
비 고0.1280.9830.0000.9620.9661.000
2023-12-11T09:05:52.823668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품종발생질병비 고연도
품종1.0000.2140.7611.0000.000
발생질병0.2141.0000.0001.0000.214
비 고0.7610.0001.0001.0000.000
연도1.0001.0001.0001.0001.000
0.0000.2140.0001.0001.000
2023-12-11T09:05:52.924227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발생건수발생률(%)연도품종발생질병비 고
발생건수1.0000.9711.0000.0000.6930.0000.726
발생률(%)0.9711.0001.0000.0000.7110.0960.739
연도1.0001.0001.0001.0001.0001.0001.000
0.0000.0001.0001.0000.0000.2140.000
품종0.6930.7111.0000.0001.0000.2140.761
발생질병0.0000.0961.0000.2140.2141.0000.000
비 고0.7260.7391.0000.0000.7610.0001.000

Missing values

2023-12-11T09:05:50.895827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:05:51.079597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T09:05:51.242211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연도품종발생질병발생건수발생률(%)비 고Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13
02018년1월조피볼락아가미흡충11.2통영<NA><NA><NA><NA><NA><NA><NA>
12018년1월조피볼락비브리오11.2통영<NA><NA><NA><NA><NA><NA><NA>
22018년1월넙치트리코디나11.2통영<NA><NA><NA><NA><NA><NA><NA>
32018년1월숭어, 조피볼락, 참돔, 비단잉어, 점농어, 넙치, 방어, 강도다리, 볼락, 돌돔, 우렁이, 뱀장어, 향어, 은어, 메기, 징거미새우, 잉어, 붕어, 말쥐치, 쏘가리, 미꾸리, 자라, 은어없음8396.5통영, 거제, 고성, 남해, 하동, 창원, 밀양, 창녕, 김해, 산청<NA><NA><NA><NA><NA><NA><NA>
42018년2월조피볼락비브리오11.1통영<NA><NA><NA><NA><NA><NA><NA>
52018년2월참돔비브리오11.1통영<NA><NA><NA><NA><NA><NA><NA>
62018년2월넙치비브리오11.1고성<NA><NA><NA><NA><NA><NA><NA>
72018년2월넙치에드워드11.1거제<NA><NA><NA><NA><NA><NA><NA>
82018년2월넙치트리코디나22.3고성<NA><NA><NA><NA><NA><NA><NA>
92018년2월숭어, 조피볼락, 돌돔, 넙치, 능성어, 강도다리, 참돔, 볼락, 쥐치, 감성돔, 우렁이, 뱀장어, 향어, 열대어, 잉어, 붕어, 자라, 철갑상어, 미꾸리없음8493.1통영, 거제, 고성, 남해, 하동, 창원, 밀양, 양산, 김해, 함양, 산청<NA><NA><NA><NA><NA><NA><NA>
연도품종발생질병발생건수발생률(%)비 고Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13
1162018년12월넙치트리코디나증11.1남해<NA><NA><NA><NA><NA><NA><NA>
1172018년12월넙치비브리오병11.1거제<NA><NA><NA><NA><NA><NA><NA>
1182018년12월버들치스쿠티카증11.1합천<NA><NA><NA><NA><NA><NA><NA>
1192018년12월은어활주세균증22.3밀양<NA><NA><NA><NA><NA><NA><NA>
1202018년12월조피볼락, 참돔, 돌돔, 넙치, 볼락, 말쥐치, 민어, 무지개송어, 뱀장어, 해삼, 잉어, 향어, 점농어, 틸라피아, 방어, 메기, 감성돔, 숭어없음7989.8통영, 거제, 고성, 남해, 하동, 밀양, 창녕, 창원, 양산, 합천<NA><NA><NA><NA><NA><NA><NA>
121<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
122<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
123<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
124<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
125<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연도품종발생질병발생건수발생률(%)비 고# duplicates
0<NA><NA><NA><NA><NA><NA><NA>5