Overview

Dataset statistics

Number of variables17
Number of observations120
Missing cells1220
Missing cells (%)59.8%
Duplicate rows1
Duplicate rows (%)0.8%
Total size in memory17.5 KiB
Average record size in memory149.1 B

Variable types

Categorical5
Numeric2
Unsupported10

Dataset

Description경상남도 수산생물 질병 발생 현황 월별 조사결과입니다.(해양, 어류, 갑각류에 대한 질병발생 품종, 질병, 발생건수, 발생률 데이터를 제공합니다.)
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=3076265

Alerts

Dataset has 1 (0.8%) duplicate rowsDuplicates
품종 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
발생질병 is highly overall correlated with 연도High correlation
비 고 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
연도 is highly overall correlated with 발생건수 and 5 other fieldsHigh correlation
is highly overall correlated with 연도High correlation
발생건수 is highly overall correlated with 발생률(%) and 3 other fieldsHigh correlation
발생률(%) is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
연도 is highly imbalanced (58.6%)Imbalance
발생건수 has 10 (8.3%) missing valuesMissing
발생률(%) has 10 (8.3%) missing valuesMissing
Unnamed: 7 has 120 (100.0%) missing valuesMissing
Unnamed: 8 has 120 (100.0%) missing valuesMissing
Unnamed: 9 has 120 (100.0%) missing valuesMissing
Unnamed: 10 has 120 (100.0%) missing valuesMissing
Unnamed: 11 has 120 (100.0%) missing valuesMissing
Unnamed: 12 has 120 (100.0%) missing valuesMissing
Unnamed: 13 has 120 (100.0%) missing valuesMissing
Unnamed: 14 has 120 (100.0%) missing valuesMissing
Unnamed: 15 has 120 (100.0%) missing valuesMissing
Unnamed: 16 has 120 (100.0%) missing valuesMissing
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 13 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 14 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 15 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 16 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 00:05:45.374009
Analysis finished2023-12-11 00:05:46.879809
Duration1.51 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2017년
110 
<NA>
 
10

Length

Max length5
Median length5
Mean length4.9166667
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017년
2nd row2017년
3rd row2017년
4th row2017년
5th row2017년

Common Values

ValueCountFrequency (%)
2017년 110
91.7%
<NA> 10
 
8.3%

Length

2023-12-11T09:05:46.945905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:05:47.040805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017년 110
91.7%
na 10
 
8.3%


Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
6월
17 
3월
12 
5월
11 
10월
11 
7월
10 
Other values (8)
59 

Length

Max length4
Median length2
Mean length2.3583333
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1월
2nd row1월
3rd row1월
4th row1월
5th row2월

Common Values

ValueCountFrequency (%)
6월 17
14.2%
3월 12
10.0%
5월 11
9.2%
10월 11
9.2%
7월 10
8.3%
<NA> 10
8.3%
4월 9
7.5%
8월 9
7.5%
9월 8
6.7%
11월 8
6.7%
Other values (3) 15
12.5%

Length

2023-12-11T09:05:47.153414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
6월 17
14.2%
3월 12
10.0%
5월 11
9.2%
10월 11
9.2%
7월 10
8.3%
na 10
8.3%
4월 9
7.5%
8월 9
7.5%
9월 8
6.7%
11월 8
6.7%
Other values (3) 15
12.5%

품종
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)25.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
조피볼락
28 
넙치
21 
참돔
10 
<NA>
10 
방어
Other values (26)
42 

Length

Max length136
Median length122
Mean length12.158333
Min length2

Unique

Unique20 ?
Unique (%)16.7%

Sample

1st row조피볼락
2nd row감성돔
3rd row넙치
4th row숭어, 방어, 쥐치, 농어, 조피볼락, 볼락, 참돔, 돌돔, 감성돔, 말쥐치, 넙치, 뱀장어, 잉어, 우렁이, 미꾸라지, 자라, 철갑상어, 동자개, 붕어, 이스라엘잉어, 점농어, 틸라피아, 금붕어, 징거미새우, 비단잉어
5th row점농어

Common Values

ValueCountFrequency (%)
조피볼락 28
23.3%
넙치 21
17.5%
참돔 10
 
8.3%
<NA> 10
 
8.3%
방어 9
 
7.5%
숭어 6
 
5.0%
점농어 5
 
4.2%
감성돔 4
 
3.3%
돌돔 3
 
2.5%
뱀장어 2
 
1.7%
Other values (21) 22
18.3%

Length

2023-12-11T09:05:47.289098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
조피볼락 40
 
11.2%
넙치 31
 
8.7%
참돔 22
 
6.2%
숭어 16
 
4.5%
방어 13
 
3.7%
감성돔 13
 
3.7%
돌돔 13
 
3.7%
뱀장어 13
 
3.7%
잉어 12
 
3.4%
붕어 12
 
3.4%
Other values (43) 171
48.0%

발생질병
Categorical

HIGH CORRELATION 

Distinct26
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
비브리오
27 
없음
12 
아가미흡충
11 
<NA>
10 
트리코디나
Other values (21)
53 

Length

Max length12
Median length10
Mean length4.55
Min length2

Unique

Unique9 ?
Unique (%)7.5%

Sample

1st row비브리오병
2nd row비브리오병
3rd row스쿠티카증
4th row없음
5th row활주세균, 비브리오

Common Values

ValueCountFrequency (%)
비브리오 27
22.5%
없음 12
10.0%
아가미흡충 11
9.2%
<NA> 10
 
8.3%
트리코디나 7
 
5.8%
연쇄구균 7
 
5.8%
베네데니아 5
 
4.2%
활주세균 5
 
4.2%
비브리오병 5
 
4.2%
스쿠치카 5
 
4.2%
Other values (16) 26
21.7%

Length

2023-12-11T09:05:47.415432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
비브리오 29
23.6%
아가미흡충 12
9.8%
없음 12
9.8%
na 10
 
8.1%
트리코디나 7
 
5.7%
연쇄구균 7
 
5.7%
활주세균 7
 
5.7%
베네데니아 5
 
4.1%
비브리오병 5
 
4.1%
스쿠치카 5
 
4.1%
Other values (14) 24
19.5%

발생건수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct25
Distinct (%)22.7%
Missing10
Missing (%)8.3%
Infinite0
Infinite (%)0.0%
Mean10.445455
Minimum1
Maximum91
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-11T09:05:47.524587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1.5
Q36
95-th percentile69.7
Maximum91
Range90
Interquartile range (IQR)5

Descriptive statistics

Standard deviation21.837845
Coefficient of variation (CV)2.0906553
Kurtosis5.7931488
Mean10.445455
Median Absolute Deviation (MAD)0.5
Skewness2.6473117
Sum1149
Variance476.89149
MonotonicityNot monotonic
2023-12-11T09:05:47.638383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
1 55
45.8%
2 12
 
10.0%
6 5
 
4.2%
3 5
 
4.2%
5 5
 
4.2%
7 3
 
2.5%
8 3
 
2.5%
4 3
 
2.5%
16 2
 
1.7%
9 2
 
1.7%
Other values (15) 15
 
12.5%
(Missing) 10
 
8.3%
ValueCountFrequency (%)
1 55
45.8%
2 12
 
10.0%
3 5
 
4.2%
4 3
 
2.5%
5 5
 
4.2%
6 5
 
4.2%
7 3
 
2.5%
8 3
 
2.5%
9 2
 
1.7%
11 1
 
0.8%
ValueCountFrequency (%)
91 1
0.8%
89 1
0.8%
85 1
0.8%
84 1
0.8%
78 1
0.8%
76 1
0.8%
62 1
0.8%
60 1
0.8%
59 1
0.8%
55 1
0.8%

발생률(%)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct41
Distinct (%)37.3%
Missing10
Missing (%)8.3%
Infinite0
Infinite (%)0.0%
Mean10.895455
Minimum1
Maximum94.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-11T09:05:48.007778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11.1
median1.5
Q36.3
95-th percentile74.33
Maximum94.9
Range93.9
Interquartile range (IQR)5.2

Descriptive statistics

Standard deviation22.806077
Coefficient of variation (CV)2.0931735
Kurtosis5.8638213
Mean10.895455
Median Absolute Deviation (MAD)0.5
Skewness2.6594708
Sum1198.5
Variance520.11714
MonotonicityNot monotonic
2023-12-11T09:05:48.129396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
1.1 38
31.7%
1.0 17
14.2%
2.0 7
 
5.8%
2.1 3
 
2.5%
6.4 3
 
2.5%
7.4 3
 
2.5%
9.0 3
 
2.5%
3.0 2
 
1.7%
5.3 2
 
1.7%
86.7 1
 
0.8%
Other values (31) 31
25.8%
(Missing) 10
 
8.3%
ValueCountFrequency (%)
1.0 17
14.2%
1.1 38
31.7%
1.9 1
 
0.8%
2.0 7
 
5.8%
2.1 3
 
2.5%
2.2 1
 
0.8%
3.0 2
 
1.7%
3.1 1
 
0.8%
3.2 1
 
0.8%
3.4 1
 
0.8%
ValueCountFrequency (%)
94.9 1
0.8%
93.3 1
0.8%
88.2 1
0.8%
86.7 1
0.8%
83.9 1
0.8%
80.9 1
0.8%
66.3 1
0.8%
60.6 1
0.8%
59.4 1
0.8%
56.4 1
0.8%

비 고
Categorical

HIGH CORRELATION 

Distinct36
Distinct (%)30.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
통영
32 
남해
14 
거제
12 
<NA>
10 
하동
Other values (31)
44 

Length

Max length50
Median length2
Mean length6.9
Min length2

Unique

Unique26 ?
Unique (%)21.7%

Sample

1st row통영
2nd row남해
3rd row하동, 남해, 거제
4th row통영, 거제, 남해, 하동, 김해, 창원, 양산, 창녕, 진주, 함양, 함안, 산청
5th row남해, 하동

Common Values

ValueCountFrequency (%)
통영 32
26.7%
남해 14
11.7%
거제 12
 
10.0%
<NA> 10
 
8.3%
하동 8
 
6.7%
통영, 남해 7
 
5.8%
통영, 거제 5
 
4.2%
고성 2
 
1.7%
거제, 남해 2
 
1.7%
남해, 하동 2
 
1.7%
Other values (26) 26
21.7%

Length

2023-12-11T09:05:48.255244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
통영 57
21.8%
남해 43
16.4%
거제 36
13.7%
하동 23
8.8%
고성 16
 
6.1%
na 10
 
3.8%
김해 10
 
3.8%
양산 10
 
3.8%
산청 10
 
3.8%
창원 8
 
3.1%
Other values (8) 39
14.9%

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing120
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing120
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing120
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 10
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing120
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing120
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 12
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing120
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 13
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing120
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 14
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing120
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 15
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing120
Missing (%)100.0%
Memory size1.2 KiB

Unnamed: 16
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing120
Missing (%)100.0%
Memory size1.2 KiB

Interactions

2023-12-11T09:05:46.090024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:45.913513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:46.179457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:46.006884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:05:48.340315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품종발생질병발생건수발생률(%)비 고
1.0000.4280.3420.0000.0000.546
품종0.4281.0000.0000.9390.9390.984
발생질병0.3420.0001.0000.0770.0770.729
발생건수0.0000.9390.0771.0001.0000.986
발생률(%)0.0000.9390.0771.0001.0000.986
비 고0.5460.9840.7290.9860.9861.000
2023-12-11T09:05:48.438544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품종발생질병비 고연도
품종1.0000.0000.7091.0000.125
발생질병0.0001.0000.2241.0000.102
비 고0.7090.2241.0001.0000.171
연도1.0001.0001.0001.0001.000
0.1250.1020.1711.0001.000
2023-12-11T09:05:48.529554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발생건수발생률(%)연도품종발생질병비 고
발생건수1.0000.9561.0000.0000.6620.0000.706
발생률(%)0.9561.0001.0000.0000.6630.0000.718
연도1.0001.0001.0001.0001.0001.0001.000
0.0000.0001.0001.0000.1250.1020.171
품종0.6620.6631.0000.1251.0000.0000.709
발생질병0.0000.0001.0000.1020.0001.0000.224
비 고0.7060.7181.0000.1710.7090.2241.000

Missing values

2023-12-11T09:05:46.321511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:05:46.564495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T09:05:46.778771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연도품종발생질병발생건수발생률(%)비 고Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16
02017년1월조피볼락비브리오병11.0통영<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
12017년1월감성돔비브리오병11.0남해<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
22017년1월넙치스쿠티카증33.1하동, 남해, 거제<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
32017년1월숭어, 방어, 쥐치, 농어, 조피볼락, 볼락, 참돔, 돌돔, 감성돔, 말쥐치, 넙치, 뱀장어, 잉어, 우렁이, 미꾸라지, 자라, 철갑상어, 동자개, 붕어, 이스라엘잉어, 점농어, 틸라피아, 금붕어, 징거미새우, 비단잉어없음9194.9통영, 거제, 남해, 하동, 김해, 창원, 양산, 창녕, 진주, 함양, 함안, 산청<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
42017년2월점농어활주세균, 비브리오22.0남해, 하동<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
52017년2월점농어아가미흡충11.0남해<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
62017년2월조피볼락비브리오병54.9통영, 남해<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
72017년2월철갑상어곰팡이병11.0의령<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
82017년2월참돔비브리오병21.9통영, 남해<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
92017년2월참돔베네데니아증11.0거제<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
연도품종발생질병발생건수발생률(%)비 고Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16
110<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
111<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
112<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
113<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
114<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
115<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
116<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
117<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
118<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
119<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연도품종발생질병발생건수발생률(%)비 고# duplicates
0<NA><NA><NA><NA><NA><NA><NA>10