Overview

Dataset statistics

Number of variables7
Number of observations116
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.6 KiB
Average record size in memory58.1 B

Variable types

Numeric1
Text3
Categorical3

Dataset

Description동물의 질병 정보
Author농림축산검역본부
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220214000000001884

Alerts

RGSDE is highly imbalanced (73.5%)Imbalance
DISS_NO has unique valuesUnique
DISS_NM has unique valuesUnique

Reproduction

Analysis started2023-12-11 03:46:38.485779
Analysis finished2023-12-11 03:46:39.192415
Duration0.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

DISS_NO
Real number (ℝ)

UNIQUE 

Distinct116
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean60.051724
Minimum1
Maximum121
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-11T12:46:39.264809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.75
Q130.75
median59.5
Q389.25
95-th percentile115.25
Maximum121
Range120
Interquartile range (IQR)58.5

Descriptive statistics

Standard deviation34.549988
Coefficient of variation (CV)0.57533715
Kurtosis-1.1629478
Mean60.051724
Median Absolute Deviation (MAD)29.5
Skewness0.03094647
Sum6966
Variance1193.7016
MonotonicityNot monotonic
2023-12-11T12:46:39.427684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28 1
 
0.9%
103 1
 
0.9%
118 1
 
0.9%
117 1
 
0.9%
116 1
 
0.9%
115 1
 
0.9%
113 1
 
0.9%
112 1
 
0.9%
110 1
 
0.9%
109 1
 
0.9%
Other values (106) 106
91.4%
ValueCountFrequency (%)
1 1
0.9%
2 1
0.9%
3 1
0.9%
4 1
0.9%
5 1
0.9%
6 1
0.9%
7 1
0.9%
8 1
0.9%
9 1
0.9%
11 1
0.9%
ValueCountFrequency (%)
121 1
0.9%
120 1
0.9%
119 1
0.9%
118 1
0.9%
117 1
0.9%
116 1
0.9%
115 1
0.9%
113 1
0.9%
112 1
0.9%
110 1
0.9%

DISS_NM
Text

UNIQUE 

Distinct116
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-11T12:46:39.758817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length10
Mean length5.9913793
Min length2

Characters and Unicode

Total characters695
Distinct characters198
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique116 ?
Unique (%)100.0%

Sample

1st row돼지단독
2nd row돼지로타바이러스감염증
3rd row돼지생식기호흡기증후군
4th row돼지수포병
5th row돼지써코바이러스감염증
ValueCountFrequency (%)
돼지단독 1
 
0.9%
전염성f낭병 1
 
0.9%
파스튜렐라폐렴 1
 
0.9%
소부제병 1
 
0.9%
개코로나바이러스감염증 1
 
0.9%
토끼바이러스성출혈병 1
 
0.9%
탄저 1
 
0.9%
클라미디아병 1
 
0.9%
크립토스포리디움증 1
 
0.9%
큐열 1
 
0.9%
Other values (107) 107
91.5%
2023-12-11T12:46:40.299884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
38
 
5.5%
35
 
5.0%
32
 
4.6%
31
 
4.5%
25
 
3.6%
21
 
3.0%
21
 
3.0%
18
 
2.6%
17
 
2.4%
16
 
2.3%
Other values (188) 441
63.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 691
99.4%
Uppercase Letter 3
 
0.4%
Space Separator 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
38
 
5.5%
35
 
5.1%
32
 
4.6%
31
 
4.5%
25
 
3.6%
21
 
3.0%
21
 
3.0%
18
 
2.6%
17
 
2.5%
16
 
2.3%
Other values (184) 437
63.2%
Uppercase Letter
ValueCountFrequency (%)
F 1
33.3%
S 1
33.3%
R 1
33.3%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 691
99.4%
Latin 3
 
0.4%
Common 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
38
 
5.5%
35
 
5.1%
32
 
4.6%
31
 
4.5%
25
 
3.6%
21
 
3.0%
21
 
3.0%
18
 
2.6%
17
 
2.5%
16
 
2.3%
Other values (184) 437
63.2%
Latin
ValueCountFrequency (%)
F 1
33.3%
S 1
33.3%
R 1
33.3%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 691
99.4%
ASCII 4
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
38
 
5.5%
35
 
5.1%
32
 
4.6%
31
 
4.5%
25
 
3.6%
21
 
3.0%
21
 
3.0%
18
 
2.6%
17
 
2.5%
16
 
2.3%
Other values (184) 437
63.2%
ASCII
ValueCountFrequency (%)
F 1
25.0%
S 1
25.0%
1
25.0%
R 1
25.0%
Distinct115
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-11T12:46:40.659812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length28
Mean length18.517241
Min length6

Characters and Unicode

Total characters2148
Distinct characters52
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique114 ?
Unique (%)98.3%

Sample

1st rowSwine erysipelas
2nd rowPorcine rotavirus infection
3rd rowPorcine reproductive and respiratory syndrom
4th rowSwine vesicular disease
5th rowPCV-2 infection
ValueCountFrequency (%)
disease 19
 
7.6%
infection 14
 
5.6%
bovine 8
 
3.2%
fever 7
 
2.8%
porcine 6
 
2.4%
swine 5
 
2.0%
infectious 4
 
1.6%
viral 4
 
1.6%
fowl 3
 
1.2%
respiratory 3
 
1.2%
Other values (154) 178
70.9%
2023-12-11T12:46:41.179741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 244
 
11.4%
e 214
 
10.0%
s 195
 
9.1%
o 164
 
7.6%
a 138
 
6.4%
n 136
 
6.3%
135
 
6.3%
r 125
 
5.8%
t 98
 
4.6%
l 79
 
3.7%
Other values (42) 620
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1874
87.2%
Space Separator 135
 
6.3%
Uppercase Letter 130
 
6.1%
Other Punctuation 7
 
0.3%
Dash Punctuation 1
 
< 0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 244
13.0%
e 214
11.4%
s 195
10.4%
o 164
8.8%
a 138
 
7.4%
n 136
 
7.3%
r 125
 
6.7%
t 98
 
5.2%
l 79
 
4.2%
c 77
 
4.1%
Other values (16) 404
21.6%
Uppercase Letter
ValueCountFrequency (%)
C 18
13.8%
A 16
12.3%
P 14
10.8%
B 12
9.2%
S 11
8.5%
E 8
 
6.2%
R 7
 
5.4%
F 7
 
5.4%
L 7
 
5.4%
I 6
 
4.6%
Other values (9) 24
18.5%
Other Punctuation
ValueCountFrequency (%)
' 4
57.1%
: 1
 
14.3%
. 1
 
14.3%
, 1
 
14.3%
Space Separator
ValueCountFrequency (%)
135
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2004
93.3%
Common 144
 
6.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 244
12.2%
e 214
10.7%
s 195
 
9.7%
o 164
 
8.2%
a 138
 
6.9%
n 136
 
6.8%
r 125
 
6.2%
t 98
 
4.9%
l 79
 
3.9%
c 77
 
3.8%
Other values (35) 534
26.6%
Common
ValueCountFrequency (%)
135
93.8%
' 4
 
2.8%
- 1
 
0.7%
: 1
 
0.7%
. 1
 
0.7%
, 1
 
0.7%
2 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2148
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 244
 
11.4%
e 214
 
10.0%
s 195
 
9.1%
o 164
 
7.6%
a 138
 
6.4%
n 136
 
6.3%
135
 
6.3%
r 125
 
5.8%
t 98
 
4.6%
l 79
 
3.7%
Other values (42) 620
28.9%

INFO_OFFER_NM
Categorical

Distinct42
Distinct (%)36.2%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
정우석
예정용
 
8
이희수
 
7
최강석
 
7
양동군
 
5
Other values (37)
80 

Length

Max length3
Median length3
Mean length2.9482759
Min length2

Unique

Unique9 ?
Unique (%)7.8%

Sample

1st row문진산
2nd row김병한
3rd row배유찬
4th row박종현
5th row양동군

Common Values

ValueCountFrequency (%)
정우석 9
 
7.8%
예정용 8
 
6.9%
이희수 7
 
6.0%
최강석 7
 
6.0%
양동군 5
 
4.3%
장환 4
 
3.4%
박종현 4
 
3.4%
최세은 4
 
3.4%
조성준 4
 
3.4%
강현미 3
 
2.6%
Other values (32) 61
52.6%

Length

2023-12-11T12:46:41.366796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
정우석 9
 
7.8%
예정용 8
 
6.9%
이희수 7
 
6.0%
최강석 7
 
6.0%
양동군 5
 
4.3%
장환 4
 
3.4%
박종현 4
 
3.4%
최세은 4
 
3.4%
조성준 4
 
3.4%
정병열 3
 
2.6%
Other values (32) 61
52.6%

RGSDE
Categorical

IMBALANCE 

Distinct8
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2007-09-18
103 
2007-10-25
 
4
2007-11-07
 
3
2007-06-04
 
2
2007-06-12
 
1
Other values (3)
 
3

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique4 ?
Unique (%)3.4%

Sample

1st row2007-09-18
2nd row2007-09-18
3rd row2007-09-18
4th row2007-09-18
5th row2007-09-18

Common Values

ValueCountFrequency (%)
2007-09-18 103
88.8%
2007-10-25 4
 
3.4%
2007-11-07 3
 
2.6%
2007-06-04 2
 
1.7%
2007-06-12 1
 
0.9%
2007-06-07 1
 
0.9%
2007-05-28 1
 
0.9%
2007-10-30 1
 
0.9%

Length

2023-12-11T12:46:41.538463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:46:41.699524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2007-09-18 103
88.8%
2007-10-25 4
 
3.4%
2007-11-07 3
 
2.6%
2007-06-04 2
 
1.7%
2007-06-12 1
 
0.9%
2007-06-07 1
 
0.9%
2007-05-28 1
 
0.9%
2007-10-30 1
 
0.9%
Distinct51
Distinct (%)44.0%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-11T12:46:41.983455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length38
Median length36
Mean length6.3534483
Min length1

Characters and Unicode

Total characters737
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)31.0%

Sample

1st row돼지
2nd row돼지
3rd row돼지
4th row돼지
5th row돼지
ValueCountFrequency (%)
돼지 19
 
13.8%
18
 
13.0%
8
 
5.8%
미분류 8
 
5.8%
미분류,돼지 6
 
4.3%
야생조류-기타 4
 
2.9%
소,기타 4
 
2.9%
소,돼지 4
 
2.9%
쥐-랫트 4
 
2.9%
4
 
2.9%
Other values (47) 59
42.8%
2023-12-11T12:46:42.465390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 132
17.9%
50
 
6.8%
41
 
5.6%
41
 
5.6%
41
 
5.6%
34
 
4.6%
31
 
4.2%
29
 
3.9%
29
 
3.9%
- 24
 
3.3%
Other values (36) 285
38.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 559
75.8%
Other Punctuation 132
 
17.9%
Dash Punctuation 24
 
3.3%
Space Separator 22
 
3.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
50
 
8.9%
41
 
7.3%
41
 
7.3%
41
 
7.3%
34
 
6.1%
31
 
5.5%
29
 
5.2%
29
 
5.2%
22
 
3.9%
22
 
3.9%
Other values (33) 219
39.2%
Other Punctuation
ValueCountFrequency (%)
, 132
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 24
100.0%
Space Separator
ValueCountFrequency (%)
22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 559
75.8%
Common 178
 
24.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
50
 
8.9%
41
 
7.3%
41
 
7.3%
41
 
7.3%
34
 
6.1%
31
 
5.5%
29
 
5.2%
29
 
5.2%
22
 
3.9%
22
 
3.9%
Other values (33) 219
39.2%
Common
ValueCountFrequency (%)
, 132
74.2%
- 24
 
13.5%
22
 
12.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 559
75.8%
ASCII 178
 
24.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 132
74.2%
- 24
 
13.5%
22
 
12.4%
Hangul
ValueCountFrequency (%)
50
 
8.9%
41
 
7.3%
41
 
7.3%
41
 
7.3%
34
 
6.1%
31
 
5.5%
29
 
5.2%
29
 
5.2%
22
 
3.9%
22
 
3.9%
Other values (33) 219
39.2%

CAUSE_CMMN_CL
Categorical

Distinct6
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
기타
50 
바이러스
28 
세균
25 
기생충
곰팡이
 
3

Length

Max length7
Median length2
Mean length2.6293103
Min length2

Unique

Unique1 ?
Unique (%)0.9%

Sample

1st row세균
2nd row바이러스
3rd row바이러스
4th row기타
5th row바이러스

Common Values

ValueCountFrequency (%)
기타 50
43.1%
바이러스 28
24.1%
세균 25
21.6%
기생충 9
 
7.8%
곰팡이 3
 
2.6%
기타,바이러스 1
 
0.9%

Length

2023-12-11T12:46:42.629233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T12:46:42.758898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기타 50
43.1%
바이러스 28
24.1%
세균 25
21.6%
기생충 9
 
7.8%
곰팡이 3
 
2.6%
기타,바이러스 1
 
0.9%

Interactions

2023-12-11T12:46:38.930455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T12:46:42.843808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
DISS_NOINFO_OFFER_NMRGSDEMAIN_INFC_ANIMALCAUSE_CMMN_CL
DISS_NO1.0000.4480.4730.8400.214
INFO_OFFER_NM0.4481.0000.0000.8470.694
RGSDE0.4730.0001.0000.0000.246
MAIN_INFC_ANIMAL0.8400.8470.0001.0000.000
CAUSE_CMMN_CL0.2140.6940.2460.0001.000
2023-12-11T12:46:42.948700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RGSDECAUSE_CMMN_CLINFO_OFFER_NM
RGSDE1.0000.1360.000
CAUSE_CMMN_CL0.1361.0000.291
INFO_OFFER_NM0.0000.2911.000
2023-12-11T12:46:43.063053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
DISS_NOINFO_OFFER_NMRGSDECAUSE_CMMN_CL
DISS_NO1.0000.1500.2440.120
INFO_OFFER_NM0.1501.0000.0000.291
RGSDE0.2440.0001.0000.136
CAUSE_CMMN_CL0.1200.2910.1361.000

Missing values

2023-12-11T12:46:39.050038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:46:39.153282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

DISS_NODISS_NMENG_DISS_NMINFO_OFFER_NMRGSDEMAIN_INFC_ANIMALCAUSE_CMMN_CL
028돼지단독Swine erysipelas문진산2007-09-18돼지세균
129돼지로타바이러스감염증Porcine rotavirus infection김병한2007-09-18돼지바이러스
230돼지생식기호흡기증후군Porcine reproductive and respiratory syndrom배유찬2007-09-18돼지바이러스
331돼지수포병Swine vesicular disease박종현2007-09-18돼지기타
432돼지써코바이러스감염증PCV-2 infection양동군2007-09-18돼지바이러스
533돼지유행성설사병Porcine epidemic diarrhea현방훈2007-09-18돼지바이러스
634돼지적리Swine dysentery임숙경2007-09-18돼지세균
735돼지열병Classical swine fever송재영2007-09-18돼지바이러스
836돼지파보바이러스감염증Porcine parvovirus infection김성희2007-09-18돼지기타
937돼지호흡기코로나바이러스감염증Porcine respiratory coronaviral infection노인순2007-09-18돼지바이러스
DISS_NODISS_NMENG_DISS_NMINFO_OFFER_NMRGSDEMAIN_INFC_ANIMALCAUSE_CMMN_CL
10618네오스포라병Neosporosis정우석2007-09-18소,개,면양,기타 미분류기생충
10719노제마병Nosema disease장환2007-09-18벌-꿀벌기타
10820뇌척수염Avian encephalomyelitis이윤정2007-09-18닭,꿩,칠면조,메추리바이러스
10921뉴캣슬병Newcastle disease최강석2007-09-18닭,꿩,메추리기타
11022니파바이러스감염증Nipahvirus infection나진주2007-09-18고양이,개,돼지,산양,면양,쥐-랫트기타
11123닭세망내피증Reticuloendotheliosis최강석2007-09-18닭,오리,칠면조바이러스
11224닭콕시듐증Coccidiosis장환2007-06-04기타
11325대장균증Colibacillosis이희수2007-09-18소,돼지세균
11426돼지게타바이러스감염증Porcine getahvirus disease최강석2007-09-18소,돼지,쥐-랫트바이러스
11527돼지뇌심근염Encephalomyocarditis김성희2007-09-18돼지기타,바이러스