Overview

Dataset statistics

Number of variables6
Number of observations1591
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory77.8 KiB
Average record size in memory50.1 B

Variable types

Numeric2
Text1
Categorical2
DateTime1

Dataset

Description경상북도 김천시의 가축 사육 현황으로 농장 주소(읍면동까지 표기), 축종, 사육두수 등의 정보를 엑셀 csv형태 파일로 제공합니다.
Author경상북도 김천시
URLhttps://www.data.go.kr/data/15034364/fileData.do

Alerts

데이터 기준일자 has constant value ""Constant
연번 is highly overall correlated with 농장주소High correlation
농장주소 is highly overall correlated with 연번High correlation
축종 is highly imbalanced (64.3%)Imbalance
연번 has unique valuesUnique
사육두수 has 25 (1.6%) zerosZeros

Reproduction

Analysis started2023-12-11 23:39:47.379161
Analysis finished2023-12-11 23:39:48.347478
Duration0.97 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1591
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean796
Minimum1
Maximum1591
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 KiB
2023-12-12T08:39:48.432766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile80.5
Q1398.5
median796
Q31193.5
95-th percentile1511.5
Maximum1591
Range1590
Interquartile range (IQR)795

Descriptive statistics

Standard deviation459.42645
Coefficient of variation (CV)0.57716891
Kurtosis-1.2
Mean796
Median Absolute Deviation (MAD)398
Skewness0
Sum1266436
Variance211072.67
MonotonicityStrictly increasing
2023-12-12T08:39:48.580844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
1070 1
 
0.1%
1068 1
 
0.1%
1067 1
 
0.1%
1066 1
 
0.1%
1065 1
 
0.1%
1064 1
 
0.1%
1063 1
 
0.1%
1062 1
 
0.1%
1061 1
 
0.1%
Other values (1581) 1581
99.4%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1591 1
0.1%
1590 1
0.1%
1589 1
0.1%
1588 1
0.1%
1587 1
0.1%
1586 1
0.1%
1585 1
0.1%
1584 1
0.1%
1583 1
0.1%
1582 1
0.1%
Distinct68
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size12.6 KiB
2023-12-12T08:39:48.813718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters4773
Distinct characters69
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)1.0%

Sample

1st row김OO
2nd row황OO
3rd row김OO
4th row이OO
5th row육OO
ValueCountFrequency (%)
김oo 372
23.4%
이oo 277
17.4%
박oo 134
 
8.4%
정oo 89
 
5.6%
최oo 82
 
5.2%
전oo 57
 
3.6%
강oo 45
 
2.8%
백oo 34
 
2.1%
문oo 31
 
1.9%
조oo 30
 
1.9%
Other values (58) 440
27.7%
2023-12-12T08:39:49.198606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
O 3182
66.7%
372
 
7.8%
277
 
5.8%
134
 
2.8%
89
 
1.9%
82
 
1.7%
57
 
1.2%
45
 
0.9%
34
 
0.7%
31
 
0.6%
Other values (59) 470
 
9.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3183
66.7%
Other Letter 1590
33.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
372
23.4%
277
17.4%
134
 
8.4%
89
 
5.6%
82
 
5.2%
57
 
3.6%
45
 
2.8%
34
 
2.1%
31
 
1.9%
30
 
1.9%
Other values (57) 439
27.6%
Uppercase Letter
ValueCountFrequency (%)
O 3182
> 99.9%
D 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3183
66.7%
Hangul 1590
33.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
372
23.4%
277
17.4%
134
 
8.4%
89
 
5.6%
82
 
5.2%
57
 
3.6%
45
 
2.8%
34
 
2.1%
31
 
1.9%
30
 
1.9%
Other values (57) 439
27.6%
Latin
ValueCountFrequency (%)
O 3182
> 99.9%
D 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3183
66.7%
Hangul 1590
33.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O 3182
> 99.9%
D 1
 
< 0.1%
Hangul
ValueCountFrequency (%)
372
23.4%
277
17.4%
134
 
8.4%
89
 
5.6%
82
 
5.2%
57
 
3.6%
45
 
2.8%
34
 
2.1%
31
 
1.9%
30
 
1.9%
Other values (57) 439
27.6%

농장주소
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size12.6 KiB
경상북도 김천시 대덕면
205 
경상북도 김천시 감문면
183 
경상북도 김천시 조마면
145 
경상북도 김천시 구성면
132 
경상북도 김천시 지례면
110 
Other values (20)
816 

Length

Max length12
Median length12
Mean length12
Min length12

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row경상북도 김천시 어모면
2nd row경상북도 김천시 개령면
3rd row경상북도 김천시 남면
4th row경상북도 김천시 감문면
5th row경상북도 김천시 대항면

Common Values

ValueCountFrequency (%)
경상북도 김천시 대덕면 205
12.9%
경상북도 김천시 감문면 183
11.5%
경상북도 김천시 조마면 145
 
9.1%
경상북도 김천시 구성면 132
 
8.3%
경상북도 김천시 지례면 110
 
6.9%
경상북도 김천시 개령면 105
 
6.6%
경상북도 김천시 증산면 99
 
6.2%
경상북도 김천시 아포읍 98
 
6.2%
경상북도 김천시 어모면 88
 
5.5%
경상북도 김천시 양천동 63
 
4.0%
Other values (15) 363
22.8%

Length

2023-12-12T08:39:49.328500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경상북도 1591
33.3%
김천시 1591
33.3%
대덕면 205
 
4.3%
감문면 183
 
3.8%
조마면 145
 
3.0%
구성면 132
 
2.8%
지례면 110
 
2.3%
개령면 105
 
2.2%
증산면 99
 
2.1%
아포읍 98
 
2.1%
Other values (17) 514
 
10.8%

축종
Categorical

IMBALANCE 

Distinct12
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size12.6 KiB
한우
1269 
산양
 
79
종계/산란계
 
60
젖소
 
55
돼지
 
49
Other values (7)
 
79

Length

Max length6
Median length2
Mean length2.1533627
Min length2

Unique

Unique3 ?
Unique (%)0.2%

Sample

1st row한우
2nd row한우
3rd row한우
4th row한우
5th row한우

Common Values

ValueCountFrequency (%)
한우 1269
79.8%
산양 79
 
5.0%
종계/산란계 60
 
3.8%
젖소 55
 
3.5%
돼지 49
 
3.1%
육계 47
 
3.0%
육우 12
 
0.8%
염소 9
 
0.6%
사슴 8
 
0.5%
오리 1
 
0.1%
Other values (2) 2
 
0.1%

Length

2023-12-12T08:39:49.447784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
한우 1269
79.8%
산양 79
 
5.0%
종계/산란계 60
 
3.8%
젖소 55
 
3.5%
돼지 49
 
3.1%
육계 47
 
3.0%
육우 12
 
0.8%
염소 9
 
0.6%
사슴 8
 
0.5%
오리 1
 
0.1%
Other values (2) 2
 
0.1%

사육두수
Real number (ℝ)

ZEROS 

Distinct209
Distinct (%)13.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4008.4532
Minimum0
Maximum370000
Zeros25
Zeros (%)1.6%
Negative0
Negative (%)0.0%
Memory size14.1 KiB
2023-12-12T08:39:49.578250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q18
median23
Q380
95-th percentile12000
Maximum370000
Range370000
Interquartile range (IQR)72

Descriptive statistics

Standard deviation23377.187
Coefficient of variation (CV)5.8319721
Kurtosis116.85231
Mean4008.4532
Median Absolute Deviation (MAD)20
Skewness9.654533
Sum6377449
Variance5.4649288 × 108
MonotonicityNot monotonic
2023-12-12T08:39:49.741537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 85
 
5.3%
10 78
 
4.9%
20 62
 
3.9%
30 59
 
3.7%
3 55
 
3.5%
1 54
 
3.4%
4 52
 
3.3%
5 45
 
2.8%
40 36
 
2.3%
15 35
 
2.2%
Other values (199) 1030
64.7%
ValueCountFrequency (%)
0 25
 
1.6%
1 54
3.4%
2 85
5.3%
3 55
3.5%
4 52
3.3%
5 45
2.8%
6 29
 
1.8%
7 32
 
2.0%
8 32
 
2.0%
9 34
 
2.1%
ValueCountFrequency (%)
370000 1
0.1%
350000 1
0.1%
340000 1
0.1%
250000 1
0.1%
230000 1
0.1%
200000 1
0.1%
175000 1
0.1%
150000 1
0.1%
146000 1
0.1%
130000 2
0.1%

데이터 기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.6 KiB
Minimum2023-10-18 00:00:00
Maximum2023-10-18 00:00:00
2023-12-12T08:39:49.860909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:39:49.944817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T08:39:47.931397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:39:47.684077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:39:48.029409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:39:47.797163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:39:50.024987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번농장주농장주소축종사육두수
연번1.0000.3910.9710.2090.077
농장주0.3911.0000.4640.5440.441
농장주소0.9710.4641.0000.4140.000
축종0.2090.5440.4141.0000.596
사육두수0.0770.4410.0000.5961.000
2023-12-12T08:39:50.139335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
농장주소축종
농장주소1.0000.149
축종0.1491.000
2023-12-12T08:39:50.242158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사육두수농장주소축종
연번1.0000.0290.8050.089
사육두수0.0291.0000.0000.300
농장주소0.8050.0001.0000.149
축종0.0890.3000.1491.000

Missing values

2023-12-12T08:39:48.185125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:39:48.305156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번농장주농장주소축종사육두수데이터 기준일자
01김OO경상북도 김천시 어모면한우92023-10-18
12황OO경상북도 김천시 개령면한우82023-10-18
23김OO경상북도 김천시 남면한우62023-10-18
34이OO경상북도 김천시 감문면한우182023-10-18
45육OO경상북도 김천시 대항면한우802023-10-18
56정OO경상북도 김천시 대덕면한우3152023-10-18
67이OO경상북도 김천시 대덕면한우12023-10-18
78임OO경상북도 김천시 구성면한우32023-10-18
89이OO경상북도 김천시 구성면한우22023-10-18
910김OO경상북도 김천시 아포읍한우22023-10-18
연번농장주농장주소축종사육두수데이터 기준일자
15811582박OO경상북도 김천시 지례면한우1252023-10-18
15821583김OO경상북도 김천시 지례면한우352023-10-18
15831584서OO경상북도 김천시 지례면한우162023-10-18
15841585이OO경상북도 김천시 지례면한우402023-10-18
15851586김OO경상북도 김천시 지례면한우02023-10-18
15861587김OO경상북도 김천시 지좌동한우402023-10-18
15871588이OO경상북도 김천시 지좌동한우302023-10-18
15881589김OO경상북도 김천시 지좌동육계350002023-10-18
15891590한OO경상북도 김천시 지좌동산양32023-10-18
15901591김OO경상북도 김천시 지좌동산양202023-10-18