Overview

Dataset statistics

Number of variables6
Number of observations1722
Missing cells0
Missing cells (%)0.0%
Duplicate rows166
Duplicate rows (%)9.6%
Total size in memory89.3 KiB
Average record size in memory53.1 B

Variable types

Categorical1
Numeric5

Dataset

Description농림수산식품교육문화정보원 스마트팜코리아에서 제공하는 스마트축산 양돈분야 포유모돈 정보입니다.
Author농림수산식품교육문화정보원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20210929000000001581

Alerts

Dataset has 166 (9.6%) duplicate rowsDuplicates
섭취량 has 56 (3.3%) zeros Zeros

Reproduction

Analysis started2022-08-12 14:47:07.098737
Analysis finished2022-08-12 14:47:13.703052
Duration6.6 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

농장아이디
Categorical

Distinct7
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size13.6 KiB
PF_0020440
683 
PF_0000347_01
456 
PF_0021299
215 
PF_0020426
138 
PF_0021284
127 
Other values (2)
103 

Length

Max length13
Median length10
Mean length10.79442509
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPF_0020239
2nd rowPF_0020239
3rd rowPF_0020239
4th rowPF_0020239
5th rowPF_0020239

Common Values

ValueCountFrequency (%)
PF_0020440683
39.7%
PF_0000347_01456
26.5%
PF_0021299215
 
12.5%
PF_0020426138
 
8.0%
PF_0021284127
 
7.4%
PF_002023984
 
4.9%
PF_002128319
 
1.1%

Length

2022-08-12T23:47:13.809248image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-12T23:47:14.197540image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
pf_0020440683
39.7%
pf_0000347_01456
26.5%
pf_0021299215
 
12.5%
pf_0020426138
 
8.0%
pf_0021284127
 
7.4%
pf_002023984
 
4.9%
pf_002128319
 
1.1%

개체 구별 번호
Real number (ℝ≥0)

Distinct238
Distinct (%)13.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean529.4779326
Minimum1
Maximum1751
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.3 KiB
2022-08-12T23:47:14.464500image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45
Q1180
median386
Q3770.5
95-th percentile1643.85
Maximum1751
Range1750
Interquartile range (IQR)590.5

Descriptive statistics

Standard deviation455.4949952
Coefficient of variation (CV)0.8602719153
Kurtosis0.7640417999
Mean529.4779326
Median Absolute Deviation (MAD)234
Skewness1.219057434
Sum911761
Variance207475.6907
MonotonicityNot monotonic
2022-08-12T23:47:14.736333image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
53318
 
1.0%
57618
 
1.0%
53618
 
1.0%
4517
 
1.0%
4716
 
0.9%
15515
 
0.9%
9215
 
0.9%
4114
 
0.8%
30214
 
0.8%
2612
 
0.7%
Other values (228)1565
90.9%
ValueCountFrequency (%)
19
0.5%
102
 
0.1%
135
 
0.3%
1510
0.6%
257
0.4%
2612
0.7%
271
 
0.1%
3810
0.6%
397
0.4%
4114
0.8%
ValueCountFrequency (%)
175111
0.6%
17468
0.5%
17408
0.5%
17388
0.5%
172410
0.6%
171110
0.6%
167211
0.6%
165610
0.6%
164511
0.6%
162211
0.6%

교배일
Real number (ℝ≥0)

Distinct32
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20210539.6
Minimum20210503
Maximum20210608
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.3 KiB
2022-08-12T23:47:15.064587image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum20210503
5-th percentile20210510
Q120210517
median20210525
Q320210531
95-th percentile20210607
Maximum20210608
Range105
Interquartile range (IQR)14

Descriptive statistics

Standard deviation35.45733327
Coefficient of variation (CV)1.754398149 × 10-6
Kurtosis-0.4496197337
Mean20210539.6
Median Absolute Deviation (MAD)8
Skewness1.182470628
Sum3.48025492 × 1010
Variance1257.222483
MonotonicityNot monotonic
2022-08-12T23:47:15.340951image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
20210601137
 
8.0%
20210517122
 
7.1%
20210510113
 
6.6%
20210525113
 
6.6%
20210602109
 
6.3%
20210526101
 
5.9%
2021051897
 
5.6%
2021051189
 
5.2%
2021052486
 
5.0%
2021060772
 
4.2%
Other values (22)683
39.7%
ValueCountFrequency (%)
202105032
 
0.1%
2021050811
 
0.6%
20210510113
6.6%
2021051189
5.2%
2021051223
 
1.3%
2021051340
 
2.3%
2021051418
 
1.0%
2021051515
 
0.9%
2021051651
3.0%
20210517122
7.1%
ValueCountFrequency (%)
2021060834
 
2.0%
2021060772
4.2%
202106061
 
0.1%
202106057
 
0.4%
202106045
 
0.3%
2021060334
 
2.0%
20210602109
6.3%
20210601137
8.0%
2021053168
3.9%
2021053040
 
2.3%

분만일
Real number (ℝ≥0)

Distinct30
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20210914.95
Minimum20210901
Maximum20210930
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.3 KiB
2022-08-12T23:47:15.605501image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum20210901
5-th percentile20210903
Q120210909
median20210916
Q320210922
95-th percentile20210929
Maximum20210930
Range29
Interquartile range (IQR)13

Descriptive statistics

Standard deviation8.001161208
Coefficient of variation (CV)3.958831765 × 10-7
Kurtosis-1.063423433
Mean20210914.95
Median Absolute Deviation (MAD)7
Skewness-0.0210376683
Sum3.480319554 × 1010
Variance64.01858068
MonotonicityNot monotonic
2022-08-12T23:47:15.798903image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20210923131
 
7.6%
20210924127
 
7.4%
20210916118
 
6.9%
20210909118
 
6.9%
20210917101
 
5.9%
2021090499
 
5.7%
2021091894
 
5.5%
2021090389
 
5.2%
2021092972
 
4.2%
2021091169
 
4.0%
Other values (20)704
40.9%
ValueCountFrequency (%)
2021090136
 
2.1%
2021090243
 
2.5%
2021090389
5.2%
2021090499
5.7%
2021090535
 
2.0%
2021090623
 
1.3%
2021090747
 
2.7%
2021090841
 
2.4%
20210909118
6.9%
2021091065
3.8%
ValueCountFrequency (%)
2021093034
 
2.0%
2021092972
4.2%
202109281
 
0.1%
202109277
 
0.4%
202109267
 
0.4%
2021092540
 
2.3%
20210924127
7.4%
20210923131
7.6%
2021092250
 
2.9%
2021092136
 
2.1%

설정량
Real number (ℝ≥0)

Distinct85
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.616550523
Minimum0
Maximum12.6
Zeros10
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size15.3 KiB
2022-08-12T23:47:16.076879image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.4
Q12.5
median3.5
Q36.375
95-th percentile11.1
Maximum12.6
Range12.6
Interquartile range (IQR)3.875

Descriptive statistics

Standard deviation2.979706523
Coefficient of variation (CV)0.6454400333
Kurtosis-0.2709947362
Mean4.616550523
Median Absolute Deviation (MAD)2
Skewness0.8504445008
Sum7949.7
Variance8.878650965
MonotonicityNot monotonic
2022-08-12T23:47:16.303883image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.5129
 
7.5%
3.2126
 
7.3%
2.5104
 
6.0%
285
 
4.9%
677
 
4.5%
11.169
 
4.0%
569
 
4.0%
3.566
 
3.8%
166
 
3.8%
366
 
3.8%
Other values (75)865
50.2%
ValueCountFrequency (%)
010
 
0.6%
0.62
 
0.1%
0.72
 
0.1%
166
3.8%
1.22
 
0.1%
1.32
 
0.1%
1.466
3.8%
1.5129
7.5%
1.618
 
1.0%
1.72
 
0.1%
ValueCountFrequency (%)
12.61
 
0.1%
12.21
 
0.1%
1219
 
1.1%
11.83
 
0.2%
11.33
 
0.2%
11.169
4.0%
115
 
0.3%
10.818
 
1.0%
10.515
 
0.9%
10.42
 
0.1%

섭취량
Real number (ℝ≥0)

ZEROS

Distinct108
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.813704994
Minimum0
Maximum51
Zeros56
Zeros (%)3.3%
Negative0
Negative (%)0.0%
Memory size15.3 KiB
2022-08-12T23:47:16.520482image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.8
Q11.9
median3.2
Q35.3
95-th percentile8.7
Maximum51
Range51
Interquartile range (IQR)3.4

Descriptive statistics

Standard deviation2.691650697
Coefficient of variation (CV)0.7057836674
Kurtosis54.34521503
Mean3.813704994
Median Absolute Deviation (MAD)1.6
Skewness3.764906344
Sum6567.2
Variance7.244983476
MonotonicityNot monotonic
2022-08-12T23:47:16.694082image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.599
 
5.7%
2.595
 
5.5%
3.294
 
5.5%
1.684
 
4.9%
279
 
4.6%
658
 
3.4%
3.157
 
3.3%
557
 
3.3%
056
 
3.3%
5.547
 
2.7%
Other values (98)996
57.8%
ValueCountFrequency (%)
056
3.3%
0.32
 
0.1%
0.44
 
0.2%
0.52
 
0.1%
0.611
 
0.6%
0.77
 
0.4%
0.819
 
1.1%
0.912
 
0.7%
133
1.9%
1.12
 
0.1%
ValueCountFrequency (%)
511
 
0.1%
124
0.2%
11.61
 
0.1%
11.19
0.5%
116
0.3%
10.84
0.2%
10.72
 
0.1%
10.61
 
0.1%
10.53
 
0.2%
10.42
 
0.1%

Interactions

2022-08-12T23:47:12.119410image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:07.412012image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:08.622452image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:09.819536image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:11.105509image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:12.326190image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:07.732650image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:08.872340image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:10.048003image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:11.306643image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:12.554536image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:07.962144image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:09.094452image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:10.295518image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:11.531757image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:12.766057image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:08.196878image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:09.345244image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:10.699578image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:11.733875image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:12.972903image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:08.399841image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:09.587080image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:10.903722image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:47:11.932825image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-08-12T23:47:16.952610image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-12T23:47:17.102877image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-12T23:47:17.278435image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-12T23:47:17.452184image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-12T23:47:13.301051image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-12T23:47:13.590574image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

농장아이디개체 구별 번호교배일분만일설정량섭취량
0PF_002023920020210516202109075.24.5
1PF_002023920020210516202109075.32.1
2PF_002023920020210516202109074.73.5
3PF_002023920020210516202109075.23.4
4PF_002023928420210528202109190.60.6
5PF_002023928420210528202109191.22.6
6PF_002023928420210528202109191.71.8
7PF_002023928420210528202109191.21.6
8PF_002023929520210524202109181.52.2
9PF_002023929520210524202109182.12.4

Last rows

농장아이디개체 구별 번호교배일분만일설정량섭취량
1712PF_0021283120210521202109129.09.0
1713PF_0021283120210521202109129.09.0
1714PF_00212832720210511202109029.06.9
1715PF_002128339920210511202109029.08.4
1716PF_002128341620210528202109193.04.0
1717PF_002128341620210528202109196.06.1
1718PF_002128342220210524202109157.05.7
1719PF_002128342220210524202109159.06.1
1720PF_002128342320210523202109148.07.1
1721PF_002128342320210528202109196.06.1

Duplicate rows

Most frequently occurring

농장아이디개체 구별 번호교배일분만일설정량섭취량# duplicates
73PF_002044039720210608202109302.52.55
78PF_002044052420210608202109302.52.55
85PF_002044053620210608202109302.52.55
115PF_002044087120210608202109302.52.55
1PF_0000347_0136120210607202109293.23.24
3PF_0000347_01386202105112021090211.111.14
5PF_0000347_0148720210607202109293.23.24
9PF_0000347_0151020210607202109293.23.24
15PF_0000347_0155520210530202109213.13.14
18PF_0000347_01567202105102021090411.111.14