Overview

Dataset statistics

Number of variables14
Number of observations3648
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory424.1 KiB
Average record size in memory119.0 B

Variable types

Numeric6
Categorical8

Dataset

Description농림수산식품교육문화정보원 스마트팜코리아에서 제공하는 스마트축산 한우분야 예상유전전달능력(EPD) 정보입니다.
Author농림수산식품교육문화정보원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20210929000000001583

Alerts

자료갱신일자 has constant value "2021-08-02" Constant
생년월일 has a high cardinality: 1612 distinct values High cardinality
근내지방등급 is highly correlated with 자료갱신일자High correlation
성별구분 is highly correlated with 자료갱신일자High correlation
등지방두께등급 is highly correlated with 자료갱신일자High correlation
계대 is highly correlated with 자료갱신일자High correlation
냉도체중등급 is highly correlated with 자료갱신일자High correlation
등심단면적등급 is highly correlated with 자료갱신일자High correlation
자료갱신일자 is highly correlated with 근내지방등급 and 5 other fieldsHigh correlation

Reproduction

Analysis started2022-08-12 14:45:11.705193
Analysis finished2022-08-12 14:45:21.046318
Duration9.34 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

농장아이디
Real number (ℝ≥0)

Distinct27
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20316.642
Minimum20135
Maximum21343
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.2 KiB
2022-08-12T23:45:21.117517image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum20135
5-th percentile20152
Q120192
median20228
Q320256
95-th percentile21343
Maximum21343
Range1208
Interquartile range (IQR)64

Descriptive statistics

Standard deviation291.4163057
Coefficient of variation (CV)0.01434372402
Kurtosis8.004287602
Mean20316.642
Median Absolute Deviation (MAD)28
Skewness3.077032264
Sum74115110
Variance84923.46324
MonotonicityNot monotonic
2022-08-12T23:45:21.438859image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
20192667
18.3%
20229397
10.9%
20256354
 
9.7%
21343260
 
7.1%
20152254
 
7.0%
20418210
 
5.8%
20224175
 
4.8%
20228172
 
4.7%
20223153
 
4.2%
20220102
 
2.8%
Other values (17)904
24.8%
ValueCountFrequency (%)
2013511
 
0.3%
20152254
 
7.0%
2017722
 
0.6%
20192667
18.3%
2021522
 
0.6%
2021999
 
2.7%
20220102
 
2.8%
2022275
 
2.1%
20223153
 
4.2%
20224175
 
4.8%
ValueCountFrequency (%)
21343260
7.1%
20418210
5.8%
2040395
 
2.6%
2038841
 
1.1%
2034850
 
1.4%
202909
 
0.2%
20256354
9.7%
2025563
 
1.7%
2024236
 
1.0%
2023511
 
0.3%

개체식별번호
Real number (ℝ≥0)

Distinct3645
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.018129208 × 1013
Minimum2.005072901 × 1013
Maximum2.021082301 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.2 KiB
2022-08-12T23:45:21.662704image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum2.005072901 × 1013
5-th percentile2.013080851 × 1013
Q12.017061701 × 1013
median2.019042501 × 1013
Q32.020021826 × 1013
95-th percentile2.020090866 × 1013
Maximum2.021082301 × 1013
Range1.600940002 × 1011
Interquartile range (IQR)2.96012498 × 1010

Descriptive statistics

Standard deviation2.253845662 × 1010
Coefficient of variation (CV)0.001116799485
Kurtosis2.454124771
Mean2.018129208 × 1013
Median Absolute Deviation (MAD)9999000590
Skewness-1.571140758
Sum7.362135353 × 1016
Variance5.079820266 × 1020
MonotonicityNot monotonic
2022-08-12T23:45:21.848092image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.017041701 × 10132
 
0.1%
2.019050601 × 10132
 
0.1%
2.019071301 × 10132
 
0.1%
2.021020701 × 10131
 
< 0.1%
2.019071901 × 10131
 
< 0.1%
2.019072801 × 10131
 
< 0.1%
2.019062901 × 10131
 
< 0.1%
2.019080901 × 10131
 
< 0.1%
2.016032001 × 10131
 
< 0.1%
2.018080301 × 10131
 
< 0.1%
Other values (3635)3635
99.6%
ValueCountFrequency (%)
2.005072901 × 10131
< 0.1%
2.006032401 × 10131
< 0.1%
2.007031501 × 10131
< 0.1%
2.007112001 × 10131
< 0.1%
2.008060801 × 10131
< 0.1%
2.008092601 × 10131
< 0.1%
2.008120301 × 10131
< 0.1%
2.009050101 × 10131
< 0.1%
2.009062301 × 10131
< 0.1%
2.009062801 × 10131
< 0.1%
ValueCountFrequency (%)
2.021082301 × 10131
< 0.1%
2.021082201 × 10131
< 0.1%
2.021081201 × 10131
< 0.1%
2.021081101 × 10131
< 0.1%
2.021073001 × 10131
< 0.1%
2.021072101 × 10131
< 0.1%
2.021071701 × 10131
< 0.1%
2.021071601 × 10131
< 0.1%
2.021062601 × 10131
< 0.1%
2.021062501 × 10131
< 0.1%

생년월일
Categorical

HIGH CARDINALITY

Distinct1612
Distinct (%)44.2%
Missing0
Missing (%)0.0%
Memory size28.6 KiB
2019-04-16 12:00:00 AM
 
14
2019-06-20 12:00:00 AM
 
13
2019-04-20 12:00:00 AM
 
13
2019-03-27 12:00:00 AM
 
13
2019-05-02 12:00:00 AM
 
12
Other values (1607)
3583 

Length

Max length22
Median length22
Mean length22
Min length22

Unique

Unique819 ?
Unique (%)22.5%

Sample

1st row2016-06-20 12:00:00 AM
2nd row2018-03-12 12:00:00 AM
3rd row2018-08-03 12:00:00 AM
4th row2018-08-05 12:00:00 AM
5th row2018-08-06 12:00:00 AM

Common Values

ValueCountFrequency (%)
2019-04-16 12:00:00 AM14
 
0.4%
2019-06-20 12:00:00 AM13
 
0.4%
2019-04-20 12:00:00 AM13
 
0.4%
2019-03-27 12:00:00 AM13
 
0.4%
2019-05-02 12:00:00 AM12
 
0.3%
2019-05-15 12:00:00 AM11
 
0.3%
2019-05-31 12:00:00 AM11
 
0.3%
2020-06-17 12:00:00 AM11
 
0.3%
2020-04-10 12:00:00 AM11
 
0.3%
2019-05-13 12:00:00 AM11
 
0.3%
Other values (1602)3528
96.7%

Length

2022-08-12T23:45:22.051121image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
am3648
33.3%
12:00:003648
33.3%
2019-04-1614
 
0.1%
2019-06-2013
 
0.1%
2019-04-2013
 
0.1%
2019-03-2713
 
0.1%
2019-05-0212
 
0.1%
2019-05-1511
 
0.1%
2019-05-3111
 
0.1%
2020-06-1711
 
0.1%
Other values (1604)3550
32.4%

성별구분
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size28.6 KiB
1
2303 
2
1345 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
12303
63.1%
21345
36.9%

Length

2022-08-12T23:45:22.207721image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-12T23:45:22.407575image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
12303
63.1%
21345
36.9%

계대
Categorical

HIGH CORRELATION

Distinct13
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size28.6 KiB
4 계대
735 
3 계대
678 
5 계대
565 
2 계대
467 
1 계대
423 
Other values (8)
780 

Length

Max length5
Median length4
Mean length4.011239035
Min length4

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row6 계대
2nd row3 계대
3rd row4 계대
4th row5 계대
5th row4 계대

Common Values

ValueCountFrequency (%)
4 계대735
20.1%
3 계대678
18.6%
5 계대565
15.5%
2 계대467
12.8%
1 계대423
11.6%
6 계대375
10.3%
7 계대203
 
5.6%
8 계대88
 
2.4%
9 계대70
 
1.9%
10 계대36
 
1.0%
Other values (3)8
 
0.2%

Length

2022-08-12T23:45:22.613325image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
계대3648
50.0%
4735
 
10.1%
3678
 
9.3%
5565
 
7.7%
2467
 
6.4%
1423
 
5.8%
6375
 
5.1%
7203
 
2.8%
888
 
1.2%
970
 
1.0%
Other values (4)44
 
0.6%

냉도체중등급
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size28.6 KiB
A
1072 
D
970 
B
857 
C
749 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowD
2nd rowD
3rd rowB
4th rowC
5th rowC

Common Values

ValueCountFrequency (%)
A1072
29.4%
D970
26.6%
B857
23.5%
C749
20.5%

Length

2022-08-12T23:45:22.820232image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-12T23:45:23.025379image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
a1072
29.4%
d970
26.6%
b857
23.5%
c749
20.5%

냉도체중지수
Real number (ℝ)

Distinct3233
Distinct (%)88.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.78761502
Minimum-7.2253
Maximum40.9789
Zeros0
Zeros (%)0.0%
Negative81
Negative (%)2.2%
Memory size32.2 KiB
2022-08-12T23:45:23.250078image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-7.2253
5-th percentile1.43547
Q17.277875
median12.35335
Q317.2917
95-th percentile27.130935
Maximum40.9789
Range48.2042
Interquartile range (IQR)10.013825

Descriptive statistics

Standard deviation7.675979647
Coefficient of variation (CV)0.6002667139
Kurtosis0.01357711373
Mean12.78761502
Median Absolute Deviation (MAD)4.98315
Skewness0.447969511
Sum46649.2196
Variance58.92066354
MonotonicityNot monotonic
2022-08-12T23:45:23.482315image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.036110
 
0.3%
26.095610
 
0.3%
2.57789
 
0.2%
13.07218
 
0.2%
8.92368
 
0.2%
22.64118
 
0.2%
1.30238
 
0.2%
13.33597
 
0.2%
5.13797
 
0.2%
11.82157
 
0.2%
Other values (3223)3566
97.8%
ValueCountFrequency (%)
-7.22531
< 0.1%
-7.05471
< 0.1%
-6.43121
< 0.1%
-5.97611
< 0.1%
-4.51211
< 0.1%
-3.89582
0.1%
-3.59451
< 0.1%
-3.5311
< 0.1%
-3.49121
< 0.1%
-3.29551
< 0.1%
ValueCountFrequency (%)
40.97891
< 0.1%
38.98211
< 0.1%
38.09331
< 0.1%
37.68851
< 0.1%
37.53831
< 0.1%
37.48041
< 0.1%
37.23381
< 0.1%
37.22031
< 0.1%
37.06971
< 0.1%
36.62471
< 0.1%

등심단면적등급
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size28.6 KiB
A
1123 
B
992 
D
774 
C
759 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowD
2nd rowD
3rd rowC
4th rowC
5th rowB

Common Values

ValueCountFrequency (%)
A1123
30.8%
B992
27.2%
D774
21.2%
C759
20.8%

Length

2022-08-12T23:45:23.693289image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-12T23:45:24.037798image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
a1123
30.8%
b992
27.2%
d774
21.2%
c759
20.8%

등심단면적지수
Real number (ℝ)

Distinct3155
Distinct (%)86.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.065396656
Minimum-2.2253
Maximum8.248
Zeros0
Zeros (%)0.0%
Negative79
Negative (%)2.2%
Memory size32.2 KiB
2022-08-12T23:45:24.275445image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-2.2253
5-th percentile0.51902
Q12.0171
median3.0844
Q33.996125
95-th percentile5.900015
Maximum8.248
Range10.4733
Interquartile range (IQR)1.979025

Descriptive statistics

Standard deviation1.583003066
Coefficient of variation (CV)0.516410515
Kurtosis0.3922586439
Mean3.065396656
Median Absolute Deviation (MAD)0.993
Skewness0.279713422
Sum11182.567
Variance2.505898706
MonotonicityNot monotonic
2022-08-12T23:45:24.519206image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.551810
 
0.3%
5.901110
 
0.3%
2.2159
 
0.2%
2.00048
 
0.2%
2.09148
 
0.2%
2.70148
 
0.2%
0.08377
 
0.2%
2.19097
 
0.2%
2.36157
 
0.2%
-0.2247
 
0.2%
Other values (3145)3567
97.8%
ValueCountFrequency (%)
-2.22531
< 0.1%
-1.53031
< 0.1%
-1.52191
< 0.1%
-1.49931
< 0.1%
-1.39731
< 0.1%
-1.33611
< 0.1%
-1.31011
< 0.1%
-1.27651
< 0.1%
-1.00781
< 0.1%
-0.94521
< 0.1%
ValueCountFrequency (%)
8.2481
< 0.1%
8.23931
< 0.1%
8.15582
0.1%
8.09181
< 0.1%
7.92522
0.1%
7.86271
< 0.1%
7.8451
< 0.1%
7.82941
< 0.1%
7.82581
< 0.1%
7.81451
< 0.1%

등지방두께등급
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size28.6 KiB
B
1085 
C
925 
A
906 
D
732 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowA
3rd rowC
4th rowD
5th rowB

Common Values

ValueCountFrequency (%)
B1085
29.7%
C925
25.4%
A906
24.8%
D732
20.1%

Length

2022-08-12T23:45:24.736942image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-12T23:45:24.895262image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
b1085
29.7%
c925
25.4%
a906
24.8%
d732
20.1%

등지방두께지수
Real number (ℝ)

Distinct2955
Distinct (%)81.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.2960780154
Minimum-1.6244
Maximum1.5144
Zeros0
Zeros (%)0.0%
Negative2804
Negative (%)76.9%
Memory size32.2 KiB
2022-08-12T23:45:25.080862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-1.6244
5-th percentile-0.992565
Q1-0.6336
median-0.348
Q3-0.033375
95-th percentile0.67308
Maximum1.5144
Range3.1388
Interquartile range (IQR)0.600225

Descriptive statistics

Standard deviation0.4848757122
Coefficient of variation (CV)-1.637661991
Kurtosis0.5622538857
Mean-0.2960780154
Median Absolute Deviation (MAD)0.29855
Skewness0.6806292087
Sum-1080.0926
Variance0.2351044563
MonotonicityNot monotonic
2022-08-12T23:45:25.278240image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.731211
 
0.3%
-0.14210
 
0.3%
-0.65499
 
0.2%
-0.6569
 
0.2%
-0.18459
 
0.2%
-0.55958
 
0.2%
-0.53168
 
0.2%
-0.25998
 
0.2%
-0.13748
 
0.2%
-0.16247
 
0.2%
Other values (2945)3561
97.6%
ValueCountFrequency (%)
-1.62441
< 0.1%
-1.53041
< 0.1%
-1.42791
< 0.1%
-1.4221
< 0.1%
-1.39342
0.1%
-1.38971
< 0.1%
-1.38661
< 0.1%
-1.37711
< 0.1%
-1.36871
< 0.1%
-1.36251
< 0.1%
ValueCountFrequency (%)
1.51441
< 0.1%
1.49891
< 0.1%
1.44011
< 0.1%
1.42081
< 0.1%
1.37961
< 0.1%
1.35051
< 0.1%
1.33711
< 0.1%
1.33121
< 0.1%
1.31921
< 0.1%
1.31791
< 0.1%

근내지방등급
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size28.6 KiB
A
1035 
B
960 
C
827 
D
826 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowD
2nd rowD
3rd rowB
4th rowD
5th rowB

Common Values

ValueCountFrequency (%)
A1035
28.4%
B960
26.3%
C827
22.7%
D826
22.6%

Length

2022-08-12T23:45:25.459623image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-12T23:45:25.646292image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
a1035
28.4%
b960
26.3%
c827
22.7%
d826
22.6%

근내지방지수
Real number (ℝ)

Distinct2715
Distinct (%)74.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4107618147
Minimum-0.3707
Maximum1.2281
Zeros0
Zeros (%)0.0%
Negative141
Negative (%)3.9%
Memory size32.2 KiB
2022-08-12T23:45:25.857173image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-0.3707
5-th percentile0.03082
Q10.244075
median0.39595
Q30.57495
95-th percentile0.827155
Maximum1.2281
Range1.5988
Interquartile range (IQR)0.330875

Descriptive statistics

Standard deviation0.2458807169
Coefficient of variation (CV)0.5985968222
Kurtosis-0.06473310368
Mean0.4107618147
Median Absolute Deviation (MAD)0.16735
Skewness0.1428652717
Sum1498.4591
Variance0.06045732696
MonotonicityNot monotonic
2022-08-12T23:45:26.019376image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.320414
 
0.4%
0.488310
 
0.3%
0.253610
 
0.3%
0.41639
 
0.2%
0.04938
 
0.2%
0.51048
 
0.2%
0.34398
 
0.2%
0.02658
 
0.2%
0.30597
 
0.2%
0.11237
 
0.2%
Other values (2705)3559
97.6%
ValueCountFrequency (%)
-0.37071
< 0.1%
-0.33451
< 0.1%
-0.32931
< 0.1%
-0.32111
< 0.1%
-0.31721
< 0.1%
-0.29521
< 0.1%
-0.28421
< 0.1%
-0.27661
< 0.1%
-0.24891
< 0.1%
-0.24841
< 0.1%
ValueCountFrequency (%)
1.22811
< 0.1%
1.21741
< 0.1%
1.19051
< 0.1%
1.181
< 0.1%
1.17242
0.1%
1.15172
0.1%
1.09641
< 0.1%
1.07731
< 0.1%
1.07621
< 0.1%
1.07391
< 0.1%

자료갱신일자
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size28.6 KiB
2021-08-02
3648 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-08-02
2nd row2021-08-02
3rd row2021-08-02
4th row2021-08-02
5th row2021-08-02

Common Values

ValueCountFrequency (%)
2021-08-023648
100.0%

Length

2022-08-12T23:45:26.299587image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-12T23:45:26.428381image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
2021-08-023648
100.0%

Interactions

2022-08-12T23:45:19.278226image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:13.020143image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:14.296733image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:15.713636image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:16.820345image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:17.921238image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:19.506292image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:13.260044image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:14.469920image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:15.886607image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:17.018648image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:18.146202image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:19.708552image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:13.476259image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:14.675092image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:16.049431image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:17.195992image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:18.317973image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:19.896942image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:13.694402image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:14.875175image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:16.203666image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:17.365246image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:18.651733image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:20.105825image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:13.850734image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:15.044132image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:16.383131image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:17.540960image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:18.854842image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:20.272482image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:14.069599image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:15.302304image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:16.590289image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:17.715331image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:19.062296image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-08-12T23:45:26.517952image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-12T23:45:26.695391image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-12T23:45:26.863101image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-12T23:45:27.047866image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-08-12T23:45:27.301179image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-08-12T23:45:20.599072image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-12T23:45:20.929355image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

농장아이디개체식별번호생년월일성별구분계대냉도체중등급냉도체중지수등심단면적등급등심단면적지수등지방두께등급등지방두께지수근내지방등급근내지방지수자료갱신일자
020152201606200100012016-06-20 12:00:00 AM16 계대D-0.191D0.0807B-0.3785D0.18612021-08-02
120152201803120100202018-03-12 12:00:00 AM13 계대D1.5201D1.3112A-0.6661D0.17432021-08-02
220152201808030100272018-08-03 12:00:00 AM14 계대B12.5138C2.1654C-0.0319B0.53752021-08-02
320152201808050100282018-08-05 12:00:00 AM15 계대C9.0297C2.0783D0.9257D0.18342021-08-02
420152201808060100292018-08-06 12:00:00 AM14 계대C8.905B2.8374B-0.6134B0.39372021-08-02
520152201604060100372016-04-06 12:00:00 AM14 계대D3.8491B3.1916A-0.6703B0.45932021-08-02
620152201604060100382016-04-06 12:00:00 AM13 계대D6.9869B2.8364C-0.1756A0.61632021-08-02
720152201606150100392016-06-15 12:00:00 AM11 계대D2.9025D-0.3491B-0.3374D0.08012021-08-02
820152201606180100402016-06-18 12:00:00 AM14 계대D6.377D1.7754C-0.1908A0.58152021-08-02
920152201606110100472016-06-11 12:00:00 AM13 계대D-1.1937D-0.8076D0.1973D0.07032021-08-02

Last rows

농장아이디개체식별번호생년월일성별구분계대냉도체중등급냉도체중지수등심단면적등급등심단면적지수등지방두께등급등지방두께지수근내지방등급근내지방지수자료갱신일자
363820152202004200103572020-04-20 12:00:00 AM22 계대B12.484A4.2858D0.0992D0.0242021-08-02
363920152202004220102402020-04-22 12:00:00 AM17 계대C10.3781C2.6997C-0.0557B0.45882021-08-02
364020152202006060103562020-06-06 12:00:00 AM23 계대D4.7004D1.7934A-0.7708D0.132021-08-02
364120152202008090103332020-08-09 12:00:00 AM23 계대B13.287C2.4151D0.1532D0.21182021-08-02
364220152202009100103342020-09-10 12:00:00 AM16 계대B15.6726B2.7924D0.4839A0.55022021-08-02
364320152202009220103442020-09-22 12:00:00 AM25 계대C10.7447B2.8871D0.0731B0.49672021-08-02
364420220201804170100682018-04-17 12:00:00 AM12 계대D3.294B3.5018A-0.865B0.38662021-08-02
364520220201909020100832019-09-02 12:00:00 AM13 계대A16.6321A4.5312D0.1303A0.91772021-08-02
364620220202007200101022020-07-20 12:00:00 AM17 계대A21.0272A4.1907C-0.1751C0.28982021-08-02
364720220202102070101132021-02-07 12:00:00 AM24 계대B12.8375A4.0162C-0.0426A0.7222021-08-02