Overview

Dataset statistics

Number of variables10
Number of observations407
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory34.7 KiB
Average record size in memory87.3 B

Variable types

Numeric7
Categorical3

Dataset

Description산지조합에서 생산하는 과실류(사과, 배, 단감, 감귤)의 약정 사업실적 현황 요약정보 입니다.
Author농림축산식품부
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220215000000001921

Alerts

CNTRCT_ACMSLT_VOLM_TON is highly correlated with SHIPMNT_ACMSLT_VOLM_TON and 1 other fieldsHigh correlation
CNTRCT_ACMSLT_AMOUNT is highly correlated with SHIPMNT_ACMSLT_AMOUNTHigh correlation
SHIPMNT_ACMSLT_VOLM_TON is highly correlated with CNTRCT_ACMSLT_VOLM_TON and 1 other fieldsHigh correlation
SHIPMNT_ACMSLT_AMOUNT is highly correlated with CNTRCT_ACMSLT_VOLM_TON and 2 other fieldsHigh correlation
CNTRCT_ACMSLT_VOLM_TON has 28 (6.9%) zeros Zeros
CNTRCT_ACMSLT_AMOUNT has 28 (6.9%) zeros Zeros
SHIPMNT_ACMSLT_VOLM_TON has 33 (8.1%) zeros Zeros
SHIPMNT_ACMSLT_VOLM_PT has 33 (8.1%) zeros Zeros
SHIPMNT_ACMSLT_AMOUNT has 33 (8.1%) zeros Zeros
SHIPMNT_ACMSLT_AMOUNT_PT has 33 (8.1%) zeros Zeros

Reproduction

Analysis started2022-08-12 14:45:49.749938
Analysis finished2022-08-12 14:45:59.450272
Duration9.7 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

BSNS_YEAR
Real number (ℝ≥0)

Distinct14
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2009.597052
Minimum2003
Maximum2016
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-08-12T23:45:59.503497image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum2003
5-th percentile2003
Q12006.5
median2010
Q32013
95-th percentile2016
Maximum2016
Range13
Interquartile range (IQR)6.5

Descriptive statistics

Standard deviation3.917221187
Coefficient of variation (CV)0.001949257033
Kurtosis-1.061463099
Mean2009.597052
Median Absolute Deviation (MAD)3
Skewness0.001055546074
Sum817906
Variance15.34462183
MonotonicityNot monotonic
2022-08-12T23:45:59.696610image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
200844
10.8%
201137
 
9.1%
201036
 
8.8%
201633
 
8.1%
200932
 
7.9%
200428
 
6.9%
201427
 
6.6%
201526
 
6.4%
200326
 
6.4%
200524
 
5.9%
Other values (4)94
23.1%
ValueCountFrequency (%)
200326
6.4%
200428
6.9%
200524
5.9%
200624
5.9%
200723
5.7%
200844
10.8%
200932
7.9%
201036
8.8%
201137
9.1%
201223
5.7%
ValueCountFrequency (%)
201633
8.1%
201526
6.4%
201427
6.6%
201324
5.9%
201223
5.7%
201137
9.1%
201036
8.8%
200932
7.9%
200844
10.8%
200723
5.7%

AREA_HDQRTRS_NM
Categorical

Distinct20
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
경남지역본부
54 
경북지역본부
47 
충남지역본부
46 
전남지역본부
44 
충북지역본부
39 
Other values (15)
177 

Length

Max length10
Median length6
Mean length6.776412776
Min length6

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row전북지역본부
2nd row충남지역본부
3rd row강원지역본부
4th row제주지역본부
5th row경북지역본부

Common Values

ValueCountFrequency (%)
경남지역본부54
13.3%
경북지역본부47
11.5%
충남지역본부46
11.3%
전남지역본부44
10.8%
충북지역본부39
9.6%
전북지역본부33
8.1%
경기지역본부20
 
4.9%
전북지역본부(대표)19
 
4.7%
울산지역본부18
 
4.4%
경북지역본부(대표)16
 
3.9%
Other values (10)71
17.4%

Length

2022-08-12T23:45:59.909587image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경남지역본부54
13.3%
경북지역본부47
11.5%
충남지역본부46
11.3%
전남지역본부44
10.8%
충북지역본부39
9.6%
전북지역본부33
8.1%
경기지역본부20
 
4.9%
전북지역본부(대표19
 
4.7%
울산지역본부18
 
4.4%
경북지역본부(대표16
 
3.9%
Other values (10)71
17.4%

BSNS_MTHD
Categorical

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
수탁
290 
매취
117 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수탁
2nd row수탁
3rd row수탁
4th row수탁
5th row수탁

Common Values

ValueCountFrequency (%)
수탁290
71.3%
매취117
28.7%

Length

2022-08-12T23:46:00.111427image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-12T23:46:00.289423image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
수탁290
71.3%
매취117
28.7%

PRDLST_NM
Categorical

Distinct4
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
189 
사과
159 
단감
38 
감귤
21 

Length

Max length2
Median length2
Mean length1.535626536
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사과
2nd row
3rd row
4th row감귤
5th row사과

Common Values

ValueCountFrequency (%)
189
46.4%
사과159
39.1%
단감38
 
9.3%
감귤21
 
5.2%

Length

2022-08-12T23:46:00.413620image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-12T23:46:00.642228image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
189
46.4%
사과159
39.1%
단감38
 
9.3%
감귤21
 
5.2%

CNTRCT_ACMSLT_VOLM_TON
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct369
Distinct (%)90.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5162.054054
Minimum0
Maximum81705
Zeros28
Zeros (%)6.9%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-08-12T23:46:00.879321image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1397.5
median1326
Q35940
95-th percentile17594.7
Maximum81705
Range81705
Interquartile range (IQR)5542.5

Descriptive statistics

Standard deviation10118.36084
Coefficient of variation (CV)1.960142364
Kurtosis26.3794946
Mean5162.054054
Median Absolute Deviation (MAD)1248
Skewness4.665298741
Sum2100956
Variance102381226.1
MonotonicityNot monotonic
2022-08-12T23:46:01.090852image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
028
 
6.9%
4022
 
0.5%
9722
 
0.5%
5402
 
0.5%
2842
 
0.5%
4672
 
0.5%
2662
 
0.5%
9352
 
0.5%
22012
 
0.5%
11042
 
0.5%
Other values (359)361
88.7%
ValueCountFrequency (%)
028
6.9%
51
 
0.2%
151
 
0.2%
331
 
0.2%
461
 
0.2%
501
 
0.2%
681
 
0.2%
751
 
0.2%
761
 
0.2%
781
 
0.2%
ValueCountFrequency (%)
817051
0.2%
741311
0.2%
696601
0.2%
685821
0.2%
672901
0.2%
551041
0.2%
538061
0.2%
469591
0.2%
428081
0.2%
255751
0.2%

CNTRCT_ACMSLT_AMOUNT
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct380
Distinct (%)93.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7112456.391
Minimum0
Maximum57323848
Zeros28
Zeros (%)6.9%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-08-12T23:46:01.273156image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1540852.5
median2290000
Q311065321
95-th percentile26578493.6
Maximum57323848
Range57323848
Interquartile range (IQR)10524468.5

Descriptive statistics

Standard deviation9589044.834
Coefficient of variation (CV)1.348204377
Kurtosis4.922259692
Mean7112456.391
Median Absolute Deviation (MAD)2175720
Skewness1.997234882
Sum2894769751
Variance9.194978082 × 1013
MonotonicityNot monotonic
2022-08-12T23:46:01.434659image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
028
 
6.9%
120445101
 
0.2%
4437501
 
0.2%
255402041
 
0.2%
196475851
 
0.2%
7950001
 
0.2%
168383381
 
0.2%
145764221
 
0.2%
22652001
 
0.2%
43646981
 
0.2%
Other values (370)370
90.9%
ValueCountFrequency (%)
028
6.9%
60001
 
0.2%
312961
 
0.2%
616001
 
0.2%
877501
 
0.2%
952011
 
0.2%
960801
 
0.2%
962001
 
0.2%
1011001
 
0.2%
1142801
 
0.2%
ValueCountFrequency (%)
573238481
0.2%
560484911
0.2%
534962161
0.2%
440833711
0.2%
377430151
0.2%
346335911
0.2%
344101221
0.2%
332346361
0.2%
329771901
0.2%
321321991
0.2%

SHIPMNT_ACMSLT_VOLM_TON
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct362
Distinct (%)88.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4453.542998
Minimum0
Maximum77067
Zeros33
Zeros (%)8.1%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-08-12T23:46:01.868426image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1299
median1093
Q35009.5
95-th percentile14938.1
Maximum77067
Range77067
Interquartile range (IQR)4710.5

Descriptive statistics

Standard deviation9112.974175
Coefficient of variation (CV)2.046230199
Kurtosis28.73163671
Mean4453.542998
Median Absolute Deviation (MAD)1068
Skewness4.848614598
Sum1812592
Variance83046298.31
MonotonicityNot monotonic
2022-08-12T23:46:02.089991image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
033
 
8.1%
2893
 
0.7%
2283
 
0.7%
8062
 
0.5%
4112
 
0.5%
3062
 
0.5%
4272
 
0.5%
4672
 
0.5%
1812
 
0.5%
3822
 
0.5%
Other values (352)354
87.0%
ValueCountFrequency (%)
033
8.1%
81
 
0.2%
141
 
0.2%
171
 
0.2%
191
 
0.2%
201
 
0.2%
241
 
0.2%
251
 
0.2%
331
 
0.2%
341
 
0.2%
ValueCountFrequency (%)
770671
0.2%
737041
0.2%
597531
0.2%
591031
0.2%
574051
0.2%
463781
0.2%
457071
0.2%
436471
0.2%
391221
0.2%
259091
0.2%

SHIPMNT_ACMSLT_VOLM_PT
Real number (ℝ≥0)

ZEROS

Distinct85
Distinct (%)20.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77.00737101
Minimum0
Maximum117
Zeros33
Zeros (%)8.1%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-08-12T23:46:02.391128image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q167
median88
Q3100
95-th percentile107
Maximum117
Range117
Interquartile range (IQR)33

Descriptive statistics

Standard deviation31.87626083
Coefficient of variation (CV)0.4139377882
Kurtosis0.6698445615
Mean77.00737101
Median Absolute Deviation (MAD)13
Skewness-1.323192812
Sum31342
Variance1016.096005
MonotonicityNot monotonic
2022-08-12T23:46:02.637349image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
033
 
8.1%
10120
 
4.9%
10019
 
4.7%
10214
 
3.4%
8813
 
3.2%
10313
 
3.2%
9812
 
2.9%
9911
 
2.7%
9710
 
2.5%
1059
 
2.2%
Other values (75)253
62.2%
ValueCountFrequency (%)
033
8.1%
11
 
0.2%
72
 
0.5%
92
 
0.5%
111
 
0.2%
131
 
0.2%
141
 
0.2%
181
 
0.2%
211
 
0.2%
232
 
0.5%
ValueCountFrequency (%)
1171
 
0.2%
1131
 
0.2%
1122
 
0.5%
1113
 
0.7%
1102
 
0.5%
1095
1.2%
1085
1.2%
1075
1.2%
1067
1.7%
1059
2.2%

SHIPMNT_ACMSLT_AMOUNT
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct375
Distinct (%)92.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7171254.732
Minimum0
Maximum68919581
Zeros33
Zeros (%)8.1%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-08-12T23:46:02.860573image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1472396
median1718350
Q310557350
95-th percentile26118292.4
Maximum68919581
Range68919581
Interquartile range (IQR)10084954

Descriptive statistics

Standard deviation10874548.75
Coefficient of variation (CV)1.516408098
Kurtosis9.221635877
Mean7171254.732
Median Absolute Deviation (MAD)1718350
Skewness2.666254612
Sum2918700676
Variance1.182558105 × 1014
MonotonicityNot monotonic
2022-08-12T23:46:03.024318image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
033
 
8.1%
2085781
 
0.2%
671335261
 
0.2%
254868971
 
0.2%
12211011
 
0.2%
188813981
 
0.2%
171685631
 
0.2%
21665241
 
0.2%
31266811
 
0.2%
782501
 
0.2%
Other values (365)365
89.7%
ValueCountFrequency (%)
033
8.1%
55711
 
0.2%
166531
 
0.2%
235781
 
0.2%
354271
 
0.2%
357251
 
0.2%
464671
 
0.2%
481521
 
0.2%
539161
 
0.2%
550361
 
0.2%
ValueCountFrequency (%)
689195811
0.2%
671335261
0.2%
651900241
0.2%
561386671
0.2%
551309841
0.2%
536726861
0.2%
493047531
0.2%
481536741
0.2%
463041321
0.2%
352977471
0.2%

SHIPMNT_ACMSLT_AMOUNT_PT
Real number (ℝ≥0)

ZEROS

Distinct129
Distinct (%)31.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean86.08845209
Minimum0
Maximum263
Zeros33
Zeros (%)8.1%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-08-12T23:46:03.201609image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q169
median92
Q3107
95-th percentile144.7
Maximum263
Range263
Interquartile range (IQR)38

Descriptive statistics

Standard deviation40.21533221
Coefficient of variation (CV)0.467139683
Kurtosis1.036697433
Mean86.08845209
Median Absolute Deviation (MAD)19
Skewness-0.3264922995
Sum35038
Variance1617.272945
MonotonicityNot monotonic
2022-08-12T23:46:03.350191image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
033
 
8.1%
9511
 
2.7%
10011
 
2.7%
9010
 
2.5%
859
 
2.2%
1158
 
2.0%
898
 
2.0%
928
 
2.0%
948
 
2.0%
1067
 
1.7%
Other values (119)294
72.2%
ValueCountFrequency (%)
033
8.1%
11
 
0.2%
71
 
0.2%
81
 
0.2%
101
 
0.2%
111
 
0.2%
142
 
0.5%
191
 
0.2%
203
 
0.7%
231
 
0.2%
ValueCountFrequency (%)
2631
0.2%
2091
0.2%
1851
0.2%
1701
0.2%
1671
0.2%
1651
0.2%
1632
0.5%
1611
0.2%
1561
0.2%
1551
0.2%

Interactions

2022-08-12T23:45:57.878517image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:50.178867image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:51.538111image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:53.023235image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:54.209422image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:55.574392image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:56.762794image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:58.012790image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:50.408106image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:51.878426image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:53.202623image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:54.508329image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:55.749796image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:57.012057image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:58.148206image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:50.576427image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:52.085874image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:53.379689image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:54.711021image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:55.949840image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:57.155992image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:58.281743image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:50.752922image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:52.292851image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:53.547688image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:54.871961image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:56.171763image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:57.309166image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:58.423252image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:50.994777image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:52.462164image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:53.744825image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:55.050327image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:56.366534image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:57.459623image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:58.553599image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:51.191387image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:52.642792image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:53.908478image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:55.214439image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:56.514034image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:57.595348image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:58.700704image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:51.370316image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:52.842118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:54.058897image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:55.393012image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:56.646606image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-08-12T23:45:57.743138image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-08-12T23:46:03.482184image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-12T23:46:03.682686image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-12T23:46:03.908871image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-12T23:46:04.244835image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-08-12T23:46:04.465749image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-08-12T23:45:59.076696image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-12T23:45:59.361060image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

BSNS_YEARAREA_HDQRTRS_NMBSNS_MTHDPRDLST_NMCNTRCT_ACMSLT_VOLM_TONCNTRCT_ACMSLT_AMOUNTSHIPMNT_ACMSLT_VOLM_TONSHIPMNT_ACMSLT_VOLM_PTSHIPMNT_ACMSLT_AMOUNTSHIPMNT_ACMSLT_AMOUNT_PT
02004전북지역본부수탁사과972311236610561093362879108
12004충남지역본부수탁12009157420101242910318139250115
22004강원지역본부수탁2664437502288642379296
32004제주지역본부수탁감귤6729025540204597538967133526263
42004경북지역본부수탁사과1531219647585120637925486897130
52004충남지역본부수탁사과5707950005861031221101154
62004전남지역본부수탁1198216838338101848518881398112
72004경기지역본부수탁106401457642298229217168563118
82004경북지역본부수탁18382265200131171216652496
92005경북지역본부매취사과24684364698165067312668172

Last rows

BSNS_YEARAREA_HDQRTRS_NMBSNS_MTHDPRDLST_NMCNTRCT_ACMSLT_VOLM_TONCNTRCT_ACMSLT_AMOUNTSHIPMNT_ACMSLT_VOLM_TONSHIPMNT_ACMSLT_VOLM_PTSHIPMNT_ACMSLT_AMOUNTSHIPMNT_ACMSLT_AMOUNT_PT
3972004경북지역본부수탁단감21917532010447183397105
3982004경남지역본부수탁단감89239443504542661776832582
3992004충북지역본부수탁853947723815961085648115
4002004충북지역본부수탁사과10038132583691017310118969861143
4012004전북지역본부수탁399541665613261824213426101
4022004경남지역본부수탁89210262467198198269896
4032004경남지역본부수탁사과7019996098464659212343183124
4042004울산지역본부수탁1978266424094348164609062
4052004경기지역본부수탁사과26623369223990295758127
4062004울산지역본부수탁단감3715391403495503610