Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory468.9 KiB
Average record size in memory48.0 B

Variable types

Numeric1
Categorical5

Dataset

Description2015년 제·개정된 농축수산물 표준코드(품목,시장,단위,포장,크기,등급,산지)와 동일한 의미를 가지는 2013년 농축수산물 표준코드(품목,시장,단위,포장,크기,등급,산지) 나타낸 정보
Author농림수산식품교육문화정보원
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20191011000000001245

Alerts

업데이트일자 has constant value "2015-12-15" Constant
크기코드 has a high cardinality: 174 distinct values High cardinality
크기명 has a high cardinality: 106 distinct values High cardinality
구크기코드 has a high cardinality: 175 distinct values High cardinality
구크기명 has a high cardinality: 9539 distinct values High cardinality
df_index has unique values Unique

Reproduction

Analysis started2022-08-12 14:48:32.515738
Analysis finished2022-08-12 14:48:33.547420
Duration1.03 second
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8685.3458
Minimum0
Maximum17375
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size78.2 KiB
2022-08-12T23:48:33.617699image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile883.95
Q14299.75
median8671.5
Q313087
95-th percentile16502.05
Maximum17375
Range17375
Interquartile range (IQR)8787.25

Descriptive statistics

Standard deviation5036.75494
Coefficient of variation (CV)0.5799141515
Kurtosis-1.215331695
Mean8685.3458
Median Absolute Deviation (MAD)4390
Skewness0.009336468874
Sum86853458
Variance25368900.32
MonotonicityNot monotonic
2022-08-12T23:48:33.761133image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
119181
 
< 0.1%
152021
 
< 0.1%
146851
 
< 0.1%
167281
 
< 0.1%
93901
 
< 0.1%
119241
 
< 0.1%
59361
 
< 0.1%
41491
 
< 0.1%
161851
 
< 0.1%
73091
 
< 0.1%
Other values (9990)9990
99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
61
< 0.1%
71
< 0.1%
91
< 0.1%
101
< 0.1%
111
< 0.1%
121
< 0.1%
161
< 0.1%
181
< 0.1%
ValueCountFrequency (%)
173751
< 0.1%
173731
< 0.1%
173721
< 0.1%
173711
< 0.1%
173691
< 0.1%
173681
< 0.1%
173671
< 0.1%
173661
< 0.1%
173631
< 0.1%
173621
< 0.1%

크기코드
Categorical

HIGH CARDINALITY

Distinct174
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
7ZZ
 
444
124
 
242
123
 
221
125
 
208
162
 
146
Other values (169)
8739 

Length

Max length10
Median length3
Mean length3.1327
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row782
2nd row162
3rd row312
4th row802
5th row172

Common Values

ValueCountFrequency (%)
7ZZ444
 
4.4%
124242
 
2.4%
123221
 
2.2%
125208
 
2.1%
162146
 
1.5%
126140
 
1.4%
1ZZ125
 
1.2%
3ZZ96
 
1.0%
13186
 
0.9%
14582
 
0.8%
Other values (164)8210
82.1%

Length

2022-08-12T23:48:33.922111image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
7zz444
 
4.4%
124242
 
2.4%
123221
 
2.2%
125208
 
2.1%
162146
 
1.5%
126140
 
1.4%
1zz125
 
1.2%
3zz96
 
1.0%
13186
 
0.9%
14582
 
0.8%
Other values (164)8210
82.1%

크기명
Categorical

HIGH CARDINALITY

Distinct106
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
기타
 
676
40내
 
284
30내
 
271
50내
 
255
20내
 
199
Other values (101)
8315 

Length

Max length16
Median length14
Mean length3.2037
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row20내
3rd row12cm내외×1.2m내
4th row2급
5th row72

Common Values

ValueCountFrequency (%)
기타676
 
6.8%
40내284
 
2.8%
30내271
 
2.7%
50내255
 
2.5%
20내199
 
2.0%
60내183
 
1.8%
180내137
 
1.4%
6131
 
1.3%
110내129
 
1.3%
15129
 
1.3%
Other values (96)7606
76.1%

Length

2022-08-12T23:48:34.125757image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
기타676
 
6.8%
40내284
 
2.8%
30내271
 
2.7%
50내255
 
2.5%
20내199
 
2.0%
60내183
 
1.8%
180내137
 
1.4%
6131
 
1.3%
110내129
 
1.3%
15129
 
1.3%
Other values (96)7606
76.1%

구크기코드
Categorical

HIGH CARDINALITY

Distinct175
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
700
 
91
131
 
86
124
 
84
152
 
83
145
 
82
Other values (170)
9574 

Length

Max length10
Median length3
Mean length3.1327
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row782
2nd row162
3rd row312
4th row802
5th row172

Common Values

ValueCountFrequency (%)
70091
 
0.9%
13186
 
0.9%
12484
 
0.8%
15283
 
0.8%
14582
 
0.8%
13882
 
0.8%
11581
 
0.8%
14781
 
0.8%
16480
 
0.8%
15180
 
0.8%
Other values (165)9170
91.7%

Length

2022-08-12T23:48:34.276841image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
70091
 
0.9%
13186
 
0.9%
12484
 
0.8%
15283
 
0.8%
14582
 
0.8%
13882
 
0.8%
11581
 
0.8%
14781
 
0.8%
16480
 
0.8%
15180
 
0.8%
Other values (165)9170
91.7%

구크기명
Categorical

HIGH CARDINALITY

Distinct9539
Distinct (%)95.4%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
기타
 
4
상자 기타
 
3
ton 기타
 
3
kg PP대 기타
 
3
g PP대
 
3
Other values (9534)
9984 

Length

Max length26
Median length22
Mean length9.037
Min length1

Unique

Unique9089 ?
Unique (%)90.9%

Sample

1st row두름 대
2nd rowton 트럭 20내(5단위)
3rd rowl 재 12cm내외×1.2m내
4th row단 6개 2급
5th rowml PE대 72개

Common Values

ValueCountFrequency (%)
기타4
 
< 0.1%
상자 기타3
 
< 0.1%
ton 기타3
 
< 0.1%
kg PP대 기타3
 
< 0.1%
g PP대3
 
< 0.1%
kg 기타3
 
< 0.1%
PP대3
 
< 0.1%
kg3
 
< 0.1%
ton PP대3
 
< 0.1%
그물망 기타3
 
< 0.1%
Other values (9529)9969
99.7%

Length

2022-08-12T23:48:34.456830image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
g2032
 
7.4%
ton1987
 
7.3%
kg1975
 
7.2%
l946
 
3.5%
ml818
 
3.0%
기타665
 
2.4%
그물망499
 
1.8%
pp대480
 
1.8%
477
 
1.7%
상자471
 
1.7%
Other values (175)16929
62.1%

업데이트일자
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
2015-12-15
10000 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015-12-15
2nd row2015-12-15
3rd row2015-12-15
4th row2015-12-15
5th row2015-12-15

Common Values

ValueCountFrequency (%)
2015-12-1510000
100.0%

Length

2022-08-12T23:48:34.640497image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-12T23:48:34.779930image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
2015-12-1510000
100.0%

Interactions

2022-08-12T23:48:32.988240image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-08-12T23:48:34.885559image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-12T23:48:35.156157image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-12T23:48:35.312620image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

2022-08-12T23:48:33.324920image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-12T23:48:33.484761image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_index크기코드크기명구크기코드구크기명업데이트일자
011918782782두름 대2015-12-15
1725116220내162ton 트럭 20내(5단위)2015-12-15
2946431212cm내외×1.2m내312l 재 12cm내외×1.2m내2015-12-15
3169938022급802단 6개 2급2015-12-15
4772517272172ml PE대 72개2015-12-15
55943141210내141l 210내2015-12-15
6143507D2100내7D2kg 쾌 100내2015-12-15
7356312550내125kg 접 50내2015-12-15
8133887B8177B8ton 17미2015-12-15
96796147450내147g PE대 450내2015-12-15

Last rows

df_index크기코드크기명구크기코드구크기명업데이트일자
9990137887C430내7C4ton 30내2015-12-15
9991160927ZZ기타7314P2015-12-15
99925364136160내136kg 개 160내2015-12-15
9993287912330내152g 단 25내2015-12-15
9994125367A667A6g 각 6미2015-12-15
9995158007F22000내7F2kg 쾌 2000내2015-12-15
9996144467D3110내7D3kg PAN(펜) 110내2015-12-15
9997145111212112kg 포 12개2015-12-15
9998108087156통7156통2015-12-15
9999876430312-18cm×3.6m이상303ton 주 12-18cm×3.6m이상2015-12-15