Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory742.2 KiB
Average record size in memory76.0 B

Variable types

Numeric4
Text2
Categorical2

Dataset

Description월별 도매 가격정보에 대한 데이터로, - 도매 5개 도시 16개 시장에서 조사한 농축수산물 가격자료 - 도매 69 품목 116품종을 대상으로 2개 등급(상품/중품) 기준으로 조사한 월평균 가격 자료
URLhttps://www.data.go.kr/data/15087476/fileData.do

Alerts

평균가격 is highly overall correlated with 유통단계별무게High correlation
유통단계별무게 is highly overall correlated with 평균가격High correlation
유통단계별단위명 is highly imbalanced (76.5%)Imbalance

Reproduction

Analysis started2023-12-12 13:55:12.127728
Analysis finished2023-12-12 13:55:15.309846
Duration3.18 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

Distinct28
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.078
Minimum1996
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:55:15.370250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1996
5-th percentile1997
Q12004
median2010
Q32017
95-th percentile2022
Maximum2023
Range27
Interquartile range (IQR)13

Descriptive statistics

Standard deviation7.8744598
Coefficient of variation (CV)0.0039174897
Kurtosis-1.1725846
Mean2010.078
Median Absolute Deviation (MAD)7
Skewness-0.10455391
Sum20100780
Variance62.007117
MonotonicityNot monotonic
2023-12-12T22:55:15.482263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
2021 477
 
4.8%
2022 436
 
4.4%
2020 429
 
4.3%
2017 405
 
4.0%
2015 397
 
4.0%
2007 388
 
3.9%
2014 386
 
3.9%
2005 385
 
3.9%
2019 383
 
3.8%
2010 378
 
3.8%
Other values (18) 5936
59.4%
ValueCountFrequency (%)
1996 293
2.9%
1997 300
3.0%
1998 289
2.9%
1999 308
3.1%
2000 318
3.2%
2001 316
3.2%
2002 337
3.4%
2003 324
3.2%
2004 374
3.7%
2005 385
3.9%
ValueCountFrequency (%)
2023 187
 
1.9%
2022 436
4.4%
2021 477
4.8%
2020 429
4.3%
2019 383
3.8%
2018 368
3.7%
2017 405
4.0%
2016 378
3.8%
2015 397
4.0%
2014 386
3.9%


Real number (ℝ)

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.519
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:55:15.581373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.4220769
Coefficient of variation (CV)0.52493893
Kurtosis-1.2064036
Mean6.519
Median Absolute Deviation (MAD)3
Skewness-0.017971752
Sum65190
Variance11.71061
MonotonicityNot monotonic
2023-12-12T22:55:15.675832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
8 885
8.8%
10 879
8.8%
9 846
8.5%
4 841
8.4%
3 836
8.4%
7 834
8.3%
6 831
8.3%
11 825
8.2%
5 818
8.2%
2 816
8.2%
Other values (2) 1589
15.9%
ValueCountFrequency (%)
1 805
8.1%
2 816
8.2%
3 836
8.4%
4 841
8.4%
5 818
8.2%
6 831
8.3%
7 834
8.3%
8 885
8.8%
9 846
8.5%
10 879
8.8%
ValueCountFrequency (%)
12 784
7.8%
11 825
8.2%
10 879
8.8%
9 846
8.5%
8 885
8.8%
7 834
8.3%
6 831
8.3%
5 818
8.2%
4 841
8.4%
3 836
8.4%
Distinct70
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T22:55:15.933146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length2.4536
Min length1

Characters and Unicode

Total characters24536
Distinct characters110
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row풋고추
2nd row상추
3rd row참다래
4th row오이
5th row
ValueCountFrequency (%)
오이 406
 
4.1%
풋고추 398
 
4.0%
피마늘 355
 
3.5%
건고추 333
 
3.3%
291
 
2.9%
284
 
2.8%
호박 279
 
2.8%
포도 262
 
2.6%
상추 245
 
2.5%
참다래 230
 
2.3%
Other values (60) 6917
69.2%
2023-12-12T22:55:16.394409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1510
 
6.2%
1201
 
4.9%
946
 
3.9%
785
 
3.2%
630
 
2.6%
574
 
2.3%
564
 
2.3%
558
 
2.3%
554
 
2.3%
547
 
2.2%
Other values (100) 16667
67.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24192
98.6%
Open Punctuation 172
 
0.7%
Close Punctuation 172
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1510
 
6.2%
1201
 
5.0%
946
 
3.9%
785
 
3.2%
630
 
2.6%
574
 
2.4%
564
 
2.3%
558
 
2.3%
554
 
2.3%
547
 
2.3%
Other values (98) 16323
67.5%
Open Punctuation
ValueCountFrequency (%)
( 172
100.0%
Close Punctuation
ValueCountFrequency (%)
) 172
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 24192
98.6%
Common 344
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1510
 
6.2%
1201
 
5.0%
946
 
3.9%
785
 
3.2%
630
 
2.6%
574
 
2.4%
564
 
2.3%
558
 
2.3%
554
 
2.3%
547
 
2.3%
Other values (98) 16323
67.5%
Common
ValueCountFrequency (%)
( 172
50.0%
) 172
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 24192
98.6%
ASCII 344
 
1.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1510
 
6.2%
1201
 
5.0%
946
 
3.9%
785
 
3.2%
630
 
2.6%
574
 
2.4%
564
 
2.3%
558
 
2.3%
554
 
2.3%
547
 
2.3%
Other values (98) 16323
67.5%
ASCII
ValueCountFrequency (%)
( 172
50.0%
) 172
50.0%
Distinct130
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T22:55:16.681309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length2.948
Min length1

Characters and Unicode

Total characters29480
Distinct characters182
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row청양고추
2nd row
3rd row국산
4th row다다기계통
5th row흰 콩(국산)
ValueCountFrequency (%)
수입 803
 
7.9%
국산 764
 
7.5%
냉동 294
 
2.9%
일반계 281
 
2.8%
246
 
2.4%
생선 228
 
2.2%
대파 168
 
1.7%
시금치 156
 
1.5%
무세척 155
 
1.5%
수미 154
 
1.5%
Other values (123) 6914
68.0%
2023-12-12T22:55:17.087816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1415
 
4.8%
1365
 
4.6%
1333
 
4.5%
1113
 
3.8%
( 937
 
3.2%
) 937
 
3.2%
788
 
2.7%
752
 
2.6%
638
 
2.2%
579
 
2.0%
Other values (172) 19623
66.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 27105
91.9%
Open Punctuation 937
 
3.2%
Close Punctuation 937
 
3.2%
Space Separator 163
 
0.6%
Decimal Number 128
 
0.4%
Other Symbol 93
 
0.3%
Uppercase Letter 93
 
0.3%
Lowercase Letter 22
 
0.1%
Math Symbol 1
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1415
 
5.2%
1365
 
5.0%
1333
 
4.9%
1113
 
4.1%
788
 
2.9%
752
 
2.8%
638
 
2.4%
579
 
2.1%
575
 
2.1%
548
 
2.0%
Other values (157) 17999
66.4%
Decimal Number
ValueCountFrequency (%)
1 125
97.7%
2 1
 
0.8%
3 1
 
0.8%
5 1
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
A 31
33.3%
B 31
33.3%
M 31
33.3%
Lowercase Letter
ValueCountFrequency (%)
k 11
50.0%
g 11
50.0%
Open Punctuation
ValueCountFrequency (%)
( 937
100.0%
Close Punctuation
ValueCountFrequency (%)
) 937
100.0%
Space Separator
ValueCountFrequency (%)
163
100.0%
Other Symbol
ValueCountFrequency (%)
93
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 27105
91.9%
Common 2260
 
7.7%
Latin 115
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1415
 
5.2%
1365
 
5.0%
1333
 
4.9%
1113
 
4.1%
788
 
2.9%
752
 
2.8%
638
 
2.4%
579
 
2.1%
575
 
2.1%
548
 
2.0%
Other values (157) 17999
66.4%
Common
ValueCountFrequency (%)
( 937
41.5%
) 937
41.5%
163
 
7.2%
1 125
 
5.5%
93
 
4.1%
~ 1
 
< 0.1%
2 1
 
< 0.1%
3 1
 
< 0.1%
. 1
 
< 0.1%
5 1
 
< 0.1%
Latin
ValueCountFrequency (%)
A 31
27.0%
B 31
27.0%
M 31
27.0%
k 11
 
9.6%
g 11
 
9.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 27105
91.9%
ASCII 2282
 
7.7%
CJK Compat 93
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1415
 
5.2%
1365
 
5.0%
1333
 
4.9%
1113
 
4.1%
788
 
2.9%
752
 
2.8%
638
 
2.4%
579
 
2.1%
575
 
2.1%
548
 
2.0%
Other values (157) 17999
66.4%
ASCII
ValueCountFrequency (%)
( 937
41.1%
) 937
41.1%
163
 
7.1%
1 125
 
5.5%
A 31
 
1.4%
B 31
 
1.4%
M 31
 
1.4%
k 11
 
0.5%
g 11
 
0.5%
~ 1
 
< 0.1%
Other values (4) 4
 
0.2%
CJK Compat
ValueCountFrequency (%)
93
100.0%

평균가격
Real number (ℝ)

HIGH CORRELATION 

Distinct9207
Distinct (%)92.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72336.942
Minimum166.952
Maximum1715500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:55:17.219676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum166.952
5-th percentile2745.86
Q111140.116
median23947.619
Q354734.436
95-th percentile383611.54
Maximum1715500
Range1715333
Interquartile range (IQR)43594.321

Descriptive statistics

Standard deviation146863.26
Coefficient of variation (CV)2.0302663
Kurtosis23.952973
Mean72336.942
Median Absolute Deviation (MAD)15783.182
Skewness4.3120425
Sum7.2336942 × 108
Variance2.1568816 × 1010
MonotonicityNot monotonic
2023-12-12T22:55:17.363374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12400.0 10
 
0.1%
40000.0 9
 
0.1%
12000.0 8
 
0.1%
7400.0 8
 
0.1%
5500.0 8
 
0.1%
8800.0 7
 
0.1%
8200.0 7
 
0.1%
70200.0 7
 
0.1%
14800.0 6
 
0.1%
16000.0 6
 
0.1%
Other values (9197) 9924
99.2%
ValueCountFrequency (%)
166.952 1
< 0.1%
172.48 1
< 0.1%
172.769 1
< 0.1%
200.722 1
< 0.1%
203.538 1
< 0.1%
210.0 1
< 0.1%
212.286 1
< 0.1%
221.28 1
< 0.1%
222.0 1
< 0.1%
224.571 1
< 0.1%
ValueCountFrequency (%)
1715500.0 1
< 0.1%
1661500.0 1
< 0.1%
1600909.091 1
< 0.1%
1580000.0 1
< 0.1%
1482631.579 1
< 0.1%
1434545.455 1
< 0.1%
1423636.364 1
< 0.1%
1404090.909 1
< 0.1%
1393750.0 1
< 0.1%
1387290.909 1
< 0.1%

등급명
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
중품
5653 
상품
4328 
M과
 
10
S과
 
9

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중품
2nd row중품
3rd row상품
4th row중품
5th row중품

Common Values

ValueCountFrequency (%)
중품 5653
56.5%
상품 4328
43.3%
M과 10
 
0.1%
S과 9
 
0.1%

Length

2023-12-12T22:55:17.518183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:55:17.630879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
중품 5653
56.5%
상품 4328
43.3%
m과 10
 
0.1%
s과 9
 
0.1%

유통단계별무게
Real number (ℝ)

HIGH CORRELATION 

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.63325
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:55:17.750364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median10
Q320
95-th percentile45
Maximum100
Range99
Interquartile range (IQR)16

Descriptive statistics

Standard deviation15.354069
Coefficient of variation (CV)0.98214186
Kurtosis4.7254678
Mean15.63325
Median Absolute Deviation (MAD)8
Skewness1.8284913
Sum156332.5
Variance235.74744
MonotonicityNot monotonic
2023-12-12T22:55:17.888459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
10.0 2597
26.0%
20.0 1498
15.0%
1.0 1354
13.5%
4.0 632
 
6.3%
40.0 579
 
5.8%
2.0 500
 
5.0%
30.0 472
 
4.7%
5.0 458
 
4.6%
35.0 282
 
2.8%
60.0 251
 
2.5%
Other values (15) 1377
13.8%
ValueCountFrequency (%)
1.0 1354
13.5%
1.5 74
 
0.7%
2.0 500
 
5.0%
3.0 10
 
0.1%
4.0 632
6.3%
4.5 52
 
0.5%
5.0 458
 
4.6%
6.0 10
 
0.1%
7.5 12
 
0.1%
8.0 208
 
2.1%
ValueCountFrequency (%)
100.0 53
 
0.5%
60.0 251
 
2.5%
50.0 89
 
0.9%
45.0 189
 
1.9%
40.0 579
 
5.8%
35.0 282
 
2.8%
30.0 472
 
4.7%
22.5 1
 
< 0.1%
20.0 1498
15.0%
18.0 218
 
2.2%

유통단계별단위명
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
kg
9182 
 
438
마리
 
185
 
125
 
70

Length

Max length2
Median length2
Mean length1.9367
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowkg
2nd rowkg
3rd rowkg
4th rowkg
5th rowkg

Common Values

ValueCountFrequency (%)
kg 9182
91.8%
438
 
4.4%
마리 185
 
1.8%
125
 
1.2%
70
 
0.7%

Length

2023-12-12T22:55:18.066153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:55:18.197088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kg 9182
91.8%
438
 
4.4%
마리 185
 
1.8%
125
 
1.2%
70
 
0.7%

Interactions

2023-12-12T22:55:14.597705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:12.918764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:13.612595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:14.088256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:14.693747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:13.003296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:13.719292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:14.192421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:14.788937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:13.401040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:13.831197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:14.318958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:14.936741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:13.501429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:13.966763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:55:14.451655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:55:18.277176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도품목명평균가격등급명유통단계별무게유통단계별단위명
연도1.0000.0000.3200.1620.0780.1040.135
0.0001.0000.1790.0600.0000.0000.061
품목명0.3200.1791.0000.7400.5810.9870.984
평균가격0.1620.0600.7401.0000.0490.6290.093
등급명0.0780.0000.5810.0491.0000.0600.100
유통단계별무게0.1040.0000.9870.6290.0601.0000.337
유통단계별단위명0.1350.0610.9840.0930.1000.3371.000
2023-12-12T22:55:18.401130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유통단계별단위명등급명
유통단계별단위명1.0000.081
등급명0.0811.000
2023-12-12T22:55:18.504602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도평균가격유통단계별무게등급명유통단계별단위명
연도1.000-0.0360.166-0.0370.0470.059
-0.0361.000-0.024-0.0010.0000.025
평균가격0.166-0.0241.0000.7420.0300.039
유통단계별무게-0.037-0.0010.7421.0000.0410.223
등급명0.0470.0000.0300.0411.0000.081
유통단계별단위명0.0590.0250.0390.2230.0811.000

Missing values

2023-12-12T22:55:15.090195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:55:15.254166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도품목명품종명평균가격등급명유통단계별무게유통단계별단위명
3607320185풋고추청양고추30430.0중품10.0kg
4220020218상추38351.429중품4.0kg
8201200110참다래국산23187.5상품10.0kg
2258320108오이다다기계통25409.136중품10.0kg
3836720198흰 콩(국산)188600.0중품35.0kg
38738201910열무열무7361.905중품4.0kg
3213420163건고추양건907636.364상품60.0kg
2375820114땅콩국산189276.19상품30.0kg
234119979참깨백색(국산)321650.609중품30.0kg
33226201610포도거봉7353.846중품2.0kg
연도품목명품종명평균가격등급명유통단계별무게유통단계별단위명
3808720196오이다다기계통23824.561중품100.0
169919974고구마9964.96중품10.0kg
716820012대파485.417중품1.0kg
2625199711갈치생선11314.76중품1.0kg
4019020207깐마늘(국산)깐마늘(남도)106956.522상품20.0kg
19886200812피마늘한지kg29666.667중품10.0kg
1337920052배추월동236.471중품1.0kg
649320008오렌지수입18561.538중품18.0kg
42900202112피마늘난지(남도)70200.0중품10.0kg
3094120157건고추양건1010000.0상품60.0kg