Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory742.2 KiB
Average record size in memory76.0 B

Variable types

Numeric4
Text2
Categorical2

Dataset

Description월별 소매 가격정보에 대한 데이터로, - 소매 19개 도시 54개 시장에서 조사한 농축수산물 가격자료 - 소매 90품목 143품종을 대상으로 2개 등급(상품/중품) 기준으로 조사한 월평균 가격 자료
URLhttps://www.data.go.kr/data/15087482/fileData.do

Alerts

등급명 is highly imbalanced (58.1%)Imbalance

Reproduction

Analysis started2023-12-12 07:16:06.093993
Analysis finished2023-12-12 07:16:09.138901
Duration3.04 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2013.4044
Minimum1996
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:16:09.195374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1996
5-th percentile1997
Q12013
median2016
Q32019
95-th percentile2022
Maximum2023
Range27
Interquartile range (IQR)6

Descriptive statistics

Standard deviation8.530307
Coefficient of variation (CV)0.0042367579
Kurtosis-0.3907541
Mean2013.4044
Median Absolute Deviation (MAD)3
Skewness-1.0565768
Sum20134044
Variance72.766137
MonotonicityNot monotonic
2023-12-12T16:16:09.333222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
2017 829
 
8.3%
2016 769
 
7.7%
2021 766
 
7.7%
2018 766
 
7.7%
2019 756
 
7.6%
2014 754
 
7.5%
2020 747
 
7.5%
2015 744
 
7.4%
2013 737
 
7.4%
2022 640
 
6.4%
Other values (6) 2492
24.9%
ValueCountFrequency (%)
1996 405
4.0%
1997 423
4.2%
1998 439
4.4%
1999 455
4.5%
2000 452
4.5%
2013 737
7.4%
2014 754
7.5%
2015 744
7.4%
2016 769
7.7%
2017 829
8.3%
ValueCountFrequency (%)
2023 318
 
3.2%
2022 640
6.4%
2021 766
7.7%
2020 747
7.5%
2019 756
7.6%
2018 766
7.7%
2017 829
8.3%
2016 769
7.7%
2015 744
7.4%
2014 754
7.5%


Real number (ℝ)

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.483
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:16:09.480905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median6
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.4304064
Coefficient of variation (CV)0.52913873
Kurtosis-1.1990082
Mean6.483
Median Absolute Deviation (MAD)3
Skewness0.0079256673
Sum64830
Variance11.767688
MonotonicityNot monotonic
2023-12-12T16:16:09.577761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
5 887
8.9%
9 862
8.6%
8 855
8.6%
3 849
8.5%
10 837
8.4%
12 835
8.3%
4 830
8.3%
6 829
8.3%
1 827
8.3%
7 817
8.2%
Other values (2) 1572
15.7%
ValueCountFrequency (%)
1 827
8.3%
2 808
8.1%
3 849
8.5%
4 830
8.3%
5 887
8.9%
6 829
8.3%
7 817
8.2%
8 855
8.6%
9 862
8.6%
10 837
8.4%
ValueCountFrequency (%)
12 835
8.3%
11 764
7.6%
10 837
8.4%
9 862
8.6%
8 855
8.6%
7 817
8.2%
6 829
8.3%
5 887
8.9%
4 830
8.3%
3 849
8.5%
Distinct91
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T16:16:09.821946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length2.5745
Min length1

Characters and Unicode

Total characters25745
Distinct characters134
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row고등어
2nd row고구마
3rd row건고추
4th row갈치
5th row깐마늘(국산)
ValueCountFrequency (%)
쇠고기 638
 
6.4%
오이 399
 
4.0%
풋고추 345
 
3.5%
건고추 278
 
2.8%
상추 278
 
2.8%
호박 271
 
2.7%
참깨 242
 
2.4%
234
 
2.3%
땅콩 228
 
2.3%
208
 
2.1%
Other values (81) 6879
68.8%
2023-12-12T16:16:10.273802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2108
 
8.2%
1446
 
5.6%
1017
 
4.0%
747
 
2.9%
667
 
2.6%
638
 
2.5%
619
 
2.4%
565
 
2.2%
548
 
2.1%
516
 
2.0%
Other values (124) 16874
65.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 25449
98.9%
Close Punctuation 148
 
0.6%
Open Punctuation 148
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2108
 
8.3%
1446
 
5.7%
1017
 
4.0%
747
 
2.9%
667
 
2.6%
638
 
2.5%
619
 
2.4%
565
 
2.2%
548
 
2.2%
516
 
2.0%
Other values (122) 16578
65.1%
Close Punctuation
ValueCountFrequency (%)
) 148
100.0%
Open Punctuation
ValueCountFrequency (%)
( 148
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 25449
98.9%
Common 296
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2108
 
8.3%
1446
 
5.7%
1017
 
4.0%
747
 
2.9%
667
 
2.6%
638
 
2.5%
619
 
2.4%
565
 
2.2%
548
 
2.2%
516
 
2.0%
Other values (122) 16578
65.1%
Common
ValueCountFrequency (%)
) 148
50.0%
( 148
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 25449
98.9%
ASCII 296
 
1.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2108
 
8.3%
1446
 
5.7%
1017
 
4.0%
747
 
2.9%
667
 
2.6%
638
 
2.5%
619
 
2.4%
565
 
2.2%
548
 
2.2%
516
 
2.0%
Other values (122) 16578
65.1%
ASCII
ValueCountFrequency (%)
) 148
50.0%
( 148
50.0%
Distinct148
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T16:16:10.546591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length17
Mean length3.2119
Min length1

Characters and Unicode

Total characters32119
Distinct characters228
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)0.1%

Sample

1st row생선
2nd row
3rd row화건
4th row생선
5th row깐마늘(국산)
ValueCountFrequency (%)
수입 778
 
7.5%
국산 559
 
5.4%
냉동 269
 
2.6%
248
 
2.4%
생선 204
 
2.0%
양배추 160
 
1.5%
151
 
1.4%
일반계 150
 
1.4%
신고 147
 
1.4%
무세척 146
 
1.4%
Other values (163) 7628
73.1%
2023-12-12T16:16:11.006993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1588
 
4.9%
1521
 
4.7%
1203
 
3.7%
923
 
2.9%
871
 
2.7%
( 815
 
2.5%
) 815
 
2.5%
712
 
2.2%
649
 
2.0%
576
 
1.8%
Other values (218) 22446
69.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29551
92.0%
Open Punctuation 815
 
2.5%
Close Punctuation 815
 
2.5%
Space Separator 440
 
1.4%
Decimal Number 245
 
0.8%
Lowercase Letter 114
 
0.4%
Uppercase Letter 85
 
0.3%
Other Punctuation 27
 
0.1%
Math Symbol 27
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1588
 
5.4%
1521
 
5.1%
1203
 
4.1%
923
 
3.1%
871
 
2.9%
712
 
2.4%
649
 
2.2%
576
 
1.9%
553
 
1.9%
518
 
1.8%
Other values (195) 20437
69.2%
Decimal Number
ValueCountFrequency (%)
0 58
23.7%
1 53
21.6%
3 49
20.0%
2 27
11.0%
5 26
10.6%
8 18
 
7.3%
4 13
 
5.3%
7 1
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
B 19
22.4%
A 19
22.4%
M 19
22.4%
L 12
14.1%
C 7
 
8.2%
J 7
 
8.2%
E 1
 
1.2%
U 1
 
1.2%
Lowercase Letter
ValueCountFrequency (%)
g 77
67.5%
k 37
32.5%
Open Punctuation
ValueCountFrequency (%)
( 815
100.0%
Close Punctuation
ValueCountFrequency (%)
) 815
100.0%
Space Separator
ValueCountFrequency (%)
440
100.0%
Other Punctuation
ValueCountFrequency (%)
. 27
100.0%
Math Symbol
ValueCountFrequency (%)
× 27
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29551
92.0%
Common 2369
 
7.4%
Latin 199
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1588
 
5.4%
1521
 
5.1%
1203
 
4.1%
923
 
3.1%
871
 
2.9%
712
 
2.4%
649
 
2.2%
576
 
1.9%
553
 
1.9%
518
 
1.8%
Other values (195) 20437
69.2%
Common
ValueCountFrequency (%)
( 815
34.4%
) 815
34.4%
440
18.6%
0 58
 
2.4%
1 53
 
2.2%
3 49
 
2.1%
. 27
 
1.1%
2 27
 
1.1%
× 27
 
1.1%
5 26
 
1.1%
Other values (3) 32
 
1.4%
Latin
ValueCountFrequency (%)
g 77
38.7%
k 37
18.6%
B 19
 
9.5%
A 19
 
9.5%
M 19
 
9.5%
L 12
 
6.0%
C 7
 
3.5%
J 7
 
3.5%
E 1
 
0.5%
U 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29551
92.0%
ASCII 2541
 
7.9%
None 27
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1588
 
5.4%
1521
 
5.1%
1203
 
4.1%
923
 
3.1%
871
 
2.9%
712
 
2.4%
649
 
2.2%
576
 
1.9%
553
 
1.9%
518
 
1.8%
Other values (195) 20437
69.2%
ASCII
ValueCountFrequency (%)
( 815
32.1%
) 815
32.1%
440
17.3%
g 77
 
3.0%
0 58
 
2.3%
1 53
 
2.1%
3 49
 
1.9%
k 37
 
1.5%
. 27
 
1.1%
2 27
 
1.1%
Other values (12) 143
 
5.6%
None
ValueCountFrequency (%)
× 27
100.0%

평균가격
Real number (ℝ)

Distinct9819
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5656.5182
Minimum80.417
Maximum68205.289
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:16:11.155166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum80.417
5-th percentile428.5595
Q11363.7967
median3008.238
Q36772.705
95-th percentile19949.37
Maximum68205.289
Range68124.872
Interquartile range (IQR)5408.9083

Descriptive statistics

Standard deviation7508.8231
Coefficient of variation (CV)1.3274638
Kurtosis14.952322
Mean5656.5182
Median Absolute Deviation (MAD)2079.1475
Skewness3.3040441
Sum56565182
Variance56382424
MonotonicityNot monotonic
2023-12-12T16:16:11.318016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2700.0 9
 
0.1%
1528.571 8
 
0.1%
7700.0 5
 
0.1%
2400.0 5
 
0.1%
6165.556 5
 
0.1%
1200.0 5
 
0.1%
9823.077 4
 
< 0.1%
4043.0 4
 
< 0.1%
1370.0 4
 
< 0.1%
5200.0 4
 
< 0.1%
Other values (9809) 9947
99.5%
ValueCountFrequency (%)
80.417 1
< 0.1%
85.32 1
< 0.1%
87.48 1
< 0.1%
87.5 1
< 0.1%
94.727 1
< 0.1%
96.292 1
< 0.1%
98.538 1
< 0.1%
102.857 1
< 0.1%
103.417 1
< 0.1%
104.077 1
< 0.1%
ValueCountFrequency (%)
68205.289 1
< 0.1%
65223.821 1
< 0.1%
65187.018 1
< 0.1%
64791.802 1
< 0.1%
64597.294 1
< 0.1%
64166.367 1
< 0.1%
63721.617 1
< 0.1%
63613.71 1
< 0.1%
63044.358 1
< 0.1%
62581.308 1
< 0.1%

등급명
Categorical

IMBALANCE 

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
중품
4949 
상품
4319 
냉동
 
185
냉장
 
172
1등급
 
171
Other values (7)
 
204

Length

Max length4
Median length2
Mean length2.0273
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중품
2nd row상품
3rd row상품
4th row중품
5th row중품

Common Values

ValueCountFrequency (%)
중품 4949
49.5%
상품 4319
43.2%
냉동 185
 
1.8%
냉장 172
 
1.7%
1등급 171
 
1.7%
3등급 57
 
0.6%
1+등급 53
 
0.5%
25
 
0.2%
23
 
0.2%
S과 18
 
0.2%
Other values (2) 28
 
0.3%

Length

2023-12-12T16:16:11.465463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
중품 4949
49.5%
상품 4319
43.2%
냉동 185
 
1.8%
냉장 172
 
1.7%
1등급 171
 
1.7%
3등급 57
 
0.6%
1+등급 53
 
0.5%
25
 
0.2%
23
 
0.2%
s과 18
 
0.2%
Other values (2) 28
 
0.3%

유통단계별무게
Real number (ℝ)

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83.6558
Minimum1
Maximum600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:16:11.576111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median10
Q3100
95-th percentile500
Maximum600
Range599
Interquartile range (IQR)99

Descriptive statistics

Standard deviation149.95142
Coefficient of variation (CV)1.7924808
Kurtosis4.7451217
Mean83.6558
Median Absolute Deviation (MAD)9
Skewness2.4125662
Sum836558
Variance22485.428
MonotonicityNot monotonic
2023-12-12T16:16:11.696286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 4051
40.5%
100 2935
29.3%
10 1549
 
15.5%
500 636
 
6.4%
600 278
 
2.8%
2 111
 
1.1%
150 100
 
1.0%
5 97
 
1.0%
200 97
 
1.0%
20 77
 
0.8%
ValueCountFrequency (%)
1 4051
40.5%
2 111
 
1.1%
5 97
 
1.0%
10 1549
 
15.5%
20 77
 
0.8%
30 69
 
0.7%
100 2935
29.3%
150 100
 
1.0%
200 97
 
1.0%
500 636
 
6.4%
ValueCountFrequency (%)
600 278
 
2.8%
500 636
 
6.4%
200 97
 
1.0%
150 100
 
1.0%
100 2935
29.3%
30 69
 
0.7%
20 77
 
0.8%
10 1549
15.5%
5 97
 
1.0%
2 111
 
1.1%
Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
g
4046 
2346 
kg
2272 
마리
787 
포기
 
364
Other values (5)
 
185

Length

Max length2
Median length1
Mean length1.3496
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row마리
2nd rowkg
3rd rowg
4th row마리
5th rowkg

Common Values

ValueCountFrequency (%)
g 4046
40.5%
2346
23.5%
kg 2272
22.7%
마리 787
 
7.9%
포기 364
 
3.6%
79
 
0.8%
리터 52
 
0.5%
묶음 21
 
0.2%
20
 
0.2%
13
 
0.1%

Length

2023-12-12T16:16:11.842624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:16:11.987134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
g 4046
40.5%
2346
23.5%
kg 2272
22.7%
마리 787
 
7.9%
포기 364
 
3.6%
79
 
0.8%
리터 52
 
0.5%
묶음 21
 
0.2%
20
 
0.2%
13
 
0.1%

Interactions

2023-12-12T16:16:08.501467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:07.018152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:07.412885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:07.829268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:08.621914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:07.101395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:07.510820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:08.182577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:08.712720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:07.189863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:07.619752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:08.274868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:08.808133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:07.316268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:07.718844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:16:08.390847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:16:12.078849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도품목명평균가격등급명유통단계별무게유통단계별단위
연도1.0000.0680.4150.1450.3080.2060.169
0.0681.0000.1200.0000.0000.0000.000
품목명0.4150.1201.0000.8600.8341.0000.996
평균가격0.1450.0000.8601.0000.1940.3160.440
등급명0.3080.0000.8340.1941.0000.4330.393
유통단계별무게0.2060.0001.0000.3160.4331.0000.683
유통단계별단위0.1690.0000.9960.4400.3930.6831.000
2023-12-12T16:16:12.194654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유통단계별단위등급명
유통단계별단위1.0000.176
등급명0.1761.000
2023-12-12T16:16:12.307943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도평균가격유통단계별무게등급명유통단계별단위
연도1.000-0.0770.1710.0110.1280.090
-0.0771.0000.009-0.0130.0000.000
평균가격0.1710.0091.000-0.1980.0820.148
유통단계별무게0.011-0.013-0.1981.0000.1840.446
등급명0.1280.0000.0820.1841.0000.176
유통단계별단위0.0900.0000.1480.4460.1761.000

Missing values

2023-12-12T16:16:08.947442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:16:09.086509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도품목명품종명평균가격등급명유통단계별무게유통단계별단위
23001202112고등어생선3457.607중품1마리
349519994고구마2194.75상품1kg
1826320197건고추화건16661.797상품600g
7058201310갈치생선8163.952중품1마리
2491820232깐마늘(국산)깐마늘(국산)9433.805중품1kg
1138720162아몬드수입2040.806중품100g
172119978참깨중국3123.32중품500g
787720144딸기딸기865.091상품100g
1021320157멜론멜론6994.532상품1
456620003백태(국산)2265.0상품500g
연도품목명품종명평균가격등급명유통단계별무게유통단계별단위
1373120174사과후지13887.136중품10
7141201311피망551.714중품100g
8820201410붉은고추붉은고추1269.19상품100g
389419998시금치시금치5487.115상품1kg
1564620183쇠고기호주산갈비1955.84냉동100g
22916202112풋고추꽈리고추1194.287상품100g
258219986양배추양배추1987.708상품1포기
7281201312풋고추꽈리고추1061.26상품100g
739420141배추월동2013.569중품1포기
183219979고등어냉동1004.348중품1마리