Overview

Dataset statistics

Number of variables8
Number of observations2062
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory139.1 KiB
Average record size in memory69.1 B

Variable types

Text1
Categorical3
Numeric4

Dataset

Description품목명,등급,수량,단위,최고가,최저가,평균가,조사일
Author서울시농수산식품공사
URLhttps://data.seoul.go.kr/dataList/OA-2664/S/1/datasetView.do

Alerts

조사일 has constant value ""Constant
최고가 is highly overall correlated with 최저가 and 1 other fieldsHigh correlation
최저가 is highly overall correlated with 최고가 and 1 other fieldsHigh correlation
평균가 is highly overall correlated with 최고가 and 1 other fieldsHigh correlation
단위 is highly imbalanced (66.2%)Imbalance
최고가 has 1289 (62.5%) zerosZeros
최저가 has 1289 (62.5%) zerosZeros
평균가 has 1289 (62.5%) zerosZeros

Reproduction

Analysis started2024-05-11 05:54:52.975868
Analysis finished2024-05-11 05:54:58.536549
Duration5.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct429
Distinct (%)20.8%
Missing0
Missing (%)0.0%
Memory size16.2 KiB
2024-05-11T14:54:58.988888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length5.6585839
Min length1

Characters and Unicode

Total characters11668
Distinct characters323
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)2.7%

Sample

1st row(냉)갈치
2nd row(냉)갈치
3rd row(냉)고등어
4th row(냉)고등어
5th row(냉)고등어 수입
ValueCountFrequency (%)
복숭아 292
 
8.8%
수입 171
 
5.1%
사과 140
 
4.2%
국산 75
 
2.2%
포도 60
 
1.8%
딸기 56
 
1.7%
만감 52
 
1.6%
감귤 48
 
1.4%
양파 44
 
1.3%
자두 44
 
1.3%
Other values (421) 2352
70.5%
2024-05-11T14:54:59.829776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1272
 
10.9%
( 488
 
4.2%
) 488
 
4.2%
353
 
3.0%
302
 
2.6%
298
 
2.6%
282
 
2.4%
261
 
2.2%
195
 
1.7%
192
 
1.6%
Other values (313) 7537
64.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9390
80.5%
Space Separator 1272
 
10.9%
Open Punctuation 488
 
4.2%
Close Punctuation 488
 
4.2%
Uppercase Letter 24
 
0.2%
Decimal Number 6
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
353
 
3.8%
302
 
3.2%
298
 
3.2%
282
 
3.0%
261
 
2.8%
195
 
2.1%
192
 
2.0%
189
 
2.0%
173
 
1.8%
170
 
1.8%
Other values (306) 6975
74.3%
Uppercase Letter
ValueCountFrequency (%)
A 8
33.3%
M 8
33.3%
B 8
33.3%
Space Separator
ValueCountFrequency (%)
1272
100.0%
Open Punctuation
ValueCountFrequency (%)
( 488
100.0%
Close Punctuation
ValueCountFrequency (%)
) 488
100.0%
Decimal Number
ValueCountFrequency (%)
5 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9390
80.5%
Common 2254
 
19.3%
Latin 24
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
353
 
3.8%
302
 
3.2%
298
 
3.2%
282
 
3.0%
261
 
2.8%
195
 
2.1%
192
 
2.0%
189
 
2.0%
173
 
1.8%
170
 
1.8%
Other values (306) 6975
74.3%
Common
ValueCountFrequency (%)
1272
56.4%
( 488
 
21.7%
) 488
 
21.7%
5 6
 
0.3%
Latin
ValueCountFrequency (%)
A 8
33.3%
M 8
33.3%
B 8
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9390
80.5%
ASCII 2278
 
19.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1272
55.8%
( 488
 
21.4%
) 488
 
21.4%
A 8
 
0.4%
M 8
 
0.4%
B 8
 
0.4%
5 6
 
0.3%
Hangul
ValueCountFrequency (%)
353
 
3.8%
302
 
3.2%
298
 
3.2%
282
 
3.0%
261
 
2.8%
195
 
2.1%
192
 
2.0%
189
 
2.0%
173
 
1.8%
170
 
1.8%
Other values (306) 6975
74.3%

등급
Categorical

Distinct6
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size16.2 KiB
607 
566 
498 
360 
 
18

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
607
29.4%
566
27.4%
498
24.2%
360
17.5%
18
 
0.9%
13
 
0.6%

Length

2024-05-11T14:55:00.119250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:55:00.345314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
607
29.4%
566
27.4%
498
24.2%
360
17.5%
18
 
0.9%
13
 
0.6%

수량
Real number (ℝ)

Distinct42
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.075558
Minimum0.05
Maximum10000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size18.3 KiB
2024-05-11T14:55:00.554680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.05
5-th percentile1
Q13.125
median5
Q310
95-th percentile20
Maximum10000
Range9999.95
Interquartile range (IQR)6.875

Descriptive statistics

Standard deviation494.19904
Coefficient of variation (CV)11.472841
Kurtosis337.44189
Mean43.075558
Median Absolute Deviation (MAD)4
Skewness17.861002
Sum88821.8
Variance244232.69
MonotonicityNot monotonic
2024-05-11T14:55:00.798226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
10.0 480
23.3%
5.0 276
13.4%
1.0 213
10.3%
4.0 206
10.0%
2.0 140
 
6.8%
8.0 107
 
5.2%
15.0 100
 
4.8%
20.0 94
 
4.6%
3.0 68
 
3.3%
4.5 64
 
3.1%
Other values (32) 314
15.2%
ValueCountFrequency (%)
0.05 3
 
0.1%
0.2 3
 
0.1%
0.25 4
 
0.2%
0.5 3
 
0.1%
0.75 3
 
0.1%
1.0 213
10.3%
1.5 31
 
1.5%
1.6 4
 
0.2%
2.0 140
6.8%
2.5 44
 
2.1%
ValueCountFrequency (%)
10000.0 4
 
0.2%
5000.0 4
 
0.2%
750.0 3
 
0.1%
700.0 3
 
0.1%
500.0 11
0.5%
400.0 5
0.2%
200.0 3
 
0.1%
150.0 3
 
0.1%
100.0 11
0.5%
50.0 12
0.6%

단위
Categorical

IMBALANCE 

Distinct16
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size16.2 KiB
kg상자
1537 
kg
338 
Kg그물망
 
45
 
30
g단
 
23
Other values (11)
 
89

Length

Max length5
Median length4
Mean length3.5800194
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowkg상자
2nd rowkg상자
3rd rowkg상자
4th rowkg상자
5th rowkg상자

Common Values

ValueCountFrequency (%)
kg상자 1537
74.5%
kg 338
 
16.4%
Kg그물망 45
 
2.2%
30
 
1.5%
g단 23
 
1.1%
kg단 22
 
1.1%
kg개 20
 
1.0%
10
 
0.5%
마리 8
 
0.4%
kgPAN 6
 
0.3%
Other values (6) 23
 
1.1%

Length

2024-05-11T14:55:01.037632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
kg상자 1537
74.5%
kg 338
 
16.4%
kg그물망 45
 
2.2%
30
 
1.5%
g단 23
 
1.1%
kg단 22
 
1.1%
kg개 20
 
1.0%
10
 
0.5%
kgpp대 9
 
0.4%
마리 8
 
0.4%
Other values (5) 20
 
1.0%

최고가
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct293
Distinct (%)14.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10561.476
Minimum0
Maximum210000
Zeros1289
Zeros (%)62.5%
Negative0
Negative (%)0.0%
Memory size18.3 KiB
2024-05-11T14:55:01.291096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q314000
95-th percentile52000
Maximum210000
Range210000
Interquartile range (IQR)14000

Descriptive statistics

Standard deviation22678.624
Coefficient of variation (CV)2.1472968
Kurtosis19.591209
Mean10561.476
Median Absolute Deviation (MAD)0
Skewness3.8406181
Sum21777764
Variance5.1432 × 108
MonotonicityNot monotonic
2024-05-11T14:55:01.600830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1289
62.5%
20000 23
 
1.1%
16000 17
 
0.8%
12000 16
 
0.8%
25000 15
 
0.7%
30000 14
 
0.7%
10000 13
 
0.6%
14000 13
 
0.6%
14500 12
 
0.6%
15000 12
 
0.6%
Other values (283) 638
30.9%
ValueCountFrequency (%)
0 1289
62.5%
292 1
 
< 0.1%
350 2
 
0.1%
800 1
 
< 0.1%
900 1
 
< 0.1%
1000 1
 
< 0.1%
1050 2
 
0.1%
1100 1
 
< 0.1%
1150 2
 
0.1%
1200 1
 
< 0.1%
ValueCountFrequency (%)
210000 1
 
< 0.1%
190000 1
 
< 0.1%
180000 1
 
< 0.1%
178000 1
 
< 0.1%
175000 1
 
< 0.1%
173000 1
 
< 0.1%
170000 1
 
< 0.1%
160000 3
0.1%
147000 1
 
< 0.1%
145000 2
0.1%

최저가
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct269
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7955.5572
Minimum0
Maximum200000
Zeros1289
Zeros (%)62.5%
Negative0
Negative (%)0.0%
Memory size18.3 KiB
2024-05-11T14:55:01.799281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q38900
95-th percentile40000
Maximum200000
Range200000
Interquartile range (IQR)8900

Descriptive statistics

Standard deviation18575.057
Coefficient of variation (CV)2.3348531
Kurtosis25.157777
Mean7955.5572
Median Absolute Deviation (MAD)0
Skewness4.331056
Sum16404359
Variance3.4503275 × 108
MonotonicityNot monotonic
2024-05-11T14:55:02.003408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1289
62.5%
20000 22
 
1.1%
10000 20
 
1.0%
12000 19
 
0.9%
16000 15
 
0.7%
4000 15
 
0.7%
3000 14
 
0.7%
6000 13
 
0.6%
25000 13
 
0.6%
15000 12
 
0.6%
Other values (259) 630
30.6%
ValueCountFrequency (%)
0 1289
62.5%
100 1
 
< 0.1%
200 1
 
< 0.1%
250 1
 
< 0.1%
292 1
 
< 0.1%
350 1
 
< 0.1%
400 1
 
< 0.1%
500 3
 
0.1%
530 1
 
< 0.1%
533 1
 
< 0.1%
ValueCountFrequency (%)
200000 1
 
< 0.1%
170000 1
 
< 0.1%
160000 1
 
< 0.1%
150000 2
0.1%
145000 1
 
< 0.1%
140000 1
 
< 0.1%
126667 2
0.1%
125000 4
0.2%
121000 1
 
< 0.1%
120000 2
0.1%

평균가
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct730
Distinct (%)35.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9236.5213
Minimum0
Maximum202500
Zeros1289
Zeros (%)62.5%
Negative0
Negative (%)0.0%
Memory size18.3 KiB
2024-05-11T14:55:02.233072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q311848
95-th percentile45864.1
Maximum202500
Range202500
Interquartile range (IQR)11848

Descriptive statistics

Standard deviation20223.823
Coefficient of variation (CV)2.1895498
Kurtosis21.68513
Mean9236.5213
Median Absolute Deviation (MAD)0
Skewness4.0118876
Sum19045707
Variance4.0900302 × 108
MonotonicityNot monotonic
2024-05-11T14:55:02.465173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1289
62.5%
20000 5
 
0.2%
45000 4
 
0.2%
30000 4
 
0.2%
3500 3
 
0.1%
16000 2
 
0.1%
13860 2
 
0.1%
10839 2
 
0.1%
21557 2
 
0.1%
20559 2
 
0.1%
Other values (720) 747
36.2%
ValueCountFrequency (%)
0 1289
62.5%
264 1
 
< 0.1%
347 1
 
< 0.1%
350 1
 
< 0.1%
715 1
 
< 0.1%
742 1
 
< 0.1%
846 1
 
< 0.1%
852 1
 
< 0.1%
908 1
 
< 0.1%
932 1
 
< 0.1%
ValueCountFrequency (%)
202500 1
< 0.1%
173333 1
< 0.1%
169190 1
< 0.1%
165429 1
< 0.1%
160833 1
< 0.1%
155295 1
< 0.1%
152305 1
< 0.1%
143334 1
< 0.1%
140940 1
< 0.1%
136423 1
< 0.1%

조사일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size16.2 KiB
20240511
2062 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20240511
2nd row20240511
3rd row20240511
4th row20240511
5th row20240511

Common Values

ValueCountFrequency (%)
20240511 2062
100.0%

Length

2024-05-11T14:55:02.680160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T14:55:02.819698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20240511 2062
100.0%

Interactions

2024-05-11T14:54:56.874139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:54.811363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:55.514004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:56.086984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:57.122633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:54.979761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:55.648500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:56.253605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:57.285828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:55.142662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:55.769763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:56.411394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:57.490902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:55.346207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:55.925048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T14:54:56.602628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T14:55:02.930471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등급수량단위최고가최저가평균가
등급1.0000.0000.2660.0750.1190.055
수량0.0001.0000.0960.0000.0000.000
단위0.2660.0961.0000.0000.0000.000
최고가0.0750.0000.0001.0000.9600.988
최저가0.1190.0000.0000.9601.0000.983
평균가0.0550.0000.0000.9880.9831.000
2024-05-11T14:55:03.057905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단위등급
단위1.0000.131
등급0.1311.000
2024-05-11T14:55:03.182917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수량최고가최저가평균가등급단위
수량1.000-0.010-0.018-0.0090.0000.052
최고가-0.0101.0000.9940.9990.0390.000
최저가-0.0180.9941.0000.9970.0620.000
평균가-0.0090.9990.9971.0000.0290.000
등급0.0000.0390.0620.0291.0000.131
단위0.0520.0000.0000.0000.1311.000

Missing values

2024-05-11T14:54:58.112850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T14:54:58.386675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

품목명등급수량단위최고가최저가평균가조사일
0(냉)갈치10.0kg상자00020240511
1(냉)갈치10.0kg상자00020240511
2(냉)고등어10.0kg상자45000183003302420240511
3(냉)고등어20.0kg상자00020240511
4(냉)고등어 수입10.0kg상자55000266673602520240511
5(선)갈치3.0kg상자00020240511
6(선)갈치3.0kg상자00020240511
7(선)갈치3.0kg상자00020240511
8(선)갈치5.0kg상자21000020000020250020240511
9(선)갈치5.0kg상자17500017000017333320240511
품목명등급수량단위최고가최저가평균가조사일
2052황색멜론5.0kg상자00020240511
2053황색멜론5.0kg상자00020240511
2054황색멜론8.0kg상자00020240511
2055황색멜론8.0kg상자00020240511
2056황색멜론8.0kg상자00020240511
2057황색멜론8.0kg상자00020240511
2058황색멜론10.0kg상자00020240511
2059황색멜론10.0kg상자00020240511
2060황색멜론10.0kg상자00020240511
2061황색멜론10.0kg상자00020240511