Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows631
Duplicate rows (%)6.3%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text1
Categorical2
Numeric2

Dataset

Description인천광역시 남촌농산물도매시장 월간 경락가격에 대한 데이터로 품목, 등급, 단량, 단위, 평균가등을 볼 수 있습니다.
Author인천광역시
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15051664&srcSe=7661IVAWM27C61E190

Alerts

단위 has constant value ""Constant
Dataset has 631 (6.3%) duplicate rowsDuplicates
단량 is highly overall correlated with 평균가High correlation
평균가 is highly overall correlated with 단량High correlation
등급 is highly imbalanced (54.3%)Imbalance

Reproduction

Analysis started2024-01-28 15:45:26.918085
Analysis finished2024-01-28 15:45:27.748706
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

품목
Text

Distinct404
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-01-29T00:45:27.876863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length17
Mean length9.3048
Min length5

Characters and Unicode

Total characters93048
Distinct characters298
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)0.5%

Sample

1st row연근(기타)
2nd row풋고추(청초(일반))
3rd row오이(가시오이)
4th row밤(밤(일반))
5th row아보카도(아보카도(일반))
ValueCountFrequency (%)
표고버섯(생표고 270
 
2.6%
오이(백다다기 206
 
2.0%
기타(엽경채류(기타 183
 
1.8%
수박(수박(일반)(꼭지절단 173
 
1.7%
표고버섯(표고버섯(일반 151
 
1.5%
시금치(시금치(일반 146
 
1.4%
가지(가지(일반 137
 
1.3%
풋고추(청양 134
 
1.3%
새송이(새송이(일반 133
 
1.3%
호박(애호박 116
 
1.1%
Other values (398) 8648
84.0%
2024-01-29T00:45:28.179760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 14082
 
15.1%
) 14082
 
15.1%
3529
 
3.8%
3470
 
3.7%
3030
 
3.3%
2887
 
3.1%
2422
 
2.6%
2097
 
2.3%
1374
 
1.5%
1299
 
1.4%
Other values (288) 44776
48.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 64432
69.2%
Open Punctuation 14082
 
15.1%
Close Punctuation 14082
 
15.1%
Space Separator 297
 
0.3%
Other Punctuation 132
 
0.1%
Decimal Number 23
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3529
 
5.5%
3470
 
5.4%
3030
 
4.7%
2887
 
4.5%
2422
 
3.8%
2097
 
3.3%
1374
 
2.1%
1299
 
2.0%
1229
 
1.9%
1091
 
1.7%
Other values (282) 42004
65.2%
Decimal Number
ValueCountFrequency (%)
1 17
73.9%
8 6
 
26.1%
Open Punctuation
ValueCountFrequency (%)
( 14082
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14082
100.0%
Space Separator
ValueCountFrequency (%)
297
100.0%
Other Punctuation
ValueCountFrequency (%)
, 132
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 64432
69.2%
Common 28616
30.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3529
 
5.5%
3470
 
5.4%
3030
 
4.7%
2887
 
4.5%
2422
 
3.8%
2097
 
3.3%
1374
 
2.1%
1299
 
2.0%
1229
 
1.9%
1091
 
1.7%
Other values (282) 42004
65.2%
Common
ValueCountFrequency (%)
( 14082
49.2%
) 14082
49.2%
297
 
1.0%
, 132
 
0.5%
1 17
 
0.1%
8 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 64432
69.2%
ASCII 28616
30.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 14082
49.2%
) 14082
49.2%
297
 
1.0%
, 132
 
0.5%
1 17
 
0.1%
8 6
 
< 0.1%
Hangul
ValueCountFrequency (%)
3529
 
5.5%
3470
 
5.4%
3030
 
4.7%
2887
 
4.5%
2422
 
3.8%
2097
 
3.3%
1374
 
2.1%
1299
 
2.0%
1229
 
1.9%
1091
 
1.7%
Other values (282) 42004
65.2%

등급
Categorical

IMBALANCE 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
특(1등
6419 
상(2등
2495 
보통(3
 
458
4등
 
192
9등(등
 
179
Other values (5)
 
257

Length

Max length17
Median length16
Mean length16.0358
Min length16

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row특(1등
2nd row특(1등
3rd row상(2등
4th row특(1등
5th row특(1등

Common Values

ValueCountFrequency (%)
특(1등 6419
64.2%
상(2등 2495
 
24.9%
보통(3 458
 
4.6%
4등 192
 
1.9%
9등(등 179
 
1.8%
없음 91
 
0.9%
6등 51
 
0.5%
8등 45
 
0.4%
5등 44
 
0.4%
7등 26
 
0.3%

Length

2024-01-29T00:45:28.284179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-29T00:45:28.373931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
특(1등 6419
64.2%
상(2등 2495
 
24.9%
보통(3 458
 
4.6%
4등 192
 
1.9%
9등(등 179
 
1.8%
없음 91
 
0.9%
6등 51
 
0.5%
8등 45
 
0.4%
5등 44
 
0.4%
7등 26
 
0.3%

단량
Real number (ℝ)

HIGH CORRELATION 

Distinct95
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.479726
Minimum0.01
Maximum136
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-29T00:45:28.484783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.5
Q13
median5
Q310
95-th percentile15
Maximum136
Range135.99
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.1001723
Coefficient of variation (CV)0.78709691
Kurtosis66.697926
Mean6.479726
Median Absolute Deviation (MAD)3
Skewness3.8779021
Sum64797.26
Variance26.011757
MonotonicityNot monotonic
2024-01-29T00:45:28.604470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10.0 2134
21.3%
4.0 2061
20.6%
2.0 1121
11.2%
5.0 822
 
8.2%
8.0 766
 
7.7%
1.0 487
 
4.9%
15.0 334
 
3.3%
3.0 297
 
3.0%
20.0 273
 
2.7%
0.5 247
 
2.5%
Other values (85) 1458
14.6%
ValueCountFrequency (%)
0.01 20
 
0.2%
0.02 1
 
< 0.1%
0.05 32
0.3%
0.06 14
 
0.1%
0.1 16
 
0.2%
0.12 4
 
< 0.1%
0.15 3
 
< 0.1%
0.16 14
 
0.1%
0.2 69
0.7%
0.25 5
 
0.1%
ValueCountFrequency (%)
136.0 1
 
< 0.1%
102.0 1
 
< 0.1%
89.0 1
 
< 0.1%
85.0 1
 
< 0.1%
51.0 1
 
< 0.1%
40.0 1
 
< 0.1%
34.0 1
 
< 0.1%
25.0 3
 
< 0.1%
21.0 2
 
< 0.1%
20.0 273
2.7%

단위
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
kg
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowkg
2nd rowkg
3rd rowkg
4th rowkg
5th rowkg

Common Values

ValueCountFrequency (%)
kg 10000
100.0%

Length

2024-01-29T00:45:28.727538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-29T00:45:28.798945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kg 10000
100.0%

평균가
Real number (ℝ)

HIGH CORRELATION 

Distinct4461
Distinct (%)44.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18350.455
Minimum100
Maximum1417000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-29T00:45:28.874012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile1500
Q15821
median12000
Q322000
95-th percentile55250
Maximum1417000
Range1416900
Interquartile range (IQR)16179

Descriptive statistics

Standard deviation28745.824
Coefficient of variation (CV)1.5664911
Kurtosis759.69905
Mean18350.455
Median Absolute Deviation (MAD)7463.5
Skewness19.111213
Sum1.8350455 × 108
Variance8.2632241 × 108
MonotonicityNot monotonic
2024-01-29T00:45:28.975450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000 206
 
2.1%
8000 196
 
2.0%
4000 182
 
1.8%
15000 156
 
1.6%
13000 151
 
1.5%
3000 147
 
1.5%
5000 146
 
1.5%
6000 144
 
1.4%
2000 142
 
1.4%
12000 140
 
1.4%
Other values (4451) 8390
83.9%
ValueCountFrequency (%)
100 1
 
< 0.1%
150 1
 
< 0.1%
200 6
 
0.1%
300 21
0.2%
336 1
 
< 0.1%
350 14
0.1%
367 1
 
< 0.1%
375 1
 
< 0.1%
400 11
0.1%
408 1
 
< 0.1%
ValueCountFrequency (%)
1417000 1
 
< 0.1%
1062500 1
 
< 0.1%
531200 1
 
< 0.1%
448177 1
 
< 0.1%
406200 1
 
< 0.1%
354200 1
 
< 0.1%
240000 4
< 0.1%
215000 2
 
< 0.1%
210000 2
 
< 0.1%
200000 7
0.1%

Interactions

2024-01-29T00:45:27.413126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-29T00:45:27.259522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-29T00:45:27.499622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-29T00:45:27.334968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-29T00:45:29.045440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등급단량평균가
등급1.0000.0690.000
단량0.0691.0000.921
평균가0.0000.9211.000
2024-01-29T00:45:29.109538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단량평균가등급
단량1.0000.5850.035
평균가0.5851.0000.000
등급0.0350.0001.000

Missing values

2024-01-29T00:45:27.631073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-29T00:45:27.708152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

품목등급단량단위평균가
10260연근(기타)특(1등15.0kg50000
15205풋고추(청초(일반))특(1등4.0kg3000
10487오이(가시오이)상(2등10.0kg12821
6330밤(밤(일반))특(1등10.0kg39024
9255아보카도(아보카도(일반))특(1등4.0kg9000
11318전분 및 사료제조(기타)특(1등5.0kg9500
6153미역(줄기미역)특(1등7.5kg11000
5308망고(망고(일반))상(2등5.0kg18000
3680단감(송본)특(1등10.0kg27914
130가지(가지(일반))특(1등10.0kg12533
품목등급단량단위평균가
7178비타민(비타민(일반))상(2등2.0kg7000
15320피망(단고추)(청피망)상(2등10.0kg30429
8273생강(기타)특(1등10.0kg110000
10921오이(백다다기)상(2등10.0kg5357
6433방울토마토(대추방울)특(1등4.0kg21078
12624토란대(기타)특(1등4.0kg2912
4676떫은감(반시)5등5.0kg8000
2914깻잎(기타)특(1등3.0kg13021
5291만가닥(만가닥(일반))특(1등1.5kg6000
11420전분 및 사료제조(청포묵)특(1등6.0kg9000

Duplicate rows

Most frequently occurring

품목등급단량단위평균가# duplicates
73곡물제조(연두부)특(1등12.0kg1780018
62곡물제조(두부)특(1등0.5kg123017
194무청(건무청)특(1등10.0kg2000017
67곡물제조(두부)특(1등7.0kg750016
71곡물제조(순두부)특(1등16.0kg1780016
313숙주나물(숙주나물(일반))특(1등3.5kg450016
475콩나물(콩나물(일반))특(1등5.0kg750016
66곡물제조(두부)특(1등3.0kg530015
100꼬시래기(꼬시래기(일반))특(1등8.0kg1050015
220미역(줄기미역)특(1등7.5kg1100015