Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows622
Duplicate rows (%)6.2%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Text1
Categorical2
Numeric2

Dataset

Description인천광역시 남촌농산물도매시장 월간 경락가격에 대한 데이터로 품목, 등급, 단량, 단위, 평균가등을 볼 수 있습니다.
Author인천광역시
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15051664&srcSe=7661IVAWM27C61E190

Alerts

단위 has constant value ""Constant
Dataset has 622 (6.2%) duplicate rowsDuplicates
단량 is highly overall correlated with 평균가High correlation
평균가 is highly overall correlated with 단량High correlation
등급 is highly imbalanced (54.6%)Imbalance
평균가 is highly skewed (γ1 = 99.26819391)Skewed

Reproduction

Analysis started2024-01-28 15:45:20.219465
Analysis finished2024-01-28 15:45:20.977115
Duration0.76 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

품목
Text

Distinct406
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-01-29T00:45:21.120117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length17
Mean length9.3021
Min length5

Characters and Unicode

Total characters93021
Distinct characters298
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)0.6%

Sample

1st row파프리카(노랑파프리카)
2nd row풋고추(청양)
3rd row아욱(아욱(일반))
4th row양송이(기타)
5th row가지(가지(일반))
ValueCountFrequency (%)
표고버섯(생표고 259
 
2.5%
오이(백다다기 214
 
2.1%
수박(수박(일반)(꼭지절단 190
 
1.8%
기타(엽경채류(기타 185
 
1.8%
풋고추(청양 147
 
1.4%
가지(가지(일반 146
 
1.4%
표고버섯(표고버섯(일반 142
 
1.4%
새송이(새송이(일반 132
 
1.3%
시금치(시금치(일반 132
 
1.3%
풋고추(청초(일반 113
 
1.1%
Other values (400) 8630
83.9%
2024-01-29T00:45:21.426725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 14063
 
15.1%
) 14063
 
15.1%
3502
 
3.8%
3453
 
3.7%
2964
 
3.2%
2954
 
3.2%
2479
 
2.7%
2130
 
2.3%
1344
 
1.4%
1290
 
1.4%
Other values (288) 44779
48.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 64454
69.3%
Open Punctuation 14063
 
15.1%
Close Punctuation 14063
 
15.1%
Space Separator 290
 
0.3%
Other Punctuation 131
 
0.1%
Decimal Number 20
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3502
 
5.4%
3453
 
5.4%
2964
 
4.6%
2954
 
4.6%
2479
 
3.8%
2130
 
3.3%
1344
 
2.1%
1290
 
2.0%
1240
 
1.9%
1132
 
1.8%
Other values (282) 41966
65.1%
Decimal Number
ValueCountFrequency (%)
1 13
65.0%
8 7
35.0%
Open Punctuation
ValueCountFrequency (%)
( 14063
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14063
100.0%
Space Separator
ValueCountFrequency (%)
290
100.0%
Other Punctuation
ValueCountFrequency (%)
, 131
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 64454
69.3%
Common 28567
30.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3502
 
5.4%
3453
 
5.4%
2964
 
4.6%
2954
 
4.6%
2479
 
3.8%
2130
 
3.3%
1344
 
2.1%
1290
 
2.0%
1240
 
1.9%
1132
 
1.8%
Other values (282) 41966
65.1%
Common
ValueCountFrequency (%)
( 14063
49.2%
) 14063
49.2%
290
 
1.0%
, 131
 
0.5%
1 13
 
< 0.1%
8 7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 64454
69.3%
ASCII 28567
30.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 14063
49.2%
) 14063
49.2%
290
 
1.0%
, 131
 
0.5%
1 13
 
< 0.1%
8 7
 
< 0.1%
Hangul
ValueCountFrequency (%)
3502
 
5.4%
3453
 
5.4%
2964
 
4.6%
2954
 
4.6%
2479
 
3.8%
2130
 
3.3%
1344
 
2.1%
1290
 
2.0%
1240
 
1.9%
1132
 
1.8%
Other values (282) 41966
65.1%

등급
Categorical

IMBALANCE 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
특(1등
6397 
상(2등
2549 
보통(3
 
434
4등
 
200
9등(등
 
178
Other values (5)
 
242

Length

Max length17
Median length16
Mean length16.0353
Min length16

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9등(등
2nd row상(2등
3rd row특(1등
4th row특(1등
5th row특(1등

Common Values

ValueCountFrequency (%)
특(1등 6397
64.0%
상(2등 2549
 
25.5%
보통(3 434
 
4.3%
4등 200
 
2.0%
9등(등 178
 
1.8%
없음 89
 
0.9%
8등 54
 
0.5%
5등 45
 
0.4%
6등 36
 
0.4%
7등 18
 
0.2%

Length

2024-01-29T00:45:21.536416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-29T00:45:21.620656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
특(1등 6397
64.0%
상(2등 2549
 
25.5%
보통(3 434
 
4.3%
4등 200
 
2.0%
9등(등 178
 
1.8%
없음 89
 
0.9%
8등 54
 
0.5%
5등 45
 
0.4%
6등 36
 
0.4%
7등 18
 
0.2%

단량
Real number (ℝ)

HIGH CORRELATION 

Distinct86
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.480144
Minimum0.01
Maximum136
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-29T00:45:21.726099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.5
Q13
median5
Q310
95-th percentile15
Maximum136
Range135.99
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.0291108
Coefficient of variation (CV)0.77608009
Kurtosis58.09765
Mean6.480144
Median Absolute Deviation (MAD)3
Skewness3.4233407
Sum64801.44
Variance25.291955
MonotonicityNot monotonic
2024-01-29T00:45:21.835726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10.0 2120
21.2%
4.0 2090
20.9%
2.0 1097
11.0%
5.0 808
 
8.1%
8.0 744
 
7.4%
1.0 471
 
4.7%
15.0 334
 
3.3%
3.0 316
 
3.2%
20.0 269
 
2.7%
0.5 238
 
2.4%
Other values (76) 1513
15.1%
ValueCountFrequency (%)
0.01 17
 
0.2%
0.02 1
 
< 0.1%
0.05 41
0.4%
0.06 10
 
0.1%
0.1 17
 
0.2%
0.12 4
 
< 0.1%
0.15 5
 
0.1%
0.16 14
 
0.1%
0.2 65
0.7%
0.25 3
 
< 0.1%
ValueCountFrequency (%)
136.0 1
 
< 0.1%
89.0 1
 
< 0.1%
85.0 1
 
< 0.1%
51.0 1
 
< 0.1%
40.0 3
 
< 0.1%
34.0 1
 
< 0.1%
25.0 4
 
< 0.1%
21.0 2
 
< 0.1%
20.0 269
2.7%
18.0 74
 
0.7%

단위
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
kg
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowkg
2nd rowkg
3rd rowkg
4th rowkg
5th rowkg

Common Values

ValueCountFrequency (%)
kg 10000
100.0%

Length

2024-01-29T00:45:21.942561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-29T00:45:22.009064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kg 10000
100.0%

평균가
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct4474
Distinct (%)44.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22085.493
Minimum150
Maximum40008000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-29T00:45:22.097603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum150
5-th percentile1495.75
Q15898.75
median11942
Q322000
95-th percentile54920.25
Maximum40008000
Range40007850
Interquartile range (IQR)16101.25

Descriptive statistics

Standard deviation400887.28
Coefficient of variation (CV)18.151611
Kurtosis9901.7552
Mean22085.493
Median Absolute Deviation (MAD)7098.5
Skewness99.268194
Sum2.2085493 × 108
Variance1.6071061 × 1011
MonotonicityNot monotonic
2024-01-29T00:45:22.214768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000 218
 
2.2%
8000 175
 
1.8%
4000 172
 
1.7%
5000 162
 
1.6%
13000 159
 
1.6%
12000 159
 
1.6%
15000 156
 
1.6%
3000 154
 
1.5%
6000 153
 
1.5%
7000 146
 
1.5%
Other values (4464) 8346
83.5%
ValueCountFrequency (%)
150 1
 
< 0.1%
200 6
 
0.1%
250 1
 
< 0.1%
276 1
 
< 0.1%
300 25
0.2%
336 1
 
< 0.1%
350 9
 
0.1%
367 1
 
< 0.1%
400 12
0.1%
408 1
 
< 0.1%
ValueCountFrequency (%)
40008000 1
 
< 0.1%
1417000 1
 
< 0.1%
604200 1
 
< 0.1%
532057 1
 
< 0.1%
531200 1
 
< 0.1%
448177 1
 
< 0.1%
406200 3
< 0.1%
354200 1
 
< 0.1%
240000 6
0.1%
215000 1
 
< 0.1%

Interactions

2024-01-29T00:45:20.680831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-29T00:45:20.512004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-29T00:45:20.761452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-29T00:45:20.595061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-29T00:45:22.292587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등급단량평균가
등급1.0000.0670.000
단량0.0671.0000.000
평균가0.0000.0001.000
2024-01-29T00:45:22.360294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단량평균가등급
단량1.0000.5710.035
평균가0.5711.0000.000
등급0.0350.0001.000

Missing values

2024-01-29T00:45:20.873328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-29T00:45:20.942930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

품목등급단량단위평균가
12959파프리카(노랑파프리카)9등(등5.0kg4000
15107풋고추(청양)상(2등10.0kg48367
9294아욱(아욱(일반))특(1등2.0kg1500
9706양송이(기타)특(1등2.0kg12367
222가지(가지(일반))특(1등10.0kg8810
15388피망(단고추)(피망(일반))특(1등10.0kg26154
14681풋고추(아삭이)상(2등10.0kg25000
6236밤(밤(일반))특(1등10.0kg37214
9272아스파라거스(녹색)특(1등1.0kg5000
13188팥(기타)특(1등2.0kg7000
품목등급단량단위평균가
189가지(가지(일반))상(2등8.0kg5214
6694배추(기타)특(1등15.0kg10536
2608기타(엽경채류(기타))상(2등1.0kg3725
15247풋고추(청초(일반))특(1등4.0kg11000
6019미나리(미나리(일반))특(1등4.0kg10667
814갓(돌산갓)상(2등2.0kg2601
5988미나리(돌미나리)상(2등4.0kg7000
7638사과(후지)보통(310.0kg25750
993강낭콩(강낭콩(일반))특(1등4.0kg19538
7864상추(쫑상추)상(2등4.0kg10000

Duplicate rows

Most frequently occurring

품목등급단량단위평균가# duplicates
69곡물제조(순두부)특(1등16.0kg1780018
71곡물제조(연두부)특(1등12.0kg1780018
57곡물제조(두부)특(1등0.5kg123017
218미역(줄기미역)특(1등7.5kg1100017
62곡물제조(두부)특(1등7.0kg750016
181무순(무순(일반))특(1등0.05kg30016
61곡물제조(두부)특(1등3.0kg530015
97꼬시래기(꼬시래기(일반))특(1등8.0kg1050015
190무청(건무청)특(1등10.0kg2000015
317숙주나물(숙주나물(일반))특(1등3.5kg450015