Overview

Dataset statistics

Number of variables6
Number of observations230
Missing cells214
Missing cells (%)15.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.1 KiB
Average record size in memory49.6 B

Variable types

Categorical4
Numeric1
Text1

Dataset

Description부산광역시_사하구_주요생필품가격정보_20230523
Author부산광역시 사하구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3079289

Alerts

데이터기준일자 has constant value ""Constant
단위 is highly overall correlated with 품목High correlation
품목 is highly overall correlated with 단위High correlation
비고 has 214 (93.0%) missing valuesMissing
금액 has 4 (1.7%) zerosZeros

Reproduction

Analysis started2023-12-10 17:49:00.514663
Analysis finished2023-12-10 17:49:01.595433
Duration1.08 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

품목
Categorical

HIGH CORRELATION 

Distinct46
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
 
5
고춧가루
 
5
달걀
 
5
라면
 
5
설탕
 
5
Other values (41)
205 

Length

Max length178
Median length8
Mean length8.0652174
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
5
 
2.2%
고춧가루 5
 
2.2%
달걀 5
 
2.2%
라면 5
 
2.2%
설탕 5
 
2.2%
식용유 5
 
2.2%
5
 
2.2%
두부 5
 
2.2%
발효조미료 5
 
2.2%
참기름 5
 
2.2%
Other values (36) 180
78.3%

Length

2023-12-11T02:49:01.750875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5
 
2.2%
사과 5
 
2.2%
소금 5
 
2.2%
배추 5
 
2.2%
5
 
2.2%
대파 5
 
2.2%
양파 5
 
2.2%
오이 5
 
2.2%
감자 5
 
2.2%
상추 5
 
2.2%
Other values (36) 180
78.3%

단위
Categorical

HIGH CORRELATION 

Distinct43
Distinct (%)18.7%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
1.0㎏ 잎없는것
 
10
25㎝정도 1마리
 
10
1.0㎏
 
10
시원소주 360㎖ 1병
 
5
오복왕표 0.9ℓ 1병
 
5
Other values (38)
190 

Length

Max length18
Median length14
Mean length10.891304
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 정미포장미 20㎏
2nd row 정미포장미 20㎏
3rd row 정미포장미 20㎏
4th row 정미포장미 20㎏
5th row 정미포장미 20㎏

Common Values

ValueCountFrequency (%)
1.0㎏ 잎없는것 10
 
4.3%
25㎝정도 1마리 10
 
4.3%
1.0㎏ 10
 
4.3%
시원소주 360㎖ 1병 5
 
2.2%
오복왕표 0.9ℓ 1병 5
 
2.2%
신라면 120g 5개 1봉지 5
 
2.2%
백설표 정백당 1㎏ 5
 
2.2%
콩기름1.8ℓ 5
 
2.2%
백태 500g 5
 
2.2%
340g 풀무원 부침용 5
 
2.2%
Other values (33) 165
71.7%

Length

2023-12-11T02:49:02.013071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1.0㎏ 30
 
6.2%
500g 25
 
5.2%
1마리 20
 
4.1%
1병 20
 
4.1%
100g 20
 
4.1%
25㎝정도 15
 
3.1%
백설표 10
 
2.1%
1.5ℓ 10
 
2.1%
잎없는것 10
 
2.1%
1개 10
 
2.1%
Other values (61) 315
64.9%

조사업체
Categorical

Distinct5
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
괴정골목시장
46 
장림골목시장
46 
서원유통 탑스토아
46 
롯데마트(사하점)
46 
홈플러스(장림점)
46 

Length

Max length9
Median length9
Mean length7.8
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row괴정골목시장
2nd row장림골목시장
3rd row서원유통 탑스토아
4th row롯데마트(사하점)
5th row홈플러스(장림점)

Common Values

ValueCountFrequency (%)
괴정골목시장 46
20.0%
장림골목시장 46
20.0%
서원유통 탑스토아 46
20.0%
롯데마트(사하점) 46
20.0%
홈플러스(장림점) 46
20.0%

Length

2023-12-11T02:49:02.282941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:49:02.508648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
괴정골목시장 46
16.7%
장림골목시장 46
16.7%
서원유통 46
16.7%
탑스토아 46
16.7%
롯데마트(사하점 46
16.7%
홈플러스(장림점 46
16.7%

금액
Real number (ℝ)

ZEROS 

Distinct157
Distinct (%)68.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8098.913
Minimum0
Maximum91550
Zeros4
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2023-12-11T02:49:02.810798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile968
Q12205
median4100
Q38247.5
95-th percentile34460
Maximum91550
Range91550
Interquartile range (IQR)6042.5

Descriptive statistics

Standard deviation12480.112
Coefficient of variation (CV)1.5409613
Kurtosis15.953357
Mean8098.913
Median Absolute Deviation (MAD)2265
Skewness3.7014485
Sum1862750
Variance1.5575319 × 108
MonotonicityNot monotonic
2023-12-11T02:49:03.098419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2000 6
 
2.6%
1980 5
 
2.2%
3000 4
 
1.7%
4200 4
 
1.7%
2500 4
 
1.7%
0 4
 
1.7%
8500 4
 
1.7%
6000 3
 
1.3%
5000 3
 
1.3%
1000 3
 
1.3%
Other values (147) 190
82.6%
ValueCountFrequency (%)
0 4
1.7%
700 1
 
0.4%
750 1
 
0.4%
860 1
 
0.4%
870 1
 
0.4%
910 1
 
0.4%
930 2
0.9%
950 1
 
0.4%
990 1
 
0.4%
1000 3
1.3%
ValueCountFrequency (%)
91550 1
0.4%
74000 1
0.4%
64950 1
0.4%
57800 1
0.4%
57000 1
0.4%
49900 1
0.4%
49500 1
0.4%
47900 1
0.4%
42900 1
0.4%
39500 1
0.4%

비고
Text

MISSING 

Distinct8
Distinct (%)50.0%
Missing214
Missing (%)93.0%
Memory size1.9 KiB
2023-12-11T02:49:03.407338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length5.0625
Min length2

Characters and Unicode

Total characters81
Distinct characters34
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)37.5%

Sample

1st row메뚜기쌀
2nd row황금메뚜기쌀
3rd row맥심
4th row맥심
5th row맥심
ValueCountFrequency (%)
맥심 5
25.0%
델몬트오렌지 5
25.0%
400g 2
 
10.0%
메뚜기쌀 1
 
5.0%
황금메뚜기쌀 1
 
5.0%
서울우유 1
 
5.0%
340g 1
 
5.0%
풀무원 1
 
5.0%
해찬들 1
 
5.0%
순창 1
 
5.0%
2023-12-11T02:49:04.048559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
 
6.2%
5
 
6.2%
5
 
6.2%
5
 
6.2%
5
 
6.2%
5
 
6.2%
5
 
6.2%
5
 
6.2%
0 5
 
6.2%
4
 
4.9%
Other values (24) 32
39.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65
80.2%
Decimal Number 9
 
11.1%
Space Separator 4
 
4.9%
Lowercase Letter 3
 
3.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
2
 
3.1%
2
 
3.1%
Other values (19) 21
32.3%
Decimal Number
ValueCountFrequency (%)
0 5
55.6%
4 3
33.3%
3 1
 
11.1%
Space Separator
ValueCountFrequency (%)
4
100.0%
Lowercase Letter
ValueCountFrequency (%)
g 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65
80.2%
Common 13
 
16.0%
Latin 3
 
3.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
2
 
3.1%
2
 
3.1%
Other values (19) 21
32.3%
Common
ValueCountFrequency (%)
0 5
38.5%
4
30.8%
4 3
23.1%
3 1
 
7.7%
Latin
ValueCountFrequency (%)
g 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65
80.2%
ASCII 16
 
19.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
5
 
7.7%
2
 
3.1%
2
 
3.1%
Other values (19) 21
32.3%
ASCII
ValueCountFrequency (%)
0 5
31.2%
4
25.0%
4 3
18.8%
g 3
18.8%
3 1
 
6.2%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2023-05-23
230 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-05-23
2nd row2023-05-23
3rd row2023-05-23
4th row2023-05-23
5th row2023-05-23

Common Values

ValueCountFrequency (%)
2023-05-23 230
100.0%

Length

2023-12-11T02:49:04.306922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:49:04.525944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-05-23 230
100.0%

Interactions

2023-12-11T02:49:01.033595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:49:04.640279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
품목단위조사업체금액비고
품목1.0001.0000.0000.7931.000
단위1.0001.0000.0000.8031.000
조사업체0.0000.0001.0000.0000.000
금액0.7930.8030.0001.0000.871
비고1.0001.0000.0000.8711.000
2023-12-11T02:49:04.871727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단위품목조사업체
단위1.0000.9920.000
품목0.9921.0000.000
조사업체0.0000.0001.000
2023-12-11T02:49:05.084918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
금액품목단위조사업체
금액1.0000.3780.3950.000
품목0.3781.0000.9920.000
단위0.3950.9921.0000.000
조사업체0.0000.0000.0001.000

Missing values

2023-12-11T02:49:01.264124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:49:01.499721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

품목단위조사업체금액비고데이터기준일자
0정미포장미 20㎏괴정골목시장57000메뚜기쌀2023-05-23
1정미포장미 20㎏장림골목시장49900황금메뚜기쌀2023-05-23
2정미포장미 20㎏서원유통 탑스토아57800<NA>2023-05-23
3정미포장미 20㎏롯데마트(사하점)42900<NA>2023-05-23
4정미포장미 20㎏홈플러스(장림점)47900<NA>2023-05-23
5밀가루백설표 중력분1등3㎏괴정골목시장6200<NA>2023-05-23
6밀가루백설표 중력분1등3㎏장림골목시장6720<NA>2023-05-23
7밀가루백설표 중력분1등3㎏서원유통 탑스토아5350<NA>2023-05-23
8밀가루백설표 중력분1등3㎏롯데마트(사하점)5700<NA>2023-05-23
9밀가루백설표 중력분1등3㎏홈플러스(장림점)5700<NA>2023-05-23
품목단위조사업체금액비고데이터기준일자
220소금맛소금 1kg괴정골목시장4700<NA>2023-05-23
221소금맛소금 1kg장림골목시장4800<NA>2023-05-23
222소금맛소금 1kg서원유통 탑스토아4180<NA>2023-05-23
223소금맛소금 1kg롯데마트(사하점)4180<NA>2023-05-23
224소금맛소금 1kg홈플러스(장림점)4180<NA>2023-05-23
225케첩500g/개괴정골목시장2700<NA>2023-05-23
226케첩500g/개장림골목시장2900<NA>2023-05-23
227케첩500g/개서원유통 탑스토아3180<NA>2023-05-23
228케첩500g/개롯데마트(사하점)3180<NA>2023-05-23
229케첩500g/개홈플러스(장림점)3180<NA>2023-05-23