Overview

Dataset statistics

Number of variables7
Number of observations86
Missing cells12
Missing cells (%)2.0%
Duplicate rows1
Duplicate rows (%)1.2%
Total size in memory5.0 KiB
Average record size in memory59.5 B

Variable types

Categorical1
Text3
Numeric2
DateTime1

Dataset

Description대형마트 소비자물가정보 84건에 대하여 경상북도 및 포항시의 품목별 최저가와 평균가를 산출내어 민원인에게 제공하고자 합니다.
URLhttps://www.data.go.kr/data/15048475/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (1.2%) duplicate rowsDuplicates
경상북도평균가 is highly overall correlated with 포항시평균가High correlation
포항시평균가 is highly overall correlated with 경상북도평균가High correlation
품명 has 2 (2.3%) missing valuesMissing
경상북도평균가 has 2 (2.3%) missing valuesMissing
경상북도최저가 has 2 (2.3%) missing valuesMissing
포항시평균가 has 2 (2.3%) missing valuesMissing
포항시최저가 has 2 (2.3%) missing valuesMissing
데이터기준일자 has 2 (2.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 10:07:28.319265
Analysis finished2023-12-12 10:07:29.987634
Duration1.67 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

Distinct15
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Memory size820.0 B
채소류
15 
어류
10 
유지·조미료
곡류
기타잡비
Other values (10)
36 

Length

Max length6
Median length4
Mean length3.244186
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row곡류
2nd row곡류
3rd row곡류
4th row곡류
5th row곡류

Common Values

ValueCountFrequency (%)
채소류 15
17.4%
어류 10
11.6%
유지·조미료 9
10.5%
곡류 8
9.3%
기타잡비 8
9.3%
과실류 7
8.1%
빵및과자 6
 
7.0%
육류 4
 
4.7%
외식 4
 
4.7%
낙농품 3
 
3.5%
Other values (5) 12
14.0%

Length

2023-12-12T19:07:30.094180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
채소류 15
17.4%
어류 10
11.6%
유지·조미료 9
10.5%
곡류 8
9.3%
기타잡비 8
9.3%
과실류 7
8.1%
빵및과자 6
 
7.0%
육류 4
 
4.7%
외식 4
 
4.7%
낙농품 3
 
3.5%
Other values (5) 12
14.0%

품명
Text

MISSING 

Distinct84
Distinct (%)100.0%
Missing2
Missing (%)2.3%
Memory size820.0 B
2023-12-12T19:07:30.462545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7.5
Mean length2.5714286
Min length1

Characters and Unicode

Total characters216
Distinct characters130
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique84 ?
Unique (%)100.0%

Sample

1st row
2nd row보리쌀
3rd row찹쌀
4th row
5th row밀가루
ValueCountFrequency (%)
미역 1
 
1.2%
1
 
1.2%
초코파이 1
 
1.2%
스낵과자 1
 
1.2%
식빵 1
 
1.2%
1
 
1.2%
1
 
1.2%
된장 1
 
1.2%
간장 1
 
1.2%
고추장 1
 
1.2%
Other values (76) 76
88.4%
2023-12-12T19:07:31.119197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
 
3.7%
6
 
2.8%
6
 
2.8%
5
 
2.3%
5
 
2.3%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
3
 
1.4%
Other values (120) 167
77.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 212
98.1%
Space Separator 2
 
0.9%
Open Punctuation 1
 
0.5%
Close Punctuation 1
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
3.8%
6
 
2.8%
6
 
2.8%
5
 
2.4%
5
 
2.4%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
3
 
1.4%
Other values (117) 163
76.9%
Space Separator
ValueCountFrequency (%)
2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 212
98.1%
Common 4
 
1.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
 
3.8%
6
 
2.8%
6
 
2.8%
5
 
2.4%
5
 
2.4%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
3
 
1.4%
Other values (117) 163
76.9%
Common
ValueCountFrequency (%)
2
50.0%
( 1
25.0%
) 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 212
98.1%
ASCII 4
 
1.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8
 
3.8%
6
 
2.8%
6
 
2.8%
5
 
2.4%
5
 
2.4%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
3
 
1.4%
Other values (117) 163
76.9%
ASCII
ValueCountFrequency (%)
2
50.0%
( 1
25.0%
) 1
25.0%

경상북도평균가
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct84
Distinct (%)100.0%
Missing2
Missing (%)2.3%
Infinite0
Infinite (%)0.0%
Mean8504.1548
Minimum255
Maximum54784
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size906.0 B
2023-12-12T19:07:31.331943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum255
5-th percentile1063.7
Q12644.5
median5066.5
Q39936
95-th percentile32074.95
Maximum54784
Range54529
Interquartile range (IQR)7291.5

Descriptive statistics

Standard deviation10328.04
Coefficient of variation (CV)1.2144699
Kurtosis9.034031
Mean8504.1548
Median Absolute Deviation (MAD)3312.5
Skewness2.8407367
Sum714349
Variance1.0666842 × 108
MonotonicityNot monotonic
2023-12-12T19:07:31.525120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1905 1
 
1.2%
4207 1
 
1.2%
2169 1
 
1.2%
3595 1
 
1.2%
1402 1
 
1.2%
12354 1
 
1.2%
3906 1
 
1.2%
5594 1
 
1.2%
9903 1
 
1.2%
2586 1
 
1.2%
Other values (74) 74
86.0%
(Missing) 2
 
2.3%
ValueCountFrequency (%)
255 1
1.2%
608 1
1.2%
801 1
1.2%
911 1
1.2%
1031 1
1.2%
1249 1
1.2%
1355 1
1.2%
1364 1
1.2%
1402 1
1.2%
1463 1
1.2%
ValueCountFrequency (%)
54784 1
1.2%
53726 1
1.2%
37270 1
1.2%
36746 1
1.2%
33336 1
1.2%
24929 1
1.2%
23044 1
1.2%
17500 1
1.2%
17475 1
1.2%
16146 1
1.2%

경상북도최저가
Text

MISSING 

Distinct73
Distinct (%)86.9%
Missing2
Missing (%)2.3%
Memory size820.0 B
2023-12-12T19:07:31.879577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.7857143
Min length3

Characters and Unicode

Total characters402
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)73.8%

Sample

1st row47,900
2nd row1,950
3rd row2,630
4th row9,188
5th row4,990
ValueCountFrequency (%)
3,500 2
 
2.4%
700 2
 
2.4%
3,000 2
 
2.4%
990 2
 
2.4%
2,500 2
 
2.4%
4,100 2
 
2.4%
6,000 2
 
2.4%
2,800 2
 
2.4%
2,200 2
 
2.4%
1,120 2
 
2.4%
Other values (63) 64
76.2%
2023-12-12T19:07:32.406698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 118
29.4%
, 71
17.7%
9 40
 
10.0%
2 31
 
7.7%
1 29
 
7.2%
5 24
 
6.0%
6 22
 
5.5%
8 20
 
5.0%
4 19
 
4.7%
3 17
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 331
82.3%
Other Punctuation 71
 
17.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 118
35.6%
9 40
 
12.1%
2 31
 
9.4%
1 29
 
8.8%
5 24
 
7.3%
6 22
 
6.6%
8 20
 
6.0%
4 19
 
5.7%
3 17
 
5.1%
7 11
 
3.3%
Other Punctuation
ValueCountFrequency (%)
, 71
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 402
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 118
29.4%
, 71
17.7%
9 40
 
10.0%
2 31
 
7.7%
1 29
 
7.2%
5 24
 
6.0%
6 22
 
5.5%
8 20
 
5.0%
4 19
 
4.7%
3 17
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 402
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 118
29.4%
, 71
17.7%
9 40
 
10.0%
2 31
 
7.7%
1 29
 
7.2%
5 24
 
6.0%
6 22
 
5.5%
8 20
 
5.0%
4 19
 
4.7%
3 17
 
4.2%

포항시평균가
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct83
Distinct (%)98.8%
Missing2
Missing (%)2.3%
Infinite0
Infinite (%)0.0%
Mean8478.3214
Minimum149
Maximum60226
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size906.0 B
2023-12-12T19:07:32.599256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum149
5-th percentile900.65
Q12902.75
median5017.5
Q39474
95-th percentile31246.65
Maximum60226
Range60077
Interquartile range (IQR)6571.25

Descriptive statistics

Standard deviation11025.663
Coefficient of variation (CV)1.3004536
Kurtosis11.456407
Mean8478.3214
Median Absolute Deviation (MAD)3077.5
Skewness3.202619
Sum712179
Variance1.2156525 × 108
MonotonicityNot monotonic
2023-12-12T19:07:32.779682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1284 2
 
2.3%
4320 1
 
1.2%
3380 1
 
1.2%
1295 1
 
1.2%
18380 1
 
1.2%
4140 1
 
1.2%
5040 1
 
1.2%
10141 1
 
1.2%
2009 1
 
1.2%
1089 1
 
1.2%
Other values (73) 73
84.9%
(Missing) 2
 
2.3%
ValueCountFrequency (%)
149 1
1.2%
617 1
1.2%
721 1
1.2%
794 1
1.2%
890 1
1.2%
961 1
1.2%
1089 1
1.2%
1284 2
2.3%
1295 1
1.2%
1411 1
1.2%
ValueCountFrequency (%)
60226 1
1.2%
59386 1
1.2%
45040 1
1.2%
34577 1
1.2%
33060 1
1.2%
20971 1
1.2%
18380 1
1.2%
18000 1
1.2%
17971 1
1.2%
14877 1
1.2%

포항시최저가
Text

MISSING 

Distinct79
Distinct (%)94.0%
Missing2
Missing (%)2.3%
Memory size820.0 B
2023-12-12T19:07:33.134225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length4.1190476
Min length3

Characters and Unicode

Total characters346
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)88.1%

Sample

1st row49800
2nd row2990
3rd row2625
4th row9960
5th row5190
ValueCountFrequency (%)
500 2
 
2.4%
1980 2
 
2.4%
3980 2
 
2.4%
2990 2
 
2.4%
4950 2
 
2.4%
4320 1
 
1.2%
836 1
 
1.2%
2480 1
 
1.2%
15,000 1
 
1.2%
2630 1
 
1.2%
Other values (69) 69
82.1%
2023-12-12T19:07:33.686106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 108
31.2%
9 36
 
10.4%
5 33
 
9.5%
8 32
 
9.2%
2 32
 
9.2%
1 27
 
7.8%
3 21
 
6.1%
4 19
 
5.5%
6 17
 
4.9%
, 12
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 334
96.5%
Other Punctuation 12
 
3.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 108
32.3%
9 36
 
10.8%
5 33
 
9.9%
8 32
 
9.6%
2 32
 
9.6%
1 27
 
8.1%
3 21
 
6.3%
4 19
 
5.7%
6 17
 
5.1%
7 9
 
2.7%
Other Punctuation
ValueCountFrequency (%)
, 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 346
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 108
31.2%
9 36
 
10.4%
5 33
 
9.5%
8 32
 
9.2%
2 32
 
9.2%
1 27
 
7.8%
3 21
 
6.1%
4 19
 
5.5%
6 17
 
4.9%
, 12
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 346
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 108
31.2%
9 36
 
10.4%
5 33
 
9.5%
8 32
 
9.2%
2 32
 
9.2%
1 27
 
7.8%
3 21
 
6.1%
4 19
 
5.5%
6 17
 
4.9%
, 12
 
3.5%

데이터기준일자
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)1.2%
Missing2
Missing (%)2.3%
Memory size820.0 B
Minimum2023-03-05 00:00:00
Maximum2023-03-05 00:00:00
2023-12-12T19:07:33.851910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:07:33.968926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T19:07:29.230923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:07:28.988751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:07:29.355639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:07:29.113695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:07:34.067689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분품명경상북도평균가경상북도최저가포항시평균가포항시최저가
구분1.0001.0000.0000.8410.1070.983
품명1.0001.0001.0001.0001.0001.000
경상북도평균가0.0001.0001.0000.9960.9780.996
경상북도최저가0.8411.0000.9961.0000.9650.947
포항시평균가0.1071.0000.9780.9651.0000.991
포항시최저가0.9831.0000.9960.9470.9911.000
2023-12-12T19:07:34.214885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
경상북도평균가포항시평균가구분
경상북도평균가1.0000.9800.000
포항시평균가0.9801.0000.050
구분0.0000.0501.000

Missing values

2023-12-12T19:07:29.544004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:07:29.714471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T19:07:29.885296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분품명경상북도평균가경상북도최저가포항시평균가포항시최저가데이터기준일자
0곡류5372647,90059386498002023-03-05
1곡류보리쌀37861,950388029902023-03-05
2곡류찹쌀47302,630466526252023-03-05
3곡류121009,1881237699602023-03-05
4곡류밀가루59204,990563051902023-03-05
5곡류두부34631,990378318002023-03-05
6곡류라면8016607947402023-03-05
7곡류국수34582,590369735502023-03-05
8육류쇠고기 (국산)5478434,27560226365002023-03-05
9육류돼지고기128909,8001293684502023-03-05
구분품명경상북도평균가경상북도최저가포항시평균가포항시최저가데이터기준일자
76기타잡비치약31201,120407019402023-03-05
77기타잡비로션67734,100552122182023-03-05
78기타잡비샴푸79675,300707926052023-03-05
79기타잡비손세정제49062,800457532502023-03-05
80기타잡비보건용 마스크10317008908902023-03-05
81차와음료커피95021,49071395202023-03-05
82차와음료콜라27962,290311322902023-03-05
83차와음료과일주스25641,290357032802023-03-05
84<NA><NA><NA><NA><NA><NA><NA>
85<NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

구분품명경상북도평균가경상북도최저가포항시평균가포항시최저가데이터기준일자# duplicates
0<NA><NA><NA><NA><NA><NA><NA>2