Overview

Dataset statistics

Number of variables11
Number of observations22
Missing cells0
Missing cells (%)0.0%
Duplicate rows5
Duplicate rows (%)22.7%
Total size in memory2.2 KiB
Average record size in memory104.0 B

Variable types

Categorical10
DateTime1

Dataset

Description인천광역시 부평구 음식물류폐기물 기간별 판매현황(site코드, 전표일자, 상품코드, 확정수량, 매출단가, 스티커시작번호, 스티커최종번호, 출고단위, 출고단위수량, 묶음당매수, BOX당묶음수량)
Author인천광역시 부평구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15062412&srcSe=7661IVAWM27C61E190

Alerts

site코드 has constant value ""Constant
전표일자 has constant value ""Constant
출고단위수량 has constant value ""Constant
Dataset has 5 (22.7%) duplicate rowsDuplicates
BOX당묶음수량 is highly overall correlated with 상품코드 and 5 other fieldsHigh correlation
스티커시작번호 is highly overall correlated with 상품코드 and 5 other fieldsHigh correlation
상품코드 is highly overall correlated with 확정수량 and 5 other fieldsHigh correlation
매출단가 is highly overall correlated with 상품코드 and 5 other fieldsHigh correlation
출고단위 is highly overall correlated with 확정수량High correlation
스티커최종번호 is highly overall correlated with 상품코드 and 5 other fieldsHigh correlation
묶음당매수 is highly overall correlated with 상품코드 and 5 other fieldsHigh correlation
확정수량 is highly overall correlated with 상품코드 and 6 other fieldsHigh correlation
출고단위 is highly imbalanced (73.3%)Imbalance

Reproduction

Analysis started2024-04-17 09:51:06.130183
Analysis finished2024-04-17 09:51:06.712545
Duration0.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

site코드
Categorical

CONSTANT 

Distinct1
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size308.0 B
306
22 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row306
2nd row306
3rd row306
4th row306
5th row306

Common Values

ValueCountFrequency (%)
306 22
100.0%

Length

2024-04-17T18:51:06.762000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:51:06.841800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
306 22
100.0%

전표일자
Date

CONSTANT 

Distinct1
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size308.0 B
Minimum2020-07-27 00:00:00
Maximum2020-07-27 00:00:00
2024-04-17T18:51:06.905955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T18:51:06.976506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

상품코드
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Memory size308.0 B
5050
11 
5005
7020
5010
6125

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5050
2nd row5050
3rd row5050
4th row5050
5th row5050

Common Values

ValueCountFrequency (%)
5050 11
50.0%
5005 5
22.7%
7020 2
 
9.1%
5010 2
 
9.1%
6125 2
 
9.1%

Length

2024-04-17T18:51:07.061047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:51:07.154264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5050 11
50.0%
5005 5
22.7%
7020 2
 
9.1%
5010 2
 
9.1%
6125 2
 
9.1%

확정수량
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)18.2%
Missing0
Missing (%)0.0%
Memory size308.0 B
10
12 
20
50
200
 
1

Length

Max length3
Median length2
Mean length2.0454545
Min length2

Unique

Unique1 ?
Unique (%)4.5%

Sample

1st row10
2nd row10
3rd row10
4th row10
5th row10

Common Values

ValueCountFrequency (%)
10 12
54.5%
20 5
22.7%
50 4
 
18.2%
200 1
 
4.5%

Length

2024-04-17T18:51:07.253856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:51:07.343277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
10 12
54.5%
20 5
22.7%
50 4
 
18.2%
200 1
 
4.5%

매출단가
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Memory size308.0 B
1699
11 
193
689
363
7060

Length

Max length4
Median length4
Mean length3.5909091
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1699
2nd row1699
3rd row1699
4th row1699
5th row1699

Common Values

ValueCountFrequency (%)
1699 11
50.0%
193 5
22.7%
689 2
 
9.1%
363 2
 
9.1%
7060 2
 
9.1%

Length

2024-04-17T18:51:07.444529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:51:07.536705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1699 11
50.0%
193 5
22.7%
689 2
 
9.1%
363 2
 
9.1%
7060 2
 
9.1%

스티커시작번호
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
20100000000000000
11 
20000000000000000
19100000000000000

Length

Max length17
Median length17
Mean length17
Min length17

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20100000000000000
2nd row20100000000000000
3rd row20100000000000000
4th row20100000000000000
5th row20100000000000000

Common Values

ValueCountFrequency (%)
20100000000000000 11
50.0%
20000000000000000 9
40.9%
19100000000000000 2
 
9.1%

Length

2024-04-17T18:51:07.637008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:51:07.727105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20100000000000000 11
50.0%
20000000000000000 9
40.9%
19100000000000000 2
 
9.1%

스티커최종번호
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
20100000000000000
11 
20000000000000000
19100000000000000

Length

Max length17
Median length17
Mean length17
Min length17

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20100000000000000
2nd row20100000000000000
3rd row20100000000000000
4th row20100000000000000
5th row20100000000000000

Common Values

ValueCountFrequency (%)
20100000000000000 11
50.0%
20000000000000000 9
40.9%
19100000000000000 2
 
9.1%

Length

2024-04-17T18:51:07.820029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:51:07.905005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20100000000000000 11
50.0%
20000000000000000 9
40.9%
19100000000000000 2
 
9.1%

출고단위
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size308.0 B
2
21 
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)4.5%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 21
95.5%
3 1
 
4.5%

Length

2024-04-17T18:51:07.998270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:51:08.335856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 21
95.5%
3 1
 
4.5%

출고단위수량
Categorical

CONSTANT 

Distinct1
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size308.0 B
1
22 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 22
100.0%

Length

2024-04-17T18:51:08.413207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:51:08.495775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 22
100.0%

묶음당매수
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
10
13 
20
50

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10
2nd row10
3rd row10
4th row10
5th row10

Common Values

ValueCountFrequency (%)
10 13
59.1%
20 5
 
22.7%
50 4
 
18.2%

Length

2024-04-17T18:51:08.573389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:51:08.656123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
10 13
59.1%
20 5
 
22.7%
50 4
 
18.2%

BOX당묶음수량
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
20
15 
50
10

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 15
68.2%
50 5
 
22.7%
10 2
 
9.1%

Length

2024-04-17T18:51:08.747070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T18:51:08.831013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 15
68.2%
50 5
 
22.7%
10 2
 
9.1%

Correlations

2024-04-17T18:51:08.896208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상품코드확정수량매출단가스티커시작번호스티커최종번호출고단위묶음당매수BOX당묶음수량
상품코드1.0000.8111.0001.0001.0000.0001.0001.000
확정수량0.8111.0000.8110.6340.6341.0001.0000.649
매출단가1.0000.8111.0001.0001.0000.0001.0001.000
스티커시작번호1.0000.6341.0001.0001.0000.0000.9270.979
스티커최종번호1.0000.6341.0001.0001.0000.0000.9270.979
출고단위0.0001.0000.0000.0000.0001.0000.0000.000
묶음당매수1.0001.0001.0000.9270.9270.0001.0000.935
BOX당묶음수량1.0000.6491.0000.9790.9790.0000.9351.000
2024-04-17T18:51:09.001005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
BOX당묶음수량스티커시작번호상품코드매출단가출고단위스티커최종번호묶음당매수확정수량
BOX당묶음수량1.0000.8200.9460.9460.0000.8200.6860.652
스티커시작번호0.8201.0000.9460.9460.0001.0000.6690.635
상품코드0.9460.9461.0001.0000.0000.9460.9460.749
매출단가0.9460.9461.0001.0000.0000.9460.9460.749
출고단위0.0000.0000.0000.0001.0000.0000.0000.949
스티커최종번호0.8201.0000.9460.9460.0001.0000.6690.635
묶음당매수0.6860.6690.9460.9460.0000.6691.0000.973
확정수량0.6520.6350.7490.7490.9490.6350.9731.000
2024-04-17T18:51:09.102638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상품코드확정수량매출단가스티커시작번호스티커최종번호출고단위묶음당매수BOX당묶음수량
상품코드1.0000.7491.0000.9460.9460.0000.9460.946
확정수량0.7491.0000.7490.6350.6350.9490.9730.652
매출단가1.0000.7491.0000.9460.9460.0000.9460.946
스티커시작번호0.9460.6350.9461.0001.0000.0000.6690.820
스티커최종번호0.9460.6350.9461.0001.0000.0000.6690.820
출고단위0.0000.9490.0000.0000.0001.0000.0000.000
묶음당매수0.9460.9730.9460.6690.6690.0001.0000.686
BOX당묶음수량0.9460.6520.9460.8200.8200.0000.6861.000

Missing values

2024-04-17T18:51:06.543487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-17T18:51:06.663291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

site코드전표일자상품코드확정수량매출단가스티커시작번호스티커최종번호출고단위출고단위수량묶음당매수BOX당묶음수량
03062020-07-2750501016992010000000000000020100000000000000211020
13062020-07-2750501016992010000000000000020100000000000000211020
23062020-07-2750501016992010000000000000020100000000000000211020
33062020-07-2750501016992010000000000000020100000000000000211020
43062020-07-2750501016992010000000000000020100000000000000211020
53062020-07-2750501016992010000000000000020100000000000000211020
63062020-07-2750501016992010000000000000020100000000000000211020
73062020-07-2750501016992010000000000000020100000000000000211020
83062020-07-2750501016992010000000000000020100000000000000211020
93062020-07-2750501016992010000000000000020100000000000000211020
site코드전표일자상품코드확정수량매출단가스티커시작번호스티커최종번호출고단위출고단위수량묶음당매수BOX당묶음수량
123062020-07-275010503632000000000000000020000000000000000215020
133062020-07-275010503632000000000000000020000000000000000215020
143062020-07-275005201932000000000000000020000000000000000212050
153062020-07-275005201932000000000000000020000000000000000212050
163062020-07-275005201932000000000000000020000000000000000212050
173062020-07-275005201932000000000000000020000000000000000212050
183062020-07-275005201932000000000000000020000000000000000212050
193062020-07-27505020016992010000000000000020100000000000000311020
203062020-07-2761251070601910000000000000019100000000000000211010
213062020-07-2761251070601910000000000000019100000000000000211010

Duplicate rows

Most frequently occurring

site코드전표일자상품코드확정수량매출단가스티커시작번호스티커최종번호출고단위출고단위수량묶음당매수BOX당묶음수량# duplicates
23062020-07-275050101699201000000000000002010000000000000021102010
03062020-07-2750052019320000000000000000200000000000000002120505
13062020-07-2750105036320000000000000000200000000000000002150202
33062020-07-27612510706019100000000000000191000000000000002110102
43062020-07-2770205068920000000000000000200000000000000002150202