Overview

Dataset statistics

Number of variables4
Number of observations33
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 KiB
Average record size in memory37.0 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description인천광역시 미추홀구의 대형폐기물스티커 발주현황에 대한 데이터로 발주일자, 업체명, 품목, 발주수량의 항목을 제공하고 있습니다.
Author인천광역시 미추홀구
URLhttps://www.data.go.kr/data/15090447/fileData.do

Alerts

발주수량 is highly overall correlated with 업체명High correlation
발주일자 is highly overall correlated with 업체명High correlation
업체명 is highly overall correlated with 발주수량 and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 08:16:22.692522
Analysis finished2023-12-12 08:16:23.092291
Duration0.4 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

발주일자
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)33.3%
Missing0
Missing (%)0.0%
Memory size396.0 B
2022-01-12
2022-04-08
2022-01-19
2022-03-16
2022-03-04
Other values (6)
12 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique1 ?
Unique (%)3.0%

Sample

1st row2022-01-08
2nd row2022-01-08
3rd row2022-01-12
4th row2022-01-12
5th row2022-01-12

Common Values

ValueCountFrequency (%)
2022-01-12 5
15.2%
2022-04-08 5
15.2%
2022-01-19 4
12.1%
2022-03-16 4
12.1%
2022-03-04 3
9.1%
2022-07-13 3
9.1%
2022-01-08 2
 
6.1%
2022-02-22 2
 
6.1%
2022-03-31 2
 
6.1%
2022-07-25 2
 
6.1%

Length

2023-12-12T17:16:23.151214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022-01-12 5
15.2%
2022-04-08 5
15.2%
2022-01-19 4
12.1%
2022-03-16 4
12.1%
2022-03-04 3
9.1%
2022-07-13 3
9.1%
2022-01-08 2
 
6.1%
2022-02-22 2
 
6.1%
2022-03-31 2
 
6.1%
2022-07-25 2
 
6.1%

업체명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size396.0 B
영광산업
20 
성광디자인
11 
서구구립장애인재활
 
2

Length

Max length9
Median length4
Mean length4.6363636
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서구구립장애인재활
2nd row서구구립장애인재활
3rd row영광산업
4th row영광산업
5th row영광산업

Common Values

ValueCountFrequency (%)
영광산업 20
60.6%
성광디자인 11
33.3%
서구구립장애인재활 2
 
6.1%

Length

2023-12-12T17:16:23.287744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:16:23.420372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영광산업 20
60.6%
성광디자인 11
33.3%
서구구립장애인재활 2
 
6.1%

품목
Text

Distinct18
Distinct (%)54.5%
Missing0
Missing (%)0.0%
Memory size396.0 B
2023-12-12T17:16:23.575995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length8
Mean length8.8181818
Min length6

Characters and Unicode

Total characters291
Distinct characters24
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)21.2%

Sample

1st row일반용 50L
2nd row일반용 75L
3rd row일반용 5L
4th row일반용 10L
5th row일반용 20L
ValueCountFrequency (%)
스티커 11
14.5%
원권 11
14.5%
일반용 8
10.5%
음식물 7
9.2%
10l 7
9.2%
재사용 5
 
6.6%
20l 4
 
5.3%
5l 4
 
5.3%
3000 3
 
3.9%
5000 3
 
3.9%
Other values (10) 13
17.1%
2023-12-12T17:16:23.901008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
62
21.3%
0 48
16.5%
L 22
 
7.6%
15
 
5.2%
1 14
 
4.8%
11
 
3.8%
11
 
3.8%
11
 
3.8%
11
 
3.8%
11
 
3.8%
Other values (14) 75
25.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 123
42.3%
Decimal Number 84
28.9%
Space Separator 62
21.3%
Uppercase Letter 22
 
7.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
15
12.2%
11
8.9%
11
8.9%
11
8.9%
11
8.9%
11
8.9%
8
 
6.5%
8
 
6.5%
7
 
5.7%
7
 
5.7%
Other values (5) 23
18.7%
Decimal Number
ValueCountFrequency (%)
0 48
57.1%
1 14
 
16.7%
5 10
 
11.9%
2 6
 
7.1%
3 4
 
4.8%
6 1
 
1.2%
7 1
 
1.2%
Space Separator
ValueCountFrequency (%)
62
100.0%
Uppercase Letter
ValueCountFrequency (%)
L 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 146
50.2%
Hangul 123
42.3%
Latin 22
 
7.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
15
12.2%
11
8.9%
11
8.9%
11
8.9%
11
8.9%
11
8.9%
8
 
6.5%
8
 
6.5%
7
 
5.7%
7
 
5.7%
Other values (5) 23
18.7%
Common
ValueCountFrequency (%)
62
42.5%
0 48
32.9%
1 14
 
9.6%
5 10
 
6.8%
2 6
 
4.1%
3 4
 
2.7%
6 1
 
0.7%
7 1
 
0.7%
Latin
ValueCountFrequency (%)
L 22
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 168
57.7%
Hangul 123
42.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
62
36.9%
0 48
28.6%
L 22
 
13.1%
1 14
 
8.3%
5 10
 
6.0%
2 6
 
3.6%
3 4
 
2.4%
6 1
 
0.6%
7 1
 
0.6%
Hangul
ValueCountFrequency (%)
15
12.2%
11
8.9%
11
8.9%
11
8.9%
11
8.9%
11
8.9%
8
 
6.5%
8
 
6.5%
7
 
5.7%
7
 
5.7%
Other values (5) 23
18.7%

발주수량
Real number (ℝ)

HIGH CORRELATION 

Distinct21
Distinct (%)63.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean536242.42
Minimum2000
Maximum3200000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size429.0 B
2023-12-12T17:16:24.076112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile13200
Q130000
median250000
Q3900000
95-th percentile1800000
Maximum3200000
Range3198000
Interquartile range (IQR)870000

Descriptive statistics

Standard deviation745491.8
Coefficient of variation (CV)1.3902141
Kurtosis3.9960219
Mean536242.42
Median Absolute Deviation (MAD)221500
Skewness1.93905
Sum17696000
Variance5.5575802 × 1011
MonotonicityNot monotonic
2023-12-12T17:16:24.209936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
40000 4
 
12.1%
300000 3
 
9.1%
900000 3
 
9.1%
30000 3
 
9.1%
14000 2
 
6.1%
1800000 2
 
6.1%
500000 2
 
6.1%
1200000 1
 
3.0%
25500 1
 
3.0%
250000 1
 
3.0%
Other values (11) 11
33.3%
ValueCountFrequency (%)
2000 1
 
3.0%
12000 1
 
3.0%
14000 2
6.1%
25500 1
 
3.0%
28500 1
 
3.0%
30000 3
9.1%
40000 4
12.1%
50000 1
 
3.0%
150000 1
 
3.0%
200000 1
 
3.0%
ValueCountFrequency (%)
3200000 1
 
3.0%
1800000 2
6.1%
1700000 1
 
3.0%
1600000 1
 
3.0%
1200000 1
 
3.0%
900000 3
9.1%
500000 2
6.1%
450000 1
 
3.0%
350000 1
 
3.0%
300000 3
9.1%

Interactions

2023-12-12T17:16:22.852631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:16:24.317408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발주일자업체명품목발주수량
발주일자1.0001.0000.4860.534
업체명1.0001.0001.0000.653
품목0.4861.0001.0000.891
발주수량0.5340.6530.8911.000
2023-12-12T17:16:24.414559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체명발주일자
업체명1.0000.856
발주일자0.8561.000
2023-12-12T17:16:24.493418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발주수량발주일자업체명
발주수량1.0000.2780.524
발주일자0.2781.0000.856
업체명0.5240.8561.000

Missing values

2023-12-12T17:16:22.974335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:16:23.060553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

발주일자업체명품목발주수량
02022-01-08서구구립장애인재활일반용 50L1200000
12022-01-08서구구립장애인재활일반용 75L500000
22022-01-12영광산업일반용 5L300000
32022-01-12영광산업일반용 10L900000
42022-01-12영광산업일반용 20L900000
52022-01-12영광산업재사용 10L150000
62022-01-12영광산업재사용 20L900000
72022-01-19성광디자인스티커 1000 원권14000
82022-01-19성광디자인스티커 3000 원권12000
92022-01-19성광디자인스티커 5000 원권14000
발주일자업체명품목발주수량
232022-04-08영광산업일반용 10L1800000
242022-04-08영광산업일반용 20L1800000
252022-04-08영광산업재사용 10L500000
262022-04-08영광산업재사용 20L1600000
272022-04-14영광산업재사용 10L50000
282022-07-13성광디자인스티커 1000 원권30000
292022-07-13성광디자인스티커 3000 원권40000
302022-07-13성광디자인스티커 5000 원권30000
312022-07-25영광산업음식물 5L350000
322022-07-25영광산업음식물 10L250000