Overview

Dataset statistics

Number of variables5
Number of observations31
Missing cells22
Missing cells (%)14.2%
Duplicate rows1
Duplicate rows (%)3.2%
Total size in memory1.4 KiB
Average record size in memory46.3 B

Variable types

Categorical2
Text1
Numeric2

Dataset

Description인천광역시 미추홀구의 대형폐기물스티커 입고 현황에 대한 데이터로 입고일자, 품명, 입고수량, 기준일 항목을 제공하고 있습니다.
URLhttps://www.data.go.kr/data/15090441/fileData.do

Alerts

Dataset has 1 (3.2%) duplicate rowsDuplicates
발주일자 is highly overall correlated with 업체명High correlation
업체명 is highly overall correlated with 발주일자High correlation
발주수량 is highly overall correlated with 입고수량High correlation
입고수량 is highly overall correlated with 발주수량High correlation
품목 has 7 (22.6%) missing valuesMissing
발주수량 has 7 (22.6%) missing valuesMissing
입고수량 has 8 (25.8%) missing valuesMissing

Reproduction

Analysis started2023-12-12 02:50:50.296153
Analysis finished2023-12-12 02:50:51.366139
Duration1.07 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

발주일자
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)25.8%
Missing0
Missing (%)0.0%
Memory size380.0 B
<NA>
2023-01-05
2023-02-13
2023-02-23
2023-03-08
Other values (3)

Length

Max length10
Median length10
Mean length8.6451613
Min length4

Unique

Unique1 ?
Unique (%)3.2%

Sample

1st row2023-01-05
2nd row2023-01-05
3rd row2023-01-05
4th row2023-01-05
5th row2023-01-05

Common Values

ValueCountFrequency (%)
<NA> 7
22.6%
2023-01-05 5
16.1%
2023-02-13 5
16.1%
2023-02-23 5
16.1%
2023-03-08 4
12.9%
2023-02-09 2
 
6.5%
2023-04-18 2
 
6.5%
2023-04-20 1
 
3.2%

Length

2023-12-12T11:50:51.449546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:50:51.571889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 7
22.6%
2023-01-05 5
16.1%
2023-02-13 5
16.1%
2023-02-23 5
16.1%
2023-03-08 4
12.9%
2023-02-09 2
 
6.5%
2023-04-18 2
 
6.5%
2023-04-20 1
 
3.2%

업체명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)16.1%
Missing0
Missing (%)0.0%
Memory size380.0 B
영광산업
12 
<NA>
에덴복지재단
성광디자인
서구구립장애인재활

Length

Max length9
Median length4
Mean length4.8064516
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영광산업
2nd row영광산업
3rd row영광산업
4th row영광산업
5th row영광산업

Common Values

ValueCountFrequency (%)
영광산업 12
38.7%
<NA> 7
22.6%
에덴복지재단 5
16.1%
성광디자인 5
16.1%
서구구립장애인재활 2
 
6.5%

Length

2023-12-12T11:50:51.734815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:50:51.878163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영광산업 12
38.7%
na 7
22.6%
에덴복지재단 5
16.1%
성광디자인 5
16.1%
서구구립장애인재활 2
 
6.5%

품목
Text

MISSING 

Distinct18
Distinct (%)75.0%
Missing7
Missing (%)22.6%
Memory size380.0 B
2023-12-12T11:50:52.056041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length8
Mean length8.2916667
Min length6

Characters and Unicode

Total characters199
Distinct characters24
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)50.0%

Sample

1st row일반용 5L
2nd row일반용 10L
3rd row일반용 20L
4th row재사용 10L
5th row재사용 20L
ValueCountFrequency (%)
일반용 8
15.1%
스티커 5
9.4%
음식물 5
9.4%
원권 5
9.4%
10l 5
9.4%
재사용 4
 
7.5%
20l 4
 
7.5%
5l 3
 
5.7%
사업계용 2
 
3.8%
10000 2
 
3.8%
Other values (10) 10
18.9%
2023-12-12T11:50:52.389151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
42
21.1%
0 29
14.6%
L 19
 
9.5%
14
 
7.0%
1 9
 
4.5%
8
 
4.0%
8
 
4.0%
5 6
 
3.0%
6
 
3.0%
5
 
2.5%
Other values (14) 53
26.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 84
42.2%
Decimal Number 54
27.1%
Space Separator 42
21.1%
Uppercase Letter 19
 
9.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
16.7%
8
9.5%
8
9.5%
6
 
7.1%
5
 
6.0%
5
 
6.0%
5
 
6.0%
5
 
6.0%
5
 
6.0%
5
 
6.0%
Other values (5) 18
21.4%
Decimal Number
ValueCountFrequency (%)
0 29
53.7%
1 9
 
16.7%
5 6
 
11.1%
2 5
 
9.3%
3 3
 
5.6%
6 1
 
1.9%
7 1
 
1.9%
Space Separator
ValueCountFrequency (%)
42
100.0%
Uppercase Letter
ValueCountFrequency (%)
L 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 96
48.2%
Hangul 84
42.2%
Latin 19
 
9.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
16.7%
8
9.5%
8
9.5%
6
 
7.1%
5
 
6.0%
5
 
6.0%
5
 
6.0%
5
 
6.0%
5
 
6.0%
5
 
6.0%
Other values (5) 18
21.4%
Common
ValueCountFrequency (%)
42
43.8%
0 29
30.2%
1 9
 
9.4%
5 6
 
6.2%
2 5
 
5.2%
3 3
 
3.1%
6 1
 
1.0%
7 1
 
1.0%
Latin
ValueCountFrequency (%)
L 19
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 115
57.8%
Hangul 84
42.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
42
36.5%
0 29
25.2%
L 19
16.5%
1 9
 
7.8%
5 6
 
5.2%
2 5
 
4.3%
3 3
 
2.6%
6 1
 
0.9%
7 1
 
0.9%
Hangul
ValueCountFrequency (%)
14
16.7%
8
9.5%
8
9.5%
6
 
7.1%
5
 
6.0%
5
 
6.0%
5
 
6.0%
5
 
6.0%
5
 
6.0%
5
 
6.0%
Other values (5) 18
21.4%

발주수량
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct22
Distinct (%)91.7%
Missing7
Missing (%)22.6%
Infinite0
Infinite (%)0.0%
Mean677370.83
Minimum1000
Maximum3400000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size411.0 B
2023-12-12T11:50:52.537185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile16400
Q163000
median350000
Q3875000
95-th percentile2040000
Maximum3400000
Range3399000
Interquartile range (IQR)812000

Descriptive statistics

Standard deviation845315.3
Coefficient of variation (CV)1.2479358
Kurtosis3.5340875
Mean677370.83
Median Absolute Deviation (MAD)295500
Skewness1.8406026
Sum16256900
Variance7.1455796 × 1011
MonotonicityNot monotonic
2023-12-12T11:50:52.706433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
1600000 2
 
6.5%
300000 2
 
6.5%
174000 1
 
3.2%
429000 1
 
3.2%
1000 1
 
3.2%
219900 1
 
3.2%
30000 1
 
3.2%
14000 1
 
3.2%
50000 1
 
3.2%
60000 1
 
3.2%
Other values (12) 12
38.7%
(Missing) 7
22.6%
ValueCountFrequency (%)
1000 1
3.2%
14000 1
3.2%
30000 1
3.2%
50000 1
3.2%
59000 1
3.2%
60000 1
3.2%
64000 1
3.2%
156000 1
3.2%
174000 1
3.2%
219900 1
3.2%
ValueCountFrequency (%)
3400000 1
3.2%
2100000 1
3.2%
1700000 1
3.2%
1600000 2
6.5%
1100000 1
3.2%
800000 1
3.2%
650000 1
3.2%
550000 1
3.2%
500000 1
3.2%
429000 1
3.2%

입고수량
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct22
Distinct (%)95.7%
Missing8
Missing (%)25.8%
Infinite0
Infinite (%)0.0%
Mean340969.57
Minimum1000
Maximum1332000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size411.0 B
2023-12-12T11:50:52.874281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile14000
Q159500
median167000
Q3442000
95-th percentile1231900
Maximum1332000
Range1331000
Interquartile range (IQR)382500

Descriptive statistics

Standard deviation418552.71
Coefficient of variation (CV)1.2275369
Kurtosis0.91532006
Mean340969.57
Median Absolute Deviation (MAD)133600
Skewness1.4684036
Sum7842300
Variance1.7518637 × 1011
MonotonicityNot monotonic
2023-12-12T11:50:53.033075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
14000 2
 
6.5%
205000 1
 
3.2%
1000 1
 
3.2%
20700 1
 
3.2%
58000 1
 
3.2%
60000 1
 
3.2%
64000 1
 
3.2%
145000 1
 
3.2%
75000 1
 
3.2%
495000 1
 
3.2%
Other values (12) 12
38.7%
(Missing) 8
25.8%
ValueCountFrequency (%)
1000 1
3.2%
14000 2
6.5%
20700 1
3.2%
58000 1
3.2%
59000 1
3.2%
60000 1
3.2%
64000 1
3.2%
75000 1
3.2%
76000 1
3.2%
145000 1
3.2%
ValueCountFrequency (%)
1332000 1
3.2%
1245000 1
3.2%
1114000 1
3.2%
984000 1
3.2%
500000 1
3.2%
495000 1
3.2%
389000 1
3.2%
300600 1
3.2%
300000 1
3.2%
224000 1
3.2%

Interactions

2023-12-12T11:50:50.788058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:50:50.537844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:50:50.901473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:50:50.667557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T11:50:53.149861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발주일자업체명품목발주수량입고수량
발주일자1.0001.0000.5270.5430.000
업체명1.0001.0000.6720.6230.384
품목0.5270.6721.0000.0000.000
발주수량0.5430.6230.0001.0000.908
입고수량0.0000.3840.0000.9081.000
2023-12-12T11:50:53.287854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발주일자업체명
발주일자1.0000.922
업체명0.9221.000
2023-12-12T11:50:53.396426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발주수량입고수량발주일자업체명
발주수량1.0000.9000.2300.457
입고수량0.9001.0000.0000.273
발주일자0.2300.0001.0000.922
업체명0.4570.2730.9221.000

Missing values

2023-12-12T11:50:51.038661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T11:50:51.146065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T11:50:51.269923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

발주일자업체명품목발주수량입고수량
02023-01-05영광산업일반용 5L300000167000
12023-01-05영광산업일반용 10L5900059000
22023-01-05영광산업일반용 20L15600076000
32023-01-05영광산업재사용 10L17400014000
42023-01-05영광산업재사용 20L429000389000
52023-02-09서구구립장애인재활일반용 50L800000300600
62023-02-09서구구립장애인재활일반용 75L400000224000
72023-02-13에덴복지재단일반용 5L300000300000
82023-02-13에덴복지재단일반용 10L16000001114000
92023-02-13에덴복지재단일반용 20L1600000984000
발주일자업체명품목발주수량입고수량
212023-04-18영광산업사업계용 30L30000<NA>
222023-04-18영광산업사업계용 60L21990020700
232023-04-20성광디자인스티커 10000 원권10001000
24<NA><NA><NA><NA><NA>
25<NA><NA><NA><NA><NA>
26<NA><NA><NA><NA><NA>
27<NA><NA><NA><NA><NA>
28<NA><NA><NA><NA><NA>
29<NA><NA><NA><NA><NA>
30<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

발주일자업체명품목발주수량입고수량# duplicates
0<NA><NA><NA><NA><NA>7