Overview

Dataset statistics

Number of variables7
Number of observations127
Missing cells2
Missing cells (%)0.2%
Duplicate rows8
Duplicate rows (%)6.3%
Total size in memory7.4 KiB
Average record size in memory60.0 B

Variable types

Categorical3
Numeric2
Text2

Dataset

Description한국서부발전(주)_폐기물 처리내역 및 현황 조회 서비스
Author충청남도
URLhttps://alldam.chungnam.go.kr/bigdata/collect/view.chungnam?menuCd=DOM_000000201001001000&apiIdx=2843

Alerts

Dataset has 8 (6.3%) duplicate rowsDuplicates
처리량(재활용) is highly overall correlated with 처리량(자가처리)High correlation
처리량(자가처리) is highly overall correlated with 처리량(재활용)High correlation
처리량(자가처리) is highly imbalanced (90.1%)Imbalance
폐기물명 has 2 (1.6%) missing valuesMissing
처리량(위탁처리) has 53 (41.7%) zerosZeros
처리량(재활용) has 91 (71.7%) zerosZeros

Reproduction

Analysis started2024-01-09 22:15:24.004258
Analysis finished2024-01-09 22:15:24.698585
Duration0.69 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

발전소명
Categorical

Distinct3
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
평택
60 
태안
55 
평택건설
12 

Length

Max length4
Median length2
Mean length2.1889764
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row태안
2nd row태안
3rd row태안
4th row태안
5th row태안

Common Values

ValueCountFrequency (%)
평택 60
47.2%
태안 55
43.3%
평택건설 12
 
9.4%

Length

2024-01-10T07:15:24.755264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:15:24.840646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
평택 60
47.2%
태안 55
43.3%
평택건설 12
 
9.4%

처리량(자가처리)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
0.0
124 
42555.2
 
1
5347.01
 
1
4704.73
 
1

Length

Max length7
Median length3
Mean length3.0944882
Min length3

Unique

Unique3 ?
Unique (%)2.4%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 124
97.6%
42555.2 1
 
0.8%
5347.01 1
 
0.8%
4704.73 1
 
0.8%

Length

2024-01-10T07:15:24.929942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:15:25.019904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 124
97.6%
42555.2 1
 
0.8%
5347.01 1
 
0.8%
4704.73 1
 
0.8%

처리량(위탁처리)
Real number (ℝ)

ZEROS 

Distinct75
Distinct (%)59.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean153.51213
Minimum0
Maximum4830.76
Zeros53
Zeros (%)41.7%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2024-01-10T07:15:25.107359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median5.08
Q392.37
95-th percentile551.314
Maximum4830.76
Range4830.76
Interquartile range (IQR)92.37

Descriptive statistics

Standard deviation544.69099
Coefficient of variation (CV)3.5481952
Kurtosis47.578544
Mean153.51213
Median Absolute Deviation (MAD)5.08
Skewness6.382232
Sum19496.04
Variance296688.28
MonotonicityNot monotonic
2024-01-10T07:15:25.213444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 53
41.7%
19.94 1
 
0.8%
37.95 1
 
0.8%
0.97 1
 
0.8%
5.08 1
 
0.8%
42.54 1
 
0.8%
144.5 1
 
0.8%
11.4 1
 
0.8%
2.59 1
 
0.8%
104.42 1
 
0.8%
Other values (65) 65
51.2%
ValueCountFrequency (%)
0.0 53
41.7%
0.25 1
 
0.8%
0.37 1
 
0.8%
0.39 1
 
0.8%
0.44 1
 
0.8%
0.97 1
 
0.8%
1.51 1
 
0.8%
1.82 1
 
0.8%
2.59 1
 
0.8%
4.37 1
 
0.8%
ValueCountFrequency (%)
4830.76 1
0.8%
2766.83 1
0.8%
1629.74 1
0.8%
1438.56 1
0.8%
1367.25 1
0.8%
1057.87 1
0.8%
562.39 1
0.8%
525.47 1
0.8%
468.32 1
0.8%
372.32 1
0.8%

처리량(재활용)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct37
Distinct (%)29.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10678.85
Minimum0
Maximum1344863.9
Zeros91
Zeros (%)71.7%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2024-01-10T07:15:25.313870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q311.71
95-th percentile229.661
Maximum1344863.9
Range1344863.9
Interquartile range (IQR)11.71

Descriptive statistics

Standard deviation119330.28
Coefficient of variation (CV)11.17445
Kurtosis126.9961
Mean10678.85
Median Absolute Deviation (MAD)0
Skewness11.26917
Sum1356214
Variance1.4239715 × 1010
MonotonicityNot monotonic
2024-01-10T07:15:25.421020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%)
0.0 91
71.7%
55.2 1
 
0.8%
9.01 1
 
0.8%
13.48 1
 
0.8%
3630.24 1
 
0.8%
59.46 1
 
0.8%
48.55 1
 
0.8%
81.43 1
 
0.8%
1561.82 1
 
0.8%
16.94 1
 
0.8%
Other values (27) 27
 
21.3%
ValueCountFrequency (%)
0.0 91
71.7%
1.3 1
 
0.8%
6.62 1
 
0.8%
9.01 1
 
0.8%
9.94 1
 
0.8%
13.48 1
 
0.8%
13.75 1
 
0.8%
16.07 1
 
0.8%
16.94 1
 
0.8%
18.0 1
 
0.8%
ValueCountFrequency (%)
1344863.92 1
0.8%
3630.24 1
0.8%
3432.85 1
0.8%
1561.82 1
0.8%
464.06 1
0.8%
263.36 1
0.8%
245.03 1
0.8%
193.8 1
0.8%
164.85 1
0.8%
146.19 1
0.8%
Distinct104
Distinct (%)81.9%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2024-01-10T07:15:25.842981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length7
Mean length4.3858268
Min length1

Characters and Unicode

Total characters557
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique102 ?
Unique (%)80.3%

Sample

1st row55.2
2nd row16.07
3rd row8.9
4th row138.36
5th row208.35
ValueCountFrequency (%)
0 23
 
18.1%
지정폐기물 2
 
1.6%
55.2 1
 
0.8%
1.3 1
 
0.8%
42.54 1
 
0.8%
144.5 1
 
0.8%
11.4 1
 
0.8%
9.01 1
 
0.8%
2.59 1
 
0.8%
19.94 1
 
0.8%
Other values (94) 94
74.0%
2024-01-10T07:15:26.143050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 100
18.0%
4 59
10.6%
1 57
10.2%
0 53
9.5%
5 45
8.1%
3 43
7.7%
8 42
7.5%
2 42
7.5%
6 38
 
6.8%
7 34
 
6.1%
Other values (6) 44
7.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 447
80.3%
Other Punctuation 100
 
18.0%
Other Letter 10
 
1.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 59
13.2%
1 57
12.8%
0 53
11.9%
5 45
10.1%
3 43
9.6%
8 42
9.4%
2 42
9.4%
6 38
8.5%
7 34
7.6%
9 34
7.6%
Other Letter
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%
Other Punctuation
ValueCountFrequency (%)
. 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 547
98.2%
Hangul 10
 
1.8%

Most frequent character per script

Common
ValueCountFrequency (%)
. 100
18.3%
4 59
10.8%
1 57
10.4%
0 53
9.7%
5 45
8.2%
3 43
7.9%
8 42
7.7%
2 42
7.7%
6 38
 
6.9%
7 34
 
6.2%
Hangul
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 547
98.2%
Hangul 10
 
1.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 100
18.3%
4 59
10.8%
1 57
10.4%
0 53
9.7%
5 45
8.2%
3 43
7.9%
8 42
7.7%
2 42
7.7%
6 38
 
6.9%
7 34
 
6.2%
Hangul
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%

폐기물구분
Categorical

Distinct6
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
일반폐기물
56 
지정폐기물
54 
건설폐기물
13 
음식물
 
2
그밖의 폐유기용제(액상)
 
1

Length

Max length23
Median length5
Mean length5.1732283
Min length3

Unique

Unique2 ?
Unique (%)1.6%

Sample

1st row음식물
2nd row지정폐기물
3rd row지정폐기물
4th row지정폐기물
5th row지정폐기물

Common Values

ValueCountFrequency (%)
일반폐기물 56
44.1%
지정폐기물 54
42.5%
건설폐기물 13
 
10.2%
음식물 2
 
1.6%
그밖의 폐유기용제(액상) 1
 
0.8%
폐유(그밖에 달리 분류되지 아니하는 폐유) 1
 
0.8%

Length

2024-01-10T07:15:26.262547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:15:26.352133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반폐기물 56
42.4%
지정폐기물 54
40.9%
건설폐기물 13
 
9.8%
음식물 2
 
1.5%
그밖의 1
 
0.8%
폐유기용제(액상 1
 
0.8%
폐유(그밖에 1
 
0.8%
달리 1
 
0.8%
분류되지 1
 
0.8%
아니하는 1
 
0.8%

폐기물명
Text

MISSING 

Distinct56
Distinct (%)44.8%
Missing2
Missing (%)1.6%
Memory size1.1 KiB
2024-01-10T07:15:26.550339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length16
Mean length11.264
Min length2

Characters and Unicode

Total characters1408
Distinct characters112
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)10.4%

Sample

1st row음식물
2nd row황산(폐축전지
3rd row폐페인트 및 폐락카(고상)
4th row폐촉매(고상)
5th row폐유(폐광물유(액상)
ValueCountFrequency (%)
폐페인트 8
 
4.4%
8
 
4.4%
그밖의 7
 
3.9%
무기성오니류(폐수처리오니 6
 
3.3%
폐목재류 6
 
3.3%
동식물성 6
 
3.3%
폐합성고분자화합물(폐합성수지류 6
 
3.3%
폐광물유(고상 5
 
2.8%
폐유(그 5
 
2.8%
밖의 5
 
2.8%
Other values (57) 119
65.7%
2024-01-10T07:15:26.892891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
156
 
11.1%
( 121
 
8.6%
) 107
 
7.6%
66
 
4.7%
56
 
4.0%
51
 
3.6%
50
 
3.6%
41
 
2.9%
36
 
2.6%
34
 
2.4%
Other values (102) 690
49.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1102
78.3%
Open Punctuation 121
 
8.6%
Close Punctuation 107
 
7.6%
Space Separator 56
 
4.0%
Uppercase Letter 16
 
1.1%
Other Punctuation 6
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
156
 
14.2%
66
 
6.0%
51
 
4.6%
50
 
4.5%
41
 
3.7%
36
 
3.3%
34
 
3.1%
32
 
2.9%
31
 
2.8%
22
 
2.0%
Other values (94) 583
52.9%
Uppercase Letter
ValueCountFrequency (%)
S 4
25.0%
C 4
25.0%
P 4
25.0%
B 4
25.0%
Open Punctuation
ValueCountFrequency (%)
( 121
100.0%
Close Punctuation
ValueCountFrequency (%)
) 107
100.0%
Space Separator
ValueCountFrequency (%)
56
100.0%
Other Punctuation
ValueCountFrequency (%)
· 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1102
78.3%
Common 290
 
20.6%
Latin 16
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
156
 
14.2%
66
 
6.0%
51
 
4.6%
50
 
4.5%
41
 
3.7%
36
 
3.3%
34
 
3.1%
32
 
2.9%
31
 
2.8%
22
 
2.0%
Other values (94) 583
52.9%
Common
ValueCountFrequency (%)
( 121
41.7%
) 107
36.9%
56
19.3%
· 6
 
2.1%
Latin
ValueCountFrequency (%)
S 4
25.0%
C 4
25.0%
P 4
25.0%
B 4
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1102
78.3%
ASCII 300
 
21.3%
None 6
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
156
 
14.2%
66
 
6.0%
51
 
4.6%
50
 
4.5%
41
 
3.7%
36
 
3.3%
34
 
3.1%
32
 
2.9%
31
 
2.8%
22
 
2.0%
Other values (94) 583
52.9%
ASCII
ValueCountFrequency (%)
( 121
40.3%
) 107
35.7%
56
18.7%
S 4
 
1.3%
C 4
 
1.3%
P 4
 
1.3%
B 4
 
1.3%
None
ValueCountFrequency (%)
· 6
100.0%

Interactions

2024-01-10T07:15:24.397989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:15:24.263923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:15:24.476683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:15:24.329804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T07:15:26.968920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발전소명처리량(자가처리)처리량(위탁처리)처리량(재활용)폐기물구분폐기물명
발전소명1.0000.0000.6710.0000.8160.939
처리량(자가처리)0.0001.0000.0001.0000.0000.888
처리량(위탁처리)0.6710.0001.0000.0000.4200.244
처리량(재활용)0.0001.0000.0001.0000.0001.000
폐기물구분0.8160.0000.4200.0001.0000.996
폐기물명0.9390.8880.2441.0000.9961.000
2024-01-10T07:15:27.049663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
폐기물구분발전소명처리량(자가처리)
폐기물구분1.0000.4940.000
발전소명0.4941.0000.000
처리량(자가처리)0.0000.0001.000
2024-01-10T07:15:27.121062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리량(위탁처리)처리량(재활용)발전소명처리량(자가처리)폐기물구분
처리량(위탁처리)1.000-0.3880.3530.0000.162
처리량(재활용)-0.3881.0000.0000.9920.000
발전소명0.3530.0001.0000.0000.494
처리량(자가처리)0.0000.9920.0001.0000.000
폐기물구분0.1620.0000.4940.0001.000

Missing values

2024-01-10T07:15:24.571555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T07:15:24.663011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

발전소명처리량(자가처리)처리량(위탁처리)처리량(재활용)발생량(톤)폐기물구분폐기물명
0태안0.00.055.255.2음식물음식물
1태안0.00.016.0716.07지정폐기물황산(폐축전지
2태안0.08.90.08.9지정폐기물폐페인트 및 폐락카(고상)
3태안0.01.51136.85138.36지정폐기물폐촉매(고상)
4태안0.043.5164.85208.35지정폐기물폐유(폐광물유(액상)
5태안0.08.840.08.84지정폐기물폐유(그 밖의 폐광물유(고상)
6태안0.06.420.06.42지정폐기물폐석면(해체·제거시)
7태안0.00.00.0지정폐기물그밖의 폐유기용제(액상)<NA>
8태안0.023.630.023.63일반폐기물폐흡착제
9태안0.0300.480.0300.48일반폐기물폐합성고분자화합물(폐합성수지류)
발전소명처리량(자가처리)처리량(위탁처리)처리량(재활용)발생량(톤)폐기물구분폐기물명
117평택건설0.04830.760.04830.76건설폐기물폐콘크리트(불연성)
118평택건설0.08.560.08.56지정폐기물폐페인트 및 폐락카(고상)
119평택건설0.019.40.019.4지정폐기물폐유(그밖에 달리 분류되지 아니하는 폐유
120평택건설0.00.390.00.39지정폐기물폐석면(석면의 제거작업에 사용된 비닐시트
121평택건설0.04.460.04.46건설폐기물폐섬유(가연성)
122평택건설0.0179.10.0179.1건설폐기물폐합성수지(가연성)
123평택건설0.0281.670.0281.67건설폐기물혼합건설폐기물(가연성
124평택건설0.0562.390.0562.39건설폐기물건설오니(불연성
125평택건설0.02766.830.02766.83건설폐기물폐콘크리트(불연성)
126평택건설0.0525.470.0525.47건설폐기물폐목재류

Duplicate rows

Most frequently occurring

발전소명처리량(자가처리)처리량(위탁처리)처리량(재활용)발생량(톤)폐기물구분폐기물명# duplicates
0태안0.00.00.00일반폐기물폐석고2
1평택0.00.00.00일반폐기물폐콘크리트(일반)2
2평택0.00.00.00지정폐기물PCBS 함유폐기물(액상)2
3평택0.00.00.00지정폐기물PCBS 함유폐기물(액상이 아닌것(고상))2
4평택0.00.00.00지정폐기물폐유(폐오일필터)2
5평택0.00.00.00지정폐기물폐유독물(고상)2
6평택0.00.00.00지정폐기물폐페인트 및 폐락카(액상)2
7평택0.00.00.00지정폐기물황산(폐축전지(액상))2