Dataset statistics
Number of variables | 10 |
---|---|
Number of observations | 10000 |
Missing cells | 11528 |
Missing cells (%) | 11.5% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 908.2 KiB |
Average record size in memory | 93.0 B |
Variable types
Numeric | 4 |
---|---|
Text | 1 |
Categorical | 2 |
Boolean | 3 |
Dataset
Description | 폐기물처분부담금관리시스템 내 등록되어진 데이터로 폐기물처분부담금신고를 위한 폐기물분류코드 등록 및 그에 따른 신고한 부분에 대한 SMS 전송내역을 제공하는 자료 입니다. |
---|---|
Author | 한국환경공단 |
URL | https://www.data.go.kr/data/15092767/fileData.do |
실적년도 has constant value "" | Constant |
폐기물정보대분류 is highly overall correlated with 폐기물정보중분류 and 2 other fields | High correlation |
폐기물정보중분류 is highly overall correlated with 폐기물정보대분류 | High correlation |
폐기물정보소분류 is highly overall correlated with 폐기물분류코드 | High correlation |
폐기물분류코드 is highly overall correlated with 폐기물정보대분류 and 1 other fields | High correlation |
연소여부 is highly overall correlated with 의료폐기물여부 | High correlation |
지정폐기물여부 is highly overall correlated with 폐기물정보대분류 | High correlation |
의료폐기물여부 is highly overall correlated with 연소여부 | High correlation |
폐기물분류코드 is highly imbalanced (50.1%) | Imbalance |
의료폐기물여부 is highly imbalanced (75.8%) | Imbalance |
폐기물정보중분류 has 311 (3.1%) missing values | Missing |
폐기물정보소분류 has 2938 (29.4%) missing values | Missing |
연소여부 has 6617 (66.2%) missing values | Missing |
의료폐기물여부 has 1662 (16.6%) missing values | Missing |
폐기물정보소분류 has 1411 (14.1%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-12 06:41:17.865107 |
---|---|
Analysis finished | 2023-12-12 06:41:20.788239 |
Duration | 2.92 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
시퀀스
Real number (ℝ)
Distinct | 273 |
---|---|
Distinct (%) | 2.7% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1458439.8 |
Minimum | 1367640 |
---|---|
Maximum | 1576405 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1367640 |
---|---|
5-th percentile | 1379902 |
Q1 | 1418528 |
median | 1460016 |
Q3 | 1479476 |
95-th percentile | 1558089 |
Maximum | 1576405 |
Range | 208765 |
Interquartile range (IQR) | 60948 |
Descriptive statistics
Standard deviation | 51636.283 |
---|---|
Coefficient of variation (CV) | 0.035405151 |
Kurtosis | -0.51673777 |
Mean | 1458439.8 |
Median Absolute Deviation (MAD) | 31115 |
Skewness | 0.35401196 |
Sum | 1.4584398 × 1010 |
Variance | 2.6663058 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1381007 | 57 | 0.6% |
1450828 | 54 | 0.5% |
1453803 | 52 | 0.5% |
1478713 | 51 | 0.5% |
1530218 | 49 | 0.5% |
1428901 | 49 | 0.5% |
1549925 | 49 | 0.5% |
1467487 | 48 | 0.5% |
1448608 | 47 | 0.5% |
1419828 | 47 | 0.5% |
Other values (263) | 9497 |
Value | Count | Frequency (%) |
1367640 | 42 | |
1368888 | 38 | |
1370501 | 30 | |
1370544 | 36 | |
1371038 | 36 | |
1372788 | 36 | |
1372808 | 44 | |
1373470 | 33 | |
1373472 | 43 | |
1374300 | 36 |
Value | Count | Frequency (%) |
1576405 | 38 | |
1567487 | 41 | |
1564181 | 30 | |
1562983 | 39 | |
1562521 | 32 | |
1561820 | 34 | |
1560845 | 37 | |
1560575 | 30 | |
1560415 | 44 | |
1558361 | 38 |
폐기물분류번호
Text
Distinct | 367 |
---|---|
Distinct (%) | 3.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
03-01-03 | 43 | 0.4% |
51-08-05 | 41 | 0.4% |
51-20-05 | 39 | 0.4% |
51-38-01 | 39 | 0.4% |
06-01 | 39 | 0.4% |
51-18 | 38 | 0.4% |
51-29 | 38 | 0.4% |
03-08-02 | 37 | 0.4% |
91-15 | 37 | 0.4% |
06-01-07 | 37 | 0.4% |
Other values (357) | 9612 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 17331 | |
- | 16751 | |
1 | 12279 | |
5 | 6085 | 8.7% |
2 | 4210 | 6.0% |
3 | 3762 | 5.4% |
9 | 3638 | 5.2% |
4 | 2209 | 3.1% |
7 | 1411 | 2.0% |
6 | 1409 | 2.0% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 53502 | |
Dash Punctuation | 16751 | 23.8% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 17331 | |
1 | 12279 | |
5 | 6085 | 11.4% |
2 | 4210 | 7.9% |
3 | 3762 | 7.0% |
9 | 3638 | 6.8% |
4 | 2209 | 4.1% |
7 | 1411 | 2.6% |
6 | 1409 | 2.6% |
8 | 1168 | 2.2% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 16751 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 70253 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 17331 | |
- | 16751 | |
1 | 12279 | |
5 | 6085 | 8.7% |
2 | 4210 | 6.0% |
3 | 3762 | 5.4% |
9 | 3638 | 5.2% |
4 | 2209 | 3.1% |
7 | 1411 | 2.0% |
6 | 1409 | 2.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 70253 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 17331 | |
- | 16751 | |
1 | 12279 | |
5 | 6085 | 8.7% |
2 | 4210 | 6.0% |
3 | 3762 | 5.4% |
9 | 3638 | 5.2% |
4 | 2209 | 3.1% |
7 | 1411 | 2.0% |
6 | 1409 | 2.0% |
실적년도
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2022 |
---|
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2022 |
---|---|
2nd row | 2022 |
3rd row | 2022 |
4th row | 2022 |
5th row | 2022 |
Common Values
Value | Count | Frequency (%) |
2022 | 10000 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2022 | 10000 |
폐기물분류코드
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
사업장폐기물 | |
---|---|
생활폐기물 | |
건설폐기물 | 411 |
Length
Max length | 6 |
---|---|
Median length | 6 |
Mean length | 5.8312 |
Min length | 5 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 사업장폐기물 |
---|---|
2nd row | 사업장폐기물 |
3rd row | 사업장폐기물 |
4th row | 생활폐기물 |
5th row | 사업장폐기물 |
Common Values
Value | Count | Frequency (%) |
사업장폐기물 | 8312 | |
생활폐기물 | 1277 | 12.8% |
건설폐기물 | 411 | 4.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
사업장폐기물 | 8312 | |
생활폐기물 | 1277 | 12.8% |
건설폐기물 | 411 | 4.1% |
폐기물정보대분류
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 13 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 39.8767 |
Minimum | 1 |
---|---|
Maximum | 91 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 2 |
Q1 | 7 |
median | 51 |
Q3 | 51 |
95-th percentile | 91 |
Maximum | 91 |
Range | 90 |
Interquartile range (IQR) | 44 |
Descriptive statistics
Standard deviation | 28.631791 |
---|---|
Coefficient of variation (CV) | 0.71800804 |
Kurtosis | -0.88077618 |
Mean | 39.8767 |
Median Absolute Deviation (MAD) | 11 |
Skewness | 0.10959072 |
Sum | 398767 |
Variance | 819.77948 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
51 | 4908 | |
91 | 1277 | 12.8% |
3 | 942 | 9.4% |
2 | 416 | 4.2% |
40 | 411 | 4.1% |
1 | 402 | 4.0% |
6 | 386 | 3.9% |
10 | 273 | 2.7% |
8 | 250 | 2.5% |
7 | 228 | 2.3% |
Other values (3) | 507 | 5.1% |
Value | Count | Frequency (%) |
1 | 402 | |
2 | 416 | |
3 | 942 | |
4 | 141 | 1.4% |
5 | 187 | 1.9% |
6 | 386 | |
7 | 228 | 2.3% |
8 | 250 | 2.5% |
9 | 179 | 1.8% |
10 | 273 | 2.7% |
Value | Count | Frequency (%) |
91 | 1277 | 12.8% |
51 | 4908 | |
40 | 411 | 4.1% |
10 | 273 | 2.7% |
9 | 179 | 1.8% |
8 | 250 | 2.5% |
7 | 228 | 2.3% |
6 | 386 | 3.9% |
5 | 187 | 1.9% |
4 | 141 | 1.4% |
폐기물정보중분류
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 46 |
---|---|
Distinct (%) | 0.5% |
Missing | 311 |
Missing (%) | 3.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 13.46269 |
Minimum | 1 |
---|---|
Maximum | 99 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 3 |
median | 8 |
Q3 | 18 |
95-th percentile | 41 |
Maximum | 99 |
Range | 98 |
Interquartile range (IQR) | 15 |
Descriptive statistics
Standard deviation | 15.707709 |
---|---|
Coefficient of variation (CV) | 1.1667586 |
Kurtosis | 9.9793014 |
Mean | 13.46269 |
Median Absolute Deviation (MAD) | 6 |
Skewness | 2.6169287 |
Sum | 130440 |
Variance | 246.73212 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 1369 | 13.7% |
3 | 910 | 9.1% |
2 | 906 | 9.1% |
4 | 505 | 5.1% |
17 | 490 | 4.9% |
6 | 417 | 4.2% |
8 | 350 | 3.5% |
5 | 315 | 3.1% |
12 | 304 | 3.0% |
9 | 257 | 2.6% |
Other values (36) | 3866 | |
(Missing) | 311 | 3.1% |
Value | Count | Frequency (%) |
1 | 1369 | |
2 | 906 | |
3 | 910 | |
4 | 505 | 5.1% |
5 | 315 | 3.1% |
6 | 417 | 4.2% |
7 | 252 | 2.5% |
8 | 350 | 3.5% |
9 | 257 | 2.6% |
10 | 206 | 2.1% |
Value | Count | Frequency (%) |
99 | 104 | |
90 | 44 | 0.4% |
46 | 55 | |
45 | 123 | |
44 | 60 | |
43 | 46 | 0.5% |
42 | 47 | 0.5% |
41 | 63 | |
40 | 11 | 0.1% |
38 | 105 |
폐기물정보소분류
Real number (ℝ)
HIGH CORRELATION
  MISSING
  ZEROS
 
Distinct | 23 |
---|---|
Distinct (%) | 0.3% |
Missing | 2938 |
Missing (%) | 29.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 11.708581 |
Minimum | 0 |
---|---|
Maximum | 99 |
Zeros | 1411 |
Zeros (%) | 14.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 1 |
median | 2 |
Q3 | 5 |
95-th percentile | 99 |
Maximum | 99 |
Range | 99 |
Interquartile range (IQR) | 4 |
Descriptive statistics
Standard deviation | 27.546837 |
---|---|
Coefficient of variation (CV) | 2.352705 |
Kurtosis | 5.9119469 |
Mean | 11.708581 |
Median Absolute Deviation (MAD) | 2 |
Skewness | 2.7772046 |
Sum | 82686 |
Variance | 758.82825 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 1411 | |
1 | 1204 | |
2 | 1189 | |
3 | 818 | 8.2% |
99 | 607 | 6.1% |
4 | 467 | 4.7% |
5 | 320 | 3.2% |
6 | 252 | 2.5% |
7 | 187 | 1.9% |
8 | 127 | 1.3% |
Other values (13) | 480 | 4.8% |
(Missing) | 2938 |
Value | Count | Frequency (%) |
0 | 1411 | |
1 | 1204 | |
2 | 1189 | |
3 | 818 | |
4 | 467 | 4.7% |
5 | 320 | 3.2% |
6 | 252 | 2.5% |
7 | 187 | 1.9% |
8 | 127 | 1.3% |
9 | 78 | 0.8% |
Value | Count | Frequency (%) |
99 | 607 | |
90 | 24 | 0.2% |
29 | 26 | 0.3% |
24 | 29 | 0.3% |
23 | 28 | 0.3% |
22 | 19 | 0.2% |
21 | 28 | 0.3% |
19 | 53 | 0.5% |
14 | 26 | 0.3% |
13 | 35 | 0.4% |
연소여부
Boolean
HIGH CORRELATION
  MISSING
 
Distinct | 2 |
---|---|
Distinct (%) | 0.1% |
Missing | 6617 |
Missing (%) | 66.2% |
Memory size | 97.7 KiB |
False | |
---|---|
True | 429 |
(Missing) |
Value | Count | Frequency (%) |
False | 2954 | |
True | 429 | 4.3% |
(Missing) | 6617 |
지정폐기물여부
Boolean
HIGH CORRELATION
 
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 87.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 7709 | |
True | 2291 | 22.9% |
의료폐기물여부
Boolean
HIGH CORRELATION
  IMBALANCE
  MISSING
 
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 1662 |
Missing (%) | 16.6% |
Memory size | 97.7 KiB |
False | |
---|---|
True | 333 |
(Missing) |
Value | Count | Frequency (%) |
False | 8005 | |
True | 333 | 3.3% |
(Missing) | 1662 | 16.6% |
시퀀스 | 폐기물분류코드 | 폐기물정보대분류 | 폐기물정보중분류 | 폐기물정보소분류 | 연소여부 | 지정폐기물여부 | 의료폐기물여부 | |
---|---|---|---|---|---|---|---|---|
시퀀스 | 1.000 | 0.000 | 0.020 | 0.034 | 0.000 | 0.018 | 0.026 | 0.013 |
폐기물분류코드 | 0.000 | 1.000 | 1.000 | 0.508 | 0.504 | 0.099 | 0.149 | 0.409 |
폐기물정보대분류 | 0.020 | 1.000 | 1.000 | 0.515 | 0.490 | 0.899 | 0.631 | 0.996 |
폐기물정보중분류 | 0.034 | 0.508 | 0.515 | 1.000 | 0.263 | 0.320 | 0.549 | 0.412 |
폐기물정보소분류 | 0.000 | 0.504 | 0.490 | 0.263 | 1.000 | 0.056 | 0.246 | 0.117 |
연소여부 | 0.018 | 0.099 | 0.899 | 0.320 | 0.056 | 1.000 | 0.699 | 0.893 |
지정폐기물여부 | 0.026 | 0.149 | 0.631 | 0.549 | 0.246 | 0.699 | 1.000 | 0.288 |
의료폐기물여부 | 0.013 | 0.409 | 0.996 | 0.412 | 0.117 | 0.893 | 0.288 | 1.000 |
폐기물분류코드 | 의료폐기물여부 | 지정폐기물여부 | 연소여부 | |
---|---|---|---|---|
폐기물분류코드 | 1.000 | 0.269 | 0.245 | 0.063 |
의료폐기물여부 | 0.269 | 1.000 | 0.186 | 0.703 |
지정폐기물여부 | 0.245 | 0.186 | 1.000 | 0.493 |
연소여부 | 0.063 | 0.703 | 0.493 | 1.000 |
시퀀스 | 폐기물정보대분류 | 폐기물정보중분류 | 폐기물정보소분류 | 폐기물분류코드 | 연소여부 | 지정폐기물여부 | 의료폐기물여부 | |
---|---|---|---|---|---|---|---|---|
시퀀스 | 1.000 | -0.006 | -0.032 | 0.013 | 0.000 | 0.018 | 0.026 | 0.013 |
폐기물정보대분류 | -0.006 | 1.000 | 0.556 | -0.194 | 1.000 | 0.485 | 0.759 | 0.355 |
폐기물정보중분류 | -0.032 | 0.556 | 1.000 | -0.134 | 0.243 | 0.391 | 0.397 | 0.297 |
폐기물정보소분류 | 0.013 | -0.194 | -0.134 | 1.000 | 0.504 | 0.093 | 0.164 | 0.078 |
폐기물분류코드 | 0.000 | 1.000 | 0.243 | 0.504 | 1.000 | 0.063 | 0.245 | 0.269 |
연소여부 | 0.018 | 0.485 | 0.391 | 0.093 | 0.063 | 1.000 | 0.493 | 0.703 |
지정폐기물여부 | 0.026 | 0.759 | 0.397 | 0.164 | 0.245 | 0.493 | 1.000 | 0.186 |
의료폐기물여부 | 0.013 | 0.355 | 0.297 | 0.078 | 0.269 | 0.703 | 0.186 | 1.000 |
시퀀스 | 폐기물분류번호 | 실적년도 | 폐기물분류코드 | 폐기물정보대분류 | 폐기물정보중분류 | 폐기물정보소분류 | 연소여부 | 지정폐기물여부 | 의료폐기물여부 | |
---|---|---|---|---|---|---|---|---|---|---|
86724 | 1398549 | 02-01-99 | 2022 | 사업장폐기물 | 2 | 1 | 99 | <NA> | Y | N |
77788 | 1375085 | 51-18-02 | 2022 | 사업장폐기물 | 51 | 18 | 2 | <NA> | N | N |
79569 | 1473556 | 51-08-05 | 2022 | 사업장폐기물 | 51 | 8 | 5 | N | N | N |
18541 | 1423773 | 91-10 | 2022 | 생활폐기물 | 91 | 10 | <NA> | <NA> | N | <NA> |
29964 | 1403544 | 51-42 | 2022 | 사업장폐기물 | 51 | 42 | <NA> | <NA> | N | N |
18550 | 1378655 | 91-10 | 2022 | 생활폐기물 | 91 | 10 | <NA> | <NA> | N | <NA> |
48935 | 1454821 | 51-03-06 | 2022 | 사업장폐기물 | 51 | 3 | 6 | <NA> | N | N |
20394 | 1399542 | 07-01 | 2022 | 사업장폐기물 | 7 | 1 | <NA> | <NA> | N | N |
14658 | 1399975 | 06-01-06 | 2022 | 사업장폐기물 | 6 | 1 | 6 | <NA> | Y | N |
2113 | 1396761 | 51-03-05 | 2022 | 사업장폐기물 | 51 | 3 | 5 | <NA> | N | N |
시퀀스 | 폐기물분류번호 | 실적년도 | 폐기물분류코드 | 폐기물정보대분류 | 폐기물정보중분류 | 폐기물정보소분류 | 연소여부 | 지정폐기물여부 | 의료폐기물여부 | |
---|---|---|---|---|---|---|---|---|---|---|
51874 | 1528854 | 51-17-21 | 2022 | 사업장폐기물 | 51 | 17 | 21 | <NA> | N | N |
19965 | 1547873 | 07 | 2022 | 사업장폐기물 | 7 | <NA> | <NA> | <NA> | N | N |
33895 | 1487722 | 51-13-02 | 2022 | 사업장폐기물 | 51 | 13 | 2 | N | N | N |
72018 | 1418643 | 51-26-00 | 2022 | 사업장폐기물 | 51 | 26 | 0 | N | N | N |
7507 | 1478709 | 40-02-06 | 2022 | 건설폐기물 | 40 | 2 | 6 | <NA> | N | <NA> |
52860 | 1470698 | 51-17-29 | 2022 | 사업장폐기물 | 51 | 17 | 29 | <NA> | N | N |
29210 | 1370544 | 51-04-02 | 2022 | 사업장폐기물 | 51 | 4 | 2 | N | N | N |
44079 | 1473556 | 91-15-00 | 2022 | 생활폐기물 | 91 | 15 | 0 | <NA> | N | <NA> |
47822 | 1551218 | 10-11-00 | 2022 | 사업장폐기물 | 10 | 11 | 0 | Y | Y | Y |
71862 | 1454887 | 51-26-00 | 2022 | 사업장폐기물 | 51 | 26 | 0 | N | N | N |