Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 1117 |
Missing cells | 1096 |
Missing cells (%) | 14.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 67.8 KiB |
Average record size in memory | 62.1 B |
Variable types
Categorical | 1 |
---|---|
Text | 1 |
Numeric | 5 |
Dataset
Description | 수질TMS 부착사업장에서 실시간으로 측정되는 수질오염물질의 배출량을 매년 통계자료로 생성하여 시스템을 통해 공개 |
---|---|
URL | https://www.data.go.kr/data/15106197/fileData.do |
연도 has constant value "" | Constant |
총유기탄소 배출량 is highly overall correlated with 부유물질 배출량 and 2 other fields | High correlation |
부유물질 배출량 is highly overall correlated with 총유기탄소 배출량 and 2 other fields | High correlation |
총질소 배출량 is highly overall correlated with 총유기탄소 배출량 and 2 other fields | High correlation |
총인 배출량 is highly overall correlated with 총유기탄소 배출량 and 2 other fields | High correlation |
총유기탄소 배출량 has 902 (80.8%) missing values | Missing |
총질소 배출량 has 90 (8.1%) missing values | Missing |
총인 배출량 has 103 (9.2%) missing values | Missing |
부유물질 배출량 has 58 (5.2%) zeros | Zeros |
총인 배출량 has 505 (45.2%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-13 00:41:45.703216 |
---|---|
Analysis finished | 2023-12-13 00:41:48.071833 |
Duration | 2.37 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
연도
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 8.9 KiB |
2022 |
---|
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2022 |
---|---|
2nd row | 2022 |
3rd row | 2022 |
4th row | 2022 |
5th row | 2022 |
Common Values
Value | Count | Frequency (%) |
2022 | 1117 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2022 | 1117 |
사업장명
Text
Distinct | 1045 |
---|---|
Distinct (%) | 93.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 8.9 KiB |
Value | Count | Frequency (%) |
현대제철(당진 | 6 | 0.5% |
한국서부발전(태안 | 5 | 0.4% |
아산디스플레이시티1폐수 | 4 | 0.4% |
포스코(광양 | 4 | 0.4% |
풍산안강공장(경주 | 4 | 0.4% |
화성봉담하수 | 3 | 0.3% |
포스코인터내셔널(인천 | 3 | 0.3% |
삼성전자(화성 | 3 | 0.3% |
대전하수 | 3 | 0.3% |
한울원자력본부(울진 | 3 | 0.3% |
Other values (1042) | 1088 |
Most occurring characters
Value | Count | Frequency (%) |
수 | 884 | 11.6% |
하 | 678 | 8.9% |
( | 305 | 4.0% |
) | 305 | 4.0% |
산 | 254 | 3.3% |
주 | 190 | 2.5% |
천 | 189 | 2.5% |
폐 | 150 | 2.0% |
성 | 143 | 1.9% |
양 | 126 | 1.7% |
Other values (381) | 4405 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 6904 | |
Open Punctuation | 305 | 4.0% |
Close Punctuation | 305 | 4.0% |
Decimal Number | 48 | 0.6% |
Uppercase Letter | 43 | 0.6% |
Space Separator | 12 | 0.2% |
Dash Punctuation | 6 | 0.1% |
Other Punctuation | 3 | < 0.1% |
Lowercase Letter | 3 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
수 | 884 | 12.8% |
하 | 678 | 9.8% |
산 | 254 | 3.7% |
주 | 190 | 2.8% |
천 | 189 | 2.7% |
폐 | 150 | 2.2% |
성 | 143 | 2.1% |
양 | 126 | 1.8% |
안 | 100 | 1.4% |
진 | 91 | 1.3% |
Other values (356) | 4099 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 8 | |
S | 8 | |
L | 6 | |
K | 6 | |
I | 4 | |
O | 3 | 7.0% |
D | 2 | 4.7% |
P | 2 | 4.7% |
A | 2 | 4.7% |
J | 1 | 2.3% |
Decimal Number
Value | Count | Frequency (%) |
1 | 21 | |
2 | 20 | |
3 | 3 | 6.2% |
4 | 3 | 6.2% |
5 | 1 | 2.1% |
Lowercase Letter
Value | Count | Frequency (%) |
t | 1 | |
b | 1 | |
h | 1 |
Other Punctuation
Value | Count | Frequency (%) |
# | 2 | |
. | 1 |
Open Punctuation
Value | Count | Frequency (%) |
( | 305 |
Close Punctuation
Value | Count | Frequency (%) |
) | 305 |
Space Separator
Value | Count | Frequency (%) |
12 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 6 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 6904 | |
Common | 679 | 8.9% |
Latin | 46 | 0.6% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
수 | 884 | 12.8% |
하 | 678 | 9.8% |
산 | 254 | 3.7% |
주 | 190 | 2.8% |
천 | 189 | 2.7% |
폐 | 150 | 2.2% |
성 | 143 | 2.1% |
양 | 126 | 1.8% |
안 | 100 | 1.4% |
진 | 91 | 1.3% |
Other values (356) | 4099 |
Latin
Value | Count | Frequency (%) |
C | 8 | |
S | 8 | |
L | 6 | |
K | 6 | |
I | 4 | |
O | 3 | 6.5% |
D | 2 | 4.3% |
P | 2 | 4.3% |
A | 2 | 4.3% |
J | 1 | 2.2% |
Other values (4) | 4 |
Common
Value | Count | Frequency (%) |
( | 305 | |
) | 305 | |
1 | 21 | 3.1% |
2 | 20 | 2.9% |
12 | 1.8% | |
- | 6 | 0.9% |
3 | 3 | 0.4% |
4 | 3 | 0.4% |
# | 2 | 0.3% |
. | 1 | 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 6904 | |
ASCII | 725 | 9.5% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
수 | 884 | 12.8% |
하 | 678 | 9.8% |
산 | 254 | 3.7% |
주 | 190 | 2.8% |
천 | 189 | 2.7% |
폐 | 150 | 2.2% |
성 | 143 | 2.1% |
양 | 126 | 1.8% |
안 | 100 | 1.4% |
진 | 91 | 1.3% |
Other values (356) | 4099 |
ASCII
Value | Count | Frequency (%) |
( | 305 | |
) | 305 | |
1 | 21 | 2.9% |
2 | 20 | 2.8% |
12 | 1.7% | |
C | 8 | 1.1% |
S | 8 | 1.1% |
L | 6 | 0.8% |
- | 6 | 0.8% |
K | 6 | 0.8% |
Other values (15) | 28 | 3.9% |
방류구
Real number (ℝ)
Distinct | 6 |
---|---|
Distinct (%) | 0.5% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1.1056401 |
Minimum | 1 |
---|---|
Maximum | 6 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.9 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 1 |
median | 1 |
Q3 | 1 |
95-th percentile | 2 |
Maximum | 6 |
Range | 5 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 0.44954623 |
---|---|
Coefficient of variation (CV) | 0.40659364 |
Kurtosis | 36.909075 |
Mean | 1.1056401 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 5.5451571 |
Sum | 1235 |
Variance | 0.20209182 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 1039 | |
2 | 53 | 4.7% |
3 | 14 | 1.3% |
4 | 8 | 0.7% |
5 | 2 | 0.2% |
6 | 1 | 0.1% |
Value | Count | Frequency (%) |
1 | 1039 | |
2 | 53 | 4.7% |
3 | 14 | 1.3% |
4 | 8 | 0.7% |
5 | 2 | 0.2% |
6 | 1 | 0.1% |
Value | Count | Frequency (%) |
6 | 1 | 0.1% |
5 | 2 | 0.2% |
4 | 8 | 0.7% |
3 | 14 | 1.3% |
2 | 53 | 4.7% |
1 | 1039 |
총유기탄소 배출량
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 205 |
---|---|
Distinct (%) | 95.3% |
Missing | 902 |
Missing (%) | 80.8% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 17626.42 |
Minimum | 0 |
---|---|
Maximum | 464456.4 |
Zeros | 7 |
Zeros (%) | 0.6% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0.47 |
Q1 | 157.85 |
median | 918.1 |
Q3 | 5387 |
95-th percentile | 81215.4 |
Maximum | 464456.4 |
Range | 464456.4 |
Interquartile range (IQR) | 5229.15 |
Descriptive statistics
Standard deviation | 62782.811 |
---|---|
Coefficient of variation (CV) | 3.5618583 |
Kurtosis | 29.193007 |
Mean | 17626.42 |
Median Absolute Deviation (MAD) | 896.7 |
Skewness | 5.2230009 |
Sum | 3789680.3 |
Variance | 3.9416813 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0.0 | 7 | 0.6% |
26.2 | 2 | 0.2% |
152.8 | 2 | 0.2% |
117.9 | 2 | 0.2% |
0.4 | 2 | 0.2% |
2402.2 | 1 | 0.1% |
1054.6 | 1 | 0.1% |
918.1 | 1 | 0.1% |
48.0 | 1 | 0.1% |
152.2 | 1 | 0.1% |
Other values (195) | 195 | 17.5% |
(Missing) | 902 |
Value | Count | Frequency (%) |
0.0 | 7 | |
0.1 | 1 | 0.1% |
0.3 | 1 | 0.1% |
0.4 | 2 | 0.2% |
0.5 | 1 | 0.1% |
0.8 | 1 | 0.1% |
6.7 | 1 | 0.1% |
7.3 | 1 | 0.1% |
8.1 | 1 | 0.1% |
9.1 | 1 | 0.1% |
Value | Count | Frequency (%) |
464456.4 | 1 | |
451185.3 | 1 | |
360009.2 | 1 | |
291364.0 | 1 | |
278032.3 | 1 | |
253415.1 | 1 | |
205769.2 | 1 | |
149978.1 | 1 | |
149676.3 | 1 | |
102796.8 | 1 |
부유물질 배출량
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 989 |
---|---|
Distinct (%) | 88.6% |
Missing | 1 |
Missing (%) | 0.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 17976.615 |
Minimum | 0 |
---|---|
Maximum | 1957909.5 |
Zeros | 58 |
Zeros (%) | 5.2% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 78.95 |
median | 991.6 |
Q3 | 5763.275 |
95-th percentile | 80083.125 |
Maximum | 1957909.5 |
Range | 1957909.5 |
Interquartile range (IQR) | 5684.325 |
Descriptive statistics
Standard deviation | 84442.932 |
---|---|
Coefficient of variation (CV) | 4.6973767 |
Kurtosis | 276.85937 |
Mean | 17976.615 |
Median Absolute Deviation (MAD) | 989.3 |
Skewness | 14.056499 |
Sum | 20061902 |
Variance | 7.1306087 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0.0 | 58 | 5.2% |
0.1 | 8 | 0.7% |
0.2 | 7 | 0.6% |
0.9 | 4 | 0.4% |
1.6 | 3 | 0.3% |
1.5 | 3 | 0.3% |
0.4 | 3 | 0.3% |
2.3 | 3 | 0.3% |
9.5 | 3 | 0.3% |
3.3 | 3 | 0.3% |
Other values (979) | 1021 |
Value | Count | Frequency (%) |
0.0 | 58 | |
0.1 | 8 | 0.7% |
0.2 | 7 | 0.6% |
0.3 | 3 | 0.3% |
0.4 | 3 | 0.3% |
0.5 | 2 | 0.2% |
0.6 | 1 | 0.1% |
0.7 | 1 | 0.1% |
0.8 | 2 | 0.2% |
0.9 | 4 | 0.4% |
Value | Count | Frequency (%) |
1957909.5 | 1 | |
1075668.4 | 1 | |
537200.4 | 1 | |
499623.2 | 1 | |
482916.9 | 1 | |
458477.3 | 1 | |
442631.9 | 1 | |
426682.8 | 1 | |
376532.3 | 1 | |
361731.5 | 1 |
총질소 배출량
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 1024 |
---|---|
Distinct (%) | 99.7% |
Missing | 90 |
Missing (%) | 8.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 76534.536 |
Minimum | 0 |
---|---|
Maximum | 7059490.7 |
Zeros | 3 |
Zeros (%) | 0.3% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 147.53 |
Q1 | 1204.25 |
median | 4845.7 |
Q3 | 26217.1 |
95-th percentile | 307782.98 |
Maximum | 7059490.7 |
Range | 7059490.7 |
Interquartile range (IQR) | 25012.85 |
Descriptive statistics
Standard deviation | 340796.62 |
---|---|
Coefficient of variation (CV) | 4.4528475 |
Kurtosis | 206.05173 |
Mean | 76534.536 |
Median Absolute Deviation (MAD) | 4399 |
Skewness | 12.246752 |
Sum | 78600969 |
Variance | 1.1614234 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0.0 | 3 | 0.3% |
43.2 | 2 | 0.2% |
3127.7 | 1 | 0.1% |
8058.0 | 1 | 0.1% |
750.2 | 1 | 0.1% |
443.1 | 1 | 0.1% |
10.7 | 1 | 0.1% |
13010.9 | 1 | 0.1% |
5515.0 | 1 | 0.1% |
231.7 | 1 | 0.1% |
Other values (1014) | 1014 | |
(Missing) | 90 | 8.1% |
Value | Count | Frequency (%) |
0.0 | 3 | |
0.1 | 1 | 0.1% |
0.4 | 1 | 0.1% |
1.3 | 1 | 0.1% |
2.0 | 1 | 0.1% |
2.2 | 1 | 0.1% |
8.2 | 1 | 0.1% |
9.1 | 1 | 0.1% |
9.6 | 1 | 0.1% |
10.7 | 1 | 0.1% |
Value | Count | Frequency (%) |
7059490.7 | 1 | |
4422279.3 | 1 | |
2792729.4 | 1 | |
2424608.6 | 1 | |
1968647.7 | 1 | |
1860595.5 | 1 | |
1719874.8 | 1 | |
1683698.8 | 1 | |
1402254.4 | 1 | |
1303658.1 | 1 |
총인 배출량
Real number (ℝ)
HIGH CORRELATION
  MISSING
  ZEROS
 
Distinct | 396 |
---|---|
Distinct (%) | 39.1% |
Missing | 103 |
Missing (%) | 9.2% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1605.5534 |
Minimum | 0 |
---|---|
Maximum | 109115.5 |
Zeros | 505 |
Zeros (%) | 45.2% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 0.1 |
Q3 | 96.175 |
95-th percentile | 7259.9 |
Maximum | 109115.5 |
Range | 109115.5 |
Interquartile range (IQR) | 96.175 |
Descriptive statistics
Standard deviation | 7605.628 |
---|---|
Coefficient of variation (CV) | 4.7370759 |
Kurtosis | 80.526929 |
Mean | 1605.5534 |
Median Absolute Deviation (MAD) | 0.1 |
Skewness | 8.1761724 |
Sum | 1628031.1 |
Variance | 57845578 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0.0 | 505 | |
0.2 | 18 | 1.6% |
0.3 | 14 | 1.3% |
0.4 | 11 | 1.0% |
0.1 | 9 | 0.8% |
0.9 | 7 | 0.6% |
2.6 | 6 | 0.5% |
0.7 | 6 | 0.5% |
1.1 | 5 | 0.4% |
1.3 | 5 | 0.4% |
Other values (386) | 428 | |
(Missing) | 103 | 9.2% |
Value | Count | Frequency (%) |
0.0 | 505 | |
0.1 | 9 | 0.8% |
0.2 | 18 | 1.6% |
0.3 | 14 | 1.3% |
0.4 | 11 | 1.0% |
0.5 | 4 | 0.4% |
0.6 | 2 | 0.2% |
0.7 | 6 | 0.5% |
0.8 | 3 | 0.3% |
0.9 | 7 | 0.6% |
Value | Count | Frequency (%) |
109115.5 | 1 | |
86871.0 | 1 | |
67944.0 | 1 | |
66568.0 | 1 | |
61266.6 | 1 | |
59488.3 | 1 | |
54510.7 | 1 | |
51630.4 | 1 | |
48112.9 | 1 | |
47652.6 | 1 |
방류구 | 총유기탄소 배출량 | 부유물질 배출량 | 총질소 배출량 | 총인 배출량 | |
---|---|---|---|---|---|
방류구 | 1.000 | 0.190 | 0.048 | 0.083 | 0.000 |
총유기탄소 배출량 | 0.190 | 1.000 | 1.000 | 1.000 | 0.944 |
부유물질 배출량 | 0.048 | 1.000 | 1.000 | 0.896 | 0.850 |
총질소 배출량 | 0.083 | 1.000 | 0.896 | 1.000 | 0.883 |
총인 배출량 | 0.000 | 0.944 | 0.850 | 0.883 | 1.000 |
방류구 | 총유기탄소 배출량 | 부유물질 배출량 | 총질소 배출량 | 총인 배출량 | |
---|---|---|---|---|---|
방류구 | 1.000 | 0.214 | 0.117 | 0.152 | 0.127 |
총유기탄소 배출량 | 0.214 | 1.000 | 0.837 | 0.835 | 0.705 |
부유물질 배출량 | 0.117 | 0.837 | 1.000 | 0.912 | 0.742 |
총질소 배출량 | 0.152 | 0.835 | 0.912 | 1.000 | 0.735 |
총인 배출량 | 0.127 | 0.705 | 0.742 | 0.735 | 1.000 |
연도 | 사업장명 | 방류구 | 총유기탄소 배출량 | 부유물질 배출량 | 총질소 배출량 | 총인 배출량 | |
---|---|---|---|---|---|---|---|
0 | 2022 | 영천신녕하수 | 1 | <NA> | 26.6 | 1446.1 | 0.0 |
1 | 2022 | 성주폐수 | 1 | <NA> | 76.6 | 1721.1 | 0.0 |
2 | 2022 | 봉화춘양하수 | 1 | <NA> | 4.3 | 704.3 | 0.0 |
3 | 2022 | 함안군북하수 | 1 | 0.4 | 0.3 | 583.1 | 0.0 |
4 | 2022 | 사천곤양하수 | 1 | 323.6 | 113.6 | 2181.4 | 3.7 |
5 | 2022 | 고창대산하수 | 1 | 280.7 | 7.6 | 1706.0 | 0.0 |
6 | 2022 | 김천아포하수 | 1 | <NA> | 1586.8 | 2058.0 | 0.0 |
7 | 2022 | 중앙특수제지(포천) | 1 | 1050.1 | 29.6 | 522.1 | 0.0 |
8 | 2022 | 한국서부발전(태안) | 4 | <NA> | 486.8 | 1735.7 | 2.6 |
9 | 2022 | 괴산대제폐수 | 1 | 75.7 | 235.1 | 309.4 | 0.0 |
연도 | 사업장명 | 방류구 | 총유기탄소 배출량 | 부유물질 배출량 | 총질소 배출량 | 총인 배출량 | |
---|---|---|---|---|---|---|---|
1107 | 2022 | 울진후포하수 | 1 | <NA> | 93.5 | 8189.5 | 9.7 |
1108 | 2022 | 화천산양하수 | 1 | <NA> | 290.5 | 2048.2 | 0.0 |
1109 | 2022 | 양주옥정하수 | 1 | <NA> | 15459.3 | 86176.0 | 330.1 |
1110 | 2022 | 나주산단폐수 | 1 | <NA> | 237.5 | 815.9 | 0.9 |
1111 | 2022 | 켐트로닉스(세종) | 1 | <NA> | 1158.8 | 814.1 | 0.0 |
1112 | 2022 | 화성서신하수 | 1 | <NA> | 294.1 | 1103.8 | 0.0 |
1113 | 2022 | 하남하수 | 1 | <NA> | 2928.8 | 51521.1 | 0.0 |
1114 | 2022 | 의성금성하수 | 1 | 835.9 | 276.2 | 2154.7 | 0.0 |
1115 | 2022 | 밀양정수(밀양) | 1 | <NA> | 269.2 | <NA> | <NA> |
1116 | 2022 | 화성매송하수 | 1 | 4338.7 | 524.0 | 1893.0 | 0.8 |