Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 10000 |
Missing cells | 8 |
Missing cells (%) | < 0.1% |
Duplicate rows | 21 |
Duplicate rows (%) | 0.2% |
Total size in memory | 556.6 KiB |
Average record size in memory | 57.0 B |
Variable types
Text | 2 |
---|---|
Categorical | 2 |
DateTime | 1 |
Numeric | 1 |
Dataset
Description | 이동설치가 용이한 이동형측정기기를 수질오염사고 예상 지점 및 사고 발생지점에 설치하여 측정소별 자체적으로 정한 기준을 초과한 경우의 측정값 및 경보이력 |
---|---|
Author | 한국환경공단 |
URL | https://www.data.go.kr/data/15065132/fileData.do |
Reproduction
Analysis started | 2023-12-12 11:21:17.759963 |
---|---|
Analysis finished | 2023-12-12 11:21:19.109689 |
Duration | 1.35 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
측정소명
Text
Distinct | 55 |
---|---|
Distinct (%) | 0.6% |
Missing | 1 |
Missing (%) | < 0.1% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
석남천 | 8345 | |
김해 | 298 | 3.0% |
구미 | 187 | 1.9% |
성주 | 168 | 1.7% |
호남예비3 | 143 | 1.4% |
한강1 | 136 | 1.4% |
지석천 | 109 | 1.1% |
황룡강 | 104 | 1.0% |
왕숙천 | 103 | 1.0% |
고령 | 52 | 0.5% |
Other values (45) | 354 | 3.5% |
Most occurring characters
Value | Count | Frequency (%) |
천 | 8633 | |
남 | 8535 | |
석 | 8454 | |
해 | 299 | 1.0% |
김 | 298 | 1.0% |
강 | 249 | 0.8% |
주 | 208 | 0.7% |
미 | 198 | 0.7% |
1 | 195 | 0.7% |
구 | 190 | 0.6% |
Other values (71) | 2278 | 7.7% |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 29081 | |
Decimal Number | 454 | 1.5% |
Dash Punctuation | 2 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
천 | 8633 | |
남 | 8535 | |
석 | 8454 | |
해 | 299 | 1.0% |
김 | 298 | 1.0% |
강 | 249 | 0.9% |
주 | 208 | 0.7% |
미 | 198 | 0.7% |
구 | 190 | 0.7% |
성 | 178 | 0.6% |
Other values (64) | 1839 | 6.3% |
Decimal Number
Value | Count | Frequency (%) |
1 | 195 | |
3 | 158 | |
2 | 43 | 9.5% |
0 | 28 | 6.2% |
8 | 28 | 6.2% |
4 | 2 | 0.4% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 29081 | |
Common | 456 | 1.5% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
천 | 8633 | |
남 | 8535 | |
석 | 8454 | |
해 | 299 | 1.0% |
김 | 298 | 1.0% |
강 | 249 | 0.9% |
주 | 208 | 0.7% |
미 | 198 | 0.7% |
구 | 190 | 0.7% |
성 | 178 | 0.6% |
Other values (64) | 1839 | 6.3% |
Common
Value | Count | Frequency (%) |
1 | 195 | |
3 | 158 | |
2 | 43 | 9.4% |
0 | 28 | 6.1% |
8 | 28 | 6.1% |
4 | 2 | 0.4% |
- | 2 | 0.4% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 29081 | |
ASCII | 456 | 1.5% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
천 | 8633 | |
남 | 8535 | |
석 | 8454 | |
해 | 299 | 1.0% |
김 | 298 | 1.0% |
강 | 249 | 0.9% |
주 | 208 | 0.7% |
미 | 198 | 0.7% |
구 | 190 | 0.7% |
성 | 178 | 0.6% |
Other values (64) | 1839 | 6.3% |
ASCII
Value | Count | Frequency (%) |
1 | 195 | |
3 | 158 | |
2 | 43 | 9.4% |
0 | 28 | 6.1% |
8 | 28 | 6.1% |
4 | 2 | 0.4% |
- | 2 | 0.4% |
항목코드
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
DOW00 | |
---|---|
PHY00 | |
ETC | 7 |
CON00 | 6 |
Length
Max length | 5 |
---|---|
Median length | 5 |
Mean length | 4.9986 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | DOW00 |
---|---|
2nd row | DOW00 |
3rd row | DOW00 |
4th row | ETC |
5th row | DOW00 |
Common Values
Value | Count | Frequency (%) |
DOW00 | 7914 | |
PHY00 | 2073 | 20.7% |
ETC | 7 | 0.1% |
CON00 | 6 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
dow00 | 7914 | |
phy00 | 2073 | 20.7% |
etc | 7 | 0.1% |
con00 | 6 | 0.1% |
항목명
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
용존산소 | |
---|---|
pH | |
기타 | 7 |
전기전도도 | 6 |
Length
Max length | 5 |
---|---|
Median length | 4 |
Mean length | 3.5846 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 용존산소 |
---|---|
2nd row | 용존산소 |
3rd row | 용존산소 |
4th row | 기타 |
5th row | 용존산소 |
Common Values
Value | Count | Frequency (%) |
용존산소 | 7914 | |
pH | 2073 | 20.7% |
기타 | 7 | 0.1% |
전기전도도 | 6 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
용존산소 | 7914 | |
ph | 2073 | 20.7% |
기타 | 7 | 0.1% |
전기전도도 | 6 | 0.1% |
측정일시
Text
Distinct | 9880 |
---|---|
Distinct (%) | 98.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 19 |
---|---|
Median length | 19 |
Mean length | 19 |
Min length | 19 |
Characters and Unicode
Total characters | 190000 |
---|---|
Distinct characters | 13 |
Distinct categories | 4 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 9769 ? |
---|---|
Unique (%) | 97.7% |
Sample
1st row | 2021-11-26 10:20:00 |
---|---|
2nd row | 2022-04-24 03:50:00 |
3rd row | 2021-05-13 09:50:00 |
4th row | 2013-04-26 16:46:00 |
5th row | 2021-05-19 03:50:00 |
Value | Count | Frequency (%) |
05:50:00 | 132 | 0.7% |
07:20:00 | 120 | 0.6% |
01:20:00 | 118 | 0.6% |
06:20:00 | 117 | 0.6% |
09:20:00 | 115 | 0.6% |
02:50:00 | 115 | 0.6% |
05:20:00 | 114 | 0.6% |
00:50:00 | 114 | 0.6% |
04:20:00 | 114 | 0.6% |
03:20:00 | 113 | 0.6% |
Other values (1141) | 18828 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 60645 | |
2 | 27234 | |
1 | 24142 | 12.7% |
- | 20000 | 10.5% |
: | 20000 | 10.5% |
10000 | 5.3% | |
9 | 5370 | 2.8% |
5 | 5334 | 2.8% |
4 | 5072 | 2.7% |
3 | 4947 | 2.6% |
Other values (3) | 7256 | 3.8% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 140000 | |
Dash Punctuation | 20000 | 10.5% |
Other Punctuation | 20000 | 10.5% |
Space Separator | 10000 | 5.3% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 60645 | |
2 | 27234 | |
1 | 24142 | 17.2% |
9 | 5370 | 3.8% |
5 | 5334 | 3.8% |
4 | 5072 | 3.6% |
3 | 4947 | 3.5% |
6 | 2756 | 2.0% |
7 | 2309 | 1.6% |
8 | 2191 | 1.6% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 20000 |
Other Punctuation
Value | Count | Frequency (%) |
: | 20000 |
Space Separator
Value | Count | Frequency (%) |
10000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 190000 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 60645 | |
2 | 27234 | |
1 | 24142 | 12.7% |
- | 20000 | 10.5% |
: | 20000 | 10.5% |
10000 | 5.3% | |
9 | 5370 | 2.8% |
5 | 5334 | 2.8% |
4 | 5072 | 2.7% |
3 | 4947 | 2.6% |
Other values (3) | 7256 | 3.8% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 190000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 60645 | |
2 | 27234 | |
1 | 24142 | 12.7% |
- | 20000 | 10.5% |
: | 20000 | 10.5% |
10000 | 5.3% | |
9 | 5370 | 2.8% |
5 | 5334 | 2.8% |
4 | 5072 | 2.7% |
3 | 4947 | 2.6% |
Other values (3) | 7256 | 3.8% |
경보발생시간
Date
Distinct | 9931 |
---|---|
Distinct (%) | 99.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Minimum | 2011-02-23 13:45:00 |
---|---|
Maximum | 2022-04-25 09:50:00 |
측정값
Real number (ℝ)
ZEROS
 
Distinct | 15 |
---|---|
Distinct (%) | 0.2% |
Missing | 7 |
Missing (%) | 0.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 3.698589 |
Minimum | 0 |
---|---|
Maximum | 30 |
Zeros | 1804 |
Zeros (%) | 18.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 2 |
median | 4 |
Q3 | 4 |
95-th percentile | 5 |
Maximum | 30 |
Range | 30 |
Interquartile range (IQR) | 2 |
Descriptive statistics
Standard deviation | 4.3764069 |
---|---|
Coefficient of variation (CV) | 1.1832639 |
Kurtosis | 22.515947 |
Mean | 3.698589 |
Median Absolute Deviation (MAD) | 1 |
Skewness | 4.4153555 |
Sum | 36960 |
Variance | 19.152937 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4 | 2761 | |
3 | 2001 | |
5 | 1818 | |
0 | 1804 | |
2 | 851 | 8.5% |
1 | 306 | 3.1% |
28 | 171 | 1.7% |
9 | 136 | 1.4% |
29 | 75 | 0.8% |
6 | 46 | 0.5% |
Other values (5) | 24 | 0.2% |
Value | Count | Frequency (%) |
0 | 1804 | |
1 | 306 | 3.1% |
2 | 851 | 8.5% |
3 | 2001 | |
4 | 2761 | |
5 | 1818 | |
6 | 46 | 0.5% |
8 | 5 | 0.1% |
9 | 136 | 1.4% |
10 | 12 | 0.1% |
Value | Count | Frequency (%) |
30 | 5 | 0.1% |
29 | 75 | 0.8% |
28 | 171 | 1.7% |
24 | 1 | < 0.1% |
18 | 1 | < 0.1% |
10 | 12 | 0.1% |
9 | 136 | 1.4% |
8 | 5 | 0.1% |
6 | 46 | 0.5% |
5 | 1818 |
측정소명 | 항목코드 | 항목명 | 측정값 | |
---|---|---|---|---|
측정소명 | 1.000 | 0.882 | 0.882 | 0.750 |
항목코드 | 0.882 | 1.000 | 1.000 | 0.601 |
항목명 | 0.882 | 1.000 | 1.000 | 0.601 |
측정값 | 0.750 | 0.601 | 0.601 | 1.000 |
항목명 | 항목코드 | |
---|---|---|
항목명 | 1.000 | 1.000 |
항목코드 | 1.000 | 1.000 |
측정값 | 항목코드 | 항목명 | |
---|---|---|---|
측정값 | 1.000 | 0.377 | 0.377 |
항목코드 | 0.377 | 1.000 | 1.000 |
항목명 | 0.377 | 1.000 | 1.000 |
측정소명 | 항목코드 | 항목명 | 측정일시 | 경보발생시간 | 측정값 | |
---|---|---|---|---|---|---|
1774 | 석남천 | DOW00 | 용존산소 | 2021-11-26 10:20:00 | 2021-11-26 10:20:00 | 4 |
53 | 석남천 | DOW00 | 용존산소 | 2022-04-24 03:50:00 | 2022-04-24 03:50:00 | 3 |
5947 | 석남천 | DOW00 | 용존산소 | 2021-05-13 09:50:00 | 2021-05-13 09:50:00 | 5 |
17502 | 여주침사지 | ETC | 기타 | 2013-04-26 16:46:00 | 2013-04-26 16:51:00 | <NA> |
5757 | 석남천 | DOW00 | 용존산소 | 2021-05-19 03:50:00 | 2021-05-19 03:50:00 | 3 |
6851 | 석남천 | DOW00 | 용존산소 | 2021-04-14 03:50:00 | 2021-04-14 03:50:00 | 4 |
17001 | 호남예비3 | PHY00 | pH | 2015-06-24 02:00:00 | 2015-06-24 02:16:00 | 0 |
9971 | 석남천 | DOW00 | 용존산소 | 2020-10-07 20:40:00 | 2020-10-07 20:45:00 | 4 |
6836 | 석남천 | DOW00 | 용존산소 | 2021-04-14 12:40:00 | 2021-04-14 12:40:00 | 5 |
5483 | 석남천 | DOW00 | 용존산소 | 2021-05-25 12:20:00 | 2021-05-25 12:20:00 | 1 |
측정소명 | 항목코드 | 항목명 | 측정일시 | 경보발생시간 | 측정값 | |
---|---|---|---|---|---|---|
7177 | 구미 | PHY00 | pH | 2021-03-27 03:10:00 | 2021-03-27 03:15:00 | 0 |
10169 | 석남천 | DOW00 | 용존산소 | 2020-09-27 00:40:00 | 2020-09-27 00:45:00 | 4 |
12101 | 석남천 | DOW00 | 용존산소 | 2019-11-19 23:20:00 | 2019-11-19 23:25:00 | 3 |
5801 | 석남천 | DOW00 | 용존산소 | 2021-05-17 07:20:00 | 2021-05-17 07:20:00 | 3 |
13555 | 석남천 | DOW00 | 용존산소 | 2019-09-27 22:40:00 | 2019-09-27 22:45:00 | 4 |
380 | 석남천 | PHY00 | pH | 2022-04-15 05:40:00 | 2022-04-15 05:40:00 | 0 |
2037 | 석남천 | DOW00 | 용존산소 | 2021-11-17 18:20:00 | 2021-11-17 18:20:00 | 5 |
9645 | 고령 | PHY00 | pH | 2020-10-17 10:30:00 | 2020-10-17 10:35:00 | 0 |
17056 | 호남예비3 | PHY00 | pH | 2015-06-10 19:00:00 | 2015-06-10 19:56:00 | 0 |
12625 | 석남천 | DOW00 | 용존산소 | 2019-11-07 09:40:00 | 2019-11-07 09:45:00 | 4 |
Most frequently occurring
측정소명 | 항목코드 | 항목명 | 측정일시 | 경보발생시간 | 측정값 | # duplicates | |
---|---|---|---|---|---|---|---|
3 | 석남천 | DOW00 | 용존산소 | 2021-04-30 07:20:00 | 2021-04-30 10:23:00 | 0 | 3 |
20 | 호남예비3 | PHY00 | pH | 2015-06-22 17:00:00 | 2015-06-22 19:10:00 | 0 | 3 |
0 | 감천 | PHY00 | pH | 2013-05-25 00:20:00 | 2013-05-25 00:56:00 | 0 | 2 |
1 | 금남 | PHY00 | pH | 2013-05-20 13:20:00 | 2013-05-20 13:53:00 | 0 | 2 |
2 | 노안 | PHY00 | pH | 2012-10-24 14:00:00 | 2012-10-24 14:20:00 | 0 | 2 |
4 | 성주 | PHY00 | pH | 2013-03-19 16:50:00 | 2013-03-19 17:25:00 | 0 | 2 |
5 | 승촌 | PHY00 | pH | 2013-03-28 11:50:00 | 2013-03-28 12:17:00 | 0 | 2 |
6 | 승촌 | PHY00 | pH | 2013-05-06 10:00:00 | 2013-05-06 10:41:00 | 3 | 2 |
7 | 승촌 | PHY00 | pH | 2013-05-10 16:50:00 | 2013-05-10 17:26:00 | 0 | 2 |
8 | 승촌 | PHY00 | pH | 2013-05-13 15:10:00 | 2013-05-13 16:04:00 | 0 | 2 |