Overview

Dataset statistics

Number of variables7
Number of observations106
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.3 KiB
Average record size in memory61.2 B

Variable types

Categorical1
Text2
Numeric4

Dataset

Description2013년 국립공원내 계곡 하천 수질측정결과를 CSV 포맷으로 제공합니다. 생화학적 산소 요구량 BOD는 분기별 측정결과를 제공하며 단위는 ppm입니다.
Author국립공원공단
URLhttps://www.data.go.kr/data/15044519/fileData.do

Alerts

생화학적산소요구량_1차(BOD) is highly overall correlated with 생화학적산소요구량_2차(BOD) and 2 other fieldsHigh correlation
생화학적산소요구량_2차(BOD) is highly overall correlated with 생화학적산소요구량_1차(BOD) and 3 other fieldsHigh correlation
생화학적산소요구량_3차(BOD) is highly overall correlated with 생화학적산소요구량_1차(BOD) and 3 other fieldsHigh correlation
생화학적산소요구량_4차(BOD) is highly overall correlated with 생화학적산소요구량_1차(BOD) and 3 other fieldsHigh correlation
공원명 is highly overall correlated with 생화학적산소요구량_2차(BOD) and 2 other fieldsHigh correlation
일련번호 has unique valuesUnique
생화학적산소요구량_2차(BOD) has 9 (8.5%) zerosZeros

Reproduction

Analysis started2023-12-12 09:14:37.636001
Analysis finished2023-12-12 09:14:39.614655
Duration1.98 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

공원명
Categorical

HIGH CORRELATION 

Distinct21
Distinct (%)19.8%
Missing0
Missing (%)0.0%
Memory size980.0 B
지리산
11 
계룡산
지리산남부
북한산
북한산도봉
 
6
Other values (16)
64 

Length

Max length5
Median length3
Mean length3.4528302
Min length2

Unique

Unique1 ?
Unique (%)0.9%

Sample

1st row지리산
2nd row지리산
3rd row지리산
4th row지리산
5th row지리산

Common Values

ValueCountFrequency (%)
지리산 11
 
10.4%
계룡산 9
 
8.5%
지리산남부 8
 
7.5%
북한산 8
 
7.5%
북한산도봉 6
 
5.7%
월악산 6
 
5.7%
덕유산 6
 
5.7%
속리산 6
 
5.7%
설악산 6
 
5.7%
치악산 4
 
3.8%
Other values (11) 36
34.0%

Length

2023-12-12T18:14:39.682485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
지리산 11
 
10.4%
계룡산 9
 
8.5%
지리산남부 8
 
7.5%
북한산 8
 
7.5%
북한산도봉 6
 
5.7%
월악산 6
 
5.7%
덕유산 6
 
5.7%
속리산 6
 
5.7%
설악산 6
 
5.7%
오대산 4
 
3.8%
Other values (11) 36
34.0%

일련번호
Text

UNIQUE 

Distinct106
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size980.0 B
2023-12-12T18:14:40.014528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.0188679
Min length4

Characters and Unicode

Total characters426
Distinct characters40
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique106 ?
Unique (%)100.0%

Sample

1st row지리-1
2nd row지리-2
3rd row지리-3
4th row지리-4
5th row지리-5
ValueCountFrequency (%)
지리-1 1
 
0.9%
월악-5 1
 
0.9%
월악-3 1
 
0.9%
월악-2 1
 
0.9%
월악-1 1
 
0.9%
치악-4 1
 
0.9%
치악-3 1
 
0.9%
치악-2 1
 
0.9%
치악-1 1
 
0.9%
주왕-5 1
 
0.9%
Other values (96) 96
90.6%
2023-12-12T18:14:40.637589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 106
24.9%
1 24
 
5.6%
22
 
5.2%
21
 
4.9%
3 18
 
4.2%
17
 
4.0%
2 17
 
4.0%
16
 
3.8%
4 14
 
3.3%
5 11
 
2.6%
Other values (30) 160
37.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 212
49.8%
Decimal Number 108
25.4%
Dash Punctuation 106
24.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
22
 
10.4%
21
 
9.9%
17
 
8.0%
16
 
7.5%
9
 
4.2%
9
 
4.2%
9
 
4.2%
8
 
3.8%
8
 
3.8%
8
 
3.8%
Other values (19) 85
40.1%
Decimal Number
ValueCountFrequency (%)
1 24
22.2%
3 18
16.7%
2 17
15.7%
4 14
13.0%
5 11
10.2%
6 10
9.3%
7 7
 
6.5%
8 4
 
3.7%
9 2
 
1.9%
0 1
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
- 106
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 214
50.2%
Hangul 212
49.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
22
 
10.4%
21
 
9.9%
17
 
8.0%
16
 
7.5%
9
 
4.2%
9
 
4.2%
9
 
4.2%
8
 
3.8%
8
 
3.8%
8
 
3.8%
Other values (19) 85
40.1%
Common
ValueCountFrequency (%)
- 106
49.5%
1 24
 
11.2%
3 18
 
8.4%
2 17
 
7.9%
4 14
 
6.5%
5 11
 
5.1%
6 10
 
4.7%
7 7
 
3.3%
8 4
 
1.9%
9 2
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 214
50.2%
Hangul 212
49.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 106
49.5%
1 24
 
11.2%
3 18
 
8.4%
2 17
 
7.9%
4 14
 
6.5%
5 11
 
5.1%
6 10
 
4.7%
7 7
 
3.3%
8 4
 
1.9%
9 2
 
0.9%
Hangul
ValueCountFrequency (%)
22
 
10.4%
21
 
9.9%
17
 
8.0%
16
 
7.5%
9
 
4.2%
9
 
4.2%
9
 
4.2%
8
 
3.8%
8
 
3.8%
8
 
3.8%
Other values (19) 85
40.1%
Distinct78
Distinct (%)73.6%
Missing0
Missing (%)0.0%
Memory size980.0 B
2023-12-12T18:14:41.002917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5.5
Mean length4.2830189
Min length2

Characters and Unicode

Total characters454
Distinct characters99
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)53.8%

Sample

1st row유평계곡
2nd row중산리계곡
3rd row백무동계곡
4th row쌍계사계곡
5th row대성계곡
ValueCountFrequency (%)
원당천 4
 
3.8%
화산계곡 3
 
2.8%
갑사계곡 3
 
2.8%
동학사계곡 3
 
2.8%
황룡동계곡 3
 
2.8%
홍류동계곡 3
 
2.8%
천은계곡 2
 
1.9%
소금강계곡 2
 
1.9%
구천동계곡 2
 
1.9%
구룡계곡 2
 
1.9%
Other values (68) 79
74.5%
2023-12-12T18:14:41.967582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
93
20.5%
91
20.0%
24
 
5.3%
16
 
3.5%
15
 
3.3%
8
 
1.8%
7
 
1.5%
7
 
1.5%
6
 
1.3%
6
 
1.3%
Other values (89) 181
39.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 442
97.4%
Decimal Number 9
 
2.0%
Dash Punctuation 3
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
93
21.0%
91
20.6%
24
 
5.4%
16
 
3.6%
15
 
3.4%
8
 
1.8%
7
 
1.6%
7
 
1.6%
6
 
1.4%
6
 
1.4%
Other values (85) 169
38.2%
Decimal Number
ValueCountFrequency (%)
1 5
55.6%
2 3
33.3%
3 1
 
11.1%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 442
97.4%
Common 12
 
2.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
93
21.0%
91
20.6%
24
 
5.4%
16
 
3.6%
15
 
3.4%
8
 
1.8%
7
 
1.6%
7
 
1.6%
6
 
1.4%
6
 
1.4%
Other values (85) 169
38.2%
Common
ValueCountFrequency (%)
1 5
41.7%
2 3
25.0%
- 3
25.0%
3 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 442
97.4%
ASCII 12
 
2.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
93
21.0%
91
20.6%
24
 
5.4%
16
 
3.6%
15
 
3.4%
8
 
1.8%
7
 
1.6%
7
 
1.6%
6
 
1.4%
6
 
1.4%
Other values (85) 169
38.2%
ASCII
ValueCountFrequency (%)
1 5
41.7%
2 3
25.0%
- 3
25.0%
3 1
 
8.3%

생화학적산소요구량_1차(BOD)
Real number (ℝ)

HIGH CORRELATION 

Distinct10
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.38679245
Minimum0.1
Maximum1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T18:14:42.142800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile0.1
Q10.2
median0.4
Q30.5
95-th percentile0.7
Maximum1
Range0.9
Interquartile range (IQR)0.3

Descriptive statistics

Standard deviation0.20380919
Coefficient of variation (CV)0.52692132
Kurtosis-0.23912629
Mean0.38679245
Median Absolute Deviation (MAD)0.2
Skewness0.53323637
Sum41
Variance0.041538185
MonotonicityNot monotonic
2023-12-12T18:14:42.356325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.2 25
23.6%
0.5 19
17.9%
0.4 16
15.1%
0.3 13
12.3%
0.1 11
10.4%
0.6 11
10.4%
0.7 6
 
5.7%
0.8 3
 
2.8%
1.0 1
 
0.9%
0.9 1
 
0.9%
ValueCountFrequency (%)
0.1 11
10.4%
0.2 25
23.6%
0.3 13
12.3%
0.4 16
15.1%
0.5 19
17.9%
0.6 11
10.4%
0.7 6
 
5.7%
0.8 3
 
2.8%
0.9 1
 
0.9%
1.0 1
 
0.9%
ValueCountFrequency (%)
1.0 1
 
0.9%
0.9 1
 
0.9%
0.8 3
 
2.8%
0.7 6
 
5.7%
0.6 11
10.4%
0.5 19
17.9%
0.4 16
15.1%
0.3 13
12.3%
0.2 25
23.6%
0.1 11
10.4%

생화학적산소요구량_2차(BOD)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct10
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.34339623
Minimum0
Maximum0.9
Zeros9
Zeros (%)8.5%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T18:14:42.483772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.1
median0.35
Q30.5
95-th percentile0.675
Maximum0.9
Range0.9
Interquartile range (IQR)0.4

Descriptive statistics

Standard deviation0.21996488
Coefficient of variation (CV)0.64055705
Kurtosis-0.59863691
Mean0.34339623
Median Absolute Deviation (MAD)0.15
Skewness0.23524709
Sum36.4
Variance0.048384546
MonotonicityNot monotonic
2023-12-12T18:14:42.606505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0.1 19
17.9%
0.4 17
16.0%
0.5 16
15.1%
0.3 15
14.2%
0.6 14
13.2%
0.2 10
9.4%
0.0 9
8.5%
0.7 2
 
1.9%
0.8 2
 
1.9%
0.9 2
 
1.9%
ValueCountFrequency (%)
0.0 9
8.5%
0.1 19
17.9%
0.2 10
9.4%
0.3 15
14.2%
0.4 17
16.0%
0.5 16
15.1%
0.6 14
13.2%
0.7 2
 
1.9%
0.8 2
 
1.9%
0.9 2
 
1.9%
ValueCountFrequency (%)
0.9 2
 
1.9%
0.8 2
 
1.9%
0.7 2
 
1.9%
0.6 14
13.2%
0.5 16
15.1%
0.4 17
16.0%
0.3 15
14.2%
0.2 10
9.4%
0.1 19
17.9%
0.0 9
8.5%

생화학적산소요구량_3차(BOD)
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)8.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.39528302
Minimum0.1
Maximum0.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T18:14:42.744156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile0.1
Q10.2
median0.4
Q30.6
95-th percentile0.775
Maximum0.9
Range0.8
Interquartile range (IQR)0.4

Descriptive statistics

Standard deviation0.22012208
Coefficient of variation (CV)0.55687208
Kurtosis-0.98399725
Mean0.39528302
Median Absolute Deviation (MAD)0.2
Skewness0.30671154
Sum41.9
Variance0.048453729
MonotonicityNot monotonic
2023-12-12T18:14:42.913867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0.2 20
18.9%
0.6 17
16.0%
0.3 16
15.1%
0.1 16
15.1%
0.5 15
14.2%
0.7 8
 
7.5%
0.4 8
 
7.5%
0.8 4
 
3.8%
0.9 2
 
1.9%
ValueCountFrequency (%)
0.1 16
15.1%
0.2 20
18.9%
0.3 16
15.1%
0.4 8
 
7.5%
0.5 15
14.2%
0.6 17
16.0%
0.7 8
 
7.5%
0.8 4
 
3.8%
0.9 2
 
1.9%
ValueCountFrequency (%)
0.9 2
 
1.9%
0.8 4
 
3.8%
0.7 8
 
7.5%
0.6 17
16.0%
0.5 15
14.2%
0.4 8
 
7.5%
0.3 16
15.1%
0.2 20
18.9%
0.1 16
15.1%

생화학적산소요구량_4차(BOD)
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3745283
Minimum0.1
Maximum0.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T18:14:43.080615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile0.1
Q10.2
median0.4
Q30.5
95-th percentile0.6
Maximum0.8
Range0.7
Interquartile range (IQR)0.3

Descriptive statistics

Standard deviation0.18976744
Coefficient of variation (CV)0.50668384
Kurtosis-1.2948993
Mean0.3745283
Median Absolute Deviation (MAD)0.2
Skewness-0.11122697
Sum39.7
Variance0.03601168
MonotonicityNot monotonic
2023-12-12T18:14:43.226522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
0.5 23
21.7%
0.6 22
20.8%
0.1 19
17.9%
0.2 17
16.0%
0.4 13
12.3%
0.3 10
9.4%
0.8 1
 
0.9%
0.7 1
 
0.9%
ValueCountFrequency (%)
0.1 19
17.9%
0.2 17
16.0%
0.3 10
9.4%
0.4 13
12.3%
0.5 23
21.7%
0.6 22
20.8%
0.7 1
 
0.9%
0.8 1
 
0.9%
ValueCountFrequency (%)
0.8 1
 
0.9%
0.7 1
 
0.9%
0.6 22
20.8%
0.5 23
21.7%
0.4 13
12.3%
0.3 10
9.4%
0.2 17
16.0%
0.1 19
17.9%

Interactions

2023-12-12T18:14:39.029907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:37.931901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:38.336288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:38.674634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:39.105284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:38.028536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:38.414903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:38.752760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:39.185812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:38.116014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:38.512722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:38.829891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:39.289426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:38.223270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:38.595839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:14:38.928488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T18:14:43.328355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공원명지점명생화학적산소요구량_1차(BOD)생화학적산소요구량_2차(BOD)생화학적산소요구량_3차(BOD)생화학적산소요구량_4차(BOD)
공원명1.0001.0000.7550.8920.8690.848
지점명1.0001.0000.9410.9360.8690.982
생화학적산소요구량_1차(BOD)0.7550.9411.0000.8060.4530.807
생화학적산소요구량_2차(BOD)0.8920.9360.8061.0000.5610.794
생화학적산소요구량_3차(BOD)0.8690.8690.4530.5611.0000.589
생화학적산소요구량_4차(BOD)0.8480.9820.8070.7940.5891.000
2023-12-12T18:14:43.456246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
생화학적산소요구량_1차(BOD)생화학적산소요구량_2차(BOD)생화학적산소요구량_3차(BOD)생화학적산소요구량_4차(BOD)공원명
생화학적산소요구량_1차(BOD)1.0000.6560.6330.8800.371
생화학적산소요구량_2차(BOD)0.6561.0000.5720.8430.565
생화학적산소요구량_3차(BOD)0.6330.5721.0000.8230.534
생화학적산소요구량_4차(BOD)0.8800.8430.8231.0000.523
공원명0.3710.5650.5340.5231.000

Missing values

2023-12-12T18:14:39.422816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:14:39.566642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

공원명일련번호지점명생화학적산소요구량_1차(BOD)생화학적산소요구량_2차(BOD)생화학적산소요구량_3차(BOD)생화학적산소요구량_4차(BOD)
0지리산지리-1유평계곡0.30.30.30.3
1지리산지리-2중산리계곡0.40.30.20.3
2지리산지리-3백무동계곡0.40.30.30.3
3지리산지리-4쌍계사계곡0.50.30.10.3
4지리산지리-5대성계곡0.50.40.50.5
5지리산지리-6단천계곡0.50.40.20.4
6지리산지리-7내원계곡0.40.30.50.4
7지리산지리-8장당계곡0.70.50.20.5
8지리산지리-9칠선계곡0.40.40.50.4
9지리산지리-10거림계곡0.40.50.20.4
공원명일련번호지점명생화학적산소요구량_1차(BOD)생화학적산소요구량_2차(BOD)생화학적산소요구량_3차(BOD)생화학적산소요구량_4차(BOD)
96소백산소백-3삼가계곡10.60.50.60.6
97소백산소백-4삼가계곡20.30.20.60.4
98소백산북부소북-1천동계곡0.20.50.30.3
99소백산북부소북-2죽령계곡0.30.50.30.4
100소백산북부소북-3남천계곡0.20.20.30.2
101소백산북부소북-4어의계곡0.20.30.20.2
102월출산월출-1천황계곡0.10.50.30.3
103월출산월출-2도갑사계곡0.20.40.40.3
104월출산월출-3경포대계곡0.30.40.40.4
105변산반도변산-1직소천0.30.80.50.5