Overview

Dataset statistics

Number of variables4
Number of observations21
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory867.0 B
Average record size in memory41.3 B

Variable types

Text1
Numeric3

Dataset

Description전라남도 보건환경연구원 홈페이지에 게시된 하천수 호소수 관련 검사항목 및 수수료에 관한 사항을 정리한 파일입니다.
Author전라남도
URLhttps://www.data.go.kr/data/15041960/fileData.do

Alerts

수수료(원) is highly overall correlated with 하천수High correlation
하천수 is highly overall correlated with 수수료(원) and 1 other fieldsHigh correlation
호소수 is highly overall correlated with 하천수High correlation
검사항목 has unique valuesUnique
하천수 has 16 (76.2%) zerosZeros
호소수 has 16 (76.2%) zerosZeros

Reproduction

Analysis started2023-12-12 16:43:24.487346
Analysis finished2023-12-12 16:43:25.926946
Duration1.44 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

검사항목
Text

UNIQUE 

Distinct21
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size300.0 B
2023-12-13T01:43:26.094580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length13
Mean length10.142857
Min length3

Characters and Unicode

Total characters213
Distinct characters96
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)100.0%

Sample

1st row수소이온농도(pH)
2nd row화학적산소요구량(COD)
3rd row생물화학적산소요구량(BOD)
4th row부유물질량(SS)
5th row용존산소량(DO)
ValueCountFrequency (%)
수소이온농도(ph 1
 
4.2%
화학적산소요구량(cod 1
 
4.2%
클로로필a 1
 
4.2%
pce 1
 
4.2%
1,2디클로로에탄,디클로로메탄 1
 
4.2%
클로로포름 1
 
4.2%
휘발성저급탄화수소류(사염화탄소 1
 
4.2%
음이온계면활성제(abs 1
 
4.2%
폴리크로리네이티드비페닐(pcb 1
 
4.2%
6가크롬(cr6 1
 
4.2%
Other values (14) 14
58.3%
2023-12-13T01:43:26.539768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 16
 
7.5%
) 16
 
7.5%
9
 
4.2%
8
 
3.8%
C 6
 
2.8%
, 5
 
2.3%
4
 
1.9%
P 4
 
1.9%
4
 
1.9%
4
 
1.9%
Other values (86) 137
64.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 128
60.1%
Uppercase Letter 31
 
14.6%
Open Punctuation 16
 
7.5%
Close Punctuation 16
 
7.5%
Lowercase Letter 7
 
3.3%
Other Punctuation 5
 
2.3%
Decimal Number 4
 
1.9%
Space Separator 3
 
1.4%
Dash Punctuation 2
 
0.9%
Math Symbol 1
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
7.0%
8
 
6.2%
4
 
3.1%
4
 
3.1%
4
 
3.1%
4
 
3.1%
3
 
2.3%
3
 
2.3%
3
 
2.3%
3
 
2.3%
Other values (59) 83
64.8%
Uppercase Letter
ValueCountFrequency (%)
C 6
19.4%
P 4
12.9%
S 3
9.7%
B 3
9.7%
D 3
9.7%
O 3
9.7%
A 2
 
6.5%
N 2
 
6.5%
T 2
 
6.5%
H 2
 
6.5%
Lowercase Letter
ValueCountFrequency (%)
p 1
14.3%
a 1
14.3%
d 1
14.3%
g 1
14.3%
b 1
14.3%
r 1
14.3%
s 1
14.3%
Decimal Number
ValueCountFrequency (%)
6 2
50.0%
1 1
25.0%
2 1
25.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 128
60.1%
Common 47
 
22.1%
Latin 38
 
17.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
 
7.0%
8
 
6.2%
4
 
3.1%
4
 
3.1%
4
 
3.1%
4
 
3.1%
3
 
2.3%
3
 
2.3%
3
 
2.3%
3
 
2.3%
Other values (59) 83
64.8%
Latin
ValueCountFrequency (%)
C 6
15.8%
P 4
10.5%
S 3
 
7.9%
B 3
 
7.9%
D 3
 
7.9%
O 3
 
7.9%
A 2
 
5.3%
N 2
 
5.3%
T 2
 
5.3%
H 2
 
5.3%
Other values (8) 8
21.1%
Common
ValueCountFrequency (%)
( 16
34.0%
) 16
34.0%
, 5
 
10.6%
3
 
6.4%
6 2
 
4.3%
- 2
 
4.3%
1 1
 
2.1%
2 1
 
2.1%
+ 1
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 128
60.1%
ASCII 85
39.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 16
18.8%
) 16
18.8%
C 6
 
7.1%
, 5
 
5.9%
P 4
 
4.7%
S 3
 
3.5%
B 3
 
3.5%
D 3
 
3.5%
O 3
 
3.5%
3
 
3.5%
Other values (17) 23
27.1%
Hangul
ValueCountFrequency (%)
9
 
7.0%
8
 
6.2%
4
 
3.1%
4
 
3.1%
4
 
3.1%
4
 
3.1%
3
 
2.3%
3
 
2.3%
3
 
2.3%
3
 
2.3%
Other values (59) 83
64.8%

수수료(원)
Real number (ℝ)

HIGH CORRELATION 

Distinct18
Distinct (%)85.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13785.714
Minimum800
Maximum125200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.0 B
2023-12-13T01:43:26.684315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum800
5-th percentile1300
Q13400
median6900
Q313200
95-th percentile20300
Maximum125200
Range124400
Interquartile range (IQR)9800

Descriptive statistics

Standard deviation26091.92
Coefficient of variation (CV)1.8926781
Kurtosis18.966082
Mean13785.714
Median Absolute Deviation (MAD)4100
Skewness4.2662449
Sum289500
Variance6.8078829 × 108
MonotonicityNot monotonic
2023-12-13T01:43:26.836937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
6900 3
 
14.3%
2800 2
 
9.5%
800 1
 
4.8%
10600 1
 
4.8%
3000 1
 
4.8%
1300 1
 
4.8%
15400 1
 
4.8%
13200 1
 
4.8%
125200 1
 
4.8%
20300 1
 
4.8%
Other values (8) 8
38.1%
ValueCountFrequency (%)
800 1
 
4.8%
1300 1
 
4.8%
2800 2
9.5%
3000 1
 
4.8%
3400 1
 
4.8%
3700 1
 
4.8%
5800 1
 
4.8%
6900 3
14.3%
7300 1
 
4.8%
10600 1
 
4.8%
ValueCountFrequency (%)
125200 1
4.8%
20300 1
4.8%
15400 1
4.8%
14800 1
4.8%
13900 1
4.8%
13200 1
4.8%
13100 1
4.8%
11400 1
4.8%
10600 1
4.8%
7300 1
4.8%

하천수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)28.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean785.71429
Minimum0
Maximum5800
Zeros16
Zeros (%)76.2%
Negative0
Negative (%)0.0%
Memory size321.0 B
2023-12-13T01:43:26.995419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3700
Maximum5800
Range5800
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1649.3289
Coefficient of variation (CV)2.0991458
Kurtosis3.5521224
Mean785.71429
Median Absolute Deviation (MAD)0
Skewness2.0829061
Sum16500
Variance2720285.7
MonotonicityNot monotonic
2023-12-13T01:43:27.130895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 16
76.2%
800 1
 
4.8%
5800 1
 
4.8%
2800 1
 
4.8%
3700 1
 
4.8%
3400 1
 
4.8%
ValueCountFrequency (%)
0 16
76.2%
800 1
 
4.8%
2800 1
 
4.8%
3400 1
 
4.8%
3700 1
 
4.8%
5800 1
 
4.8%
ValueCountFrequency (%)
5800 1
 
4.8%
3700 1
 
4.8%
3400 1
 
4.8%
2800 1
 
4.8%
800 1
 
4.8%
0 16
76.2%

호소수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)28.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean857.14286
Minimum0
Maximum7300
Zeros16
Zeros (%)76.2%
Negative0
Negative (%)0.0%
Memory size321.0 B
2023-12-13T01:43:27.256908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3700
Maximum7300
Range7300
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1891.9755
Coefficient of variation (CV)2.2073048
Kurtosis6.2597453
Mean857.14286
Median Absolute Deviation (MAD)0
Skewness2.4816238
Sum18000
Variance3579571.4
MonotonicityNot monotonic
2023-12-13T01:43:27.770395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 16
76.2%
800 1
 
4.8%
7300 1
 
4.8%
2800 1
 
4.8%
3700 1
 
4.8%
3400 1
 
4.8%
ValueCountFrequency (%)
0 16
76.2%
800 1
 
4.8%
2800 1
 
4.8%
3400 1
 
4.8%
3700 1
 
4.8%
7300 1
 
4.8%
ValueCountFrequency (%)
7300 1
 
4.8%
3700 1
 
4.8%
3400 1
 
4.8%
2800 1
 
4.8%
800 1
 
4.8%
0 16
76.2%

Interactions

2023-12-13T01:43:25.413064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:43:24.627951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:43:25.031204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:43:25.541793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:43:24.761599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:43:25.157577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:43:25.651302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:43:24.904036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:43:25.283362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:43:27.874562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검사항목수수료(원)하천수호소수
검사항목1.0001.0001.0001.000
수수료(원)1.0001.0000.0000.000
하천수1.0000.0001.0000.991
호소수1.0000.0000.9911.000
2023-12-13T01:43:27.980044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수수료(원)하천수호소수
수수료(원)1.000-0.509-0.422
하천수-0.5091.0000.637
호소수-0.4220.6371.000

Missing values

2023-12-13T01:43:25.782839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:43:25.891360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

검사항목수수료(원)하천수호소수
0수소이온농도(pH)800800800
1화학적산소요구량(COD)730007300
2생물화학적산소요구량(BOD)580058000
3부유물질량(SS)280028002800
4용존산소량(DO)280000
5총대장균군1480000
6분원성대장균군1140000
7총질소(T-N)370037003700
8총인(T-P)340034003400
9카드뮴(Cd)690000
검사항목수수료(원)하천수호소수
11시안(CN)1310000
12수은(Hg)1060000
13유기인2030000
14납(Pb)690000
156가크롬(Cr6+)690000
16폴리크로리네이티드비페닐(PCB)12520000
17음이온계면활성제(ABS)1320000
18휘발성저급탄화수소류(사염화탄소, 클로로포름, 1,2디클로로에탄,디클로로메탄, PCE)1540000
19클로로필a130000
20전기전도도300000