Overview

Dataset statistics

Number of variables16
Number of observations23
Missing cells156
Missing cells (%)42.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.2 KiB
Average record size in memory140.7 B

Variable types

Unsupported13
Text1
Categorical2

Dataset

Description2020년옥정호주요하천수질조사결과3월
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=204347

Alerts

Unnamed: 12 is highly overall correlated with Unnamed: 15High correlation
Unnamed: 15 is highly overall correlated with Unnamed: 12High correlation
Unnamed: 12 is highly imbalanced (53.6%)Imbalance
Unnamed: 15 is highly imbalanced (56.3%)Imbalance
Unnamed: 0 has 23 (100.0%) missing valuesMissing
2020년 옥정호 주요하천 수질조사 결과(3월) has 1 (4.3%) missing valuesMissing
Unnamed: 2 has 5 (21.7%) missing valuesMissing
Unnamed: 3 has 5 (21.7%) missing valuesMissing
Unnamed: 4 has 5 (21.7%) missing valuesMissing
Unnamed: 5 has 5 (21.7%) missing valuesMissing
Unnamed: 6 has 5 (21.7%) missing valuesMissing
Unnamed: 7 has 5 (21.7%) missing valuesMissing
Unnamed: 8 has 5 (21.7%) missing valuesMissing
Unnamed: 9 has 5 (21.7%) missing valuesMissing
Unnamed: 10 has 23 (100.0%) missing valuesMissing
Unnamed: 11 has 23 (100.0%) missing valuesMissing
Unnamed: 13 has 23 (100.0%) missing valuesMissing
Unnamed: 14 has 23 (100.0%) missing valuesMissing
Unnamed: 0 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 2 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 13 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 14 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 02:12:27.838297
Analysis finished2024-03-14 02:12:28.563590
Duration0.73 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing23
Missing (%)100.0%
Memory size339.0 B
Distinct22
Distinct (%)100.0%
Missing1
Missing (%)4.3%
Memory size316.0 B
2024-03-14T11:12:28.688063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length37.5
Mean length15
Min length2

Characters and Unicode

Total characters330
Distinct characters84
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)100.0%

Sample

1st row1. 조사지점 : 임실천, 오원천, 섬진강본류, 옥녀동천, 추령천, 도원천하류, 동진강상류, 섬진강댐하류
2nd row2. 조사일자 : 2020년 3월 23일
3rd row3. 조사항목 : 총 17항목(현장 측정: 수온, pH, DO, EC 와 BOD 등 13개)
4th row4. 조사결과
5th row지점명
ValueCountFrequency (%)
mg/l 11
 
16.4%
3
 
4.5%
bod 2
 
3.0%
ec 2
 
3.0%
do 2
 
3.0%
ph 2
 
3.0%
수온 2
 
3.0%
1 1
 
1.5%
ss 1
 
1.5%
조사결과 1
 
1.5%
Other values (40) 40
59.7%
2024-03-14T11:12:28.978141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
29
 
8.8%
) 17
 
5.2%
( 17
 
5.2%
16
 
4.8%
/ 15
 
4.5%
m 15
 
4.5%
L 13
 
3.9%
g 12
 
3.6%
, 10
 
3.0%
O 9
 
2.7%
Other values (74) 177
53.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 95
28.8%
Uppercase Letter 55
16.7%
Lowercase Letter 34
 
10.3%
Other Punctuation 33
 
10.0%
Space Separator 29
 
8.8%
Decimal Number 24
 
7.3%
Close Punctuation 17
 
5.2%
Open Punctuation 17
 
5.2%
Control 16
 
4.8%
Dash Punctuation 7
 
2.1%
Other values (2) 3
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
5.3%
5
 
5.3%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
3
 
3.2%
Other values (37) 54
56.8%
Uppercase Letter
ValueCountFrequency (%)
L 13
23.6%
O 9
16.4%
N 7
12.7%
D 5
 
9.1%
C 5
 
9.1%
P 3
 
5.5%
T 3
 
5.5%
H 3
 
5.5%
S 3
 
5.5%
B 2
 
3.6%
Lowercase Letter
ValueCountFrequency (%)
m 15
44.1%
g 12
35.3%
p 2
 
5.9%
c 1
 
2.9%
l 1
 
2.9%
h 1
 
2.9%
a 1
 
2.9%
μ 1
 
2.9%
Decimal Number
ValueCountFrequency (%)
0 6
25.0%
3 6
25.0%
2 5
20.8%
1 5
20.8%
7 1
 
4.2%
4 1
 
4.2%
Other Punctuation
ValueCountFrequency (%)
/ 15
45.5%
, 10
30.3%
: 4
 
12.1%
. 4
 
12.1%
Other Symbol
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
29
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Control
ValueCountFrequency (%)
16
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 146
44.2%
Hangul 95
28.8%
Latin 88
26.7%
Greek 1
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
5.3%
5
 
5.3%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
3
 
3.2%
Other values (37) 54
56.8%
Common
ValueCountFrequency (%)
29
19.9%
) 17
11.6%
( 17
11.6%
16
11.0%
/ 15
10.3%
, 10
 
6.8%
- 7
 
4.8%
0 6
 
4.1%
3 6
 
4.1%
2 5
 
3.4%
Other values (8) 18
12.3%
Latin
ValueCountFrequency (%)
m 15
17.0%
L 13
14.8%
g 12
13.6%
O 9
10.2%
N 7
8.0%
D 5
 
5.7%
C 5
 
5.7%
P 3
 
3.4%
T 3
 
3.4%
H 3
 
3.4%
Other values (8) 13
14.8%
Greek
ValueCountFrequency (%)
μ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 231
70.0%
Hangul 95
28.8%
None 2
 
0.6%
CJK Compat 1
 
0.3%
Letterlike Symbols 1
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
29
 
12.6%
) 17
 
7.4%
( 17
 
7.4%
16
 
6.9%
/ 15
 
6.5%
m 15
 
6.5%
L 13
 
5.6%
g 12
 
5.2%
, 10
 
4.3%
O 9
 
3.9%
Other values (23) 78
33.8%
Hangul
ValueCountFrequency (%)
5
 
5.3%
5
 
5.3%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
4
 
4.2%
3
 
3.2%
Other values (37) 54
56.8%
None
ValueCountFrequency (%)
1
50.0%
μ 1
50.0%
CJK Compat
ValueCountFrequency (%)
1
100.0%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%

Unnamed: 2
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)21.7%
Memory size316.0 B

Unnamed: 3
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)21.7%
Memory size316.0 B

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)21.7%
Memory size316.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)21.7%
Memory size316.0 B

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)21.7%
Memory size316.0 B

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)21.7%
Memory size316.0 B

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)21.7%
Memory size316.0 B

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)21.7%
Memory size316.0 B

Unnamed: 10
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing23
Missing (%)100.0%
Memory size339.0 B

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing23
Missing (%)100.0%
Memory size339.0 B

Unnamed: 12
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
13.7
13.8
 
1
13.733333333333334
 
1

Length

Max length18
Median length4
Mean length4.6086957
Min length4

Unique

Unique2 ?
Unique (%)8.7%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
13.7 2
 
8.7%
13.8 1
 
4.3%
13.733333333333334 1
 
4.3%

Length

2024-03-14T11:12:29.100487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T11:12:29.191320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
13.7 2
 
8.7%
13.8 1
 
4.3%
13.733333333333334 1
 
4.3%

Unnamed: 13
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing23
Missing (%)100.0%
Memory size339.0 B

Unnamed: 14
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing23
Missing (%)100.0%
Memory size339.0 B

Unnamed: 15
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size316.0 B
<NA>
19 
145.0
 
1
148.0
 
1
149.0
 
1
147.33333333333334
 
1

Length

Max length18
Median length4
Mean length4.7391304
Min length4

Unique

Unique4 ?
Unique (%)17.4%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 19
82.6%
145.0 1
 
4.3%
148.0 1
 
4.3%
149.0 1
 
4.3%
147.33333333333334 1
 
4.3%

Length

2024-03-14T11:12:29.279028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T11:12:29.356723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 19
82.6%
145.0 1
 
4.3%
148.0 1
 
4.3%
149.0 1
 
4.3%
147.33333333333334 1
 
4.3%

Correlations

2024-03-14T11:12:29.423322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2020년 옥정호 주요하천 수질조사 결과(3월)Unnamed: 12Unnamed: 15
2020년 옥정호 주요하천 수질조사 결과(3월)1.0001.0001.000
Unnamed: 121.0001.0001.000
Unnamed: 151.0001.0001.000
2024-03-14T11:12:29.514813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 12Unnamed: 15
Unnamed: 121.0001.000
Unnamed: 151.0001.000
2024-03-14T11:12:29.586319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 12Unnamed: 15
Unnamed: 121.0001.000
Unnamed: 151.0001.000

Missing values

2024-03-14T11:12:28.009035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T11:12:28.265922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T11:12:28.448132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 02020년 옥정호 주요하천 수질조사 결과(3월)Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15
0<NA><NA>NaNNaNNaNNaNNaNNaNNaNNaN<NA><NA><NA><NA><NA><NA>
1<NA>1. 조사지점 : 임실천, 오원천, 섬진강본류, 옥녀동천, 추령천, 도원천하류, 동진강상류, 섬진강댐하류NaNNaNNaNNaNNaNNaNNaNNaN<NA><NA><NA><NA><NA><NA>
2<NA>2. 조사일자 : 2020년 3월 23일NaNNaNNaNNaNNaNNaNNaNNaN<NA><NA><NA><NA><NA><NA>
3<NA>3. 조사항목 : 총 17항목(현장 측정: 수온, pH, DO, EC 와 BOD 등 13개)NaNNaNNaNNaNNaNNaNNaNNaN<NA><NA><NA><NA><NA><NA>
4<NA>4. 조사결과NaNNaNNaNNaNNaNNaNNaNNaN<NA><NA><NA><NA><NA><NA>
5<NA>지점명임실천오원천섬진강본류옥녀동천추령천도원천하류동진강상류섬진강댐하류<NA><NA><NA><NA><NA><NA>
6<NA>수온 (℃)13.715.412.311.411.49.77.39.2<NA><NA>13.8<NA><NA>145.0
7<NA>pH6.977.17.47.47.37.58<NA><NA>13.7<NA><NA>148.0
8<NA>DO (mg/L)12.612.713.413.411.812.813.412.3<NA><NA>13.7<NA><NA>149.0
9<NA>EC (μS/cm)312152162147121128141132<NA><NA>13.733333<NA><NA>147.333333
Unnamed: 02020년 옥정호 주요하천 수질조사 결과(3월)Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15
13<NA>T-N (mg/L)2.9511.9942.1272.3122.2211.6571.8111.523<NA><NA><NA><NA><NA><NA>
14<NA>T-P (mg/L)0.0390.0190.0250.0180.010.010.0080.006<NA><NA><NA><NA><NA><NA>
15<NA>NH3-N (mg/L)0.8380.0790.0330.030.0350.0530.0280.024<NA><NA><NA><NA><NA><NA>
16<NA>NO2-N (mg/L)0.0910.020.0190.0140.0230.0050.0040.004<NA><NA><NA><NA><NA><NA>
17<NA>NO3-N (mg/L)1.8931.0981.9131.7181.720.8450.9550.86<NA><NA><NA><NA><NA><NA>
18<NA>PO₄-P (mg/L)0.0170.006불검출0.0080.0030.009불검출불검출<NA><NA><NA><NA><NA><NA>
19<NA>Chl-a (mg/㎥)5.64.76.46.20.92.31.82.2<NA><NA><NA><NA><NA><NA>
20<NA>TOC (mg/L)2.51.62.11.71.52.21.92.2<NA><NA><NA><NA><NA><NA>
21<NA>총대장균군 (총대장균군/100mL)2000046034020001400260024001400<NA><NA><NA><NA><NA><NA>
22<NA>분원성대장균군 (분원성대장균군수/100mL)280012151229361710<NA><NA><NA><NA><NA><NA>