Overview

Dataset statistics

Number of variables4
Number of observations27
Missing cells58
Missing cells (%)53.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory996.0 B
Average record size in memory36.9 B

Variable types

Text4

Dataset

Description먹는물검사항목및검사주기
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202685

Alerts

Unnamed: 0 has 12 (44.4%) missing valuesMissing
Unnamed: 1 has 20 (74.1%) missing valuesMissing
Unnamed: 2 has 9 (33.3%) missing valuesMissing
Unnamed: 3 has 17 (63.0%) missing valuesMissing

Reproduction

Analysis started2024-03-14 01:21:48.137137
Analysis finished2024-03-14 01:21:48.584247
Duration0.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Text

MISSING 

Distinct14
Distinct (%)93.3%
Missing12
Missing (%)44.4%
Memory size348.0 B
2024-03-14T10:21:48.748325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length180
Median length101
Mean length31.866667
Min length1

Characters and Unicode

Total characters478
Distinct characters119
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)86.7%

Sample

1st row먹는물 검사항목 및 검사주기
2nd row 
3rd row구 분
4th row광역
5th row·
ValueCountFrequency (%)
1회 6
 
4.8%
검사 5
 
4.0%
항목은 4
 
3.2%
4
 
3.2%
대장균군 4
 
3.2%
따라 4
 
3.2%
지난 3
 
2.4%
이상으로 3
 
2.4%
조정하여 3
 
2.4%
가능 3
 
2.4%
Other values (70) 87
69.0%
2024-03-14T10:21:49.119047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
114
23.8%
, 17
 
3.6%
14
 
2.9%
14
 
2.9%
10
 
2.1%
1 9
 
1.9%
8
 
1.7%
8
 
1.7%
8
 
1.7%
7
 
1.5%
Other values (109) 269
56.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 324
67.8%
Space Separator 115
 
24.1%
Other Punctuation 20
 
4.2%
Decimal Number 14
 
2.9%
Close Punctuation 4
 
0.8%
Open Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
4.3%
14
 
4.3%
10
 
3.1%
8
 
2.5%
8
 
2.5%
8
 
2.5%
7
 
2.2%
7
 
2.2%
7
 
2.2%
7
 
2.2%
Other values (97) 234
72.2%
Other Punctuation
ValueCountFrequency (%)
, 17
85.0%
1
 
5.0%
· 1
 
5.0%
1
 
5.0%
Decimal Number
ValueCountFrequency (%)
1 9
64.3%
3 3
 
21.4%
4 1
 
7.1%
2 1
 
7.1%
Space Separator
ValueCountFrequency (%)
114
99.1%
  1
 
0.9%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 324
67.8%
Common 154
32.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
4.3%
14
 
4.3%
10
 
3.1%
8
 
2.5%
8
 
2.5%
8
 
2.5%
7
 
2.2%
7
 
2.2%
7
 
2.2%
7
 
2.2%
Other values (97) 234
72.2%
Common
ValueCountFrequency (%)
114
74.0%
, 17
 
11.0%
1 9
 
5.8%
) 4
 
2.6%
3 3
 
1.9%
4 1
 
0.6%
1
 
0.6%
( 1
 
0.6%
· 1
 
0.6%
1
 
0.6%
Other values (2) 2
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 324
67.8%
ASCII 150
31.4%
Punctuation 2
 
0.4%
None 2
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
114
76.0%
, 17
 
11.3%
1 9
 
6.0%
) 4
 
2.7%
3 3
 
2.0%
4 1
 
0.7%
( 1
 
0.7%
2 1
 
0.7%
Hangul
ValueCountFrequency (%)
14
 
4.3%
14
 
4.3%
10
 
3.1%
8
 
2.5%
8
 
2.5%
8
 
2.5%
7
 
2.2%
7
 
2.2%
7
 
2.2%
7
 
2.2%
Other values (97) 234
72.2%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
None
ValueCountFrequency (%)
· 1
50.0%
  1
50.0%

Unnamed: 1
Text

MISSING 

Distinct6
Distinct (%)85.7%
Missing20
Missing (%)74.1%
Memory size348.0 B
2024-03-14T10:21:49.250788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.4285714
Min length4

Characters and Unicode

Total characters31
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)71.4%

Sample

1st row정 수 장
2nd row수도꼭지
3rd row수 도 관
4th row노후지역
5th row수도꼭지
ValueCountFrequency (%)
수도꼭지 2
16.7%
2
16.7%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
노후지역 1
8.3%
급수과정별 1
8.3%
1
8.3%
1
8.3%
2024-03-14T10:21:49.475388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6
19.4%
5
16.1%
3
9.7%
3
9.7%
2
 
6.5%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other values (6) 6
19.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 25
80.6%
Space Separator 6
 
19.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
20.0%
3
12.0%
3
12.0%
2
 
8.0%
2
 
8.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
Other values (5) 5
20.0%
Space Separator
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 25
80.6%
Common 6
 
19.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
20.0%
3
12.0%
3
12.0%
2
 
8.0%
2
 
8.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
Other values (5) 5
20.0%
Common
ValueCountFrequency (%)
6
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 25
80.6%
ASCII 6
 
19.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6
100.0%
Hangul
ValueCountFrequency (%)
5
20.0%
3
12.0%
3
12.0%
2
 
8.0%
2
 
8.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
Other values (5) 5
20.0%

Unnamed: 2
Text

MISSING 

Distinct17
Distinct (%)94.4%
Missing9
Missing (%)33.3%
Memory size348.0 B
2024-03-14T10:21:49.609998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.2777778
Min length3

Characters and Unicode

Total characters95
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)88.9%

Sample

1st row매일검사
2nd row(6항목)
3rd row매주검사1)
4th row(8항목)
5th row매월검사2)
ValueCountFrequency (%)
매월검사 2
 
10.5%
1
 
5.3%
6항목 1
 
5.3%
매주검사1 1
 
5.3%
8항목 1
 
5.3%
매월검사2 1
 
5.3%
52항목 1
 
5.3%
매분기 1
 
5.3%
7항목 1
 
5.3%
매일검사 1
 
5.3%
Other values (8) 8
42.1%
2024-03-14T10:21:49.883067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 12
12.6%
10
10.5%
10
10.5%
( 9
9.5%
8
 
8.4%
8
 
8.4%
7
 
7.4%
1 5
 
5.3%
3
 
3.2%
5 3
 
3.2%
Other values (13) 20
21.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 56
58.9%
Decimal Number 17
 
17.9%
Close Punctuation 12
 
12.6%
Open Punctuation 9
 
9.5%
Space Separator 1
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10
17.9%
10
17.9%
8
14.3%
8
14.3%
7
12.5%
3
 
5.4%
3
 
5.4%
3
 
5.4%
1
 
1.8%
1
 
1.8%
Other values (2) 2
 
3.6%
Decimal Number
ValueCountFrequency (%)
1 5
29.4%
5 3
17.6%
2 3
17.6%
6 2
 
11.8%
3 1
 
5.9%
7 1
 
5.9%
8 1
 
5.9%
9 1
 
5.9%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 56
58.9%
Common 39
41.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10
17.9%
10
17.9%
8
14.3%
8
14.3%
7
12.5%
3
 
5.4%
3
 
5.4%
3
 
5.4%
1
 
1.8%
1
 
1.8%
Other values (2) 2
 
3.6%
Common
ValueCountFrequency (%)
) 12
30.8%
( 9
23.1%
1 5
12.8%
5 3
 
7.7%
2 3
 
7.7%
6 2
 
5.1%
3 1
 
2.6%
1
 
2.6%
7 1
 
2.6%
8 1
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 56
58.9%
ASCII 39
41.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 12
30.8%
( 9
23.1%
1 5
12.8%
5 3
 
7.7%
2 3
 
7.7%
6 2
 
5.1%
3 1
 
2.6%
1
 
2.6%
7 1
 
2.6%
8 1
 
2.6%
Hangul
ValueCountFrequency (%)
10
17.9%
10
17.9%
8
14.3%
8
14.3%
7
12.5%
3
 
5.4%
3
 
5.4%
3
 
5.4%
1
 
1.8%
1
 
1.8%
Other values (2) 2
 
3.6%

Unnamed: 3
Text

MISSING 

Distinct10
Distinct (%)100.0%
Missing17
Missing (%)63.0%
Memory size348.0 B
2024-03-14T10:21:50.069958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length363
Median length65.5
Mean length83.5
Min length7

Characters and Unicode

Total characters835
Distinct characters140
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)100.0%

Sample

1st row측 정 항 목
2nd row냄새, 맛, 색도, 탁도, 수소이온 농도, 잔류염소
3rd row일반세균, 총 대장균군, 대장균 또는 분원성 대장균군, 암모니아성 질소, 질산성 질소, 과망간산칼륨 소비량, 증발잔류물
4th row소독제 및 소독부산물질 중 분기검사항목 제외 일반세균, 총 대장균군, 대장균, 분원성 대장균군, 납, 불소, 비소, 셀레늄, 수은, 시안, 크롬, 암모니아성 질소, 질산성 질소, 카드뮴, 보론, 페놀, 다이아지논, 파라티온, 페니트로티온, 카바릴, 1,1,1-트리클로로에탄, 테트라클로로에틸렌, 트리클로로에틸렌, 디클로로메탄, 벤젠, 톨루엔, 에틸벤젠, 크실렌, 1,1-디클로로에틸렌, 사염화탄소, 1,2-디브로모-3-클로로프로판, 1,4-다이옥산, 경도, 과망간산칼륨, 냄새, 맛, 동, 색도, 세제, pH, 아연, 염소이온, 증발잔류물, 철, 망간, 탁도, 황산이온, 알루미늄, 총트리할로메탄, 클로로포름, 브로모디클로로메탄, 디브로모클로로메탄
5th row10개 소독부산물중 7개항목(잔류염소, 클로랄하이드레이트, 디브로모아세토니트릴, 디클로로아세토니트릴, 트리클로로아세토니트릴, 할로아세틱에시드, 포름알데히드)
ValueCountFrequency (%)
대장균군 11
 
6.9%
질소 7
 
4.4%
대장균 6
 
3.8%
분원성 6
 
3.8%
일반세균 6
 
3.8%
잔류염소 5
 
3.1%
또는 5
 
3.1%
5
 
3.1%
암모니아성 4
 
2.5%
탁도 4
 
2.5%
Other values (75) 100
62.9%
2024-03-14T10:21:50.377755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
149
 
17.8%
, 108
 
12.9%
32
 
3.8%
26
 
3.1%
24
 
2.9%
18
 
2.2%
18
 
2.2%
14
 
1.7%
13
 
1.6%
13
 
1.6%
Other values (130) 420
50.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 549
65.7%
Space Separator 149
 
17.8%
Other Punctuation 108
 
12.9%
Decimal Number 13
 
1.6%
Dash Punctuation 6
 
0.7%
Control 2
 
0.2%
Close Punctuation 2
 
0.2%
Lowercase Letter 2
 
0.2%
Uppercase Letter 2
 
0.2%
Open Punctuation 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
5.8%
26
 
4.7%
24
 
4.4%
18
 
3.3%
18
 
3.3%
14
 
2.6%
13
 
2.4%
13
 
2.4%
12
 
2.2%
12
 
2.2%
Other values (116) 367
66.8%
Decimal Number
ValueCountFrequency (%)
1 8
61.5%
0 1
 
7.7%
7 1
 
7.7%
4 1
 
7.7%
2 1
 
7.7%
3 1
 
7.7%
Space Separator
ValueCountFrequency (%)
149
100.0%
Other Punctuation
ValueCountFrequency (%)
, 108
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Control
ValueCountFrequency (%)
2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Lowercase Letter
ValueCountFrequency (%)
p 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
H 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 549
65.7%
Common 282
33.8%
Latin 4
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
5.8%
26
 
4.7%
24
 
4.4%
18
 
3.3%
18
 
3.3%
14
 
2.6%
13
 
2.4%
13
 
2.4%
12
 
2.2%
12
 
2.2%
Other values (116) 367
66.8%
Common
ValueCountFrequency (%)
149
52.8%
, 108
38.3%
1 8
 
2.8%
- 6
 
2.1%
2
 
0.7%
) 2
 
0.7%
( 2
 
0.7%
0 1
 
0.4%
7 1
 
0.4%
4 1
 
0.4%
Other values (2) 2
 
0.7%
Latin
ValueCountFrequency (%)
p 2
50.0%
H 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 549
65.7%
ASCII 286
34.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
149
52.1%
, 108
37.8%
1 8
 
2.8%
- 6
 
2.1%
2
 
0.7%
) 2
 
0.7%
p 2
 
0.7%
H 2
 
0.7%
( 2
 
0.7%
0 1
 
0.3%
Other values (4) 4
 
1.4%
Hangul
ValueCountFrequency (%)
32
 
5.8%
26
 
4.7%
24
 
4.4%
18
 
3.3%
18
 
3.3%
14
 
2.6%
13
 
2.4%
13
 
2.4%
12
 
2.2%
12
 
2.2%
Other values (116) 367
66.8%

Correlations

2024-03-14T10:21:50.472237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 0Unnamed: 1Unnamed: 2Unnamed: 3
Unnamed: 01.000NaN1.0001.000
Unnamed: 1NaN1.0001.0001.000
Unnamed: 21.0001.0001.0001.000
Unnamed: 31.0001.0001.0001.000

Missing values

2024-03-14T10:21:48.393134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T10:21:48.468184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T10:21:48.538901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0Unnamed: 1Unnamed: 2Unnamed: 3
0먹는물 검사항목 및 검사주기<NA><NA><NA>
1<NA><NA><NA>
2구 분<NA><NA>측 정 항 목
3광역정 수 장매일검사냄새, 맛, 색도, 탁도, 수소이온 농도, 잔류염소
4·<NA>(6항목)<NA>
5지방<NA>매주검사1)일반세균, 총 대장균군, 대장균 또는 분원성 대장균군, 암모니아성 질소, 질산성 질소, 과망간산칼륨 소비량, 증발잔류물
6상수도<NA>(8항목)<NA>
7<NA><NA>매월검사2)소독제 및 소독부산물질 중 분기검사항목 제외 일반세균, 총 대장균군, 대장균, 분원성 대장균군, 납, 불소, 비소, 셀레늄, 수은, 시안, 크롬, 암모니아성 질소, 질산성 질소, 카드뮴, 보론, 페놀, 다이아지논, 파라티온, 페니트로티온, 카바릴, 1,1,1-트리클로로에탄, 테트라클로로에틸렌, 트리클로로에틸렌, 디클로로메탄, 벤젠, 톨루엔, 에틸벤젠, 크실렌, 1,1-디클로로에틸렌, 사염화탄소, 1,2-디브로모-3-클로로프로판, 1,4-다이옥산, 경도, 과망간산칼륨, 냄새, 맛, 동, 색도, 세제, pH, 아연, 염소이온, 증발잔류물, 철, 망간, 탁도, 황산이온, 알루미늄, 총트리할로메탄, 클로로포름, 브로모디클로로메탄, 디브로모클로로메탄
8<NA><NA>(52항목)<NA>
9<NA><NA>매분기10개 소독부산물중 7개항목(잔류염소, 클로랄하이드레이트, 디브로모아세토니트릴, 디클로로아세토니트릴, 트리클로로아세토니트릴, 할로아세틱에시드, 포름알데히드)
Unnamed: 0Unnamed: 1Unnamed: 2Unnamed: 3
17<NA>시 설(12항목)<NA>
18마을․전용<NA>분기검사3)일반세균, 총 대장균군, 대장균 또는 분원성 대장균군, 암모니아성 질소, 질산성 질소, 냄새, 맛, 색도, 탁도, 불소, 망간, 알루미늄, 잔류염소, 보론 및 염소이온(해수에 한함)
19상수도<NA>(16항목)<NA>
20소규모<NA><NA><NA>
21급수시설<NA>연 전항목검사먹는물 수질기준 전항목
22<NA><NA>(59항목)<NA>
231) 일반세균, 총 대장균군, 대장균 또는 분원성 대장균군 항목은 반드시 매주 1회 이상 검사, 기타 항목은 지난 1년간의 수질검사결과에 따라 매월 1회 이상으로 조정하여 검사 가능<NA><NA><NA>
242) 일반세균, 총 대장균군, 대장균 또는 분원성 대장균군, 암모니아성 질소, 질산성 질소, 과망간산칼륨 소비량, 냄새, 맛, 색도, 수소이온 농도, 염소이온, 망간, 탁도 및 알루미늄 항목은 반드시 매월 1회 이상 검사를 실시하고, 기타 항목은 지난 3년간의 수질검사 결과에 따라 매분기 1회 이상으로 조정하여 검사 가능<NA><NA><NA>
253) 지난 3년간의 수질검사 결과에 따라 매 반기 1회 이상으로 조정하여 검사 가능<NA><NA><NA>
26※ 먹는물 수질기준 및 검사 등에 관한 규칙 제4조 및 별표1에 따라 실시, 마을상수도 등의 경우 매 분기검사중 연간 전항목 검사가 중복되는 분기는 연 1회 전항목 검사로 대체(전산입력 필수)<NA><NA><NA>