Overview

Dataset statistics

Number of variables14
Number of observations290
Missing cells465
Missing cells (%)11.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory33.0 KiB
Average record size in memory116.5 B

Variable types

Categorical8
Text3
Numeric3

Dataset

Description지하수 수질측정 결과 현황
Author경기도
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=3PS7152M60Q0ZAG0427012039966&infSeq=1

Alerts

집계년도 has constant value ""Constant
주용도명 has constant value ""Constant
채수일자 is highly overall correlated with 채수위치우편번호 and 3 other fieldsHigh correlation
음용사용여부 is highly overall correlated with 측정결과적합여부 and 1 other fieldsHigh correlation
초과항목명 is highly overall correlated with 시군명 and 1 other fieldsHigh correlation
시군명 is highly overall correlated with 채수위치우편번호 and 5 other fieldsHigh correlation
초과항목결과내역 is highly overall correlated with 채수위치우편번호 and 7 other fieldsHigh correlation
측정결과적합여부 is highly overall correlated with 음용사용여부 and 1 other fieldsHigh correlation
채수위치우편번호 is highly overall correlated with WGS84위도 and 3 other fieldsHigh correlation
WGS84위도 is highly overall correlated with 채수위치우편번호 and 3 other fieldsHigh correlation
WGS84경도 is highly overall correlated with 시군명 and 1 other fieldsHigh correlation
초과항목명 is highly imbalanced (56.7%)Imbalance
채수위치우편번호 has 126 (43.4%) missing valuesMissing
채수위치지번주소 has 18 (6.2%) missing valuesMissing
채수위치도로명주소 has 243 (83.8%) missing valuesMissing
WGS84위도 has 39 (13.4%) missing valuesMissing
WGS84경도 has 39 (13.4%) missing valuesMissing

Reproduction

Analysis started2023-12-10 21:01:11.896057
Analysis finished2023-12-10 21:01:14.161140
Duration2.27 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

집계년도
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023
290 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2023 290
100.0%

Length

2023-12-11T06:01:14.246118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:01:14.362430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 290
100.0%

시군명
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)10.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
수원시
31 
양주시
22 
성남시
20 
남양주시
 
16
의정부시
 
14
Other values (26)
187 

Length

Max length4
Median length3
Mean length3.1448276
Min length3

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row가평군
2nd row가평군
3rd row가평군
4th row고양시
5th row고양시

Common Values

ValueCountFrequency (%)
수원시 31
 
10.7%
양주시 22
 
7.6%
성남시 20
 
6.9%
남양주시 16
 
5.5%
의정부시 14
 
4.8%
광명시 14
 
4.8%
동두천시 12
 
4.1%
안양시 12
 
4.1%
과천시 12
 
4.1%
군포시 11
 
3.8%
Other values (21) 126
43.4%

Length

2023-12-11T06:01:14.492075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
수원시 31
 
10.7%
양주시 22
 
7.6%
성남시 20
 
6.9%
남양주시 16
 
5.5%
의정부시 14
 
4.8%
광명시 14
 
4.8%
동두천시 12
 
4.1%
안양시 12
 
4.1%
과천시 12
 
4.1%
군포시 11
 
3.8%
Other values (21) 126
43.4%
Distinct288
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-11T06:01:14.837478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length5.7793103
Min length2

Characters and Unicode

Total characters1676
Distinct characters233
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique286 ?
Unique (%)98.6%

Sample

1st row가평군 경춘도로
2nd row가평군 수덕산
3rd row가평군 보납산1
4th row마두
5th row명봉
ValueCountFrequency (%)
양주시 20
 
4.4%
남양주시 16
 
3.5%
군포시 11
 
2.4%
동두천시 11
 
2.4%
이천시 9
 
2.0%
과천시 9
 
2.0%
성남시중원구 8
 
1.8%
부천시 8
 
1.8%
파주시 8
 
1.8%
김포시 7
 
1.5%
Other values (298) 346
76.4%
2023-12-11T06:01:15.329644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
163
 
9.7%
150
 
8.9%
65
 
3.9%
64
 
3.8%
59
 
3.5%
48
 
2.9%
47
 
2.8%
42
 
2.5%
38
 
2.3%
32
 
1.9%
Other values (223) 968
57.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1458
87.0%
Space Separator 163
 
9.7%
Decimal Number 28
 
1.7%
Close Punctuation 12
 
0.7%
Open Punctuation 12
 
0.7%
Uppercase Letter 2
 
0.1%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
150
 
10.3%
65
 
4.5%
64
 
4.4%
59
 
4.0%
48
 
3.3%
47
 
3.2%
42
 
2.9%
38
 
2.6%
32
 
2.2%
31
 
2.1%
Other values (210) 882
60.5%
Decimal Number
ValueCountFrequency (%)
1 11
39.3%
2 9
32.1%
4 3
 
10.7%
8 2
 
7.1%
5 1
 
3.6%
6 1
 
3.6%
3 1
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
G 1
50.0%
L 1
50.0%
Space Separator
ValueCountFrequency (%)
163
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1458
87.0%
Common 216
 
12.9%
Latin 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
150
 
10.3%
65
 
4.5%
64
 
4.4%
59
 
4.0%
48
 
3.3%
47
 
3.2%
42
 
2.9%
38
 
2.6%
32
 
2.2%
31
 
2.1%
Other values (210) 882
60.5%
Common
ValueCountFrequency (%)
163
75.5%
) 12
 
5.6%
( 12
 
5.6%
1 11
 
5.1%
2 9
 
4.2%
4 3
 
1.4%
8 2
 
0.9%
5 1
 
0.5%
, 1
 
0.5%
6 1
 
0.5%
Latin
ValueCountFrequency (%)
G 1
50.0%
L 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1458
87.0%
ASCII 218
 
13.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
163
74.8%
) 12
 
5.5%
( 12
 
5.5%
1 11
 
5.0%
2 9
 
4.1%
4 3
 
1.4%
8 2
 
0.9%
G 1
 
0.5%
L 1
 
0.5%
5 1
 
0.5%
Other values (3) 3
 
1.4%
Hangul
ValueCountFrequency (%)
150
 
10.3%
65
 
4.5%
64
 
4.4%
59
 
4.0%
48
 
3.3%
47
 
3.2%
42
 
2.9%
38
 
2.6%
32
 
2.2%
31
 
2.1%
Other values (210) 882
60.5%

채수일자
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)6.6%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-09-05
39 
2023-09-12
32 
2023-09-11
29 
2023-09-03
25 
2023-09-07
21 
Other values (14)
144 

Length

Max length10
Median length10
Mean length9.6689655
Min length4

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row2023-09-19
2nd row2023-09-19
3rd row2023-09-19
4th row2023-09-25
5th row2023-09-25

Common Values

ValueCountFrequency (%)
2023-09-05 39
13.4%
2023-09-12 32
11.0%
2023-09-11 29
10.0%
2023-09-03 25
 
8.6%
2023-09-07 21
 
7.2%
2023-09-06 18
 
6.2%
2023-09-19 17
 
5.9%
2023-09-25 17
 
5.9%
<NA> 16
 
5.5%
2023-09-13 13
 
4.5%
Other values (9) 63
21.7%

Length

2023-12-11T06:01:15.469874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2023-09-05 39
13.4%
2023-09-12 32
11.0%
2023-09-11 29
10.0%
2023-09-03 25
 
8.6%
2023-09-07 21
 
7.2%
2023-09-06 18
 
6.2%
2023-09-19 17
 
5.9%
2023-09-25 17
 
5.9%
na 16
 
5.5%
2023-09-01 13
 
4.5%
Other values (9) 63
21.7%

주용도명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
음용
290 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row음용
2nd row음용
3rd row음용
4th row음용
5th row음용

Common Values

ValueCountFrequency (%)
음용 290
100.0%

Length

2023-12-11T06:01:15.595509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:01:15.698997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
음용 290
100.0%

음용사용여부
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
가능
187 
불가능
103 

Length

Max length3
Median length2
Mean length2.3551724
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가능
2nd row가능
3rd row가능
4th row가능
5th row불가능

Common Values

ValueCountFrequency (%)
가능 187
64.5%
불가능 103
35.5%

Length

2023-12-11T06:01:15.811946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:01:15.910693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
가능 187
64.5%
불가능 103
35.5%

측정결과적합여부
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
적합
186 
부적합
85 
미검사
19 

Length

Max length3
Median length2
Mean length2.3586207
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row적합
2nd row적합
3rd row적합
4th row적합
5th row부적합

Common Values

ValueCountFrequency (%)
적합 186
64.1%
부적합 85
29.3%
미검사 19
 
6.6%

Length

2023-12-11T06:01:16.006535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:01:16.098814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
적합 186
64.1%
부적합 85
29.3%
미검사 19
 
6.6%

초과항목명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct9
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
<NA>
204 
총대장균군
59 
총대장균균
 
10
총대장균군,대장균
 
6
총대장균군,분원성대장균,대장균
 
3
Other values (4)
 
8

Length

Max length16
Median length4
Mean length4.6172414
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row총대장균군,대장균

Common Values

ValueCountFrequency (%)
<NA> 204
70.3%
총대장균군 59
 
20.3%
총대장균균 10
 
3.4%
총대장균군,대장균 6
 
2.1%
총대장균군,분원성대장균,대장균 3
 
1.0%
총대장균 2
 
0.7%
총대장균군,분원성대장균 2
 
0.7%
총대장균균,대장균 2
 
0.7%
총대장균군,분원성대장균군 2
 
0.7%

Length

2023-12-11T06:01:16.216576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:01:16.377012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 204
70.3%
총대장균군 59
 
20.3%
총대장균균 10
 
3.4%
총대장균군,대장균 6
 
2.1%
총대장균군,분원성대장균,대장균 3
 
1.0%
총대장균 2
 
0.7%
총대장균군,분원성대장균 2
 
0.7%
총대장균균,대장균 2
 
0.7%
총대장균군,분원성대장균군 2
 
0.7%

초과항목결과내역
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
<NA>
204 
검출
86 

Length

Max length4
Median length4
Mean length3.4068966
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row검출

Common Values

ValueCountFrequency (%)
<NA> 204
70.3%
검출 86
29.7%

Length

2023-12-11T06:01:16.518741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:01:16.626708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 204
70.3%
검출 86
29.7%

채수위치우편번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct127
Distinct (%)77.4%
Missing126
Missing (%)43.4%
Infinite0
Infinite (%)0.0%
Mean13777.72
Minimum10011
Maximum18549
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-11T06:01:16.749103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10011
5-th percentile10479
Q111441
median13025.5
Q316211.25
95-th percentile17709.85
Maximum18549
Range8538
Interquartile range (IQR)4770.25

Descriptive statistics

Standard deviation2571.6412
Coefficient of variation (CV)0.18665217
Kurtosis-1.3530912
Mean13777.72
Median Absolute Deviation (MAD)2023.5
Skewness0.29203045
Sum2259546
Variance6613338.5
MonotonicityNot monotonic
2023-12-11T06:01:16.915068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17379 6
 
2.1%
14334 5
 
1.7%
16201 4
 
1.4%
12700 4
 
1.4%
13800 4
 
1.4%
18549 3
 
1.0%
11304 3
 
1.0%
11441 3
 
1.0%
17586 3
 
1.0%
11641 2
 
0.7%
Other values (117) 127
43.8%
(Missing) 126
43.4%
ValueCountFrequency (%)
10011 1
0.3%
10069 1
0.3%
10124 1
0.3%
10263 1
0.3%
10296 1
0.3%
10299 1
0.3%
10330 1
0.3%
10409 1
0.3%
10479 2
0.7%
10539 1
0.3%
ValueCountFrequency (%)
18549 3
1.0%
18546 1
 
0.3%
18465 1
 
0.3%
18149 2
0.7%
18114 1
 
0.3%
17728 1
 
0.3%
17607 1
 
0.3%
17586 3
1.0%
17534 1
 
0.3%
17415 1
 
0.3%
Distinct254
Distinct (%)93.4%
Missing18
Missing (%)6.2%
Memory size2.4 KiB
2023-12-11T06:01:17.227675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length22
Mean length18.797794
Min length14

Characters and Unicode

Total characters5113
Distinct characters177
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique238 ?
Unique (%)87.5%

Sample

1st row경기도 가평군 가평읍 읍내리 산6-1
2nd row경기도 가평군 북면 제령리 603-1
3rd row경기도 가평군 읍내리 산154
4th row경기도 고양시 일산동구 마두동 1121
5th row경기도 고양시 덕양구 내유동 769-3
ValueCountFrequency (%)
경기도 272
 
22.0%
수원시 31
 
2.5%
22
 
1.8%
양주시 22
 
1.8%
남양주시 16
 
1.3%
장안구 16
 
1.3%
의정부시 14
 
1.1%
광명시 14
 
1.1%
동두천시 12
 
1.0%
안양시 12
 
1.0%
Other values (491) 806
65.2%
2023-12-11T06:01:17.772251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
972
19.0%
276
 
5.4%
275
 
5.4%
273
 
5.3%
265
 
5.2%
1 247
 
4.8%
230
 
4.5%
226
 
4.4%
- 162
 
3.2%
2 102
 
2.0%
Other values (167) 2085
40.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3109
60.8%
Space Separator 972
 
19.0%
Decimal Number 870
 
17.0%
Dash Punctuation 162
 
3.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
276
 
8.9%
275
 
8.8%
273
 
8.8%
265
 
8.5%
230
 
7.4%
226
 
7.3%
93
 
3.0%
73
 
2.3%
65
 
2.1%
63
 
2.0%
Other values (155) 1270
40.8%
Decimal Number
ValueCountFrequency (%)
1 247
28.4%
2 102
11.7%
3 76
 
8.7%
6 69
 
7.9%
4 69
 
7.9%
5 65
 
7.5%
7 64
 
7.4%
0 63
 
7.2%
9 61
 
7.0%
8 54
 
6.2%
Space Separator
ValueCountFrequency (%)
972
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 162
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3109
60.8%
Common 2004
39.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
276
 
8.9%
275
 
8.8%
273
 
8.8%
265
 
8.5%
230
 
7.4%
226
 
7.3%
93
 
3.0%
73
 
2.3%
65
 
2.1%
63
 
2.0%
Other values (155) 1270
40.8%
Common
ValueCountFrequency (%)
972
48.5%
1 247
 
12.3%
- 162
 
8.1%
2 102
 
5.1%
3 76
 
3.8%
6 69
 
3.4%
4 69
 
3.4%
5 65
 
3.2%
7 64
 
3.2%
0 63
 
3.1%
Other values (2) 115
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3109
60.8%
ASCII 2004
39.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
972
48.5%
1 247
 
12.3%
- 162
 
8.1%
2 102
 
5.1%
3 76
 
3.8%
6 69
 
3.4%
4 69
 
3.4%
5 65
 
3.2%
7 64
 
3.2%
0 63
 
3.1%
Other values (2) 115
 
5.7%
Hangul
ValueCountFrequency (%)
276
 
8.9%
275
 
8.8%
273
 
8.8%
265
 
8.5%
230
 
7.4%
226
 
7.3%
93
 
3.0%
73
 
2.3%
65
 
2.1%
63
 
2.0%
Other values (155) 1270
40.8%
Distinct46
Distinct (%)97.9%
Missing243
Missing (%)83.8%
Memory size2.4 KiB
2023-12-11T06:01:18.111055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length23
Mean length20.446809
Min length14

Characters and Unicode

Total characters961
Distinct characters123
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)95.7%

Sample

1st row경기도 고양시 덕양구 중앙로46번길 54-52
2nd row경기도 과천시 아랫배랭이로 27
3rd row경기도 과천시 자하동길 64
4th row경기도 군포시 속달로110번길 25
5th row경기도 남양주시 가운로 47
ValueCountFrequency (%)
경기도 47
 
22.3%
파주시 7
 
3.3%
양주시 5
 
2.4%
수원시 5
 
2.4%
양평군 4
 
1.9%
안성시 4
 
1.9%
영통구 3
 
1.4%
안양시 2
 
0.9%
남파로 2
 
0.9%
의정부시 2
 
0.9%
Other values (121) 130
61.6%
2023-12-11T06:01:18.652681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
164
 
17.1%
50
 
5.2%
47
 
4.9%
47
 
4.9%
46
 
4.8%
37
 
3.9%
1 34
 
3.5%
2 32
 
3.3%
28
 
2.9%
- 23
 
2.4%
Other values (113) 453
47.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 564
58.7%
Decimal Number 210
 
21.9%
Space Separator 164
 
17.1%
Dash Punctuation 23
 
2.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
50
 
8.9%
47
 
8.3%
47
 
8.3%
46
 
8.2%
37
 
6.6%
28
 
5.0%
18
 
3.2%
17
 
3.0%
14
 
2.5%
14
 
2.5%
Other values (101) 246
43.6%
Decimal Number
ValueCountFrequency (%)
1 34
16.2%
2 32
15.2%
0 23
11.0%
7 23
11.0%
4 21
10.0%
6 19
9.0%
3 17
8.1%
5 16
7.6%
9 13
 
6.2%
8 12
 
5.7%
Space Separator
ValueCountFrequency (%)
164
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 564
58.7%
Common 397
41.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
50
 
8.9%
47
 
8.3%
47
 
8.3%
46
 
8.2%
37
 
6.6%
28
 
5.0%
18
 
3.2%
17
 
3.0%
14
 
2.5%
14
 
2.5%
Other values (101) 246
43.6%
Common
ValueCountFrequency (%)
164
41.3%
1 34
 
8.6%
2 32
 
8.1%
- 23
 
5.8%
0 23
 
5.8%
7 23
 
5.8%
4 21
 
5.3%
6 19
 
4.8%
3 17
 
4.3%
5 16
 
4.0%
Other values (2) 25
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 564
58.7%
ASCII 397
41.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
164
41.3%
1 34
 
8.6%
2 32
 
8.1%
- 23
 
5.8%
0 23
 
5.8%
7 23
 
5.8%
4 21
 
5.3%
6 19
 
4.8%
3 17
 
4.3%
5 16
 
4.0%
Other values (2) 25
 
6.3%
Hangul
ValueCountFrequency (%)
50
 
8.9%
47
 
8.3%
47
 
8.3%
46
 
8.2%
37
 
6.6%
28
 
5.0%
18
 
3.2%
17
 
3.0%
14
 
2.5%
14
 
2.5%
Other values (101) 246
43.6%

WGS84위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct232
Distinct (%)92.4%
Missing39
Missing (%)13.4%
Infinite0
Infinite (%)0.0%
Mean37.51668
Minimum36.942395
Maximum38.21868
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-11T06:01:18.871340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.942395
5-th percentile37.171988
Q137.322068
median37.455238
Q337.71466
95-th percentile37.917787
Maximum38.21868
Range1.2762848
Interquartile range (IQR)0.39259157

Descriptive statistics

Standard deviation0.24533068
Coefficient of variation (CV)0.0065392426
Kurtosis-0.49424248
Mean37.51668
Median Absolute Deviation (MAD)0.17049742
Skewness0.37486746
Sum9416.6868
Variance0.060187141
MonotonicityNot monotonic
2023-12-11T06:01:19.021573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.2744369868 5
 
1.7%
37.4395518457 4
 
1.4%
37.8996104605 3
 
1.0%
37.4363873373 2
 
0.7%
37.417166308 2
 
0.7%
37.0152713035 2
 
0.7%
37.3231519256 2
 
0.7%
37.9204636404 2
 
0.7%
37.9455774294 2
 
0.7%
37.4635402669 2
 
0.7%
Other values (222) 225
77.6%
(Missing) 39
 
13.4%
ValueCountFrequency (%)
36.9423947338 1
0.3%
36.9948286864 1
0.3%
37.0119880267 1
0.3%
37.0152713035 2
0.7%
37.0710671992 1
0.3%
37.1123200476 1
0.3%
37.1421717682 1
0.3%
37.1457219321 1
0.3%
37.1530825479 1
0.3%
37.1636942676 1
0.3%
ValueCountFrequency (%)
38.2186795679 1
0.3%
38.1757087557 1
0.3%
38.1501326308 1
0.3%
38.034087896 1
0.3%
38.0176240388 1
0.3%
37.976544202 1
0.3%
37.9455774294 2
0.7%
37.9333641424 1
0.3%
37.9282130919 1
0.3%
37.9204636404 2
0.7%

WGS84경도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct232
Distinct (%)92.4%
Missing39
Missing (%)13.4%
Infinite0
Infinite (%)0.0%
Mean127.04364
Minimum126.54585
Maximum127.75945
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-11T06:01:19.204565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.54585
5-th percentile126.77618
Q1126.91471
median127.02322
Q3127.11733
95-th percentile127.47494
Maximum127.75945
Range1.2135908
Interquartile range (IQR)0.20262008

Descriptive statistics

Standard deviation0.20766556
Coefficient of variation (CV)0.0016346001
Kurtosis1.3767509
Mean127.04364
Median Absolute Deviation (MAD)0.099789819
Skewness0.91284603
Sum31887.955
Variance0.043124985
MonotonicityNot monotonic
2023-12-11T06:01:19.371407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.4176309338 5
 
1.7%
126.8629013283 4
 
1.4%
127.0749200907 3
 
1.0%
126.9729503151 2
 
0.7%
126.9639999748 2
 
0.7%
127.2755353914 2
 
0.7%
127.006616392 2
 
0.7%
127.0445407358 2
 
0.7%
127.0861141426 2
 
0.7%
126.8585726693 2
 
0.7%
Other values (222) 225
77.6%
(Missing) 39
 
13.4%
ValueCountFrequency (%)
126.5458549227 1
0.3%
126.588758433 1
0.3%
126.6209497424 1
0.3%
126.6406703204 1
0.3%
126.6885824183 1
0.3%
126.7015859929 1
0.3%
126.7059729621 1
0.3%
126.7099210717 1
0.3%
126.7415297734 1
0.3%
126.7420841528 1
0.3%
ValueCountFrequency (%)
127.7594457602 1
0.3%
127.7016478438 1
0.3%
127.6490832169 1
0.3%
127.6455985051 1
0.3%
127.6421145369 1
0.3%
127.6082677164 1
0.3%
127.6016991264 1
0.3%
127.5975576559 1
0.3%
127.5705883755 1
0.3%
127.5320474177 1
0.3%

Interactions

2023-12-11T06:01:13.329945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:12.638209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:12.990934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:13.415430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:12.737331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:13.083777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:13.519951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:12.850877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:01:13.218158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T06:01:19.534362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명채수일자음용사용여부측정결과적합여부초과항목명채수위치우편번호채수위치도로명주소WGS84위도WGS84경도
시군명1.0000.9850.5420.5610.8731.0001.0000.9750.947
채수일자0.9851.0000.2870.2510.8280.8681.0000.8660.732
음용사용여부0.5420.2871.0000.8110.0000.3231.0000.2410.359
측정결과적합여부0.5610.2510.8111.0000.0000.4071.0000.1940.363
초과항목명0.8730.8280.0000.0001.0000.5901.0000.6850.468
채수위치우편번호1.0000.8680.3230.4070.5901.0001.0000.9210.888
채수위치도로명주소1.0001.0001.0001.0001.0001.0001.0001.0001.000
WGS84위도0.9750.8660.2410.1940.6850.9211.0001.0000.762
WGS84경도0.9470.7320.3590.3630.4680.8881.0000.7621.000
2023-12-11T06:01:19.947924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
채수일자음용사용여부초과항목명시군명초과항목결과내역측정결과적합여부
채수일자1.0000.2190.4230.8151.0000.114
음용사용여부0.2191.0000.0000.4401.0000.991
초과항목명0.4230.0001.0000.5371.0000.000
시군명0.8150.4400.5371.0001.0000.325
초과항목결과내역1.0001.0001.0001.0001.0001.000
측정결과적합여부0.1140.9910.0000.3251.0001.000
2023-12-11T06:01:20.053893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
채수위치우편번호WGS84위도WGS84경도시군명채수일자음용사용여부측정결과적합여부초과항목명초과항목결과내역
채수위치우편번호1.000-0.9180.2350.8970.5600.2560.2710.3181.000
WGS84위도-0.9181.0000.0090.7970.5560.1820.1150.2961.000
WGS84경도0.2350.0091.0000.6930.3800.2710.2310.1751.000
시군명0.8970.7970.6931.0000.8150.4400.3250.5371.000
채수일자0.5600.5560.3800.8151.0000.2190.1140.4231.000
음용사용여부0.2560.1820.2710.4400.2191.0000.9910.0001.000
측정결과적합여부0.2710.1150.2310.3250.1140.9911.0000.0001.000
초과항목명0.3180.2960.1750.5370.4230.0000.0001.0001.000
초과항목결과내역1.0001.0001.0001.0001.0001.0001.0001.0001.000

Missing values

2023-12-11T06:01:13.683870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:01:13.887448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T06:01:14.048229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

집계년도시군명지점번호채수일자주용도명음용사용여부측정결과적합여부초과항목명초과항목결과내역채수위치우편번호채수위치지번주소채수위치도로명주소WGS84위도WGS84경도
02023가평군가평군 경춘도로2023-09-19음용가능적합<NA><NA><NA>경기도 가평군 가평읍 읍내리 산6-1<NA>37.845649127.532047
12023가평군가평군 수덕산2023-09-19음용가능적합<NA><NA>12405경기도 가평군 북면 제령리 603-1<NA>37.901535127.517405
22023가평군가평군 보납산12023-09-19음용가능적합<NA><NA><NA>경기도 가평군 읍내리 산154<NA>37.837429127.514472
32023고양시마두2023-09-25음용가능적합<NA><NA>10409경기도 고양시 일산동구 마두동 1121<NA>37.662779126.771881
42023고양시명봉2023-09-25음용불가능부적합총대장균군,대장균검출10263경기도 고양시 덕양구 내유동 769-3<NA>37.729758126.856424
52023고양시성라공원12023-09-25음용가능적합<NA><NA>10479경기도 고양시 덕양구 성사동 267<NA>37.647115126.844309
62023고양시유곽골(화전)2023-09-25음용불가능부적합총대장균검출10539경기도 고양시 덕양구 덕은동 2경기도 고양시 덕양구 중앙로46번길 54-5237.596734126.883527
72023고양시천일2023-09-25음용불가능부적합총대장균군,대장균검출10555경기도 고양시 덕양구 원흥동 114-13<NA>37.659573126.879806
82023고양시능안골2023-09-25음용불가능부적합총대장균검출10299경기도 고양시 덕양구 식사동 426-1<NA>37.673486126.819174
92023고양시중산마을2023-09-25음용가능적합<NA><NA>10330경기도 고양시 일산동구 중산동 138-14<NA>37.688573126.784819
집계년도시군명지점번호채수일자주용도명음용사용여부측정결과적합여부초과항목명초과항목결과내역채수위치우편번호채수위치지번주소채수위치도로명주소WGS84위도WGS84경도
2802023하남시하남시 선법사2023-09-06음용가능적합<NA><NA><NA>경기도 하남시 교산동 300<NA>37.518293127.208311
2812023하남시하남시 은고개2023-09-06음용가능적합<NA><NA>13028경기도 하남시 상산곡동 산 189경기도 하남시 하남대로 53-237.48169127.238881
2822023하남시하남시 일장천2023-09-06음용불가능부적합총대장균균검출<NA>경기도 하남시 학암동 산 24-1<NA>37.482462127.167432
2832023하남시하남시 참새골2023-09-06음용불가능부적합총대장균균검출<NA>경기도 하남시 감이동 29<NA>37.506019127.170311
2842023하남시곱돌광산2023-09-06음용불가능부적합총대장균균검출13023경기도 하남시 창우동 산 26-1경기도 하남시 검단산로146번길 24037.527831127.238867
2852023화성시화성시 독지리2023-09-06음용가능적합<NA><NA>18546경기도 화성시 송산면 독지리 677-2<NA>37.253426126.701586
2862023화성시화성시 만의사2023-09-06음용가능적합<NA><NA>18465경기도 화성시 동탄면 중리 141<NA>37.208612127.149543
2872023화성시화성시 매봉산12023-09-06음용가능적합<NA><NA>18549경기도 화성시 송산면 삼존리 산65-5<NA>37.22113126.742084
2882023화성시화성시 매봉산22023-09-06음용가능적합<NA><NA>18549경기도 화성시 송산면 삼존리 산65-2<NA>37.221795126.74153
2892023화성시화성시 매봉산42023-09-06음용가능적합<NA><NA>18549경기도 화성시 송산면 삼존리 422-1<NA>37.220129126.750031