Overview

Dataset statistics

Number of variables8
Number of observations24
Missing cells0
Missing cells (%)0.0%
Duplicate rows5
Duplicate rows (%)20.8%
Total size in memory1.7 KiB
Average record size in memory74.5 B

Variable types

Text2
Categorical6

Alerts

수온값 has constant value ""Constant
수위 has constant value ""Constant
Dataset has 5 (20.8%) duplicate rowsDuplicates
상부굴착구경(mm) is highly overall correlated with 설치일자 and 2 other fieldsHigh correlation
하부굴착구경(mm) is highly overall correlated with 관리기관명 and 1 other fieldsHigh correlation
관리기관명 is highly overall correlated with 상부굴착구경(mm) and 1 other fieldsHigh correlation
설치일자 is highly overall correlated with 상부굴착구경(mm)High correlation

Reproduction

Analysis started2023-12-10 11:42:27.992756
Analysis finished2023-12-10 11:42:28.642957
Duration0.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct19
Distinct (%)79.2%
Missing0
Missing (%)0.0%
Memory size324.0 B
2023-12-10T20:42:28.785041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length4
Mean length4.5833333
Min length4

Characters and Unicode

Total characters110
Distinct characters51
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)58.3%

Sample

1st row신촌동 동성아파트
2nd row강화강화
3rd row강화선원
4th row강화화도
5th row금산진산
ValueCountFrequency (%)
안산원시 2
 
8.0%
익산망성 2
 
8.0%
안산성곡2 2
 
8.0%
안산신길 2
 
8.0%
안산성곡1 2
 
8.0%
용인남곡 1
 
4.0%
신촌동 1
 
4.0%
보령오천 1
 
4.0%
사천곤명 1
 
4.0%
안산목내 1
 
4.0%
Other values (10) 10
40.0%
2023-12-10T20:42:29.233201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14
 
12.7%
9
 
8.2%
7
 
6.4%
6
 
5.5%
5
 
4.5%
5
 
4.5%
4
 
3.6%
4
 
3.6%
2
 
1.8%
2
 
1.8%
Other values (41) 52
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 103
93.6%
Decimal Number 4
 
3.6%
Connector Punctuation 2
 
1.8%
Space Separator 1
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
13.6%
9
 
8.7%
7
 
6.8%
6
 
5.8%
5
 
4.9%
5
 
4.9%
4
 
3.9%
4
 
3.9%
2
 
1.9%
2
 
1.9%
Other values (37) 45
43.7%
Decimal Number
ValueCountFrequency (%)
2 2
50.0%
1 2
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 103
93.6%
Common 7
 
6.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
13.6%
9
 
8.7%
7
 
6.8%
6
 
5.8%
5
 
4.9%
5
 
4.9%
4
 
3.9%
4
 
3.9%
2
 
1.9%
2
 
1.9%
Other values (37) 45
43.7%
Common
ValueCountFrequency (%)
_ 2
28.6%
2 2
28.6%
1 2
28.6%
1
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 103
93.6%
ASCII 7
 
6.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
14
 
13.6%
9
 
8.7%
7
 
6.8%
6
 
5.8%
5
 
4.9%
5
 
4.9%
4
 
3.9%
4
 
3.9%
2
 
1.9%
2
 
1.9%
Other values (37) 45
43.7%
ASCII
ValueCountFrequency (%)
_ 2
28.6%
2 2
28.6%
1 2
28.6%
1
14.3%

설치일자
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size324.0 B
20201231
19 
20200421
20200724
 
1
20201221
 
1

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique2 ?
Unique (%)8.3%

Sample

1st row20200724
2nd row20200421
3rd row20200421
4th row20200421
5th row20201231

Common Values

ValueCountFrequency (%)
20201231 19
79.2%
20200421 3
 
12.5%
20200724 1
 
4.2%
20201221 1
 
4.2%

Length

2023-12-10T20:42:29.429565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:42:29.590266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20201231 19
79.2%
20200421 3
 
12.5%
20200724 1
 
4.2%
20201221 1
 
4.2%

주소
Text

Distinct19
Distinct (%)79.2%
Missing0
Missing (%)0.0%
Memory size324.0 B
2023-12-10T20:42:29.881952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length24.5
Mean length21.208333
Min length18

Characters and Unicode

Total characters509
Distinct characters82
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)58.3%

Sample

1st row경상남도 창원시 성산구 신촌동 22-4
2nd row인천광역시 강화군 강화읍 남산리 438-5
3rd row인천광역시 강화군 선원면 선행리 190-5
4th row인천광역시 강화군 화도면 상방리 산 131
5th row충청남도 금산군 진산면 묵산리 175
ValueCountFrequency (%)
경기도 12
 
9.9%
단원구 9
 
7.4%
안산시 9
 
7.4%
경상남도 4
 
3.3%
성곡동 4
 
3.3%
강화군 3
 
2.5%
인천광역시 3
 
2.5%
전라북도 2
 
1.7%
847 2
 
1.7%
망성면 2
 
1.7%
Other values (62) 71
58.7%
2023-12-10T20:42:30.338985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
98
 
19.3%
25
 
4.9%
23
 
4.5%
22
 
4.3%
16
 
3.1%
16
 
3.1%
7 13
 
2.6%
1 13
 
2.6%
12
 
2.4%
12
 
2.4%
Other values (72) 259
50.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 314
61.7%
Space Separator 98
 
19.3%
Decimal Number 87
 
17.1%
Dash Punctuation 10
 
2.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
25
 
8.0%
23
 
7.3%
22
 
7.0%
16
 
5.1%
16
 
5.1%
12
 
3.8%
12
 
3.8%
12
 
3.8%
11
 
3.5%
10
 
3.2%
Other values (60) 155
49.4%
Decimal Number
ValueCountFrequency (%)
7 13
14.9%
1 13
14.9%
2 12
13.8%
3 10
11.5%
4 9
10.3%
8 8
9.2%
5 6
6.9%
0 6
6.9%
9 6
6.9%
6 4
 
4.6%
Space Separator
ValueCountFrequency (%)
98
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 314
61.7%
Common 195
38.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
25
 
8.0%
23
 
7.3%
22
 
7.0%
16
 
5.1%
16
 
5.1%
12
 
3.8%
12
 
3.8%
12
 
3.8%
11
 
3.5%
10
 
3.2%
Other values (60) 155
49.4%
Common
ValueCountFrequency (%)
98
50.3%
7 13
 
6.7%
1 13
 
6.7%
2 12
 
6.2%
3 10
 
5.1%
- 10
 
5.1%
4 9
 
4.6%
8 8
 
4.1%
5 6
 
3.1%
0 6
 
3.1%
Other values (2) 10
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 314
61.7%
ASCII 195
38.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
98
50.3%
7 13
 
6.7%
1 13
 
6.7%
2 12
 
6.2%
3 10
 
5.1%
- 10
 
5.1%
4 9
 
4.6%
8 8
 
4.1%
5 6
 
3.1%
0 6
 
3.1%
Other values (2) 10
 
5.1%
Hangul
ValueCountFrequency (%)
25
 
8.0%
23
 
7.3%
22
 
7.0%
16
 
5.1%
16
 
5.1%
12
 
3.8%
12
 
3.8%
12
 
3.8%
11
 
3.5%
10
 
3.2%
Other values (60) 155
49.4%

관리기관명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size324.0 B
-
15 
환경부 한국환경공단

Length

Max length10
Median length1
Mean length4.375
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row-
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
- 15
62.5%
환경부 한국환경공단 9
37.5%

Length

2023-12-10T20:42:30.523212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:42:30.631084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
15
45.5%
환경부 9
27.3%
한국환경공단 9
27.3%

상부굴착구경(mm)
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)12.5%
Missing0
Missing (%)0.0%
Memory size324.0 B
0
14 
300
200
 
1

Length

Max length3
Median length1
Mean length1.8333333
Min length1

Unique

Unique1 ?
Unique (%)4.2%

Sample

1st row200
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 14
58.3%
300 9
37.5%
200 1
 
4.2%

Length

2023-12-10T20:42:30.767119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:42:30.923705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 14
58.3%
300 9
37.5%
200 1
 
4.2%

하부굴착구경(mm)
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size324.0 B
0
15 
250

Length

Max length3
Median length1
Mean length1.75
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 15
62.5%
250 9
37.5%

Length

2023-12-10T20:42:31.087672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:42:31.221344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 15
62.5%
250 9
37.5%

수온값
Categorical

CONSTANT 

Distinct1
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size324.0 B
0
24 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 24
100.0%

Length

2023-12-10T20:42:31.344188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:42:31.470234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 24
100.0%

수위
Categorical

CONSTANT 

Distinct1
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size324.0 B
0
24 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 24
100.0%

Length

2023-12-10T20:42:31.585888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:42:31.690094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 24
100.0%

Correlations

2023-12-10T20:42:31.767244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관측소명설치일자주소관리기관명상부굴착구경(mm)하부굴착구경(mm)
관측소명1.0001.0001.0001.0001.0001.000
설치일자1.0001.0001.0000.2910.6800.291
주소1.0001.0001.0001.0001.0001.000
관리기관명1.0000.2911.0001.0001.0000.989
상부굴착구경(mm)1.0000.6801.0001.0001.0001.000
하부굴착구경(mm)1.0000.2911.0000.9891.0001.000
2023-12-10T20:42:32.198231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상부굴착구경(mm)하부굴착구경(mm)관리기관명설치일자
상부굴착구경(mm)1.0000.9770.9770.692
하부굴착구경(mm)0.9771.0000.9070.169
관리기관명0.9770.9071.0000.169
설치일자0.6920.1690.1691.000
2023-12-10T20:42:32.337417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치일자관리기관명상부굴착구경(mm)하부굴착구경(mm)
설치일자1.0000.1690.6920.169
관리기관명0.1691.0000.9770.907
상부굴착구경(mm)0.6920.9771.0000.977
하부굴착구경(mm)0.1690.9070.9771.000

Missing values

2023-12-10T20:42:28.381495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T20:42:28.582957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

관측소명설치일자주소관리기관명상부굴착구경(mm)하부굴착구경(mm)수온값수위
0신촌동 동성아파트20200724경상남도 창원시 성산구 신촌동 22-4-200000
1강화강화20200421인천광역시 강화군 강화읍 남산리 438-5-0000
2강화선원20200421인천광역시 강화군 선원면 선행리 190-5-0000
3강화화도20200421인천광역시 강화군 화도면 상방리 산 131-0000
4금산진산20201231충청남도 금산군 진산면 묵산리 175-0000
5보령오천20201231충청남도 보령시 오천면 원산도3길 108-0000
6사천곤명20201231경상남도 사천시 곤명면 37-1-0000
7안산목내20201231경기도 안산시 단원구 목내동 472환경부 한국환경공단30025000
8안산신길20201231경기도 안산시 단원구 신길동 1053환경부 한국환경공단30025000
9안산신길20201231경기도 안산시 단원구 신길동 1053환경부 한국환경공단30025000
관측소명설치일자주소관리기관명상부굴착구경(mm)하부굴착구경(mm)수온값수위
14익산망성20201231전라북도 익산시 망성면 안성로 726-0000
15익산망성20201231전라북도 익산시 망성면 안성로 726-0000
16통영도산20201231경상남도 통영시 도산면 수월리 산257-4-0000
17인제서화_신20201231강원도 인제군 서화면 천도리 1092-0000
18창원진전_신20201221경상남도 창원시마산합포구 진전면 오서리 484번지-0000
19광주곤지암20201231경기도 광주시 곤지암읍 평촌길 12-137-0000
20안산성곡120201231경기도 안산시 단원구 성곡동 627-2환경부 한국환경공단30025000
21안산성곡120201231경기도 안산시 단원구 성곡동 627-2환경부 한국환경공단30025000
22안산성곡220201231경기도 안산시 단원구 성곡동 793환경부 한국환경공단30025000
23안산성곡220201231경기도 안산시 단원구 성곡동 793환경부 한국환경공단30025000

Duplicate rows

Most frequently occurring

관측소명설치일자주소관리기관명상부굴착구경(mm)하부굴착구경(mm)수온값수위# duplicates
0안산성곡120201231경기도 안산시 단원구 성곡동 627-2환경부 한국환경공단300250002
1안산성곡220201231경기도 안산시 단원구 성곡동 793환경부 한국환경공단300250002
2안산신길20201231경기도 안산시 단원구 신길동 1053환경부 한국환경공단300250002
3안산원시20201231경기도 안산시 단원구 원시동 847환경부 한국환경공단300250002
4익산망성20201231전라북도 익산시 망성면 안성로 726-00002