Overview

Dataset statistics

Number of variables8
Number of observations41
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.9 KiB
Average record size in memory72.2 B

Variable types

Text2
Categorical4
Numeric2

Alerts

하부굴착구경(mm) is highly overall correlated with 수온값 and 2 other fieldsHigh correlation
관리기관명 is highly overall correlated with 수온값 and 2 other fieldsHigh correlation
설치일자 is highly overall correlated with 수온값 and 2 other fieldsHigh correlation
수온값 is highly overall correlated with 설치일자 and 2 other fieldsHigh correlation
상부굴착구경(mm) is highly imbalanced (83.5%)Imbalance
수온값 has 8 (19.5%) zerosZeros
수위 has 8 (19.5%) zerosZeros

Reproduction

Analysis started2023-12-10 11:42:21.525196
Analysis finished2023-12-10 11:42:22.999067
Duration1.47 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct29
Distinct (%)70.7%
Missing0
Missing (%)0.0%
Memory size460.0 B
2023-12-10T20:42:23.191491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.195122
Min length4

Characters and Unicode

Total characters172
Distinct characters55
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)41.5%

Sample

1st row고흥고흥
2nd row고흥과역
3rd row고흥남양
4th row고흥봉래
5th row보성보성
ValueCountFrequency (%)
울산화산 2
 
4.9%
창원팔용 2
 
4.9%
창원천선 2
 
4.9%
부산송정3 2
 
4.9%
부산송정1 2
 
4.9%
해남북일 2
 
4.9%
부산송정2 2
 
4.9%
울산원산 2
 
4.9%
창원신촌 2
 
4.9%
창원성산 2
 
4.9%
Other values (19) 21
51.2%
2023-12-10T20:42:23.854702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
24
 
14.0%
11
 
6.4%
9
 
5.2%
9
 
5.2%
8
 
4.7%
8
 
4.7%
7
 
4.1%
5
 
2.9%
5
 
2.9%
5
 
2.9%
Other values (45) 81
47.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 164
95.3%
Decimal Number 8
 
4.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
 
14.6%
11
 
6.7%
9
 
5.5%
9
 
5.5%
8
 
4.9%
8
 
4.9%
7
 
4.3%
5
 
3.0%
5
 
3.0%
5
 
3.0%
Other values (41) 73
44.5%
Decimal Number
ValueCountFrequency (%)
4 2
25.0%
3 2
25.0%
2 2
25.0%
1 2
25.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 164
95.3%
Common 8
 
4.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
 
14.6%
11
 
6.7%
9
 
5.5%
9
 
5.5%
8
 
4.9%
8
 
4.9%
7
 
4.3%
5
 
3.0%
5
 
3.0%
5
 
3.0%
Other values (41) 73
44.5%
Common
ValueCountFrequency (%)
4 2
25.0%
3 2
25.0%
2 2
25.0%
1 2
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 164
95.3%
ASCII 8
 
4.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
24
 
14.6%
11
 
6.7%
9
 
5.5%
9
 
5.5%
8
 
4.9%
8
 
4.9%
7
 
4.3%
5
 
3.0%
5
 
3.0%
5
 
3.0%
Other values (41) 73
44.5%
ASCII
ValueCountFrequency (%)
4 2
25.0%
3 2
25.0%
2 2
25.0%
1 2
25.0%

설치일자
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Memory size460.0 B
20211214
23 
20211201
11 
20211216

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20211201
2nd row20211201
3rd row20211201
4th row20211201
5th row20211201

Common Values

ValueCountFrequency (%)
20211214 23
56.1%
20211201 11
26.8%
20211216 7
 
17.1%

Length

2023-12-10T20:42:24.077952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:42:24.254408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20211214 23
56.1%
20211201 11
26.8%
20211216 7
 
17.1%

주소
Text

Distinct28
Distinct (%)68.3%
Missing0
Missing (%)0.0%
Memory size460.0 B
2023-12-10T20:42:24.641770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length24
Mean length20.634146
Min length16

Characters and Unicode

Total characters846
Distinct characters89
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)41.5%

Sample

1st row전라남도 고흥군 고흥읍 고소리 산6-5
2nd row전라남도 고흥군 과역면 신곡리 1117-9
3rd row전라남도 고흥군 남양면 장담리 3056
4th row전라남도 고흥군 봉래면 외초리 108-3
5th row전라남도 보성군 보성읍 대야리 산177-10
ValueCountFrequency (%)
부산광역시 8
 
4.1%
송정동 8
 
4.1%
강서구 8
 
4.1%
경상남도 8
 
4.1%
창원시 8
 
4.1%
울산광역시 7
 
3.6%
전라남도 7
 
3.6%
온산읍 7
 
3.6%
울주군 7
 
3.6%
경상북도 7
 
3.6%
Other values (82) 120
61.5%
2023-12-10T20:42:25.296424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
154
 
18.2%
1 41
 
4.8%
40
 
4.7%
29
 
3.4%
28
 
3.3%
23
 
2.7%
21
 
2.5%
- 21
 
2.5%
20
 
2.4%
5 19
 
2.2%
Other values (79) 450
53.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 517
61.1%
Space Separator 154
 
18.2%
Decimal Number 154
 
18.2%
Dash Punctuation 21
 
2.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
40
 
7.7%
29
 
5.6%
28
 
5.4%
23
 
4.4%
21
 
4.1%
20
 
3.9%
19
 
3.7%
16
 
3.1%
16
 
3.1%
16
 
3.1%
Other values (67) 289
55.9%
Decimal Number
ValueCountFrequency (%)
1 41
26.6%
5 19
12.3%
2 18
11.7%
7 15
 
9.7%
0 13
 
8.4%
4 11
 
7.1%
9 11
 
7.1%
6 10
 
6.5%
3 10
 
6.5%
8 6
 
3.9%
Space Separator
ValueCountFrequency (%)
154
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 517
61.1%
Common 329
38.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
40
 
7.7%
29
 
5.6%
28
 
5.4%
23
 
4.4%
21
 
4.1%
20
 
3.9%
19
 
3.7%
16
 
3.1%
16
 
3.1%
16
 
3.1%
Other values (67) 289
55.9%
Common
ValueCountFrequency (%)
154
46.8%
1 41
 
12.5%
- 21
 
6.4%
5 19
 
5.8%
2 18
 
5.5%
7 15
 
4.6%
0 13
 
4.0%
4 11
 
3.3%
9 11
 
3.3%
6 10
 
3.0%
Other values (2) 16
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 517
61.1%
ASCII 329
38.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
154
46.8%
1 41
 
12.5%
- 21
 
6.4%
5 19
 
5.8%
2 18
 
5.5%
7 15
 
4.6%
0 13
 
4.0%
4 11
 
3.3%
9 11
 
3.3%
6 10
 
3.0%
Other values (2) 16
 
4.9%
Hangul
ValueCountFrequency (%)
40
 
7.7%
29
 
5.6%
28
 
5.4%
23
 
4.4%
21
 
4.1%
20
 
3.9%
19
 
3.7%
16
 
3.1%
16
 
3.1%
16
 
3.1%
Other values (67) 289
55.9%

관리기관명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size460.0 B
환경부. 한국환경공단
23 
환경부. 한국수자원공사
18 

Length

Max length12
Median length11
Mean length11.439024
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row환경부. 한국수자원공사
2nd row환경부. 한국수자원공사
3rd row환경부. 한국수자원공사
4th row환경부. 한국수자원공사
5th row환경부. 한국수자원공사

Common Values

ValueCountFrequency (%)
환경부. 한국환경공단 23
56.1%
환경부. 한국수자원공사 18
43.9%

Length

2023-12-10T20:42:25.503985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:42:25.662272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
환경부 41
50.0%
한국환경공단 23
28.0%
한국수자원공사 18
22.0%

상부굴착구경(mm)
Categorical

IMBALANCE 

Distinct2
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size460.0 B
300
40 
400
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)2.4%

Sample

1st row300
2nd row300
3rd row300
4th row300
5th row300

Common Values

ValueCountFrequency (%)
300 40
97.6%
400 1
 
2.4%

Length

2023-12-10T20:42:25.824794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:42:25.980497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
300 40
97.6%
400 1
 
2.4%

하부굴착구경(mm)
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size460.0 B
300
24 
200
17 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row200
2nd row200
3rd row200
4th row200
5th row200

Common Values

ValueCountFrequency (%)
300 24
58.5%
200 17
41.5%

Length

2023-12-10T20:42:26.115037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:42:26.265897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
300 24
58.5%
200 17
41.5%

수온값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct27
Distinct (%)65.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.207317
Minimum0
Maximum17.5
Zeros8
Zeros (%)19.5%
Negative0
Negative (%)0.0%
Memory size501.0 B
2023-12-10T20:42:26.401811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q112.3
median14.6
Q316.7
95-th percentile17.3
Maximum17.5
Range17.5
Interquartile range (IQR)4.4

Descriptive statistics

Standard deviation6.3140474
Coefficient of variation (CV)0.51723466
Kurtosis0.15930032
Mean12.207317
Median Absolute Deviation (MAD)2.2
Skewness-1.3327577
Sum500.5
Variance39.867195
MonotonicityNot monotonic
2023-12-10T20:42:26.583666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
0.0 8
19.5%
17.3 3
 
7.3%
12.4 2
 
4.9%
13.9 2
 
4.9%
17.0 2
 
4.9%
16.1 2
 
4.9%
16.7 2
 
4.9%
17.2 1
 
2.4%
17.4 1
 
2.4%
17.1 1
 
2.4%
Other values (17) 17
41.5%
ValueCountFrequency (%)
0.0 8
19.5%
11.7 1
 
2.4%
12.1 1
 
2.4%
12.3 1
 
2.4%
12.4 2
 
4.9%
12.9 1
 
2.4%
13.3 1
 
2.4%
13.4 1
 
2.4%
13.5 1
 
2.4%
13.9 2
 
4.9%
ValueCountFrequency (%)
17.5 1
 
2.4%
17.4 1
 
2.4%
17.3 3
7.3%
17.2 1
 
2.4%
17.1 1
 
2.4%
17.0 2
4.9%
16.8 1
 
2.4%
16.7 2
4.9%
16.1 2
4.9%
16.0 1
 
2.4%

수위
Real number (ℝ)

ZEROS 

Distinct31
Distinct (%)75.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.1960976
Minimum0
Maximum24.6
Zeros8
Zeros (%)19.5%
Negative0
Negative (%)0.0%
Memory size501.0 B
2023-12-10T20:42:26.738012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11.66
median3.4
Q35.24
95-th percentile10.77
Maximum24.6
Range24.6
Interquartile range (IQR)3.58

Descriptive statistics

Standard deviation4.6549392
Coefficient of variation (CV)1.1093496
Kurtosis9.1922179
Mean4.1960976
Median Absolute Deviation (MAD)1.84
Skewness2.6395694
Sum172.04
Variance21.668459
MonotonicityNot monotonic
2023-12-10T20:42:26.894785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0.0 8
 
19.5%
3.2 2
 
4.9%
5.0 2
 
4.9%
3.58 2
 
4.9%
1.95 1
 
2.4%
5.61 1
 
2.4%
5.46 1
 
2.4%
5.24 1
 
2.4%
24.6 1
 
2.4%
1.66 1
 
2.4%
Other values (21) 21
51.2%
ValueCountFrequency (%)
0.0 8
19.5%
1.2 1
 
2.4%
1.31 1
 
2.4%
1.66 1
 
2.4%
1.7 1
 
2.4%
1.9 1
 
2.4%
1.95 1
 
2.4%
2.17 1
 
2.4%
2.78 1
 
2.4%
2.92 1
 
2.4%
ValueCountFrequency (%)
24.6 1
2.4%
16.3 1
2.4%
10.77 1
2.4%
9.9 1
2.4%
8.66 1
2.4%
6.1 1
2.4%
5.61 1
2.4%
5.54 1
2.4%
5.46 1
2.4%
5.41 1
2.4%

Interactions

2023-12-10T20:42:22.552089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:42:22.347757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:42:22.651728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:42:22.450329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T20:42:27.020543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관측소명설치일자주소관리기관명상부굴착구경(mm)하부굴착구경(mm)수온값수위
관측소명1.0001.0001.0001.0000.0000.9580.9380.931
설치일자1.0001.0001.0001.0000.0860.7050.7850.460
주소1.0001.0001.0001.0000.0000.9910.9410.960
관리기관명1.0001.0001.0001.0000.0000.9870.7180.337
상부굴착구경(mm)0.0000.0860.0000.0001.0000.0000.0640.000
하부굴착구경(mm)0.9580.7050.9910.9870.0001.0000.6600.274
수온값0.9380.7850.9410.7180.0640.6601.0000.590
수위0.9310.4600.9600.3370.0000.2740.5901.000
2023-12-10T20:42:27.175662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상부굴착구경(mm)하부굴착구경(mm)관리기관명설치일자
상부굴착구경(mm)1.0000.0000.0000.137
하부굴착구경(mm)0.0001.0000.8990.938
관리기관명0.0000.8991.0000.987
설치일자0.1370.9380.9871.000
2023-12-10T20:42:27.319251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수온값수위설치일자관리기관명상부굴착구경(mm)하부굴착구경(mm)
수온값1.0000.2850.7770.8180.0570.759
수위0.2851.0000.3250.3320.0000.267
설치일자0.7770.3251.0000.9870.1370.938
관리기관명0.8180.3320.9871.0000.0000.899
상부굴착구경(mm)0.0570.0000.1370.0001.0000.000
하부굴착구경(mm)0.7590.2670.9380.8990.0001.000

Missing values

2023-12-10T20:42:22.769027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T20:42:22.908726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

관측소명설치일자주소관리기관명상부굴착구경(mm)하부굴착구경(mm)수온값수위
0고흥고흥20211201전라남도 고흥군 고흥읍 고소리 산6-5환경부. 한국수자원공사3002000.00.0
1고흥과역20211201전라남도 고흥군 과역면 신곡리 1117-9환경부. 한국수자원공사3002000.00.0
2고흥남양20211201전라남도 고흥군 남양면 장담리 3056환경부. 한국수자원공사3002000.00.0
3고흥봉래20211201전라남도 고흥군 봉래면 외초리 108-3환경부. 한국수자원공사3002000.00.0
4보성보성20211201전라남도 보성군 보성읍 대야리 산177-10환경부. 한국수자원공사3002000.00.0
5안동남후20211216경상북도 안동시 남후면 광음리 427-1환경부. 한국수자원공사30020012.11.9
6안동도산20211216경상북도 안동시 도산면 단천리 532환경부. 한국수자원공사30020013.316.3
7안동송천20211216경상북도 안동시 송천동 1319-101환경부. 한국수자원공사30020012.45.0
8영양석보20211216경상북도 영양군 석보면 요원리 282-1환경부. 한국수자원공사30020013.96.1
9영양섬촌20211216경상북도 영양군 일월면 섬촌리 1-2환경부. 한국수자원공사30020012.94.6
관측소명설치일자주소관리기관명상부굴착구경(mm)하부굴착구경(mm)수온값수위
31해남북일20211201전라남도 해남군 북일면 용일리 1550환경부. 한국수자원공사4003000.00.0
32해남북일20211201전라남도 해남군 북일면 용일리 1550환경부. 한국수자원공사3002000.00.0
33부산송정120211214부산광역시 강서구 송정동 1456환경부. 한국환경공단30030017.11.95
34부산송정120211214부산광역시 강서구 송정동 1456환경부. 한국환경공단30030017.31.66
35부산송정220211214부산광역시 강서구 송정동 1499-2환경부. 한국환경공단30030016.73.2
36부산송정220211214부산광역시 강서구 송정동 1499-2환경부. 한국환경공단30030017.33.1
37부산송정320211214부산광역시 강서구 송정동 1718환경부. 한국환경공단30030017.03.93
38부산송정320211214부산광역시 강서구 송정동 1718환경부. 한국환경공단30030017.43.53
39부산송정420211214부산광역시 강서구 송정동 1718환경부. 한국환경공단30030017.03.61
40부산송정420211214부산광역시 강서구 송정동 1718환경부. 한국환경공단30030016.11.31