Overview

Dataset statistics

Number of variables8
Number of observations850
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory56.6 KiB
Average record size in memory68.2 B

Variable types

Categorical2
Numeric4
Text2

Dataset

Description홍수 등 수문기상 재해 대응 및 물관리 유관기관 지원을 위한 유역별 맞춤형 강수량 예측정보를 생산합니다. 세부 사항은 기상청 수문기상 가뭄정보 시스템(https://hydro.kma.go.kr)을 참고하여 주시기 바랍니다.
Author기상청
URLhttps://www.data.go.kr/data/15068582/fileData.do

Alerts

일자(연월일) has constant value ""Constant
표준유역코드 is highly overall correlated with 중권역코드 and 3 other fieldsHigh correlation
중권역코드 is highly overall correlated with 표준유역코드 and 3 other fieldsHigh correlation
대권역코드 is highly overall correlated with 표준유역코드 and 3 other fieldsHigh correlation
강수량(mm) is highly overall correlated with 표준유역코드 and 2 other fieldsHigh correlation
대권역 is highly overall correlated with 표준유역코드 and 2 other fieldsHigh correlation
표준유역코드 has unique valuesUnique
강수량(mm) has 176 (20.7%) zerosZeros

Reproduction

Analysis started2023-12-12 18:37:00.000758
Analysis finished2023-12-12 18:37:05.408864
Duration5.41 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일자(연월일)
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
2020-08-01
850 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-08-01
2nd row2020-08-01
3rd row2020-08-01
4th row2020-08-01
5th row2020-08-01

Common Values

ValueCountFrequency (%)
2020-08-01 850
100.0%

Length

2023-12-13T03:37:05.545661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:37:05.765524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-08-01 850
100.0%

표준유역코드
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct850
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean236577.32
Minimum100101
Maximum600405
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.6 KiB
2023-12-13T03:37:06.027286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100101
5-th percentile100313.45
Q1102201.25
median201554.5
Q3301377
95-th percentile510101.55
Maximum600405
Range500304
Interquartile range (IQR)199175.75

Descriptive statistics

Standard deviation132570.7
Coefficient of variation (CV)0.56036944
Kurtosis0.016057783
Mean236577.32
Median Absolute Deviation (MAD)99648.5
Skewness0.90103938
Sum2.0109072 × 108
Variance1.7574991 × 1010
MonotonicityStrictly increasing
2023-12-13T03:37:06.414763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100101 1
 
0.1%
250407 1
 
0.1%
250409 1
 
0.1%
300101 1
 
0.1%
300102 1
 
0.1%
300103 1
 
0.1%
300104 1
 
0.1%
300105 1
 
0.1%
300106 1
 
0.1%
300107 1
 
0.1%
Other values (840) 840
98.8%
ValueCountFrequency (%)
100101 1
0.1%
100102 1
0.1%
100103 1
0.1%
100104 1
0.1%
100105 1
0.1%
100106 1
0.1%
100107 1
0.1%
100108 1
0.1%
100109 1
0.1%
100110 1
0.1%
ValueCountFrequency (%)
600405 1
0.1%
600404 1
0.1%
600403 1
0.1%
600402 1
0.1%
600401 1
0.1%
600304 1
0.1%
600303 1
0.1%
600302 1
0.1%
600301 1
0.1%
600204 1
0.1%
Distinct820
Distinct (%)96.5%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
2023-12-13T03:37:07.084038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length3.9905882
Min length2

Characters and Unicode

Total characters3392
Distinct characters237
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique800 ?
Unique (%)94.1%

Sample

1st row광동댐
2nd row광동댐하류
3rd row임계천
4th row골지천중류
5th row도암댐
ValueCountFrequency (%)
남천 5
 
0.6%
동천 4
 
0.5%
북천 4
 
0.5%
남대천 3
 
0.4%
광천 3
 
0.4%
한천 3
 
0.4%
석교천 2
 
0.2%
금천 2
 
0.2%
화양천 2
 
0.2%
사천천 2
 
0.2%
Other values (810) 820
96.5%
2023-12-13T03:37:08.290809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
607
 
17.9%
288
 
8.5%
124
 
3.7%
102
 
3.0%
102
 
3.0%
95
 
2.8%
92
 
2.7%
79
 
2.3%
71
 
2.1%
69
 
2.0%
Other values (227) 1763
52.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3390
99.9%
Decimal Number 1
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
607
 
17.9%
288
 
8.5%
124
 
3.7%
102
 
3.0%
102
 
3.0%
95
 
2.8%
92
 
2.7%
79
 
2.3%
71
 
2.1%
69
 
2.0%
Other values (225) 1761
51.9%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%
Other Punctuation
ValueCountFrequency (%)
· 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3390
99.9%
Common 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
607
 
17.9%
288
 
8.5%
124
 
3.7%
102
 
3.0%
102
 
3.0%
95
 
2.8%
92
 
2.7%
79
 
2.3%
71
 
2.1%
69
 
2.0%
Other values (225) 1761
51.9%
Common
ValueCountFrequency (%)
2 1
50.0%
· 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3390
99.9%
ASCII 1
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
607
 
17.9%
288
 
8.5%
124
 
3.7%
102
 
3.0%
102
 
3.0%
95
 
2.8%
92
 
2.7%
79
 
2.3%
71
 
2.1%
69
 
2.0%
Other values (225) 1761
51.9%
ASCII
ValueCountFrequency (%)
2 1
100.0%
None
ValueCountFrequency (%)
· 1
100.0%

중권역코드
Real number (ℝ)

HIGH CORRELATION 

Distinct117
Distinct (%)13.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2365.7176
Minimum1001
Maximum6004
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.6 KiB
2023-12-13T03:37:08.678730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile1003
Q11022
median2015.5
Q33013.75
95-th percentile5101
Maximum6004
Range5003
Interquartile range (IQR)1991.75

Descriptive statistics

Standard deviation1325.7181
Coefficient of variation (CV)0.5603873
Kurtosis0.016056697
Mean2365.7176
Median Absolute Deviation (MAD)996.5
Skewness0.90103739
Sum2010860
Variance1757528.6
MonotonicityIncreasing
2023-12-13T03:37:09.029426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1003 20
 
2.4%
2012 19
 
2.2%
1007 19
 
2.2%
1101 18
 
2.1%
1001 17
 
2.0%
2004 17
 
2.0%
2002 16
 
1.9%
1018 16
 
1.9%
3101 16
 
1.9%
1022 15
 
1.8%
Other values (107) 677
79.6%
ValueCountFrequency (%)
1001 17
2.0%
1002 13
1.5%
1003 20
2.4%
1004 14
1.6%
1005 5
 
0.6%
1006 10
1.2%
1007 19
2.2%
1008 13
1.5%
1009 4
 
0.5%
1010 12
1.4%
ValueCountFrequency (%)
6004 5
0.6%
6003 4
0.5%
6002 4
0.5%
6001 3
 
0.4%
5303 1
 
0.1%
5302 9
1.1%
5301 4
0.5%
5202 6
0.7%
5201 4
0.5%
5101 4
0.5%
Distinct117
Distinct (%)13.8%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
2023-12-13T03:37:09.759030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.5105882
Min length2

Characters and Unicode

Total characters2984
Distinct characters109
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)0.9%

Sample

1st row남한강상류
2nd row남한강상류
3rd row남한강상류
4th row남한강상류
5th row남한강상류
ValueCountFrequency (%)
충주댐 20
 
2.4%
남한강하류 19
 
2.2%
금호강 19
 
2.2%
안성천 18
 
2.1%
남한강상류 17
 
2.0%
내성천 17
 
2.0%
임하댐 16
 
1.9%
삽교천 16
 
1.9%
한강서울 16
 
1.9%
한탄강 15
 
1.8%
Other values (107) 677
79.6%
2023-12-13T03:37:10.981734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
385
 
12.9%
291
 
9.8%
164
 
5.5%
111
 
3.7%
98
 
3.3%
92
 
3.1%
84
 
2.8%
74
 
2.5%
72
 
2.4%
72
 
2.4%
Other values (99) 1541
51.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2984
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
385
 
12.9%
291
 
9.8%
164
 
5.5%
111
 
3.7%
98
 
3.3%
92
 
3.1%
84
 
2.8%
74
 
2.5%
72
 
2.4%
72
 
2.4%
Other values (99) 1541
51.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2984
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
385
 
12.9%
291
 
9.8%
164
 
5.5%
111
 
3.7%
98
 
3.3%
92
 
3.1%
84
 
2.8%
74
 
2.5%
72
 
2.4%
72
 
2.4%
Other values (99) 1541
51.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2984
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
385
 
12.9%
291
 
9.8%
164
 
5.5%
111
 
3.7%
98
 
3.3%
92
 
3.1%
84
 
2.8%
74
 
2.5%
72
 
2.4%
72
 
2.4%
Other values (99) 1541
51.6%

대권역코드
Real number (ℝ)

HIGH CORRELATION 

Distinct26
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean237.36353
Minimum101
Maximum601
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.6 KiB
2023-12-13T03:37:11.334177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile101
Q1104
median203
Q3301
95-th percentile511
Maximum601
Range500
Interquartile range (IQR)197

Descriptive statistics

Standard deviation132.44958
Coefficient of variation (CV)0.55800308
Kurtosis0.022625932
Mean237.36353
Median Absolute Deviation (MAD)99
Skewness0.90309447
Sum201759
Variance17542.891
MonotonicityNot monotonic
2023-12-13T03:37:11.696283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
101 98
 
11.5%
301 78
 
9.2%
102 70
 
8.2%
203 68
 
8.0%
202 67
 
7.9%
201 60
 
7.1%
401 46
 
5.4%
104 38
 
4.5%
501 34
 
4.0%
103 31
 
3.6%
Other values (16) 260
30.6%
ValueCountFrequency (%)
101 98
11.5%
102 70
8.2%
103 31
 
3.6%
104 38
 
4.5%
111 18
 
2.1%
121 14
 
1.6%
131 21
 
2.5%
201 60
7.1%
202 67
7.9%
203 68
8.0%
ValueCountFrequency (%)
601 16
 
1.9%
531 14
 
1.6%
521 10
 
1.2%
511 4
 
0.5%
501 34
4.0%
411 27
3.2%
401 46
5.4%
331 24
2.8%
321 19
2.2%
311 16
 
1.9%

대권역
Categorical

HIGH CORRELATION 

Distinct26
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
남한강
98 
금강
78 
북한강
70 
낙동강하류
68 
낙동강중류
67 
Other values (21)
469 

Length

Max length5
Median length4
Mean length3.7870588
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남한강
2nd row남한강
3rd row남한강
4th row남한강
5th row남한강

Common Values

ValueCountFrequency (%)
남한강 98
 
11.5%
금강 78
 
9.2%
북한강 70
 
8.2%
낙동강하류 68
 
8.0%
낙동강중류 67
 
7.9%
낙동강상류 60
 
7.1%
섬진강 46
 
5.4%
임진강 38
 
4.5%
영산강 34
 
4.0%
팔당댐하류 31
 
3.6%
Other values (16) 260
30.6%

Length

2023-12-13T03:37:12.154199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
남한강 98
 
11.5%
금강 78
 
9.2%
북한강 70
 
8.2%
낙동강하류 68
 
8.0%
낙동강중류 67
 
7.9%
낙동강상류 60
 
7.1%
섬진강 46
 
5.4%
임진강 38
 
4.5%
영산강 34
 
4.0%
팔당댐하류 31
 
3.6%
Other values (16) 260
30.6%

강수량(mm)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct341
Distinct (%)40.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.285059
Minimum0
Maximum118.4
Zeros176
Zeros (%)20.7%
Negative0
Negative (%)0.0%
Memory size7.6 KiB
2023-12-13T03:37:12.556532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.4
median7.35
Q319.9
95-th percentile56.22
Maximum118.4
Range118.4
Interquartile range (IQR)19.5

Descriptive statistics

Standard deviation19.392873
Coefficient of variation (CV)1.3575634
Kurtosis5.536429
Mean14.285059
Median Absolute Deviation (MAD)7.35
Skewness2.151067
Sum12142.3
Variance376.08353
MonotonicityNot monotonic
2023-12-13T03:37:13.023080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 176
 
20.7%
0.5 18
 
2.1%
0.2 18
 
2.1%
0.4 12
 
1.4%
1.0 10
 
1.2%
1.5 9
 
1.1%
0.1 9
 
1.1%
0.7 8
 
0.9%
0.9 6
 
0.7%
15.5 5
 
0.6%
Other values (331) 579
68.1%
ValueCountFrequency (%)
0.0 176
20.7%
0.1 9
 
1.1%
0.2 18
 
2.1%
0.3 4
 
0.5%
0.4 12
 
1.4%
0.5 18
 
2.1%
0.6 3
 
0.4%
0.7 8
 
0.9%
0.8 5
 
0.6%
0.9 6
 
0.7%
ValueCountFrequency (%)
118.4 1
0.1%
113.5 1
0.1%
110.6 1
0.1%
108.0 1
0.1%
107.6 1
0.1%
107.5 1
0.1%
105.4 1
0.1%
95.2 1
0.1%
94.1 1
0.1%
80.7 1
0.1%

Interactions

2023-12-13T03:37:04.169965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:00.701966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:01.937172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:02.953997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:04.385098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:00.883905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:02.281883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:03.273268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:04.592940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:01.064515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:02.462347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:03.548149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:04.807471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:01.748658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:02.712179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:37:03.881036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:37:13.272234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
표준유역코드중권역코드대권역코드대권역강수량(mm)
표준유역코드1.0001.0000.9850.9660.658
중권역코드1.0001.0000.9850.9660.670
대권역코드0.9850.9851.0001.0000.655
대권역0.9660.9661.0001.0000.719
강수량(mm)0.6580.6700.6550.7191.000
2023-12-13T03:37:13.512469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
표준유역코드중권역코드대권역코드강수량(mm)대권역
표준유역코드1.0001.0000.998-0.5730.806
중권역코드1.0001.0000.998-0.5730.806
대권역코드0.9980.9981.000-0.5800.990
강수량(mm)-0.573-0.573-0.5801.0000.353
대권역0.8060.8060.9900.3531.000

Missing values

2023-12-13T03:37:05.050823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:37:05.301349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일자(연월일)표준유역코드표준유역중권역코드중권역대권역코드대권역강수량(mm)
02020-08-01100101광동댐1001남한강상류101남한강7.9
12020-08-01100102광동댐하류1001남한강상류101남한강8.8
22020-08-01100103임계천1001남한강상류101남한강9.3
32020-08-01100104골지천중류1001남한강상류101남한강12.0
42020-08-01100105도암댐1001남한강상류101남한강5.2
52020-08-01100106송천1001남한강상류101남한강6.2
62020-08-01100107골지천하류1001남한강상류101남한강12.0
72020-08-01100108오대천상류1001남한강상류101남한강12.4
82020-08-01100109오대천하류1001남한강상류101남한강12.6
92020-08-01100110어천상류1001남한강상류101남한강7.2
일자(연월일)표준유역코드표준유역중권역코드중권역대권역코드대권역강수량(mm)
8402020-08-01600204화북천6002제주북해601제주도0.0
8412020-08-01600301창고천6003제주남해601제주도0.0
8422020-08-01600302예래천6003제주남해601제주도0.0
8432020-08-01600303도순천6003제주남해601제주도0.0
8442020-08-01600304신례천6003제주남해601제주도0.0
8452020-08-01600401조천읍6004제주동해601제주도0.0
8462020-08-01600402구좌읍6004제주동해601제주도0.0
8472020-08-01600403성산읍6004제주동해601제주도0.0
8482020-08-01600404천미천6004제주동해601제주도0.0
8492020-08-01600405종남천6004제주동해601제주도0.0