Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows49
Duplicate rows (%)49.0%
Total size in memory7.1 KiB
Average record size in memory72.3 B

Variable types

Categorical5
Numeric3

Alerts

댐이름 has constant value ""Constant
강우량(mm) has constant value ""Constant
저수율 has constant value ""Constant
Dataset has 49 (49.0%) duplicate rowsDuplicates
저수위(m) is highly overall correlated with 저수량(백만m3)High correlation
저수량(백만m3) is highly overall correlated with 저수위(m)High correlation
일자/시간(t) is highly overall correlated with 유입량(ms) and 1 other fieldsHigh correlation
유입량(ms) is highly overall correlated with 일자/시간(t) and 1 other fieldsHigh correlation
방류량(ms) is highly overall correlated with 일자/시간(t) and 1 other fieldsHigh correlation
저수위(m) is highly imbalanced (69.6%)Imbalance
저수량(백만m3) is highly imbalanced (69.6%)Imbalance

Reproduction

Analysis started2023-12-10 11:59:50.940193
Analysis finished2023-12-10 11:59:52.518796
Duration1.58 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

댐이름
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
군위
100 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row군위
2nd row군위
3rd row군위
4th row군위
5th row군위

Common Values

ValueCountFrequency (%)
군위 100
100.0%

Length

2023-12-10T20:59:52.599960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:59:52.714424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
군위 100
100.0%

일자/시간(t)
Real number (ℝ)

HIGH CORRELATION 

Distinct51
Distinct (%)51.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0190503 × 109
Minimum2.0190502 × 109
Maximum2.0190504 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:59:52.857008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0190502 × 109
5-th percentile2.0190502 × 109
Q12.0190503 × 109
median2.0190503 × 109
Q32.0190504 × 109
95-th percentile2.0190504 × 109
Maximum2.0190504 × 109
Range202
Interquartile range (IQR)119.5

Descriptive statistics

Standard deviation68.696004
Coefficient of variation (CV)3.4023919 × 10-8
Kurtosis-1.0762718
Mean2.0190503 × 109
Median Absolute Deviation (MAD)88.5
Skewness-0.024521778
Sum2.0190503 × 1011
Variance4719.141
MonotonicityNot monotonic
2023-12-10T20:59:53.083196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2019050213 2
 
2.0%
2019050408 2
 
2.0%
2019050310 2
 
2.0%
2019050312 2
 
2.0%
2019050314 2
 
2.0%
2019050311 2
 
2.0%
2019050309 2
 
2.0%
2019050308 2
 
2.0%
2019050306 2
 
2.0%
2019050413 2
 
2.0%
Other values (41) 80
80.0%
ValueCountFrequency (%)
2019050212 1
1.0%
2019050213 2
2.0%
2019050214 2
2.0%
2019050215 2
2.0%
2019050216 2
2.0%
2019050217 2
2.0%
2019050218 2
2.0%
2019050219 2
2.0%
2019050220 2
2.0%
2019050221 2
2.0%
ValueCountFrequency (%)
2019050414 1
1.0%
2019050413 2
2.0%
2019050412 2
2.0%
2019050411 2
2.0%
2019050410 2
2.0%
2019050409 2
2.0%
2019050408 2
2.0%
2019050407 2
2.0%
2019050406 2
2.0%
2019050405 2
2.0%

저수위(m)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
192.38
91 
192.37
 
8
192.36
 
1

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row192.37
2nd row192.37
3rd row192.38
4th row192.37
5th row192.37

Common Values

ValueCountFrequency (%)
192.38 91
91.0%
192.37 8
 
8.0%
192.36 1
 
1.0%

Length

2023-12-10T20:59:53.254997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:59:53.388615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
192.38 91
91.0%
192.37 8
 
8.0%
192.36 1
 
1.0%

강우량(mm)
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100
100.0%

Length

2023-12-10T20:59:53.512244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:59:53.634448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 100
100.0%

유입량(ms)
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)23.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.32263
Minimum1.115
Maximum6.107
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:59:53.758816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.115
5-th percentile1.116
Q11.119
median1.1235
Q31.129
95-th percentile1.13605
Maximum6.107
Range4.992
Interquartile range (IQR)0.01

Descriptive statistics

Standard deviation0.97837212
Coefficient of variation (CV)0.73971717
Kurtosis21.142466
Mean1.32263
Median Absolute Deviation (MAD)0.0045
Skewness4.7663907
Sum132.263
Variance0.95721201
MonotonicityNot monotonic
2023-12-10T20:59:53.917590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1.132 10
 
10.0%
1.116 10
 
10.0%
1.126 10
 
10.0%
1.119 8
 
8.0%
1.12 8
 
8.0%
1.123 6
 
6.0%
1.117 6
 
6.0%
1.118 4
 
4.0%
1.122 4
 
4.0%
1.128 4
 
4.0%
Other values (13) 30
30.0%
ValueCountFrequency (%)
1.115 2
 
2.0%
1.116 10
10.0%
1.117 6
6.0%
1.118 4
 
4.0%
1.119 8
8.0%
1.12 8
8.0%
1.121 2
 
2.0%
1.122 4
 
4.0%
1.123 6
6.0%
1.124 4
 
4.0%
ValueCountFrequency (%)
6.107 2
 
2.0%
6.076 2
 
2.0%
1.156 1
 
1.0%
1.135 1
 
1.0%
1.134 2
 
2.0%
1.133 4
 
4.0%
1.132 10
10.0%
1.131 2
 
2.0%
1.129 2
 
2.0%
1.128 4
 
4.0%

방류량(ms)
Real number (ℝ)

HIGH CORRELATION 

Distinct21
Distinct (%)21.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.12431
Minimum1.115
Maximum1.156
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:59:54.074289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.115
5-th percentile1.116
Q11.119
median1.1235
Q31.129
95-th percentile1.134
Maximum1.156
Range0.041
Interquartile range (IQR)0.01

Descriptive statistics

Standard deviation0.0067803862
Coefficient of variation (CV)0.0060307087
Kurtosis3.161145
Mean1.12431
Median Absolute Deviation (MAD)0.0045
Skewness1.1179861
Sum112.431
Variance4.5973636 × 10-5
MonotonicityNot monotonic
2023-12-10T20:59:54.239912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
1.132 12
12.0%
1.116 10
 
10.0%
1.126 10
 
10.0%
1.12 8
 
8.0%
1.119 8
 
8.0%
1.123 6
 
6.0%
1.117 6
 
6.0%
1.118 4
 
4.0%
1.128 4
 
4.0%
1.122 4
 
4.0%
Other values (11) 28
28.0%
ValueCountFrequency (%)
1.115 2
 
2.0%
1.116 10
10.0%
1.117 6
6.0%
1.118 4
 
4.0%
1.119 8
8.0%
1.12 8
8.0%
1.121 2
 
2.0%
1.122 4
 
4.0%
1.123 6
6.0%
1.124 4
 
4.0%
ValueCountFrequency (%)
1.156 1
 
1.0%
1.135 3
 
3.0%
1.134 2
 
2.0%
1.133 4
 
4.0%
1.132 12
12.0%
1.131 2
 
2.0%
1.129 2
 
2.0%
1.128 4
 
4.0%
1.127 2
 
2.0%
1.126 10
10.0%

저수량(백만m3)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
22.37
91 
22.353
 
8
22.335
 
1

Length

Max length6
Median length5
Mean length5.09
Min length5

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row22.353
2nd row22.353
3rd row22.37
4th row22.353
5th row22.353

Common Values

ValueCountFrequency (%)
22.37 91
91.0%
22.353 8
 
8.0%
22.335 1
 
1.0%

Length

2023-12-10T20:59:54.410898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:59:54.527564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
22.37 91
91.0%
22.353 8
 
8.0%
22.335 1
 
1.0%

저수율
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
45.9
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row45.9
2nd row45.9
3rd row45.9
4th row45.9
5th row45.9

Common Values

ValueCountFrequency (%)
45.9 100
100.0%

Length

2023-12-10T20:59:54.659173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:59:54.762326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
45.9 100
100.0%

Interactions

2023-12-10T20:59:51.920538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:59:51.202121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:59:51.568560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:59:52.031675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:59:51.345480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:59:51.727822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:59:52.138494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:59:51.465707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:59:51.827337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T20:59:54.835735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일자/시간(t)저수위(m)유입량(ms)방류량(ms)저수량(백만m3)
일자/시간(t)1.0000.7060.4780.9720.706
저수위(m)0.7061.0000.1740.6961.000
유입량(ms)0.4780.1741.0000.4650.174
방류량(ms)0.9720.6960.4651.0000.696
저수량(백만m3)0.7061.0000.1740.6961.000
2023-12-10T20:59:55.287692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
저수위(m)저수량(백만m3)
저수위(m)1.0001.000
저수량(백만m3)1.0001.000
2023-12-10T20:59:55.393488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일자/시간(t)유입량(ms)방류량(ms)저수위(m)저수량(백만m3)
일자/시간(t)1.000-0.932-0.9270.3470.347
유입량(ms)-0.9321.0000.9980.2840.284
방류량(ms)-0.9270.9981.0000.3720.372
저수위(m)0.3470.2840.3721.0001.000
저수량(백만m3)0.3470.2840.3721.0001.000

Missing values

2023-12-10T20:59:52.288624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T20:59:52.443839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

댐이름일자/시간(t)저수위(m)강우량(mm)유입량(ms)방류량(ms)저수량(백만m3)저수율
0군위2019050213192.3706.1071.13522.35345.9
1군위2019050214192.3701.1331.13322.35345.9
2군위2019050217192.3806.0761.13222.3745.9
3군위2019050216192.3701.1331.13322.35345.9
4군위2019050215192.3701.1321.13222.35345.9
5군위2019050214192.3701.1331.13322.35345.9
6군위2019050219192.3801.1311.13122.3745.9
7군위2019050218192.3801.1341.13422.3745.9
8군위2019050217192.3806.0761.13222.3745.9
9군위2019050216192.3701.1331.13322.35345.9
댐이름일자/시간(t)저수위(m)강우량(mm)유입량(ms)방류량(ms)저수량(백만m3)저수율
90군위2019050320192.3801.121.1222.3745.9
91군위2019050320192.3801.121.1222.3745.9
92군위2019050319192.3801.121.1222.3745.9
93군위2019050319192.3801.121.1222.3745.9
94군위2019050314192.3801.1221.12222.3745.9
95군위2019050316192.3801.1231.12322.3745.9
96군위2019050315192.3801.1231.12322.3745.9
97군위2019050315192.3801.1231.12322.3745.9
98군위2019050212192.3601.1351.13522.33545.9
99군위2019050213192.3706.1071.13522.35345.9

Duplicate rows

Most frequently occurring

댐이름일자/시간(t)저수위(m)강우량(mm)유입량(ms)방류량(ms)저수량(백만m3)저수율# duplicates
0군위2019050213192.3706.1071.13522.35345.92
1군위2019050214192.3701.1331.13322.35345.92
2군위2019050215192.3701.1321.13222.35345.92
3군위2019050216192.3701.1331.13322.35345.92
4군위2019050217192.3806.0761.13222.3745.92
5군위2019050218192.3801.1341.13422.3745.92
6군위2019050219192.3801.1311.13122.3745.92
7군위2019050220192.3801.1321.13222.3745.92
8군위2019050221192.3801.1321.13222.3745.92
9군위2019050222192.3801.1321.13222.3745.92