Overview

Dataset statistics

Number of variables3
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows4
Duplicate rows (%)4.0%
Total size in memory2.7 KiB
Average record size in memory27.3 B

Variable types

Categorical1
Numeric2

Alerts

Dataset has 4 (4.0%) duplicate rowsDuplicates
공급량 is highly overall correlated with 시설명High correlation
시설명 is highly overall correlated with 공급량High correlation

Reproduction

Analysis started2023-12-10 10:20:28.381038
Analysis finished2023-12-10 10:20:29.215654
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
광동취수장
28 
다압취수장
28 
대청취수장
28 
덕소취수장
16 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광동취수장
2nd row광동취수장
3rd row광동취수장
4th row광동취수장
5th row광동취수장

Common Values

ValueCountFrequency (%)
광동취수장 28
28.0%
다압취수장 28
28.0%
대청취수장 28
28.0%
덕소취수장 16
16.0%

Length

2023-12-10T19:20:29.325482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:20:29.496493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
광동취수장 28
28.0%
다압취수장 28
28.0%
대청취수장 28
28.0%
덕소취수장 16
16.0%

공급날짜
Real number (ℝ)

Distinct28
Distinct (%)28.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20210214
Minimum20210201
Maximum20210228
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:29.698751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20210201
5-th percentile20210202
Q120210207
median20210214
Q320210220
95-th percentile20210227
Maximum20210228
Range27
Interquartile range (IQR)13.25

Descriptive statistics

Standard deviation8.0436373
Coefficient of variation (CV)3.9799863 × 10-7
Kurtosis-1.1231209
Mean20210214
Median Absolute Deviation (MAD)7
Skewness0.10845299
Sum2.0210214 × 109
Variance64.700101
MonotonicityNot monotonic
2023-12-10T19:20:30.009839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
20210214 6
 
6.0%
20210202 6
 
6.0%
20210217 5
 
5.0%
20210206 5
 
5.0%
20210210 4
 
4.0%
20210201 4
 
4.0%
20210212 4
 
4.0%
20210209 4
 
4.0%
20210216 4
 
4.0%
20210203 4
 
4.0%
Other values (18) 54
54.0%
ValueCountFrequency (%)
20210201 4
4.0%
20210202 6
6.0%
20210203 4
4.0%
20210204 3
3.0%
20210205 3
3.0%
20210206 5
5.0%
20210207 3
3.0%
20210208 3
3.0%
20210209 4
4.0%
20210210 4
4.0%
ValueCountFrequency (%)
20210228 3
3.0%
20210227 3
3.0%
20210226 3
3.0%
20210225 3
3.0%
20210224 3
3.0%
20210223 3
3.0%
20210222 3
3.0%
20210221 3
3.0%
20210220 3
3.0%
20210219 3
3.0%

공급량
Real number (ℝ)

HIGH CORRELATION 

Distinct91
Distinct (%)91.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean242288.18
Minimum27608
Maximum370900
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:30.236426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum27608
5-th percentile29997.1
Q133090
median305222
Q3336208
95-th percentile352050
Maximum370900
Range343292
Interquartile range (IQR)303118

Descriptive statistics

Standard deviation133774.12
Coefficient of variation (CV)0.55212812
Kurtosis-1.0719854
Mean242288.18
Median Absolute Deviation (MAD)31386
Skewness-0.92025819
Sum24228818
Variance1.7895515 × 1010
MonotonicityNot monotonic
2023-12-10T19:20:30.456599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
348500 3
 
3.0%
352000 3
 
3.0%
336192 2
 
2.0%
351100 2
 
2.0%
336384 2
 
2.0%
348900 2
 
2.0%
336256 2
 
2.0%
30992 1
 
1.0%
304204 1
 
1.0%
303120 1
 
1.0%
Other values (81) 81
81.0%
ValueCountFrequency (%)
27608 1
1.0%
28586 1
1.0%
29190 1
1.0%
29226 1
1.0%
29676 1
1.0%
30014 1
1.0%
30156 1
1.0%
30330 1
1.0%
30542 1
1.0%
30730 1
1.0%
ValueCountFrequency (%)
370900 1
 
1.0%
367500 1
 
1.0%
355200 1
 
1.0%
353500 1
 
1.0%
353000 1
 
1.0%
352000 3
3.0%
351100 2
2.0%
348900 2
2.0%
348500 3
3.0%
339500 1
 
1.0%

Interactions

2023-12-10T19:20:28.759160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:28.492892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:28.893687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:28.612142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:20:30.621398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명공급날짜공급량
시설명1.0000.0000.835
공급날짜0.0001.0000.393
공급량0.8350.3931.000
2023-12-10T19:20:30.764186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공급날짜공급량시설명
공급날짜1.000-0.2730.000
공급량-0.2731.0000.804
시설명0.0000.8041.000

Missing values

2023-12-10T19:20:29.062633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:20:29.169421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명공급날짜공급량
0광동취수장2021020830992
1광동취수장2021022430156
2광동취수장2021021832746
3광동취수장2021022329190
4광동취수장2021020632928
5광동취수장2021021431480
6광동취수장2021022527608
7광동취수장2021020131824
8광동취수장2021022032028
9광동취수장2021021133304
시설명공급날짜공급량
90덕소취수장20210212355200
91덕소취수장20210203353500
92덕소취수장20210214348500
93덕소취수장20210214348500
94덕소취수장20210202352000
95덕소취수장20210206348900
96덕소취수장20210209367500
97덕소취수장20210214348500
98덕소취수장20210216353000
99덕소취수장20210217351100

Duplicate rows

Most frequently occurring

시설명공급날짜공급량# duplicates
0덕소취수장202102023520003
2덕소취수장202102143485003
1덕소취수장202102063489002
3덕소취수장202102173511002