Overview

Dataset statistics

Number of variables3
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)2.0%
Total size in memory2.7 KiB
Average record size in memory27.3 B

Variable types

Categorical1
Numeric2

Alerts

Dataset has 2 (2.0%) duplicate rowsDuplicates
공급량 is highly overall correlated with 시설명High correlation
시설명 is highly overall correlated with 공급량High correlation

Reproduction

Analysis started2023-12-10 10:20:31.387035
Analysis finished2023-12-10 10:20:32.297772
Duration0.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
광동취수장
31 
다압취수장
31 
대청취수장
31 
덕소취수장

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광동취수장
2nd row광동취수장
3rd row광동취수장
4th row광동취수장
5th row광동취수장

Common Values

ValueCountFrequency (%)
광동취수장 31
31.0%
다압취수장 31
31.0%
대청취수장 31
31.0%
덕소취수장 7
 
7.0%

Length

2023-12-10T19:20:32.393207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:20:32.563189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
광동취수장 31
31.0%
다압취수장 31
31.0%
대청취수장 31
31.0%
덕소취수장 7
 
7.0%

공급날짜
Real number (ℝ)

Distinct31
Distinct (%)31.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20210116
Minimum20210101
Maximum20210131
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:32.785529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20210101
5-th percentile20210102
Q120210109
median20210116
Q320210123
95-th percentile20210130
Maximum20210131
Range30
Interquartile range (IQR)14.25

Descriptive statistics

Standard deviation8.7392994
Coefficient of variation (CV)4.3242203 × 10-7
Kurtosis-1.1089088
Mean20210116
Median Absolute Deviation (MAD)7.5
Skewness0.063304181
Sum2.0210116 × 109
Variance76.375354
MonotonicityNot monotonic
2023-12-10T19:20:33.034922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20210112 5
 
5.0%
20210116 5
 
5.0%
20210113 4
 
4.0%
20210107 4
 
4.0%
20210114 4
 
4.0%
20210128 3
 
3.0%
20210115 3
 
3.0%
20210109 3
 
3.0%
20210108 3
 
3.0%
20210111 3
 
3.0%
Other values (21) 63
63.0%
ValueCountFrequency (%)
20210101 3
3.0%
20210102 3
3.0%
20210103 3
3.0%
20210104 3
3.0%
20210105 3
3.0%
20210106 3
3.0%
20210107 4
4.0%
20210108 3
3.0%
20210109 3
3.0%
20210110 3
3.0%
ValueCountFrequency (%)
20210131 3
3.0%
20210130 3
3.0%
20210129 3
3.0%
20210128 3
3.0%
20210127 3
3.0%
20210126 3
3.0%
20210125 3
3.0%
20210124 3
3.0%
20210123 3
3.0%
20210122 3
3.0%

공급량
Real number (ℝ)

HIGH CORRELATION 

Distinct92
Distinct (%)92.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean237635.4
Minimum30058
Maximum406600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:33.282005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum30058
5-th percentile31068.8
Q134941
median310014
Q3334992
95-th percentile384600
Maximum406600
Range376542
Interquartile range (IQR)300051

Descriptive statistics

Standard deviation139727.61
Coefficient of variation (CV)0.58799156
Kurtosis-1.3319453
Mean237635.4
Median Absolute Deviation (MAD)25314
Skewness-0.7417169
Sum23763540
Variance1.9523805 × 1010
MonotonicityNot monotonic
2023-12-10T19:20:33.866176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
335232 3
 
3.0%
384600 2
 
2.0%
335360 2
 
2.0%
334912 2
 
2.0%
334272 2
 
2.0%
334784 2
 
2.0%
406600 2
 
2.0%
330496 1
 
1.0%
307240 1
 
1.0%
310240 1
 
1.0%
Other values (82) 82
82.0%
ValueCountFrequency (%)
30058 1
1.0%
30420 1
1.0%
30584 1
1.0%
30632 1
1.0%
30704 1
1.0%
31088 1
1.0%
31152 1
1.0%
31288 1
1.0%
31292 1
1.0%
31334 1
1.0%
ValueCountFrequency (%)
406600 2
2.0%
405700 1
1.0%
396200 1
1.0%
384600 2
2.0%
381600 1
1.0%
339376 1
1.0%
337316 1
1.0%
336924 1
1.0%
336704 1
1.0%
335936 1
1.0%

Interactions

2023-12-10T19:20:31.760751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:31.524755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:31.904457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:31.653451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:20:34.058480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명공급날짜공급량
시설명1.0000.0000.917
공급날짜0.0001.0000.000
공급량0.9170.0001.000
2023-12-10T19:20:34.204119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공급날짜공급량시설명
공급날짜1.000-0.1140.000
공급량-0.1141.0000.923
시설명0.0000.9231.000

Missing values

2023-12-10T19:20:32.119186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:20:32.250427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명공급날짜공급량
0광동취수장2021012232840
1광동취수장2021012331152
2광동취수장2021011734438
3광동취수장2021010531288
4광동취수장2021012731504
5광동취수장2021012531292
6광동취수장2021010230420
7광동취수장2021011439762
8광동취수장2021011834490
9광동취수장2021011933342
시설명공급날짜공급량
90대청취수장20210115303052
91대청취수장20210109336924
92대청취수장20210126295660
93덕소취수장20210114396200
94덕소취수장20210116384600
95덕소취수장20210112406600
96덕소취수장20210112406600
97덕소취수장20210116384600
98덕소취수장20210107381600
99덕소취수장20210113405700

Duplicate rows

Most frequently occurring

시설명공급날짜공급량# duplicates
0덕소취수장202101124066002
1덕소취수장202101163846002