Overview

Dataset statistics

Number of variables3
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)2.0%
Total size in memory2.7 KiB
Average record size in memory27.3 B

Variable types

Categorical1
Numeric2

Alerts

Dataset has 2 (2.0%) duplicate rowsDuplicates
공급량 is highly overall correlated with 시설명High correlation
시설명 is highly overall correlated with 공급량High correlation

Reproduction

Analysis started2023-12-10 10:20:25.161200
Analysis finished2023-12-10 10:20:26.210048
Duration1.05 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
광동취수장
31 
다압취수장
31 
대청취수장
31 
덕소취수장

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광동취수장
2nd row광동취수장
3rd row광동취수장
4th row광동취수장
5th row광동취수장

Common Values

ValueCountFrequency (%)
광동취수장 31
31.0%
다압취수장 31
31.0%
대청취수장 31
31.0%
덕소취수장 7
 
7.0%

Length

2023-12-10T19:20:26.331512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:20:26.503327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
광동취수장 31
31.0%
다압취수장 31
31.0%
대청취수장 31
31.0%
덕소취수장 7
 
7.0%

공급날짜
Real number (ℝ)

Distinct31
Distinct (%)31.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20210315
Minimum20210301
Maximum20210331
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:26.687221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20210301
5-th percentile20210302
Q120210307
median20210315
Q320210323
95-th percentile20210330
Maximum20210331
Range30
Interquartile range (IQR)16

Descriptive statistics

Standard deviation9.3002661
Coefficient of variation (CV)4.6017423 × 10-7
Kurtosis-1.2539811
Mean20210315
Median Absolute Deviation (MAD)8
Skewness0.085680882
Sum2.0210315 × 109
Variance86.494949
MonotonicityNot monotonic
2023-12-10T19:20:26.989508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20210301 5
 
5.0%
20210302 5
 
5.0%
20210304 4
 
4.0%
20210303 4
 
4.0%
20210309 4
 
4.0%
20210306 3
 
3.0%
20210322 3
 
3.0%
20210324 3
 
3.0%
20210330 3
 
3.0%
20210311 3
 
3.0%
Other values (21) 63
63.0%
ValueCountFrequency (%)
20210301 5
5.0%
20210302 5
5.0%
20210303 4
4.0%
20210304 4
4.0%
20210305 3
3.0%
20210306 3
3.0%
20210307 3
3.0%
20210308 3
3.0%
20210309 4
4.0%
20210310 3
3.0%
ValueCountFrequency (%)
20210331 3
3.0%
20210330 3
3.0%
20210329 3
3.0%
20210328 3
3.0%
20210327 3
3.0%
20210326 3
3.0%
20210325 3
3.0%
20210324 3
3.0%
20210323 3
3.0%
20210322 3
3.0%

공급량
Real number (ℝ)

HIGH CORRELATION 

Distinct93
Distinct (%)93.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean227452.84
Minimum13990
Maximum364400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:27.234266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum13990
5-th percentile17970
Q128916
median304216
Q3331936
95-th percentile334534.4
Maximum364400
Range350410
Interquartile range (IQR)303020

Descriptive statistics

Standard deviation137803.15
Coefficient of variation (CV)0.60585373
Kurtosis-1.3336414
Mean227452.84
Median Absolute Deviation (MAD)28776
Skewness-0.80196831
Sum22745284
Variance1.8989708 × 1010
MonotonicityNot monotonic
2023-12-10T19:20:27.481961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
333312 2
 
2.0%
323200 2
 
2.0%
330900 2
 
2.0%
325120 2
 
2.0%
333056 2
 
2.0%
334528 2
 
2.0%
334656 2
 
2.0%
28514 1
 
1.0%
301376 1
 
1.0%
306168 1
 
1.0%
Other values (83) 83
83.0%
ValueCountFrequency (%)
13990 1
1.0%
15850 1
1.0%
16314 1
1.0%
16736 1
1.0%
17400 1
1.0%
18000 1
1.0%
19496 1
1.0%
19568 1
1.0%
19838 1
1.0%
20098 1
1.0%
ValueCountFrequency (%)
364400 1
1.0%
353100 1
1.0%
334848 1
1.0%
334656 2
2.0%
334528 2
2.0%
334272 1
1.0%
334080 1
1.0%
333952 1
1.0%
333888 1
1.0%
333760 1
1.0%

Interactions

2023-12-10T19:20:25.662238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:25.313363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:25.807593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:25.472247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:20:27.657369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명공급날짜공급량
시설명1.0000.0000.968
공급날짜0.0001.0000.000
공급량0.9680.0001.000
2023-12-10T19:20:27.798442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공급날짜공급량시설명
공급날짜1.000-0.1350.000
공급량-0.1351.0000.758
시설명0.0000.7581.000

Missing values

2023-12-10T19:20:26.021426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:20:26.159595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명공급날짜공급량
0광동취수장2021030128514
1광동취수장2021032016736
2광동취수장2021030922574
3광동취수장2021032115850
4광동취수장2021031618000
5광동취수장2021031320098
6광동취수장2021030226016
7광동취수장2021032931212
8광동취수장2021031916314
9광동취수장2021031419568
시설명공급날짜공급량
90대청취수장20210302305400
91대청취수장20210314305868
92대청취수장20210325302088
93덕소취수장20210309353100
94덕소취수장20210302330900
95덕소취수장20210301323200
96덕소취수장20210304364400
97덕소취수장20210302330900
98덕소취수장20210303331000
99덕소취수장20210301323200

Duplicate rows

Most frequently occurring

시설명공급날짜공급량# duplicates
0덕소취수장202103013232002
1덕소취수장202103023309002