Overview

Dataset statistics

Number of variables3
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows3
Duplicate rows (%)3.0%
Total size in memory2.7 KiB
Average record size in memory27.3 B

Variable types

Categorical1
Numeric2

Alerts

Dataset has 3 (3.0%) duplicate rowsDuplicates
공급량 is highly overall correlated with 시설명High correlation
시설명 is highly overall correlated with 공급량High correlation

Reproduction

Analysis started2023-12-10 10:20:38.216820
Analysis finished2023-12-10 10:20:39.498818
Duration1.28 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
광동취수장
30 
다압취수장
30 
대청취수장
30 
덕소취수장
10 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광동취수장
2nd row광동취수장
3rd row광동취수장
4th row광동취수장
5th row광동취수장

Common Values

ValueCountFrequency (%)
광동취수장 30
30.0%
다압취수장 30
30.0%
대청취수장 30
30.0%
덕소취수장 10
 
10.0%

Length

2023-12-10T19:20:39.621676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:20:39.775556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
광동취수장 30
30.0%
다압취수장 30
30.0%
대청취수장 30
30.0%
덕소취수장 10
 
10.0%

공급날짜
Real number (ℝ)

Distinct30
Distinct (%)30.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20201115
Minimum20201101
Maximum20201130
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:39.965066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20201101
5-th percentile20201102
Q120201108
median20201114
Q320201122
95-th percentile20201129
Maximum20201130
Range29
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.6659382
Coefficient of variation (CV)4.2898317 × 10-7
Kurtosis-1.1849691
Mean20201115
Median Absolute Deviation (MAD)7
Skewness0.13773449
Sum2.0201115 × 109
Variance75.098485
MonotonicityNot monotonic
2023-12-10T19:20:40.176722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20201109 6
 
6.0%
20201108 5
 
5.0%
20201103 5
 
5.0%
20201117 4
 
4.0%
20201102 4
 
4.0%
20201112 4
 
4.0%
20201105 3
 
3.0%
20201126 3
 
3.0%
20201125 3
 
3.0%
20201124 3
 
3.0%
Other values (20) 60
60.0%
ValueCountFrequency (%)
20201101 3
3.0%
20201102 4
4.0%
20201103 5
5.0%
20201104 3
3.0%
20201105 3
3.0%
20201106 3
3.0%
20201107 3
3.0%
20201108 5
5.0%
20201109 6
6.0%
20201110 3
3.0%
ValueCountFrequency (%)
20201130 3
3.0%
20201129 3
3.0%
20201128 3
3.0%
20201127 3
3.0%
20201126 3
3.0%
20201125 3
3.0%
20201124 3
3.0%
20201123 3
3.0%
20201122 3
3.0%
20201121 3
3.0%

공급량
Real number (ℝ)

HIGH CORRELATION 

Distinct91
Distinct (%)91.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean228642.9
Minimum25242
Maximum376900
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:40.378810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25242
5-th percentile25967.6
Q128464
median307823
Q3335296
95-th percentile366055
Maximum376900
Range351658
Interquartile range (IQR)306832

Descriptive statistics

Standard deviation138264.78
Coefficient of variation (CV)0.60471933
Kurtosis-1.3941561
Mean228642.9
Median Absolute Deviation (MAD)28561
Skewness-0.68339156
Sum22864290
Variance1.911715 × 1010
MonotonicityNot monotonic
2023-12-10T19:20:40.610958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
376000 3
 
3.0%
28200 2
 
2.0%
365600 2
 
2.0%
336384 2
 
2.0%
335296 2
 
2.0%
364700 2
 
2.0%
336320 2
 
2.0%
335424 2
 
2.0%
295830 1
 
1.0%
306028 1
 
1.0%
Other values (81) 81
81.0%
ValueCountFrequency (%)
25242 1
1.0%
25652 1
1.0%
25660 1
1.0%
25740 1
1.0%
25960 1
1.0%
25968 1
1.0%
26164 1
1.0%
26266 1
1.0%
26444 1
1.0%
26520 1
1.0%
ValueCountFrequency (%)
376900 1
 
1.0%
376000 3
3.0%
374700 1
 
1.0%
365600 2
2.0%
364700 2
2.0%
363300 1
 
1.0%
337280 1
 
1.0%
336960 1
 
1.0%
336512 1
 
1.0%
336384 2
2.0%

Interactions

2023-12-10T19:20:38.605319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:38.328861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:38.753957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:38.460003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:20:40.763744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명공급날짜공급량
시설명1.0000.0000.902
공급날짜0.0001.0000.379
공급량0.9020.3791.000
2023-12-10T19:20:40.919544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공급날짜공급량시설명
공급날짜1.000-0.0320.000
공급량-0.0321.0000.848
시설명0.0000.8481.000

Missing values

2023-12-10T19:20:39.143396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:20:39.445853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명공급날짜공급량
0광동취수장2020110828200
1광동취수장2020110327522
2광동취수장2020110629316
3광동취수장2020112027188
4광동취수장2020112325968
5광동취수장2020111025960
6광동취수장2020111626164
7광동취수장2020110926570
8광동취수장2020111326266
9광동취수장2020111926444
시설명공급날짜공급량
90덕소취수장20201109376000
91덕소취수장20201109376000
92덕소취수장20201108364700
93덕소취수장20201109376000
94덕소취수장20201112374700
95덕소취수장20201117376900
96덕소취수장20201108364700
97덕소취수장20201103365600
98덕소취수장20201102363300
99덕소취수장20201103365600

Duplicate rows

Most frequently occurring

시설명공급날짜공급량# duplicates
2덕소취수장202011093760003
0덕소취수장202011033656002
1덕소취수장202011083647002