Overview

Dataset statistics

Number of variables3
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows43
Duplicate rows (%)43.0%
Total size in memory2.7 KiB
Average record size in memory27.3 B

Variable types

Categorical1
Numeric2

Alerts

Dataset has 43 (43.0%) duplicate rowsDuplicates
공급량 is highly overall correlated with 시설명High correlation
시설명 is highly overall correlated with 공급량High correlation

Reproduction

Analysis started2024-04-20 16:01:21.598178
Analysis finished2024-04-20 16:01:22.574106
Duration0.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size928.0 B
갑천가압장
62 
강진가압장
38 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row갑천가압장
2nd row갑천가압장
3rd row갑천가압장
4th row갑천가압장
5th row갑천가압장

Common Values

ValueCountFrequency (%)
갑천가압장 62
62.0%
강진가압장 38
38.0%

Length

2024-04-21T01:01:22.679379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T01:01:22.841864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
갑천가압장 62
62.0%
강진가압장 38
38.0%

공급날짜
Real number (ℝ)

Distinct31
Distinct (%)31.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20210115
Minimum20210101
Maximum20210131
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2024-04-21T01:01:23.015136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20210101
5-th percentile20210102
Q120210108
median20210114
Q320210123
95-th percentile20210130
Maximum20210131
Range30
Interquartile range (IQR)15.25

Descriptive statistics

Standard deviation9.0083239
Coefficient of variation (CV)4.4573343 × 10-7
Kurtosis-1.1397495
Mean20210115
Median Absolute Deviation (MAD)7
Skewness0.20591149
Sum2.0210115 × 109
Variance81.149899
MonotonicityNot monotonic
2024-04-21T01:01:23.268475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20210107 4
 
4.0%
20210113 4
 
4.0%
20210111 4
 
4.0%
20210108 4
 
4.0%
20210110 4
 
4.0%
20210109 4
 
4.0%
20210101 4
 
4.0%
20210102 4
 
4.0%
20210118 4
 
4.0%
20210114 4
 
4.0%
Other values (21) 60
60.0%
ValueCountFrequency (%)
20210101 4
4.0%
20210102 4
4.0%
20210103 3
3.0%
20210104 4
4.0%
20210105 3
3.0%
20210106 3
3.0%
20210107 4
4.0%
20210108 4
4.0%
20210109 4
4.0%
20210110 4
4.0%
ValueCountFrequency (%)
20210131 3
3.0%
20210130 3
3.0%
20210129 3
3.0%
20210128 3
3.0%
20210127 3
3.0%
20210126 3
3.0%
20210125 3
3.0%
20210124 3
3.0%
20210123 2
2.0%
20210122 2
2.0%

공급량
Real number (ℝ)

HIGH CORRELATION 

Distinct52
Distinct (%)52.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean57887.34
Minimum1437
Maximum168572
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2024-04-21T01:01:23.668187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1437
5-th percentile1456
Q11619
median1677.5
Q3142120
95-th percentile167742.2
Maximum168572
Range167135
Interquartile range (IQR)140501

Descriptive statistics

Standard deviation72577.95
Coefficient of variation (CV)1.2537793
Kurtosis-1.7096945
Mean57887.34
Median Absolute Deviation (MAD)150
Skewness0.5315207
Sum5788734
Variance5.2675589 × 109
MonotonicityNot monotonic
2024-04-21T01:01:24.108677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1666 6
 
6.0%
1684 4
 
4.0%
1620 4
 
4.0%
1618 4
 
4.0%
1544 2
 
2.0%
1701 2
 
2.0%
1456 2
 
2.0%
142024 2
 
2.0%
168128 2
 
2.0%
167728 2
 
2.0%
Other values (42) 70
70.0%
ValueCountFrequency (%)
1437 2
2.0%
1452 2
2.0%
1456 2
2.0%
1477 2
2.0%
1506 2
2.0%
1511 2
2.0%
1544 2
2.0%
1567 2
2.0%
1583 2
2.0%
1613 2
2.0%
ValueCountFrequency (%)
168572 1
1.0%
168128 2
2.0%
168012 2
2.0%
167728 2
2.0%
164732 1
1.0%
161788 1
1.0%
156912 2
2.0%
155792 2
2.0%
150940 2
2.0%
148456 2
2.0%

Interactions

2024-04-21T01:01:22.042705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T01:01:21.737417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T01:01:22.197845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T01:01:21.886889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T01:01:24.379637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명공급날짜공급량
시설명1.0000.0001.000
공급날짜0.0001.0000.451
공급량1.0000.4511.000
2024-04-21T01:01:24.556270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공급날짜공급량시설명
공급날짜1.0000.0550.000
공급량0.0551.0000.990
시설명0.0000.9901.000

Missing values

2024-04-21T01:01:22.392856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T01:01:22.522952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명공급날짜공급량
0갑천가압장202101251671
1갑천가압장202101031456
2갑천가압장202101311651
3갑천가압장202101111689
4갑천가압장202101031456
5갑천가압장202101191620
6갑천가압장202101241688
7갑천가압장202101301647
8갑천가압장202101091567
9갑천가압장202101011511
시설명공급날짜공급량
90강진가압장20210107138080
91강진가압장20210124138136
92강진가압장20210110155792
93강진가압장20210109147484
94강진가압장20210106144312
95강진가압장20210108146200
96강진가압장20210104148456
97강진가압장20210111156912
98강진가압장20210112164732
99강진가압장20210103139644

Duplicate rows

Most frequently occurring

시설명공급날짜공급량# duplicates
0갑천가압장2021010115112
1갑천가압장2021010214772
2갑천가압장2021010314562
3갑천가압장2021010414372
4갑천가압장2021010515062
5갑천가압장2021010616192
6갑천가압장2021010716182
7갑천가압장2021010815442
8갑천가압장2021010915672
9갑천가압장2021011014522