Overview

Dataset statistics

Number of variables4
Number of observations46
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)2.2%
Total size in memory1.7 KiB
Average record size in memory37.9 B

Variable types

Categorical2
Numeric2

Dataset

Description샘플 데이터
Author한국기상산업기술원
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=3d1cac40-9fcf-11ee-a443-a7e161ec5b2c

Alerts

지점번호 has constant value ""Constant
Dataset has 1 (2.2%) duplicate rowsDuplicates
전운량 is highly overall correlated with 태양광발전량High correlation
태양광발전량 is highly overall correlated with 전운량High correlation
전운량 has 16 (34.8%) zerosZeros
태양광발전량 has 24 (52.2%) zerosZeros

Reproduction

Analysis started2024-01-05 20:27:00.135080
Analysis finished2024-01-05 20:27:05.977514
Duration5.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지점번호
Categorical

CONSTANT 

Distinct1
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size500.0 B
1
46 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 46
100.0%

Length

2024-01-05T20:27:06.176869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-05T20:27:06.465797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 46
100.0%

연월일시분
Categorical

Distinct2
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size500.0 B
2020-01-01
23 
2020-01-02
23 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-01-01
2nd row2020-01-01
3rd row2020-01-01
4th row2020-01-01
5th row2020-01-01

Common Values

ValueCountFrequency (%)
2020-01-01 23
50.0%
2020-01-02 23
50.0%

Length

2024-01-05T20:27:06.748724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-05T20:27:07.043149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-01-01 23
50.0%
2020-01-02 23
50.0%

전운량
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct30
Distinct (%)65.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.98843961
Minimum0
Maximum9.5933333
Zeros16
Zeros (%)34.8%
Negative0
Negative (%)0.0%
Memory size546.0 B
2024-01-05T20:27:07.589065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.25277778
Q30.7515
95-th percentile6.76
Maximum9.5933333
Range9.5933333
Interquartile range (IQR)0.7515

Descriptive statistics

Standard deviation2.2000849
Coefficient of variation (CV)2.2258162
Kurtosis9.2923242
Mean0.98843961
Median Absolute Deviation (MAD)0.25277778
Skewness3.1450237
Sum45.468222
Variance4.8403734
MonotonicityNot monotonic
2024-01-05T20:27:08.449115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
0.0 16
34.8%
0.093333333 2
 
4.3%
0.36 1
 
2.2%
0.486666667 1
 
2.2%
0.272222222 1
 
2.2%
0.233333333 1
 
2.2%
0.05 1
 
2.2%
0.044 1
 
2.2%
0.3 1
 
2.2%
0.99 1
 
2.2%
Other values (20) 20
43.5%
ValueCountFrequency (%)
0.0 16
34.8%
0.033333333 1
 
2.2%
0.044 1
 
2.2%
0.05 1
 
2.2%
0.093333333 2
 
4.3%
0.173333333 1
 
2.2%
0.233333333 1
 
2.2%
0.272222222 1
 
2.2%
0.3 1
 
2.2%
0.316666667 1
 
2.2%
ValueCountFrequency (%)
9.593333333 1
2.2%
8.93 1
2.2%
7.64 1
2.2%
4.12 1
2.2%
1.923333333 1
2.2%
1.72 1
2.2%
1.396666667 1
2.2%
1.326666667 1
2.2%
0.99 1
2.2%
0.88 1
2.2%

태양광발전량
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct20
Distinct (%)43.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean140.69565
Minimum0
Maximum577
Zeros24
Zeros (%)52.2%
Negative0
Negative (%)0.0%
Memory size546.0 B
2024-01-05T20:27:09.383667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3294
95-th percentile539.75
Maximum577
Range577
Interquartile range (IQR)294

Descriptive statistics

Standard deviation206.14039
Coefficient of variation (CV)1.4651511
Kurtosis-0.57846913
Mean140.69565
Median Absolute Deviation (MAD)0
Skewness1.0622063
Sum6472
Variance42493.861
MonotonicityNot monotonic
2024-01-05T20:27:10.185586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
0 24
52.2%
2 3
 
6.5%
465 2
 
4.3%
115 1
 
2.2%
267 1
 
2.2%
86 1
 
2.2%
273 1
 
2.2%
420 1
 
2.2%
493 1
 
2.2%
515 1
 
2.2%
Other values (10) 10
21.7%
ValueCountFrequency (%)
0 24
52.2%
2 3
 
6.5%
3 1
 
2.2%
86 1
 
2.2%
93 1
 
2.2%
97 1
 
2.2%
115 1
 
2.2%
267 1
 
2.2%
273 1
 
2.2%
301 1
 
2.2%
ValueCountFrequency (%)
577 1
2.2%
550 1
2.2%
548 1
2.2%
515 1
2.2%
493 1
2.2%
487 1
2.2%
465 2
4.3%
420 1
2.2%
405 1
2.2%
306 1
2.2%

Interactions

2024-01-05T20:27:04.914375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T20:27:04.230716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T20:27:05.161293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-05T20:27:04.609912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-05T20:27:10.573783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연월일시분전운량태양광발전량
연월일시분1.0000.2200.264
전운량0.2201.0000.000
태양광발전량0.2640.0001.000
2024-01-05T20:27:11.612452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전운량태양광발전량연월일시분
전운량1.000-0.7640.143
태양광발전량-0.7641.0000.261
연월일시분0.1430.2611.000

Missing values

2024-01-05T20:27:05.528032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-05T20:27:05.820405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지점번호연월일시분전운량태양광발전량
012020-01-010.360
112020-01-010.3733330
212020-01-010.5933330
312020-01-014.120
412020-01-018.930
512020-01-019.5933330
612020-01-017.640
712020-01-011.3966673
812020-01-010.792115
912020-01-010.033333306
지점번호연월일시분전운량태양광발전량
3612020-01-020.0515
3712020-01-020.0493
3812020-01-020.0420
3912020-01-020.0273
4012020-01-020.086
4112020-01-020.02
4212020-01-020.00
4312020-01-020.050
4412020-01-020.2333330
4512020-01-020.2722220

Duplicate rows

Most frequently occurring

지점번호연월일시분전운량태양광발전량# duplicates
012020-01-010.04652