Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells487
Missing cells (%)0.8%
Duplicate rows32
Duplicate rows (%)0.3%
Total size in memory585.9 KiB
Average record size in memory60.0 B

Variable types

DateTime1
Numeric4
Categorical1

Dataset

Description2022년 논 가뭄에 대한 가뭄 정보와 변경된 저수율을 시군별로 나타내는 것 작성 기준일에 따른 가뭄 정보(N-현재)(1-1개월)(2-2개월)(3-3개월)
URLhttps://www.data.go.kr/data/15117185/fileData.do

Alerts

Dataset has 32 (0.3%) duplicate rowsDuplicates
저수율 is highly overall correlated with 평년 and 1 other fieldsHigh correlation
평년 is highly overall correlated with 저수율High correlation
평년대비 is highly overall correlated with 저수율 and 1 other fieldsHigh correlation
가뭄단계 is highly overall correlated with 평년대비High correlation
가뭄단계 is highly imbalanced (85.3%)Imbalance
표준코드 has 487 (4.9%) missing valuesMissing
저수율 has 710 (7.1%) zerosZeros
평년 has 710 (7.1%) zerosZeros

Reproduction

Analysis started2023-12-12 10:19:09.821970
Analysis finished2023-12-12 10:19:12.693153
Duration2.87 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct365
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-01-01 00:00:00
Maximum2022-12-31 00:00:00
2023-12-12T19:19:12.774097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:13.286828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

표준코드
Real number (ℝ)

MISSING 

Distinct159
Distinct (%)1.7%
Missing487
Missing (%)4.9%
Infinite0
Infinite (%)0.0%
Mean44510.158
Minimum26710
Maximum50130
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:19:13.489154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum26710
5-th percentile41170
Q142170
median44825
Q347170
95-th percentile48840
Maximum50130
Range23420
Interquartile range (IQR)5000

Descriptive statistics

Standard deviation3780.6259
Coefficient of variation (CV)0.084938496
Kurtosis7.0310518
Mean44510.158
Median Absolute Deviation (MAD)2425
Skewness-2.0598151
Sum4.2342513 × 108
Variance14293132
MonotonicityNot monotonic
2023-12-12T19:19:13.688757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42170 79
 
0.8%
42820 75
 
0.8%
46890 75
 
0.8%
46830 73
 
0.7%
44250 73
 
0.7%
50110 72
 
0.7%
48860 72
 
0.7%
44825 71
 
0.7%
44760 71
 
0.7%
42800 70
 
0.7%
Other values (149) 8782
87.8%
(Missing) 487
 
4.9%
ValueCountFrequency (%)
26710 57
0.6%
27710 59
0.6%
28710 56
0.6%
28720 54
0.5%
31710 53
0.5%
41110 49
0.5%
41130 65
0.7%
41150 60
0.6%
41170 54
0.5%
41190 58
0.6%
ValueCountFrequency (%)
50130 68
0.7%
50110 72
0.7%
48890 46
0.5%
48880 63
0.6%
48870 54
0.5%
48860 72
0.7%
48850 63
0.6%
48840 62
0.6%
48820 58
0.6%
48740 49
0.5%

저수율
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct80
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean71.5093
Minimum0
Maximum110
Zeros710
Zeros (%)7.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:19:13.881126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q160
median79
Q390
95-th percentile99
Maximum110
Range110
Interquartile range (IQR)30

Descriptive statistics

Standard deviation25.61492
Coefficient of variation (CV)0.35820404
Kurtosis1.7193784
Mean71.5093
Median Absolute Deviation (MAD)14
Skewness-1.4385966
Sum715093
Variance656.12413
MonotonicityNot monotonic
2023-12-12T19:19:14.080739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 710
 
7.1%
100 343
 
3.4%
85 277
 
2.8%
82 271
 
2.7%
94 262
 
2.6%
93 261
 
2.6%
95 260
 
2.6%
88 254
 
2.5%
96 247
 
2.5%
87 245
 
2.5%
Other values (70) 6870
68.7%
ValueCountFrequency (%)
0 710
7.1%
17 1
 
< 0.1%
28 1
 
< 0.1%
29 1
 
< 0.1%
30 7
 
0.1%
31 6
 
0.1%
32 6
 
0.1%
33 9
 
0.1%
34 13
 
0.1%
35 14
 
0.1%
ValueCountFrequency (%)
110 2
 
< 0.1%
109 13
 
0.1%
108 6
 
0.1%
107 4
 
< 0.1%
106 4
 
< 0.1%
100 343
3.4%
99 178
1.8%
98 201
2.0%
97 198
2.0%
96 247
2.5%

평년
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct65
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.9144
Minimum0
Maximum100
Zeros710
Zeros (%)7.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:19:14.283390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q164
median74
Q381
95-th percentile91
Maximum100
Range100
Interquartile range (IQR)17

Descriptive statistics

Standard deviation21.764279
Coefficient of variation (CV)0.31581613
Kurtosis4.3712949
Mean68.9144
Median Absolute Deviation (MAD)8
Skewness-2.0939301
Sum689144
Variance473.68384
MonotonicityNot monotonic
2023-12-12T19:19:14.485080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 710
 
7.1%
75 378
 
3.8%
77 372
 
3.7%
80 350
 
3.5%
71 341
 
3.4%
78 332
 
3.3%
79 309
 
3.1%
83 307
 
3.1%
73 304
 
3.0%
74 294
 
2.9%
Other values (55) 6303
63.0%
ValueCountFrequency (%)
0 710
7.1%
31 1
 
< 0.1%
37 3
 
< 0.1%
38 10
 
0.1%
39 12
 
0.1%
40 8
 
0.1%
42 2
 
< 0.1%
43 2
 
< 0.1%
44 3
 
< 0.1%
45 9
 
0.1%
ValueCountFrequency (%)
100 71
0.7%
99 6
 
0.1%
98 15
 
0.1%
97 35
0.4%
96 63
0.6%
95 42
0.4%
94 86
0.9%
93 68
0.7%
92 76
0.8%
91 80
0.8%

평년대비
Real number (ℝ)

HIGH CORRELATION 

Distinct146
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean104.1749
Minimum34
Maximum283
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:19:14.682290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum34
5-th percentile71
Q195
median104
Q3116
95-th percentile134
Maximum283
Range249
Interquartile range (IQR)21

Descriptive statistics

Standard deviation19.404205
Coefficient of variation (CV)0.18626564
Kurtosis2.0803914
Mean104.1749
Median Absolute Deviation (MAD)11
Skewness0.23289693
Sum1041749
Variance376.52316
MonotonicityNot monotonic
2023-12-12T19:19:14.879061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 1049
 
10.5%
105 265
 
2.6%
107 253
 
2.5%
106 251
 
2.5%
104 233
 
2.3%
109 227
 
2.3%
103 217
 
2.2%
102 206
 
2.1%
101 202
 
2.0%
110 201
 
2.0%
Other values (136) 6896
69.0%
ValueCountFrequency (%)
34 1
 
< 0.1%
39 2
 
< 0.1%
42 1
 
< 0.1%
43 2
 
< 0.1%
44 6
0.1%
45 2
 
< 0.1%
46 6
0.1%
47 4
< 0.1%
48 6
0.1%
49 7
0.1%
ValueCountFrequency (%)
283 1
 
< 0.1%
211 1
 
< 0.1%
198 1
 
< 0.1%
195 2
 
< 0.1%
190 1
 
< 0.1%
183 1
 
< 0.1%
182 5
0.1%
181 1
 
< 0.1%
180 5
0.1%
179 7
0.1%

가뭄단계
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
정상
9513 
관심
 
326
주의
 
115
경계
 
43
심각
 
3

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정상
2nd row정상
3rd row정상
4th row정상
5th row정상

Common Values

ValueCountFrequency (%)
정상 9513
95.1%
관심 326
 
3.3%
주의 115
 
1.1%
경계 43
 
0.4%
심각 3
 
< 0.1%

Length

2023-12-12T19:19:15.051761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:19:15.183838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정상 9513
95.1%
관심 326
 
3.3%
주의 115
 
1.1%
경계 43
 
0.4%
심각 3
 
< 0.1%

Interactions

2023-12-12T19:19:11.915693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:10.462917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:10.936080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:11.436103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:12.048131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:10.596389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:11.057289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:11.557265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:12.175744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:10.705431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:11.181828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:11.664942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:12.302138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:10.818247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:11.321728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:19:11.796127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:19:15.296410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
표준코드저수율평년평년대비가뭄단계
표준코드1.0000.4660.4130.2900.153
저수율0.4661.0000.7260.6370.838
평년0.4130.7261.0000.4410.128
평년대비0.2900.6370.4411.0000.718
가뭄단계0.1530.8380.1280.7181.000
2023-12-12T19:19:15.440641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
표준코드저수율평년평년대비가뭄단계
표준코드1.000-0.257-0.207-0.1540.103
저수율-0.2571.0000.6910.6940.500
평년-0.2070.6911.0000.0400.078
평년대비-0.1540.6940.0401.0000.521
가뭄단계0.1030.5000.0780.5211.000

Missing values

2023-12-12T19:19:12.475502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:19:12.628243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준일자표준코드저수율평년평년대비가뭄단계
322022022-03-24478509384111정상
396742022-09-1246170435676정상
394372022-01-18461706858118정상
2732022-10-01421508876116정상
357852022-01-16<NA>767799정상
426902022-12-1746870466473정상
62332022-01-29427308788100정상
240982022-01-09488908174111정상
400102022-08-1446710406067관심
438652022-03-0746800717298정상
기준일자표준코드저수율평년평년대비가뭄단계
195482022-07-2348840478357주의
210232022-08-0748330818596정상
11862022-04-024223010099102정상
125572022-05-284117000100정상
22092022-01-20428309790108정상
523642022-06-19447106760113정상
335392022-11-21277108271117정상
527382022-06-2844230445087정상
272862022-10-04479207849162정상
279832022-09-0147840426861관심

Duplicate rows

Most frequently occurring

기준일자표준코드저수율평년평년대비가뭄단계# duplicates
162022-06-11<NA>00100정상3
172022-06-13<NA>00100정상3
222022-08-24<NA>00100정상3
02022-01-11<NA>00100정상2
12022-01-12<NA>00100정상2
22022-01-23<NA>00100정상2
32022-02-05<NA>00100정상2
42022-02-19<NA>00100정상2
52022-02-28<NA>00100정상2
62022-03-03<NA>00100정상2