Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.2 KiB
Average record size in memory43.3 B

Variable types

Categorical2
Text1
Numeric2

Alerts

평년(mm) is highly overall correlated with 부족량High correlation
부족량 is highly overall correlated with 평년(mm)High correlation
부족량 is highly imbalanced (76.8%)Imbalance
평년(mm) has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:53:24.272893
Analysis finished2023-12-10 10:53:25.558137
Duration1.29 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기도
31 
경상북도
23 
강원도
18 
경상남도
18 
전라남도
10 

Length

Max length4
Median length4
Mean length3.51
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row강원도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경기도 31
31.0%
경상북도 23
23.0%
강원도 18
18.0%
경상남도 18
18.0%
전라남도 10
 
10.0%

Length

2023-12-10T19:53:25.665737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:53:25.837411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 31
31.0%
경상북도 23
23.0%
강원도 18
18.0%
경상남도 18
18.0%
전라남도 10
 
10.0%
Distinct99
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:53:26.254736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.03
Min length3

Characters and Unicode

Total characters303
Distinct characters85
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98 ?
Unique (%)98.0%

Sample

1st row정선군
2nd row평창군
3rd row영월군
4th row횡성군
5th row홍천군
ValueCountFrequency (%)
고성군 2
 
2.0%
경산시 1
 
1.0%
김해시 1
 
1.0%
영천시 1
 
1.0%
영주시 1
 
1.0%
구미시 1
 
1.0%
안동시 1
 
1.0%
경주시 1
 
1.0%
포항시 1
 
1.0%
김천시 1
 
1.0%
Other values (89) 89
89.0%
2023-12-10T19:53:27.022727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
54
17.8%
49
 
16.2%
14
 
4.6%
12
 
4.0%
11
 
3.6%
9
 
3.0%
8
 
2.6%
6
 
2.0%
5
 
1.7%
5
 
1.7%
Other values (75) 130
42.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 303
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
54
17.8%
49
 
16.2%
14
 
4.6%
12
 
4.0%
11
 
3.6%
9
 
3.0%
8
 
2.6%
6
 
2.0%
5
 
1.7%
5
 
1.7%
Other values (75) 130
42.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 303
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
54
17.8%
49
 
16.2%
14
 
4.6%
12
 
4.0%
11
 
3.6%
9
 
3.0%
8
 
2.6%
6
 
2.0%
5
 
1.7%
5
 
1.7%
Other values (75) 130
42.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 303
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
54
17.8%
49
 
16.2%
14
 
4.6%
12
 
4.0%
11
 
3.6%
9
 
3.0%
8
 
2.6%
6
 
2.0%
5
 
1.7%
5
 
1.7%
Other values (75) 130
42.9%

강수량(mm)
Real number (ℝ)

Distinct96
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean141.563
Minimum85.5
Maximum219.4
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:53:27.313655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum85.5
5-th percentile103.005
Q1118.675
median144.6
Q3159.325
95-th percentile194.355
Maximum219.4
Range133.9
Interquartile range (IQR)40.65

Descriptive statistics

Standard deviation29.090297
Coefficient of variation (CV)0.20549365
Kurtosis-0.42897445
Mean141.563
Median Absolute Deviation (MAD)22.2
Skewness0.42866601
Sum14156.3
Variance846.24538
MonotonicityNot monotonic
2023-12-10T19:53:27.588026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
154.1 2
 
2.0%
160.8 2
 
2.0%
122.7 2
 
2.0%
146.5 2
 
2.0%
167.8 1
 
1.0%
99.0 1
 
1.0%
131.7 1
 
1.0%
118.9 1
 
1.0%
130.9 1
 
1.0%
103.4 1
 
1.0%
Other values (86) 86
86.0%
ValueCountFrequency (%)
85.5 1
1.0%
96.0 1
1.0%
99.0 1
1.0%
99.2 1
1.0%
99.3 1
1.0%
103.2 1
1.0%
103.3 1
1.0%
103.4 1
1.0%
104.6 1
1.0%
106.0 1
1.0%
ValueCountFrequency (%)
219.4 1
1.0%
208.7 1
1.0%
203.9 1
1.0%
200.5 1
1.0%
197.3 1
1.0%
194.2 1
1.0%
189.4 1
1.0%
187.9 1
1.0%
186.4 1
1.0%
179.5 1
1.0%

평년(mm)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean96.80773
Minimum68.632584
Maximum188.50321
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:53:27.850880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum68.632584
5-th percentile76.752027
Q186.219289
median92.606276
Q397.779805
95-th percentile138.26513
Maximum188.50321
Range119.87063
Interquartile range (IQR)11.560516

Descriptive statistics

Standard deviation20.506555
Coefficient of variation (CV)0.21182766
Kurtosis7.2266711
Mean96.80773
Median Absolute Deviation (MAD)5.9944511
Skewness2.4252911
Sum9680.773
Variance420.5188
MonotonicityNot monotonic
2023-12-10T19:53:28.092249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
82.8623129366353 1
 
1.0%
138.18313869576 1
 
1.0%
91.3348975367648 1
 
1.0%
87.3246752029017 1
 
1.0%
102.72988885173 1
 
1.0%
75.7509603056666 1
 
1.0%
83.8207959045143 1
 
1.0%
91.9714718894515 1
 
1.0%
88.7211281235294 1
 
1.0%
76.7614982396149 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
68.6325840794395 1
1.0%
75.0143897559289 1
1.0%
75.0812207299508 1
1.0%
75.7509603056666 1
1.0%
76.5720678172201 1
1.0%
76.7614982396149 1
1.0%
77.7320492995171 1
1.0%
78.0538894182465 1
1.0%
78.3390855964504 1
1.0%
78.8589889749964 1
1.0%
ValueCountFrequency (%)
188.503213485952 1
1.0%
186.452556789326 1
1.0%
155.869025359211 1
1.0%
149.476219766065 1
1.0%
139.822931497422 1
1.0%
138.18313869576 1
1.0%
137.459993252233 1
1.0%
126.915162139388 1
1.0%
126.686122123757 1
1.0%
123.467527874487 1
1.0%

부족량
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct11
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
-
90 
17.6525567893256
 
1
19.0151621393881
 
1
18.4070998899012
 
1
12.7229314974216
 
1
Other values (6)
 
6

Length

Max length16
Median length1
Mean length2.49
Min length1

Unique

Unique10 ?
Unique (%)10.0%

Sample

1st row-
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
- 90
90.0%
17.6525567893256 1
 
1.0%
19.0151621393881 1
 
1.0%
18.4070998899012 1
 
1.0%
12.7229314974216 1
 
1.0%
35.2032134859524 1
 
1.0%
22.9762197660651 1
 
1.0%
39.1831386957602 1
 
1.0%
1.3690253592107 1
 
1.0%
17.4861221237571 1
 
1.0%

Length

2023-12-10T19:53:28.312074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
90
90.0%
17.6525567893256 1
 
1.0%
19.0151621393881 1
 
1.0%
18.4070998899012 1
 
1.0%
12.7229314974216 1
 
1.0%
35.2032134859524 1
 
1.0%
22.9762197660651 1
 
1.0%
39.1831386957602 1
 
1.0%
1.3690253592107 1
 
1.0%
17.4861221237571 1
 
1.0%

Interactions

2023-12-10T19:53:24.966294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:53:24.630219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:53:25.130853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:53:24.807699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:53:28.455454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명시군명강수량(mm)평년(mm)부족량
시도명1.0000.8370.6240.4810.270
시군명0.8371.0000.9400.0000.000
강수량(mm)0.6240.9401.0000.3490.365
평년(mm)0.4810.0000.3491.0000.896
부족량0.2700.0000.3650.8961.000
2023-12-10T19:53:28.625347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부족량시도명
부족량1.0000.143
시도명0.1431.000
2023-12-10T19:53:28.774295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
강수량(mm)평년(mm)시도명부족량
강수량(mm)1.0000.1290.2980.160
평년(mm)0.1291.0000.2950.691
시도명0.2980.2951.0000.143
부족량0.1600.6910.1431.000

Missing values

2023-12-10T19:53:25.332783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:53:25.494517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도명시군명강수량(mm)평년(mm)부족량
0강원도정선군167.882.862313-
1강원도평창군154.194.630574-
2강원도영월군157.685.907363-
3강원도횡성군160.894.337923-
4강원도홍천군179.294.162192-
5강원도삼척시143.175.081221-
6강원도양양군187.992.918088-
7강원도고성군179.093.33054-
8강원도인제군194.292.594555-
9강원도양구군177.485.398638-
시도명시군명강수량(mm)평년(mm)부족량
90전라남도화순군113.999.748278-
91전라남도장흥군133.1123.467528-
92전라남도강진군125.2117.932519-
93전라남도해남군118.0104.373524-
94전라남도영암군113.094.256854-
95전라남도무안군106.987.574858-
96전라남도함평군110.789.742362-
97전라남도영광군96.091.106564-
98전라남도장성군99.392.705086-
99전라남도완도군146.4137.459993-